This study focuses on an extreme rainfall event in East China during the mei-yu season,in which the capital city(Nanjing)of Jiangsu Province experienced a maximum 14-h rainfall accumulation of 209.6 mm and a peak hour...This study focuses on an extreme rainfall event in East China during the mei-yu season,in which the capital city(Nanjing)of Jiangsu Province experienced a maximum 14-h rainfall accumulation of 209.6 mm and a peak hourly rainfall of 118.8 mm.The performance of two sets of convection-permitting ensemble forecast systems(CEFSs),each with 30 members and a 3-km horizontal grid spacing,is evaluated.The CEFS_ICBCs,using multiple initial and boundary conditions(ICs and BCs),and the CEFS_ICBCs Phys,which incorporates both multi-physics schemes and ICs/BCs,are compared to the CMA-REPS(China Meteorological Administration-Regional Ensemble Prediction System)with a coarser 10-km grid spacing.The two CEFSs demonstrate more uniform rank histograms and lower Brier scores(with higher resolution),improving precipitation intensity predictions and providing more reliable probability forecasts,although they overestimate precipitation over Mt.Dabie.It is challenging for the CEFSs to capture the evolution of mesoscale rainstorms that are known to be related to the errors in predicting the southwesterly low-level winds.Sensitivity experiments reveal that the microphysics and radiation schemes introduce considerable uncertainty in predicting the intensity and location of heavy rainfall in and near Nanjing and Mt.Dabie.In particular,the Asymmetric Convection Model 2(ACM2)planetary boundary layer scheme combined with the Pleim-Xiu surface layer scheme tends to produce a biased northeastward extension of the boundary-layer jet,contributing to the northeastward bias of heavy precipitation around Nanjing in the CEFS_ICBCs.展开更多
The impacts of lateral boundary conditions(LBCs)provided by numerical models and data-driven networks on convective-scale ensemble forecasts are investigated in this study.Four experiments are conducted on the Hangzho...The impacts of lateral boundary conditions(LBCs)provided by numerical models and data-driven networks on convective-scale ensemble forecasts are investigated in this study.Four experiments are conducted on the Hangzhou RDP(19th Hangzhou Asian Games Research Development Project on Convective-scale Ensemble Prediction and Application)testbed,with the LBCs respectively sourced from National Centers for Environmental Prediction(NCEP)Global Forecast System(GFS)forecasts with 33 vertical levels(Exp_GFS),Pangu forecasts with 13 vertical levels(Exp_Pangu),Fuxi forecasts with 13 vertical levels(Exp_Fuxi),and NCEP GFS forecasts with the vertical levels reduced to 13(the same as those of Exp_Pangu and Exp_Fuxi)(Exp_GFSRDV).In general,Exp_Pangu performs comparably to Exp_GFS,while Exp_Fuxi shows slightly inferior performance compared to Exp_Pangu,possibly due to its less accurate large-scale predictions.Therefore,the ability of using data-driven networks to efficiently provide LBCs for convective-scale ensemble forecasts has been demonstrated.Moreover,Exp_GFSRDV has the worst convective-scale forecasts among the four experiments,which indicates the potential improvement of using data-driven networks for LBCs by increasing the vertical levels of the networks.However,the ensemble spread of the four experiments barely increases with lead time.Thus,each experiment has insufficient ensemble spread to present realistic forecast uncertainties,which will be investigated in a future study.展开更多
Background:Stomach cancer(SC)is one of the most lethal malignancies worldwide due to late-stage diagnosis and limited treatment.The transcriptomic,epigenomic,and proteomic,etc.,omics datasets generated by high-through...Background:Stomach cancer(SC)is one of the most lethal malignancies worldwide due to late-stage diagnosis and limited treatment.The transcriptomic,epigenomic,and proteomic,etc.,omics datasets generated by high-throughput sequencing technology have become prominent in biomedical research,and they reveal molecular aspects of cancer diagnosis and therapy.Despite the development of advanced sequencing technology,the presence of high-dimensionality in multi-omics data makes it challenging to interpret the data.Methods:In this study,we introduce RankXLAN,an explainable ensemble-based multi-omics framework that integrates feature selection(FS),ensemble learning,bioinformatics,and in-silico validation for robust biomarker detection,potential therapeutic drug-repurposing candidates’identification,and classification of SC.To enhance the interpretability of the model,we incorporated explainable artificial intelligence(SHapley Additive exPlanations analysis),as well as accuracy,precision,F1-score,recall,cross-validation,specificity,likelihood ratio(LR)+,LR−,and Youden index results.Results:The experimental results showed that the top four FS algorithms achieved improved results when applied to the ensemble learning classification model.The proposed ensemble model produced an area under the curve(AUC)score of 0.994 for gene expression,0.97 for methylation,and 0.96 for miRNA expression data.Through the integration of bioinformatics and ML approach of the transcriptomic and epigenomic multi-omics dataset,we identified potential marker genes,namely,UBE2D2,HPCAL4,IGHA1,DPT,and FN3K.In-silico molecular docking revealed a strong binding affinity between ANKRD13C and the FDA-approved drug Everolimus(binding affinity−10.1 kcal/mol),identifying ANKRD13C as a potential therapeutic drug-repurposing target for SC.Conclusion:The proposed framework RankXLAN outperforms other existing frameworks for serum biomarker identification,therapeutic target identification,and SC classification with multi-omics datasets.展开更多
In recent years,ransomware attacks have become one of the most common and destructive types of cyberattacks.Their impact is significant on the operations,finances and reputation of affected companies.Despite the effor...In recent years,ransomware attacks have become one of the most common and destructive types of cyberattacks.Their impact is significant on the operations,finances and reputation of affected companies.Despite the efforts of researchers and security experts to protect information systems from these attacks,the threat persists and the proposed solutions are not able to significantly stop the spread of ransomware attacks.The latest remarkable achievements of large language models(LLMs)in NLP tasks have caught the attention of cybersecurity researchers to integrate thesemodels into security threat detection.Thesemodels offer high embedding capabilities,able to extract rich semantic representations and paving theway formore accurate and adaptive solutions.In this context,we propose a new approach for ransomware detection based on an ensemblemethod that leverages three distinctLLMembeddingmodels.This ensemble strategy takes advantage of the variety of embedding methods and the strengths of each model.In the proposed solution,each embedding model is associated with an independently trainedMLP classifier.The predictions obtained are then merged using a weighted voting technique,assigning each model an influence proportional to its performance.This approach makes it possible to exploit the complementarity of representations,improve detection accuracy and robustness,and offer a more reliable solution in the face of the growing diversity and complexity of modern ransomware.展开更多
The surge in smishing attacks underscores the urgent need for robust,real-time detection systems powered by advanced deep learning models.This paper introduces PhishNet,a novel ensemble learning framework that integra...The surge in smishing attacks underscores the urgent need for robust,real-time detection systems powered by advanced deep learning models.This paper introduces PhishNet,a novel ensemble learning framework that integrates transformer-based models(RoBERTa)and large language models(LLMs)(GPT-OSS 120B,LLaMA3.370B,and Qwen332B)to enhance smishing detection performance significantly.To mitigate class imbalance,we apply synthetic data augmentation using T5 and leverage various text preprocessing techniques.Our system employs a duallayer voting mechanism:weighted majority voting among LLMs and a final ensemble vote to classify messages as ham,spam,or smishing.Experimental results show an average accuracy improvement from 96%to 98.5%compared to the best standalone transformer,and from 93%to 98.5%when compared to LLMs across datasets.Furthermore,we present a real-time,user-friendly application to operationalize our detection model for practical use.PhishNet demonstrates superior scalability,usability,and detection accuracy,filling critical gaps in current smishing detection methodologies.展开更多
Distributed Denial of Service(DDoS)attacks are one of the severe threats to network infrastructure,sometimes bypassing traditional diagnosis algorithms because of their evolving complexity.PresentMachine Learning(ML)t...Distributed Denial of Service(DDoS)attacks are one of the severe threats to network infrastructure,sometimes bypassing traditional diagnosis algorithms because of their evolving complexity.PresentMachine Learning(ML)techniques for DDoS attack diagnosis normally apply network traffic statistical features such as packet sizes and inter-arrival times.However,such techniques sometimes fail to capture complicated relations among various traffic flows.In this paper,we present a new multi-scale ensemble strategy given the Graph Neural Networks(GNNs)for improving DDoS detection.Our technique divides traffic into macro-and micro-level elements,letting various GNN models to get the two corase-scale anomalies and subtle,stealthy attack models.Through modeling network traffic as graph-structured data,GNNs efficiently learn intricate relations among network entities.The proposed ensemble learning algorithm combines the results of several GNNs to improve generalization,robustness,and scalability.Extensive experiments on three benchmark datasets—UNSW-NB15,CICIDS2017,and CICDDoS2019—show that our approach outperforms traditional machine learning and deep learning models in detecting both high-rate and low-rate(stealthy)DDoS attacks,with significant improvements in accuracy and recall.These findings demonstrate the suggested method’s applicability and robustness for real-world implementation in contexts where several DDoS patterns coexist.展开更多
Traditional mining in open pit mines often uses explosives,leading to environmental hazards,with flyrock being a critical issue.In detail,excess flying rock beyond the designated explosion area was identified as the p...Traditional mining in open pit mines often uses explosives,leading to environmental hazards,with flyrock being a critical issue.In detail,excess flying rock beyond the designated explosion area was identified as the primary cause of fatal and non-fatal blasting hazards in open pit mining.Therefore,the accurate and reliable prediction of flyrock becomes crucial for effectively managing and mitigating associated problems.This study used the Light Gradient Boosting Machine(LightGBM)model to predict flyrock in a lead-zinc mine,with promising results.To improve its accuracy,multi-verse optimizer(MVO)and ant lion optimizer(ALO)metaheuristic algorithms were introduced.Results showed MVO-LightGBM outperformed conventional LightGBM.Additionally,decision tree(DT),support vector machine(SVM),and classification and regression tree(CART)models were trained and compared with MVO-LightGBM.The MVO-LightGBM model excelled over DT,SVM,and CART.This study highlights MVO-LightGBM's effectiveness and potential for broader applications.Furthermore,a multiple parametric sensitivity analysis(MPSA)algorithm was employed to specify the sensitivity of parameters.MPSA results indicated that the highest and lowest sensitivities are relevant to blasted rock per hole and spacing with theγ=1752.12 andγ=49.52,respectively.展开更多
Android smartphones have become an integral part of our daily lives,becoming targets for ransomware attacks.Such attacks encrypt user information and ask for payment to recover it.Conventional detection mechanisms,suc...Android smartphones have become an integral part of our daily lives,becoming targets for ransomware attacks.Such attacks encrypt user information and ask for payment to recover it.Conventional detection mechanisms,such as signature-based and heuristic techniques,often fail to detect new and polymorphic ransomware samples.To address this challenge,we employed various ensemble classifiers,such as Random Forest,Gradient Boosting,Bagging,and AutoML models.We aimed to showcase how AutoML can automate processes such as model selection,feature engineering,and hyperparameter optimization,to minimize manual effort while ensuring or enhancing performance compared to traditional approaches.We used this framework to test it with a publicly available dataset from the Kaggle repository,which contains features for Android ransomware network traffic.The dataset comprises 392,024 flow records,divided into eleven groups.There are ten classes for various ransomware types,including SVpeng,PornDroid,Koler,WannaLocker,and Lockerpin.There is also a class for regular traffic.We applied a three-step procedure to select themost relevant features:filter,wrapper,and embeddedmethods.The Bagging classifier was highly accurate,correctly getting 99.84%of the time.The FLAML AutoML framework was evenmore accurate,correctly getting 99.85%of the time.This is indicative of howwellAutoML performs in improving things with minimal human assistance.Our findings indicate that AutoML is an efficient,scalable,and flexible method to discover Android ransomware,and it will facilitate the development of next-generation intrusion detection systems.展开更多
Optical non-reciprocity is a fundamental phenomenon in photonics.It is crucial for developing devices that rely on directional signal control,such as optical isolators and circulators.However,most research in this fie...Optical non-reciprocity is a fundamental phenomenon in photonics.It is crucial for developing devices that rely on directional signal control,such as optical isolators and circulators.However,most research in this field has focused on systems in equilibrium or steady states.In this work,we demonstrate a room-temperature Rydberg atomic platform where the unidirectional propagation of light acts as a switch to mediate time-crystalline-like collective oscillations through atomic synchronization.展开更多
Rice is one of the most important staple crops globally.Rice plant diseases can severely reduce crop yields and,in extreme cases,lead to total production loss.Early diagnosis enables timely intervention,mitigates dise...Rice is one of the most important staple crops globally.Rice plant diseases can severely reduce crop yields and,in extreme cases,lead to total production loss.Early diagnosis enables timely intervention,mitigates disease severity,supports effective treatment strategies,and reduces reliance on excessive pesticide use.Traditional machine learning approaches have been applied for automated rice disease diagnosis;however,these methods depend heavily on manual image preprocessing and handcrafted feature extraction,which are labor-intensive and time-consuming and often require domain expertise.Recently,end-to-end deep learning(DL) models have been introduced for this task,but they often lack robustness and generalizability across diverse datasets.To address these limitations,we propose a novel end-toend training framework for convolutional neural network(CNN) and attention-based model ensembles(E2ETCA).This framework integrates features from two state-of-the-art(SOTA) CNN models,Inception V3 and DenseNet-201,and an attention-based vision transformer(ViT) model.The fused features are passed through an additional fully connected layer with softmax activation for final classification.The entire process is trained end-to-end,enhancing its suitability for realworld deployment.Furthermore,we extract and analyze the learned features using a support vector machine(SVM),a traditional machine learning classifier,to provide comparative insights.We evaluate the proposed E2ETCA framework on three publicly available datasets,the Mendeley Rice Leaf Disease Image Samples dataset,the Kaggle Rice Diseases Image dataset,the Bangladesh Rice Research Institute dataset,and a combined version of all three.Using standard evaluation metrics(accuracy,precision,recall,and F1-score),our framework demonstrates superior performance compared to existing SOTA methods in rice disease diagnosis,with potential applicability to other agricultural disease detection tasks.展开更多
Artificial Intelligence(AI)is changing healthcare by helping with diagnosis.However,for doctors to trust AI tools,they need to be both accurate and easy to understand.In this study,we created a new machine learning sy...Artificial Intelligence(AI)is changing healthcare by helping with diagnosis.However,for doctors to trust AI tools,they need to be both accurate and easy to understand.In this study,we created a new machine learning system for the early detection of Autism Spectrum Disorder(ASD)in children.Our main goal was to build a model that is not only good at predicting ASD but also clear in its reasoning.For this,we combined several different models,including Random Forest,XGBoost,and Neural Networks,into a single,more powerful framework.We used two different types of datasets:(i)a standard behavioral dataset and(ii)a more complex multimodal dataset with images,audio,and physiological information.The datasets were carefully preprocessed for missing values,redundant features,and dataset imbalance to ensure fair learning.The results outperformed the state-of-the-art with a Regularized Neural Network,achieving 97.6%accuracy on behavioral data.Whereas,on the multimodal data,the accuracy is 98.2%.Other models also did well with accuracies consistently above 96%.We also used SHAP and LIME on a behavioral dataset for models’explainability.展开更多
Satellite data obtained over synoptic data-sparse regions such as an ocean contribute toward improving the quality of the initial state of limited-area models. Background error covariances are crucial to the proper di...Satellite data obtained over synoptic data-sparse regions such as an ocean contribute toward improving the quality of the initial state of limited-area models. Background error covariances are crucial to the proper distribution of satellite-observed information in variational data assimilation. In the NMC (National Meteorological Center) method, background error covariances are underestimated over data-sparse regions such as an ocean because of small differences between different forecast times. Thus, it is necessary to reconstruct and tune the background error covariances so as to maximize the usefulness of the satellite data for the initial state of limited-area models, especially over an ocean where there is a lack of conventional data. In this study, we attempted to estimate background error covariances so as to provide adequate error statistics for data-sparse regions by using ensemble forecasts of optimal perturbations using bred vectors. The background error covariances estimated by the ensemble method reduced the overestimation of error amplitude obtained by the NMC method. By employing an appropriate horizontal length scale to exclude spurious correlations, the ensemble method produced better results than the NMC method in the assimilation of retrieved satellite data. Because the ensemble method distributes observed information over a limited local area, it would be more useful in the analysis of high-resolution satellite data. Accordingly, the performance of forecast models can be improved over the area where the satellite data are assimilated.展开更多
Based on the daily sea surface wind field prediction data of Japan Meteorological Agency(JMA) forecast model,National Centers for Environmental Prediction(NCEP GFS) model and U.S.Navy Operational Global Atmospheric Pr...Based on the daily sea surface wind field prediction data of Japan Meteorological Agency(JMA) forecast model,National Centers for Environmental Prediction(NCEP GFS) model and U.S.Navy Operational Global Atmospheric Prediction System(NOGAPS) model at 12:00 UTC from June 28 to August 10 in 2009,the bias-removed ensemble mean(BRE) was used to do the forecast test on the sea surface wind fields,and the root-mean-square error(RMSE) was used to test and evaluate the forecast results.The results showed that the BRE considerably reduced the RMSEs of 24 and 48 h sea surface wind field forecasts,and the forecast skill was superior to that of the single model forecast.The RMSE decreases in the south of central Bohai Sea and the middle of the Yellow Sea were the most obvious.In addition,the BRE forecast improved evidently the forecast skill of the gale process which occurred during July 13-14 and August 7 in 2009.The forecast accuracy of the wind speed and the gale location was also improved.展开更多
This paper preliminarily investigates the application of the orthogonal conditional nonlinear optimal perturbations(CNOPs)–based ensemble forecast technique in MM5(Fifth-generation Pennsylvania State University–Nati...This paper preliminarily investigates the application of the orthogonal conditional nonlinear optimal perturbations(CNOPs)–based ensemble forecast technique in MM5(Fifth-generation Pennsylvania State University–National Center for Atmospheric Research Mesoscale Model). The results show that the ensemble forecast members generated by the orthogonal CNOPs present large spreads but tend to be located on the two sides of real tropical cyclone(TC) tracks and have good agreements between ensemble spreads and ensemble-mean forecast errors for TC tracks. Subsequently, these members reflect more reasonable forecast uncertainties and enhance the orthogonal CNOPs–based ensemble-mean forecasts to obtain higher skill for TC tracks than the orthogonal SVs(singular vectors)–, BVs(bred vectors)– and RPs(random perturbations)–based ones. The results indicate that orthogonal CNOPs of smaller magnitudes should be adopted to construct the initial ensemble perturbations for short lead–time forecasts, but those of larger magnitudes should be used for longer lead–time forecasts due to the effects of nonlinearities. The performance of the orthogonal CNOPs–based ensemble-mean forecasts is case-dependent,which encourages evaluating statistically the forecast skill with more TC cases. Finally, the results show that the ensemble forecasts with only initial perturbations in this work do not increase the forecast skill of TC intensity, which may be related with both the coarse model horizontal resolution and the model error.展开更多
On 21 July 2012,an extreme rainfall event that recorded a maximum rainfall amount over 24 hours of 460 mm,occurred in Beijing,China. Most operational models failed to predict such an extreme amount. In this study,a co...On 21 July 2012,an extreme rainfall event that recorded a maximum rainfall amount over 24 hours of 460 mm,occurred in Beijing,China. Most operational models failed to predict such an extreme amount. In this study,a convective-permitting ensemble forecast system(CEFS),at 4-km grid spacing,covering the entire mainland of China,is applied to this extreme rainfall case. CEFS consists of 22 members and uses multiple physics parameterizations. For the event,the predicted maximum is 415 mm d^-1 in the probability-matched ensemble mean. The predicted high-probability heavy rain region is located in southwest Beijing,as was observed. Ensemble-based verification scores are then investigated. For a small verification domain covering Beijing and its surrounding areas,the precipitation rank histogram of CEFS is much flatter than that of a reference global ensemble. CEFS has a lower(higher) Brier score and a higher resolution than the global ensemble for precipitation,indicating more reliable probabilistic forecasting by CEFS. Additionally,forecasts of different ensemble members are compared and discussed. Most of the extreme rainfall comes from convection in the warm sector east of an approaching cold front. A few members of CEFS successfully reproduce such precipitation,and orographic lift of highly moist low-level flows with a significantly southeasterly component is suggested to have played important roles in producing the initial convection. Comparisons between good and bad forecast members indicate a strong sensitivity of the extreme rainfall to the mesoscale environmental conditions,and,to less of an extent,the model physics.展开更多
The application of numerical weather prediction (NWP) products is increasing dramatically. Existing reports indicate that ensemble predictions have better skill than deterministic forecasts. In this study, numerical...The application of numerical weather prediction (NWP) products is increasing dramatically. Existing reports indicate that ensemble predictions have better skill than deterministic forecasts. In this study, numerical ensemble precipitation forecasts in the TIGGE database were evaluated using deterministic, dichotomous (yes/no), and probabilistic techniques over Iran for the period 2008-16. Thirteen rain gauges spread over eight homogeneous precipitation regimes were selected for evaluation. The Inverse Distance Weighting and Kriging methods were adopted for interpolation of the prediction values, downscaled to the stations at lead times of one to three days. To enhance the forecast quality, NWP values were post-processed via Bayesian Model Averaging. The results showed that ECMWF had better scores than other products. However, products of all centers underestimated precipitation in high precipitation regions while overestimating precipitation in other regions. This points to a systematic bias in forecasts and demands application of bias correction techniques. Based on dichotomous evaluation, NCEP did better at most stations, although all centers overpredicted the number of precipitation events. Compared to those of ECMWF and NCER UKMO yielded higher scores in mountainous regions, but performed poorly at other selected stations. Furthermore, the evaluations showed that all centers had better skill in wet than in dry seasons. The quality of post-processed predictions was better than those of the raw predictions. In conclusion, the accuracy of the NWP predictions made by the selected centers could be classified as medium over Iran, while post-processing of predictions is recommended to improve the quality.展开更多
A time-lagged ensemble method is used to improve 6-15 day precipitation forecasts from the Beijing Climate Center Atmospheric General Circulation Model,version 2.0.1.The approach averages the deterministic predictions...A time-lagged ensemble method is used to improve 6-15 day precipitation forecasts from the Beijing Climate Center Atmospheric General Circulation Model,version 2.0.1.The approach averages the deterministic predictions of precipitation from the most recent model run and from earlier runs,all at the same forecast valid time.This lagged average forecast (LAF) method assigns equal weight to each ensemble member and produces a forecast by taking the ensemble mean.Our analyses of the Equitable Threat Score,the Hanssen and Kuipers Score,and the frequency bias indicate that the LAF using five members at time-lagged intervals of 6 h improves 6-15 day forecasts of precipitation frequency above 1 mm d-1 and 5 mm d-1 in many regions of China,and is more effective than the LAF method with selection of the time-lagged interval of 12 or 24 h between ensemble members.In particular,significant improvements are seen over regions where the frequencies of rainfall days are higher than about 40%-50% in the summer season; these regions include northeastern and central to southern China,and the southeastem Tibetan Plateau.展开更多
A running mean bias (RMB) correction ap- proach was applied to the forecasts of near-surface variables in a seasonal short-range ensemble forecasting experiment with 57 consecutive cases during summer 2010 in the no...A running mean bias (RMB) correction ap- proach was applied to the forecasts of near-surface variables in a seasonal short-range ensemble forecasting experiment with 57 consecutive cases during summer 2010 in the northern China region. To determine a proper training window length for calculating RMB, window lengths from 2 to 20 days were evaluated, and 16 days was taken as an optimal window length, since it receives most of the benefit from extending the window length. The raw and 16-day RMB corrected ensembles were then evaluated for their ensemble mean forecast skills. The results show that the raw ensemble has obvious bias in all near-surface variables. The RMB correction can remove the bias reasonably well, and generate an unbiased ensemble. The bias correction not only reduces the ensemble mean forecast error, but also results in a better spreaderror relationship. Moreover, two methods for computing calibrated probabilistic forecast (PF) were also evaluated through the 57 case dates: 1) using the relative frequency from the RMB-eorrected ensemble; 2) computing the forecasting probabilities based on a historical rank histogram. The first method outperforms the second one, as it can improve both the reliability and the resolution of the PFs, while the second method only has a small effect on the reliability, indicating the necessity and importance of removing the systematic errors from the ensemble.展开更多
Despite the maturity of ensemble numerical weather prediction(NWP),the resulting forecasts are still,more often than not,under-dispersed.As such,forecast calibration tools have become popular.Among those tools,quantil...Despite the maturity of ensemble numerical weather prediction(NWP),the resulting forecasts are still,more often than not,under-dispersed.As such,forecast calibration tools have become popular.Among those tools,quantile regression(QR)is highly competitive in terms of both flexibility and predictive performance.Nevertheless,a long-standing problem of QR is quantile crossing,which greatly limits the interpretability of QR-calibrated forecasts.On this point,this study proposes a non-crossing quantile regression neural network(NCQRNN),for calibrating ensemble NWP forecasts into a set of reliable quantile forecasts without crossing.The overarching design principle of NCQRNN is to add on top of the conventional QRNN structure another hidden layer,which imposes a non-decreasing mapping between the combined output from nodes of the last hidden layer to the nodes of the output layer,through a triangular weight matrix with positive entries.The empirical part of the work considers a solar irradiance case study,in which four years of ensemble irradiance forecasts at seven locations,issued by the European Centre for Medium-Range Weather Forecasts,are calibrated via NCQRNN,as well as via an eclectic mix of benchmarking models,ranging from the naïve climatology to the state-of-the-art deep-learning and other non-crossing models.Formal and stringent forecast verification suggests that the forecasts post-processed via NCQRNN attain the maximum sharpness subject to calibration,amongst all competitors.Furthermore,the proposed conception to resolve quantile crossing is remarkably simple yet general,and thus has broad applicability as it can be integrated with many shallow-and deep-learning-based neural networks.展开更多
An unprecedented heavy rainfall event occurred in Henan Province,China,during the period of 1200 UTC 19-1200 UTC 20 July 2021 with a record of 522 mm accumulated rainfall.Zhengzhou,the capital city of Henan,received 2...An unprecedented heavy rainfall event occurred in Henan Province,China,during the period of 1200 UTC 19-1200 UTC 20 July 2021 with a record of 522 mm accumulated rainfall.Zhengzhou,the capital city of Henan,received 201.9 mm of rainfall in just one hour on the day.In the present study,the sensitivity of this event to atmospheric variables is investigated using the ECMWF ensemble forecasts.The sensitivity analysis first indicates that a local YellowHuai River low vortex(YHV)in the southern part of Henan played a crucial role in this extreme event.Meanwhile,the western Pacific subtropical high(WPSH)was stronger than the long-term average and to the west of its climatological position.Moreover,the existence of a tropical cyclone(TC)In-Fa pushed into the peripheral of the WPSH and brought an enhanced easterly flow between the TC and WPSH channeling abundant moisture to inland China and feeding into the YHV.Members of the ECMWF ensemble are selected and grouped into the GOOD and the POOR groups based on their predicted maximum rainfall accumulations during the event.Some good members of ECMWF ensemble Prediction System(ECMWF-EPS)are able to capture good spatial distribution of the heavy rainfall,but still underpredict its extremity.The better prediction ability of these members comes from the better prediction of the evolution characteristics(i.e.,intensity and location)of the YHV and TC In-Fa.When the YHV was moving westward to the south of Henan,a relatively strong southerly wind in the southwestern part of Henan converged with the easterly flow from the channel wind between In-Fa and WPSH.The convergence and accompanying ascending motion induced heavy precipitation.展开更多
基金supported by the National Natural Science Foundation of China(Grant Nos.42030610 and 42205006)the Startup Foundation for Introducing Talent of NUIST(2023r121)。
文摘This study focuses on an extreme rainfall event in East China during the mei-yu season,in which the capital city(Nanjing)of Jiangsu Province experienced a maximum 14-h rainfall accumulation of 209.6 mm and a peak hourly rainfall of 118.8 mm.The performance of two sets of convection-permitting ensemble forecast systems(CEFSs),each with 30 members and a 3-km horizontal grid spacing,is evaluated.The CEFS_ICBCs,using multiple initial and boundary conditions(ICs and BCs),and the CEFS_ICBCs Phys,which incorporates both multi-physics schemes and ICs/BCs,are compared to the CMA-REPS(China Meteorological Administration-Regional Ensemble Prediction System)with a coarser 10-km grid spacing.The two CEFSs demonstrate more uniform rank histograms and lower Brier scores(with higher resolution),improving precipitation intensity predictions and providing more reliable probability forecasts,although they overestimate precipitation over Mt.Dabie.It is challenging for the CEFSs to capture the evolution of mesoscale rainstorms that are known to be related to the errors in predicting the southwesterly low-level winds.Sensitivity experiments reveal that the microphysics and radiation schemes introduce considerable uncertainty in predicting the intensity and location of heavy rainfall in and near Nanjing and Mt.Dabie.In particular,the Asymmetric Convection Model 2(ACM2)planetary boundary layer scheme combined with the Pleim-Xiu surface layer scheme tends to produce a biased northeastward extension of the boundary-layer jet,contributing to the northeastward bias of heavy precipitation around Nanjing in the CEFS_ICBCs.
基金supported by the Strategic Research and Consulting Project of the Chinese Academy of Engineering[grant number 2024-XBZD-14]the National Natural Science Foundation of China[grant numbers 42192553 and 41922036]the Fundamental Research Funds for the Central Universities–Cemac“GeoX”Interdisciplinary Program[grant number 020714380207]。
文摘The impacts of lateral boundary conditions(LBCs)provided by numerical models and data-driven networks on convective-scale ensemble forecasts are investigated in this study.Four experiments are conducted on the Hangzhou RDP(19th Hangzhou Asian Games Research Development Project on Convective-scale Ensemble Prediction and Application)testbed,with the LBCs respectively sourced from National Centers for Environmental Prediction(NCEP)Global Forecast System(GFS)forecasts with 33 vertical levels(Exp_GFS),Pangu forecasts with 13 vertical levels(Exp_Pangu),Fuxi forecasts with 13 vertical levels(Exp_Fuxi),and NCEP GFS forecasts with the vertical levels reduced to 13(the same as those of Exp_Pangu and Exp_Fuxi)(Exp_GFSRDV).In general,Exp_Pangu performs comparably to Exp_GFS,while Exp_Fuxi shows slightly inferior performance compared to Exp_Pangu,possibly due to its less accurate large-scale predictions.Therefore,the ability of using data-driven networks to efficiently provide LBCs for convective-scale ensemble forecasts has been demonstrated.Moreover,Exp_GFSRDV has the worst convective-scale forecasts among the four experiments,which indicates the potential improvement of using data-driven networks for LBCs by increasing the vertical levels of the networks.However,the ensemble spread of the four experiments barely increases with lead time.Thus,each experiment has insufficient ensemble spread to present realistic forecast uncertainties,which will be investigated in a future study.
基金the Deanship of Research and Graduate Studies at King Khalid University,KSA,for funding this work through the Large Research Project under grant number RGP2/164/46.
文摘Background:Stomach cancer(SC)is one of the most lethal malignancies worldwide due to late-stage diagnosis and limited treatment.The transcriptomic,epigenomic,and proteomic,etc.,omics datasets generated by high-throughput sequencing technology have become prominent in biomedical research,and they reveal molecular aspects of cancer diagnosis and therapy.Despite the development of advanced sequencing technology,the presence of high-dimensionality in multi-omics data makes it challenging to interpret the data.Methods:In this study,we introduce RankXLAN,an explainable ensemble-based multi-omics framework that integrates feature selection(FS),ensemble learning,bioinformatics,and in-silico validation for robust biomarker detection,potential therapeutic drug-repurposing candidates’identification,and classification of SC.To enhance the interpretability of the model,we incorporated explainable artificial intelligence(SHapley Additive exPlanations analysis),as well as accuracy,precision,F1-score,recall,cross-validation,specificity,likelihood ratio(LR)+,LR−,and Youden index results.Results:The experimental results showed that the top four FS algorithms achieved improved results when applied to the ensemble learning classification model.The proposed ensemble model produced an area under the curve(AUC)score of 0.994 for gene expression,0.97 for methylation,and 0.96 for miRNA expression data.Through the integration of bioinformatics and ML approach of the transcriptomic and epigenomic multi-omics dataset,we identified potential marker genes,namely,UBE2D2,HPCAL4,IGHA1,DPT,and FN3K.In-silico molecular docking revealed a strong binding affinity between ANKRD13C and the FDA-approved drug Everolimus(binding affinity−10.1 kcal/mol),identifying ANKRD13C as a potential therapeutic drug-repurposing target for SC.Conclusion:The proposed framework RankXLAN outperforms other existing frameworks for serum biomarker identification,therapeutic target identification,and SC classification with multi-omics datasets.
基金funded by the Deanship of Graduate Studies and Scientific Research at Jouf University under grant No.(DGSSR-2024-02-01176).
文摘In recent years,ransomware attacks have become one of the most common and destructive types of cyberattacks.Their impact is significant on the operations,finances and reputation of affected companies.Despite the efforts of researchers and security experts to protect information systems from these attacks,the threat persists and the proposed solutions are not able to significantly stop the spread of ransomware attacks.The latest remarkable achievements of large language models(LLMs)in NLP tasks have caught the attention of cybersecurity researchers to integrate thesemodels into security threat detection.Thesemodels offer high embedding capabilities,able to extract rich semantic representations and paving theway formore accurate and adaptive solutions.In this context,we propose a new approach for ransomware detection based on an ensemblemethod that leverages three distinctLLMembeddingmodels.This ensemble strategy takes advantage of the variety of embedding methods and the strengths of each model.In the proposed solution,each embedding model is associated with an independently trainedMLP classifier.The predictions obtained are then merged using a weighted voting technique,assigning each model an influence proportional to its performance.This approach makes it possible to exploit the complementarity of representations,improve detection accuracy and robustness,and offer a more reliable solution in the face of the growing diversity and complexity of modern ransomware.
基金funded by the Deanship of Scientific Research(DSR)at King Abdulaziz University,Jeddah,under Grant No.(GPIP:1074-612-2024).
文摘The surge in smishing attacks underscores the urgent need for robust,real-time detection systems powered by advanced deep learning models.This paper introduces PhishNet,a novel ensemble learning framework that integrates transformer-based models(RoBERTa)and large language models(LLMs)(GPT-OSS 120B,LLaMA3.370B,and Qwen332B)to enhance smishing detection performance significantly.To mitigate class imbalance,we apply synthetic data augmentation using T5 and leverage various text preprocessing techniques.Our system employs a duallayer voting mechanism:weighted majority voting among LLMs and a final ensemble vote to classify messages as ham,spam,or smishing.Experimental results show an average accuracy improvement from 96%to 98.5%compared to the best standalone transformer,and from 93%to 98.5%when compared to LLMs across datasets.Furthermore,we present a real-time,user-friendly application to operationalize our detection model for practical use.PhishNet demonstrates superior scalability,usability,and detection accuracy,filling critical gaps in current smishing detection methodologies.
文摘Distributed Denial of Service(DDoS)attacks are one of the severe threats to network infrastructure,sometimes bypassing traditional diagnosis algorithms because of their evolving complexity.PresentMachine Learning(ML)techniques for DDoS attack diagnosis normally apply network traffic statistical features such as packet sizes and inter-arrival times.However,such techniques sometimes fail to capture complicated relations among various traffic flows.In this paper,we present a new multi-scale ensemble strategy given the Graph Neural Networks(GNNs)for improving DDoS detection.Our technique divides traffic into macro-and micro-level elements,letting various GNN models to get the two corase-scale anomalies and subtle,stealthy attack models.Through modeling network traffic as graph-structured data,GNNs efficiently learn intricate relations among network entities.The proposed ensemble learning algorithm combines the results of several GNNs to improve generalization,robustness,and scalability.Extensive experiments on three benchmark datasets—UNSW-NB15,CICIDS2017,and CICDDoS2019—show that our approach outperforms traditional machine learning and deep learning models in detecting both high-rate and low-rate(stealthy)DDoS attacks,with significant improvements in accuracy and recall.These findings demonstrate the suggested method’s applicability and robustness for real-world implementation in contexts where several DDoS patterns coexist.
基金funded by the Key Laboratory of Geological Safety of Coastal Urban Underground Space,Ministry of Natural Resources of China(Grant No.BHKF2022Y02)Natural Science Foundation of Guangdong Province,China(Grant No.2024A1515011162)Natural Science Foundation of Shandong Province,China(Grant No.ZR2024QE021).
文摘Traditional mining in open pit mines often uses explosives,leading to environmental hazards,with flyrock being a critical issue.In detail,excess flying rock beyond the designated explosion area was identified as the primary cause of fatal and non-fatal blasting hazards in open pit mining.Therefore,the accurate and reliable prediction of flyrock becomes crucial for effectively managing and mitigating associated problems.This study used the Light Gradient Boosting Machine(LightGBM)model to predict flyrock in a lead-zinc mine,with promising results.To improve its accuracy,multi-verse optimizer(MVO)and ant lion optimizer(ALO)metaheuristic algorithms were introduced.Results showed MVO-LightGBM outperformed conventional LightGBM.Additionally,decision tree(DT),support vector machine(SVM),and classification and regression tree(CART)models were trained and compared with MVO-LightGBM.The MVO-LightGBM model excelled over DT,SVM,and CART.This study highlights MVO-LightGBM's effectiveness and potential for broader applications.Furthermore,a multiple parametric sensitivity analysis(MPSA)algorithm was employed to specify the sensitivity of parameters.MPSA results indicated that the highest and lowest sensitivities are relevant to blasted rock per hole and spacing with theγ=1752.12 andγ=49.52,respectively.
基金supported through theOngoing Research Funding Program(ORF-2025-498),King Saud University,Riyadh,Saudi Arabia.
文摘Android smartphones have become an integral part of our daily lives,becoming targets for ransomware attacks.Such attacks encrypt user information and ask for payment to recover it.Conventional detection mechanisms,such as signature-based and heuristic techniques,often fail to detect new and polymorphic ransomware samples.To address this challenge,we employed various ensemble classifiers,such as Random Forest,Gradient Boosting,Bagging,and AutoML models.We aimed to showcase how AutoML can automate processes such as model selection,feature engineering,and hyperparameter optimization,to minimize manual effort while ensuring or enhancing performance compared to traditional approaches.We used this framework to test it with a publicly available dataset from the Kaggle repository,which contains features for Android ransomware network traffic.The dataset comprises 392,024 flow records,divided into eleven groups.There are ten classes for various ransomware types,including SVpeng,PornDroid,Koler,WannaLocker,and Lockerpin.There is also a class for regular traffic.We applied a three-step procedure to select themost relevant features:filter,wrapper,and embeddedmethods.The Bagging classifier was highly accurate,correctly getting 99.84%of the time.The FLAML AutoML framework was evenmore accurate,correctly getting 99.85%of the time.This is indicative of howwellAutoML performs in improving things with minimal human assistance.Our findings indicate that AutoML is an efficient,scalable,and flexible method to discover Android ransomware,and it will facilitate the development of next-generation intrusion detection systems.
基金supported by the National Natural Science Foundation of China (Grant No.12274131)the Innovation Program for Quantum Science and Technology (Grant No.2024ZD0300101)。
文摘Optical non-reciprocity is a fundamental phenomenon in photonics.It is crucial for developing devices that rely on directional signal control,such as optical isolators and circulators.However,most research in this field has focused on systems in equilibrium or steady states.In this work,we demonstrate a room-temperature Rydberg atomic platform where the unidirectional propagation of light acts as a switch to mediate time-crystalline-like collective oscillations through atomic synchronization.
基金the Begum Rokeya University,Rangpur,and the United Arab Emirates University,UAE for partially supporting this work。
文摘Rice is one of the most important staple crops globally.Rice plant diseases can severely reduce crop yields and,in extreme cases,lead to total production loss.Early diagnosis enables timely intervention,mitigates disease severity,supports effective treatment strategies,and reduces reliance on excessive pesticide use.Traditional machine learning approaches have been applied for automated rice disease diagnosis;however,these methods depend heavily on manual image preprocessing and handcrafted feature extraction,which are labor-intensive and time-consuming and often require domain expertise.Recently,end-to-end deep learning(DL) models have been introduced for this task,but they often lack robustness and generalizability across diverse datasets.To address these limitations,we propose a novel end-toend training framework for convolutional neural network(CNN) and attention-based model ensembles(E2ETCA).This framework integrates features from two state-of-the-art(SOTA) CNN models,Inception V3 and DenseNet-201,and an attention-based vision transformer(ViT) model.The fused features are passed through an additional fully connected layer with softmax activation for final classification.The entire process is trained end-to-end,enhancing its suitability for realworld deployment.Furthermore,we extract and analyze the learned features using a support vector machine(SVM),a traditional machine learning classifier,to provide comparative insights.We evaluate the proposed E2ETCA framework on three publicly available datasets,the Mendeley Rice Leaf Disease Image Samples dataset,the Kaggle Rice Diseases Image dataset,the Bangladesh Rice Research Institute dataset,and a combined version of all three.Using standard evaluation metrics(accuracy,precision,recall,and F1-score),our framework demonstrates superior performance compared to existing SOTA methods in rice disease diagnosis,with potential applicability to other agricultural disease detection tasks.
基金the King Salman center for Disability Research for funding this work through Research Group No.KSRG-2024-050.
文摘Artificial Intelligence(AI)is changing healthcare by helping with diagnosis.However,for doctors to trust AI tools,they need to be both accurate and easy to understand.In this study,we created a new machine learning system for the early detection of Autism Spectrum Disorder(ASD)in children.Our main goal was to build a model that is not only good at predicting ASD but also clear in its reasoning.For this,we combined several different models,including Random Forest,XGBoost,and Neural Networks,into a single,more powerful framework.We used two different types of datasets:(i)a standard behavioral dataset and(ii)a more complex multimodal dataset with images,audio,and physiological information.The datasets were carefully preprocessed for missing values,redundant features,and dataset imbalance to ensure fair learning.The results outperformed the state-of-the-art with a Regularized Neural Network,achieving 97.6%accuracy on behavioral data.Whereas,on the multimodal data,the accuracy is 98.2%.Other models also did well with accuracies consistently above 96%.We also used SHAP and LIME on a behavioral dataset for models’explainability.
基金funded by the Korea Meteorological Administration Research and Development Program under Grant RACS 2010-2016supported by the Brain Korea 21 project of the Ministry of Education and Human Resources Development of the Korean government
文摘Satellite data obtained over synoptic data-sparse regions such as an ocean contribute toward improving the quality of the initial state of limited-area models. Background error covariances are crucial to the proper distribution of satellite-observed information in variational data assimilation. In the NMC (National Meteorological Center) method, background error covariances are underestimated over data-sparse regions such as an ocean because of small differences between different forecast times. Thus, it is necessary to reconstruct and tune the background error covariances so as to maximize the usefulness of the satellite data for the initial state of limited-area models, especially over an ocean where there is a lack of conventional data. In this study, we attempted to estimate background error covariances so as to provide adequate error statistics for data-sparse regions by using ensemble forecasts of optimal perturbations using bred vectors. The background error covariances estimated by the ensemble method reduced the overestimation of error amplitude obtained by the NMC method. By employing an appropriate horizontal length scale to exclude spurious correlations, the ensemble method produced better results than the NMC method in the assimilation of retrieved satellite data. Because the ensemble method distributes observed information over a limited local area, it would be more useful in the analysis of high-resolution satellite data. Accordingly, the performance of forecast models can be improved over the area where the satellite data are assimilated.
基金Supported by Chinese Meteorological Administration's Special Funds(Meteorology) for Scientific Research on Public Causes( GYHY200906007)Gale Forecast Item of the Shengli Oil Field Observatory (2008001)~~
文摘Based on the daily sea surface wind field prediction data of Japan Meteorological Agency(JMA) forecast model,National Centers for Environmental Prediction(NCEP GFS) model and U.S.Navy Operational Global Atmospheric Prediction System(NOGAPS) model at 12:00 UTC from June 28 to August 10 in 2009,the bias-removed ensemble mean(BRE) was used to do the forecast test on the sea surface wind fields,and the root-mean-square error(RMSE) was used to test and evaluate the forecast results.The results showed that the BRE considerably reduced the RMSEs of 24 and 48 h sea surface wind field forecasts,and the forecast skill was superior to that of the single model forecast.The RMSE decreases in the south of central Bohai Sea and the middle of the Yellow Sea were the most obvious.In addition,the BRE forecast improved evidently the forecast skill of the gale process which occurred during July 13-14 and August 7 in 2009.The forecast accuracy of the wind speed and the gale location was also improved.
基金jointly sponsored by the National Key Research and Development Program of China (2018YFC1506402)the National Natural Science Foundation of China (Grant Nos.41475100 and 41805081)the Global Regional Assimilation and Prediction System Development Program of the China Meteorological Administration (GRAPES-FZZX2018)
文摘This paper preliminarily investigates the application of the orthogonal conditional nonlinear optimal perturbations(CNOPs)–based ensemble forecast technique in MM5(Fifth-generation Pennsylvania State University–National Center for Atmospheric Research Mesoscale Model). The results show that the ensemble forecast members generated by the orthogonal CNOPs present large spreads but tend to be located on the two sides of real tropical cyclone(TC) tracks and have good agreements between ensemble spreads and ensemble-mean forecast errors for TC tracks. Subsequently, these members reflect more reasonable forecast uncertainties and enhance the orthogonal CNOPs–based ensemble-mean forecasts to obtain higher skill for TC tracks than the orthogonal SVs(singular vectors)–, BVs(bred vectors)– and RPs(random perturbations)–based ones. The results indicate that orthogonal CNOPs of smaller magnitudes should be adopted to construct the initial ensemble perturbations for short lead–time forecasts, but those of larger magnitudes should be used for longer lead–time forecasts due to the effects of nonlinearities. The performance of the orthogonal CNOPs–based ensemble-mean forecasts is case-dependent,which encourages evaluating statistically the forecast skill with more TC cases. Finally, the results show that the ensemble forecasts with only initial perturbations in this work do not increase the forecast skill of TC intensity, which may be related with both the coarse model horizontal resolution and the model error.
基金supported by the National Fundamental Research (973) Program of China (Grant No. 2013CB430103)the Special Foundation of the China Meteorological Administration (Grant No. GYHY201506006)supported by the National Science Foundation of China (Grant No. 41405100)
文摘On 21 July 2012,an extreme rainfall event that recorded a maximum rainfall amount over 24 hours of 460 mm,occurred in Beijing,China. Most operational models failed to predict such an extreme amount. In this study,a convective-permitting ensemble forecast system(CEFS),at 4-km grid spacing,covering the entire mainland of China,is applied to this extreme rainfall case. CEFS consists of 22 members and uses multiple physics parameterizations. For the event,the predicted maximum is 415 mm d^-1 in the probability-matched ensemble mean. The predicted high-probability heavy rain region is located in southwest Beijing,as was observed. Ensemble-based verification scores are then investigated. For a small verification domain covering Beijing and its surrounding areas,the precipitation rank histogram of CEFS is much flatter than that of a reference global ensemble. CEFS has a lower(higher) Brier score and a higher resolution than the global ensemble for precipitation,indicating more reliable probabilistic forecasting by CEFS. Additionally,forecasts of different ensemble members are compared and discussed. Most of the extreme rainfall comes from convection in the warm sector east of an approaching cold front. A few members of CEFS successfully reproduce such precipitation,and orographic lift of highly moist low-level flows with a significantly southeasterly component is suggested to have played important roles in producing the initial convection. Comparisons between good and bad forecast members indicate a strong sensitivity of the extreme rainfall to the mesoscale environmental conditions,and,to less of an extent,the model physics.
文摘The application of numerical weather prediction (NWP) products is increasing dramatically. Existing reports indicate that ensemble predictions have better skill than deterministic forecasts. In this study, numerical ensemble precipitation forecasts in the TIGGE database were evaluated using deterministic, dichotomous (yes/no), and probabilistic techniques over Iran for the period 2008-16. Thirteen rain gauges spread over eight homogeneous precipitation regimes were selected for evaluation. The Inverse Distance Weighting and Kriging methods were adopted for interpolation of the prediction values, downscaled to the stations at lead times of one to three days. To enhance the forecast quality, NWP values were post-processed via Bayesian Model Averaging. The results showed that ECMWF had better scores than other products. However, products of all centers underestimated precipitation in high precipitation regions while overestimating precipitation in other regions. This points to a systematic bias in forecasts and demands application of bias correction techniques. Based on dichotomous evaluation, NCEP did better at most stations, although all centers overpredicted the number of precipitation events. Compared to those of ECMWF and NCER UKMO yielded higher scores in mountainous regions, but performed poorly at other selected stations. Furthermore, the evaluations showed that all centers had better skill in wet than in dry seasons. The quality of post-processed predictions was better than those of the raw predictions. In conclusion, the accuracy of the NWP predictions made by the selected centers could be classified as medium over Iran, while post-processing of predictions is recommended to improve the quality.
基金supported by the National Basic Research Program of China (973 Program: Grant No. 2010CB951902)the Special Program for China Meteorology Trade (Grant No. GYHY201306020)the Technology Support Program of China (Grant No. 2009BAC51B03)
文摘A time-lagged ensemble method is used to improve 6-15 day precipitation forecasts from the Beijing Climate Center Atmospheric General Circulation Model,version 2.0.1.The approach averages the deterministic predictions of precipitation from the most recent model run and from earlier runs,all at the same forecast valid time.This lagged average forecast (LAF) method assigns equal weight to each ensemble member and produces a forecast by taking the ensemble mean.Our analyses of the Equitable Threat Score,the Hanssen and Kuipers Score,and the frequency bias indicate that the LAF using five members at time-lagged intervals of 6 h improves 6-15 day forecasts of precipitation frequency above 1 mm d-1 and 5 mm d-1 in many regions of China,and is more effective than the LAF method with selection of the time-lagged interval of 12 or 24 h between ensemble members.In particular,significant improvements are seen over regions where the frequencies of rainfall days are higher than about 40%-50% in the summer season; these regions include northeastern and central to southern China,and the southeastem Tibetan Plateau.
基金supported by a project of the National Natural Science Foundation of China (Grant No. 41305099)
文摘A running mean bias (RMB) correction ap- proach was applied to the forecasts of near-surface variables in a seasonal short-range ensemble forecasting experiment with 57 consecutive cases during summer 2010 in the northern China region. To determine a proper training window length for calculating RMB, window lengths from 2 to 20 days were evaluated, and 16 days was taken as an optimal window length, since it receives most of the benefit from extending the window length. The raw and 16-day RMB corrected ensembles were then evaluated for their ensemble mean forecast skills. The results show that the raw ensemble has obvious bias in all near-surface variables. The RMB correction can remove the bias reasonably well, and generate an unbiased ensemble. The bias correction not only reduces the ensemble mean forecast error, but also results in a better spreaderror relationship. Moreover, two methods for computing calibrated probabilistic forecast (PF) were also evaluated through the 57 case dates: 1) using the relative frequency from the RMB-eorrected ensemble; 2) computing the forecasting probabilities based on a historical rank histogram. The first method outperforms the second one, as it can improve both the reliability and the resolution of the PFs, while the second method only has a small effect on the reliability, indicating the necessity and importance of removing the systematic errors from the ensemble.
基金supported by the National Natural Science Foundation of China (Project No.42375192)the China Meteorological Administration Climate Change Special Program (CMA-CCSP+1 种基金Project No.QBZ202315)support by the Vector Stiftung through the Young Investigator Group"Artificial Intelligence for Probabilistic Weather Forecasting."
文摘Despite the maturity of ensemble numerical weather prediction(NWP),the resulting forecasts are still,more often than not,under-dispersed.As such,forecast calibration tools have become popular.Among those tools,quantile regression(QR)is highly competitive in terms of both flexibility and predictive performance.Nevertheless,a long-standing problem of QR is quantile crossing,which greatly limits the interpretability of QR-calibrated forecasts.On this point,this study proposes a non-crossing quantile regression neural network(NCQRNN),for calibrating ensemble NWP forecasts into a set of reliable quantile forecasts without crossing.The overarching design principle of NCQRNN is to add on top of the conventional QRNN structure another hidden layer,which imposes a non-decreasing mapping between the combined output from nodes of the last hidden layer to the nodes of the output layer,through a triangular weight matrix with positive entries.The empirical part of the work considers a solar irradiance case study,in which four years of ensemble irradiance forecasts at seven locations,issued by the European Centre for Medium-Range Weather Forecasts,are calibrated via NCQRNN,as well as via an eclectic mix of benchmarking models,ranging from the naïve climatology to the state-of-the-art deep-learning and other non-crossing models.Formal and stringent forecast verification suggests that the forecasts post-processed via NCQRNN attain the maximum sharpness subject to calibration,amongst all competitors.Furthermore,the proposed conception to resolve quantile crossing is remarkably simple yet general,and thus has broad applicability as it can be integrated with many shallow-and deep-learning-based neural networks.
基金National Natural Science Foundation of China(42175003,42088101)Graduate Research and Innovation Projects of Jiangsu Province(KYCX22_1134)。
文摘An unprecedented heavy rainfall event occurred in Henan Province,China,during the period of 1200 UTC 19-1200 UTC 20 July 2021 with a record of 522 mm accumulated rainfall.Zhengzhou,the capital city of Henan,received 201.9 mm of rainfall in just one hour on the day.In the present study,the sensitivity of this event to atmospheric variables is investigated using the ECMWF ensemble forecasts.The sensitivity analysis first indicates that a local YellowHuai River low vortex(YHV)in the southern part of Henan played a crucial role in this extreme event.Meanwhile,the western Pacific subtropical high(WPSH)was stronger than the long-term average and to the west of its climatological position.Moreover,the existence of a tropical cyclone(TC)In-Fa pushed into the peripheral of the WPSH and brought an enhanced easterly flow between the TC and WPSH channeling abundant moisture to inland China and feeding into the YHV.Members of the ECMWF ensemble are selected and grouped into the GOOD and the POOR groups based on their predicted maximum rainfall accumulations during the event.Some good members of ECMWF ensemble Prediction System(ECMWF-EPS)are able to capture good spatial distribution of the heavy rainfall,but still underpredict its extremity.The better prediction ability of these members comes from the better prediction of the evolution characteristics(i.e.,intensity and location)of the YHV and TC In-Fa.When the YHV was moving westward to the south of Henan,a relatively strong southerly wind in the southwestern part of Henan converged with the easterly flow from the channel wind between In-Fa and WPSH.The convergence and accompanying ascending motion induced heavy precipitation.