The burgeoning market for lithium-ion batteries has stimulated a growing need for more reliable battery performance monitoring. Accurate state-of-health(SOH) estimation is critical for ensuring battery operational per...The burgeoning market for lithium-ion batteries has stimulated a growing need for more reliable battery performance monitoring. Accurate state-of-health(SOH) estimation is critical for ensuring battery operational performance. Despite numerous data-driven methods reported in existing research for battery SOH estimation, these methods often exhibit inconsistent performance across different application scenarios. To address this issue and overcome the performance limitations of individual data-driven models,integrating multiple models for SOH estimation has received considerable attention. Ensemble learning(EL) typically leverages the strengths of multiple base models to achieve more robust and accurate outputs. However, the lack of a clear review of current research hinders the further development of ensemble methods in SOH estimation. Therefore, this paper comprehensively reviews multi-model ensemble learning methods for battery SOH estimation. First, existing ensemble methods are systematically categorized into 6 classes based on their combination strategies. Different realizations and underlying connections are meticulously analyzed for each category of EL methods, highlighting distinctions, innovations, and typical applications. Subsequently, these ensemble methods are comprehensively compared in terms of base models, combination strategies, and publication trends. Evaluations across 6 dimensions underscore the outstanding performance of stacking-based ensemble methods. Following this, these ensemble methods are further inspected from the perspectives of weighted ensemble and diversity, aiming to inspire potential approaches for enhancing ensemble performance. Moreover, addressing challenges such as base model selection, measuring model robustness and uncertainty, and interpretability of ensemble models in practical applications is emphasized. Finally, future research prospects are outlined, specifically noting that deep learning ensemble is poised to advance ensemble methods for battery SOH estimation. The convergence of advanced machine learning with ensemble learning is anticipated to yield valuable avenues for research. Accelerated research in ensemble learning holds promising prospects for achieving more accurate and reliable battery SOH estimation under real-world conditions.展开更多
With the rapid development of economy,air pollution caused by industrial expansion has caused serious harm to human health and social development.Therefore,establishing an effective air pollution concentration predict...With the rapid development of economy,air pollution caused by industrial expansion has caused serious harm to human health and social development.Therefore,establishing an effective air pollution concentration prediction system is of great scientific and practical significance for accurate and reliable predictions.This paper proposes a combination of pointinterval prediction system for pollutant concentration prediction by leveraging neural network,meta-heuristic optimization algorithm,and fuzzy theory.Fuzzy information granulation technology is used in data preprocessing to transform numerical sequences into fuzzy particles for comprehensive feature extraction.The golden Jackal optimization algorithm is employed in the optimization stage to fine-tune model hyperparameters.In the prediction stage,an ensemble learning method combines training results frommultiplemodels to obtain final point predictions while also utilizing quantile regression and kernel density estimation methods for interval predictions on the test set.Experimental results demonstrate that the combined model achieves a high goodness of fit coefficient of determination(R^(2))at 99.3% and a maximum difference between prediction accuracy mean absolute percentage error(MAPE)and benchmark model at 12.6%.This suggests that the integrated learning system proposed in this paper can provide more accurate deterministic predictions as well as reliable uncertainty analysis compared to traditionalmodels,offering practical reference for air quality early warning.展开更多
In this study,we conducted an experiment to construct multi-model ensemble(MME)predictions for the El Niño-Southern Oscillation(ENSO)using a neural network,based on hindcast data released from five coupled oceana...In this study,we conducted an experiment to construct multi-model ensemble(MME)predictions for the El Niño-Southern Oscillation(ENSO)using a neural network,based on hindcast data released from five coupled oceanatmosphere models,which exhibit varying levels of complexity.This nonlinear approach demonstrated extraordinary superiority and effectiveness in constructing ENSO MME.Subsequently,we employed the leave-one-out crossvalidation and the moving base methods to further validate the robustness of the neural network model in the formulation of ENSO MME.In conclusion,the neural network algorithm outperforms the conventional approach of assigning a uniform weight to all models.This is evidenced by an enhancement in correlation coefficients and reduction in prediction errors,which have the potential to provide a more accurate ENSO forecast.展开更多
Background:Stomach cancer(SC)is one of the most lethal malignancies worldwide due to late-stage diagnosis and limited treatment.The transcriptomic,epigenomic,and proteomic,etc.,omics datasets generated by high-through...Background:Stomach cancer(SC)is one of the most lethal malignancies worldwide due to late-stage diagnosis and limited treatment.The transcriptomic,epigenomic,and proteomic,etc.,omics datasets generated by high-throughput sequencing technology have become prominent in biomedical research,and they reveal molecular aspects of cancer diagnosis and therapy.Despite the development of advanced sequencing technology,the presence of high-dimensionality in multi-omics data makes it challenging to interpret the data.Methods:In this study,we introduce RankXLAN,an explainable ensemble-based multi-omics framework that integrates feature selection(FS),ensemble learning,bioinformatics,and in-silico validation for robust biomarker detection,potential therapeutic drug-repurposing candidates’identification,and classification of SC.To enhance the interpretability of the model,we incorporated explainable artificial intelligence(SHapley Additive exPlanations analysis),as well as accuracy,precision,F1-score,recall,cross-validation,specificity,likelihood ratio(LR)+,LR−,and Youden index results.Results:The experimental results showed that the top four FS algorithms achieved improved results when applied to the ensemble learning classification model.The proposed ensemble model produced an area under the curve(AUC)score of 0.994 for gene expression,0.97 for methylation,and 0.96 for miRNA expression data.Through the integration of bioinformatics and ML approach of the transcriptomic and epigenomic multi-omics dataset,we identified potential marker genes,namely,UBE2D2,HPCAL4,IGHA1,DPT,and FN3K.In-silico molecular docking revealed a strong binding affinity between ANKRD13C and the FDA-approved drug Everolimus(binding affinity−10.1 kcal/mol),identifying ANKRD13C as a potential therapeutic drug-repurposing target for SC.Conclusion:The proposed framework RankXLAN outperforms other existing frameworks for serum biomarker identification,therapeutic target identification,and SC classification with multi-omics datasets.展开更多
In recent years,ransomware attacks have become one of the most common and destructive types of cyberattacks.Their impact is significant on the operations,finances and reputation of affected companies.Despite the effor...In recent years,ransomware attacks have become one of the most common and destructive types of cyberattacks.Their impact is significant on the operations,finances and reputation of affected companies.Despite the efforts of researchers and security experts to protect information systems from these attacks,the threat persists and the proposed solutions are not able to significantly stop the spread of ransomware attacks.The latest remarkable achievements of large language models(LLMs)in NLP tasks have caught the attention of cybersecurity researchers to integrate thesemodels into security threat detection.Thesemodels offer high embedding capabilities,able to extract rich semantic representations and paving theway formore accurate and adaptive solutions.In this context,we propose a new approach for ransomware detection based on an ensemblemethod that leverages three distinctLLMembeddingmodels.This ensemble strategy takes advantage of the variety of embedding methods and the strengths of each model.In the proposed solution,each embedding model is associated with an independently trainedMLP classifier.The predictions obtained are then merged using a weighted voting technique,assigning each model an influence proportional to its performance.This approach makes it possible to exploit the complementarity of representations,improve detection accuracy and robustness,and offer a more reliable solution in the face of the growing diversity and complexity of modern ransomware.展开更多
Metal organic framework(MOF) assembled with coordination bonds has the disadvantage of poor stability that limits its application in the field of stationary phase,while covalent organic framework(COF)assembled through...Metal organic framework(MOF) assembled with coordination bonds has the disadvantage of poor stability that limits its application in the field of stationary phase,while covalent organic framework(COF)assembled through covalent bonds exhibits excellent structural stability.It has been shown that the stationary phases prepared by combining MOF and COF can make up for the poor stability of MOF@SiO_(2),and the MOF/COF composites have superior chromatographic separation performance.However,the traditional methods for preparing COF/MOF based stationary phases are generally solvent thermal synthesis.In this study,a green and low-cost synthesis method was proposed for the preparation of MOF/COF@SiO_(2) stationary phase.Firstly,COF@SiO_(2) was prepared in a choline chloride/ethylene glycol based deep eutectic solvent(DES).Secondly,another acid-base tunable DES prepared by mixing p-toluenesulfonic acid(PTSA)and 2-methylimidazole in different proportions was introduced as the reaction solvent and reactant for rapid synthesis of MOF/COF@SiO_(2).Compared with the toxic transition metal-based MOFs selected in most previous studies,a lightweight and non-toxic S-zone metal(calcium) based MOF was employed in this study.PTSA and calcium will form the calcium/oxygen-containing organic acid framework in acidic DES,which assembles with terephthalic acid dissolved in basic DES to form MOF.The strong hydrogen bonding effect of DES can facilitate rapid assembly of Ca-MOF.The obtained Ca-MOF/COF@SiO_(2) can be used for multi-mode chromatography to efficiently separate multiple isomeric/hydrophilic/hydrophobic analytes.The synthesis method of Ca-MOF/COF@SiO_(2) is green and mild,especially the use of acid-base tunable DES promotes the rapid synthesis of non-toxic Ca-MOF/COF@silica composites,which offers an innovative approach of greenly synthesizing novel MOF/COF stationary phases and extends their applications in the field of chromatography.展开更多
[Objectives]This study was conducted to achieve rapid and accurate detection of protein content in rice with a particle size of 1.0 mm.[Methods]A multi-model fusion strategy was proposed on the basis of Stacking ensem...[Objectives]This study was conducted to achieve rapid and accurate detection of protein content in rice with a particle size of 1.0 mm.[Methods]A multi-model fusion strategy was proposed on the basis of Stacking ensemble learning.A base learner pool was constructed,containing Partial Least Squares(PLS),Support Vector Machine(SVM),Deep Extreme Learning Machine(DELM),Random Forest(RF),Gradient Boosting Decision Tree(GBDT),and Multilayer Perceptron(MLP).PLS,DELM,and Linear Regression(LR)were used as meta-learner candidates.Employing integer coding technology,systematic dynamic combinations of base learners and meta-learners were generated,resulting in a total of 40 non-repetitive fusion models.The optimal combination was selected through a comprehensive evaluation based on multiple assessment indicators.[Results]The combination"PLS-DELM-MLP-LR"(code 1367)achieved coefficients of determination of 0.9732 and 0.9780 on the validation set and independent test set,respectively,with relative root mean square errors of 2.35%and 2.36%,and residual predictive deviations of 6.1075 and 6.7479,respectively.[Conclusions]The Stacking fusion model significantly enhances the predictive accuracy and robustness of spectral quantitative analysis,providing an efficient and feasible solution for modeling complex agricultural product spectral data.展开更多
The surge in smishing attacks underscores the urgent need for robust,real-time detection systems powered by advanced deep learning models.This paper introduces PhishNet,a novel ensemble learning framework that integra...The surge in smishing attacks underscores the urgent need for robust,real-time detection systems powered by advanced deep learning models.This paper introduces PhishNet,a novel ensemble learning framework that integrates transformer-based models(RoBERTa)and large language models(LLMs)(GPT-OSS 120B,LLaMA3.370B,and Qwen332B)to enhance smishing detection performance significantly.To mitigate class imbalance,we apply synthetic data augmentation using T5 and leverage various text preprocessing techniques.Our system employs a duallayer voting mechanism:weighted majority voting among LLMs and a final ensemble vote to classify messages as ham,spam,or smishing.Experimental results show an average accuracy improvement from 96%to 98.5%compared to the best standalone transformer,and from 93%to 98.5%when compared to LLMs across datasets.Furthermore,we present a real-time,user-friendly application to operationalize our detection model for practical use.PhishNet demonstrates superior scalability,usability,and detection accuracy,filling critical gaps in current smishing detection methodologies.展开更多
Distributed Denial of Service(DDoS)attacks are one of the severe threats to network infrastructure,sometimes bypassing traditional diagnosis algorithms because of their evolving complexity.PresentMachine Learning(ML)t...Distributed Denial of Service(DDoS)attacks are one of the severe threats to network infrastructure,sometimes bypassing traditional diagnosis algorithms because of their evolving complexity.PresentMachine Learning(ML)techniques for DDoS attack diagnosis normally apply network traffic statistical features such as packet sizes and inter-arrival times.However,such techniques sometimes fail to capture complicated relations among various traffic flows.In this paper,we present a new multi-scale ensemble strategy given the Graph Neural Networks(GNNs)for improving DDoS detection.Our technique divides traffic into macro-and micro-level elements,letting various GNN models to get the two corase-scale anomalies and subtle,stealthy attack models.Through modeling network traffic as graph-structured data,GNNs efficiently learn intricate relations among network entities.The proposed ensemble learning algorithm combines the results of several GNNs to improve generalization,robustness,and scalability.Extensive experiments on three benchmark datasets—UNSW-NB15,CICIDS2017,and CICDDoS2019—show that our approach outperforms traditional machine learning and deep learning models in detecting both high-rate and low-rate(stealthy)DDoS attacks,with significant improvements in accuracy and recall.These findings demonstrate the suggested method’s applicability and robustness for real-world implementation in contexts where several DDoS patterns coexist.展开更多
Traditional mining in open pit mines often uses explosives,leading to environmental hazards,with flyrock being a critical issue.In detail,excess flying rock beyond the designated explosion area was identified as the p...Traditional mining in open pit mines often uses explosives,leading to environmental hazards,with flyrock being a critical issue.In detail,excess flying rock beyond the designated explosion area was identified as the primary cause of fatal and non-fatal blasting hazards in open pit mining.Therefore,the accurate and reliable prediction of flyrock becomes crucial for effectively managing and mitigating associated problems.This study used the Light Gradient Boosting Machine(LightGBM)model to predict flyrock in a lead-zinc mine,with promising results.To improve its accuracy,multi-verse optimizer(MVO)and ant lion optimizer(ALO)metaheuristic algorithms were introduced.Results showed MVO-LightGBM outperformed conventional LightGBM.Additionally,decision tree(DT),support vector machine(SVM),and classification and regression tree(CART)models were trained and compared with MVO-LightGBM.The MVO-LightGBM model excelled over DT,SVM,and CART.This study highlights MVO-LightGBM's effectiveness and potential for broader applications.Furthermore,a multiple parametric sensitivity analysis(MPSA)algorithm was employed to specify the sensitivity of parameters.MPSA results indicated that the highest and lowest sensitivities are relevant to blasted rock per hole and spacing with theγ=1752.12 andγ=49.52,respectively.展开更多
Android smartphones have become an integral part of our daily lives,becoming targets for ransomware attacks.Such attacks encrypt user information and ask for payment to recover it.Conventional detection mechanisms,suc...Android smartphones have become an integral part of our daily lives,becoming targets for ransomware attacks.Such attacks encrypt user information and ask for payment to recover it.Conventional detection mechanisms,such as signature-based and heuristic techniques,often fail to detect new and polymorphic ransomware samples.To address this challenge,we employed various ensemble classifiers,such as Random Forest,Gradient Boosting,Bagging,and AutoML models.We aimed to showcase how AutoML can automate processes such as model selection,feature engineering,and hyperparameter optimization,to minimize manual effort while ensuring or enhancing performance compared to traditional approaches.We used this framework to test it with a publicly available dataset from the Kaggle repository,which contains features for Android ransomware network traffic.The dataset comprises 392,024 flow records,divided into eleven groups.There are ten classes for various ransomware types,including SVpeng,PornDroid,Koler,WannaLocker,and Lockerpin.There is also a class for regular traffic.We applied a three-step procedure to select themost relevant features:filter,wrapper,and embeddedmethods.The Bagging classifier was highly accurate,correctly getting 99.84%of the time.The FLAML AutoML framework was evenmore accurate,correctly getting 99.85%of the time.This is indicative of howwellAutoML performs in improving things with minimal human assistance.Our findings indicate that AutoML is an efficient,scalable,and flexible method to discover Android ransomware,and it will facilitate the development of next-generation intrusion detection systems.展开更多
Optical non-reciprocity is a fundamental phenomenon in photonics.It is crucial for developing devices that rely on directional signal control,such as optical isolators and circulators.However,most research in this fie...Optical non-reciprocity is a fundamental phenomenon in photonics.It is crucial for developing devices that rely on directional signal control,such as optical isolators and circulators.However,most research in this field has focused on systems in equilibrium or steady states.In this work,we demonstrate a room-temperature Rydberg atomic platform where the unidirectional propagation of light acts as a switch to mediate time-crystalline-like collective oscillations through atomic synchronization.展开更多
Rice is one of the most important staple crops globally.Rice plant diseases can severely reduce crop yields and,in extreme cases,lead to total production loss.Early diagnosis enables timely intervention,mitigates dise...Rice is one of the most important staple crops globally.Rice plant diseases can severely reduce crop yields and,in extreme cases,lead to total production loss.Early diagnosis enables timely intervention,mitigates disease severity,supports effective treatment strategies,and reduces reliance on excessive pesticide use.Traditional machine learning approaches have been applied for automated rice disease diagnosis;however,these methods depend heavily on manual image preprocessing and handcrafted feature extraction,which are labor-intensive and time-consuming and often require domain expertise.Recently,end-to-end deep learning(DL) models have been introduced for this task,but they often lack robustness and generalizability across diverse datasets.To address these limitations,we propose a novel end-toend training framework for convolutional neural network(CNN) and attention-based model ensembles(E2ETCA).This framework integrates features from two state-of-the-art(SOTA) CNN models,Inception V3 and DenseNet-201,and an attention-based vision transformer(ViT) model.The fused features are passed through an additional fully connected layer with softmax activation for final classification.The entire process is trained end-to-end,enhancing its suitability for realworld deployment.Furthermore,we extract and analyze the learned features using a support vector machine(SVM),a traditional machine learning classifier,to provide comparative insights.We evaluate the proposed E2ETCA framework on three publicly available datasets,the Mendeley Rice Leaf Disease Image Samples dataset,the Kaggle Rice Diseases Image dataset,the Bangladesh Rice Research Institute dataset,and a combined version of all three.Using standard evaluation metrics(accuracy,precision,recall,and F1-score),our framework demonstrates superior performance compared to existing SOTA methods in rice disease diagnosis,with potential applicability to other agricultural disease detection tasks.展开更多
Artificial Intelligence(AI)is changing healthcare by helping with diagnosis.However,for doctors to trust AI tools,they need to be both accurate and easy to understand.In this study,we created a new machine learning sy...Artificial Intelligence(AI)is changing healthcare by helping with diagnosis.However,for doctors to trust AI tools,they need to be both accurate and easy to understand.In this study,we created a new machine learning system for the early detection of Autism Spectrum Disorder(ASD)in children.Our main goal was to build a model that is not only good at predicting ASD but also clear in its reasoning.For this,we combined several different models,including Random Forest,XGBoost,and Neural Networks,into a single,more powerful framework.We used two different types of datasets:(i)a standard behavioral dataset and(ii)a more complex multimodal dataset with images,audio,and physiological information.The datasets were carefully preprocessed for missing values,redundant features,and dataset imbalance to ensure fair learning.The results outperformed the state-of-the-art with a Regularized Neural Network,achieving 97.6%accuracy on behavioral data.Whereas,on the multimodal data,the accuracy is 98.2%.Other models also did well with accuracies consistently above 96%.We also used SHAP and LIME on a behavioral dataset for models’explainability.展开更多
The 21-yr ensemble predictions of model precipitation and circulation in the East Asian and western North Pacific (Asia-Pacific) summer monsoon region (0°-50°N, 100° 150°E) were evaluated in ni...The 21-yr ensemble predictions of model precipitation and circulation in the East Asian and western North Pacific (Asia-Pacific) summer monsoon region (0°-50°N, 100° 150°E) were evaluated in nine different AGCM, used in the Asia-Pacific Economic Cooperation Climate Center (APCC) multi-model ensemble seasonal prediction system. The analysis indicates that the precipitation anomaly patterns of model ensemble predictions are substantially different from the observed counterparts in this region, but the summer monsoon circulations are reasonably predicted. For example, all models can well produce the interannual variability of the western North Pacific monsoon index (WNPMI) defined by 850 hPa winds, but they failed to predict the relationship between WNPMI and precipitation anomalies. The interannual variability of the 500 hPa geopotential height (GPH) can be well predicted by the models in contrast to precipitation anomalies. On the basis of such model performances and the relationship between the interannual variations of 500 hPa GPH and precipitation anomalies, we developed a statistical scheme used to downscale the summer monsoon precipitation anomaly on the basis of EOF and singular value decomposition (SVD). In this scheme, the three leading EOF modes of 500 hPa GPH anomaly fields predicted by the models are firstly corrected by the linear regression between the principal components in each model and observation, respectively. Then, the corrected model GPH is chosen as the predictor to downscale the precipitation anomaly field, which is assembled by the forecasted expansion coefficients of model 500 hPa GPH and the three leading SVD modes of observed precipitation anomaly corresponding to the prediction of model 500 hPa GPH during a 19-year training period. The cross-validated forecasts suggest that this downscaling scheme may have a potential to improve the forecast skill of the precipitation anomaly in the South China Sea, western North Pacific and the East Asia Pacific regions, where the anomaly correlation coefficient (ACC) has been improved by 0.14, corresponding to the reduced RMSE of 10.4% in the conventional multi-model ensemble (MME) forecast.展开更多
In order to reduce the uncertainty of offline land surface model (LSM) simulations of land evapotranspiration (ET), we used ensemble simulations based on three meteorological forcing datasets [Princeton, ITPCAS (...In order to reduce the uncertainty of offline land surface model (LSM) simulations of land evapotranspiration (ET), we used ensemble simulations based on three meteorological forcing datasets [Princeton, ITPCAS (Institute of Tibetan Plateau Research, Chinese Academy of Sciences), Qian] and four LSMs (BATS, VIC, CLM3.0 and CLM3.5), to explore the trends and spatiotemporal characteristics of ET, as well as the spatiotemporal pattern of ET in response to climate factors over China's Mainland during 1982-2007. The results showed that various simulations of each member and their arithmetic mean (EnsAVlean) could capture the spatial distribution and seasonal pattern of ET sufficiently well, where they exhibited more significant spatial and seasonal variation in the ET compared with observation-based ET estimates (Obs_MTE). For the mean annual ET, we found that the BATS forced by Princeton forcing overestimated the annual mean ET compared with Obs_MTE for most of the basins in China, whereas the VIC forced by Princeton forcing showed underestimations. By contrast, the Ens_Mean was closer to Obs_MTE, although the results were underestimated over Southeast China. Furthermore, both the Obs_MTE and Ens_Mean exhibited a significant increasing trend during 1982-98; whereas after 1998, when the last big EI Nifio event occurred, the Ens_Mean tended to decrease significantly between 1999 and 2007, although the change was not significant for Obs_MTE. Changes in air temperature and shortwave radiation played key roles in the long-term variation in ET over the humid area of China, but precipitation mainly controlled the long-term variation in ET in arid and semi-arid areas of China.展开更多
Seasonal prediction of summer rainfall over the Yangtze River valley(YRV) is valuable for agricultural and industrial production and freshwater resource management in China, but remains a major challenge. Earlier mu...Seasonal prediction of summer rainfall over the Yangtze River valley(YRV) is valuable for agricultural and industrial production and freshwater resource management in China, but remains a major challenge. Earlier multi-model ensemble(MME) prediction schemes for summer rainfall over China focus on single-value prediction, which cannot provide the necessary uncertainty information, while commonly-used ensemble schemes for probability density function(PDF) prediction are not adapted to YRV summer rainfall prediction. In the present study, an MME PDF prediction scheme is proposed based on the ENSEMBLES hindcasts. It is similar to the earlier Bayesian ensemble prediction scheme, but with optimization of ensemble members and a revision of the variance modeling of the likelihood function. The optimized ensemble members are regressed YRV summer rainfall with factors selected from model outputs of synchronous 500-h Pa geopotential height as predictors. The revised variance modeling of the likelihood function is a simple linear regression with ensemble spread as the predictor. The cross-validation skill of 1960–2002 YRV summer rainfall prediction shows that the new scheme produces a skillful PDF prediction, and is much better-calibrated, sharper, and more accurate than the earlier Bayesian ensemble and raw ensemble.展开更多
This study investigates multi-model ensemble forecasts of track and intensity of tropical cyclones over the western Pacific, based on forecast outputs from the China Meteorological Administration, European Centre for ...This study investigates multi-model ensemble forecasts of track and intensity of tropical cyclones over the western Pacific, based on forecast outputs from the China Meteorological Administration, European Centre for Medium-Range Weather Forecasts, Japan Meteorological Agency and National Centers for Environmental Prediction in the THORPEX Interactive Grand Global Ensemble(TIGGE) datasets. The multi-model ensemble schemes, namely the bias-removed ensemble mean(BREM) and superensemble(SUP), are compared with the ensemble mean(EMN) and single-model forecasts. Moreover, a new model bias estimation scheme is investigated and applied to the BREM and SUP schemes. The results showed that, compared with single-model forecasts and EMN, the multi-model ensembles of the BREM and SUP schemes can have smaller errors in most cases. However, there were also circumstances where BREM was less skillful than EMN, indicating that using a time-averaged error as model bias is not optimal. A new model bias estimation scheme of the biweight mean is introduced. Through minimizing the negative influence of singular errors, this scheme can obtain a more accurate model bias estimation and improve the BREM forecast skill. The application of the biweight mean in the bias calculation of SUP also resulted in improved skill. The results indicate that the modification of multi-model ensemble schemes through this bias estimation method is feasible.展开更多
Dissolved oxygen(DO)is an important indicator of aquaculture,and its accurate forecasting can effectively improve the quality of aquatic products.In this paper,a new DO hybrid forecasting model is proposed that includ...Dissolved oxygen(DO)is an important indicator of aquaculture,and its accurate forecasting can effectively improve the quality of aquatic products.In this paper,a new DO hybrid forecasting model is proposed that includes three stages:multi-factor analysis,adaptive decomposition,and an optimizationbased ensemble.First,considering the complex factors affecting DO,the grey relational(GR)degree method is used to screen out the environmental factors most closely related to DO.The consideration of multiple factors makes model fusion more effective.Second,the series of DO,water temperature,salinity,and oxygen saturation are decomposed adaptively into sub-series by means of the empirical wavelet transform(EWT)method.Then,five benchmark models are utilized to forecast the sub-series of EWT decomposition.The ensemble weights of these five sub-forecasting models are calculated by particle swarm optimization and gravitational search algorithm(PSOGSA).Finally,a multi-factor ensemble model for DO is obtained by weighted allocation.The performance of the proposed model is verified by timeseries data collected by the pacific islands ocean observing system(PacIOOS)from the WQB04 station at Hilo.The evaluation indicators involved in the experiment include the Nash–Sutcliffe efficiency(NSE),Kling–Gupta efficiency(KGE),mean absolute percent error(MAPE),standard deviation of error(SDE),and coefficient of determination(R^(2)).Example analysis demonstrates that:①The proposed model can obtain excellent DO forecasting results;②the proposed model is superior to other comparison models;and③the forecasting model can be used to analyze the trend of DO and enable managers to make better management decisions.展开更多
A Bayesian probabilistic prediction scheme of the Yangtze River Valley (YRV) summer rainfall is proposed to combine forecast information from multi-model ensemble dataset provided by ENSEMBLES project.Due to the low f...A Bayesian probabilistic prediction scheme of the Yangtze River Valley (YRV) summer rainfall is proposed to combine forecast information from multi-model ensemble dataset provided by ENSEMBLES project.Due to the low forecast skill of rainfall in dynamic models,the time series of regressed YRV summer rainfall are selected as ensemble members in the new scheme,instead of commonly-used YRV summer rainfall simulated by models.Each time series of regressed YRV summer rainfall is derived from a simple linear regression.The predictor in each simple linear regression is the skillfully simulated circulation or surface temperature factor which is highly linear with the observed YRV summer rainfall in the training set.The high correlation between the ensemble mean of these regressed YRV summer rainfall and observation benefit extracting more sample information from the ensemble system.The results show that the cross-validated skill of the new scheme over the period of 1960 to 2002 is much higher than equally-weighted ensemble,multiple linear regression,and Bayesian ensemble with simulated YRV summer rainfall as ensemble members.In addition,the new scheme is also more skillful than reference forecasts (random forecast at a 0.01 significance level for ensemble mean and climatology forecast for probability density function).展开更多
基金National Natural Science Foundation of China (52075420)Fundamental Research Funds for the Central Universities (xzy022023049)National Key Research and Development Program of China (2023YFB3408600)。
文摘The burgeoning market for lithium-ion batteries has stimulated a growing need for more reliable battery performance monitoring. Accurate state-of-health(SOH) estimation is critical for ensuring battery operational performance. Despite numerous data-driven methods reported in existing research for battery SOH estimation, these methods often exhibit inconsistent performance across different application scenarios. To address this issue and overcome the performance limitations of individual data-driven models,integrating multiple models for SOH estimation has received considerable attention. Ensemble learning(EL) typically leverages the strengths of multiple base models to achieve more robust and accurate outputs. However, the lack of a clear review of current research hinders the further development of ensemble methods in SOH estimation. Therefore, this paper comprehensively reviews multi-model ensemble learning methods for battery SOH estimation. First, existing ensemble methods are systematically categorized into 6 classes based on their combination strategies. Different realizations and underlying connections are meticulously analyzed for each category of EL methods, highlighting distinctions, innovations, and typical applications. Subsequently, these ensemble methods are comprehensively compared in terms of base models, combination strategies, and publication trends. Evaluations across 6 dimensions underscore the outstanding performance of stacking-based ensemble methods. Following this, these ensemble methods are further inspected from the perspectives of weighted ensemble and diversity, aiming to inspire potential approaches for enhancing ensemble performance. Moreover, addressing challenges such as base model selection, measuring model robustness and uncertainty, and interpretability of ensemble models in practical applications is emphasized. Finally, future research prospects are outlined, specifically noting that deep learning ensemble is poised to advance ensemble methods for battery SOH estimation. The convergence of advanced machine learning with ensemble learning is anticipated to yield valuable avenues for research. Accelerated research in ensemble learning holds promising prospects for achieving more accurate and reliable battery SOH estimation under real-world conditions.
基金supported by General Scientific Research Funding of the Science and Technology Development Fund(FDCT)in Macao(No.0150/2022/A)the Faculty Research Grants of Macao University of Science and Technology(No.FRG-22-074-FIE).
文摘With the rapid development of economy,air pollution caused by industrial expansion has caused serious harm to human health and social development.Therefore,establishing an effective air pollution concentration prediction system is of great scientific and practical significance for accurate and reliable predictions.This paper proposes a combination of pointinterval prediction system for pollutant concentration prediction by leveraging neural network,meta-heuristic optimization algorithm,and fuzzy theory.Fuzzy information granulation technology is used in data preprocessing to transform numerical sequences into fuzzy particles for comprehensive feature extraction.The golden Jackal optimization algorithm is employed in the optimization stage to fine-tune model hyperparameters.In the prediction stage,an ensemble learning method combines training results frommultiplemodels to obtain final point predictions while also utilizing quantile regression and kernel density estimation methods for interval predictions on the test set.Experimental results demonstrate that the combined model achieves a high goodness of fit coefficient of determination(R^(2))at 99.3% and a maximum difference between prediction accuracy mean absolute percentage error(MAPE)and benchmark model at 12.6%.This suggests that the integrated learning system proposed in this paper can provide more accurate deterministic predictions as well as reliable uncertainty analysis compared to traditionalmodels,offering practical reference for air quality early warning.
基金The fund from Southern Marine Science and Engineering Guangdong Laboratory(Zhuhai)under contract No.SML2021SP310the National Natural Science Foundation of China under contract Nos 42227901 and 42475061the Key R&D Program of Zhejiang Province under contract No.2024C03257.
文摘In this study,we conducted an experiment to construct multi-model ensemble(MME)predictions for the El Niño-Southern Oscillation(ENSO)using a neural network,based on hindcast data released from five coupled oceanatmosphere models,which exhibit varying levels of complexity.This nonlinear approach demonstrated extraordinary superiority and effectiveness in constructing ENSO MME.Subsequently,we employed the leave-one-out crossvalidation and the moving base methods to further validate the robustness of the neural network model in the formulation of ENSO MME.In conclusion,the neural network algorithm outperforms the conventional approach of assigning a uniform weight to all models.This is evidenced by an enhancement in correlation coefficients and reduction in prediction errors,which have the potential to provide a more accurate ENSO forecast.
基金the Deanship of Research and Graduate Studies at King Khalid University,KSA,for funding this work through the Large Research Project under grant number RGP2/164/46.
文摘Background:Stomach cancer(SC)is one of the most lethal malignancies worldwide due to late-stage diagnosis and limited treatment.The transcriptomic,epigenomic,and proteomic,etc.,omics datasets generated by high-throughput sequencing technology have become prominent in biomedical research,and they reveal molecular aspects of cancer diagnosis and therapy.Despite the development of advanced sequencing technology,the presence of high-dimensionality in multi-omics data makes it challenging to interpret the data.Methods:In this study,we introduce RankXLAN,an explainable ensemble-based multi-omics framework that integrates feature selection(FS),ensemble learning,bioinformatics,and in-silico validation for robust biomarker detection,potential therapeutic drug-repurposing candidates’identification,and classification of SC.To enhance the interpretability of the model,we incorporated explainable artificial intelligence(SHapley Additive exPlanations analysis),as well as accuracy,precision,F1-score,recall,cross-validation,specificity,likelihood ratio(LR)+,LR−,and Youden index results.Results:The experimental results showed that the top four FS algorithms achieved improved results when applied to the ensemble learning classification model.The proposed ensemble model produced an area under the curve(AUC)score of 0.994 for gene expression,0.97 for methylation,and 0.96 for miRNA expression data.Through the integration of bioinformatics and ML approach of the transcriptomic and epigenomic multi-omics dataset,we identified potential marker genes,namely,UBE2D2,HPCAL4,IGHA1,DPT,and FN3K.In-silico molecular docking revealed a strong binding affinity between ANKRD13C and the FDA-approved drug Everolimus(binding affinity−10.1 kcal/mol),identifying ANKRD13C as a potential therapeutic drug-repurposing target for SC.Conclusion:The proposed framework RankXLAN outperforms other existing frameworks for serum biomarker identification,therapeutic target identification,and SC classification with multi-omics datasets.
基金funded by the Deanship of Graduate Studies and Scientific Research at Jouf University under grant No.(DGSSR-2024-02-01176).
文摘In recent years,ransomware attacks have become one of the most common and destructive types of cyberattacks.Their impact is significant on the operations,finances and reputation of affected companies.Despite the efforts of researchers and security experts to protect information systems from these attacks,the threat persists and the proposed solutions are not able to significantly stop the spread of ransomware attacks.The latest remarkable achievements of large language models(LLMs)in NLP tasks have caught the attention of cybersecurity researchers to integrate thesemodels into security threat detection.Thesemodels offer high embedding capabilities,able to extract rich semantic representations and paving theway formore accurate and adaptive solutions.In this context,we propose a new approach for ransomware detection based on an ensemblemethod that leverages three distinctLLMembeddingmodels.This ensemble strategy takes advantage of the variety of embedding methods and the strengths of each model.In the proposed solution,each embedding model is associated with an independently trainedMLP classifier.The predictions obtained are then merged using a weighted voting technique,assigning each model an influence proportional to its performance.This approach makes it possible to exploit the complementarity of representations,improve detection accuracy and robustness,and offer a more reliable solution in the face of the growing diversity and complexity of modern ransomware.
基金supported by National Natural Science Foundation of China (Nos.21906124,32302202)Natural Science Foundation of Hubei Province (No.2017CFB220)Natural Science Foundation of Shandong Province (No.ZR2023MH278)。
文摘Metal organic framework(MOF) assembled with coordination bonds has the disadvantage of poor stability that limits its application in the field of stationary phase,while covalent organic framework(COF)assembled through covalent bonds exhibits excellent structural stability.It has been shown that the stationary phases prepared by combining MOF and COF can make up for the poor stability of MOF@SiO_(2),and the MOF/COF composites have superior chromatographic separation performance.However,the traditional methods for preparing COF/MOF based stationary phases are generally solvent thermal synthesis.In this study,a green and low-cost synthesis method was proposed for the preparation of MOF/COF@SiO_(2) stationary phase.Firstly,COF@SiO_(2) was prepared in a choline chloride/ethylene glycol based deep eutectic solvent(DES).Secondly,another acid-base tunable DES prepared by mixing p-toluenesulfonic acid(PTSA)and 2-methylimidazole in different proportions was introduced as the reaction solvent and reactant for rapid synthesis of MOF/COF@SiO_(2).Compared with the toxic transition metal-based MOFs selected in most previous studies,a lightweight and non-toxic S-zone metal(calcium) based MOF was employed in this study.PTSA and calcium will form the calcium/oxygen-containing organic acid framework in acidic DES,which assembles with terephthalic acid dissolved in basic DES to form MOF.The strong hydrogen bonding effect of DES can facilitate rapid assembly of Ca-MOF.The obtained Ca-MOF/COF@SiO_(2) can be used for multi-mode chromatography to efficiently separate multiple isomeric/hydrophilic/hydrophobic analytes.The synthesis method of Ca-MOF/COF@SiO_(2) is green and mild,especially the use of acid-base tunable DES promotes the rapid synthesis of non-toxic Ca-MOF/COF@silica composites,which offers an innovative approach of greenly synthesizing novel MOF/COF stationary phases and extends their applications in the field of chromatography.
文摘[Objectives]This study was conducted to achieve rapid and accurate detection of protein content in rice with a particle size of 1.0 mm.[Methods]A multi-model fusion strategy was proposed on the basis of Stacking ensemble learning.A base learner pool was constructed,containing Partial Least Squares(PLS),Support Vector Machine(SVM),Deep Extreme Learning Machine(DELM),Random Forest(RF),Gradient Boosting Decision Tree(GBDT),and Multilayer Perceptron(MLP).PLS,DELM,and Linear Regression(LR)were used as meta-learner candidates.Employing integer coding technology,systematic dynamic combinations of base learners and meta-learners were generated,resulting in a total of 40 non-repetitive fusion models.The optimal combination was selected through a comprehensive evaluation based on multiple assessment indicators.[Results]The combination"PLS-DELM-MLP-LR"(code 1367)achieved coefficients of determination of 0.9732 and 0.9780 on the validation set and independent test set,respectively,with relative root mean square errors of 2.35%and 2.36%,and residual predictive deviations of 6.1075 and 6.7479,respectively.[Conclusions]The Stacking fusion model significantly enhances the predictive accuracy and robustness of spectral quantitative analysis,providing an efficient and feasible solution for modeling complex agricultural product spectral data.
基金funded by the Deanship of Scientific Research(DSR)at King Abdulaziz University,Jeddah,under Grant No.(GPIP:1074-612-2024).
文摘The surge in smishing attacks underscores the urgent need for robust,real-time detection systems powered by advanced deep learning models.This paper introduces PhishNet,a novel ensemble learning framework that integrates transformer-based models(RoBERTa)and large language models(LLMs)(GPT-OSS 120B,LLaMA3.370B,and Qwen332B)to enhance smishing detection performance significantly.To mitigate class imbalance,we apply synthetic data augmentation using T5 and leverage various text preprocessing techniques.Our system employs a duallayer voting mechanism:weighted majority voting among LLMs and a final ensemble vote to classify messages as ham,spam,or smishing.Experimental results show an average accuracy improvement from 96%to 98.5%compared to the best standalone transformer,and from 93%to 98.5%when compared to LLMs across datasets.Furthermore,we present a real-time,user-friendly application to operationalize our detection model for practical use.PhishNet demonstrates superior scalability,usability,and detection accuracy,filling critical gaps in current smishing detection methodologies.
文摘Distributed Denial of Service(DDoS)attacks are one of the severe threats to network infrastructure,sometimes bypassing traditional diagnosis algorithms because of their evolving complexity.PresentMachine Learning(ML)techniques for DDoS attack diagnosis normally apply network traffic statistical features such as packet sizes and inter-arrival times.However,such techniques sometimes fail to capture complicated relations among various traffic flows.In this paper,we present a new multi-scale ensemble strategy given the Graph Neural Networks(GNNs)for improving DDoS detection.Our technique divides traffic into macro-and micro-level elements,letting various GNN models to get the two corase-scale anomalies and subtle,stealthy attack models.Through modeling network traffic as graph-structured data,GNNs efficiently learn intricate relations among network entities.The proposed ensemble learning algorithm combines the results of several GNNs to improve generalization,robustness,and scalability.Extensive experiments on three benchmark datasets—UNSW-NB15,CICIDS2017,and CICDDoS2019—show that our approach outperforms traditional machine learning and deep learning models in detecting both high-rate and low-rate(stealthy)DDoS attacks,with significant improvements in accuracy and recall.These findings demonstrate the suggested method’s applicability and robustness for real-world implementation in contexts where several DDoS patterns coexist.
基金funded by the Key Laboratory of Geological Safety of Coastal Urban Underground Space,Ministry of Natural Resources of China(Grant No.BHKF2022Y02)Natural Science Foundation of Guangdong Province,China(Grant No.2024A1515011162)Natural Science Foundation of Shandong Province,China(Grant No.ZR2024QE021).
文摘Traditional mining in open pit mines often uses explosives,leading to environmental hazards,with flyrock being a critical issue.In detail,excess flying rock beyond the designated explosion area was identified as the primary cause of fatal and non-fatal blasting hazards in open pit mining.Therefore,the accurate and reliable prediction of flyrock becomes crucial for effectively managing and mitigating associated problems.This study used the Light Gradient Boosting Machine(LightGBM)model to predict flyrock in a lead-zinc mine,with promising results.To improve its accuracy,multi-verse optimizer(MVO)and ant lion optimizer(ALO)metaheuristic algorithms were introduced.Results showed MVO-LightGBM outperformed conventional LightGBM.Additionally,decision tree(DT),support vector machine(SVM),and classification and regression tree(CART)models were trained and compared with MVO-LightGBM.The MVO-LightGBM model excelled over DT,SVM,and CART.This study highlights MVO-LightGBM's effectiveness and potential for broader applications.Furthermore,a multiple parametric sensitivity analysis(MPSA)algorithm was employed to specify the sensitivity of parameters.MPSA results indicated that the highest and lowest sensitivities are relevant to blasted rock per hole and spacing with theγ=1752.12 andγ=49.52,respectively.
基金supported through theOngoing Research Funding Program(ORF-2025-498),King Saud University,Riyadh,Saudi Arabia.
文摘Android smartphones have become an integral part of our daily lives,becoming targets for ransomware attacks.Such attacks encrypt user information and ask for payment to recover it.Conventional detection mechanisms,such as signature-based and heuristic techniques,often fail to detect new and polymorphic ransomware samples.To address this challenge,we employed various ensemble classifiers,such as Random Forest,Gradient Boosting,Bagging,and AutoML models.We aimed to showcase how AutoML can automate processes such as model selection,feature engineering,and hyperparameter optimization,to minimize manual effort while ensuring or enhancing performance compared to traditional approaches.We used this framework to test it with a publicly available dataset from the Kaggle repository,which contains features for Android ransomware network traffic.The dataset comprises 392,024 flow records,divided into eleven groups.There are ten classes for various ransomware types,including SVpeng,PornDroid,Koler,WannaLocker,and Lockerpin.There is also a class for regular traffic.We applied a three-step procedure to select themost relevant features:filter,wrapper,and embeddedmethods.The Bagging classifier was highly accurate,correctly getting 99.84%of the time.The FLAML AutoML framework was evenmore accurate,correctly getting 99.85%of the time.This is indicative of howwellAutoML performs in improving things with minimal human assistance.Our findings indicate that AutoML is an efficient,scalable,and flexible method to discover Android ransomware,and it will facilitate the development of next-generation intrusion detection systems.
基金supported by the National Natural Science Foundation of China (Grant No.12274131)the Innovation Program for Quantum Science and Technology (Grant No.2024ZD0300101)。
文摘Optical non-reciprocity is a fundamental phenomenon in photonics.It is crucial for developing devices that rely on directional signal control,such as optical isolators and circulators.However,most research in this field has focused on systems in equilibrium or steady states.In this work,we demonstrate a room-temperature Rydberg atomic platform where the unidirectional propagation of light acts as a switch to mediate time-crystalline-like collective oscillations through atomic synchronization.
基金the Begum Rokeya University,Rangpur,and the United Arab Emirates University,UAE for partially supporting this work。
文摘Rice is one of the most important staple crops globally.Rice plant diseases can severely reduce crop yields and,in extreme cases,lead to total production loss.Early diagnosis enables timely intervention,mitigates disease severity,supports effective treatment strategies,and reduces reliance on excessive pesticide use.Traditional machine learning approaches have been applied for automated rice disease diagnosis;however,these methods depend heavily on manual image preprocessing and handcrafted feature extraction,which are labor-intensive and time-consuming and often require domain expertise.Recently,end-to-end deep learning(DL) models have been introduced for this task,but they often lack robustness and generalizability across diverse datasets.To address these limitations,we propose a novel end-toend training framework for convolutional neural network(CNN) and attention-based model ensembles(E2ETCA).This framework integrates features from two state-of-the-art(SOTA) CNN models,Inception V3 and DenseNet-201,and an attention-based vision transformer(ViT) model.The fused features are passed through an additional fully connected layer with softmax activation for final classification.The entire process is trained end-to-end,enhancing its suitability for realworld deployment.Furthermore,we extract and analyze the learned features using a support vector machine(SVM),a traditional machine learning classifier,to provide comparative insights.We evaluate the proposed E2ETCA framework on three publicly available datasets,the Mendeley Rice Leaf Disease Image Samples dataset,the Kaggle Rice Diseases Image dataset,the Bangladesh Rice Research Institute dataset,and a combined version of all three.Using standard evaluation metrics(accuracy,precision,recall,and F1-score),our framework demonstrates superior performance compared to existing SOTA methods in rice disease diagnosis,with potential applicability to other agricultural disease detection tasks.
基金the King Salman center for Disability Research for funding this work through Research Group No.KSRG-2024-050.
文摘Artificial Intelligence(AI)is changing healthcare by helping with diagnosis.However,for doctors to trust AI tools,they need to be both accurate and easy to understand.In this study,we created a new machine learning system for the early detection of Autism Spectrum Disorder(ASD)in children.Our main goal was to build a model that is not only good at predicting ASD but also clear in its reasoning.For this,we combined several different models,including Random Forest,XGBoost,and Neural Networks,into a single,more powerful framework.We used two different types of datasets:(i)a standard behavioral dataset and(ii)a more complex multimodal dataset with images,audio,and physiological information.The datasets were carefully preprocessed for missing values,redundant features,and dataset imbalance to ensure fair learning.The results outperformed the state-of-the-art with a Regularized Neural Network,achieving 97.6%accuracy on behavioral data.Whereas,on the multimodal data,the accuracy is 98.2%.Other models also did well with accuracies consistently above 96%.We also used SHAP and LIME on a behavioral dataset for models’explainability.
基金The National Nat-ural Science Foundation of China (NSFC), Grant Nos.90711003, 40375014the program of GYHY200706005, and the APCC Visiting Scientist Program jointly supportedthis work.
文摘The 21-yr ensemble predictions of model precipitation and circulation in the East Asian and western North Pacific (Asia-Pacific) summer monsoon region (0°-50°N, 100° 150°E) were evaluated in nine different AGCM, used in the Asia-Pacific Economic Cooperation Climate Center (APCC) multi-model ensemble seasonal prediction system. The analysis indicates that the precipitation anomaly patterns of model ensemble predictions are substantially different from the observed counterparts in this region, but the summer monsoon circulations are reasonably predicted. For example, all models can well produce the interannual variability of the western North Pacific monsoon index (WNPMI) defined by 850 hPa winds, but they failed to predict the relationship between WNPMI and precipitation anomalies. The interannual variability of the 500 hPa geopotential height (GPH) can be well predicted by the models in contrast to precipitation anomalies. On the basis of such model performances and the relationship between the interannual variations of 500 hPa GPH and precipitation anomalies, we developed a statistical scheme used to downscale the summer monsoon precipitation anomaly on the basis of EOF and singular value decomposition (SVD). In this scheme, the three leading EOF modes of 500 hPa GPH anomaly fields predicted by the models are firstly corrected by the linear regression between the principal components in each model and observation, respectively. Then, the corrected model GPH is chosen as the predictor to downscale the precipitation anomaly field, which is assembled by the forecasted expansion coefficients of model 500 hPa GPH and the three leading SVD modes of observed precipitation anomaly corresponding to the prediction of model 500 hPa GPH during a 19-year training period. The cross-validated forecasts suggest that this downscaling scheme may have a potential to improve the forecast skill of the precipitation anomaly in the South China Sea, western North Pacific and the East Asia Pacific regions, where the anomaly correlation coefficient (ACC) has been improved by 0.14, corresponding to the reduced RMSE of 10.4% in the conventional multi-model ensemble (MME) forecast.
基金supported by the National Natural Science Foundation of China(Grant Nos.4140508391437220 and 41305066)+1 种基金the Natural Science Foundation of Hunan Province(Grant No.2015JJ3098)the Fund Project for The Education Department of Hunan Province(Grant No.14C0897)
文摘In order to reduce the uncertainty of offline land surface model (LSM) simulations of land evapotranspiration (ET), we used ensemble simulations based on three meteorological forcing datasets [Princeton, ITPCAS (Institute of Tibetan Plateau Research, Chinese Academy of Sciences), Qian] and four LSMs (BATS, VIC, CLM3.0 and CLM3.5), to explore the trends and spatiotemporal characteristics of ET, as well as the spatiotemporal pattern of ET in response to climate factors over China's Mainland during 1982-2007. The results showed that various simulations of each member and their arithmetic mean (EnsAVlean) could capture the spatial distribution and seasonal pattern of ET sufficiently well, where they exhibited more significant spatial and seasonal variation in the ET compared with observation-based ET estimates (Obs_MTE). For the mean annual ET, we found that the BATS forced by Princeton forcing overestimated the annual mean ET compared with Obs_MTE for most of the basins in China, whereas the VIC forced by Princeton forcing showed underestimations. By contrast, the Ens_Mean was closer to Obs_MTE, although the results were underestimated over Southeast China. Furthermore, both the Obs_MTE and Ens_Mean exhibited a significant increasing trend during 1982-98; whereas after 1998, when the last big EI Nifio event occurred, the Ens_Mean tended to decrease significantly between 1999 and 2007, although the change was not significant for Obs_MTE. Changes in air temperature and shortwave radiation played key roles in the long-term variation in ET over the humid area of China, but precipitation mainly controlled the long-term variation in ET in arid and semi-arid areas of China.
基金co-supported by the National Natural Science Foundation (Grant Nos. 41005052 and 41375086)the Strategic Priority Research Program of the Chinese Academy of Sciences (Grant No. XDA05110201)the National Basic Research Program of China (Grant No. 2010CB950403)
文摘Seasonal prediction of summer rainfall over the Yangtze River valley(YRV) is valuable for agricultural and industrial production and freshwater resource management in China, but remains a major challenge. Earlier multi-model ensemble(MME) prediction schemes for summer rainfall over China focus on single-value prediction, which cannot provide the necessary uncertainty information, while commonly-used ensemble schemes for probability density function(PDF) prediction are not adapted to YRV summer rainfall prediction. In the present study, an MME PDF prediction scheme is proposed based on the ENSEMBLES hindcasts. It is similar to the earlier Bayesian ensemble prediction scheme, but with optimization of ensemble members and a revision of the variance modeling of the likelihood function. The optimized ensemble members are regressed YRV summer rainfall with factors selected from model outputs of synchronous 500-h Pa geopotential height as predictors. The revised variance modeling of the likelihood function is a simple linear regression with ensemble spread as the predictor. The cross-validation skill of 1960–2002 YRV summer rainfall prediction shows that the new scheme produces a skillful PDF prediction, and is much better-calibrated, sharper, and more accurate than the earlier Bayesian ensemble and raw ensemble.
基金Special Research Program for Public Welfare(Meteorology)of China(GYHY200906009,GYHY201006015,GYHY200906007)National Natural Science Foundation of China(4107503541475044)
文摘This study investigates multi-model ensemble forecasts of track and intensity of tropical cyclones over the western Pacific, based on forecast outputs from the China Meteorological Administration, European Centre for Medium-Range Weather Forecasts, Japan Meteorological Agency and National Centers for Environmental Prediction in the THORPEX Interactive Grand Global Ensemble(TIGGE) datasets. The multi-model ensemble schemes, namely the bias-removed ensemble mean(BREM) and superensemble(SUP), are compared with the ensemble mean(EMN) and single-model forecasts. Moreover, a new model bias estimation scheme is investigated and applied to the BREM and SUP schemes. The results showed that, compared with single-model forecasts and EMN, the multi-model ensembles of the BREM and SUP schemes can have smaller errors in most cases. However, there were also circumstances where BREM was less skillful than EMN, indicating that using a time-averaged error as model bias is not optimal. A new model bias estimation scheme of the biweight mean is introduced. Through minimizing the negative influence of singular errors, this scheme can obtain a more accurate model bias estimation and improve the BREM forecast skill. The application of the biweight mean in the bias calculation of SUP also resulted in improved skill. The results indicate that the modification of multi-model ensemble schemes through this bias estimation method is feasible.
基金the National Natural Science Foundation of China(61873283)the Changsha Science&Technology Project(KQ1707017)the innovation-driven project of the Central South University(2019CX005).
文摘Dissolved oxygen(DO)is an important indicator of aquaculture,and its accurate forecasting can effectively improve the quality of aquatic products.In this paper,a new DO hybrid forecasting model is proposed that includes three stages:multi-factor analysis,adaptive decomposition,and an optimizationbased ensemble.First,considering the complex factors affecting DO,the grey relational(GR)degree method is used to screen out the environmental factors most closely related to DO.The consideration of multiple factors makes model fusion more effective.Second,the series of DO,water temperature,salinity,and oxygen saturation are decomposed adaptively into sub-series by means of the empirical wavelet transform(EWT)method.Then,five benchmark models are utilized to forecast the sub-series of EWT decomposition.The ensemble weights of these five sub-forecasting models are calculated by particle swarm optimization and gravitational search algorithm(PSOGSA).Finally,a multi-factor ensemble model for DO is obtained by weighted allocation.The performance of the proposed model is verified by timeseries data collected by the pacific islands ocean observing system(PacIOOS)from the WQB04 station at Hilo.The evaluation indicators involved in the experiment include the Nash–Sutcliffe efficiency(NSE),Kling–Gupta efficiency(KGE),mean absolute percent error(MAPE),standard deviation of error(SDE),and coefficient of determination(R^(2)).Example analysis demonstrates that:①The proposed model can obtain excellent DO forecasting results;②the proposed model is superior to other comparison models;and③the forecasting model can be used to analyze the trend of DO and enable managers to make better management decisions.
基金supported by the Knowledge Innovation Key Project of Chinese Academy of Sciences (CAS) under Grant No.KZCX2-YW-217Doctor Research Startup Project at the Institute of Atmospheric Physics,the CAS under Grant No.7-098300
文摘A Bayesian probabilistic prediction scheme of the Yangtze River Valley (YRV) summer rainfall is proposed to combine forecast information from multi-model ensemble dataset provided by ENSEMBLES project.Due to the low forecast skill of rainfall in dynamic models,the time series of regressed YRV summer rainfall are selected as ensemble members in the new scheme,instead of commonly-used YRV summer rainfall simulated by models.Each time series of regressed YRV summer rainfall is derived from a simple linear regression.The predictor in each simple linear regression is the skillfully simulated circulation or surface temperature factor which is highly linear with the observed YRV summer rainfall in the training set.The high correlation between the ensemble mean of these regressed YRV summer rainfall and observation benefit extracting more sample information from the ensemble system.The results show that the cross-validated skill of the new scheme over the period of 1960 to 2002 is much higher than equally-weighted ensemble,multiple linear regression,and Bayesian ensemble with simulated YRV summer rainfall as ensemble members.In addition,the new scheme is also more skillful than reference forecasts (random forecast at a 0.01 significance level for ensemble mean and climatology forecast for probability density function).