With the rapid development of economy,air pollution caused by industrial expansion has caused serious harm to human health and social development.Therefore,establishing an effective air pollution concentration predict...With the rapid development of economy,air pollution caused by industrial expansion has caused serious harm to human health and social development.Therefore,establishing an effective air pollution concentration prediction system is of great scientific and practical significance for accurate and reliable predictions.This paper proposes a combination of pointinterval prediction system for pollutant concentration prediction by leveraging neural network,meta-heuristic optimization algorithm,and fuzzy theory.Fuzzy information granulation technology is used in data preprocessing to transform numerical sequences into fuzzy particles for comprehensive feature extraction.The golden Jackal optimization algorithm is employed in the optimization stage to fine-tune model hyperparameters.In the prediction stage,an ensemble learning method combines training results frommultiplemodels to obtain final point predictions while also utilizing quantile regression and kernel density estimation methods for interval predictions on the test set.Experimental results demonstrate that the combined model achieves a high goodness of fit coefficient of determination(R^(2))at 99.3% and a maximum difference between prediction accuracy mean absolute percentage error(MAPE)and benchmark model at 12.6%.This suggests that the integrated learning system proposed in this paper can provide more accurate deterministic predictions as well as reliable uncertainty analysis compared to traditionalmodels,offering practical reference for air quality early warning.展开更多
Deep learning based analyses of computed tomography(CT)images contribute to automated diagnosis of COVID-19,and ensemble learning may commonly provide a better solution.Here,we proposed an ensemble learning method tha...Deep learning based analyses of computed tomography(CT)images contribute to automated diagnosis of COVID-19,and ensemble learning may commonly provide a better solution.Here,we proposed an ensemble learning method that integrates several component neural networks to jointly diagnose COVID-19.Two ensemble strategies are considered:the output scores of all component models that are combined with the weights adjusted adaptively by cost function back propagation;voting strategy.A database containing 8347 CT slices of COVID-19,common pneumonia and normal subjects was used as training and testing sets.Results show that the novel method can reach a high accuracy of 99.37%(recall:0.9981;precision:0.9893),with an increase of about 7% in comparison to single-component models.And the average test accuracy is 95.62%(recall:0.9587;precision:0.9559),with a corresponding increase of 5.2%.Compared with several latest deep learning models on the identical test set,our method made an accuracy improvement up to 10.88%.The proposed method may be a promising solution for the diagnosis of COVID-19.展开更多
Driven by rapid technological advancements and economic growth,mineral extraction and metal refining have increased dramatically,generating huge volumes of tailings and mine waste(TMWs).Investigating the morphological...Driven by rapid technological advancements and economic growth,mineral extraction and metal refining have increased dramatically,generating huge volumes of tailings and mine waste(TMWs).Investigating the morphological fractions of heavy metals and metalloids(HMMs)in TMWs is key to evaluating their leaching potential into the environment;however,traditional experiments are time-consuming and labor-intensive.In this study,10 machine learning(ML)algorithms were used and compared for rapidly predicting the morphological fractions of HMMs in TMWs.A dataset comprising 2376 data points was used,with mineral composition,elemental properties,and total concentration used as inputs and concentration of morphological fraction used as output.After grid search optimization,the extra tree model performed the best,achieving coefficient of determination(R2)of 0.946 and 0.942 on the validation and test sets,respectively.Electronegativity was found to have the greatest impact on the morphological fraction.The models’performance was enhanced by applying an ensemble method to the top three optimal ML models,including gradient boosting decision tree,extra trees and categorical boosting.Overall,the proposed framework can accurately predict the concentrations of different morphological fractions of HMMs in TMWs.This approach can minimize detection time,aid in the safe management and recovery of TMWs.展开更多
This study was aimed to prepare landslide susceptibility maps for the Pithoragarh district in Uttarakhand,India,using advanced ensemble models that combined Radial Basis Function Networks(RBFN)with three ensemble lear...This study was aimed to prepare landslide susceptibility maps for the Pithoragarh district in Uttarakhand,India,using advanced ensemble models that combined Radial Basis Function Networks(RBFN)with three ensemble learning techniques:DAGGING(DG),MULTIBOOST(MB),and ADABOOST(AB).This combination resulted in three distinct ensemble models:DG-RBFN,MB-RBFN,and AB-RBFN.Additionally,a traditional weighted method,Information Value(IV),and a benchmark machine learning(ML)model,Multilayer Perceptron Neural Network(MLP),were employed for comparison and validation.The models were developed using ten landslide conditioning factors,which included slope,aspect,elevation,curvature,land cover,geomorphology,overburden depth,lithology,distance to rivers and distance to roads.These factors were instrumental in predicting the output variable,which was the probability of landslide occurrence.Statistical analysis of the models’performance indicated that the DG-RBFN model,with an Area Under ROC Curve(AUC)of 0.931,outperformed the other models.The AB-RBFN model achieved an AUC of 0.929,the MB-RBFN model had an AUC of 0.913,and the MLP model recorded an AUC of 0.926.These results suggest that the advanced ensemble ML model DG-RBFN was more accurate than traditional statistical model,single MLP model,and other ensemble models in preparing trustworthy landslide susceptibility maps,thereby enhancing land use planning and decision-making.展开更多
Real-time and reliable measurements of the effluent quality are essential to improve operating efficiency and reduce energy consumption for the wastewater treatment process.Due to the low accuracy and unstable perform...Real-time and reliable measurements of the effluent quality are essential to improve operating efficiency and reduce energy consumption for the wastewater treatment process.Due to the low accuracy and unstable performance of the traditional effluent quality measurements,we propose a selective ensemble extreme learning machine modeling method to enhance the effluent quality predictions.Extreme learning machine algorithm is inserted into a selective ensemble frame as the component model since it runs much faster and provides better generalization performance than other popular learning algorithms.Ensemble extreme learning machine models overcome variations in different trials of simulations for single model.Selective ensemble based on genetic algorithm is used to further exclude some bad components from all the available ensembles in order to reduce the computation complexity and improve the generalization performance.The proposed method is verified with the data from an industrial wastewater treatment plant,located in Shenyang,China.Experimental results show that the proposed method has relatively stronger generalization and higher accuracy than partial least square,neural network partial least square,single extreme learning machine and ensemble extreme learning machine model.展开更多
Metamaterial Antenna is a special class of antennas that uses metamaterial to enhance their performance.Antenna size affects the quality factor and the radiation loss of the antenna.Metamaterial antennas can overcome ...Metamaterial Antenna is a special class of antennas that uses metamaterial to enhance their performance.Antenna size affects the quality factor and the radiation loss of the antenna.Metamaterial antennas can overcome the limitation of bandwidth for small antennas.Machine learning(ML)model is recently applied to predict antenna parameters.ML can be used as an alternative approach to the trial-and-error process of finding proper parameters of the simulated antenna.The accuracy of the prediction depends mainly on the selected model.Ensemble models combine two or more base models to produce a better-enhanced model.In this paper,a weighted average ensemble model is proposed to predict the bandwidth of the Metamaterial Antenna.Two base models are used namely:Multilayer Perceptron(MLP)and Support Vector Machines(SVM).To calculate the weights for each model,an optimization algorithm is used to find the optimal weights of the ensemble.Dynamic Group-Based Cooperative Optimizer(DGCO)is employed to search for optimal weight for the base models.The proposed model is compared with three based models and the average ensemble model.The results show that the proposed model is better than other models and can predict antenna bandwidth efficiently.展开更多
Predictive analytics have been widely used in the literature with respect to laparoscopic surgery and risk stratification.However,most predictive analytics in this field exploit generalized linearmodels for predictive...Predictive analytics have been widely used in the literature with respect to laparoscopic surgery and risk stratification.However,most predictive analytics in this field exploit generalized linearmodels for predictive purposes,which are limited by model assumptionsdincluding linearity between response variables and additive interactions between variables.In many instances,such assumptions may not hold true,and the complex relationship between predictors and response variables is usually unknown.To address this limitation,machine-learning algorithms can be employed to model the underlying data.The advantage of machine learning algorithms is that they usually do not require strict assumptions regarding data structure,and they are able to learn complex functional forms using a nonparametric approach.Furthermore,two or more machine learning algorithms can be synthesized to further improve predictive accuracy.Such a process is referred to as ensemble modeling,and it has been used broadly in various industries.However,this approach has not been widely reported in the laparoscopic surgical literature due to its complexity in both model training and interpretation.With this technical note,we provide a comprehensive overview of the ensemble-modeling technique and a step-by-step tutorial on how to implement ensemble modeling.展开更多
Employing machine learning techniques in predicting the parameters of metamaterial antennas has a significant impact on the reduction of the time needed to design an antenna with optimal parameters using simulation to...Employing machine learning techniques in predicting the parameters of metamaterial antennas has a significant impact on the reduction of the time needed to design an antenna with optimal parameters using simulation tools.In this paper,we propose a new approach for predicting the bandwidth of metamaterial antenna using a novel ensemble model.The proposed ensemble model is composed of two levels of regression models.The first level consists of three strong models namely,random forest,support vector regression,and light gradient boosting machine.Whereas the second level is based on the ElasticNet regression model,which receives the prediction results from the models in the first level for refinement and producing the final optimal result.To achieve the best performance of these regression models,the advanced squirrel search optimization algorithm(ASSOA)is utilized to search for the optimal set of hyper-parameters of each model.Experimental results show that the proposed two-level ensemble model could achieve a robust prediction of the bandwidth of metamaterial antenna when compared with the recently published ensemble models based on the same publicly available benchmark dataset.The findings indicate that the proposed approach results in root mean square error(RMSE)of(0.013),mean absolute error(MAE)of(0.004),and mean bias error(MBE)of(0.0017).These results are superior to the other competing ensemble models and can predict the antenna bandwidth more accurately.展开更多
The Efficient Global Optimization(EGO)algorithm has been widely used in the numerical design optimization of engineering systems.However,the need for an uncertainty estimator limits the selection of a surrogate model....The Efficient Global Optimization(EGO)algorithm has been widely used in the numerical design optimization of engineering systems.However,the need for an uncertainty estimator limits the selection of a surrogate model.In this paper,a Sequential Ensemble Optimization(SEO)algorithm based on the ensemble model is proposed.In the proposed algorithm,there is no limitation on the selection of an individual surrogate model.Specifically,the SEO is built based on the EGO by extending the EGO algorithm so that it can be used in combination with the ensemble model.Also,a new uncertainty estimator for any surrogate model named the General Uncertainty Estimator(GUE)is proposed.The performance of the proposed SEO algorithm is verified by the simulations using ten well-known mathematical functions with varying dimensions.The results show that the proposed SEO algorithm performs better than the traditional EGO algorithm in terms of both the final optimization results and the convergence rate.Further,the proposed algorithm is applied to the global optimization control for turbo-fan engine acceleration schedule design.展开更多
Stomatopods are better known as mantis shrimp with considerable ecological importance in wide coastal waters globally. Some stomatopod species are exploited commercially, including Oratosquilla oratoria in the Northwe...Stomatopods are better known as mantis shrimp with considerable ecological importance in wide coastal waters globally. Some stomatopod species are exploited commercially, including Oratosquilla oratoria in the Northwest Pacific. Yet, few studies have published to promote accurate habitat identification of stomatopods, obstructing scientific management and conservation of these valuable organisms. This study provides an ensemble modeling framework for habitat suitability modeling of stomatopods, utilizing the O. oratoria stock in the Bohai Sea as an example. Two modeling techniques(i.e., generalized additive model(GAM) and geographical weighted regression(GWR)) were applied to select environmental predictors(especially the selection between two types of sediment metrics) that better characterize O. oratoria distribution and build separate habitat suitability models(HSM). The performance of the individual HSMs were compared on interpolation accuracy and transferability.Then, they were integrated to check whether the ensemble model outperforms either individual model, according to fishers’ knowledge and scientific survey data. As a result, grain-size metrics of sediment outperformed sediment content metrics in modeling O. oratoria habitat, possibly because grain-size metrics not only reflect the effect of substrates on burrow development, but also link to sediment heat capacity which influences individual thermoregulation. Moreover, the GWR-based HSM outperformed the GAM-based HSM in interpolation accuracy,while the latter one displayed better transferability. On balance, the ensemble HSM appeared to improve the predictive performance overall, as it could avoid dependence on a single model type and successfully identified fisher-recognized and survey-indicated suitable habitats in either sparsely sampled or well investigated areas.展开更多
Root-cause identification plays a vital role in business decision making by providing effective future directions for the organizations.Aspect extraction and sentiment extraction plays a vital role in identifying the ...Root-cause identification plays a vital role in business decision making by providing effective future directions for the organizations.Aspect extraction and sentiment extraction plays a vital role in identifying the rootcauses.This paper proposes the Ensemble based temporal weighting and pareto ranking(ETP)model for Root-cause identification.Aspect extraction is performed based on rules and is followed by opinion identification using the proposed boosted ensemble model.The obtained aspects are validated and ranked using the proposed aspect weighing scheme.Pareto-rule based aspect selection is performed as the final selection mechanism and the results are presented for business decision making.Experiments were performed with the standard five product benchmark dataset.Performances on all five product reviews indicate the effective performance of the proposed model.Comparisons are performed using three standard state-of-the-art models and effectiveness is measured in terms of F-Measure and Detection rates.The results indicate improved performances exhibited by the proposed model with an increase in F-Measure levels at 1%–15%and detection rates at 4%–24%compared to the state-of-the-art models.展开更多
Despite the advancement within the last decades in the field of smart grids,energy consumption forecasting utilizing the metrological features is still challenging.This paper proposes a genetic algorithm-based adaptiv...Despite the advancement within the last decades in the field of smart grids,energy consumption forecasting utilizing the metrological features is still challenging.This paper proposes a genetic algorithm-based adaptive error curve learning ensemble(GA-ECLE)model.The proposed technique copes with the stochastic variations of improving energy consumption forecasting using a machine learning-based ensembled approach.A modified ensemble model based on a utilizing error of model as a feature is used to improve the forecast accuracy.This approach combines three models,namely CatBoost(CB),Gradient Boost(GB),and Multilayer Perceptron(MLP).The ensembled CB-GB-MLP model’s inner mechanism consists of generating a meta-data from Gradient Boosting and CatBoost models to compute the final predictions using the Multilayer Perceptron network.A genetic algorithm is used to obtain the optimal features to be used for the model.To prove the proposed model’s effectiveness,we have used a four-phase technique using Jeju island’s real energy consumption data.In the first phase,we have obtained the results by applying the CB-GB-MLP model.In the second phase,we have utilized a GA-ensembled model with optimal features.The third phase is for the comparison of the energy forecasting result with the proposed ECL-based model.The fourth stage is the final stage,where we have applied the GA-ECLE model.We obtained a mean absolute error of 3.05,and a root mean square error of 5.05.Extensive experimental results are provided,demonstrating the superiority of the proposed GA-ECLE model over traditional ensemble models.展开更多
Climate change in mountainous regions has significant impacts on hydrological and ecological systems. This research studied the future temperature, precipitation and snowfall in the 21^(st) century for the Tianshan ...Climate change in mountainous regions has significant impacts on hydrological and ecological systems. This research studied the future temperature, precipitation and snowfall in the 21^(st) century for the Tianshan and northern Kunlun Mountains(TKM) based on the general circulation model(GCM) simulation ensemble from the coupled model intercomparison project phase 5(CMIP5) under the representative concentration pathway(RCP) lower emission scenario RCP4.5 and higher emission scenario RCP8.5 using the Bayesian model averaging(BMA) technique. Results show that(1) BMA significantly outperformed the simple ensemble analysis and BMA mean matches all the three observed climate variables;(2) at the end of the 21^(st) century(2070–2099) under RCP8.5, compared to the control period(1976–2005), annual mean temperature and mean annual precipitation will rise considerably by 4.8°C and 5.2%, respectively, while mean annual snowfall will dramatically decrease by 26.5%;(3) precipitation will increase in the northern Tianshan region while decrease in the Amu Darya Basin. Snowfall will significantly decrease in the western TKM. Mean annual snowfall fraction will also decrease from 0.56 of 1976–2005 to 0.42 of 2070–2099 under RCP8.5; and(4) snowfall shows a high sensitivity to temperature in autumn and spring while a low sensitivity in winter, with the highest sensitivity values occurring at the edge areas of TKM. The projections mean that flood risk will increase and solid water storage will decrease.展开更多
Strong mechanical vibration and acoustical signals of grinding process contain useful information related to load parameters in ball mills. It is a challenge to extract latent features and construct soft sensor model ...Strong mechanical vibration and acoustical signals of grinding process contain useful information related to load parameters in ball mills. It is a challenge to extract latent features and construct soft sensor model with high dimensional frequency spectra of these signals. This paper aims to develop a selective ensemble modeling approach based on nonlinear latent frequency spectral feature extraction for accurate measurement of material to ball volume ratio. Latent features are first extracted from different vibrations and acoustic spectral segments by kernel partial least squares. Algorithms of bootstrap and least squares support vector machines are employed to produce candidate sub-models using these latent features as inputs. Ensemble sub-models are selected based on genetic algorithm optimization toolbox. Partial least squares regression is used to combine these sub-models to eliminate collinearity among their prediction outputs. Results indicate that the proposed modeling approach has better prediction performance than previous ones.展开更多
Numerous factors affect the increased temperature of a machine tool, including prolonged and high-intensity usage,tool-workpiece interaction, mechanical friction, and elevated ambient temperatures, among others. Conse...Numerous factors affect the increased temperature of a machine tool, including prolonged and high-intensity usage,tool-workpiece interaction, mechanical friction, and elevated ambient temperatures, among others. Consequently,spindle thermal displacement occurs, and machining precision suffers. To prevent the errors caused by thetemperature rise of the Spindle fromaffecting the accuracy during themachining process, typically, the factory willwarm up themachine before themanufacturing process.However, if there is noway to understand the tool spindle’sthermal deformation, the machining quality will be greatly affected. In order to solve the above problem, thisstudy aims to predict the thermal displacement of the machine tool by using intelligent algorithms. In the practicalapplication, only a few temperature sensors are used to input the information into the prediction model for realtimethermal displacement prediction. This approach has greatly improved the quality of tool processing.However,each algorithm has different performances in different environments. In this study, an ensemble model is used tointegrate Long Short-TermMemory (LSTM) with Support VectorMachine (SVM). The experimental results showthat the prediction performance of LSTM-SVM is higher than that of other machine learning algorithms.展开更多
COVID-19 has caused severe health complications and produced a substantial adverse economic impact around the world.Forecasting the trend of COVID-19 infections could help in executing policies to effectively reduce t...COVID-19 has caused severe health complications and produced a substantial adverse economic impact around the world.Forecasting the trend of COVID-19 infections could help in executing policies to effectively reduce the number of new cases.In this study,we apply the decomposition and ensemble model to forecast COVID-19 confirmed cases,deaths,and recoveries in Pakistan for the upcoming month until the end of July.For the decomposition of data,the Ensemble Empirical Mode Decomposition(EEMD)technique is applied.EEMD decomposes the data into small components,called Intrinsic Mode Functions(IMFs).For individual IMFs modelling,we use the Autoregressive Integrated Moving Average(ARIMA)model.The data used in this study is obtained from the official website of Pakistan that is publicly available and designated for COVID-19 outbreak with daily updates.Our analyses reveal that the number of recoveries,new cases,and deaths are increasing in Pakistan exponentially.Based on the selected EEMD-ARIMA model,the new confirmed cases are expected to rise from 213,470 to 311,454 by 31 July 2020,which is an increase of almost 1.46 times with a 95%prediction interval of 246,529 to 376,379.The 95%prediction interval for recovery is 162,414 to 224,579,with an increase of almost two times in total from 100802 to 193495 by 31 July 2020.On the other hand,the deaths are expected to increase from 4395 to 6751,which is almost 1.54 times,with a 95%prediction interval of 5617 to 7885.Thus,the COVID-19 forecasting results of Pakistan are alarming for the next month until 31 July 2020.They also confirm that the EEMD-ARIMA model is useful for the short-term forecasting of COVID-19,and that it is capable of keeping track of the real COVID-19 data in nearly all scenarios.The decomposition and ensemble strategy can be useful to help decision-makers in developing short-term strategies about the current number of disease occurrences until an appropriate vaccine is developed.展开更多
Stance detection is the task of attitude identification toward a standpoint.Previous work of stance detection has focused on feature extraction but ignored the fact that irrelevant features exist as noise during highe...Stance detection is the task of attitude identification toward a standpoint.Previous work of stance detection has focused on feature extraction but ignored the fact that irrelevant features exist as noise during higher-level abstracting.Moreover,because the target is not always mentioned in the text,most methods have ignored target information.In order to solve these problems,we propose a neural network ensemble method that combines the timing dependence bases on long short-term memory(LSTM)and the excellent extracting performance of convolutional neural networks(CNNs).The method can obtain multi-level features that consider both local and global features.We also introduce attention mechanisms to magnify target information-related features.Furthermore,we employ sparse coding to remove noise to obtain characteristic features.Performance was improved by using sparse coding on the basis of attention employment and feature extraction.We evaluate our approach on the SemEval-2016Task 6-A public dataset,achieving a performance that exceeds the benchmark and those of participating teams.展开更多
A revised support vector regression (SVR) ensemble model based on boosting algorithm (SVR-Boosting) is presented in this paper for electricity price forecasting in electric power market. In the light of characteristic...A revised support vector regression (SVR) ensemble model based on boosting algorithm (SVR-Boosting) is presented in this paper for electricity price forecasting in electric power market. In the light of characteristics of electricity price sequence, a new triangular-shaped 为oss function is constructed in the training of the forecasting model to inhibit the learning from abnormal data in electricity price sequence. The results from actual data indicate that, compared with the single support vector regression model, the proposed SVR-Boosting ensemble model is able to enhance the stability of the model output remarkably, acquire higher predicting accuracy, and possess comparatively satisfactory generalization capability.展开更多
Breast cancer is one of the leading cancers among women.It has the second-highest mortality rate in women after lung cancer.Timely detection,especially in the early stages,can help increase survival rates.However,manu...Breast cancer is one of the leading cancers among women.It has the second-highest mortality rate in women after lung cancer.Timely detection,especially in the early stages,can help increase survival rates.However,manual diagnosis of breast cancer is a tedious and time-consuming process,and the accuracy of detection is reliant on the quality of the images and the radiologist’s experience.However,computer-aided medical diagnosis has recently shown promising results,leading to the need to develop an efficient system that can aid radiologists in diagnosing breast cancer in its early stages.The research presented in this paper is focused on the multi-class classification of breast cancer.The deep transfer learning approach has been utilized to train the deep learning models,and a pre-processing technique has been used to improve the quality of the ultrasound dataset.The proposed technique utilizes two deep learning models,Mobile-NetV2 and DenseNet201,for the composition of the deep ensemble model.Deep learning models are fine-tuned along with hyperparameter tuning to achieve better results.Subsequently,entropy-based feature selection is used.Breast cancer identification using the proposed classification approach was found to attain an accuracy of 97.04%,while the sensitivity and F1 score were 96.87%and 96.76%,respectively.The performance of the proposed model is very effective and outperforms other state-of-the-art techniques presented in the literature.展开更多
Nowadays aviation accidents have become one of the major causes of severe injuries and fatalities around the world. This attracts the research community to look into aviation safety by applying data analysis technique...Nowadays aviation accidents have become one of the major causes of severe injuries and fatalities around the world. This attracts the research community to look into aviation safety by applying data analysis techniques based on an advanced machine learning algorithm. An ensemble classification model based on Aviation Safety Reporting System(ASRS) has been proposed to analyze aviation safety targeting the people injured in the system.The ensemble classification model shall contain two modules: the data-driven module consisting of data cleaning, feature selection,and imbalanced data division and reorganization, and the modeldriven module stacked by Random Forest(RF), XGBoost(XGB),and Light Gradient Boosting Machine(LGBM) separately. The results indicate that the ensemble model could solve the data imbalance while vastly improving accuracy. LGBM illustrates higher accuracy and faster run in the analysis of a single model of the ASRS-based imbalanced data, while the ensemble model has the best performance in classification at the same time. The ensemble model proposed for imbalanced data classification can provide a certain reference for similar data processing while improving the safety of civil aviation.展开更多
基金supported by General Scientific Research Funding of the Science and Technology Development Fund(FDCT)in Macao(No.0150/2022/A)the Faculty Research Grants of Macao University of Science and Technology(No.FRG-22-074-FIE).
文摘With the rapid development of economy,air pollution caused by industrial expansion has caused serious harm to human health and social development.Therefore,establishing an effective air pollution concentration prediction system is of great scientific and practical significance for accurate and reliable predictions.This paper proposes a combination of pointinterval prediction system for pollutant concentration prediction by leveraging neural network,meta-heuristic optimization algorithm,and fuzzy theory.Fuzzy information granulation technology is used in data preprocessing to transform numerical sequences into fuzzy particles for comprehensive feature extraction.The golden Jackal optimization algorithm is employed in the optimization stage to fine-tune model hyperparameters.In the prediction stage,an ensemble learning method combines training results frommultiplemodels to obtain final point predictions while also utilizing quantile regression and kernel density estimation methods for interval predictions on the test set.Experimental results demonstrate that the combined model achieves a high goodness of fit coefficient of determination(R^(2))at 99.3% and a maximum difference between prediction accuracy mean absolute percentage error(MAPE)and benchmark model at 12.6%.This suggests that the integrated learning system proposed in this paper can provide more accurate deterministic predictions as well as reliable uncertainty analysis compared to traditionalmodels,offering practical reference for air quality early warning.
基金the Sichuan Science and Technology Department Research and Development Key Project(No.21ZDYF3607)the Weining Cloud Hospital Based AI Medical Software System Service and Demo Project(No.2019K0JTS0159)the China Postdoctoral Science Foundation(No.2020T130137ZX)。
文摘Deep learning based analyses of computed tomography(CT)images contribute to automated diagnosis of COVID-19,and ensemble learning may commonly provide a better solution.Here,we proposed an ensemble learning method that integrates several component neural networks to jointly diagnose COVID-19.Two ensemble strategies are considered:the output scores of all component models that are combined with the weights adjusted adaptively by cost function back propagation;voting strategy.A database containing 8347 CT slices of COVID-19,common pneumonia and normal subjects was used as training and testing sets.Results show that the novel method can reach a high accuracy of 99.37%(recall:0.9981;precision:0.9893),with an increase of about 7% in comparison to single-component models.And the average test accuracy is 95.62%(recall:0.9587;precision:0.9559),with a corresponding increase of 5.2%.Compared with several latest deep learning models on the identical test set,our method made an accuracy improvement up to 10.88%.The proposed method may be a promising solution for the diagnosis of COVID-19.
基金Project(2024JJ2074) supported by the Natural Science Foundation of Hunan Province,ChinaProject(22376221) supported by the National Natural Science Foundation of ChinaProject(2023QNRC001) supported by the Young Elite Scientists Sponsorship Program by CAST,China。
文摘Driven by rapid technological advancements and economic growth,mineral extraction and metal refining have increased dramatically,generating huge volumes of tailings and mine waste(TMWs).Investigating the morphological fractions of heavy metals and metalloids(HMMs)in TMWs is key to evaluating their leaching potential into the environment;however,traditional experiments are time-consuming and labor-intensive.In this study,10 machine learning(ML)algorithms were used and compared for rapidly predicting the morphological fractions of HMMs in TMWs.A dataset comprising 2376 data points was used,with mineral composition,elemental properties,and total concentration used as inputs and concentration of morphological fraction used as output.After grid search optimization,the extra tree model performed the best,achieving coefficient of determination(R2)of 0.946 and 0.942 on the validation and test sets,respectively.Electronegativity was found to have the greatest impact on the morphological fraction.The models’performance was enhanced by applying an ensemble method to the top three optimal ML models,including gradient boosting decision tree,extra trees and categorical boosting.Overall,the proposed framework can accurately predict the concentrations of different morphological fractions of HMMs in TMWs.This approach can minimize detection time,aid in the safe management and recovery of TMWs.
基金the University of Transport Technology under the project entitled“Application of Machine Learning Algorithms in Landslide Susceptibility Mapping in Mountainous Areas”with grant number DTTD2022-16.
文摘This study was aimed to prepare landslide susceptibility maps for the Pithoragarh district in Uttarakhand,India,using advanced ensemble models that combined Radial Basis Function Networks(RBFN)with three ensemble learning techniques:DAGGING(DG),MULTIBOOST(MB),and ADABOOST(AB).This combination resulted in three distinct ensemble models:DG-RBFN,MB-RBFN,and AB-RBFN.Additionally,a traditional weighted method,Information Value(IV),and a benchmark machine learning(ML)model,Multilayer Perceptron Neural Network(MLP),were employed for comparison and validation.The models were developed using ten landslide conditioning factors,which included slope,aspect,elevation,curvature,land cover,geomorphology,overburden depth,lithology,distance to rivers and distance to roads.These factors were instrumental in predicting the output variable,which was the probability of landslide occurrence.Statistical analysis of the models’performance indicated that the DG-RBFN model,with an Area Under ROC Curve(AUC)of 0.931,outperformed the other models.The AB-RBFN model achieved an AUC of 0.929,the MB-RBFN model had an AUC of 0.913,and the MLP model recorded an AUC of 0.926.These results suggest that the advanced ensemble ML model DG-RBFN was more accurate than traditional statistical model,single MLP model,and other ensemble models in preparing trustworthy landslide susceptibility maps,thereby enhancing land use planning and decision-making.
基金supported by National Natural Science Foundation of China (Nos. 61203102 and 60874057)Postdoctoral Science Foundation of China (No. 20100471464)
文摘Real-time and reliable measurements of the effluent quality are essential to improve operating efficiency and reduce energy consumption for the wastewater treatment process.Due to the low accuracy and unstable performance of the traditional effluent quality measurements,we propose a selective ensemble extreme learning machine modeling method to enhance the effluent quality predictions.Extreme learning machine algorithm is inserted into a selective ensemble frame as the component model since it runs much faster and provides better generalization performance than other popular learning algorithms.Ensemble extreme learning machine models overcome variations in different trials of simulations for single model.Selective ensemble based on genetic algorithm is used to further exclude some bad components from all the available ensembles in order to reduce the computation complexity and improve the generalization performance.The proposed method is verified with the data from an industrial wastewater treatment plant,located in Shenyang,China.Experimental results show that the proposed method has relatively stronger generalization and higher accuracy than partial least square,neural network partial least square,single extreme learning machine and ensemble extreme learning machine model.
文摘Metamaterial Antenna is a special class of antennas that uses metamaterial to enhance their performance.Antenna size affects the quality factor and the radiation loss of the antenna.Metamaterial antennas can overcome the limitation of bandwidth for small antennas.Machine learning(ML)model is recently applied to predict antenna parameters.ML can be used as an alternative approach to the trial-and-error process of finding proper parameters of the simulated antenna.The accuracy of the prediction depends mainly on the selected model.Ensemble models combine two or more base models to produce a better-enhanced model.In this paper,a weighted average ensemble model is proposed to predict the bandwidth of the Metamaterial Antenna.Two base models are used namely:Multilayer Perceptron(MLP)and Support Vector Machines(SVM).To calculate the weights for each model,an optimization algorithm is used to find the optimal weights of the ensemble.Dynamic Group-Based Cooperative Optimizer(DGCO)is employed to search for optimal weight for the base models.The proposed model is compared with three based models and the average ensemble model.The results show that the proposed model is better than other models and can predict antenna bandwidth efficiently.
基金funding from RUIYI emergency medical research fund(202013)Open Foundation of Artificial Intelligence Key Laboratory of Sichuan Province(2020RYY03)+1 种基金Research project of Health and Family Planning Commission of Sichuan Province(17PJ136)funding from Key Research&Development project of Zhejiang Province(2021C03071).
文摘Predictive analytics have been widely used in the literature with respect to laparoscopic surgery and risk stratification.However,most predictive analytics in this field exploit generalized linearmodels for predictive purposes,which are limited by model assumptionsdincluding linearity between response variables and additive interactions between variables.In many instances,such assumptions may not hold true,and the complex relationship between predictors and response variables is usually unknown.To address this limitation,machine-learning algorithms can be employed to model the underlying data.The advantage of machine learning algorithms is that they usually do not require strict assumptions regarding data structure,and they are able to learn complex functional forms using a nonparametric approach.Furthermore,two or more machine learning algorithms can be synthesized to further improve predictive accuracy.Such a process is referred to as ensemble modeling,and it has been used broadly in various industries.However,this approach has not been widely reported in the laparoscopic surgical literature due to its complexity in both model training and interpretation.With this technical note,we provide a comprehensive overview of the ensemble-modeling technique and a step-by-step tutorial on how to implement ensemble modeling.
基金The authors received funding for this study from the Deputyship for Research&Innovation,Ministry of Education in Saudi Arabia for funding this research work through the project number(IFP2021-033).
文摘Employing machine learning techniques in predicting the parameters of metamaterial antennas has a significant impact on the reduction of the time needed to design an antenna with optimal parameters using simulation tools.In this paper,we propose a new approach for predicting the bandwidth of metamaterial antenna using a novel ensemble model.The proposed ensemble model is composed of two levels of regression models.The first level consists of three strong models namely,random forest,support vector regression,and light gradient boosting machine.Whereas the second level is based on the ElasticNet regression model,which receives the prediction results from the models in the first level for refinement and producing the final optimal result.To achieve the best performance of these regression models,the advanced squirrel search optimization algorithm(ASSOA)is utilized to search for the optimal set of hyper-parameters of each model.Experimental results show that the proposed two-level ensemble model could achieve a robust prediction of the bandwidth of metamaterial antenna when compared with the recently published ensemble models based on the same publicly available benchmark dataset.The findings indicate that the proposed approach results in root mean square error(RMSE)of(0.013),mean absolute error(MAE)of(0.004),and mean bias error(MBE)of(0.0017).These results are superior to the other competing ensemble models and can predict the antenna bandwidth more accurately.
基金the financial support of the National Natural Science Foundation of China(Nos.52076180,51876176 and 51906204)National Science and Technology Major Project,China(No.2017-I0001-0001)。
文摘The Efficient Global Optimization(EGO)algorithm has been widely used in the numerical design optimization of engineering systems.However,the need for an uncertainty estimator limits the selection of a surrogate model.In this paper,a Sequential Ensemble Optimization(SEO)algorithm based on the ensemble model is proposed.In the proposed algorithm,there is no limitation on the selection of an individual surrogate model.Specifically,the SEO is built based on the EGO by extending the EGO algorithm so that it can be used in combination with the ensemble model.Also,a new uncertainty estimator for any surrogate model named the General Uncertainty Estimator(GUE)is proposed.The performance of the proposed SEO algorithm is verified by the simulations using ten well-known mathematical functions with varying dimensions.The results show that the proposed SEO algorithm performs better than the traditional EGO algorithm in terms of both the final optimization results and the convergence rate.Further,the proposed algorithm is applied to the global optimization control for turbo-fan engine acceleration schedule design.
基金The National Natural Science Foundation of China under contract No.31902375the David and Lucile Packard Foundation+1 种基金the Innovation Team of Fishery Resources and Ecology in the Yellow Sea and Bohai Sea under contract No.2020TD01the Special Funds for Taishan Scholars Project of Shandong Province。
文摘Stomatopods are better known as mantis shrimp with considerable ecological importance in wide coastal waters globally. Some stomatopod species are exploited commercially, including Oratosquilla oratoria in the Northwest Pacific. Yet, few studies have published to promote accurate habitat identification of stomatopods, obstructing scientific management and conservation of these valuable organisms. This study provides an ensemble modeling framework for habitat suitability modeling of stomatopods, utilizing the O. oratoria stock in the Bohai Sea as an example. Two modeling techniques(i.e., generalized additive model(GAM) and geographical weighted regression(GWR)) were applied to select environmental predictors(especially the selection between two types of sediment metrics) that better characterize O. oratoria distribution and build separate habitat suitability models(HSM). The performance of the individual HSMs were compared on interpolation accuracy and transferability.Then, they were integrated to check whether the ensemble model outperforms either individual model, according to fishers’ knowledge and scientific survey data. As a result, grain-size metrics of sediment outperformed sediment content metrics in modeling O. oratoria habitat, possibly because grain-size metrics not only reflect the effect of substrates on burrow development, but also link to sediment heat capacity which influences individual thermoregulation. Moreover, the GWR-based HSM outperformed the GAM-based HSM in interpolation accuracy,while the latter one displayed better transferability. On balance, the ensemble HSM appeared to improve the predictive performance overall, as it could avoid dependence on a single model type and successfully identified fisher-recognized and survey-indicated suitable habitats in either sparsely sampled or well investigated areas.
文摘Root-cause identification plays a vital role in business decision making by providing effective future directions for the organizations.Aspect extraction and sentiment extraction plays a vital role in identifying the rootcauses.This paper proposes the Ensemble based temporal weighting and pareto ranking(ETP)model for Root-cause identification.Aspect extraction is performed based on rules and is followed by opinion identification using the proposed boosted ensemble model.The obtained aspects are validated and ranked using the proposed aspect weighing scheme.Pareto-rule based aspect selection is performed as the final selection mechanism and the results are presented for business decision making.Experiments were performed with the standard five product benchmark dataset.Performances on all five product reviews indicate the effective performance of the proposed model.Comparisons are performed using three standard state-of-the-art models and effectiveness is measured in terms of F-Measure and Detection rates.The results indicate improved performances exhibited by the proposed model with an increase in F-Measure levels at 1%–15%and detection rates at 4%–24%compared to the state-of-the-art models.
基金This research was financially supported by the Ministry of Small and Mediumsized Enterprises(SMEs)and Startups(MSS),Korea,under the“Regional Specialized Industry Development Program(R&D,S2855401)”supervised by the Korea Institute for Advancement of Technology(KIAT).
文摘Despite the advancement within the last decades in the field of smart grids,energy consumption forecasting utilizing the metrological features is still challenging.This paper proposes a genetic algorithm-based adaptive error curve learning ensemble(GA-ECLE)model.The proposed technique copes with the stochastic variations of improving energy consumption forecasting using a machine learning-based ensembled approach.A modified ensemble model based on a utilizing error of model as a feature is used to improve the forecast accuracy.This approach combines three models,namely CatBoost(CB),Gradient Boost(GB),and Multilayer Perceptron(MLP).The ensembled CB-GB-MLP model’s inner mechanism consists of generating a meta-data from Gradient Boosting and CatBoost models to compute the final predictions using the Multilayer Perceptron network.A genetic algorithm is used to obtain the optimal features to be used for the model.To prove the proposed model’s effectiveness,we have used a four-phase technique using Jeju island’s real energy consumption data.In the first phase,we have obtained the results by applying the CB-GB-MLP model.In the second phase,we have utilized a GA-ensembled model with optimal features.The third phase is for the comparison of the energy forecasting result with the proposed ECL-based model.The fourth stage is the final stage,where we have applied the GA-ECLE model.We obtained a mean absolute error of 3.05,and a root mean square error of 5.05.Extensive experimental results are provided,demonstrating the superiority of the proposed GA-ECLE model over traditional ensemble models.
基金supported by the Thousand Youth Talents Plan(Xinjiang Project)the National Natural Science Foundation of China(41630859)the West Light Foundation of Chinese Academy of Sciences(2016QNXZB12)
文摘Climate change in mountainous regions has significant impacts on hydrological and ecological systems. This research studied the future temperature, precipitation and snowfall in the 21^(st) century for the Tianshan and northern Kunlun Mountains(TKM) based on the general circulation model(GCM) simulation ensemble from the coupled model intercomparison project phase 5(CMIP5) under the representative concentration pathway(RCP) lower emission scenario RCP4.5 and higher emission scenario RCP8.5 using the Bayesian model averaging(BMA) technique. Results show that(1) BMA significantly outperformed the simple ensemble analysis and BMA mean matches all the three observed climate variables;(2) at the end of the 21^(st) century(2070–2099) under RCP8.5, compared to the control period(1976–2005), annual mean temperature and mean annual precipitation will rise considerably by 4.8°C and 5.2%, respectively, while mean annual snowfall will dramatically decrease by 26.5%;(3) precipitation will increase in the northern Tianshan region while decrease in the Amu Darya Basin. Snowfall will significantly decrease in the western TKM. Mean annual snowfall fraction will also decrease from 0.56 of 1976–2005 to 0.42 of 2070–2099 under RCP8.5; and(4) snowfall shows a high sensitivity to temperature in autumn and spring while a low sensitivity in winter, with the highest sensitivity values occurring at the edge areas of TKM. The projections mean that flood risk will increase and solid water storage will decrease.
基金Supported partially by the Post Doctoral Natural Science Foundation of China(2013M532118,2015T81082)the National Natural Science Foundation of China(61573364,61273177,61503066)+2 种基金the State Key Laboratory of Synthetical Automation for Process Industriesthe National High Technology Research and Development Program of China(2015AA043802)the Scientific Research Fund of Liaoning Provincial Education Department(L2013272)
文摘Strong mechanical vibration and acoustical signals of grinding process contain useful information related to load parameters in ball mills. It is a challenge to extract latent features and construct soft sensor model with high dimensional frequency spectra of these signals. This paper aims to develop a selective ensemble modeling approach based on nonlinear latent frequency spectral feature extraction for accurate measurement of material to ball volume ratio. Latent features are first extracted from different vibrations and acoustic spectral segments by kernel partial least squares. Algorithms of bootstrap and least squares support vector machines are employed to produce candidate sub-models using these latent features as inputs. Ensemble sub-models are selected based on genetic algorithm optimization toolbox. Partial least squares regression is used to combine these sub-models to eliminate collinearity among their prediction outputs. Results indicate that the proposed modeling approach has better prediction performance than previous ones.
基金supported by the Ministry of Science and Technology,Taiwan,under Grant MOST 110-2218-E-194-010。
文摘Numerous factors affect the increased temperature of a machine tool, including prolonged and high-intensity usage,tool-workpiece interaction, mechanical friction, and elevated ambient temperatures, among others. Consequently,spindle thermal displacement occurs, and machining precision suffers. To prevent the errors caused by thetemperature rise of the Spindle fromaffecting the accuracy during themachining process, typically, the factory willwarm up themachine before themanufacturing process.However, if there is noway to understand the tool spindle’sthermal deformation, the machining quality will be greatly affected. In order to solve the above problem, thisstudy aims to predict the thermal displacement of the machine tool by using intelligent algorithms. In the practicalapplication, only a few temperature sensors are used to input the information into the prediction model for realtimethermal displacement prediction. This approach has greatly improved the quality of tool processing.However,each algorithm has different performances in different environments. In this study, an ensemble model is used tointegrate Long Short-TermMemory (LSTM) with Support VectorMachine (SVM). The experimental results showthat the prediction performance of LSTM-SVM is higher than that of other machine learning algorithms.
文摘COVID-19 has caused severe health complications and produced a substantial adverse economic impact around the world.Forecasting the trend of COVID-19 infections could help in executing policies to effectively reduce the number of new cases.In this study,we apply the decomposition and ensemble model to forecast COVID-19 confirmed cases,deaths,and recoveries in Pakistan for the upcoming month until the end of July.For the decomposition of data,the Ensemble Empirical Mode Decomposition(EEMD)technique is applied.EEMD decomposes the data into small components,called Intrinsic Mode Functions(IMFs).For individual IMFs modelling,we use the Autoregressive Integrated Moving Average(ARIMA)model.The data used in this study is obtained from the official website of Pakistan that is publicly available and designated for COVID-19 outbreak with daily updates.Our analyses reveal that the number of recoveries,new cases,and deaths are increasing in Pakistan exponentially.Based on the selected EEMD-ARIMA model,the new confirmed cases are expected to rise from 213,470 to 311,454 by 31 July 2020,which is an increase of almost 1.46 times with a 95%prediction interval of 246,529 to 376,379.The 95%prediction interval for recovery is 162,414 to 224,579,with an increase of almost two times in total from 100802 to 193495 by 31 July 2020.On the other hand,the deaths are expected to increase from 4395 to 6751,which is almost 1.54 times,with a 95%prediction interval of 5617 to 7885.Thus,the COVID-19 forecasting results of Pakistan are alarming for the next month until 31 July 2020.They also confirm that the EEMD-ARIMA model is useful for the short-term forecasting of COVID-19,and that it is capable of keeping track of the real COVID-19 data in nearly all scenarios.The decomposition and ensemble strategy can be useful to help decision-makers in developing short-term strategies about the current number of disease occurrences until an appropriate vaccine is developed.
基金This work is supported by the Fundamental Research Funds for the Central Universities(Grant No.2572019BH03).
文摘Stance detection is the task of attitude identification toward a standpoint.Previous work of stance detection has focused on feature extraction but ignored the fact that irrelevant features exist as noise during higher-level abstracting.Moreover,because the target is not always mentioned in the text,most methods have ignored target information.In order to solve these problems,we propose a neural network ensemble method that combines the timing dependence bases on long short-term memory(LSTM)and the excellent extracting performance of convolutional neural networks(CNNs).The method can obtain multi-level features that consider both local and global features.We also introduce attention mechanisms to magnify target information-related features.Furthermore,we employ sparse coding to remove noise to obtain characteristic features.Performance was improved by using sparse coding on the basis of attention employment and feature extraction.We evaluate our approach on the SemEval-2016Task 6-A public dataset,achieving a performance that exceeds the benchmark and those of participating teams.
基金Sponsored by the National Outstanding Young Investigator Grant (Grant No6970025)the Key Project of National Natural Science Foundation (GrantNo59937150)+2 种基金863 High Tech Development Plan (Grant No2001AA413910)of China and the Key Project of National Natural Science Foundation(Grant No59937150)the Project of National Natural Science Foundation (Grant No60274054)
文摘A revised support vector regression (SVR) ensemble model based on boosting algorithm (SVR-Boosting) is presented in this paper for electricity price forecasting in electric power market. In the light of characteristics of electricity price sequence, a new triangular-shaped 为oss function is constructed in the training of the forecasting model to inhibit the learning from abnormal data in electricity price sequence. The results from actual data indicate that, compared with the single support vector regression model, the proposed SVR-Boosting ensemble model is able to enhance the stability of the model output remarkably, acquire higher predicting accuracy, and possess comparatively satisfactory generalization capability.
基金This research work was funded by Institutional Fund Projects under Grant No.(IFPIP:1614-611-1442)from the Ministry of Education and King Abdulaziz University,DSR,Jeddah,Saudi Arabia.
文摘Breast cancer is one of the leading cancers among women.It has the second-highest mortality rate in women after lung cancer.Timely detection,especially in the early stages,can help increase survival rates.However,manual diagnosis of breast cancer is a tedious and time-consuming process,and the accuracy of detection is reliant on the quality of the images and the radiologist’s experience.However,computer-aided medical diagnosis has recently shown promising results,leading to the need to develop an efficient system that can aid radiologists in diagnosing breast cancer in its early stages.The research presented in this paper is focused on the multi-class classification of breast cancer.The deep transfer learning approach has been utilized to train the deep learning models,and a pre-processing technique has been used to improve the quality of the ultrasound dataset.The proposed technique utilizes two deep learning models,Mobile-NetV2 and DenseNet201,for the composition of the deep ensemble model.Deep learning models are fine-tuned along with hyperparameter tuning to achieve better results.Subsequently,entropy-based feature selection is used.Breast cancer identification using the proposed classification approach was found to attain an accuracy of 97.04%,while the sensitivity and F1 score were 96.87%and 96.76%,respectively.The performance of the proposed model is very effective and outperforms other state-of-the-art techniques presented in the literature.
基金Supported by the Joint Fund of National Natural Science Foundation of China and Civil Aviation Administration of China (U1833110)。
文摘Nowadays aviation accidents have become one of the major causes of severe injuries and fatalities around the world. This attracts the research community to look into aviation safety by applying data analysis techniques based on an advanced machine learning algorithm. An ensemble classification model based on Aviation Safety Reporting System(ASRS) has been proposed to analyze aviation safety targeting the people injured in the system.The ensemble classification model shall contain two modules: the data-driven module consisting of data cleaning, feature selection,and imbalanced data division and reorganization, and the modeldriven module stacked by Random Forest(RF), XGBoost(XGB),and Light Gradient Boosting Machine(LGBM) separately. The results indicate that the ensemble model could solve the data imbalance while vastly improving accuracy. LGBM illustrates higher accuracy and faster run in the analysis of a single model of the ASRS-based imbalanced data, while the ensemble model has the best performance in classification at the same time. The ensemble model proposed for imbalanced data classification can provide a certain reference for similar data processing while improving the safety of civil aviation.