Network Intrusion Detection System(NIDS)detection of minority class attacks is always a difficult task when dealing with attacks in complex network environments.To improve the detection capability of minority-class at...Network Intrusion Detection System(NIDS)detection of minority class attacks is always a difficult task when dealing with attacks in complex network environments.To improve the detection capability of minority-class attacks,this study proposes an intrusion detection method based on a two-layer structure.The first layer employs a CNN-BiLSTM model incorporating an attention mechanism to classify network traffic into normal traffic,majority class attacks,and merged minority class attacks.The second layer further segments the minority class attacks through Stacking ensemble learning.The datasets are selected from the generic network dataset CIC-IDS2017,NSL-KDD,and the industrial network dataset Mississippi Gas Pipeline dataset to enhance the generalization and practical applicability of the model.Experimental results show that the proposed model achieves an overall detection accuracy of 99%,99%,and 95%on the CIC-IDS2017,NSL-KDD,and industrial network datasets,respectively.It also significantly outperforms traditional methods in terms of detection accuracy and recall rate for minority class attacks.Compared with the single-layer deep learning model,the two-layer structure effectively reduces the false alarm rate while improving the minority-class attack detection performance.The research in this paper not only improves the adaptability of NIDS to complex network environments but also provides a new solution for minority-class attack detection in industrial network security.展开更多
Existing web-based security applications have failed in many situations due to the great intelligence of attackers.Among web applications,Cross-Site Scripting(XSS)is one of the dangerous assaults experienced while mod...Existing web-based security applications have failed in many situations due to the great intelligence of attackers.Among web applications,Cross-Site Scripting(XSS)is one of the dangerous assaults experienced while modifying an organization's or user's information.To avoid these security challenges,this article proposes a novel,all-encompassing combination of machine learning(NB,SVM,k-NN)and deep learning(RNN,CNN,LSTM)frameworks for detecting and defending against XSS attacks with high accuracy and efficiency.Based on the representation,a novel idea for merging stacking ensemble with web applications,termed“hybrid stacking”,is proposed.In order to implement the aforementioned methods,four distinct datasets,each of which contains both safe and unsafe content,are considered.The hybrid detection method can adaptively identify the attacks from the URL,and the defense mechanism inherits the advantages of URL encoding with dictionary-based mapping to improve prediction accuracy,accelerate the training process,and effectively remove the unsafe JScript/JavaScript keywords from the URL.The simulation results show that the proposed hybrid model is more efficient than the existing detection methods.It produces more than 99.5%accurate XSS attack classification results(accuracy,precision,recall,f1_score,and Receiver Operating Characteristic(ROC))and is highly resistant to XSS attacks.In order to ensure the security of the server's information,the proposed hybrid approach is demonstrated in a real-time environment.展开更多
Flood susceptibility modeling is crucial for rapid flood forecasting, disaster reduction strategies, evacuation planning, and decision-making. Machine learning(ML) models have proven to be effective tools for assessin...Flood susceptibility modeling is crucial for rapid flood forecasting, disaster reduction strategies, evacuation planning, and decision-making. Machine learning(ML) models have proven to be effective tools for assessing flood susceptibility. However, most previous studies have focused on individual models or comparative performance, underscoring the unique strengths and weaknesses of each model. In this study, we propose a stacking ensemble learning algorithm that harnesses the strengths of a diverse range of machine learning models. The findings reveal the following:(1) The stacking ensemble learning, using RF-XGBCB-LR model, significantly enhances flood susceptibility simulation.(2) In addition to rainfall,key flood drivers in the study area include NDVI, and impervious surfaces. Over 40% of the study area, primarily in the northeast and southeast, exhibits high flood susceptibility, with higher risks for populations compared to cropland.(3) In the northeast of the study area,heavy precipitation, low terrain, and NDVI values are key indicators contributing to high flood susceptibility, while long-duration precipitation, mountainous topography, and upper reach vegetation are the main drivers in the southeast. This study underscores the effectiveness of ML, particularly ensemble learning, in flood modeling. It identifies vulnerable areas and contributes to improved flood risk management.展开更多
Slope failures lead to catastrophic consequences in numerous countries and thus the stability assessment for slopes is of high interest in geotechnical and geological engineering researches.A hybrid stacking ensemble ...Slope failures lead to catastrophic consequences in numerous countries and thus the stability assessment for slopes is of high interest in geotechnical and geological engineering researches.A hybrid stacking ensemble approach is proposed in this study for enhancing the prediction of slope stability.In the hybrid stacking ensemble approach,we used an artificial bee colony(ABC)algorithm to find out the best combination of base classifiers(level 0)and determined a suitable meta-classifier(level 1)from a pool of 11 individual optimized machine learning(OML)algorithms.Finite element analysis(FEA)was conducted in order to form the synthetic database for the training stage(150 cases)of the proposed model while 107 real field slope cases were used for the testing stage.The results by the hybrid stacking ensemble approach were then compared with that obtained by the 11 individual OML methods using confusion matrix,F1-score,and area under the curve,i.e.AUC-score.The comparisons showed that a significant improvement in the prediction ability of slope stability has been achieved by the hybrid stacking ensemble(AUC?90.4%),which is 7%higher than the best of the 11 individual OML methods(AUC?82.9%).Then,a further comparison was undertaken between the hybrid stacking ensemble method and basic ensemble classifier on slope stability prediction.The results showed a prominent performance of the hybrid stacking ensemble method over the basic ensemble method.Finally,the importance of the variables for slope stability was studied using linear vector quantization(LVQ)method.展开更多
Numerical simulation of concrete-faced rockfill dams(CFRDs)considering the spatial variability of rockfill has become a popular research topic in recent years.In order to determine uncertain rockfill properties effici...Numerical simulation of concrete-faced rockfill dams(CFRDs)considering the spatial variability of rockfill has become a popular research topic in recent years.In order to determine uncertain rockfill properties efficiently and reliably,this study developed an uncertainty inversion analysis method for rockfill material parameters using the stacking ensemble strategy and Jaya optimizer.The comprehensive implementation process of the proposed model was described with an illustrative CFRD example.First,the surrogate model method using the stacking ensemble algorithm was used to conduct the Monte Carlo stochastic finite element calculations with reduced computational cost and improved accuracy.Afterwards,the Jaya algorithm was used to inversely calculate the combination of the coefficient of variation of rockfill material parameters.This optimizer obtained higher accuracy and more significant uncertainty reduction than traditional optimizers.Overall,the developed model effectively identified the random parameters of rockfill materials.This study provided scientific references for uncertainty analysis of CFRDs.In addition,the proposed method can be applied to other similar engineering structures.展开更多
Recently,machine learning-based technologies have been developed to automate the classification of wafer map defect patterns during semiconductormanufacturing.The existing approaches used in the wafer map pattern clas...Recently,machine learning-based technologies have been developed to automate the classification of wafer map defect patterns during semiconductormanufacturing.The existing approaches used in the wafer map pattern classification include directly learning the image through a convolution neural network and applying the ensemble method after extracting image features.This study aims to classify wafer map defects more effectively and derive robust algorithms even for datasets with insufficient defect patterns.First,the number of defects during the actual process may be limited.Therefore,insufficient data are generated using convolutional auto-encoder(CAE),and the expanded data are verified using the evaluation technique of structural similarity index measure(SSIM).After extracting handcrafted features,a boosted stacking ensemble model that integrates the four base-level classifiers with the extreme gradient boosting classifier as a meta-level classifier is designed and built for training the model based on the expanded data for final prediction.Since the proposed algorithm shows better performance than those of existing ensemble classifiers even for insufficient defect patterns,the results of this study will contribute to improving the product quality and yield of the actual semiconductor manufacturing process.展开更多
This study employs a stacking ensemble learning framework to establish a regression model for predicting the tribological properties of amide-based lubricating grease and determining the optimal additive ratios.Melami...This study employs a stacking ensemble learning framework to establish a regression model for predicting the tribological properties of amide-based lubricating grease and determining the optimal additive ratios.Melamine cyanuric acid(MCA)was selected as the thickener,and three extreme-pressure anti-wear additives were used to prepare the lubricating grease.The tribological performance was tested using an MFT-R4000 reciprocating friction and wear machine.Based on the tribological experimental data,the synthetic minority oversampling technique(SMOTE)was utilized for data augmentation,and a stacking ensemble algorithm with Bayesian optimization of hyperparameters was used to construct a predictive model for tribological performance.Subsequently,within this model framework,single and multi-objective optimization models were developed,and the fruit fly algorithm was employed to find the optimal additive combination ratios,which were experimentally validated.The results demonstrated that the learning framework based on the stacking ensemble model could effectively predict the tribological properties of amide-based lubricating grease in small sample datasets,with the R2 for the average friction coefficient prediction reaching 0.9939 and for the wear scar width prediction reaching 0.9535.In the experimental validation of the optimal additive ratios,the relative error of the friction coefficient ratio scheme was 0.51%,and the relative error of the wear scar width was 1.10%.This finding suggests that the learning framework provides a novel approach for predicting the performance of amide-based lubricating grease and studying additive combinations.展开更多
Today,phishing is an online attack designed to obtain sensitive information such as credit card and bank account numbers,passwords,and usernames.We can find several anti-phishing solutions,such as heuristic detection,...Today,phishing is an online attack designed to obtain sensitive information such as credit card and bank account numbers,passwords,and usernames.We can find several anti-phishing solutions,such as heuristic detection,virtual similarity detection,black and white lists,and machine learning(ML).However,phishing attempts remain a problem,and establishing an effective anti-phishing strategy is a work in progress.Furthermore,while most antiphishing solutions achieve the highest levels of accuracy on a given dataset,their methods suffer from an increased number of false positives.These methods are ineffective against zero-hour attacks.Phishing sites with a high False Positive Rate(FPR)are considered genuine because they can cause people to lose a lot ofmoney by visiting them.Feature selection is critical when developing phishing detection strategies.Good feature selection helps improve accuracy;however,duplicate features can also increase noise in the dataset and reduce the accuracy of the algorithm.Therefore,a combination of filter-based feature selection methods is proposed to detect phishing attacks,including constant feature removal,duplicate feature removal,quasi-feature removal,correlated feature removal,mutual information extraction,and Analysis of Variance(ANOVA)testing.The technique has been tested with differentMachine Learning classifiers:Random Forest,Artificial Neural Network(ANN),Ada-Boost,Extreme Gradient Boosting(XGBoost),Logistic Regression,Decision Trees,Gradient Boosting Classifiers,Support Vector Machine(SVM),and two types of ensemble models,stacking and majority voting to gain A low false positive rate is achieved.Stacked ensemble classifiers(gradient boosting,randomforest,support vector machine)achieve 1.31%FPR and 98.17%accuracy on Dataset 1,2.81%FPR and Dataset 3 shows 2.81%FPR and 97.61%accuracy,while Dataset 2 shows 3.47%FPR and 96.47%accuracy.展开更多
Self-powered neutron detectors(SPNDs)play a critical role in monitoring the safety margins and overall health of reactors,directly affecting safe operation within the reactor.In this work,a novel fault identification ...Self-powered neutron detectors(SPNDs)play a critical role in monitoring the safety margins and overall health of reactors,directly affecting safe operation within the reactor.In this work,a novel fault identification method based on graph convolutional networks(GCN)and Stacking ensemble learning is proposed for SPNDs.The GCN is employed to extract the spatial neighborhood information of SPNDs at different positions,and residuals are obtained by nonlinear fitting of SPND signals.In order to completely extract the time-varying features from residual sequences,the Stacking fusion model,integrated with various algorithms,is developed and enables the identification of five conditions for SPNDs:normal,drift,bias,precision degradation,and complete failure.The results demonstrate that the integration of diverse base-learners in the GCN-Stacking model exhibits advantages over a single model as well as enhances the stability and reliability in fault identification.Additionally,the GCN-Stacking model maintains higher accuracy in identifying faults at different reactor power levels.展开更多
The stability of underground entry-type excavations(UETEs)is of paramount importance for ensuring the safety of mining operations.As more engineering cases are accumulated,machine learning(ML)has demonstrated great po...The stability of underground entry-type excavations(UETEs)is of paramount importance for ensuring the safety of mining operations.As more engineering cases are accumulated,machine learning(ML)has demonstrated great potential for the stability evaluation of UETEs.In this study,a hybrid stacking ensemble method aggregating support vector machine(SVM),k-nearest neighbor(KNN),decision tree(DT),random forest(RF),multilayer perceptron neural network(MLPNN)and extreme gradient boosting(XGBoost)algorithms was proposed to assess the stability of UETEs.Firstly,a total of 399 historical cases with two indicators were collected from seven mines.Subsequently,to pursue better evaluation performance,the hyperparameters of base learners(SVM,KNN,DT,RF,MLPNN and XGBoost)and meta learner(MLPNN)were tuned by combining a five-fold cross validation(CV)and simulated annealing(SA)approach.Based on the optimal hyperparameters configuration,the stacking ensemble models were constructed using the training set(75%of the data).Finally,the performance of the proposed approach was evaluated by two global metrics(accuracy and Cohen’s Kappa)and three within-class metrics(macro average of the precision,recall and F1-score)on the test set(25%of the data).In addition,the evaluation results were compared with six base learners optimized by SA.The hybrid stacking ensemble algorithm achieved better comprehensive performance with the accuracy,Kappa coefficient,macro average of the precision,recall and F1-score were 0.92,0.851,0.885,0.88 and 0.883,respectively.The rock mass rating(RMR)had the most important influence on evaluation results.Moreover,the critical span graph(CSG)was updated based on the proposed model,representing a significant improvement compared with the previous studies.This study can provide valuable guidance for stability analysis and risk management of UETEs.However,it is necessary to consider more indicators and collect more extensive and balanced dataset to validate the model in future.展开更多
The accurate identification of smart meter(SM)fault types is crucial for enhancing the efficiency of operationand maintenance(O&M)and the reliability of power collectionsystems.However,the intelligent classificati...The accurate identification of smart meter(SM)fault types is crucial for enhancing the efficiency of operationand maintenance(O&M)and the reliability of power collectionsystems.However,the intelligent classification of SM fault typesfaces significant challenges owing to the complexity of featuresand the imbalance between fault categories.To address these issues,this study presents a fault diagnosis method for SM incorporatingthree distinct modules.The first module employs acombination of standardization,data imputation,and featureextraction to enhance the data quality,thereby facilitating improvedtraining and learning by the classifiers.To enhance theclassification performance,the data imputation method considersfeature correlation measurement and sequential imputation,and the feature extractor utilizes the discriminative enhancedsparse autoencoder.To tackle the interclass imbalance of datawith discrete and continuous features,the second module introducesan assisted classifier generative adversarial network,which includes a discrete feature generation module.Finally,anovel Stacking ensemble classifier for SM fault diagnosis is developed.In contrast to previous studies,we construct a two-layerheuristic optimization framework to address the synchronousdynamic optimization problem of the combinations and hyperparametersof the Stacking ensemble classifier,enabling betterhandling of complex classification tasks using SM data.The proposedfault diagnosis method for SM via two-layer stacking ensembleoptimization and data augmentation is trained and validatedusing SM fault data collected from 2010 to 2018 in Zhejiang Province,China.Experimental results demonstrate the effectivenessof the proposed method in improving the accuracyof SM fault diagnosis,particularly for minority classes.展开更多
Healthcare networks prove to be an urgent issue in terms of intrusion detection due to the critical consequences of cyber threats and the extreme sensitivity of medical information.The proposed Auto-Stack ID in the st...Healthcare networks prove to be an urgent issue in terms of intrusion detection due to the critical consequences of cyber threats and the extreme sensitivity of medical information.The proposed Auto-Stack ID in the study is a stacked ensemble of encoder-enhanced auctions that can be used to improve intrusion detection in healthcare networks.TheWUSTL-EHMS 2020 dataset trains and evaluates themodel,constituting an imbalanced class distribution(87.46% normal traffic and 12.53% intrusion attacks).To address this imbalance,the study balances the effect of training Bias through Stratified K-fold cross-validation(K=5),so that each class is represented similarly on training and validation splits.Second,the Auto-Stack ID method combines many base classifiers such as TabNet,LightGBM,Gaussian Naive Bayes,Histogram-Based Gradient Boosting(HGB),and Logistic Regression.We apply a two-stage training process based on the first stage,where we have base classifiers that predict out-of-fold(OOF)predictions,which we use as inputs for the second-stage meta-learner XGBoost.The meta-learner learns to refine predictions to capture complicated interactions between base models,thus improving detection accuracy without introducing bias,overfitting,or requiring domain knowledge of the meta-data.In addition,the auto-stack ID model got 98.41% accuracy and 93.45%F1 score,better than individual classifiers.It can identify intrusions due to its 90.55% recall and 96.53% precision with minimal false positives.These findings identify its suitability in ensuring healthcare networks’security through ensemble learning.Ongoing efforts will be deployed in real time to improve response to evolving threats.展开更多
Real-time prediction of the rock mass class in front of the tunnel face is essential for the adaptive adjustment of tunnel boring machines(TBMs).During the TBM tunnelling process,a large number of operation data are g...Real-time prediction of the rock mass class in front of the tunnel face is essential for the adaptive adjustment of tunnel boring machines(TBMs).During the TBM tunnelling process,a large number of operation data are generated,reflecting the interaction between the TBM system and surrounding rock,and these data can be used to evaluate the rock mass quality.This study proposed a stacking ensemble classifier for the real-time prediction of the rock mass classification using TBM operation data.Based on the Songhua River water conveyance project,a total of 7538 TBM tunnelling cycles and the corresponding rock mass classes are obtained after data preprocessing.Then,through the tree-based feature selection method,10 key TBM operation parameters are selected,and the mean values of the 10 selected features in the stable phase after removing outliers are calculated as the inputs of classifiers.The preprocessed data are randomly divided into the training set(90%)and test set(10%)using simple random sampling.Besides stacking ensemble classifier,seven individual classifiers are established as the comparison.These classifiers include support vector machine(SVM),k-nearest neighbors(KNN),random forest(RF),gradient boosting decision tree(GBDT),decision tree(DT),logistic regression(LR)and multilayer perceptron(MLP),where the hyper-parameters of each classifier are optimised using the grid search method.The prediction results show that the stacking ensemble classifier has a better performance than individual classifiers,and it shows a more powerful learning and generalisation ability for small and imbalanced samples.Additionally,a relative balance training set is obtained by the synthetic minority oversampling technique(SMOTE),and the influence of sample imbalance on the prediction performance is discussed.展开更多
Rockburst is a kind of common geological disaster in deep tunnel engineering.It has the characteristics of causing great harm and occurring at random locations and times.These characteristics seriously affect tunnel c...Rockburst is a kind of common geological disaster in deep tunnel engineering.It has the characteristics of causing great harm and occurring at random locations and times.These characteristics seriously affect tunnel construction and threaten the physical and mental health and safety of workers.Therefore,it is of great significance to study the tendency of rockburst in the early stage of tunnel survey,design and construction.At present,there is no unified method and selected parameters for rockburst prediction.In view of the large difference of different rockburst criteria and the imbalance of rockburst database categories,this paper presents a two-step rockburst prediction method based on multiple factors and the stacking ensemble algorithm.Considering the influence of rock physical and mechanical parameters,tunnel face conditions and excavation disturbance,multiple rockburst criteria are predicted by integrating multiple machine learning algorithms.A combined prediction model of rockburst criteria is established,and the results of each rockburst criterion index are weighted and combined,with the weight updated using the field rockburst record.The dynamic weight is combined with the cloud model to comprehensively evaluate the regional rockburst risk.Field results from applying the model in the Grand Canyon tunnel show that the rockburst prediction method proposed in this paper has better applicability and higher accuracy than the single rockburst criterion.展开更多
As a result of the increased number of COVID-19 cases,Ensemble Machine Learning(EML)would be an effective tool for combatting this pandemic outbreak.An ensemble of classifiers can improve the performance of single mac...As a result of the increased number of COVID-19 cases,Ensemble Machine Learning(EML)would be an effective tool for combatting this pandemic outbreak.An ensemble of classifiers can improve the performance of single machine learning(ML)classifiers,especially stacking-based ensemble learning.Stacking utilizes heterogeneous-base learners trained in parallel and combines their predictions using a meta-model to determine the final prediction results.However,building an ensemble often causes the model performance to decrease due to the increasing number of learners that are not being properly selected.Therefore,the goal of this paper is to develop and evaluate a generic,data-independent predictive method using stacked-based ensemble learning(GA-Stacking)optimized by aGenetic Algorithm(GA)for outbreak prediction and health decision aided processes.GA-Stacking utilizes five well-known classifiers,including Decision Tree(DT),Random Forest(RF),RIGID regression,Least Absolute Shrinkage and Selection Operator(LASSO),and eXtreme Gradient Boosting(XGBoost),at its first level.It also introduces GA to identify comparisons to forecast the number,combination,and trust of these base classifiers based on theMean Squared Error(MSE)as a fitness function.At the second level of the stacked ensemblemodel,a Linear Regression(LR)classifier is used to produce the final prediction.The performance of the model was evaluated using a publicly available dataset from the Center for Systems Science and Engineering,Johns Hopkins University,which consisted of 10,722 data samples.The experimental results indicated that the GA-Stacking model achieved outstanding performance with an overall accuracy of 99.99%for the three selected countries.Furthermore,the proposed model achieved good performance when compared with existing baggingbased approaches.The proposed model can be used to predict the pandemic outbreak correctly and may be applied as a generic data-independent model 3946 CMC,2023,vol.74,no.2 to predict the epidemic trend for other countries when comparing preventive and control measures.展开更多
Cross-Site Scripting(XSS)remains a significant threat to web application security,exploiting vulnerabilities to hijack user sessions and steal sensitive data.Traditional detection methods often fail to keep pace with ...Cross-Site Scripting(XSS)remains a significant threat to web application security,exploiting vulnerabilities to hijack user sessions and steal sensitive data.Traditional detection methods often fail to keep pace with the evolving sophistication of cyber threats.This paper introduces a novel hybrid ensemble learning framework that leverages a combination of advanced machine learning algorithms—Logistic Regression(LR),Support Vector Machines(SVM),eXtreme Gradient Boosting(XGBoost),Categorical Boosting(CatBoost),and Deep Neural Networks(DNN).Utilizing the XSS-Attacks-2021 dataset,which comprises 460 instances across various real-world trafficrelated scenarios,this framework significantly enhances XSS attack detection.Our approach,which includes rigorous feature engineering and model tuning,not only optimizes accuracy but also effectively minimizes false positives(FP)(0.13%)and false negatives(FN)(0.19%).This comprehensive methodology has been rigorously validated,achieving an unprecedented accuracy of 99.87%.The proposed system is scalable and efficient,capable of adapting to the increasing number of web applications and user demands without a decline in performance.It demonstrates exceptional real-time capabilities,with the ability to detect XSS attacks dynamically,maintaining high accuracy and low latency even under significant loads.Furthermore,despite the computational complexity introduced by the hybrid ensemble approach,strategic use of parallel processing and algorithm tuning ensures that the system remains scalable and performs robustly in real-time applications.Designed for easy integration with existing web security systems,our framework supports adaptable Application Programming Interfaces(APIs)and a modular design,facilitating seamless augmentation of current defenses.This innovation represents a significant advancement in cybersecurity,offering a scalable and effective solution for securing modern web applications against evolving threats.展开更多
BACKGROUND There is a lack of literature discussing the utilization of the stacking ensemble algorithm for predicting depression in patients with heart failure(HF).AIM To create a stacking model for predicting depress...BACKGROUND There is a lack of literature discussing the utilization of the stacking ensemble algorithm for predicting depression in patients with heart failure(HF).AIM To create a stacking model for predicting depression in patients with HF.METHODS This study analyzed data on 1084 HF patients from the National Health and Nutrition Examination Survey database spanning from 2005 to 2018.Through univariate analysis and the use of an artificial neural network algorithm,predictors significantly linked to depression were identified.These predictors were utilized to create a stacking model employing tree-based learners.The performances of both the individual models and the stacking model were assessed by using the test dataset.Furthermore,the SHapley additive exPlanations(SHAP)model was applied to interpret the stacking model.RESULTS The models included five predictors.Among these models,the stacking model demonstrated the highest performance,achieving an area under the curve of 0.77(95%CI:0.71-0.84),a sensitivity of 0.71,and a specificity of 0.68.The calibration curve supported the reliability of the models,and decision curve analysis confirmed their clinical value.The SHAP plot demonstrated that age had the most significant impact on the stacking model's output.CONCLUSION The stacking model demonstrated strong predictive performance.Clinicians can utilize this model to identify highrisk depression patients with HF,thus enabling early provision of psychological interventions.展开更多
Surveillance cameras have been widely used for monitoring in both private and public sectors as a security measure.Close Circuits Television(CCTV)Cameras are used to surveillance and monitor the normal and anomalous i...Surveillance cameras have been widely used for monitoring in both private and public sectors as a security measure.Close Circuits Television(CCTV)Cameras are used to surveillance and monitor the normal and anomalous incidents.Real-world anomaly detection is a significant challenge due to its complex and diverse nature.It is difficult to manually analyze because vast amounts of video data have been generated through surveillance systems,and the need for automated techniques has been raised to enhance detection accuracy.This paper proposes a novel deep-stacked ensemble model integrated with a data augmentation approach called Stack Ensemble Road Anomaly Detection(SERAD).SERAD is used to detect and classify the four most happening road anomalies,such as accidents,car fires,fighting,and snatching,through road surveillance videos with high accuracy.The SERAD adapted three pre-trained Convolutional Neural Networks(CNNs)models,namely VGG19,ResNet50 and InceptionV3.The stacking technique is employed to incorporate these three models,resulting in much-improved accuracy for classifying road abnormalities compared to individual models.Additionally,it presented a custom real-world Road Anomaly Dataset(RAD)comprising a comprehensive collection of road images and videos.The experimental results demonstrate the strength and reliability of the proposed SERAD model,achieving an impressive classification accuracy of 98.7%.The results indicate that the proposed SERAD model outperforms than the individual CNN base models.展开更多
Tailings produced by mining and ore smelting are a major source of soil pollution.Understanding the speciation of heavy metals(HMs)in tailings is essential for soil remediation and sustainable development.Given the co...Tailings produced by mining and ore smelting are a major source of soil pollution.Understanding the speciation of heavy metals(HMs)in tailings is essential for soil remediation and sustainable development.Given the complex and time-consuming nature of traditional sequential laboratory extraction methods for determining the forms of HMs in tailings,a rapid and precise identification approach is urgently required.To address this issue,a general empirical prediction method for HM occurrence was developed using machine learning(ML).The compositional information of the tailings,properties of the HMs,and sequential extraction steps were used as inputs to calculate the percentages of the seven forms of HMs.After the models were tuned and compared,extreme gradient boosting,gradient boosting decision tree,and categorical boosting methods were found to be the top three performing ML models,with the coefficient of determination(R^(2))values on the testing set exceeding 0.859.Feature importance analysis for these three optimal models indicated that electronegativity was the most important factor affecting the occurrence of HMs,with an average feature importance of 0.4522.The subsequent use of stacking as a model integration method enabled the ability of the ML models to predict HM occurrence forms to be further improved,and resulting in an increase of R^(2) to 0.879.Overall,this study developed a robust technique for predicting the occurrence forms in tailings and provides an important reference for the environmental assessment and recycling of tailings.展开更多
An anomaly-based intrusion detection system(A-IDS)provides a critical aspect in a modern computing infrastructure since new types of attacks can be discovered.It prevalently utilizes several machine learning algorithm...An anomaly-based intrusion detection system(A-IDS)provides a critical aspect in a modern computing infrastructure since new types of attacks can be discovered.It prevalently utilizes several machine learning algorithms(ML)for detecting and classifying network traffic.To date,lots of algorithms have been proposed to improve the detection performance of A-IDS,either using individual or ensemble learners.In particular,ensemble learners have shown remarkable performance over individual learners in many applications,including in cybersecurity domain.However,most existing works still suffer from unsatisfactory results due to improper ensemble design.The aim of this study is to emphasize the effectiveness of stacking ensemble-based model for A-IDS,where deep learning(e.g.,deep neural network[DNN])is used as base learner model.The effectiveness of the proposed model and base DNN model are benchmarked empirically in terms of several performance metrics,i.e.,Matthew’s correlation coefficient,accuracy,and false alarm rate.The results indicate that the proposed model is superior to the base DNN model as well as other existing ML algorithms found in the literature.展开更多
基金supported by the Institute of Information&Communications Technology Planning&Evaluation(IITP)—Innovative Human Resource Development for Local Intellectualization program grant funded by the Korea government(MSIT)(IITP-2025-RS-2022-00156334)in part by Liaoning Province Nature Fund Project(2024-BSLH-214).
文摘Network Intrusion Detection System(NIDS)detection of minority class attacks is always a difficult task when dealing with attacks in complex network environments.To improve the detection capability of minority-class attacks,this study proposes an intrusion detection method based on a two-layer structure.The first layer employs a CNN-BiLSTM model incorporating an attention mechanism to classify network traffic into normal traffic,majority class attacks,and merged minority class attacks.The second layer further segments the minority class attacks through Stacking ensemble learning.The datasets are selected from the generic network dataset CIC-IDS2017,NSL-KDD,and the industrial network dataset Mississippi Gas Pipeline dataset to enhance the generalization and practical applicability of the model.Experimental results show that the proposed model achieves an overall detection accuracy of 99%,99%,and 95%on the CIC-IDS2017,NSL-KDD,and industrial network datasets,respectively.It also significantly outperforms traditional methods in terms of detection accuracy and recall rate for minority class attacks.Compared with the single-layer deep learning model,the two-layer structure effectively reduces the false alarm rate while improving the minority-class attack detection performance.The research in this paper not only improves the adaptability of NIDS to complex network environments but also provides a new solution for minority-class attack detection in industrial network security.
基金supported by the National Research Foundation of Korea(NRF)grant funded by the Korea government(MEST)No.2015R1A3A2031159,2016R1A5A1008055.
文摘Existing web-based security applications have failed in many situations due to the great intelligence of attackers.Among web applications,Cross-Site Scripting(XSS)is one of the dangerous assaults experienced while modifying an organization's or user's information.To avoid these security challenges,this article proposes a novel,all-encompassing combination of machine learning(NB,SVM,k-NN)and deep learning(RNN,CNN,LSTM)frameworks for detecting and defending against XSS attacks with high accuracy and efficiency.Based on the representation,a novel idea for merging stacking ensemble with web applications,termed“hybrid stacking”,is proposed.In order to implement the aforementioned methods,four distinct datasets,each of which contains both safe and unsafe content,are considered.The hybrid detection method can adaptively identify the attacks from the URL,and the defense mechanism inherits the advantages of URL encoding with dictionary-based mapping to improve prediction accuracy,accelerate the training process,and effectively remove the unsafe JScript/JavaScript keywords from the URL.The simulation results show that the proposed hybrid model is more efficient than the existing detection methods.It produces more than 99.5%accurate XSS attack classification results(accuracy,precision,recall,f1_score,and Receiver Operating Characteristic(ROC))and is highly resistant to XSS attacks.In order to ensure the security of the server's information,the proposed hybrid approach is demonstrated in a real-time environment.
基金National Natural Science Foundation of China,No.42271037Key Research and Development Program Project of Anhui Province,No.2022m07020011+1 种基金The University Synergy Innovation Program of Anhui Province,No.GXXT-2021-048Science Foundation for Excellent Young Scholars of Anhui,No.2108085Y13。
文摘Flood susceptibility modeling is crucial for rapid flood forecasting, disaster reduction strategies, evacuation planning, and decision-making. Machine learning(ML) models have proven to be effective tools for assessing flood susceptibility. However, most previous studies have focused on individual models or comparative performance, underscoring the unique strengths and weaknesses of each model. In this study, we propose a stacking ensemble learning algorithm that harnesses the strengths of a diverse range of machine learning models. The findings reveal the following:(1) The stacking ensemble learning, using RF-XGBCB-LR model, significantly enhances flood susceptibility simulation.(2) In addition to rainfall,key flood drivers in the study area include NDVI, and impervious surfaces. Over 40% of the study area, primarily in the northeast and southeast, exhibits high flood susceptibility, with higher risks for populations compared to cropland.(3) In the northeast of the study area,heavy precipitation, low terrain, and NDVI values are key indicators contributing to high flood susceptibility, while long-duration precipitation, mountainous topography, and upper reach vegetation are the main drivers in the southeast. This study underscores the effectiveness of ML, particularly ensemble learning, in flood modeling. It identifies vulnerable areas and contributes to improved flood risk management.
基金We acknowledge the funding support from Australia Research Council(Grant Nos.DP200100549 and IH180100010).
文摘Slope failures lead to catastrophic consequences in numerous countries and thus the stability assessment for slopes is of high interest in geotechnical and geological engineering researches.A hybrid stacking ensemble approach is proposed in this study for enhancing the prediction of slope stability.In the hybrid stacking ensemble approach,we used an artificial bee colony(ABC)algorithm to find out the best combination of base classifiers(level 0)and determined a suitable meta-classifier(level 1)from a pool of 11 individual optimized machine learning(OML)algorithms.Finite element analysis(FEA)was conducted in order to form the synthetic database for the training stage(150 cases)of the proposed model while 107 real field slope cases were used for the testing stage.The results by the hybrid stacking ensemble approach were then compared with that obtained by the 11 individual OML methods using confusion matrix,F1-score,and area under the curve,i.e.AUC-score.The comparisons showed that a significant improvement in the prediction ability of slope stability has been achieved by the hybrid stacking ensemble(AUC?90.4%),which is 7%higher than the best of the 11 individual OML methods(AUC?82.9%).Then,a further comparison was undertaken between the hybrid stacking ensemble method and basic ensemble classifier on slope stability prediction.The results showed a prominent performance of the hybrid stacking ensemble method over the basic ensemble method.Finally,the importance of the variables for slope stability was studied using linear vector quantization(LVQ)method.
基金supported by the National Natural Science Foundation of China(Grants No.51879185 and 52179139)the Open Fund of the Hubei Key Laboratory of Construction and Management in Hydropower Engineering(Grant No.2020KSD06).
文摘Numerical simulation of concrete-faced rockfill dams(CFRDs)considering the spatial variability of rockfill has become a popular research topic in recent years.In order to determine uncertain rockfill properties efficiently and reliably,this study developed an uncertainty inversion analysis method for rockfill material parameters using the stacking ensemble strategy and Jaya optimizer.The comprehensive implementation process of the proposed model was described with an illustrative CFRD example.First,the surrogate model method using the stacking ensemble algorithm was used to conduct the Monte Carlo stochastic finite element calculations with reduced computational cost and improved accuracy.Afterwards,the Jaya algorithm was used to inversely calculate the combination of the coefficient of variation of rockfill material parameters.This optimizer obtained higher accuracy and more significant uncertainty reduction than traditional optimizers.Overall,the developed model effectively identified the random parameters of rockfill materials.This study provided scientific references for uncertainty analysis of CFRDs.In addition,the proposed method can be applied to other similar engineering structures.
基金the National Research Foundation of Korea(NRF)grant funded by the Korea government(MSIT)(No.NRF-2021R1A5A8033165)the“Human Resources Program in Energy Technology”of the Korea Institute of Energy Technology Evaluation and Planning(KETEP)and was granted financial resources from the Ministry of Trade,Industry&Energy,Republic of Korea(No.20214000000200).
文摘Recently,machine learning-based technologies have been developed to automate the classification of wafer map defect patterns during semiconductormanufacturing.The existing approaches used in the wafer map pattern classification include directly learning the image through a convolution neural network and applying the ensemble method after extracting image features.This study aims to classify wafer map defects more effectively and derive robust algorithms even for datasets with insufficient defect patterns.First,the number of defects during the actual process may be limited.Therefore,insufficient data are generated using convolutional auto-encoder(CAE),and the expanded data are verified using the evaluation technique of structural similarity index measure(SSIM).After extracting handcrafted features,a boosted stacking ensemble model that integrates the four base-level classifiers with the extreme gradient boosting classifier as a meta-level classifier is designed and built for training the model based on the expanded data for final prediction.Since the proposed algorithm shows better performance than those of existing ensemble classifiers even for insufficient defect patterns,the results of this study will contribute to improving the product quality and yield of the actual semiconductor manufacturing process.
基金support extended for this academic work by the Beijing Natural Science Foundation(No.2232066)the Open Project Foundation of State Key Laboratory of Solid Lubrication(No.LSL-2212).
文摘This study employs a stacking ensemble learning framework to establish a regression model for predicting the tribological properties of amide-based lubricating grease and determining the optimal additive ratios.Melamine cyanuric acid(MCA)was selected as the thickener,and three extreme-pressure anti-wear additives were used to prepare the lubricating grease.The tribological performance was tested using an MFT-R4000 reciprocating friction and wear machine.Based on the tribological experimental data,the synthetic minority oversampling technique(SMOTE)was utilized for data augmentation,and a stacking ensemble algorithm with Bayesian optimization of hyperparameters was used to construct a predictive model for tribological performance.Subsequently,within this model framework,single and multi-objective optimization models were developed,and the fruit fly algorithm was employed to find the optimal additive combination ratios,which were experimentally validated.The results demonstrated that the learning framework based on the stacking ensemble model could effectively predict the tribological properties of amide-based lubricating grease in small sample datasets,with the R2 for the average friction coefficient prediction reaching 0.9939 and for the wear scar width prediction reaching 0.9535.In the experimental validation of the optimal additive ratios,the relative error of the friction coefficient ratio scheme was 0.51%,and the relative error of the wear scar width was 1.10%.This finding suggests that the learning framework provides a novel approach for predicting the performance of amide-based lubricating grease and studying additive combinations.
基金financially supported by the Deanship of Scientific Research and Graduate Studies at King Khalid University under research grant number(R.G.P.2/21/46)in part by the Deanship of Scientific Research,Vice Presidency for Graduate Studies and Scientific Research,King Faisal University,Saudi Arabia,under Grant KFU253116.
文摘Today,phishing is an online attack designed to obtain sensitive information such as credit card and bank account numbers,passwords,and usernames.We can find several anti-phishing solutions,such as heuristic detection,virtual similarity detection,black and white lists,and machine learning(ML).However,phishing attempts remain a problem,and establishing an effective anti-phishing strategy is a work in progress.Furthermore,while most antiphishing solutions achieve the highest levels of accuracy on a given dataset,their methods suffer from an increased number of false positives.These methods are ineffective against zero-hour attacks.Phishing sites with a high False Positive Rate(FPR)are considered genuine because they can cause people to lose a lot ofmoney by visiting them.Feature selection is critical when developing phishing detection strategies.Good feature selection helps improve accuracy;however,duplicate features can also increase noise in the dataset and reduce the accuracy of the algorithm.Therefore,a combination of filter-based feature selection methods is proposed to detect phishing attacks,including constant feature removal,duplicate feature removal,quasi-feature removal,correlated feature removal,mutual information extraction,and Analysis of Variance(ANOVA)testing.The technique has been tested with differentMachine Learning classifiers:Random Forest,Artificial Neural Network(ANN),Ada-Boost,Extreme Gradient Boosting(XGBoost),Logistic Regression,Decision Trees,Gradient Boosting Classifiers,Support Vector Machine(SVM),and two types of ensemble models,stacking and majority voting to gain A low false positive rate is achieved.Stacked ensemble classifiers(gradient boosting,randomforest,support vector machine)achieve 1.31%FPR and 98.17%accuracy on Dataset 1,2.81%FPR and Dataset 3 shows 2.81%FPR and 97.61%accuracy,while Dataset 2 shows 3.47%FPR and 96.47%accuracy.
基金the Industry-University Cooperation Project in Fujian Province University(No.2022H6020)。
文摘Self-powered neutron detectors(SPNDs)play a critical role in monitoring the safety margins and overall health of reactors,directly affecting safe operation within the reactor.In this work,a novel fault identification method based on graph convolutional networks(GCN)and Stacking ensemble learning is proposed for SPNDs.The GCN is employed to extract the spatial neighborhood information of SPNDs at different positions,and residuals are obtained by nonlinear fitting of SPND signals.In order to completely extract the time-varying features from residual sequences,the Stacking fusion model,integrated with various algorithms,is developed and enables the identification of five conditions for SPNDs:normal,drift,bias,precision degradation,and complete failure.The results demonstrate that the integration of diverse base-learners in the GCN-Stacking model exhibits advantages over a single model as well as enhances the stability and reliability in fault identification.Additionally,the GCN-Stacking model maintains higher accuracy in identifying faults at different reactor power levels.
基金supported by the National Natural Science Foundation of China(Grant No.52204117)the Natural Science Foundation of Hunan Province,China(Grant No.2022JJ40601).
文摘The stability of underground entry-type excavations(UETEs)is of paramount importance for ensuring the safety of mining operations.As more engineering cases are accumulated,machine learning(ML)has demonstrated great potential for the stability evaluation of UETEs.In this study,a hybrid stacking ensemble method aggregating support vector machine(SVM),k-nearest neighbor(KNN),decision tree(DT),random forest(RF),multilayer perceptron neural network(MLPNN)and extreme gradient boosting(XGBoost)algorithms was proposed to assess the stability of UETEs.Firstly,a total of 399 historical cases with two indicators were collected from seven mines.Subsequently,to pursue better evaluation performance,the hyperparameters of base learners(SVM,KNN,DT,RF,MLPNN and XGBoost)and meta learner(MLPNN)were tuned by combining a five-fold cross validation(CV)and simulated annealing(SA)approach.Based on the optimal hyperparameters configuration,the stacking ensemble models were constructed using the training set(75%of the data).Finally,the performance of the proposed approach was evaluated by two global metrics(accuracy and Cohen’s Kappa)and three within-class metrics(macro average of the precision,recall and F1-score)on the test set(25%of the data).In addition,the evaluation results were compared with six base learners optimized by SA.The hybrid stacking ensemble algorithm achieved better comprehensive performance with the accuracy,Kappa coefficient,macro average of the precision,recall and F1-score were 0.92,0.851,0.885,0.88 and 0.883,respectively.The rock mass rating(RMR)had the most important influence on evaluation results.Moreover,the critical span graph(CSG)was updated based on the proposed model,representing a significant improvement compared with the previous studies.This study can provide valuable guidance for stability analysis and risk management of UETEs.However,it is necessary to consider more indicators and collect more extensive and balanced dataset to validate the model in future.
基金supported by the National Key R&D Program of China(No.2022YFB2403800)the National Natural Science Foundation of China(No.52277118)+1 种基金the Natural Science Foundation of Tianjin(No.22JCZDJC00660)the Open Fund in the State Key Laboratory of Alternate Electrical Power System With Renewable Energy Sources(No.LAPS23018).
文摘The accurate identification of smart meter(SM)fault types is crucial for enhancing the efficiency of operationand maintenance(O&M)and the reliability of power collectionsystems.However,the intelligent classification of SM fault typesfaces significant challenges owing to the complexity of featuresand the imbalance between fault categories.To address these issues,this study presents a fault diagnosis method for SM incorporatingthree distinct modules.The first module employs acombination of standardization,data imputation,and featureextraction to enhance the data quality,thereby facilitating improvedtraining and learning by the classifiers.To enhance theclassification performance,the data imputation method considersfeature correlation measurement and sequential imputation,and the feature extractor utilizes the discriminative enhancedsparse autoencoder.To tackle the interclass imbalance of datawith discrete and continuous features,the second module introducesan assisted classifier generative adversarial network,which includes a discrete feature generation module.Finally,anovel Stacking ensemble classifier for SM fault diagnosis is developed.In contrast to previous studies,we construct a two-layerheuristic optimization framework to address the synchronousdynamic optimization problem of the combinations and hyperparametersof the Stacking ensemble classifier,enabling betterhandling of complex classification tasks using SM data.The proposedfault diagnosis method for SM via two-layer stacking ensembleoptimization and data augmentation is trained and validatedusing SM fault data collected from 2010 to 2018 in Zhejiang Province,China.Experimental results demonstrate the effectivenessof the proposed method in improving the accuracyof SM fault diagnosis,particularly for minority classes.
基金funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2025R319),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia and Prince Sultan University for covering the article processing charges(APC)associated with this publicationResearchers Supporting Project Number(RSPD2025R1107),King Saud University,Riyadh,Saudi Arabia.
文摘Healthcare networks prove to be an urgent issue in terms of intrusion detection due to the critical consequences of cyber threats and the extreme sensitivity of medical information.The proposed Auto-Stack ID in the study is a stacked ensemble of encoder-enhanced auctions that can be used to improve intrusion detection in healthcare networks.TheWUSTL-EHMS 2020 dataset trains and evaluates themodel,constituting an imbalanced class distribution(87.46% normal traffic and 12.53% intrusion attacks).To address this imbalance,the study balances the effect of training Bias through Stratified K-fold cross-validation(K=5),so that each class is represented similarly on training and validation splits.Second,the Auto-Stack ID method combines many base classifiers such as TabNet,LightGBM,Gaussian Naive Bayes,Histogram-Based Gradient Boosting(HGB),and Logistic Regression.We apply a two-stage training process based on the first stage,where we have base classifiers that predict out-of-fold(OOF)predictions,which we use as inputs for the second-stage meta-learner XGBoost.The meta-learner learns to refine predictions to capture complicated interactions between base models,thus improving detection accuracy without introducing bias,overfitting,or requiring domain knowledge of the meta-data.In addition,the auto-stack ID model got 98.41% accuracy and 93.45%F1 score,better than individual classifiers.It can identify intrusions due to its 90.55% recall and 96.53% precision with minimal false positives.These findings identify its suitability in ensuring healthcare networks’security through ensemble learning.Ongoing efforts will be deployed in real time to improve response to evolving threats.
基金funded by the National Natural Science Foundation of China(Grant No.41941019)the State Key Laboratory of Hydroscience and Engineering(Grant No.2019-KY-03)。
文摘Real-time prediction of the rock mass class in front of the tunnel face is essential for the adaptive adjustment of tunnel boring machines(TBMs).During the TBM tunnelling process,a large number of operation data are generated,reflecting the interaction between the TBM system and surrounding rock,and these data can be used to evaluate the rock mass quality.This study proposed a stacking ensemble classifier for the real-time prediction of the rock mass classification using TBM operation data.Based on the Songhua River water conveyance project,a total of 7538 TBM tunnelling cycles and the corresponding rock mass classes are obtained after data preprocessing.Then,through the tree-based feature selection method,10 key TBM operation parameters are selected,and the mean values of the 10 selected features in the stable phase after removing outliers are calculated as the inputs of classifiers.The preprocessed data are randomly divided into the training set(90%)and test set(10%)using simple random sampling.Besides stacking ensemble classifier,seven individual classifiers are established as the comparison.These classifiers include support vector machine(SVM),k-nearest neighbors(KNN),random forest(RF),gradient boosting decision tree(GBDT),decision tree(DT),logistic regression(LR)and multilayer perceptron(MLP),where the hyper-parameters of each classifier are optimised using the grid search method.The prediction results show that the stacking ensemble classifier has a better performance than individual classifiers,and it shows a more powerful learning and generalisation ability for small and imbalanced samples.Additionally,a relative balance training set is obtained by the synthetic minority oversampling technique(SMOTE),and the influence of sample imbalance on the prediction performance is discussed.
基金supported by the National Natural Science Foundation of China(Grant No.52078428)the Sichuan Outstanding Young Science and Technology Talent Project,China(Grant No.2020JDJQ0032).
文摘Rockburst is a kind of common geological disaster in deep tunnel engineering.It has the characteristics of causing great harm and occurring at random locations and times.These characteristics seriously affect tunnel construction and threaten the physical and mental health and safety of workers.Therefore,it is of great significance to study the tendency of rockburst in the early stage of tunnel survey,design and construction.At present,there is no unified method and selected parameters for rockburst prediction.In view of the large difference of different rockburst criteria and the imbalance of rockburst database categories,this paper presents a two-step rockburst prediction method based on multiple factors and the stacking ensemble algorithm.Considering the influence of rock physical and mechanical parameters,tunnel face conditions and excavation disturbance,multiple rockburst criteria are predicted by integrating multiple machine learning algorithms.A combined prediction model of rockburst criteria is established,and the results of each rockburst criterion index are weighted and combined,with the weight updated using the field rockburst record.The dynamic weight is combined with the cloud model to comprehensively evaluate the regional rockburst risk.Field results from applying the model in the Grand Canyon tunnel show that the rockburst prediction method proposed in this paper has better applicability and higher accuracy than the single rockburst criterion.
文摘As a result of the increased number of COVID-19 cases,Ensemble Machine Learning(EML)would be an effective tool for combatting this pandemic outbreak.An ensemble of classifiers can improve the performance of single machine learning(ML)classifiers,especially stacking-based ensemble learning.Stacking utilizes heterogeneous-base learners trained in parallel and combines their predictions using a meta-model to determine the final prediction results.However,building an ensemble often causes the model performance to decrease due to the increasing number of learners that are not being properly selected.Therefore,the goal of this paper is to develop and evaluate a generic,data-independent predictive method using stacked-based ensemble learning(GA-Stacking)optimized by aGenetic Algorithm(GA)for outbreak prediction and health decision aided processes.GA-Stacking utilizes five well-known classifiers,including Decision Tree(DT),Random Forest(RF),RIGID regression,Least Absolute Shrinkage and Selection Operator(LASSO),and eXtreme Gradient Boosting(XGBoost),at its first level.It also introduces GA to identify comparisons to forecast the number,combination,and trust of these base classifiers based on theMean Squared Error(MSE)as a fitness function.At the second level of the stacked ensemblemodel,a Linear Regression(LR)classifier is used to produce the final prediction.The performance of the model was evaluated using a publicly available dataset from the Center for Systems Science and Engineering,Johns Hopkins University,which consisted of 10,722 data samples.The experimental results indicated that the GA-Stacking model achieved outstanding performance with an overall accuracy of 99.99%for the three selected countries.Furthermore,the proposed model achieved good performance when compared with existing baggingbased approaches.The proposed model can be used to predict the pandemic outbreak correctly and may be applied as a generic data-independent model 3946 CMC,2023,vol.74,no.2 to predict the epidemic trend for other countries when comparing preventive and control measures.
基金supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2024R513),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Cross-Site Scripting(XSS)remains a significant threat to web application security,exploiting vulnerabilities to hijack user sessions and steal sensitive data.Traditional detection methods often fail to keep pace with the evolving sophistication of cyber threats.This paper introduces a novel hybrid ensemble learning framework that leverages a combination of advanced machine learning algorithms—Logistic Regression(LR),Support Vector Machines(SVM),eXtreme Gradient Boosting(XGBoost),Categorical Boosting(CatBoost),and Deep Neural Networks(DNN).Utilizing the XSS-Attacks-2021 dataset,which comprises 460 instances across various real-world trafficrelated scenarios,this framework significantly enhances XSS attack detection.Our approach,which includes rigorous feature engineering and model tuning,not only optimizes accuracy but also effectively minimizes false positives(FP)(0.13%)and false negatives(FN)(0.19%).This comprehensive methodology has been rigorously validated,achieving an unprecedented accuracy of 99.87%.The proposed system is scalable and efficient,capable of adapting to the increasing number of web applications and user demands without a decline in performance.It demonstrates exceptional real-time capabilities,with the ability to detect XSS attacks dynamically,maintaining high accuracy and low latency even under significant loads.Furthermore,despite the computational complexity introduced by the hybrid ensemble approach,strategic use of parallel processing and algorithm tuning ensures that the system remains scalable and performs robustly in real-time applications.Designed for easy integration with existing web security systems,our framework supports adaptable Application Programming Interfaces(APIs)and a modular design,facilitating seamless augmentation of current defenses.This innovation represents a significant advancement in cybersecurity,offering a scalable and effective solution for securing modern web applications against evolving threats.
文摘BACKGROUND There is a lack of literature discussing the utilization of the stacking ensemble algorithm for predicting depression in patients with heart failure(HF).AIM To create a stacking model for predicting depression in patients with HF.METHODS This study analyzed data on 1084 HF patients from the National Health and Nutrition Examination Survey database spanning from 2005 to 2018.Through univariate analysis and the use of an artificial neural network algorithm,predictors significantly linked to depression were identified.These predictors were utilized to create a stacking model employing tree-based learners.The performances of both the individual models and the stacking model were assessed by using the test dataset.Furthermore,the SHapley additive exPlanations(SHAP)model was applied to interpret the stacking model.RESULTS The models included five predictors.Among these models,the stacking model demonstrated the highest performance,achieving an area under the curve of 0.77(95%CI:0.71-0.84),a sensitivity of 0.71,and a specificity of 0.68.The calibration curve supported the reliability of the models,and decision curve analysis confirmed their clinical value.The SHAP plot demonstrated that age had the most significant impact on the stacking model's output.CONCLUSION The stacking model demonstrated strong predictive performance.Clinicians can utilize this model to identify highrisk depression patients with HF,thus enabling early provision of psychological interventions.
基金funded by the King Saud University,Riyadh,Saudi Arabia for funding this work through Researchers Supporting Project Number-RSPD2024R893.
文摘Surveillance cameras have been widely used for monitoring in both private and public sectors as a security measure.Close Circuits Television(CCTV)Cameras are used to surveillance and monitor the normal and anomalous incidents.Real-world anomaly detection is a significant challenge due to its complex and diverse nature.It is difficult to manually analyze because vast amounts of video data have been generated through surveillance systems,and the need for automated techniques has been raised to enhance detection accuracy.This paper proposes a novel deep-stacked ensemble model integrated with a data augmentation approach called Stack Ensemble Road Anomaly Detection(SERAD).SERAD is used to detect and classify the four most happening road anomalies,such as accidents,car fires,fighting,and snatching,through road surveillance videos with high accuracy.The SERAD adapted three pre-trained Convolutional Neural Networks(CNNs)models,namely VGG19,ResNet50 and InceptionV3.The stacking technique is employed to incorporate these three models,resulting in much-improved accuracy for classifying road abnormalities compared to individual models.Additionally,it presented a custom real-world Road Anomaly Dataset(RAD)comprising a comprehensive collection of road images and videos.The experimental results demonstrate the strength and reliability of the proposed SERAD model,achieving an impressive classification accuracy of 98.7%.The results indicate that the proposed SERAD model outperforms than the individual CNN base models.
基金financially supported by the Natural Science Foundation of Hunan Province,China(No.2024JJ2074)the National Natural Science Foundation of China(No.22376221)the Young Elite Scientists Sponsorship Program by CAST,China(No.2023QNRC001).
文摘Tailings produced by mining and ore smelting are a major source of soil pollution.Understanding the speciation of heavy metals(HMs)in tailings is essential for soil remediation and sustainable development.Given the complex and time-consuming nature of traditional sequential laboratory extraction methods for determining the forms of HMs in tailings,a rapid and precise identification approach is urgently required.To address this issue,a general empirical prediction method for HM occurrence was developed using machine learning(ML).The compositional information of the tailings,properties of the HMs,and sequential extraction steps were used as inputs to calculate the percentages of the seven forms of HMs.After the models were tuned and compared,extreme gradient boosting,gradient boosting decision tree,and categorical boosting methods were found to be the top three performing ML models,with the coefficient of determination(R^(2))values on the testing set exceeding 0.859.Feature importance analysis for these three optimal models indicated that electronegativity was the most important factor affecting the occurrence of HMs,with an average feature importance of 0.4522.The subsequent use of stacking as a model integration method enabled the ability of the ML models to predict HM occurrence forms to be further improved,and resulting in an increase of R^(2) to 0.879.Overall,this study developed a robust technique for predicting the occurrence forms in tailings and provides an important reference for the environmental assessment and recycling of tailings.
基金the National Research Foundation of Korea(NRF)grant funded by the Korea government(MSIT)(No.2019R1F1A1059346)This work was supported by the 2020 Research Fund(Project No.1.180090.01)of UNIST(Ulsan National Institute of Science and Technology).
文摘An anomaly-based intrusion detection system(A-IDS)provides a critical aspect in a modern computing infrastructure since new types of attacks can be discovered.It prevalently utilizes several machine learning algorithms(ML)for detecting and classifying network traffic.To date,lots of algorithms have been proposed to improve the detection performance of A-IDS,either using individual or ensemble learners.In particular,ensemble learners have shown remarkable performance over individual learners in many applications,including in cybersecurity domain.However,most existing works still suffer from unsatisfactory results due to improper ensemble design.The aim of this study is to emphasize the effectiveness of stacking ensemble-based model for A-IDS,where deep learning(e.g.,deep neural network[DNN])is used as base learner model.The effectiveness of the proposed model and base DNN model are benchmarked empirically in terms of several performance metrics,i.e.,Matthew’s correlation coefficient,accuracy,and false alarm rate.The results indicate that the proposed model is superior to the base DNN model as well as other existing ML algorithms found in the literature.