Distribution transformers play a vital role in power distribution systems,and their reliable operation is crucial for grid stability.This study presents a simulation-based framework for active fault diagnosis and earl...Distribution transformers play a vital role in power distribution systems,and their reliable operation is crucial for grid stability.This study presents a simulation-based framework for active fault diagnosis and early warning of distribution transformers,integrating Sample Ensemble Learning(SEL)with a Self-Optimizing Support Vector Machine(SO-SVM).The SEL technique enhances data diversity and mitigates class imbalance,while SO-SVM adaptively tunes its hyperparameters to improve classification accuracy.A comprehensive transformer model was developed in MATLAB/Simulink to simulate diverse fault scenarios,including inter-turn winding faults,core saturation,and thermal aging.Feature vectors were extracted from voltage,current,and temperature measurements to train and validate the proposed hybrid model.Quantitative analysis shows that the SEL–SO-SVM framework achieves a classification accuracy of 97.8%,a precision of 96.5%,and an F1-score of 97.2%.Beyond classification,the model effectively identified incipient faults,providing an early warning lead time of up to 2.5 s before significant deviations in operational parameters.This predictive capability underscores its potential for preventing catastrophic transformer failures and enabling timely maintenance actions.The proposed approach demonstrates strong applicability for enhancing the reliability and operational safety of distribution transformers in simulated environments,offering a promising foundation for future real-time and field-level implementations.展开更多
Background:Stomach cancer(SC)is one of the most lethal malignancies worldwide due to late-stage diagnosis and limited treatment.The transcriptomic,epigenomic,and proteomic,etc.,omics datasets generated by high-through...Background:Stomach cancer(SC)is one of the most lethal malignancies worldwide due to late-stage diagnosis and limited treatment.The transcriptomic,epigenomic,and proteomic,etc.,omics datasets generated by high-throughput sequencing technology have become prominent in biomedical research,and they reveal molecular aspects of cancer diagnosis and therapy.Despite the development of advanced sequencing technology,the presence of high-dimensionality in multi-omics data makes it challenging to interpret the data.Methods:In this study,we introduce RankXLAN,an explainable ensemble-based multi-omics framework that integrates feature selection(FS),ensemble learning,bioinformatics,and in-silico validation for robust biomarker detection,potential therapeutic drug-repurposing candidates’identification,and classification of SC.To enhance the interpretability of the model,we incorporated explainable artificial intelligence(SHapley Additive exPlanations analysis),as well as accuracy,precision,F1-score,recall,cross-validation,specificity,likelihood ratio(LR)+,LR−,and Youden index results.Results:The experimental results showed that the top four FS algorithms achieved improved results when applied to the ensemble learning classification model.The proposed ensemble model produced an area under the curve(AUC)score of 0.994 for gene expression,0.97 for methylation,and 0.96 for miRNA expression data.Through the integration of bioinformatics and ML approach of the transcriptomic and epigenomic multi-omics dataset,we identified potential marker genes,namely,UBE2D2,HPCAL4,IGHA1,DPT,and FN3K.In-silico molecular docking revealed a strong binding affinity between ANKRD13C and the FDA-approved drug Everolimus(binding affinity−10.1 kcal/mol),identifying ANKRD13C as a potential therapeutic drug-repurposing target for SC.Conclusion:The proposed framework RankXLAN outperforms other existing frameworks for serum biomarker identification,therapeutic target identification,and SC classification with multi-omics datasets.展开更多
In recent years,ransomware attacks have become one of the most common and destructive types of cyberattacks.Their impact is significant on the operations,finances and reputation of affected companies.Despite the effor...In recent years,ransomware attacks have become one of the most common and destructive types of cyberattacks.Their impact is significant on the operations,finances and reputation of affected companies.Despite the efforts of researchers and security experts to protect information systems from these attacks,the threat persists and the proposed solutions are not able to significantly stop the spread of ransomware attacks.The latest remarkable achievements of large language models(LLMs)in NLP tasks have caught the attention of cybersecurity researchers to integrate thesemodels into security threat detection.Thesemodels offer high embedding capabilities,able to extract rich semantic representations and paving theway formore accurate and adaptive solutions.In this context,we propose a new approach for ransomware detection based on an ensemblemethod that leverages three distinctLLMembeddingmodels.This ensemble strategy takes advantage of the variety of embedding methods and the strengths of each model.In the proposed solution,each embedding model is associated with an independently trainedMLP classifier.The predictions obtained are then merged using a weighted voting technique,assigning each model an influence proportional to its performance.This approach makes it possible to exploit the complementarity of representations,improve detection accuracy and robustness,and offer a more reliable solution in the face of the growing diversity and complexity of modern ransomware.展开更多
With the increasing depth and intensity of coal mining operations,high-energy mine tremors have become a major trigger for rockburst disasters,posing severe threats to mine safety.Conventional rockburst risk assessmen...With the increasing depth and intensity of coal mining operations,high-energy mine tremors have become a major trigger for rockburst disasters,posing severe threats to mine safety.Conventional rockburst risk assessment methods either lack real-time adaptability or rely heavily on qualitative microseismic data analysis,limiting their effectiveness in dynamic early warning.To address these limitations,this study proposed a predictive framework for rockburst risk assessment by integrating ensemble learning algorithms with Bayesian optimization.A dataset was constructed using a sliding time window approach,linking the highest MS energy in the subsequent days with predefined risk levels.Both undersampling and oversampling strategies were employed to mitigate class imbalance,and their performance was evaluated.Three ensemble models,i.e.CatBoost,Random Forest,and LightGBM,were developed,and their hyperparameters were optimized using Bayesian techniques to enhance predictive performance.The models were validated using MS data from the 6303 and 6306 working faces at the Dongtan Coal Mine.All three ensemble models outperformed conventional classification methods,particularly in accurately predicting high-risk categories.Among them,the CatBoost model exhibited the best performance,with an accuracy of 89.47%and an F1¯-score of 90.62%.Furthermore,SHapley Additive exPlanations analysis was used to enhance model interpretability,identifying key MS indicators influencing rockburst risk predictions.This study provides a systematic approach for leveraging MS data and machine learning to improve an early warning system for rockburst hazards,offering valuable insights for underground mining safety management.展开更多
We propose a novel cooling protocol within a triple-Laguerre-Gaussian cavity optomechanical system,which is designed to suppress the thermal vibrations of a rotating mirror to reach its quantum ground state.The system...We propose a novel cooling protocol within a triple-Laguerre-Gaussian cavity optomechanical system,which is designed to suppress the thermal vibrations of a rotating mirror to reach its quantum ground state.The system incorporates two auxiliary cavities and an atomic ensemble coupled to a Laguerre-Gaussian rotational cavity.By carefully selecting system parameters,the cooling process of the rotating mirror is significantly enhanced,while the heating process is effectively suppressed,enabling efficient ground-state cooling even in the unresolved sideband regime.Compared to previous works,our scheme reduces the stringent restrictions on auxiliary systems,making it more experimentally feasible under broader parameter conditions.These findings provide a robust approach for achieving ground-state cooling in mechanical resonators.展开更多
Distributed Denial of Service(DDoS)attacks are one of the severe threats to network infrastructure,sometimes bypassing traditional diagnosis algorithms because of their evolving complexity.PresentMachine Learning(ML)t...Distributed Denial of Service(DDoS)attacks are one of the severe threats to network infrastructure,sometimes bypassing traditional diagnosis algorithms because of their evolving complexity.PresentMachine Learning(ML)techniques for DDoS attack diagnosis normally apply network traffic statistical features such as packet sizes and inter-arrival times.However,such techniques sometimes fail to capture complicated relations among various traffic flows.In this paper,we present a new multi-scale ensemble strategy given the Graph Neural Networks(GNNs)for improving DDoS detection.Our technique divides traffic into macro-and micro-level elements,letting various GNN models to get the two corase-scale anomalies and subtle,stealthy attack models.Through modeling network traffic as graph-structured data,GNNs efficiently learn intricate relations among network entities.The proposed ensemble learning algorithm combines the results of several GNNs to improve generalization,robustness,and scalability.Extensive experiments on three benchmark datasets—UNSW-NB15,CICIDS2017,and CICDDoS2019—show that our approach outperforms traditional machine learning and deep learning models in detecting both high-rate and low-rate(stealthy)DDoS attacks,with significant improvements in accuracy and recall.These findings demonstrate the suggested method’s applicability and robustness for real-world implementation in contexts where several DDoS patterns coexist.展开更多
The surge in smishing attacks underscores the urgent need for robust,real-time detection systems powered by advanced deep learning models.This paper introduces PhishNet,a novel ensemble learning framework that integra...The surge in smishing attacks underscores the urgent need for robust,real-time detection systems powered by advanced deep learning models.This paper introduces PhishNet,a novel ensemble learning framework that integrates transformer-based models(RoBERTa)and large language models(LLMs)(GPT-OSS 120B,LLaMA3.370B,and Qwen332B)to enhance smishing detection performance significantly.To mitigate class imbalance,we apply synthetic data augmentation using T5 and leverage various text preprocessing techniques.Our system employs a duallayer voting mechanism:weighted majority voting among LLMs and a final ensemble vote to classify messages as ham,spam,or smishing.Experimental results show an average accuracy improvement from 96%to 98.5%compared to the best standalone transformer,and from 93%to 98.5%when compared to LLMs across datasets.Furthermore,we present a real-time,user-friendly application to operationalize our detection model for practical use.PhishNet demonstrates superior scalability,usability,and detection accuracy,filling critical gaps in current smishing detection methodologies.展开更多
Ovarian cancer(OC)is one of the leading causes of death related to gynecological cancer,with the main difficulty of its early diagnosis and a heterogeneous nature of tumor biomarkers.Machine learning(ML)has the potent...Ovarian cancer(OC)is one of the leading causes of death related to gynecological cancer,with the main difficulty of its early diagnosis and a heterogeneous nature of tumor biomarkers.Machine learning(ML)has the potential to process complex datasets and support decision-making in OC diagnosis.Nevertheless,traditional ML models tend to be biased,overfitting,noisy,and less generalized.Moreover,their black-box nature reduces interpretability and limits their practical clinical applicability.In this study,we introduce an explainable ensemble learning(EL)model,TreeX-Stack,based on a stacking architecture that employs tree-based learners such as Decision Tree(DT),Random Forest(RF),Gradient Boosting(GB),and Extreme Gradient Boosting(XGBoost)as base learners,and Logistic Regression(LR)as the meta-learner to enhance ovarian cancer(OC)diagnosis.Local Interpretable ModelAgnostic Explanations(LIME)are used to explain individual predictions,making the model outputs more clinically interpretable and applicable.The model is trained on the dataset that includes demographic information,blood test,general chemistry,and tumor markers.Extensive preprocessing includes handling missing data using iterative imputation with Bayesian Ridge and addressing multicollinearity by removing features with correlation coefficients above 0.7.Relevant features are then selected using the Boruta feature selection method.To obtain robust and unbiased performance estimates during hyperparameter tuning,nested cross-validation(CV)with grid search is employed,and all experiments are repeated five times to ensure statistical reliability.TreeX-Stack demonstrates excellent diagnostic performance,achieving an accuracy of 0.9027,a precision of 0.8673,a recall of 0.9391,and an F1-score of 0.9012.Feature-importance analyses using LIME and permutation importance highlight Human Epididymis Protein 4(HE4)as the most significant biomarker for OC.The combination of high predictive performance and interpretability makes TreeX-Stack a reliable tool for clinical decision support in OC diagnosis.展开更多
This study investigated the impacts of key parameters in CAM6's deep convection and cloud physics schemes on the simulation of summer-mean precipitation over East Asia through conducting perturbed parameter ensemb...This study investigated the impacts of key parameters in CAM6's deep convection and cloud physics schemes on the simulation of summer-mean precipitation over East Asia through conducting perturbed parameter ensemble(PPE)experiments.Utilizing the experimental platform of CAM6,a suite of 128 PPE simulations spanning 19792014 were generated through simultaneously perturbing 12 selected parameters.Using EOF analysis,this study firstly extracted the first two leading modes of the precipitation simulation biases.The authors further pinpointed the most critical parameters that have the most influential effects on the precipitation simulation biases,through conducting generalized linear model analysis.The first leading mode of precipitation simulation biases is primarily influenced by parameters from the cloud physics scheme,including the linear effects of dcs and eii,and the nonlinear effect of rhminl*dcs.These parameters influence the simulated total precipitation(PrecT)mainly by altering the large-scale precipitation(PrecL).The second leading mode is predominantly governed by the convection scheme parameter dmpdz,reflecting a competition between the changes in convective precipitation(PrecC)and PrecL in response to variations in dmpdz.An increase in dmpdz induces decreased PrecC and increased PrecL in East Asia,and both of the changes collectively shape the ultimate PrecT response to the adjusted dmpdz.Lastly,it is noteworthy that the nonlinear effect due to the interaction among parameters warrants attention when concurrently adjusting multiple parameters,and the precipitation biases from the PPE simulations resemble those identified through EOF analysis on the AMIP simulations,implying our findings may provide potential reference for other AGCMs.展开更多
Traditional mining in open pit mines often uses explosives,leading to environmental hazards,with flyrock being a critical issue.In detail,excess flying rock beyond the designated explosion area was identified as the p...Traditional mining in open pit mines often uses explosives,leading to environmental hazards,with flyrock being a critical issue.In detail,excess flying rock beyond the designated explosion area was identified as the primary cause of fatal and non-fatal blasting hazards in open pit mining.Therefore,the accurate and reliable prediction of flyrock becomes crucial for effectively managing and mitigating associated problems.This study used the Light Gradient Boosting Machine(LightGBM)model to predict flyrock in a lead-zinc mine,with promising results.To improve its accuracy,multi-verse optimizer(MVO)and ant lion optimizer(ALO)metaheuristic algorithms were introduced.Results showed MVO-LightGBM outperformed conventional LightGBM.Additionally,decision tree(DT),support vector machine(SVM),and classification and regression tree(CART)models were trained and compared with MVO-LightGBM.The MVO-LightGBM model excelled over DT,SVM,and CART.This study highlights MVO-LightGBM's effectiveness and potential for broader applications.Furthermore,a multiple parametric sensitivity analysis(MPSA)algorithm was employed to specify the sensitivity of parameters.MPSA results indicated that the highest and lowest sensitivities are relevant to blasted rock per hole and spacing with theγ=1752.12 andγ=49.52,respectively.展开更多
Android smartphones have become an integral part of our daily lives,becoming targets for ransomware attacks.Such attacks encrypt user information and ask for payment to recover it.Conventional detection mechanisms,suc...Android smartphones have become an integral part of our daily lives,becoming targets for ransomware attacks.Such attacks encrypt user information and ask for payment to recover it.Conventional detection mechanisms,such as signature-based and heuristic techniques,often fail to detect new and polymorphic ransomware samples.To address this challenge,we employed various ensemble classifiers,such as Random Forest,Gradient Boosting,Bagging,and AutoML models.We aimed to showcase how AutoML can automate processes such as model selection,feature engineering,and hyperparameter optimization,to minimize manual effort while ensuring or enhancing performance compared to traditional approaches.We used this framework to test it with a publicly available dataset from the Kaggle repository,which contains features for Android ransomware network traffic.The dataset comprises 392,024 flow records,divided into eleven groups.There are ten classes for various ransomware types,including SVpeng,PornDroid,Koler,WannaLocker,and Lockerpin.There is also a class for regular traffic.We applied a three-step procedure to select themost relevant features:filter,wrapper,and embeddedmethods.The Bagging classifier was highly accurate,correctly getting 99.84%of the time.The FLAML AutoML framework was evenmore accurate,correctly getting 99.85%of the time.This is indicative of howwellAutoML performs in improving things with minimal human assistance.Our findings indicate that AutoML is an efficient,scalable,and flexible method to discover Android ransomware,and it will facilitate the development of next-generation intrusion detection systems.展开更多
Optical non-reciprocity is a fundamental phenomenon in photonics.It is crucial for developing devices that rely on directional signal control,such as optical isolators and circulators.However,most research in this fie...Optical non-reciprocity is a fundamental phenomenon in photonics.It is crucial for developing devices that rely on directional signal control,such as optical isolators and circulators.However,most research in this field has focused on systems in equilibrium or steady states.In this work,we demonstrate a room-temperature Rydberg atomic platform where the unidirectional propagation of light acts as a switch to mediate time-crystalline-like collective oscillations through atomic synchronization.展开更多
Rice is one of the most important staple crops globally.Rice plant diseases can severely reduce crop yields and,in extreme cases,lead to total production loss.Early diagnosis enables timely intervention,mitigates dise...Rice is one of the most important staple crops globally.Rice plant diseases can severely reduce crop yields and,in extreme cases,lead to total production loss.Early diagnosis enables timely intervention,mitigates disease severity,supports effective treatment strategies,and reduces reliance on excessive pesticide use.Traditional machine learning approaches have been applied for automated rice disease diagnosis;however,these methods depend heavily on manual image preprocessing and handcrafted feature extraction,which are labor-intensive and time-consuming and often require domain expertise.Recently,end-to-end deep learning(DL) models have been introduced for this task,but they often lack robustness and generalizability across diverse datasets.To address these limitations,we propose a novel end-toend training framework for convolutional neural network(CNN) and attention-based model ensembles(E2ETCA).This framework integrates features from two state-of-the-art(SOTA) CNN models,Inception V3 and DenseNet-201,and an attention-based vision transformer(ViT) model.The fused features are passed through an additional fully connected layer with softmax activation for final classification.The entire process is trained end-to-end,enhancing its suitability for realworld deployment.Furthermore,we extract and analyze the learned features using a support vector machine(SVM),a traditional machine learning classifier,to provide comparative insights.We evaluate the proposed E2ETCA framework on three publicly available datasets,the Mendeley Rice Leaf Disease Image Samples dataset,the Kaggle Rice Diseases Image dataset,the Bangladesh Rice Research Institute dataset,and a combined version of all three.Using standard evaluation metrics(accuracy,precision,recall,and F1-score),our framework demonstrates superior performance compared to existing SOTA methods in rice disease diagnosis,with potential applicability to other agricultural disease detection tasks.展开更多
Intrusion detection in Internet of Things(IoT)environments presents challenges due to heterogeneous devices,diverse attack vectors,and highly imbalanced datasets.Existing research on the ToN-IoT dataset has largely em...Intrusion detection in Internet of Things(IoT)environments presents challenges due to heterogeneous devices,diverse attack vectors,and highly imbalanced datasets.Existing research on the ToN-IoT dataset has largely emphasized binary classification and single-model pipelines,which often showstrong performance but limited generalizability,probabilistic reliability,and operational interpretability.This study proposes a stacked ensemble deep learning framework that integrates random forest,extreme gradient boosting,and a deep neural network as base learners,with CatBoost as the meta-learner.On the ToN-IoT Linux process dataset,the model achieved near-perfect discrimination(macro area under the curve=0.998),robust calibration,and superior F1-scores compared with standalone classifiers.Interpretability was achieved through SHapley Additive exPlanations–based feature attribution,which highlights actionable drivers ofmalicious behavior,such as command-line patterns,process scheduling anomalies,and CPU usage spikes,and aligns these indicators with MITRE ATT&CK tactics and techniques.Complementary analyses,including cumulative lift and sensitivity-specificity trade-offs,revealed the framework’s suitability for deployment in security operations centers,where calibrated risk scores,transparent explanations,and resource-aware triage are essential.These contributions bridge methodological rigor in artificial intelligence/machine learning with operational priorities in cybersecurity,delivering a scalable and explainable intrusion detection system suitable for real-world deployment in IoT environments.展开更多
Artificial Intelligence(AI)is changing healthcare by helping with diagnosis.However,for doctors to trust AI tools,they need to be both accurate and easy to understand.In this study,we created a new machine learning sy...Artificial Intelligence(AI)is changing healthcare by helping with diagnosis.However,for doctors to trust AI tools,they need to be both accurate and easy to understand.In this study,we created a new machine learning system for the early detection of Autism Spectrum Disorder(ASD)in children.Our main goal was to build a model that is not only good at predicting ASD but also clear in its reasoning.For this,we combined several different models,including Random Forest,XGBoost,and Neural Networks,into a single,more powerful framework.We used two different types of datasets:(i)a standard behavioral dataset and(ii)a more complex multimodal dataset with images,audio,and physiological information.The datasets were carefully preprocessed for missing values,redundant features,and dataset imbalance to ensure fair learning.The results outperformed the state-of-the-art with a Regularized Neural Network,achieving 97.6%accuracy on behavioral data.Whereas,on the multimodal data,the accuracy is 98.2%.Other models also did well with accuracies consistently above 96%.We also used SHAP and LIME on a behavioral dataset for models’explainability.展开更多
This study was aimed to prepare landslide susceptibility maps for the Pithoragarh district in Uttarakhand,India,using advanced ensemble models that combined Radial Basis Function Networks(RBFN)with three ensemble lear...This study was aimed to prepare landslide susceptibility maps for the Pithoragarh district in Uttarakhand,India,using advanced ensemble models that combined Radial Basis Function Networks(RBFN)with three ensemble learning techniques:DAGGING(DG),MULTIBOOST(MB),and ADABOOST(AB).This combination resulted in three distinct ensemble models:DG-RBFN,MB-RBFN,and AB-RBFN.Additionally,a traditional weighted method,Information Value(IV),and a benchmark machine learning(ML)model,Multilayer Perceptron Neural Network(MLP),were employed for comparison and validation.The models were developed using ten landslide conditioning factors,which included slope,aspect,elevation,curvature,land cover,geomorphology,overburden depth,lithology,distance to rivers and distance to roads.These factors were instrumental in predicting the output variable,which was the probability of landslide occurrence.Statistical analysis of the models’performance indicated that the DG-RBFN model,with an Area Under ROC Curve(AUC)of 0.931,outperformed the other models.The AB-RBFN model achieved an AUC of 0.929,the MB-RBFN model had an AUC of 0.913,and the MLP model recorded an AUC of 0.926.These results suggest that the advanced ensemble ML model DG-RBFN was more accurate than traditional statistical model,single MLP model,and other ensemble models in preparing trustworthy landslide susceptibility maps,thereby enhancing land use planning and decision-making.展开更多
The burgeoning market for lithium-ion batteries has stimulated a growing need for more reliable battery performance monitoring. Accurate state-of-health(SOH) estimation is critical for ensuring battery operational per...The burgeoning market for lithium-ion batteries has stimulated a growing need for more reliable battery performance monitoring. Accurate state-of-health(SOH) estimation is critical for ensuring battery operational performance. Despite numerous data-driven methods reported in existing research for battery SOH estimation, these methods often exhibit inconsistent performance across different application scenarios. To address this issue and overcome the performance limitations of individual data-driven models,integrating multiple models for SOH estimation has received considerable attention. Ensemble learning(EL) typically leverages the strengths of multiple base models to achieve more robust and accurate outputs. However, the lack of a clear review of current research hinders the further development of ensemble methods in SOH estimation. Therefore, this paper comprehensively reviews multi-model ensemble learning methods for battery SOH estimation. First, existing ensemble methods are systematically categorized into 6 classes based on their combination strategies. Different realizations and underlying connections are meticulously analyzed for each category of EL methods, highlighting distinctions, innovations, and typical applications. Subsequently, these ensemble methods are comprehensively compared in terms of base models, combination strategies, and publication trends. Evaluations across 6 dimensions underscore the outstanding performance of stacking-based ensemble methods. Following this, these ensemble methods are further inspected from the perspectives of weighted ensemble and diversity, aiming to inspire potential approaches for enhancing ensemble performance. Moreover, addressing challenges such as base model selection, measuring model robustness and uncertainty, and interpretability of ensemble models in practical applications is emphasized. Finally, future research prospects are outlined, specifically noting that deep learning ensemble is poised to advance ensemble methods for battery SOH estimation. The convergence of advanced machine learning with ensemble learning is anticipated to yield valuable avenues for research. Accelerated research in ensemble learning holds promising prospects for achieving more accurate and reliable battery SOH estimation under real-world conditions.展开更多
文摘Distribution transformers play a vital role in power distribution systems,and their reliable operation is crucial for grid stability.This study presents a simulation-based framework for active fault diagnosis and early warning of distribution transformers,integrating Sample Ensemble Learning(SEL)with a Self-Optimizing Support Vector Machine(SO-SVM).The SEL technique enhances data diversity and mitigates class imbalance,while SO-SVM adaptively tunes its hyperparameters to improve classification accuracy.A comprehensive transformer model was developed in MATLAB/Simulink to simulate diverse fault scenarios,including inter-turn winding faults,core saturation,and thermal aging.Feature vectors were extracted from voltage,current,and temperature measurements to train and validate the proposed hybrid model.Quantitative analysis shows that the SEL–SO-SVM framework achieves a classification accuracy of 97.8%,a precision of 96.5%,and an F1-score of 97.2%.Beyond classification,the model effectively identified incipient faults,providing an early warning lead time of up to 2.5 s before significant deviations in operational parameters.This predictive capability underscores its potential for preventing catastrophic transformer failures and enabling timely maintenance actions.The proposed approach demonstrates strong applicability for enhancing the reliability and operational safety of distribution transformers in simulated environments,offering a promising foundation for future real-time and field-level implementations.
基金the Deanship of Research and Graduate Studies at King Khalid University,KSA,for funding this work through the Large Research Project under grant number RGP2/164/46.
文摘Background:Stomach cancer(SC)is one of the most lethal malignancies worldwide due to late-stage diagnosis and limited treatment.The transcriptomic,epigenomic,and proteomic,etc.,omics datasets generated by high-throughput sequencing technology have become prominent in biomedical research,and they reveal molecular aspects of cancer diagnosis and therapy.Despite the development of advanced sequencing technology,the presence of high-dimensionality in multi-omics data makes it challenging to interpret the data.Methods:In this study,we introduce RankXLAN,an explainable ensemble-based multi-omics framework that integrates feature selection(FS),ensemble learning,bioinformatics,and in-silico validation for robust biomarker detection,potential therapeutic drug-repurposing candidates’identification,and classification of SC.To enhance the interpretability of the model,we incorporated explainable artificial intelligence(SHapley Additive exPlanations analysis),as well as accuracy,precision,F1-score,recall,cross-validation,specificity,likelihood ratio(LR)+,LR−,and Youden index results.Results:The experimental results showed that the top four FS algorithms achieved improved results when applied to the ensemble learning classification model.The proposed ensemble model produced an area under the curve(AUC)score of 0.994 for gene expression,0.97 for methylation,and 0.96 for miRNA expression data.Through the integration of bioinformatics and ML approach of the transcriptomic and epigenomic multi-omics dataset,we identified potential marker genes,namely,UBE2D2,HPCAL4,IGHA1,DPT,and FN3K.In-silico molecular docking revealed a strong binding affinity between ANKRD13C and the FDA-approved drug Everolimus(binding affinity−10.1 kcal/mol),identifying ANKRD13C as a potential therapeutic drug-repurposing target for SC.Conclusion:The proposed framework RankXLAN outperforms other existing frameworks for serum biomarker identification,therapeutic target identification,and SC classification with multi-omics datasets.
基金funded by the Deanship of Graduate Studies and Scientific Research at Jouf University under grant No.(DGSSR-2024-02-01176).
文摘In recent years,ransomware attacks have become one of the most common and destructive types of cyberattacks.Their impact is significant on the operations,finances and reputation of affected companies.Despite the efforts of researchers and security experts to protect information systems from these attacks,the threat persists and the proposed solutions are not able to significantly stop the spread of ransomware attacks.The latest remarkable achievements of large language models(LLMs)in NLP tasks have caught the attention of cybersecurity researchers to integrate thesemodels into security threat detection.Thesemodels offer high embedding capabilities,able to extract rich semantic representations and paving theway formore accurate and adaptive solutions.In this context,we propose a new approach for ransomware detection based on an ensemblemethod that leverages three distinctLLMembeddingmodels.This ensemble strategy takes advantage of the variety of embedding methods and the strengths of each model.In the proposed solution,each embedding model is associated with an independently trainedMLP classifier.The predictions obtained are then merged using a weighted voting technique,assigning each model an influence proportional to its performance.This approach makes it possible to exploit the complementarity of representations,improve detection accuracy and robustness,and offer a more reliable solution in the face of the growing diversity and complexity of modern ransomware.
基金funded by the National Natural Science Foundation of China(Grant No.42477208)Natural Science Foundation of Hubei Province,China(Grant No.2024AFA072)Open Research Fund of State Key Laboratory of Geomechanics and Geotechnical Engineering Safety(Grant No.SKLGME-JBGS2402).
文摘With the increasing depth and intensity of coal mining operations,high-energy mine tremors have become a major trigger for rockburst disasters,posing severe threats to mine safety.Conventional rockburst risk assessment methods either lack real-time adaptability or rely heavily on qualitative microseismic data analysis,limiting their effectiveness in dynamic early warning.To address these limitations,this study proposed a predictive framework for rockburst risk assessment by integrating ensemble learning algorithms with Bayesian optimization.A dataset was constructed using a sliding time window approach,linking the highest MS energy in the subsequent days with predefined risk levels.Both undersampling and oversampling strategies were employed to mitigate class imbalance,and their performance was evaluated.Three ensemble models,i.e.CatBoost,Random Forest,and LightGBM,were developed,and their hyperparameters were optimized using Bayesian techniques to enhance predictive performance.The models were validated using MS data from the 6303 and 6306 working faces at the Dongtan Coal Mine.All three ensemble models outperformed conventional classification methods,particularly in accurately predicting high-risk categories.Among them,the CatBoost model exhibited the best performance,with an accuracy of 89.47%and an F1¯-score of 90.62%.Furthermore,SHapley Additive exPlanations analysis was used to enhance model interpretability,identifying key MS indicators influencing rockburst risk predictions.This study provides a systematic approach for leveraging MS data and machine learning to improve an early warning system for rockburst hazards,offering valuable insights for underground mining safety management.
基金Project supported by the National Natural Science Foundation of China(Grant No.62471180)。
文摘We propose a novel cooling protocol within a triple-Laguerre-Gaussian cavity optomechanical system,which is designed to suppress the thermal vibrations of a rotating mirror to reach its quantum ground state.The system incorporates two auxiliary cavities and an atomic ensemble coupled to a Laguerre-Gaussian rotational cavity.By carefully selecting system parameters,the cooling process of the rotating mirror is significantly enhanced,while the heating process is effectively suppressed,enabling efficient ground-state cooling even in the unresolved sideband regime.Compared to previous works,our scheme reduces the stringent restrictions on auxiliary systems,making it more experimentally feasible under broader parameter conditions.These findings provide a robust approach for achieving ground-state cooling in mechanical resonators.
文摘Distributed Denial of Service(DDoS)attacks are one of the severe threats to network infrastructure,sometimes bypassing traditional diagnosis algorithms because of their evolving complexity.PresentMachine Learning(ML)techniques for DDoS attack diagnosis normally apply network traffic statistical features such as packet sizes and inter-arrival times.However,such techniques sometimes fail to capture complicated relations among various traffic flows.In this paper,we present a new multi-scale ensemble strategy given the Graph Neural Networks(GNNs)for improving DDoS detection.Our technique divides traffic into macro-and micro-level elements,letting various GNN models to get the two corase-scale anomalies and subtle,stealthy attack models.Through modeling network traffic as graph-structured data,GNNs efficiently learn intricate relations among network entities.The proposed ensemble learning algorithm combines the results of several GNNs to improve generalization,robustness,and scalability.Extensive experiments on three benchmark datasets—UNSW-NB15,CICIDS2017,and CICDDoS2019—show that our approach outperforms traditional machine learning and deep learning models in detecting both high-rate and low-rate(stealthy)DDoS attacks,with significant improvements in accuracy and recall.These findings demonstrate the suggested method’s applicability and robustness for real-world implementation in contexts where several DDoS patterns coexist.
基金funded by the Deanship of Scientific Research(DSR)at King Abdulaziz University,Jeddah,under Grant No.(GPIP:1074-612-2024).
文摘The surge in smishing attacks underscores the urgent need for robust,real-time detection systems powered by advanced deep learning models.This paper introduces PhishNet,a novel ensemble learning framework that integrates transformer-based models(RoBERTa)and large language models(LLMs)(GPT-OSS 120B,LLaMA3.370B,and Qwen332B)to enhance smishing detection performance significantly.To mitigate class imbalance,we apply synthetic data augmentation using T5 and leverage various text preprocessing techniques.Our system employs a duallayer voting mechanism:weighted majority voting among LLMs and a final ensemble vote to classify messages as ham,spam,or smishing.Experimental results show an average accuracy improvement from 96%to 98.5%compared to the best standalone transformer,and from 93%to 98.5%when compared to LLMs across datasets.Furthermore,we present a real-time,user-friendly application to operationalize our detection model for practical use.PhishNet demonstrates superior scalability,usability,and detection accuracy,filling critical gaps in current smishing detection methodologies.
基金supported and funded by the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University(IMSIU)under the grant number IMSIU-DDRSP2601.
文摘Ovarian cancer(OC)is one of the leading causes of death related to gynecological cancer,with the main difficulty of its early diagnosis and a heterogeneous nature of tumor biomarkers.Machine learning(ML)has the potential to process complex datasets and support decision-making in OC diagnosis.Nevertheless,traditional ML models tend to be biased,overfitting,noisy,and less generalized.Moreover,their black-box nature reduces interpretability and limits their practical clinical applicability.In this study,we introduce an explainable ensemble learning(EL)model,TreeX-Stack,based on a stacking architecture that employs tree-based learners such as Decision Tree(DT),Random Forest(RF),Gradient Boosting(GB),and Extreme Gradient Boosting(XGBoost)as base learners,and Logistic Regression(LR)as the meta-learner to enhance ovarian cancer(OC)diagnosis.Local Interpretable ModelAgnostic Explanations(LIME)are used to explain individual predictions,making the model outputs more clinically interpretable and applicable.The model is trained on the dataset that includes demographic information,blood test,general chemistry,and tumor markers.Extensive preprocessing includes handling missing data using iterative imputation with Bayesian Ridge and addressing multicollinearity by removing features with correlation coefficients above 0.7.Relevant features are then selected using the Boruta feature selection method.To obtain robust and unbiased performance estimates during hyperparameter tuning,nested cross-validation(CV)with grid search is employed,and all experiments are repeated five times to ensure statistical reliability.TreeX-Stack demonstrates excellent diagnostic performance,achieving an accuracy of 0.9027,a precision of 0.8673,a recall of 0.9391,and an F1-score of 0.9012.Feature-importance analyses using LIME and permutation importance highlight Human Epididymis Protein 4(HE4)as the most significant biomarker for OC.The combination of high predictive performance and interpretability makes TreeX-Stack a reliable tool for clinical decision support in OC diagnosis.
基金jointly supported by the National Key Research and Development Program of China [grant number 2022YFF0802004]the Excellent Youth Natural Science Foundation of Jiangsu Province [grant number BK20230061]the Joint Open Project of KLME&CIC-FEMD[grant number KLME202501]。
文摘This study investigated the impacts of key parameters in CAM6's deep convection and cloud physics schemes on the simulation of summer-mean precipitation over East Asia through conducting perturbed parameter ensemble(PPE)experiments.Utilizing the experimental platform of CAM6,a suite of 128 PPE simulations spanning 19792014 were generated through simultaneously perturbing 12 selected parameters.Using EOF analysis,this study firstly extracted the first two leading modes of the precipitation simulation biases.The authors further pinpointed the most critical parameters that have the most influential effects on the precipitation simulation biases,through conducting generalized linear model analysis.The first leading mode of precipitation simulation biases is primarily influenced by parameters from the cloud physics scheme,including the linear effects of dcs and eii,and the nonlinear effect of rhminl*dcs.These parameters influence the simulated total precipitation(PrecT)mainly by altering the large-scale precipitation(PrecL).The second leading mode is predominantly governed by the convection scheme parameter dmpdz,reflecting a competition between the changes in convective precipitation(PrecC)and PrecL in response to variations in dmpdz.An increase in dmpdz induces decreased PrecC and increased PrecL in East Asia,and both of the changes collectively shape the ultimate PrecT response to the adjusted dmpdz.Lastly,it is noteworthy that the nonlinear effect due to the interaction among parameters warrants attention when concurrently adjusting multiple parameters,and the precipitation biases from the PPE simulations resemble those identified through EOF analysis on the AMIP simulations,implying our findings may provide potential reference for other AGCMs.
基金funded by the Key Laboratory of Geological Safety of Coastal Urban Underground Space,Ministry of Natural Resources of China(Grant No.BHKF2022Y02)Natural Science Foundation of Guangdong Province,China(Grant No.2024A1515011162)Natural Science Foundation of Shandong Province,China(Grant No.ZR2024QE021).
文摘Traditional mining in open pit mines often uses explosives,leading to environmental hazards,with flyrock being a critical issue.In detail,excess flying rock beyond the designated explosion area was identified as the primary cause of fatal and non-fatal blasting hazards in open pit mining.Therefore,the accurate and reliable prediction of flyrock becomes crucial for effectively managing and mitigating associated problems.This study used the Light Gradient Boosting Machine(LightGBM)model to predict flyrock in a lead-zinc mine,with promising results.To improve its accuracy,multi-verse optimizer(MVO)and ant lion optimizer(ALO)metaheuristic algorithms were introduced.Results showed MVO-LightGBM outperformed conventional LightGBM.Additionally,decision tree(DT),support vector machine(SVM),and classification and regression tree(CART)models were trained and compared with MVO-LightGBM.The MVO-LightGBM model excelled over DT,SVM,and CART.This study highlights MVO-LightGBM's effectiveness and potential for broader applications.Furthermore,a multiple parametric sensitivity analysis(MPSA)algorithm was employed to specify the sensitivity of parameters.MPSA results indicated that the highest and lowest sensitivities are relevant to blasted rock per hole and spacing with theγ=1752.12 andγ=49.52,respectively.
基金supported through theOngoing Research Funding Program(ORF-2025-498),King Saud University,Riyadh,Saudi Arabia.
文摘Android smartphones have become an integral part of our daily lives,becoming targets for ransomware attacks.Such attacks encrypt user information and ask for payment to recover it.Conventional detection mechanisms,such as signature-based and heuristic techniques,often fail to detect new and polymorphic ransomware samples.To address this challenge,we employed various ensemble classifiers,such as Random Forest,Gradient Boosting,Bagging,and AutoML models.We aimed to showcase how AutoML can automate processes such as model selection,feature engineering,and hyperparameter optimization,to minimize manual effort while ensuring or enhancing performance compared to traditional approaches.We used this framework to test it with a publicly available dataset from the Kaggle repository,which contains features for Android ransomware network traffic.The dataset comprises 392,024 flow records,divided into eleven groups.There are ten classes for various ransomware types,including SVpeng,PornDroid,Koler,WannaLocker,and Lockerpin.There is also a class for regular traffic.We applied a three-step procedure to select themost relevant features:filter,wrapper,and embeddedmethods.The Bagging classifier was highly accurate,correctly getting 99.84%of the time.The FLAML AutoML framework was evenmore accurate,correctly getting 99.85%of the time.This is indicative of howwellAutoML performs in improving things with minimal human assistance.Our findings indicate that AutoML is an efficient,scalable,and flexible method to discover Android ransomware,and it will facilitate the development of next-generation intrusion detection systems.
基金supported by the National Natural Science Foundation of China (Grant No.12274131)the Innovation Program for Quantum Science and Technology (Grant No.2024ZD0300101)。
文摘Optical non-reciprocity is a fundamental phenomenon in photonics.It is crucial for developing devices that rely on directional signal control,such as optical isolators and circulators.However,most research in this field has focused on systems in equilibrium or steady states.In this work,we demonstrate a room-temperature Rydberg atomic platform where the unidirectional propagation of light acts as a switch to mediate time-crystalline-like collective oscillations through atomic synchronization.
基金the Begum Rokeya University,Rangpur,and the United Arab Emirates University,UAE for partially supporting this work。
文摘Rice is one of the most important staple crops globally.Rice plant diseases can severely reduce crop yields and,in extreme cases,lead to total production loss.Early diagnosis enables timely intervention,mitigates disease severity,supports effective treatment strategies,and reduces reliance on excessive pesticide use.Traditional machine learning approaches have been applied for automated rice disease diagnosis;however,these methods depend heavily on manual image preprocessing and handcrafted feature extraction,which are labor-intensive and time-consuming and often require domain expertise.Recently,end-to-end deep learning(DL) models have been introduced for this task,but they often lack robustness and generalizability across diverse datasets.To address these limitations,we propose a novel end-toend training framework for convolutional neural network(CNN) and attention-based model ensembles(E2ETCA).This framework integrates features from two state-of-the-art(SOTA) CNN models,Inception V3 and DenseNet-201,and an attention-based vision transformer(ViT) model.The fused features are passed through an additional fully connected layer with softmax activation for final classification.The entire process is trained end-to-end,enhancing its suitability for realworld deployment.Furthermore,we extract and analyze the learned features using a support vector machine(SVM),a traditional machine learning classifier,to provide comparative insights.We evaluate the proposed E2ETCA framework on three publicly available datasets,the Mendeley Rice Leaf Disease Image Samples dataset,the Kaggle Rice Diseases Image dataset,the Bangladesh Rice Research Institute dataset,and a combined version of all three.Using standard evaluation metrics(accuracy,precision,recall,and F1-score),our framework demonstrates superior performance compared to existing SOTA methods in rice disease diagnosis,with potential applicability to other agricultural disease detection tasks.
文摘Intrusion detection in Internet of Things(IoT)environments presents challenges due to heterogeneous devices,diverse attack vectors,and highly imbalanced datasets.Existing research on the ToN-IoT dataset has largely emphasized binary classification and single-model pipelines,which often showstrong performance but limited generalizability,probabilistic reliability,and operational interpretability.This study proposes a stacked ensemble deep learning framework that integrates random forest,extreme gradient boosting,and a deep neural network as base learners,with CatBoost as the meta-learner.On the ToN-IoT Linux process dataset,the model achieved near-perfect discrimination(macro area under the curve=0.998),robust calibration,and superior F1-scores compared with standalone classifiers.Interpretability was achieved through SHapley Additive exPlanations–based feature attribution,which highlights actionable drivers ofmalicious behavior,such as command-line patterns,process scheduling anomalies,and CPU usage spikes,and aligns these indicators with MITRE ATT&CK tactics and techniques.Complementary analyses,including cumulative lift and sensitivity-specificity trade-offs,revealed the framework’s suitability for deployment in security operations centers,where calibrated risk scores,transparent explanations,and resource-aware triage are essential.These contributions bridge methodological rigor in artificial intelligence/machine learning with operational priorities in cybersecurity,delivering a scalable and explainable intrusion detection system suitable for real-world deployment in IoT environments.
基金the King Salman center for Disability Research for funding this work through Research Group No.KSRG-2024-050.
文摘Artificial Intelligence(AI)is changing healthcare by helping with diagnosis.However,for doctors to trust AI tools,they need to be both accurate and easy to understand.In this study,we created a new machine learning system for the early detection of Autism Spectrum Disorder(ASD)in children.Our main goal was to build a model that is not only good at predicting ASD but also clear in its reasoning.For this,we combined several different models,including Random Forest,XGBoost,and Neural Networks,into a single,more powerful framework.We used two different types of datasets:(i)a standard behavioral dataset and(ii)a more complex multimodal dataset with images,audio,and physiological information.The datasets were carefully preprocessed for missing values,redundant features,and dataset imbalance to ensure fair learning.The results outperformed the state-of-the-art with a Regularized Neural Network,achieving 97.6%accuracy on behavioral data.Whereas,on the multimodal data,the accuracy is 98.2%.Other models also did well with accuracies consistently above 96%.We also used SHAP and LIME on a behavioral dataset for models’explainability.
基金the University of Transport Technology under the project entitled“Application of Machine Learning Algorithms in Landslide Susceptibility Mapping in Mountainous Areas”with grant number DTTD2022-16.
文摘This study was aimed to prepare landslide susceptibility maps for the Pithoragarh district in Uttarakhand,India,using advanced ensemble models that combined Radial Basis Function Networks(RBFN)with three ensemble learning techniques:DAGGING(DG),MULTIBOOST(MB),and ADABOOST(AB).This combination resulted in three distinct ensemble models:DG-RBFN,MB-RBFN,and AB-RBFN.Additionally,a traditional weighted method,Information Value(IV),and a benchmark machine learning(ML)model,Multilayer Perceptron Neural Network(MLP),were employed for comparison and validation.The models were developed using ten landslide conditioning factors,which included slope,aspect,elevation,curvature,land cover,geomorphology,overburden depth,lithology,distance to rivers and distance to roads.These factors were instrumental in predicting the output variable,which was the probability of landslide occurrence.Statistical analysis of the models’performance indicated that the DG-RBFN model,with an Area Under ROC Curve(AUC)of 0.931,outperformed the other models.The AB-RBFN model achieved an AUC of 0.929,the MB-RBFN model had an AUC of 0.913,and the MLP model recorded an AUC of 0.926.These results suggest that the advanced ensemble ML model DG-RBFN was more accurate than traditional statistical model,single MLP model,and other ensemble models in preparing trustworthy landslide susceptibility maps,thereby enhancing land use planning and decision-making.
基金National Natural Science Foundation of China (52075420)Fundamental Research Funds for the Central Universities (xzy022023049)National Key Research and Development Program of China (2023YFB3408600)。
文摘The burgeoning market for lithium-ion batteries has stimulated a growing need for more reliable battery performance monitoring. Accurate state-of-health(SOH) estimation is critical for ensuring battery operational performance. Despite numerous data-driven methods reported in existing research for battery SOH estimation, these methods often exhibit inconsistent performance across different application scenarios. To address this issue and overcome the performance limitations of individual data-driven models,integrating multiple models for SOH estimation has received considerable attention. Ensemble learning(EL) typically leverages the strengths of multiple base models to achieve more robust and accurate outputs. However, the lack of a clear review of current research hinders the further development of ensemble methods in SOH estimation. Therefore, this paper comprehensively reviews multi-model ensemble learning methods for battery SOH estimation. First, existing ensemble methods are systematically categorized into 6 classes based on their combination strategies. Different realizations and underlying connections are meticulously analyzed for each category of EL methods, highlighting distinctions, innovations, and typical applications. Subsequently, these ensemble methods are comprehensively compared in terms of base models, combination strategies, and publication trends. Evaluations across 6 dimensions underscore the outstanding performance of stacking-based ensemble methods. Following this, these ensemble methods are further inspected from the perspectives of weighted ensemble and diversity, aiming to inspire potential approaches for enhancing ensemble performance. Moreover, addressing challenges such as base model selection, measuring model robustness and uncertainty, and interpretability of ensemble models in practical applications is emphasized. Finally, future research prospects are outlined, specifically noting that deep learning ensemble is poised to advance ensemble methods for battery SOH estimation. The convergence of advanced machine learning with ensemble learning is anticipated to yield valuable avenues for research. Accelerated research in ensemble learning holds promising prospects for achieving more accurate and reliable battery SOH estimation under real-world conditions.