As batteries become increasingly essential for energy storage technologies,battery prognosis,and diagnosis remain central to ensure reliable operation and effective management,as well as to aid the in-depth investigat...As batteries become increasingly essential for energy storage technologies,battery prognosis,and diagnosis remain central to ensure reliable operation and effective management,as well as to aid the in-depth investigation of degradation mechanisms.However,dynamic operating conditions,cell-to-cell inconsistencies,and limited availability of labeled data have posed significant challenges to accurate and robust prognosis and diagnosis.Herein,we introduce a time-series-decomposition-based ensembled lightweight learning model(TELL-Me),which employs a synergistic dual-module framework to facilitate accurate and reliable forecasting.The feature module formulates features with physical implications and sheds light on battery aging mechanisms,while the gradient module monitors capacity degradation rates and captures aging trend.TELL-Me achieves high accuracy in end-of-life prediction using minimal historical data from a single battery without requiring offline training dataset,and demonstrates impressive generality and robustness across various operating conditions and battery types.Additionally,by correlating feature contributions with degradation mechanisms across different datasets,TELL-Me is endowed with the diagnostic ability that not only enhances prediction reliability but also provides critical insights into the design and optimization of next-generation batteries.展开更多
6G is desired to support more intelligence networks and this trend attaches importance to the self-healing capability if degradation emerges in the cellular networks.As a primary component of selfhealing networks,faul...6G is desired to support more intelligence networks and this trend attaches importance to the self-healing capability if degradation emerges in the cellular networks.As a primary component of selfhealing networks,fault detection is investigated in this paper.Considering the fast response and low timeand-computational consumption,it is the first time that the Online Broad Learning System(OBLS)is applied to identify outages in cellular networks.In addition,the Automatic-constructed Online Broad Learning System(AOBLS)is put forward to rationalize its structure and consequently avoid over-fitting and under-fitting.Furthermore,a multi-layer classification structure is proposed to further improve the classification performance.To face the challenges caused by imbalanced data in fault detection problems,a novel weighting strategy is derived to achieve the Multilayer Automatic-constructed Weighted Online Broad Learning System(MAWOBLS)and ensemble learning with retrained Support Vector Machine(SVM),denoted as EMAWOBLS,for superior treatment with this imbalance issue.Simulation results show that the proposed algorithm has excellent performance in detecting faults with satisfactory time usage.展开更多
Background:Stomach cancer(SC)is one of the most lethal malignancies worldwide due to late-stage diagnosis and limited treatment.The transcriptomic,epigenomic,and proteomic,etc.,omics datasets generated by high-through...Background:Stomach cancer(SC)is one of the most lethal malignancies worldwide due to late-stage diagnosis and limited treatment.The transcriptomic,epigenomic,and proteomic,etc.,omics datasets generated by high-throughput sequencing technology have become prominent in biomedical research,and they reveal molecular aspects of cancer diagnosis and therapy.Despite the development of advanced sequencing technology,the presence of high-dimensionality in multi-omics data makes it challenging to interpret the data.Methods:In this study,we introduce RankXLAN,an explainable ensemble-based multi-omics framework that integrates feature selection(FS),ensemble learning,bioinformatics,and in-silico validation for robust biomarker detection,potential therapeutic drug-repurposing candidates’identification,and classification of SC.To enhance the interpretability of the model,we incorporated explainable artificial intelligence(SHapley Additive exPlanations analysis),as well as accuracy,precision,F1-score,recall,cross-validation,specificity,likelihood ratio(LR)+,LR−,and Youden index results.Results:The experimental results showed that the top four FS algorithms achieved improved results when applied to the ensemble learning classification model.The proposed ensemble model produced an area under the curve(AUC)score of 0.994 for gene expression,0.97 for methylation,and 0.96 for miRNA expression data.Through the integration of bioinformatics and ML approach of the transcriptomic and epigenomic multi-omics dataset,we identified potential marker genes,namely,UBE2D2,HPCAL4,IGHA1,DPT,and FN3K.In-silico molecular docking revealed a strong binding affinity between ANKRD13C and the FDA-approved drug Everolimus(binding affinity−10.1 kcal/mol),identifying ANKRD13C as a potential therapeutic drug-repurposing target for SC.Conclusion:The proposed framework RankXLAN outperforms other existing frameworks for serum biomarker identification,therapeutic target identification,and SC classification with multi-omics datasets.展开更多
The burgeoning market for lithium-ion batteries has stimulated a growing need for more reliable battery performance monitoring. Accurate state-of-health(SOH) estimation is critical for ensuring battery operational per...The burgeoning market for lithium-ion batteries has stimulated a growing need for more reliable battery performance monitoring. Accurate state-of-health(SOH) estimation is critical for ensuring battery operational performance. Despite numerous data-driven methods reported in existing research for battery SOH estimation, these methods often exhibit inconsistent performance across different application scenarios. To address this issue and overcome the performance limitations of individual data-driven models,integrating multiple models for SOH estimation has received considerable attention. Ensemble learning(EL) typically leverages the strengths of multiple base models to achieve more robust and accurate outputs. However, the lack of a clear review of current research hinders the further development of ensemble methods in SOH estimation. Therefore, this paper comprehensively reviews multi-model ensemble learning methods for battery SOH estimation. First, existing ensemble methods are systematically categorized into 6 classes based on their combination strategies. Different realizations and underlying connections are meticulously analyzed for each category of EL methods, highlighting distinctions, innovations, and typical applications. Subsequently, these ensemble methods are comprehensively compared in terms of base models, combination strategies, and publication trends. Evaluations across 6 dimensions underscore the outstanding performance of stacking-based ensemble methods. Following this, these ensemble methods are further inspected from the perspectives of weighted ensemble and diversity, aiming to inspire potential approaches for enhancing ensemble performance. Moreover, addressing challenges such as base model selection, measuring model robustness and uncertainty, and interpretability of ensemble models in practical applications is emphasized. Finally, future research prospects are outlined, specifically noting that deep learning ensemble is poised to advance ensemble methods for battery SOH estimation. The convergence of advanced machine learning with ensemble learning is anticipated to yield valuable avenues for research. Accelerated research in ensemble learning holds promising prospects for achieving more accurate and reliable battery SOH estimation under real-world conditions.展开更多
Deep learning algorithms have been rapidly incorporated into many different applications due to the increase in computational power and the availability of massive amounts of data.Recently,both deep learning and ensem...Deep learning algorithms have been rapidly incorporated into many different applications due to the increase in computational power and the availability of massive amounts of data.Recently,both deep learning and ensemble learning have been used to recognize underlying structures and patterns from high-level features to make predictions/decisions.With the growth in popularity of deep learning and ensemble learning algorithms,they have received significant attention from both scientists and the industrial community due to their superior ability to learn features from big data.Ensemble deep learning has exhibited significant performance in enhancing learning generalization through the use of multiple deep learning algorithms.Although ensemble deep learning has large quantities of training parameters,which results in time and space overheads,it performs much better than traditional ensemble learning.Ensemble deep learning has been successfully used in several areas,such as bioinformatics,finance,and health care.In this paper,we review and investigate recent ensemble deep learning algorithms and techniques in health care domains,medical imaging,health care data analytics,genomics,diagnosis,disease prevention,and drug discovery.We cover several widely used deep learning algorithms along with their architectures,including deep neural networks(DNNs),convolutional neural networks(CNNs),recurrent neural networks(RNNs),and generative adversarial networks(GANs).Common healthcare tasks,such as medical imaging,electronic health records,and genomics,are also demonstrated.Furthermore,in this review,the challenges inherent in reducing the burden on the healthcare system are discussed and explored.Finally,future directions and opportunities for enhancing healthcare model performance are discussed.展开更多
Autism spectrum disorder(ASD)is a multifaceted neurological developmental condition that manifests in several ways.Nearly all autistic children remain undiagnosed before the age of three.Developmental problems affecti...Autism spectrum disorder(ASD)is a multifaceted neurological developmental condition that manifests in several ways.Nearly all autistic children remain undiagnosed before the age of three.Developmental problems affecting face features are often associated with fundamental brain disorders.The facial evolution of newborns with ASD is quite different from that of typically developing children.Early recognition is very significant to aid families and parents in superstition and denial.Distinguishing facial features from typically developing children is an evident manner to detect children analyzed with ASD.Presently,artificial intelligence(AI)significantly contributes to the emerging computer-aided diagnosis(CAD)of autism and to the evolving interactivemethods that aid in the treatment and reintegration of autistic patients.This study introduces an Ensemble of deep learning models based on the autism spectrum disorder detection in facial images(EDLM-ASDDFI)model.The overarching goal of the EDLM-ASDDFI model is to recognize the difference between facial images of individuals with ASD and normal controls.In the EDLM-ASDDFI method,the primary level of data pre-processing is involved by Gabor filtering(GF).Besides,the EDLM-ASDDFI technique applies the MobileNetV2 model to learn complex features from the pre-processed data.For the ASD detection process,the EDLM-ASDDFI method uses ensemble techniques for classification procedure that encompasses long short-term memory(LSTM),deep belief network(DBN),and hybrid kernel extreme learning machine(HKELM).Finally,the hyperparameter selection of the three deep learning(DL)models can be implemented by the design of the crested porcupine optimizer(CPO)technique.An extensive experiment was conducted to emphasize the improved ASD detection performance of the EDLM-ASDDFI method.The simulation outcomes indicated that the EDLM-ASDDFI technique highlighted betterment over other existing models in terms of numerous performance measures.展开更多
Ensemble learning,a pivotal branch of machine learning,amalgamates multiple base models to enhance the overarching performance of predictive models,capitalising on the diversity and collective wisdom of the ensemble t...Ensemble learning,a pivotal branch of machine learning,amalgamates multiple base models to enhance the overarching performance of predictive models,capitalising on the diversity and collective wisdom of the ensemble to surpass individual models and mitigate overfitting.In this review,a four-layer research framework is established for the research of ensemble learning,which can offer a comprehensive and structured review of ensemble learning from bottom to top.Firstly,this survey commences by introducing fundamental ensemble learning techniques,including bagging,boosting,and stacking,while also exploring the ensemble's diversity.Then,deep ensemble learning and semi-supervised ensemble learning are studied in detail.Furthermore,the utilisation of ensemble learning techniques to navigate challenging datasets,such as imbalanced and highdimensional data,is discussed.The application of ensemble learning techniques across various research domains,including healthcare,transportation,finance,manufacturing,and the Internet,is also examined.The survey concludes by discussing challenges intrinsic to ensemble learning.展开更多
Hepatocellular carcinoma(HCC)remains a leading cause of cancer-related mortality globally,necessitating advanced diagnostic tools to improve early detection and personalized targeted therapy.This review synthesizes ev...Hepatocellular carcinoma(HCC)remains a leading cause of cancer-related mortality globally,necessitating advanced diagnostic tools to improve early detection and personalized targeted therapy.This review synthesizes evidence on explainable ensemble learning approaches for HCC classification,emphasizing their integration with clinical workflows and multi-omics data.A systematic analysis[including datasets such as The Cancer Genome Atlas,Gene Expression Omnibus,and the Surveillance,Epidemiology,and End Results(SEER)datasets]revealed that explainable ensemble learning models achieve high diagnostic accuracy by combining clinical features,serum biomarkers such as alpha-fetoprotein,imaging features such as computed tomography and magnetic resonance imaging,and genomic data.For instance,SHapley Additive exPlanations(SHAP)-based random forests trained on NCBI GSE14520 microarray data(n=445)achieved 96.53%accuracy,while stacking ensembles applied to the SEER program data(n=1897)demonstrated an area under the receiver operating characteristic curve of 0.779 for mortality prediction.Despite promising results,challenges persist,including the computational costs of SHAP and local interpretable model-agnostic explanations analyses(e.g.,TreeSHAP requiring distributed computing for metabolomics datasets)and dataset biases(e.g.,SEER’s Western population dominance limiting generalizability).Future research must address inter-cohort heterogeneity,standardize explainability metrics,and prioritize lightweight surrogate models for resource-limited settings.This review presents the potential of explainable ensemble learning frameworks to bridge the gap between predictive accuracy and clinical interpretability,though rigorous validation in independent,multi-center cohorts is critical for real-world deployment.展开更多
This paper proposes a novel hybrid fraud detection framework that integrates multi-stage feature selection,unsupervised clustering,and ensemble learning to improve classification performance in financial transaction m...This paper proposes a novel hybrid fraud detection framework that integrates multi-stage feature selection,unsupervised clustering,and ensemble learning to improve classification performance in financial transaction monitoring systems.The framework is structured into three core layers:(1)feature selection using Recursive Feature Elimination(RFE),Principal Component Analysis(PCA),and Mutual Information(MI)to reduce dimensionality and enhance input relevance;(2)anomaly detection through unsupervised clustering using K-Means,Density-Based Spatial Clustering(DBSCAN),and Hierarchical Clustering to flag suspicious patterns in unlabeled data;and(3)final classification using a voting-based hybrid ensemble of Support Vector Machine(SVM),Random Forest(RF),and Gradient Boosting Classifier(GBC).The experimental evaluation is conducted on a synthetically generated dataset comprising one million financial transactions,with 5% labelled as fraudulent,simulating realistic fraud rates and behavioural features,including transaction time,origin,amount,and geo-location.The proposed model demonstrated a significant improvement over baseline classifiers,achieving an accuracy of 99%,a precision of 99%,a recall of 97%,and an F1-score of 99%.Compared to individual models,it yielded a 9% gain in overall detection accuracy.It reduced the false positive rate to below 3.5%,thereby minimising the operational costs associated with manually reviewing false alerts.The model’s interpretability is enhanced by the integration of Shapley Additive Explanations(SHAP)values for feature importance,supporting transparency and regulatory auditability.These results affirm the practical relevance of the proposed system for deployment in real-time fraud detection scenarios such as credit card transactions,mobile banking,and cross-border payments.The study also highlights future directions,including the deployment of lightweight models and the integration of multimodal data for scalable fraud analytics.展开更多
Glaucoma,a chronic eye disease affecting millions worldwide,poses a substantial threat to eyesight and can result in permanent vision loss if left untreated.Manual identification of glaucoma is a complicated and time-...Glaucoma,a chronic eye disease affecting millions worldwide,poses a substantial threat to eyesight and can result in permanent vision loss if left untreated.Manual identification of glaucoma is a complicated and time-consuming practice requiring specialized expertise and results may be subjective.To address these challenges,this research proposes a computer-aided diagnosis(CAD)approach using Artificial Intelligence(AI)techniques for binary and multiclass classification of glaucoma stages.An ensemble fusion mechanism that combines the outputs of three pre-trained convolutional neural network(ConvNet)models–ResNet-50,VGG-16,and InceptionV3 is utilized in this paper.This fusion technique enhances diagnostic accuracy and robustness by ensemble-averaging the predictions from individual models,leveraging their complementary strengths.The objective of this work is to assess the model’s capability for early-stage glaucoma diagnosis.Classification is performed on a dataset collected from the Harvard Dataverse repository.With the proposed technique,for Normal vs.Advanced glaucoma classification,a validation accuracy of 98.04%and testing accuracy of 98.03%is achieved,with a specificity of 100%which outperforms stateof-the-art methods.For multiclass classification,the suggested ensemble approach achieved a precision and sensitivity of 97%,specificity,and testing accuracy of 98.57%and 96.82%,respectively.The proposed E-GlauNet model has significant potential in assisting ophthalmologists in the screening and fast diagnosis of glaucoma,leading to more reliable,efficient,and timely diagnosis,particularly for early-stage detection and staging of the disease.While the proposed method demonstrates high accuracy and robustness,the study is limited by the evaluation of a single dataset.Future work will focus on external validation across diverse datasets and enhancing interpretability using explainable AI techniques.展开更多
Artificial intelligence(AI)serves as a key technology in global industrial transformation and technological restructuring and as the core driver of the fourth industrial revolution.Currently,deep learning techniques,s...Artificial intelligence(AI)serves as a key technology in global industrial transformation and technological restructuring and as the core driver of the fourth industrial revolution.Currently,deep learning techniques,such as convolutional neural networks,enable intelligent information collection in fields such as tongue and pulse diagnosis owing to their robust feature-processing capabilities.Natural language processing models,including long short-term memory and transformers,have been applied to traditional Chinese medicine(TCM)for diagnosis,syndrome differentiation,and prescription generation.Traditional machine learning algorithms,such as neural networks,support vector machines,and random forests,are also widely used in TCM diagnosis and treatment because of their strong regression and classification performance on small structured datasets.Future research on AI in TCM diagnosis and treatment may emphasize building large-scale,high-quality TCM datasets with unified criteria based on syndrome elements;identifying algorithms suited to TCM theoretical data distributions;and leveraging AI multimodal fusion and ensemble learning techniques for diverse raw features,such as images,text,and manually processed structured data,to increase the clinical efficacy of TCM diagnosis and treatment.展开更多
The biomass and coal co-pyrolysis (BCP) technology combines the advantages of both resources, achieving efficient resource complementarity, reducing reliance on coal, and minimizing pollutant emissions. However, this ...The biomass and coal co-pyrolysis (BCP) technology combines the advantages of both resources, achieving efficient resource complementarity, reducing reliance on coal, and minimizing pollutant emissions. However, this process still encounters numerous challenges in attaining optimal economic and environmental performance. Therefore, an ensemble learning (EL) framework is proposed for the BCP process in this study to optimize the synergistic benefits while minimizing negative environmental impacts. Six different ensemble learning models are developed to investigate the impact of input features, such as biomass characteristics, coal characteristics, and pyrolysis conditions on the product profit and CO_(2) emissions of the BCP processes. The Optuna method is further employed to automatically optimize the hyperparameters of BCP process models for enhancing their predictive accuracy and robustness. The results indicate that the categorical boosting (CAB) model of the BCP process has demonstrated exceptional performance in accurately predicting its product profit and CO_(2) emission (R2>0.92) after undergoing five-fold cross-validation. To enhance the interpretability of this preferred model, the Shapley additive explanations and partial dependence plot analyses are conducted to evaluate the impact and importance of biomass characteristics, coal characteristics, and pyrolysis conditions on the product profitability and CO_(2) emissions of the BCP processes. Finally, the preferred model coupled with a reference vector guided evolutionary algorithm is carried to identify the optimal conditions for maximizing the product profit of BCP process products while minimizing CO_(2) emissions. It indicates the optimal BCP process can achieve high product profits (5290.85 CNY·t−1) and low CO_(2) emissions (7.45 kg·t^(−1)).展开更多
Breast cancer is among the leading causes of cancer mortality globally,and its diagnosis through histopathological image analysis is often prone to inter-observer variability and misclassification.Existing machine lea...Breast cancer is among the leading causes of cancer mortality globally,and its diagnosis through histopathological image analysis is often prone to inter-observer variability and misclassification.Existing machine learning(ML)methods struggle with intra-class heterogeneity and inter-class similarity,necessitating more robust classification models.This study presents an ML classifier ensemble hybrid model for deep feature extraction with deep learning(DL)and Bat Swarm Optimization(BSO)hyperparameter optimization to improve breast cancer histopathology(BCH)image classification.A dataset of 804 Hematoxylin and Eosin(H&E)stained images classified as Benign,in situ,Invasive,and Normal categories(ICIAR2018_BACH_Challenge)has been utilized.ResNet50 was utilized for feature extraction,while Support Vector Machines(SVM),Random Forests(RF),XGBoosts(XGB),Decision Trees(DT),and AdaBoosts(ADB)were utilized for classification.BSO was utilized for hyperparameter optimization in a soft voting ensemble approach.Accuracy,precision,recall,specificity,F1-score,Receiver Operating Characteristic(ROC),and Precision-Recall(PR)were utilized for model performance metrics.The model using an ensemble outperformed individual classifiers in terms of having greater accuracy(~90.0%),precision(~86.4%),recall(~86.3%),and specificity(~96.6%).The robustness of the model was verified by both ROC and PR curves,which showed AUC values of 1.00,0.99,and 0.98 for Benign,Invasive,and in situ instances,respectively.This ensemble model delivers a strong and clinically valid methodology for breast cancer classification that enhances precision and minimizes diagnostic errors.Future work should focus on explainable AI,multi-modal fusion,few-shot learning,and edge computing for real-world deployment.展开更多
Objective: This study investigates the auxiliary role of resting-state electroencephalography (EEG) in the clinical diagnosis of attention-deficit hyperactivity disorder (ADHD) using machine learning techniques. Metho...Objective: This study investigates the auxiliary role of resting-state electroencephalography (EEG) in the clinical diagnosis of attention-deficit hyperactivity disorder (ADHD) using machine learning techniques. Methods: Resting-state EEG recordings were obtained from 57 children, comprising 28 typically developing children and 29 children diagnosed with ADHD. The EEG signal data from both groups were analyzed. To ensure analytical accuracy, artifacts and noise in the EEG signals were removed using the EEGLAB toolbox within the MATLAB environment. Following preprocessing, a comparative analysis was conducted using various ensemble learning algorithms, including AdaBoost, GBM, LightGBM, RF, XGB, and CatBoost. Model performance was systematically evaluated and optimized, validating the superior efficacy of ensemble learning approaches in identifying ADHD. Conclusion: Applying machine learning techniques to extract features from resting-state EEG signals enabled the development of effective ensemble learning models. Differential entropy and energy features across multiple frequency bands proved particularly valuable for these models. This approach significantly enhances the detection rate of ADHD in children, demonstrating high diagnostic efficacy and sensitivity, and providing a promising tool for clinical application.展开更多
The rapid increase in the number of Internet of Things(IoT)devices,coupled with a rise in sophisticated cyberattacks,demands robust intrusion detection systems.This study presents a holistic,intelligent intrusion dete...The rapid increase in the number of Internet of Things(IoT)devices,coupled with a rise in sophisticated cyberattacks,demands robust intrusion detection systems.This study presents a holistic,intelligent intrusion detection system.It uses a combined method that integrates machine learning(ML)and deep learning(DL)techniques to improve the protection of contemporary information technology(IT)systems.Unlike traditional signature-based or singlemodel methods,this system integrates the strengths of ensemble learning for binary classification and deep learning for multi-class classification.This combination provides a more nuanced and adaptable defense.The research utilizes the NF-UQ-NIDS-v2 dataset,a recent,comprehensive benchmark for evaluating network intrusion detection systems(NIDS).Our methodological framework employs advanced artificial intelligence techniques.Specifically,we use ensemble learning algorithms(Random Forest,Gradient Boosting,AdaBoost,and XGBoost)for binary classification.Deep learning architectures are also employed to address the complexities of multi-class classification,allowing for fine-grained identification of intrusion types.To mitigate class imbalance,a common problem in multi-class intrusion detection that biases model performance,we use oversampling and data augmentation.These techniques ensure equitable class representation.The results demonstrate the efficacy of the proposed hybrid ML-DL system.It achieves significant improvements in intrusion detection accuracy and reliability.This research contributes substantively to cybersecurity by providing a more robust and adaptable intrusion detection solution.展开更多
Identifying druggable proteins,which are capable of binding therapeutic compounds,remains a critical and resource-intensive challenge in drug discovery.To address this,we propose CEL-IDP(Comparison of Ensemble Learnin...Identifying druggable proteins,which are capable of binding therapeutic compounds,remains a critical and resource-intensive challenge in drug discovery.To address this,we propose CEL-IDP(Comparison of Ensemble Learning Methods for Identification of Druggable Proteins),a computational framework combining three feature extraction methods Dipeptide Deviation from Expected Mean(DDE),Enhanced Amino Acid Composition(EAAC),and Enhanced Grouped Amino Acid Composition(EGAAC)with ensemble learning strategies(Bagging,Boosting,Stacking)to classify druggable proteins from sequence data.DDE captures dipeptide frequency deviations,EAAC encodes positional amino acid information,and EGAAC groups residues by physicochemical properties to generate discriminative feature vectors.These features were analyzed using ensemble models to overcome the limitations of single classifiers.EGAAC outperformed DDE and EAAC,with Random Forest(Bagging)and XGBoost(Boosting)achieving the highest accuracy of 71.66%,demonstrating superior performance in capturing critical biochemical patterns.Stacking showed intermediate results(68.33%),while EAAC and DDE-based models yielded lower accuracies(56.66%–66.87%).CEL-IDP streamlines large-scale druggability prediction,reduces reliance on costly experimental screening,and aligns with global initiatives like Target 2035 to expand action-able drug targets.This work advances machine learning-driven drug discovery by systematizing feature engineering and ensemble model optimization,providing a scalable workflow to accelerate target identification and validation.展开更多
The potential applications of multimodal physiological signals in healthcare,pain monitoring,and clinical decision support systems have garnered significant attention in biomedical research.Subjective self-reporting i...The potential applications of multimodal physiological signals in healthcare,pain monitoring,and clinical decision support systems have garnered significant attention in biomedical research.Subjective self-reporting is the foundation of conventional pain assessment methods,which may be unreliable.Deep learning is a promising alternative to resolve this limitation through automated pain classification.This paper proposes an ensemble deep-learning framework for pain assessment.The framework makes use of features collected from electromyography(EMG),skin conductance level(SCL),and electrocardiography(ECG)signals.We integrate Convolutional Neural Networks(CNN),Long Short-Term Memory Networks(LSTM),Bidirectional Gated Recurrent Units(BiGRU),and Deep Neural Networks(DNN)models.We then aggregate their predictions using a weighted averaging ensemble technique to increase the classification’s robustness.To improve computing efficiency and remove redundant features,we use Particle Swarm Optimization(PSO)for feature selection.This enables us to reduce the features’dimensionality without sacrificing the classification’s accuracy.With improved accuracy,precision,recall,and F1-score across all pain levels,the experimental results show that the suggested ensemble model performs better than individual deep learning classifiers.In our experiments,the suggested model achieved over 98%accuracy,suggesting promising automated pain assessment performance.However,due to differences in validation protocols,comparisons with previous studies are still limited.Combining deep learning and feature selection techniques significantly improves model generalization,reducing overfitting and enhancing classification performance.The evaluation was conducted using the BioVid Heat Pain Dataset,confirming the model’s effectiveness in distinguishing between different pain intensity levels.展开更多
Accurately evaluating the safety status of lithium-ion battery systems in electric vehicles is imperative due to the challenges in effectively predicting potential battery failure risks under stochastic profiles.Compl...Accurately evaluating the safety status of lithium-ion battery systems in electric vehicles is imperative due to the challenges in effectively predicting potential battery failure risks under stochastic profiles.Complex battery fault mechanisms and limited poor-quality data collection impede fault detection for battery systems under real-world conditions.This paper proposes a novel graph-guided fault detection method designed to recognize concealed anomalies in realistic data.Graphs guided by physical relationships are constructed for learning the dynamic evolution of physical quantities under normal conditions and their potential change characteristics in fault scenarios.An ensemble Graph Sample and Aggregate Network model are developed to tackle sample distribution imbalances and non-uniformity battery system specifications across vehicles.Failure risk probabilities for diverse battery charging and discharging segments are derived.An ablation study verifies the necessity of ensemble learning in addressing imbalanced datasets.Analysis of 102,095 segments across 86 vehicles with different battery material systems,battery capacities,and numbers of cells and temperature sensors confirms the robustness and generalization of the proposed method,yielding a recall of 98.37%.By introducing the graph,spatio-temporal global fault characteristics of battery systems are automatically extracted.The coupling relationship and evolution of physical quantities under both normal and faulty states are established,effectively uncovering fault information hidden in collected battery data without observable anomalies.The safety state of battery systems is reflected in terms of failure risk probability,providing reliable data support for battery system maintenance.展开更多
Tailings produced by mining and ore smelting are a major source of soil pollution.Understanding the speciation of heavy metals(HMs)in tailings is essential for soil remediation and sustainable development.Given the co...Tailings produced by mining and ore smelting are a major source of soil pollution.Understanding the speciation of heavy metals(HMs)in tailings is essential for soil remediation and sustainable development.Given the complex and time-consuming nature of traditional sequential laboratory extraction methods for determining the forms of HMs in tailings,a rapid and precise identification approach is urgently required.To address this issue,a general empirical prediction method for HM occurrence was developed using machine learning(ML).The compositional information of the tailings,properties of the HMs,and sequential extraction steps were used as inputs to calculate the percentages of the seven forms of HMs.After the models were tuned and compared,extreme gradient boosting,gradient boosting decision tree,and categorical boosting methods were found to be the top three performing ML models,with the coefficient of determination(R^(2))values on the testing set exceeding 0.859.Feature importance analysis for these three optimal models indicated that electronegativity was the most important factor affecting the occurrence of HMs,with an average feature importance of 0.4522.The subsequent use of stacking as a model integration method enabled the ability of the ML models to predict HM occurrence forms to be further improved,and resulting in an increase of R^(2) to 0.879.Overall,this study developed a robust technique for predicting the occurrence forms in tailings and provides an important reference for the environmental assessment and recycling of tailings.展开更多
Cloud computing(CC) provides infrastructure,storage services,and applications to the users that should be secured by some procedures or policies.Security in the cloud environment becomes essential to safeguard infrast...Cloud computing(CC) provides infrastructure,storage services,and applications to the users that should be secured by some procedures or policies.Security in the cloud environment becomes essential to safeguard infrastructure and user information from unauthorized access by implementing timely intrusion detection systems(IDS).Ensemble learning harnesses the collective power of multiple machine learning(ML) methods with feature selection(FS)process aids to progress the sturdiness and overall precision of intrusion detection.Therefore,this article presents a meta-heuristic feature selection by ensemble learning-based anomaly detection(MFS-ELAD)algorithm for the CC platforms.To realize this objective,the proposed approach utilizes a min-max standardization technique.Then,higher dimensionality features are decreased by Prairie Dogs Optimizer(PDO) algorithm.For the recognition procedure,the MFS-ELAD method emulates a group of 3 DL techniques such as sparse auto-encoder(SAE),stacked long short-term memory(SLSTM),and Elman neural network(ENN) algorithms.Eventually,the parameter fine-tuning of the DL algorithms occurs utilizing the sand cat swarm optimizer(SCSO) approach that helps in improving the recognition outcomes.The simulation examination of MFS-ELAD system on the CSE-CIC-IDS2018 dataset exhibits its promising performance across another method using a maximal precision of 99.71%.展开更多
基金supported by the National Natural Science Foundation of China(22379021 and 22479021)。
文摘As batteries become increasingly essential for energy storage technologies,battery prognosis,and diagnosis remain central to ensure reliable operation and effective management,as well as to aid the in-depth investigation of degradation mechanisms.However,dynamic operating conditions,cell-to-cell inconsistencies,and limited availability of labeled data have posed significant challenges to accurate and robust prognosis and diagnosis.Herein,we introduce a time-series-decomposition-based ensembled lightweight learning model(TELL-Me),which employs a synergistic dual-module framework to facilitate accurate and reliable forecasting.The feature module formulates features with physical implications and sheds light on battery aging mechanisms,while the gradient module monitors capacity degradation rates and captures aging trend.TELL-Me achieves high accuracy in end-of-life prediction using minimal historical data from a single battery without requiring offline training dataset,and demonstrates impressive generality and robustness across various operating conditions and battery types.Additionally,by correlating feature contributions with degradation mechanisms across different datasets,TELL-Me is endowed with the diagnostic ability that not only enhances prediction reliability but also provides critical insights into the design and optimization of next-generation batteries.
基金supported in part by the National Key Research and Development Project under Grant 2020YFB1806805partially funded through a grant from Qualcomm。
文摘6G is desired to support more intelligence networks and this trend attaches importance to the self-healing capability if degradation emerges in the cellular networks.As a primary component of selfhealing networks,fault detection is investigated in this paper.Considering the fast response and low timeand-computational consumption,it is the first time that the Online Broad Learning System(OBLS)is applied to identify outages in cellular networks.In addition,the Automatic-constructed Online Broad Learning System(AOBLS)is put forward to rationalize its structure and consequently avoid over-fitting and under-fitting.Furthermore,a multi-layer classification structure is proposed to further improve the classification performance.To face the challenges caused by imbalanced data in fault detection problems,a novel weighting strategy is derived to achieve the Multilayer Automatic-constructed Weighted Online Broad Learning System(MAWOBLS)and ensemble learning with retrained Support Vector Machine(SVM),denoted as EMAWOBLS,for superior treatment with this imbalance issue.Simulation results show that the proposed algorithm has excellent performance in detecting faults with satisfactory time usage.
基金the Deanship of Research and Graduate Studies at King Khalid University,KSA,for funding this work through the Large Research Project under grant number RGP2/164/46.
文摘Background:Stomach cancer(SC)is one of the most lethal malignancies worldwide due to late-stage diagnosis and limited treatment.The transcriptomic,epigenomic,and proteomic,etc.,omics datasets generated by high-throughput sequencing technology have become prominent in biomedical research,and they reveal molecular aspects of cancer diagnosis and therapy.Despite the development of advanced sequencing technology,the presence of high-dimensionality in multi-omics data makes it challenging to interpret the data.Methods:In this study,we introduce RankXLAN,an explainable ensemble-based multi-omics framework that integrates feature selection(FS),ensemble learning,bioinformatics,and in-silico validation for robust biomarker detection,potential therapeutic drug-repurposing candidates’identification,and classification of SC.To enhance the interpretability of the model,we incorporated explainable artificial intelligence(SHapley Additive exPlanations analysis),as well as accuracy,precision,F1-score,recall,cross-validation,specificity,likelihood ratio(LR)+,LR−,and Youden index results.Results:The experimental results showed that the top four FS algorithms achieved improved results when applied to the ensemble learning classification model.The proposed ensemble model produced an area under the curve(AUC)score of 0.994 for gene expression,0.97 for methylation,and 0.96 for miRNA expression data.Through the integration of bioinformatics and ML approach of the transcriptomic and epigenomic multi-omics dataset,we identified potential marker genes,namely,UBE2D2,HPCAL4,IGHA1,DPT,and FN3K.In-silico molecular docking revealed a strong binding affinity between ANKRD13C and the FDA-approved drug Everolimus(binding affinity−10.1 kcal/mol),identifying ANKRD13C as a potential therapeutic drug-repurposing target for SC.Conclusion:The proposed framework RankXLAN outperforms other existing frameworks for serum biomarker identification,therapeutic target identification,and SC classification with multi-omics datasets.
基金National Natural Science Foundation of China (52075420)Fundamental Research Funds for the Central Universities (xzy022023049)National Key Research and Development Program of China (2023YFB3408600)。
文摘The burgeoning market for lithium-ion batteries has stimulated a growing need for more reliable battery performance monitoring. Accurate state-of-health(SOH) estimation is critical for ensuring battery operational performance. Despite numerous data-driven methods reported in existing research for battery SOH estimation, these methods often exhibit inconsistent performance across different application scenarios. To address this issue and overcome the performance limitations of individual data-driven models,integrating multiple models for SOH estimation has received considerable attention. Ensemble learning(EL) typically leverages the strengths of multiple base models to achieve more robust and accurate outputs. However, the lack of a clear review of current research hinders the further development of ensemble methods in SOH estimation. Therefore, this paper comprehensively reviews multi-model ensemble learning methods for battery SOH estimation. First, existing ensemble methods are systematically categorized into 6 classes based on their combination strategies. Different realizations and underlying connections are meticulously analyzed for each category of EL methods, highlighting distinctions, innovations, and typical applications. Subsequently, these ensemble methods are comprehensively compared in terms of base models, combination strategies, and publication trends. Evaluations across 6 dimensions underscore the outstanding performance of stacking-based ensemble methods. Following this, these ensemble methods are further inspected from the perspectives of weighted ensemble and diversity, aiming to inspire potential approaches for enhancing ensemble performance. Moreover, addressing challenges such as base model selection, measuring model robustness and uncertainty, and interpretability of ensemble models in practical applications is emphasized. Finally, future research prospects are outlined, specifically noting that deep learning ensemble is poised to advance ensemble methods for battery SOH estimation. The convergence of advanced machine learning with ensemble learning is anticipated to yield valuable avenues for research. Accelerated research in ensemble learning holds promising prospects for achieving more accurate and reliable battery SOH estimation under real-world conditions.
基金funded by Taif University,Saudi Arabia,project No.(TU-DSPP-2024-263).
文摘Deep learning algorithms have been rapidly incorporated into many different applications due to the increase in computational power and the availability of massive amounts of data.Recently,both deep learning and ensemble learning have been used to recognize underlying structures and patterns from high-level features to make predictions/decisions.With the growth in popularity of deep learning and ensemble learning algorithms,they have received significant attention from both scientists and the industrial community due to their superior ability to learn features from big data.Ensemble deep learning has exhibited significant performance in enhancing learning generalization through the use of multiple deep learning algorithms.Although ensemble deep learning has large quantities of training parameters,which results in time and space overheads,it performs much better than traditional ensemble learning.Ensemble deep learning has been successfully used in several areas,such as bioinformatics,finance,and health care.In this paper,we review and investigate recent ensemble deep learning algorithms and techniques in health care domains,medical imaging,health care data analytics,genomics,diagnosis,disease prevention,and drug discovery.We cover several widely used deep learning algorithms along with their architectures,including deep neural networks(DNNs),convolutional neural networks(CNNs),recurrent neural networks(RNNs),and generative adversarial networks(GANs).Common healthcare tasks,such as medical imaging,electronic health records,and genomics,are also demonstrated.Furthermore,in this review,the challenges inherent in reducing the burden on the healthcare system are discussed and explored.Finally,future directions and opportunities for enhancing healthcare model performance are discussed.
基金Researchers supporting Project number(RSPD2025R1107),King Saud University,Riyadh,Saudi Arabia.
文摘Autism spectrum disorder(ASD)is a multifaceted neurological developmental condition that manifests in several ways.Nearly all autistic children remain undiagnosed before the age of three.Developmental problems affecting face features are often associated with fundamental brain disorders.The facial evolution of newborns with ASD is quite different from that of typically developing children.Early recognition is very significant to aid families and parents in superstition and denial.Distinguishing facial features from typically developing children is an evident manner to detect children analyzed with ASD.Presently,artificial intelligence(AI)significantly contributes to the emerging computer-aided diagnosis(CAD)of autism and to the evolving interactivemethods that aid in the treatment and reintegration of autistic patients.This study introduces an Ensemble of deep learning models based on the autism spectrum disorder detection in facial images(EDLM-ASDDFI)model.The overarching goal of the EDLM-ASDDFI model is to recognize the difference between facial images of individuals with ASD and normal controls.In the EDLM-ASDDFI method,the primary level of data pre-processing is involved by Gabor filtering(GF).Besides,the EDLM-ASDDFI technique applies the MobileNetV2 model to learn complex features from the pre-processed data.For the ASD detection process,the EDLM-ASDDFI method uses ensemble techniques for classification procedure that encompasses long short-term memory(LSTM),deep belief network(DBN),and hybrid kernel extreme learning machine(HKELM).Finally,the hyperparameter selection of the three deep learning(DL)models can be implemented by the design of the crested porcupine optimizer(CPO)technique.An extensive experiment was conducted to emphasize the improved ASD detection performance of the EDLM-ASDDFI method.The simulation outcomes indicated that the EDLM-ASDDFI technique highlighted betterment over other existing models in terms of numerous performance measures.
基金supported in part by National Natural Science Foundation of China No.92467109,U21A20478National Key R&D Program of China 2023YFA1011601the Major Key Project of PCL(Grant PCL2024A05).
文摘Ensemble learning,a pivotal branch of machine learning,amalgamates multiple base models to enhance the overarching performance of predictive models,capitalising on the diversity and collective wisdom of the ensemble to surpass individual models and mitigate overfitting.In this review,a four-layer research framework is established for the research of ensemble learning,which can offer a comprehensive and structured review of ensemble learning from bottom to top.Firstly,this survey commences by introducing fundamental ensemble learning techniques,including bagging,boosting,and stacking,while also exploring the ensemble's diversity.Then,deep ensemble learning and semi-supervised ensemble learning are studied in detail.Furthermore,the utilisation of ensemble learning techniques to navigate challenging datasets,such as imbalanced and highdimensional data,is discussed.The application of ensemble learning techniques across various research domains,including healthcare,transportation,finance,manufacturing,and the Internet,is also examined.The survey concludes by discussing challenges intrinsic to ensemble learning.
文摘Hepatocellular carcinoma(HCC)remains a leading cause of cancer-related mortality globally,necessitating advanced diagnostic tools to improve early detection and personalized targeted therapy.This review synthesizes evidence on explainable ensemble learning approaches for HCC classification,emphasizing their integration with clinical workflows and multi-omics data.A systematic analysis[including datasets such as The Cancer Genome Atlas,Gene Expression Omnibus,and the Surveillance,Epidemiology,and End Results(SEER)datasets]revealed that explainable ensemble learning models achieve high diagnostic accuracy by combining clinical features,serum biomarkers such as alpha-fetoprotein,imaging features such as computed tomography and magnetic resonance imaging,and genomic data.For instance,SHapley Additive exPlanations(SHAP)-based random forests trained on NCBI GSE14520 microarray data(n=445)achieved 96.53%accuracy,while stacking ensembles applied to the SEER program data(n=1897)demonstrated an area under the receiver operating characteristic curve of 0.779 for mortality prediction.Despite promising results,challenges persist,including the computational costs of SHAP and local interpretable model-agnostic explanations analyses(e.g.,TreeSHAP requiring distributed computing for metabolomics datasets)and dataset biases(e.g.,SEER’s Western population dominance limiting generalizability).Future research must address inter-cohort heterogeneity,standardize explainability metrics,and prioritize lightweight surrogate models for resource-limited settings.This review presents the potential of explainable ensemble learning frameworks to bridge the gap between predictive accuracy and clinical interpretability,though rigorous validation in independent,multi-center cohorts is critical for real-world deployment.
基金funded by the Deanship of Scientific Research,Vice Presidency for Graduate Studies and Scientific Research,King Faisal University,Saudi Arabia[Grant No.KFU241683].
文摘This paper proposes a novel hybrid fraud detection framework that integrates multi-stage feature selection,unsupervised clustering,and ensemble learning to improve classification performance in financial transaction monitoring systems.The framework is structured into three core layers:(1)feature selection using Recursive Feature Elimination(RFE),Principal Component Analysis(PCA),and Mutual Information(MI)to reduce dimensionality and enhance input relevance;(2)anomaly detection through unsupervised clustering using K-Means,Density-Based Spatial Clustering(DBSCAN),and Hierarchical Clustering to flag suspicious patterns in unlabeled data;and(3)final classification using a voting-based hybrid ensemble of Support Vector Machine(SVM),Random Forest(RF),and Gradient Boosting Classifier(GBC).The experimental evaluation is conducted on a synthetically generated dataset comprising one million financial transactions,with 5% labelled as fraudulent,simulating realistic fraud rates and behavioural features,including transaction time,origin,amount,and geo-location.The proposed model demonstrated a significant improvement over baseline classifiers,achieving an accuracy of 99%,a precision of 99%,a recall of 97%,and an F1-score of 99%.Compared to individual models,it yielded a 9% gain in overall detection accuracy.It reduced the false positive rate to below 3.5%,thereby minimising the operational costs associated with manually reviewing false alerts.The model’s interpretability is enhanced by the integration of Shapley Additive Explanations(SHAP)values for feature importance,supporting transparency and regulatory auditability.These results affirm the practical relevance of the proposed system for deployment in real-time fraud detection scenarios such as credit card transactions,mobile banking,and cross-border payments.The study also highlights future directions,including the deployment of lightweight models and the integration of multimodal data for scalable fraud analytics.
基金funded by Department of Robotics and Mechatronics Engineering,Kennesaw State University,Marietta,GA 30060,USA.
文摘Glaucoma,a chronic eye disease affecting millions worldwide,poses a substantial threat to eyesight and can result in permanent vision loss if left untreated.Manual identification of glaucoma is a complicated and time-consuming practice requiring specialized expertise and results may be subjective.To address these challenges,this research proposes a computer-aided diagnosis(CAD)approach using Artificial Intelligence(AI)techniques for binary and multiclass classification of glaucoma stages.An ensemble fusion mechanism that combines the outputs of three pre-trained convolutional neural network(ConvNet)models–ResNet-50,VGG-16,and InceptionV3 is utilized in this paper.This fusion technique enhances diagnostic accuracy and robustness by ensemble-averaging the predictions from individual models,leveraging their complementary strengths.The objective of this work is to assess the model’s capability for early-stage glaucoma diagnosis.Classification is performed on a dataset collected from the Harvard Dataverse repository.With the proposed technique,for Normal vs.Advanced glaucoma classification,a validation accuracy of 98.04%and testing accuracy of 98.03%is achieved,with a specificity of 100%which outperforms stateof-the-art methods.For multiclass classification,the suggested ensemble approach achieved a precision and sensitivity of 97%,specificity,and testing accuracy of 98.57%and 96.82%,respectively.The proposed E-GlauNet model has significant potential in assisting ophthalmologists in the screening and fast diagnosis of glaucoma,leading to more reliable,efficient,and timely diagnosis,particularly for early-stage detection and staging of the disease.While the proposed method demonstrates high accuracy and robustness,the study is limited by the evaluation of a single dataset.Future work will focus on external validation across diverse datasets and enhancing interpretability using explainable AI techniques.
基金supported by grants from the National Natural Science Foundation of China(Key Program)(No.82230124)Traditional Chinese Medicine Inheritance and Innovation“Ten million”talent project-Qihuang Project Chief Scientist Project(No.0201000401)+1 种基金State Administration of Traditional Chinese Medicine 2nd National Traditional Chinese Medicine Inheritance Studio Construction Project(Official Letter of the State Office of Traditional Chinese Medicine[2022]No.245)National Natural Science Foundation of China(General Program)(No.81974556).
文摘Artificial intelligence(AI)serves as a key technology in global industrial transformation and technological restructuring and as the core driver of the fourth industrial revolution.Currently,deep learning techniques,such as convolutional neural networks,enable intelligent information collection in fields such as tongue and pulse diagnosis owing to their robust feature-processing capabilities.Natural language processing models,including long short-term memory and transformers,have been applied to traditional Chinese medicine(TCM)for diagnosis,syndrome differentiation,and prescription generation.Traditional machine learning algorithms,such as neural networks,support vector machines,and random forests,are also widely used in TCM diagnosis and treatment because of their strong regression and classification performance on small structured datasets.Future research on AI in TCM diagnosis and treatment may emphasize building large-scale,high-quality TCM datasets with unified criteria based on syndrome elements;identifying algorithms suited to TCM theoretical data distributions;and leveraging AI multimodal fusion and ensemble learning techniques for diverse raw features,such as images,text,and manually processed structured data,to increase the clinical efficacy of TCM diagnosis and treatment.
基金support from the National Natural Science Foundation of China(22108052).
文摘The biomass and coal co-pyrolysis (BCP) technology combines the advantages of both resources, achieving efficient resource complementarity, reducing reliance on coal, and minimizing pollutant emissions. However, this process still encounters numerous challenges in attaining optimal economic and environmental performance. Therefore, an ensemble learning (EL) framework is proposed for the BCP process in this study to optimize the synergistic benefits while minimizing negative environmental impacts. Six different ensemble learning models are developed to investigate the impact of input features, such as biomass characteristics, coal characteristics, and pyrolysis conditions on the product profit and CO_(2) emissions of the BCP processes. The Optuna method is further employed to automatically optimize the hyperparameters of BCP process models for enhancing their predictive accuracy and robustness. The results indicate that the categorical boosting (CAB) model of the BCP process has demonstrated exceptional performance in accurately predicting its product profit and CO_(2) emission (R2>0.92) after undergoing five-fold cross-validation. To enhance the interpretability of this preferred model, the Shapley additive explanations and partial dependence plot analyses are conducted to evaluate the impact and importance of biomass characteristics, coal characteristics, and pyrolysis conditions on the product profitability and CO_(2) emissions of the BCP processes. Finally, the preferred model coupled with a reference vector guided evolutionary algorithm is carried to identify the optimal conditions for maximizing the product profit of BCP process products while minimizing CO_(2) emissions. It indicates the optimal BCP process can achieve high product profits (5290.85 CNY·t−1) and low CO_(2) emissions (7.45 kg·t^(−1)).
文摘Breast cancer is among the leading causes of cancer mortality globally,and its diagnosis through histopathological image analysis is often prone to inter-observer variability and misclassification.Existing machine learning(ML)methods struggle with intra-class heterogeneity and inter-class similarity,necessitating more robust classification models.This study presents an ML classifier ensemble hybrid model for deep feature extraction with deep learning(DL)and Bat Swarm Optimization(BSO)hyperparameter optimization to improve breast cancer histopathology(BCH)image classification.A dataset of 804 Hematoxylin and Eosin(H&E)stained images classified as Benign,in situ,Invasive,and Normal categories(ICIAR2018_BACH_Challenge)has been utilized.ResNet50 was utilized for feature extraction,while Support Vector Machines(SVM),Random Forests(RF),XGBoosts(XGB),Decision Trees(DT),and AdaBoosts(ADB)were utilized for classification.BSO was utilized for hyperparameter optimization in a soft voting ensemble approach.Accuracy,precision,recall,specificity,F1-score,Receiver Operating Characteristic(ROC),and Precision-Recall(PR)were utilized for model performance metrics.The model using an ensemble outperformed individual classifiers in terms of having greater accuracy(~90.0%),precision(~86.4%),recall(~86.3%),and specificity(~96.6%).The robustness of the model was verified by both ROC and PR curves,which showed AUC values of 1.00,0.99,and 0.98 for Benign,Invasive,and in situ instances,respectively.This ensemble model delivers a strong and clinically valid methodology for breast cancer classification that enhances precision and minimizes diagnostic errors.Future work should focus on explainable AI,multi-modal fusion,few-shot learning,and edge computing for real-world deployment.
基金This study received financial support from the Jilin Province Health and Technology Capacity Enhancement Project(Project Number:222Lc132).
文摘Objective: This study investigates the auxiliary role of resting-state electroencephalography (EEG) in the clinical diagnosis of attention-deficit hyperactivity disorder (ADHD) using machine learning techniques. Methods: Resting-state EEG recordings were obtained from 57 children, comprising 28 typically developing children and 29 children diagnosed with ADHD. The EEG signal data from both groups were analyzed. To ensure analytical accuracy, artifacts and noise in the EEG signals were removed using the EEGLAB toolbox within the MATLAB environment. Following preprocessing, a comparative analysis was conducted using various ensemble learning algorithms, including AdaBoost, GBM, LightGBM, RF, XGB, and CatBoost. Model performance was systematically evaluated and optimized, validating the superior efficacy of ensemble learning approaches in identifying ADHD. Conclusion: Applying machine learning techniques to extract features from resting-state EEG signals enabled the development of effective ensemble learning models. Differential entropy and energy features across multiple frequency bands proved particularly valuable for these models. This approach significantly enhances the detection rate of ADHD in children, demonstrating high diagnostic efficacy and sensitivity, and providing a promising tool for clinical application.
文摘The rapid increase in the number of Internet of Things(IoT)devices,coupled with a rise in sophisticated cyberattacks,demands robust intrusion detection systems.This study presents a holistic,intelligent intrusion detection system.It uses a combined method that integrates machine learning(ML)and deep learning(DL)techniques to improve the protection of contemporary information technology(IT)systems.Unlike traditional signature-based or singlemodel methods,this system integrates the strengths of ensemble learning for binary classification and deep learning for multi-class classification.This combination provides a more nuanced and adaptable defense.The research utilizes the NF-UQ-NIDS-v2 dataset,a recent,comprehensive benchmark for evaluating network intrusion detection systems(NIDS).Our methodological framework employs advanced artificial intelligence techniques.Specifically,we use ensemble learning algorithms(Random Forest,Gradient Boosting,AdaBoost,and XGBoost)for binary classification.Deep learning architectures are also employed to address the complexities of multi-class classification,allowing for fine-grained identification of intrusion types.To mitigate class imbalance,a common problem in multi-class intrusion detection that biases model performance,we use oversampling and data augmentation.These techniques ensure equitable class representation.The results demonstrate the efficacy of the proposed hybrid ML-DL system.It achieves significant improvements in intrusion detection accuracy and reliability.This research contributes substantively to cybersecurity by providing a more robust and adaptable intrusion detection solution.
基金supported by the MSIT(Ministry of Science and ICT),Korea,under the ITRC(Information Technology Research Centre)support program(IITP-2024-RS-2024-00437191)supervised by the IITP(Institute for Information&Communications Technology Planning&Evaluation).
文摘Identifying druggable proteins,which are capable of binding therapeutic compounds,remains a critical and resource-intensive challenge in drug discovery.To address this,we propose CEL-IDP(Comparison of Ensemble Learning Methods for Identification of Druggable Proteins),a computational framework combining three feature extraction methods Dipeptide Deviation from Expected Mean(DDE),Enhanced Amino Acid Composition(EAAC),and Enhanced Grouped Amino Acid Composition(EGAAC)with ensemble learning strategies(Bagging,Boosting,Stacking)to classify druggable proteins from sequence data.DDE captures dipeptide frequency deviations,EAAC encodes positional amino acid information,and EGAAC groups residues by physicochemical properties to generate discriminative feature vectors.These features were analyzed using ensemble models to overcome the limitations of single classifiers.EGAAC outperformed DDE and EAAC,with Random Forest(Bagging)and XGBoost(Boosting)achieving the highest accuracy of 71.66%,demonstrating superior performance in capturing critical biochemical patterns.Stacking showed intermediate results(68.33%),while EAAC and DDE-based models yielded lower accuracies(56.66%–66.87%).CEL-IDP streamlines large-scale druggability prediction,reduces reliance on costly experimental screening,and aligns with global initiatives like Target 2035 to expand action-able drug targets.This work advances machine learning-driven drug discovery by systematizing feature engineering and ensemble model optimization,providing a scalable workflow to accelerate target identification and validation.
基金funded by the Deanship of Graduate Studies and Scientific Research at Jouf University under grant No.(DGSSR-2023-02-02341).
文摘The potential applications of multimodal physiological signals in healthcare,pain monitoring,and clinical decision support systems have garnered significant attention in biomedical research.Subjective self-reporting is the foundation of conventional pain assessment methods,which may be unreliable.Deep learning is a promising alternative to resolve this limitation through automated pain classification.This paper proposes an ensemble deep-learning framework for pain assessment.The framework makes use of features collected from electromyography(EMG),skin conductance level(SCL),and electrocardiography(ECG)signals.We integrate Convolutional Neural Networks(CNN),Long Short-Term Memory Networks(LSTM),Bidirectional Gated Recurrent Units(BiGRU),and Deep Neural Networks(DNN)models.We then aggregate their predictions using a weighted averaging ensemble technique to increase the classification’s robustness.To improve computing efficiency and remove redundant features,we use Particle Swarm Optimization(PSO)for feature selection.This enables us to reduce the features’dimensionality without sacrificing the classification’s accuracy.With improved accuracy,precision,recall,and F1-score across all pain levels,the experimental results show that the suggested ensemble model performs better than individual deep learning classifiers.In our experiments,the suggested model achieved over 98%accuracy,suggesting promising automated pain assessment performance.However,due to differences in validation protocols,comparisons with previous studies are still limited.Combining deep learning and feature selection techniques significantly improves model generalization,reducing overfitting and enhancing classification performance.The evaluation was conducted using the BioVid Heat Pain Dataset,confirming the model’s effectiveness in distinguishing between different pain intensity levels.
基金funded by the National Natural Science Foundation of China(Grant No.52222708)。
文摘Accurately evaluating the safety status of lithium-ion battery systems in electric vehicles is imperative due to the challenges in effectively predicting potential battery failure risks under stochastic profiles.Complex battery fault mechanisms and limited poor-quality data collection impede fault detection for battery systems under real-world conditions.This paper proposes a novel graph-guided fault detection method designed to recognize concealed anomalies in realistic data.Graphs guided by physical relationships are constructed for learning the dynamic evolution of physical quantities under normal conditions and their potential change characteristics in fault scenarios.An ensemble Graph Sample and Aggregate Network model are developed to tackle sample distribution imbalances and non-uniformity battery system specifications across vehicles.Failure risk probabilities for diverse battery charging and discharging segments are derived.An ablation study verifies the necessity of ensemble learning in addressing imbalanced datasets.Analysis of 102,095 segments across 86 vehicles with different battery material systems,battery capacities,and numbers of cells and temperature sensors confirms the robustness and generalization of the proposed method,yielding a recall of 98.37%.By introducing the graph,spatio-temporal global fault characteristics of battery systems are automatically extracted.The coupling relationship and evolution of physical quantities under both normal and faulty states are established,effectively uncovering fault information hidden in collected battery data without observable anomalies.The safety state of battery systems is reflected in terms of failure risk probability,providing reliable data support for battery system maintenance.
基金financially supported by the Natural Science Foundation of Hunan Province,China(No.2024JJ2074)the National Natural Science Foundation of China(No.22376221)the Young Elite Scientists Sponsorship Program by CAST,China(No.2023QNRC001).
文摘Tailings produced by mining and ore smelting are a major source of soil pollution.Understanding the speciation of heavy metals(HMs)in tailings is essential for soil remediation and sustainable development.Given the complex and time-consuming nature of traditional sequential laboratory extraction methods for determining the forms of HMs in tailings,a rapid and precise identification approach is urgently required.To address this issue,a general empirical prediction method for HM occurrence was developed using machine learning(ML).The compositional information of the tailings,properties of the HMs,and sequential extraction steps were used as inputs to calculate the percentages of the seven forms of HMs.After the models were tuned and compared,extreme gradient boosting,gradient boosting decision tree,and categorical boosting methods were found to be the top three performing ML models,with the coefficient of determination(R^(2))values on the testing set exceeding 0.859.Feature importance analysis for these three optimal models indicated that electronegativity was the most important factor affecting the occurrence of HMs,with an average feature importance of 0.4522.The subsequent use of stacking as a model integration method enabled the ability of the ML models to predict HM occurrence forms to be further improved,and resulting in an increase of R^(2) to 0.879.Overall,this study developed a robust technique for predicting the occurrence forms in tailings and provides an important reference for the environmental assessment and recycling of tailings.
文摘Cloud computing(CC) provides infrastructure,storage services,and applications to the users that should be secured by some procedures or policies.Security in the cloud environment becomes essential to safeguard infrastructure and user information from unauthorized access by implementing timely intrusion detection systems(IDS).Ensemble learning harnesses the collective power of multiple machine learning(ML) methods with feature selection(FS)process aids to progress the sturdiness and overall precision of intrusion detection.Therefore,this article presents a meta-heuristic feature selection by ensemble learning-based anomaly detection(MFS-ELAD)algorithm for the CC platforms.To realize this objective,the proposed approach utilizes a min-max standardization technique.Then,higher dimensionality features are decreased by Prairie Dogs Optimizer(PDO) algorithm.For the recognition procedure,the MFS-ELAD method emulates a group of 3 DL techniques such as sparse auto-encoder(SAE),stacked long short-term memory(SLSTM),and Elman neural network(ENN) algorithms.Eventually,the parameter fine-tuning of the DL algorithms occurs utilizing the sand cat swarm optimizer(SCSO) approach that helps in improving the recognition outcomes.The simulation examination of MFS-ELAD system on the CSE-CIC-IDS2018 dataset exhibits its promising performance across another method using a maximal precision of 99.71%.