The selection of hyperparameters in regularized least squares plays an important role in large-scale system identification. The traditional methods for selecting hyperparameters are based on experience or marginal lik...The selection of hyperparameters in regularized least squares plays an important role in large-scale system identification. The traditional methods for selecting hyperparameters are based on experience or marginal likelihood maximization method, which are inaccurate or computationally expensive. In this paper, two posterior methods are proposed to select hyperparameters based on different prior knowledge (constraints), which can obtain the optimal hyperparameters using the optimization theory. Moreover, we also give the theoretical optimal constraints, and verify its effectiveness. Numerical simulation shows that the hyperparameters and parameter vector estimate obtained by the proposed methods are the optimal ones.展开更多
Analyzing big data, especially medical data, helps to provide good health care to patients and face the risks of death. The COVID-19 pandemic has had a significant impact on public health worldwide, emphasizing the ne...Analyzing big data, especially medical data, helps to provide good health care to patients and face the risks of death. The COVID-19 pandemic has had a significant impact on public health worldwide, emphasizing the need for effective risk prediction models. Machine learning (ML) techniques have shown promise in analyzing complex data patterns and predicting disease outcomes. The accuracy of these techniques is greatly affected by changing their parameters. Hyperparameter optimization plays a crucial role in improving model performance. In this work, the Particle Swarm Optimization (PSO) algorithm was used to effectively search the hyperparameter space and improve the predictive power of the machine learning models by identifying the optimal hyperparameters that can provide the highest accuracy. A dataset with a variety of clinical and epidemiological characteristics linked to COVID-19 cases was used in this study. Various machine learning models, including Random Forests, Decision Trees, Support Vector Machines, and Neural Networks, were utilized to capture the complex relationships present in the data. To evaluate the predictive performance of the models, the accuracy metric was employed. The experimental findings showed that the suggested method of estimating COVID-19 risk is effective. When compared to baseline models, the optimized machine learning models performed better and produced better results.展开更多
Neural networks(NNs),as one of the most robust and efficient machine learning methods,have been commonly used in solving several problems.However,choosing proper hyperparameters(e.g.the numbers of layers and neurons i...Neural networks(NNs),as one of the most robust and efficient machine learning methods,have been commonly used in solving several problems.However,choosing proper hyperparameters(e.g.the numbers of layers and neurons in each layer)has a significant influence on the accuracy of these methods.Therefore,a considerable number of studies have been carried out to optimize the NN hyperpaxameters.In this study,the genetic algorithm is applied to NN to find the optimal hyperpaxameters.Thus,the deep energy method,which contains a deep neural network,is applied first on a Timoshenko beam and a plate with a hole.Subsequently,the numbers of hidden layers,integration points,and neurons in each layer are optimized to reach the highest accuracy to predict the stress distribution through these structures.Thus,applying the proper optimization method on NN leads to significant increase in the NN prediction accuracy after conducting the optimization in various examples.展开更多
To predict stall and surge in advance that make the aero-engine compressor operatesafely,a stall prediction model based on deep learning theory is established in the current study.The Long Short-Term Memory(LSTM)origi...To predict stall and surge in advance that make the aero-engine compressor operatesafely,a stall prediction model based on deep learning theory is established in the current study.The Long Short-Term Memory(LSTM)originating from the recurrent neural network is used,and a set of measured dynamic pressure datasets including the stall process is used to learn whatdetermines the weight of neural network nodes.Subsequently,the structure and function hyperpa-rameters in the model are deeply optimized,and a set of measured pressure data is used to verify theprediction effects of the model.On this basis of the above good predictive capability,stall in low-and high-speed compressor are predicted by using the established model.When a period of non-stallpressure data is used as input in the model,the model can quickly complete the prediction of sub-sequent time series data through the self-learning and prediction mechanism.Comparison with thereal-time measured pressure data demonstrates that the starting point of the predicted stall is basi-cally the same as that of the measured stall,and the stall can be predicted more than 1 s in advanceso that the occurrence of stall can be avoided.The model of stall prediction in the current study canmake up for the uncertainty of threshold selection of the existing stall warning methods based onmeasured data signal processing.It has a great application potential to predict the stall occurrenceof aero-engine compressor in advance and avoid the accidents.展开更多
To overcome the challenges associated with predicting gas extraction performance and mitigating the gradual decline in extraction volume,which adversely impacts gas utilization efficiency in mines,a gas extraction pur...To overcome the challenges associated with predicting gas extraction performance and mitigating the gradual decline in extraction volume,which adversely impacts gas utilization efficiency in mines,a gas extraction pure volume prediction model was developed using Support Vector Regression(SVR)and Random Forest(RF),with hyperparameters fine-tuned via the Genetic Algorithm(GA).Building upon this,an adaptive control model for gas extraction negative pressure was formulated to maximize the extracted gas volume within the pipeline network,followed by field validation experiments.Experimental results indicate that the GA-SVR model surpasses comparable models in terms of mean absolute error,root mean square error,and mean absolute percentage error.In the extraction process of bedding boreholes,the influence of negative pressure on gas extraction concentration diminishes over time,yet it remains a critical factor in determining the extracted pure volume.In contrast,throughout the entire extraction period of cross-layer boreholes,both extracted pure volume and concentration exhibit pronounced sensitivity to fluctuations in extraction negative pressure.Field experiments demonstrated that the adaptive controlmodel enhanced the average extracted gas volume by 5.08% in the experimental borehole group compared to the control group during the later extraction stage,with a more pronounced increase of 7.15% in the first 15 days.The research findings offer essential technical support for the efficient utilization and long-term sustainable development of mine gas resources.The research findings offer essential technical support for gas disaster mitigation and the sustained,efficient utilization of mine gas.展开更多
Traffic forecasting with high precision aids Intelligent Transport Systems(ITS)in formulating and optimizing traffic management strategies.The algorithms used for tuning the hyperparameters of the deep learning models...Traffic forecasting with high precision aids Intelligent Transport Systems(ITS)in formulating and optimizing traffic management strategies.The algorithms used for tuning the hyperparameters of the deep learning models often have accurate results at the expense of high computational complexity.To address this problem,this paper uses the Tree-structured Parzen Estimator(TPE)to tune the hyperparameters of the Long Short-term Memory(LSTM)deep learning framework.The Tree-structured Parzen Estimator(TPE)uses a probabilistic approach with an adaptive searching mechanism by classifying the objective function values into good and bad samples.This ensures fast convergence in tuning the hyperparameter values in the deep learning model for performing prediction while still maintaining a certain degree of accuracy.It also overcomes the problem of converging to local optima and avoids timeconsuming random search and,therefore,avoids high computational complexity in prediction accuracy.The proposed scheme first performs data smoothing and normalization on the input data,which is then fed to the input of the TPE for tuning the hyperparameters.The traffic data is then input to the LSTM model with tuned parameters to perform the traffic prediction.The three optimizers:Adaptive Moment Estimation(Adam),Root Mean Square Propagation(RMSProp),and Stochastic Gradient Descend with Momentum(SGDM)are also evaluated for accuracy prediction and the best optimizer is then chosen for final traffic prediction in TPE-LSTM model.Simulation results verify the effectiveness of the proposed model in terms of accuracy of prediction over the benchmark schemes.展开更多
In radiology,magnetic resonance imaging(MRI)is an essential diagnostic tool that provides detailed images of a patient’s anatomical and physiological structures.MRI is particularly effective for detecting soft tissue...In radiology,magnetic resonance imaging(MRI)is an essential diagnostic tool that provides detailed images of a patient’s anatomical and physiological structures.MRI is particularly effective for detecting soft tissue anomalies.Traditionally,radiologists manually interpret these images,which can be labor-intensive and time-consuming due to the vast amount of data.To address this challenge,machine learning,and deep learning approaches can be utilized to improve the accuracy and efficiency of anomaly detection in MRI scans.This manuscript presents the use of the Deep AlexNet50 model for MRI classification with discriminative learning methods.There are three stages for learning;in the first stage,the whole dataset is used to learn the features.In the second stage,some layers of AlexNet50 are frozen with an augmented dataset,and in the third stage,AlexNet50 with an augmented dataset with the augmented dataset.This method used three publicly available MRI classification datasets:Harvard whole brain atlas(HWBA-dataset),the School of Biomedical Engineering of Southern Medical University(SMU-dataset),and The National Institute of Neuroscience and Hospitals brain MRI dataset(NINS-dataset)for analysis.Various hyperparameter optimizers like Adam,stochastic gradient descent(SGD),Root mean square propagation(RMS prop),Adamax,and AdamW have been used to compare the performance of the learning process.HWBA-dataset registers maximum classification performance.We evaluated the performance of the proposed classification model using several quantitative metrics,achieving an average accuracy of 98%.展开更多
Fire can cause significant damage to the environment,economy,and human lives.If fire can be detected early,the damage can be minimized.Advances in technology,particularly in computer vision powered by deep learning,ha...Fire can cause significant damage to the environment,economy,and human lives.If fire can be detected early,the damage can be minimized.Advances in technology,particularly in computer vision powered by deep learning,have enabled automated fire detection in images and videos.Several deep learning models have been developed for object detection,including applications in fire and smoke detection.This study focuses on optimizing the training hyperparameters of YOLOv8 andYOLOv10models usingBayesianTuning(BT).Experimental results on the large-scale D-Fire dataset demonstrate that this approach enhances detection performance.Specifically,the proposed approach improves the mean average precision at an Intersection over Union(IoU)threshold of 0.5(mAP50)of the YOLOv8s,YOLOv10s,YOLOv8l,and YOLOv10lmodels by 0.26,0.21,0.84,and 0.63,respectively,compared tomodels trainedwith the default hyperparameters.The performance gains are more pronounced in larger models,YOLOv8l and YOLOv10l,than in their smaller counterparts,YOLOv8s and YOLOv10s.Furthermore,YOLOv8 models consistently outperform YOLOv10,with mAP50 improvements of 0.26 for YOLOv8s over YOLOv10s and 0.65 for YOLOv8l over YOLOv10l when trained with BT.These results establish YOLOv8 as the preferred model for fire detection applications where detection performance is prioritized.展开更多
With the rapid adoption of artificial intelligence(AI)in domains such as power,transportation,and finance,the number of machine learning and deep learning models has grown exponentially.However,challenges such as dela...With the rapid adoption of artificial intelligence(AI)in domains such as power,transportation,and finance,the number of machine learning and deep learning models has grown exponentially.However,challenges such as delayed retraining,inconsistent version management,insufficient drift monitoring,and limited data security still hinder efficient and reliable model operations.To address these issues,this paper proposes the Intelligent Model Lifecycle Management Algorithm(IMLMA).The algorithm employs a dual-trigger mechanism based on both data volume thresholds and time intervals to automate retraining,and applies Bayesian optimization for adaptive hyperparameter tuning to improve performance.A multi-metric replacement strategy,incorporating MSE,MAE,and R2,ensures that new models replace existing ones only when performance improvements are guaranteed.A versioning and traceability database supports comparison and visualization,while real-time monitoring with stability analysis enables early warnings of latency and drift.Finally,hash-based integrity checks secure both model files and datasets.Experimental validation in a power metering operation scenario demonstrates that IMLMA reduces model update delays,enhances predictive accuracy and stability,and maintains low latency under high concurrency.This work provides a practical,reusable,and scalable solution for intelligent model lifecycle management,with broad applicability to complex systems such as smart grids.展开更多
Background:The existence of doublets in single-cell RNA sequencing(scRNA-seq)data poses a great challenge in downstream data analysis.Computational doublet-detection methods have been developed to remove doublets from...Background:The existence of doublets in single-cell RNA sequencing(scRNA-seq)data poses a great challenge in downstream data analysis.Computational doublet-detection methods have been developed to remove doublets from scRNA-seq data.Yet,the default hyperparameter settings of those methods may not provide optimal performance.Methods:We propose a strategy to tune hyperparameters for a cutting-edge doublet-detection method.We utilize a full factorial design to explore the relationship between hyperparameters and detection accuracy on 16 real scRNA-seq datasets.The optimal hyperparameters are obtained by a response surface model and convex optimization.Results:We show that the optimal hyperparameters provide top performance across scRNA-seq datasets under various biological conditions.Our tuning strategy can be applied to other computational doublet-detection methods.It also offers insights into hyperparameter tuning for broader computational methods in scRNA-seq data analysis.Conclusions:The hyperparameter configuration significantly impacts the performance of computational doublet-detection methods.Our study is the first attempt to systematically explore the optimal hyperparameters under various biological conditions and optimization objectives.Our study provides much-needed guidance for hyperparameter tuning in computational doublet-detection methods.展开更多
With the advancement of artificial intelligence,traffic forecasting is gaining more and more interest in optimizing route planning and enhancing service quality.Traffic volume is an influential parameter for planning ...With the advancement of artificial intelligence,traffic forecasting is gaining more and more interest in optimizing route planning and enhancing service quality.Traffic volume is an influential parameter for planning and operating traffic structures.This study proposed an improved ensemble-based deep learning method to solve traffic volume prediction problems.A set of optimal hyperparameters is also applied for the suggested approach to improve the performance of the learning process.The fusion of these methodologies aims to harness ensemble empirical mode decomposition’s capacity to discern complex traffic patterns and long short-term memory’s proficiency in learning temporal relationships.Firstly,a dataset for automatic vehicle identification is obtained and utilized in the preprocessing stage of the ensemble empirical mode decomposition model.The second aspect involves predicting traffic volume using the long short-term memory algorithm.Next,the study employs a trial-and-error approach to select a set of optimal hyperparameters,including the lookback window,the number of neurons in the hidden layers,and the gradient descent optimization.Finally,the fusion of the obtained results leads to a final traffic volume prediction.The experimental results show that the proposed method outperforms other benchmarks regarding various evaluation measures,including mean absolute error,root mean squared error,mean absolute percentage error,and R-squared.The achieved R-squared value reaches an impressive 98%,while the other evaluation indices surpass the competing.These findings highlight the accuracy of traffic pattern prediction.Consequently,this offers promising prospects for enhancing transportation management systems and urban infrastructure planning.展开更多
Hydrological models are developed to simulate river flows over a watershed for many practical applications in the field of water resource management. The present paper compares the performance of two recurrent neural ...Hydrological models are developed to simulate river flows over a watershed for many practical applications in the field of water resource management. The present paper compares the performance of two recurrent neural networks for rainfall-runoff modeling in the Zou River basin at Atchérigbé outlet. To this end, we used daily precipitation data over the period 1988-2010 as input of the models, such as the Long Short-Term Memory (LSTM) and Recurrent Gate Networks (GRU) to simulate river discharge in the study area. The investigated models give good results in calibration (R2 = 0.888, NSE = 0.886, and RMSE = 0.42 for LSTM;R2 = 0.9, NSE = 0.9 and RMSE = 0.397 for GRU) and in validation (R2 = 0.865, NSE = 0.851, and RMSE = 0.329 for LSTM;R2 = 0.9, NSE = 0.865 and RMSE = 0.301 for GRU). This good performance of LSTM and GRU models confirms the importance of models based on machine learning in modeling hydrological phenomena for better decision-making.展开更多
Recently,anomaly detection(AD)in streaming data gained significant attention among research communities due to its applicability in finance,business,healthcare,education,etc.The recent developments of deep learning(DL...Recently,anomaly detection(AD)in streaming data gained significant attention among research communities due to its applicability in finance,business,healthcare,education,etc.The recent developments of deep learning(DL)models find helpful in the detection and classification of anomalies.This article designs an oversampling with an optimal deep learning-based streaming data classification(OS-ODLSDC)model.The aim of the OSODLSDC model is to recognize and classify the presence of anomalies in the streaming data.The proposed OS-ODLSDC model initially undergoes preprocessing step.Since streaming data is unbalanced,support vector machine(SVM)-Synthetic Minority Over-sampling Technique(SVM-SMOTE)is applied for oversampling process.Besides,the OS-ODLSDC model employs bidirectional long short-term memory(Bi LSTM)for AD and classification.Finally,the root means square propagation(RMSProp)optimizer is applied for optimal hyperparameter tuning of the Bi LSTM model.For ensuring the promising performance of the OS-ODLSDC model,a wide-ranging experimental analysis is performed using three benchmark datasets such as CICIDS 2018,KDD-Cup 1999,and NSL-KDD datasets.展开更多
Boosting algorithms have been widely utilized in the development of landslide susceptibility mapping(LSM)studies.However,these algorithms possess distinct computational strategies and hyperparameters,making it challen...Boosting algorithms have been widely utilized in the development of landslide susceptibility mapping(LSM)studies.However,these algorithms possess distinct computational strategies and hyperparameters,making it challenging to propose an ideal LSM model.To investigate the impact of different boosting algorithms and hyperparameter optimization algorithms on LSM,this study constructed a geospatial database comprising 12 conditioning factors,such as elevation,stratum,and annual average rainfall.The XGBoost(XGB),LightGBM(LGBM),and CatBoost(CB)algorithms were employed to construct the LSM model.Furthermore,the Bayesian optimization(BO),particle swarm optimization(PSO),and Hyperband optimization(HO)algorithms were applied to optimizing the LSM model.The boosting algorithms exhibited varying performances,with CB demonstrating the highest precision,followed by LGBM,and XGB showing poorer precision.Additionally,the hyperparameter optimization algorithms displayed different performances,with HO outperforming PSO and BO showing poorer performance.The HO-CB model achieved the highest precision,boasting an accuracy of 0.764,an F1-score of 0.777,an area under the curve(AUC)value of 0.837 for the training set,and an AUC value of 0.863 for the test set.The model was interpreted using SHapley Additive exPlanations(SHAP),revealing that slope,curvature,topographic wetness index(TWI),degree of relief,and elevation significantly influenced landslides in the study area.This study offers a scientific reference for LSM and disaster prevention research.This study examines the utilization of various boosting algorithms and hyperparameter optimization algorithms in Wanzhou District.It proposes the HO-CB-SHAP framework as an effective approach to accurately forecast landslide disasters and interpret LSM models.However,limitations exist concerning the generalizability of the model and the data processing,which require further exploration in subsequent studies.展开更多
Fraud of credit cards is a major issue for financial organizations and individuals.As fraudulent actions become more complex,a demand for better fraud detection systems is rising.Deep learning approaches have shown pr...Fraud of credit cards is a major issue for financial organizations and individuals.As fraudulent actions become more complex,a demand for better fraud detection systems is rising.Deep learning approaches have shown promise in several fields,including detecting credit card fraud.However,the efficacy of these models is heavily dependent on the careful selection of appropriate hyperparameters.This paper introduces models that integrate deep learning models with hyperparameter tuning techniques to learn the patterns and relationships within credit card transaction data,thereby improving fraud detection.Three deep learning models:AutoEncoder(AE),Convolution Neural Network(CNN),and Long Short-Term Memory(LSTM)are proposed to investigate how hyperparameter adjustment impacts the efficacy of deep learning models used to identify credit card fraud.The experiments conducted on a European credit card fraud dataset using different hyperparameters and three deep learning models demonstrate that the proposed models achieve a tradeoff between detection rate and precision,leading these models to be effective in accurately predicting credit card fraud.The results demonstrate that LSTM significantly outperformed AE and CNN in terms of accuracy(99.2%),detection rate(93.3%),and area under the curve(96.3%).These proposed models have surpassed those of existing studies and are expected to make a significant contribution to the field of credit card fraud detection.展开更多
In this paper,we introduce a novel Multi-scale and Auto-tuned Semi-supervised Deep Subspace Clustering(MAS-DSC)algorithm,aimed at addressing the challenges of deep subspace clustering in high-dimensional real-world da...In this paper,we introduce a novel Multi-scale and Auto-tuned Semi-supervised Deep Subspace Clustering(MAS-DSC)algorithm,aimed at addressing the challenges of deep subspace clustering in high-dimensional real-world data,particularly in the field of medical imaging.Traditional deep subspace clustering algorithms,which are mostly unsupervised,are limited in their ability to effectively utilize the inherent prior knowledge in medical images.Our MAS-DSC algorithm incorporates a semi-supervised learning framework that uses a small amount of labeled data to guide the clustering process,thereby enhancing the discriminative power of the feature representations.Additionally,the multi-scale feature extraction mechanism is designed to adapt to the complexity of medical imaging data,resulting in more accurate clustering performance.To address the difficulty of hyperparameter selection in deep subspace clustering,this paper employs a Bayesian optimization algorithm for adaptive tuning of hyperparameters related to subspace clustering,prior knowledge constraints,and model loss weights.Extensive experiments on standard clustering datasets,including ORL,Coil20,and Coil100,validate the effectiveness of the MAS-DSC algorithm.The results show that with its multi-scale network structure and Bayesian hyperparameter optimization,MAS-DSC achieves excellent clustering results on these datasets.Furthermore,tests on a brain tumor dataset demonstrate the robustness of the algorithm and its ability to leverage prior knowledge for efficient feature extraction and enhanced clustering performance within a semi-supervised learning framework.展开更多
The recent development of the Internet of Things(IoTs)resulted in the growth of IoT-based DDoS attacks.The detection of Botnet in IoT systems implements advanced cybersecurity measures to detect and reduce malevolent ...The recent development of the Internet of Things(IoTs)resulted in the growth of IoT-based DDoS attacks.The detection of Botnet in IoT systems implements advanced cybersecurity measures to detect and reduce malevolent botnets in interconnected devices.Anomaly detection models evaluate transmission patterns,network traffic,and device behaviour to detect deviations from usual activities.Machine learning(ML)techniques detect patterns signalling botnet activity,namely sudden traffic increase,unusual command and control patterns,or irregular device behaviour.In addition,intrusion detection systems(IDSs)and signature-based techniques are applied to recognize known malware signatures related to botnets.Various ML and deep learning(DL)techniques have been developed to detect botnet attacks in IoT systems.To overcome security issues in an IoT environment,this article designs a gorilla troops optimizer with DL-enabled botnet attack detection and classification(GTODL-BADC)technique.The GTODL-BADC technique follows feature selection(FS)with optimal DL-based classification for accomplishing security in an IoT environment.For data preprocessing,the min-max data normalization approach is primarily used.The GTODL-BADC technique uses the GTO algorithm to select features and elect optimal feature subsets.Moreover,the multi-head attention-based long short-term memory(MHA-LSTM)technique was applied for botnet detection.Finally,the tree seed algorithm(TSA)was used to select the optimum hyperparameter for the MHA-LSTM method.The experimental validation of the GTODL-BADC technique can be tested on a benchmark dataset.The simulation results highlighted that the GTODL-BADC technique demonstrates promising performance in the botnet detection process.展开更多
This study explores the impact of hyperparameter optimization on machine learning models for predicting cardiovascular disease using data from an IoST(Internet of Sensing Things)device.Ten distinct machine learning ap...This study explores the impact of hyperparameter optimization on machine learning models for predicting cardiovascular disease using data from an IoST(Internet of Sensing Things)device.Ten distinct machine learning approaches were implemented and systematically evaluated before and after hyperparameter tuning.Significant improvements were observed across various models,with SVM and Neural Networks consistently showing enhanced performance metrics such as F1-Score,recall,and precision.The study underscores the critical role of tailored hyperparameter tuning in optimizing these models,revealing diverse outcomes among algorithms.Decision Trees and Random Forests exhibited stable performance throughout the evaluation.While enhancing accuracy,hyperparameter optimization also led to increased execution time.Visual representations and comprehensive results support the findings,confirming the hypothesis that optimizing parameters can effectively enhance predictive capabilities in cardiovascular disease.This research contributes to advancing the understanding and application of machine learning in healthcare,particularly in improving predictive accuracy for cardiovascular disease management and intervention strategies.展开更多
Breast cancer stands as one of the world’s most perilous and formidable diseases,having recently surpassed lung cancer as the most prevalent cancer type.This disease arises when cells in the breast undergo unregulate...Breast cancer stands as one of the world’s most perilous and formidable diseases,having recently surpassed lung cancer as the most prevalent cancer type.This disease arises when cells in the breast undergo unregulated proliferation,resulting in the formation of a tumor that has the capacity to invade surrounding tissues.It is not confined to a specific gender;both men and women can be diagnosed with breast cancer,although it is more frequently observed in women.Early detection is pivotal in mitigating its mortality rate.The key to curbing its mortality lies in early detection.However,it is crucial to explain the black-box machine learning algorithms in this field to gain the trust of medical professionals and patients.In this study,we experimented with various machine learning models to predict breast cancer using the Wisconsin Breast Cancer Dataset(WBCD)dataset.We applied Random Forest,XGBoost,Support Vector Machine(SVM),Multi-Layer Perceptron(MLP),and Gradient Boost classifiers,with the Random Forest model outperforming the others.A comparison analysis between the two methods was done after performing hyperparameter tuning on each method.The analysis showed that the random forest performs better and yields the highest result with 99.46%accuracy.After performance evaluation,two Explainable Artificial Intelligence(XAI)methods,SHapley Additive exPlanations(SHAP)and Local Interpretable Model-Agnostic Explanations(LIME),have been utilized to explain the random forest machine learning model.展开更多
Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requir...Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requires more syntactic elements.Most existing strategies focus on the general semantics of a conversation without involving the context of the sentence,recognizing the progress and comparing impacts.An ensemble pre-trained language model was taken up here to classify the conversation sentences from the conversation corpus.The conversational sentences are classified into four categories:information,question,directive,and commission.These classification label sequences are for analyzing the conversation progress and predicting the pecking order of the conversation.Ensemble of Bidirectional Encoder for Representation of Transformer(BERT),Robustly Optimized BERT pretraining Approach(RoBERTa),Generative Pre-Trained Transformer(GPT),DistilBERT and Generalized Autoregressive Pretraining for Language Understanding(XLNet)models are trained on conversation corpus with hyperparameters.Hyperparameter tuning approach is carried out for better performance on sentence classification.This Ensemble of Pre-trained Language Models with a Hyperparameter Tuning(EPLM-HT)system is trained on an annotated conversation dataset.The proposed approach outperformed compared to the base BERT,GPT,DistilBERT and XLNet transformer models.The proposed ensemble model with the fine-tuned parameters achieved an F1_score of 0.88.展开更多
文摘The selection of hyperparameters in regularized least squares plays an important role in large-scale system identification. The traditional methods for selecting hyperparameters are based on experience or marginal likelihood maximization method, which are inaccurate or computationally expensive. In this paper, two posterior methods are proposed to select hyperparameters based on different prior knowledge (constraints), which can obtain the optimal hyperparameters using the optimization theory. Moreover, we also give the theoretical optimal constraints, and verify its effectiveness. Numerical simulation shows that the hyperparameters and parameter vector estimate obtained by the proposed methods are the optimal ones.
文摘Analyzing big data, especially medical data, helps to provide good health care to patients and face the risks of death. The COVID-19 pandemic has had a significant impact on public health worldwide, emphasizing the need for effective risk prediction models. Machine learning (ML) techniques have shown promise in analyzing complex data patterns and predicting disease outcomes. The accuracy of these techniques is greatly affected by changing their parameters. Hyperparameter optimization plays a crucial role in improving model performance. In this work, the Particle Swarm Optimization (PSO) algorithm was used to effectively search the hyperparameter space and improve the predictive power of the machine learning models by identifying the optimal hyperparameters that can provide the highest accuracy. A dataset with a variety of clinical and epidemiological characteristics linked to COVID-19 cases was used in this study. Various machine learning models, including Random Forests, Decision Trees, Support Vector Machines, and Neural Networks, were utilized to capture the complex relationships present in the data. To evaluate the predictive performance of the models, the accuracy metric was employed. The experimental findings showed that the suggested method of estimating COVID-19 risk is effective. When compared to baseline models, the optimized machine learning models performed better and produced better results.
文摘Neural networks(NNs),as one of the most robust and efficient machine learning methods,have been commonly used in solving several problems.However,choosing proper hyperparameters(e.g.the numbers of layers and neurons in each layer)has a significant influence on the accuracy of these methods.Therefore,a considerable number of studies have been carried out to optimize the NN hyperpaxameters.In this study,the genetic algorithm is applied to NN to find the optimal hyperpaxameters.Thus,the deep energy method,which contains a deep neural network,is applied first on a Timoshenko beam and a plate with a hole.Subsequently,the numbers of hidden layers,integration points,and neurons in each layer are optimized to reach the highest accuracy to predict the stress distribution through these structures.Thus,applying the proper optimization method on NN leads to significant increase in the NN prediction accuracy after conducting the optimization in various examples.
基金funded by the National Natural Science Foundation of China(No.52376039 and U24A20138)the Beijing Natural Science Foundation of China(No.JQ24017)+1 种基金the National Science and Technology Major Project of China(Nos.J2019-II-0005-0025 and Y2022-Ⅱ-0002-0005)the Special Fund for the Member of Youth Innovation Promotion Association of Chinese Academy of Sciences(No.2018173)。
文摘To predict stall and surge in advance that make the aero-engine compressor operatesafely,a stall prediction model based on deep learning theory is established in the current study.The Long Short-Term Memory(LSTM)originating from the recurrent neural network is used,and a set of measured dynamic pressure datasets including the stall process is used to learn whatdetermines the weight of neural network nodes.Subsequently,the structure and function hyperpa-rameters in the model are deeply optimized,and a set of measured pressure data is used to verify theprediction effects of the model.On this basis of the above good predictive capability,stall in low-and high-speed compressor are predicted by using the established model.When a period of non-stallpressure data is used as input in the model,the model can quickly complete the prediction of sub-sequent time series data through the self-learning and prediction mechanism.Comparison with thereal-time measured pressure data demonstrates that the starting point of the predicted stall is basi-cally the same as that of the measured stall,and the stall can be predicted more than 1 s in advanceso that the occurrence of stall can be avoided.The model of stall prediction in the current study canmake up for the uncertainty of threshold selection of the existing stall warning methods based onmeasured data signal processing.It has a great application potential to predict the stall occurrenceof aero-engine compressor in advance and avoid the accidents.
基金funded by the National Key Research and Development Program of China,grant number:2023YFF0615404.
文摘To overcome the challenges associated with predicting gas extraction performance and mitigating the gradual decline in extraction volume,which adversely impacts gas utilization efficiency in mines,a gas extraction pure volume prediction model was developed using Support Vector Regression(SVR)and Random Forest(RF),with hyperparameters fine-tuned via the Genetic Algorithm(GA).Building upon this,an adaptive control model for gas extraction negative pressure was formulated to maximize the extracted gas volume within the pipeline network,followed by field validation experiments.Experimental results indicate that the GA-SVR model surpasses comparable models in terms of mean absolute error,root mean square error,and mean absolute percentage error.In the extraction process of bedding boreholes,the influence of negative pressure on gas extraction concentration diminishes over time,yet it remains a critical factor in determining the extracted pure volume.In contrast,throughout the entire extraction period of cross-layer boreholes,both extracted pure volume and concentration exhibit pronounced sensitivity to fluctuations in extraction negative pressure.Field experiments demonstrated that the adaptive controlmodel enhanced the average extracted gas volume by 5.08% in the experimental borehole group compared to the control group during the later extraction stage,with a more pronounced increase of 7.15% in the first 15 days.The research findings offer essential technical support for the efficient utilization and long-term sustainable development of mine gas resources.The research findings offer essential technical support for gas disaster mitigation and the sustained,efficient utilization of mine gas.
文摘Traffic forecasting with high precision aids Intelligent Transport Systems(ITS)in formulating and optimizing traffic management strategies.The algorithms used for tuning the hyperparameters of the deep learning models often have accurate results at the expense of high computational complexity.To address this problem,this paper uses the Tree-structured Parzen Estimator(TPE)to tune the hyperparameters of the Long Short-term Memory(LSTM)deep learning framework.The Tree-structured Parzen Estimator(TPE)uses a probabilistic approach with an adaptive searching mechanism by classifying the objective function values into good and bad samples.This ensures fast convergence in tuning the hyperparameter values in the deep learning model for performing prediction while still maintaining a certain degree of accuracy.It also overcomes the problem of converging to local optima and avoids timeconsuming random search and,therefore,avoids high computational complexity in prediction accuracy.The proposed scheme first performs data smoothing and normalization on the input data,which is then fed to the input of the TPE for tuning the hyperparameters.The traffic data is then input to the LSTM model with tuned parameters to perform the traffic prediction.The three optimizers:Adaptive Moment Estimation(Adam),Root Mean Square Propagation(RMSProp),and Stochastic Gradient Descend with Momentum(SGDM)are also evaluated for accuracy prediction and the best optimizer is then chosen for final traffic prediction in TPE-LSTM model.Simulation results verify the effectiveness of the proposed model in terms of accuracy of prediction over the benchmark schemes.
文摘In radiology,magnetic resonance imaging(MRI)is an essential diagnostic tool that provides detailed images of a patient’s anatomical and physiological structures.MRI is particularly effective for detecting soft tissue anomalies.Traditionally,radiologists manually interpret these images,which can be labor-intensive and time-consuming due to the vast amount of data.To address this challenge,machine learning,and deep learning approaches can be utilized to improve the accuracy and efficiency of anomaly detection in MRI scans.This manuscript presents the use of the Deep AlexNet50 model for MRI classification with discriminative learning methods.There are three stages for learning;in the first stage,the whole dataset is used to learn the features.In the second stage,some layers of AlexNet50 are frozen with an augmented dataset,and in the third stage,AlexNet50 with an augmented dataset with the augmented dataset.This method used three publicly available MRI classification datasets:Harvard whole brain atlas(HWBA-dataset),the School of Biomedical Engineering of Southern Medical University(SMU-dataset),and The National Institute of Neuroscience and Hospitals brain MRI dataset(NINS-dataset)for analysis.Various hyperparameter optimizers like Adam,stochastic gradient descent(SGD),Root mean square propagation(RMS prop),Adamax,and AdamW have been used to compare the performance of the learning process.HWBA-dataset registers maximum classification performance.We evaluated the performance of the proposed classification model using several quantitative metrics,achieving an average accuracy of 98%.
基金supported by the MSIT(Ministry of Science and ICT),Republic of Korea,under the ITRC(Information Technology Research Center)Support Program(IITP-2024-RS-2022-00156354)supervised by the IITP(Institute for Information&Communications Technology Planning&Evaluation)supported by the Technology Development Program(RS-2023-00264489)funded by the Ministry of SMEs and Startups(MSS,Republic of Korea).
文摘Fire can cause significant damage to the environment,economy,and human lives.If fire can be detected early,the damage can be minimized.Advances in technology,particularly in computer vision powered by deep learning,have enabled automated fire detection in images and videos.Several deep learning models have been developed for object detection,including applications in fire and smoke detection.This study focuses on optimizing the training hyperparameters of YOLOv8 andYOLOv10models usingBayesianTuning(BT).Experimental results on the large-scale D-Fire dataset demonstrate that this approach enhances detection performance.Specifically,the proposed approach improves the mean average precision at an Intersection over Union(IoU)threshold of 0.5(mAP50)of the YOLOv8s,YOLOv10s,YOLOv8l,and YOLOv10lmodels by 0.26,0.21,0.84,and 0.63,respectively,compared tomodels trainedwith the default hyperparameters.The performance gains are more pronounced in larger models,YOLOv8l and YOLOv10l,than in their smaller counterparts,YOLOv8s and YOLOv10s.Furthermore,YOLOv8 models consistently outperform YOLOv10,with mAP50 improvements of 0.26 for YOLOv8s over YOLOv10s and 0.65 for YOLOv8l over YOLOv10l when trained with BT.These results establish YOLOv8 as the preferred model for fire detection applications where detection performance is prioritized.
基金funded by Anhui NARI ZT Electric Co.,Ltd.,entitled“Research on the Shared Operation and Maintenance Service Model for Metering Equipment and Platform Development for the Modern Industrial Chain”(Grant No.524636250005).
文摘With the rapid adoption of artificial intelligence(AI)in domains such as power,transportation,and finance,the number of machine learning and deep learning models has grown exponentially.However,challenges such as delayed retraining,inconsistent version management,insufficient drift monitoring,and limited data security still hinder efficient and reliable model operations.To address these issues,this paper proposes the Intelligent Model Lifecycle Management Algorithm(IMLMA).The algorithm employs a dual-trigger mechanism based on both data volume thresholds and time intervals to automate retraining,and applies Bayesian optimization for adaptive hyperparameter tuning to improve performance.A multi-metric replacement strategy,incorporating MSE,MAE,and R2,ensures that new models replace existing ones only when performance improvements are guaranteed.A versioning and traceability database supports comparison and visualization,while real-time monitoring with stability analysis enables early warnings of latency and drift.Finally,hash-based integrity checks secure both model files and datasets.Experimental validation in a power metering operation scenario demonstrates that IMLMA reduces model update delays,enhances predictive accuracy and stability,and maintains low latency under high concurrency.This work provides a practical,reusable,and scalable solution for intelligent model lifecycle management,with broad applicability to complex systems such as smart grids.
文摘Background:The existence of doublets in single-cell RNA sequencing(scRNA-seq)data poses a great challenge in downstream data analysis.Computational doublet-detection methods have been developed to remove doublets from scRNA-seq data.Yet,the default hyperparameter settings of those methods may not provide optimal performance.Methods:We propose a strategy to tune hyperparameters for a cutting-edge doublet-detection method.We utilize a full factorial design to explore the relationship between hyperparameters and detection accuracy on 16 real scRNA-seq datasets.The optimal hyperparameters are obtained by a response surface model and convex optimization.Results:We show that the optimal hyperparameters provide top performance across scRNA-seq datasets under various biological conditions.Our tuning strategy can be applied to other computational doublet-detection methods.It also offers insights into hyperparameter tuning for broader computational methods in scRNA-seq data analysis.Conclusions:The hyperparameter configuration significantly impacts the performance of computational doublet-detection methods.Our study is the first attempt to systematically explore the optimal hyperparameters under various biological conditions and optimization objectives.Our study provides much-needed guidance for hyperparameter tuning in computational doublet-detection methods.
文摘With the advancement of artificial intelligence,traffic forecasting is gaining more and more interest in optimizing route planning and enhancing service quality.Traffic volume is an influential parameter for planning and operating traffic structures.This study proposed an improved ensemble-based deep learning method to solve traffic volume prediction problems.A set of optimal hyperparameters is also applied for the suggested approach to improve the performance of the learning process.The fusion of these methodologies aims to harness ensemble empirical mode decomposition’s capacity to discern complex traffic patterns and long short-term memory’s proficiency in learning temporal relationships.Firstly,a dataset for automatic vehicle identification is obtained and utilized in the preprocessing stage of the ensemble empirical mode decomposition model.The second aspect involves predicting traffic volume using the long short-term memory algorithm.Next,the study employs a trial-and-error approach to select a set of optimal hyperparameters,including the lookback window,the number of neurons in the hidden layers,and the gradient descent optimization.Finally,the fusion of the obtained results leads to a final traffic volume prediction.The experimental results show that the proposed method outperforms other benchmarks regarding various evaluation measures,including mean absolute error,root mean squared error,mean absolute percentage error,and R-squared.The achieved R-squared value reaches an impressive 98%,while the other evaluation indices surpass the competing.These findings highlight the accuracy of traffic pattern prediction.Consequently,this offers promising prospects for enhancing transportation management systems and urban infrastructure planning.
文摘Hydrological models are developed to simulate river flows over a watershed for many practical applications in the field of water resource management. The present paper compares the performance of two recurrent neural networks for rainfall-runoff modeling in the Zou River basin at Atchérigbé outlet. To this end, we used daily precipitation data over the period 1988-2010 as input of the models, such as the Long Short-Term Memory (LSTM) and Recurrent Gate Networks (GRU) to simulate river discharge in the study area. The investigated models give good results in calibration (R2 = 0.888, NSE = 0.886, and RMSE = 0.42 for LSTM;R2 = 0.9, NSE = 0.9 and RMSE = 0.397 for GRU) and in validation (R2 = 0.865, NSE = 0.851, and RMSE = 0.329 for LSTM;R2 = 0.9, NSE = 0.865 and RMSE = 0.301 for GRU). This good performance of LSTM and GRU models confirms the importance of models based on machine learning in modeling hydrological phenomena for better decision-making.
文摘Recently,anomaly detection(AD)in streaming data gained significant attention among research communities due to its applicability in finance,business,healthcare,education,etc.The recent developments of deep learning(DL)models find helpful in the detection and classification of anomalies.This article designs an oversampling with an optimal deep learning-based streaming data classification(OS-ODLSDC)model.The aim of the OSODLSDC model is to recognize and classify the presence of anomalies in the streaming data.The proposed OS-ODLSDC model initially undergoes preprocessing step.Since streaming data is unbalanced,support vector machine(SVM)-Synthetic Minority Over-sampling Technique(SVM-SMOTE)is applied for oversampling process.Besides,the OS-ODLSDC model employs bidirectional long short-term memory(Bi LSTM)for AD and classification.Finally,the root means square propagation(RMSProp)optimizer is applied for optimal hyperparameter tuning of the Bi LSTM model.For ensuring the promising performance of the OS-ODLSDC model,a wide-ranging experimental analysis is performed using three benchmark datasets such as CICIDS 2018,KDD-Cup 1999,and NSL-KDD datasets.
基金funded by the Natural Science Foundation of Chongqing(Grants No.CSTB2022NSCQ-MSX0594)the Humanities and Social Sciences Research Project of the Ministry of Education(Grants No.16YJCZH061).
文摘Boosting algorithms have been widely utilized in the development of landslide susceptibility mapping(LSM)studies.However,these algorithms possess distinct computational strategies and hyperparameters,making it challenging to propose an ideal LSM model.To investigate the impact of different boosting algorithms and hyperparameter optimization algorithms on LSM,this study constructed a geospatial database comprising 12 conditioning factors,such as elevation,stratum,and annual average rainfall.The XGBoost(XGB),LightGBM(LGBM),and CatBoost(CB)algorithms were employed to construct the LSM model.Furthermore,the Bayesian optimization(BO),particle swarm optimization(PSO),and Hyperband optimization(HO)algorithms were applied to optimizing the LSM model.The boosting algorithms exhibited varying performances,with CB demonstrating the highest precision,followed by LGBM,and XGB showing poorer precision.Additionally,the hyperparameter optimization algorithms displayed different performances,with HO outperforming PSO and BO showing poorer performance.The HO-CB model achieved the highest precision,boasting an accuracy of 0.764,an F1-score of 0.777,an area under the curve(AUC)value of 0.837 for the training set,and an AUC value of 0.863 for the test set.The model was interpreted using SHapley Additive exPlanations(SHAP),revealing that slope,curvature,topographic wetness index(TWI),degree of relief,and elevation significantly influenced landslides in the study area.This study offers a scientific reference for LSM and disaster prevention research.This study examines the utilization of various boosting algorithms and hyperparameter optimization algorithms in Wanzhou District.It proposes the HO-CB-SHAP framework as an effective approach to accurately forecast landslide disasters and interpret LSM models.However,limitations exist concerning the generalizability of the model and the data processing,which require further exploration in subsequent studies.
文摘Fraud of credit cards is a major issue for financial organizations and individuals.As fraudulent actions become more complex,a demand for better fraud detection systems is rising.Deep learning approaches have shown promise in several fields,including detecting credit card fraud.However,the efficacy of these models is heavily dependent on the careful selection of appropriate hyperparameters.This paper introduces models that integrate deep learning models with hyperparameter tuning techniques to learn the patterns and relationships within credit card transaction data,thereby improving fraud detection.Three deep learning models:AutoEncoder(AE),Convolution Neural Network(CNN),and Long Short-Term Memory(LSTM)are proposed to investigate how hyperparameter adjustment impacts the efficacy of deep learning models used to identify credit card fraud.The experiments conducted on a European credit card fraud dataset using different hyperparameters and three deep learning models demonstrate that the proposed models achieve a tradeoff between detection rate and precision,leading these models to be effective in accurately predicting credit card fraud.The results demonstrate that LSTM significantly outperformed AE and CNN in terms of accuracy(99.2%),detection rate(93.3%),and area under the curve(96.3%).These proposed models have surpassed those of existing studies and are expected to make a significant contribution to the field of credit card fraud detection.
基金supported in part by the National Natural Science Foundation of China under Grant 62171203in part by the Jiangsu Province“333 Project”High-Level Talent Cultivation Subsidized Project+2 种基金in part by the SuzhouKey Supporting Subjects for Health Informatics under Grant SZFCXK202147in part by the Changshu Science and Technology Program under Grants CS202015 and CS202246in part by Changshu Key Laboratory of Medical Artificial Intelligence and Big Data under Grants CYZ202301 and CS202314.
文摘In this paper,we introduce a novel Multi-scale and Auto-tuned Semi-supervised Deep Subspace Clustering(MAS-DSC)algorithm,aimed at addressing the challenges of deep subspace clustering in high-dimensional real-world data,particularly in the field of medical imaging.Traditional deep subspace clustering algorithms,which are mostly unsupervised,are limited in their ability to effectively utilize the inherent prior knowledge in medical images.Our MAS-DSC algorithm incorporates a semi-supervised learning framework that uses a small amount of labeled data to guide the clustering process,thereby enhancing the discriminative power of the feature representations.Additionally,the multi-scale feature extraction mechanism is designed to adapt to the complexity of medical imaging data,resulting in more accurate clustering performance.To address the difficulty of hyperparameter selection in deep subspace clustering,this paper employs a Bayesian optimization algorithm for adaptive tuning of hyperparameters related to subspace clustering,prior knowledge constraints,and model loss weights.Extensive experiments on standard clustering datasets,including ORL,Coil20,and Coil100,validate the effectiveness of the MAS-DSC algorithm.The results show that with its multi-scale network structure and Bayesian hyperparameter optimization,MAS-DSC achieves excellent clustering results on these datasets.Furthermore,tests on a brain tumor dataset demonstrate the robustness of the algorithm and its ability to leverage prior knowledge for efficient feature extraction and enhanced clustering performance within a semi-supervised learning framework.
文摘The recent development of the Internet of Things(IoTs)resulted in the growth of IoT-based DDoS attacks.The detection of Botnet in IoT systems implements advanced cybersecurity measures to detect and reduce malevolent botnets in interconnected devices.Anomaly detection models evaluate transmission patterns,network traffic,and device behaviour to detect deviations from usual activities.Machine learning(ML)techniques detect patterns signalling botnet activity,namely sudden traffic increase,unusual command and control patterns,or irregular device behaviour.In addition,intrusion detection systems(IDSs)and signature-based techniques are applied to recognize known malware signatures related to botnets.Various ML and deep learning(DL)techniques have been developed to detect botnet attacks in IoT systems.To overcome security issues in an IoT environment,this article designs a gorilla troops optimizer with DL-enabled botnet attack detection and classification(GTODL-BADC)technique.The GTODL-BADC technique follows feature selection(FS)with optimal DL-based classification for accomplishing security in an IoT environment.For data preprocessing,the min-max data normalization approach is primarily used.The GTODL-BADC technique uses the GTO algorithm to select features and elect optimal feature subsets.Moreover,the multi-head attention-based long short-term memory(MHA-LSTM)technique was applied for botnet detection.Finally,the tree seed algorithm(TSA)was used to select the optimum hyperparameter for the MHA-LSTM method.The experimental validation of the GTODL-BADC technique can be tested on a benchmark dataset.The simulation results highlighted that the GTODL-BADC technique demonstrates promising performance in the botnet detection process.
基金supported and funded by the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University(IMSIU),Grant Number IMSIU-RG23151.
文摘This study explores the impact of hyperparameter optimization on machine learning models for predicting cardiovascular disease using data from an IoST(Internet of Sensing Things)device.Ten distinct machine learning approaches were implemented and systematically evaluated before and after hyperparameter tuning.Significant improvements were observed across various models,with SVM and Neural Networks consistently showing enhanced performance metrics such as F1-Score,recall,and precision.The study underscores the critical role of tailored hyperparameter tuning in optimizing these models,revealing diverse outcomes among algorithms.Decision Trees and Random Forests exhibited stable performance throughout the evaluation.While enhancing accuracy,hyperparameter optimization also led to increased execution time.Visual representations and comprehensive results support the findings,confirming the hypothesis that optimizing parameters can effectively enhance predictive capabilities in cardiovascular disease.This research contributes to advancing the understanding and application of machine learning in healthcare,particularly in improving predictive accuracy for cardiovascular disease management and intervention strategies.
基金supported by the Researchers Supporting Project(RSPD2024R846),King Saud University,Riyadh,Saudi Arabia.
文摘Breast cancer stands as one of the world’s most perilous and formidable diseases,having recently surpassed lung cancer as the most prevalent cancer type.This disease arises when cells in the breast undergo unregulated proliferation,resulting in the formation of a tumor that has the capacity to invade surrounding tissues.It is not confined to a specific gender;both men and women can be diagnosed with breast cancer,although it is more frequently observed in women.Early detection is pivotal in mitigating its mortality rate.The key to curbing its mortality lies in early detection.However,it is crucial to explain the black-box machine learning algorithms in this field to gain the trust of medical professionals and patients.In this study,we experimented with various machine learning models to predict breast cancer using the Wisconsin Breast Cancer Dataset(WBCD)dataset.We applied Random Forest,XGBoost,Support Vector Machine(SVM),Multi-Layer Perceptron(MLP),and Gradient Boost classifiers,with the Random Forest model outperforming the others.A comparison analysis between the two methods was done after performing hyperparameter tuning on each method.The analysis showed that the random forest performs better and yields the highest result with 99.46%accuracy.After performance evaluation,two Explainable Artificial Intelligence(XAI)methods,SHapley Additive exPlanations(SHAP)and Local Interpretable Model-Agnostic Explanations(LIME),have been utilized to explain the random forest machine learning model.
文摘Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requires more syntactic elements.Most existing strategies focus on the general semantics of a conversation without involving the context of the sentence,recognizing the progress and comparing impacts.An ensemble pre-trained language model was taken up here to classify the conversation sentences from the conversation corpus.The conversational sentences are classified into four categories:information,question,directive,and commission.These classification label sequences are for analyzing the conversation progress and predicting the pecking order of the conversation.Ensemble of Bidirectional Encoder for Representation of Transformer(BERT),Robustly Optimized BERT pretraining Approach(RoBERTa),Generative Pre-Trained Transformer(GPT),DistilBERT and Generalized Autoregressive Pretraining for Language Understanding(XLNet)models are trained on conversation corpus with hyperparameters.Hyperparameter tuning approach is carried out for better performance on sentence classification.This Ensemble of Pre-trained Language Models with a Hyperparameter Tuning(EPLM-HT)system is trained on an annotated conversation dataset.The proposed approach outperformed compared to the base BERT,GPT,DistilBERT and XLNet transformer models.The proposed ensemble model with the fine-tuned parameters achieved an F1_score of 0.88.