The surge in smishing attacks underscores the urgent need for robust,real-time detection systems powered by advanced deep learning models.This paper introduces PhishNet,a novel ensemble learning framework that integra...The surge in smishing attacks underscores the urgent need for robust,real-time detection systems powered by advanced deep learning models.This paper introduces PhishNet,a novel ensemble learning framework that integrates transformer-based models(RoBERTa)and large language models(LLMs)(GPT-OSS 120B,LLaMA3.370B,and Qwen332B)to enhance smishing detection performance significantly.To mitigate class imbalance,we apply synthetic data augmentation using T5 and leverage various text preprocessing techniques.Our system employs a duallayer voting mechanism:weighted majority voting among LLMs and a final ensemble vote to classify messages as ham,spam,or smishing.Experimental results show an average accuracy improvement from 96%to 98.5%compared to the best standalone transformer,and from 93%to 98.5%when compared to LLMs across datasets.Furthermore,we present a real-time,user-friendly application to operationalize our detection model for practical use.PhishNet demonstrates superior scalability,usability,and detection accuracy,filling critical gaps in current smishing detection methodologies.展开更多
Deep learning based analyses of computed tomography(CT)images contribute to automated diagnosis of COVID-19,and ensemble learning may commonly provide a better solution.Here,we proposed an ensemble learning method tha...Deep learning based analyses of computed tomography(CT)images contribute to automated diagnosis of COVID-19,and ensemble learning may commonly provide a better solution.Here,we proposed an ensemble learning method that integrates several component neural networks to jointly diagnose COVID-19.Two ensemble strategies are considered:the output scores of all component models that are combined with the weights adjusted adaptively by cost function back propagation;voting strategy.A database containing 8347 CT slices of COVID-19,common pneumonia and normal subjects was used as training and testing sets.Results show that the novel method can reach a high accuracy of 99.37%(recall:0.9981;precision:0.9893),with an increase of about 7% in comparison to single-component models.And the average test accuracy is 95.62%(recall:0.9587;precision:0.9559),with a corresponding increase of 5.2%.Compared with several latest deep learning models on the identical test set,our method made an accuracy improvement up to 10.88%.The proposed method may be a promising solution for the diagnosis of COVID-19.展开更多
The oil industries are an important part of a country’s economy.The crude oil’s price is influenced by a wide range of variables.Therefore,how accurately can countries predict its behavior and what predictors to emp...The oil industries are an important part of a country’s economy.The crude oil’s price is influenced by a wide range of variables.Therefore,how accurately can countries predict its behavior and what predictors to employ are two main questions.In this view,we propose utilizing deep learning and ensemble learning techniques to boost crude oil’s price forecasting performance.The suggested method is based on a deep learning snapshot ensemble method of the Transformer model.To examine the superiority of the proposed model,this paper compares the proposed deep learning ensemble model against different machine learning and statistical models for daily Organization of the Petroleum Exporting Countries(OPEC)oil price forecasting.Experimental results demonstrated the outperformance of the proposed method over statistical and machine learning methods.More precisely,the proposed snapshot ensemble of Transformer method achieved relative improvement in the forecasting performance compared to autoregressive integrated moving average ARIMA(1,1,1),ARIMA(0,1,1),autoregressive moving average(ARMA)(0,1),vector autoregression(VAR),random walk(RW),support vector machine(SVM),and random forests(RF)models by 99.94%,99.62%,99.87%,99.65%,7.55%,98.38%,and 99.35%,respectively,according to mean square error metric.展开更多
Covid-19 is a deadly virus that is rapidly spread around the world towards the end of the 2020.The consequences of this virus are quite frightening,especially when accompanied by an underlying disease.The novelty of t...Covid-19 is a deadly virus that is rapidly spread around the world towards the end of the 2020.The consequences of this virus are quite frightening,especially when accompanied by an underlying disease.The novelty of the virus,the constant emergence of different variants and its rapid spread have a negative impact on the control and treatment process.Although the new test kits provide almost certain results,chest X-rays are extremely important to detect the progression and degree of the disease.In addition to the Covid-19 virus,pneumonia and harmless opacity of the lungs also complicate the diagnosis.Considering the negative results caused by the virus and the treatment costs,the importance of fast and accurate diagnosis is clearly seen.In this context,deep learning methods appear as an extremely popular approach.In this study,a hybrid model design with superior properties of convolutional neural networks is presented to correctly classify the Covid-19 disease.In addition,in order to contribute to the literature,a suitable dataset with balanced case numbers that can be used in all artificial intelligence classification studies is presented.With this ensemble model design,quite remarkable results are obtained for the diagnosis of three and four-class Covid-19.The proposed model can classify normal,pneumonia,and Covid-19 with 92.6%accuracy and 82.6%for normal,pneumonia,Covid-19,and lung opacity.展开更多
A single model cannot satisfy the high-precision prediction requirements given the high nonlinearity between variables.By contrast,ensemble models can effectively solve this problem.Three key factors for improving the...A single model cannot satisfy the high-precision prediction requirements given the high nonlinearity between variables.By contrast,ensemble models can effectively solve this problem.Three key factors for improving the accuracy of ensemble models are namely the high accuracy of a submodel,the diversity between subsample sets and the optimal ensemble method.This study presents an improved ensemble modeling method to improve the prediction precision and generalization capability of the model.Our proposed method first uses a bagging algorithm to generate multiple subsample sets.Second,an indicator vector is defined to describe these subsample sets.Third,subsample sets are selected on the basis of the results of agglomerative nesting clustering on indicator vectors to maximize the diversity between subsets.Subsequently,these subsample sets are placed in a stacked autoencoder for training.Finally,XGBoost algorithm,rather than the traditional simple average ensemble method,is imported to ensemble the model during modeling.Three machine learning public datasets and atmospheric column dry point dataset from a practical industrial process show that our proposed method demonstrates high precision and improved prediction ability.展开更多
Proper waste management models using recent technologies like computer vision,machine learning(ML),and deep learning(DL)are needed to effectively handle the massive quantity of increasing waste.Therefore,waste classif...Proper waste management models using recent technologies like computer vision,machine learning(ML),and deep learning(DL)are needed to effectively handle the massive quantity of increasing waste.Therefore,waste classification becomes a crucial topic which helps to categorize waste into hazardous or non-hazardous ones and thereby assist in the decision making of the waste management process.This study concentrates on the design of hazardous waste detection and classification using ensemble learning(HWDC-EL)technique to reduce toxicity and improve human health.The goal of the HWDC-EL technique is to detect the multiple classes of wastes,particularly hazardous and non-hazardous wastes.The HWDC-EL technique involves the ensemble of three feature extractors using Model Averaging technique namely discrete local binary patterns(DLBP),EfficientNet,and DenseNet121.In addition,the flower pollination algorithm(FPA)based hyperparameter optimizers are used to optimally adjust the parameters involved in the EfficientNet and DenseNet121 models.Moreover,a weighted voting-based ensemble classifier is derived using three machine learning algorithms namely support vector machine(SVM),extreme learning machine(ELM),and gradient boosting tree(GBT).The performance of the HWDC-EL technique is tested using a benchmark Garbage dataset and it obtains a maximum accuracy of 98.85%.展开更多
Pneumonia is an acute lung infection that has caused many fatalitiesglobally. Radiologists often employ chest X-rays to identify pneumoniasince they are presently the most effective imaging method for this purpose.Com...Pneumonia is an acute lung infection that has caused many fatalitiesglobally. Radiologists often employ chest X-rays to identify pneumoniasince they are presently the most effective imaging method for this purpose.Computer-aided diagnosis of pneumonia using deep learning techniques iswidely used due to its effectiveness and performance. In the proposed method,the Synthetic Minority Oversampling Technique (SMOTE) approach is usedto eliminate the class imbalance in the X-ray dataset. To compensate forthe paucity of accessible data, pre-trained transfer learning is used, and anensemble Convolutional Neural Network (CNN) model is developed. Theensemble model consists of all possible combinations of the MobileNetv2,Visual Geometry Group (VGG16), and DenseNet169 models. MobileNetV2and DenseNet169 performed well in the Single classifier model, with anaccuracy of 94%, while the ensemble model (MobileNetV2+DenseNet169)achieved an accuracy of 96.9%. Using the data synchronous parallel modelin Distributed Tensorflow, the training process accelerated performance by98.6% and outperformed other conventional approaches.展开更多
As the COVID-19 pandemic swept the globe,social media plat-forms became an essential source of information and communication for many.International students,particularly,turned to Twitter to express their struggles an...As the COVID-19 pandemic swept the globe,social media plat-forms became an essential source of information and communication for many.International students,particularly,turned to Twitter to express their struggles and hardships during this difficult time.To better understand the sentiments and experiences of these international students,we developed the Situational Aspect-Based Annotation and Classification(SABAC)text mining framework.This framework uses a three-layer approach,combining baseline Deep Learning(DL)models with Machine Learning(ML)models as meta-classifiers to accurately predict the sentiments and aspects expressed in tweets from our collected Student-COVID-19 dataset.Using the pro-posed aspect2class annotation algorithm,we labeled bulk unlabeled tweets according to their contained aspect terms.However,we also recognized the challenges of reducing data’s high dimensionality and sparsity to improve performance and annotation on unlabeled datasets.To address this issue,we proposed the Volatile Stopwords Filtering(VSF)technique to reduce sparsity and enhance classifier performance.The resulting Student-COVID Twitter dataset achieved a sophisticated accuracy of 93.21%when using the random forest as a meta-classifier.Through testing on three benchmark datasets,we found that the SABAC ensemble framework performed exceptionally well.Our findings showed that international students during the pandemic faced various issues,including stress,uncertainty,health concerns,financial stress,and difficulties with online classes and returning to school.By analyzing and summarizing these annotated tweets,decision-makers can better understand and address the real-time problems international students face during the ongoing pandemic.展开更多
Pneumonia is a dangerous respiratory disease due to which breathing becomes incredibly difficult and painful;thus,catching it early is crucial.Medical physicians’time is limited in outdoor situations due to many pati...Pneumonia is a dangerous respiratory disease due to which breathing becomes incredibly difficult and painful;thus,catching it early is crucial.Medical physicians’time is limited in outdoor situations due to many patients;therefore,automated systems can be a rescue.The input images from the X-ray equipment are also highly unpredictable due to variances in radiologists’experience.Therefore,radiologists require an automated system that can swiftly and accurately detect pneumonic lungs from chest x-rays.In medical classifications,deep convolution neural networks are commonly used.This research aims to use deep pretrained transfer learning models to accurately categorize CXR images into binary classes,i.e.,Normal and Pneumonia.The MDEV is a proposed novel ensemble approach that concatenates four heterogeneous transfer learning models:Mobile-Net,DenseNet-201,EfficientNet-B0,and VGG-16,which have been finetuned and trained on 5,856 CXR images.The evaluation matrices used in this research to contrast different deep transfer learning architectures include precision,accuracy,recall,AUC-roc,and f1-score.The model effectively decreases training loss while increasing accuracy.The findings conclude that the proposed MDEV model outperformed cutting-edge deep transfer learning models and obtains an overall precision of 92.26%,an accuracy of 92.15%,a recall of 90.90%,an auc-roc score of 90.9%,and f-score of 91.49%with minimal data pre-processing,data augmentation,finetuning and hyperparameter adjustment in classifying Normal and Pneumonia chests.展开更多
The Coronavirus Disease(COVID-19)pandemic has exposed the vulnerabilities of medical services across the globe,especially in underdeveloped nations.In the aftermath of the COVID-19 outbreak,a strong demand exists for ...The Coronavirus Disease(COVID-19)pandemic has exposed the vulnerabilities of medical services across the globe,especially in underdeveloped nations.In the aftermath of the COVID-19 outbreak,a strong demand exists for developing novel computer-assisted diagnostic tools to execute rapid and cost-effective screenings in locations where many screenings cannot be executed using conventional methods.Medical imaging has become a crucial component in the disease diagnosis process,whereas X-rays and Computed Tomography(CT)scan imaging are employed in a deep network to diagnose the diseases.In general,four steps are followed in image-based diagnostics and disease classification processes by making use of the neural networks,such as network training,feature extraction,model performance testing and optimal feature selection.The current research article devises a Chaotic Flower Pollination Algorithm with a Deep Learning-Driven Fusion(CFPADLDF)approach for detecting and classifying COVID-19.The presented CFPA-DLDF model is developed by integrating two DL models to recognize COVID-19 in medical images.Initially,the proposed CFPA-DLDF technique employs the Gabor Filtering(GF)approach to pre-process the input images.In addition,a weighted voting-based ensemble model is employed for feature extraction,in which both VGG-19 and the MixNet models are included.Finally,the CFPA with Recurrent Neural Network(RNN)model is utilized for classification,showing the work’s novelty.A comparative analysis was conducted to demonstrate the enhanced performance of the proposed CFPADLDF model,and the results established the supremacy of the proposed CFPA-DLDF model over recent approaches.展开更多
N^(6)-Methyladenine is a dynamic and reversible post translational modification,which plays an essential role in various biological processes.Because of the current inability to identify m6A-containing mRNAs,computati...N^(6)-Methyladenine is a dynamic and reversible post translational modification,which plays an essential role in various biological processes.Because of the current inability to identify m6A-containing mRNAs,computational approaches have been developed to identify m6A sites in DNA sequences.Aiming to improve prediction performance,we introduced a novel ensemble computational approach based on three hybrid deep neural networks,including a convolutional neural network,a capsule network,and a bidirectional gated recurrent unit(BiGRU)with the self-attention mechanism,to identify m6A sites in four tissues of three species.Across a total of 11 datasets,we selected different feature subsets,after optimized from 4933 dimensional features,as input for the deep hybrid neural networks.In addition,to solve the deviation caused by the relatively small number of experimentally verified samples,we constructed an ensemble model through integrating five sub-classifiers based on different training datasets.When compared through 5-fold cross-validation and independent tests,our model showed its superiority to previous methods,im6A-TS-CNN and iRNA-m6A.展开更多
The prediction of the Tropical Cyclone(TC)intensity helps the government to take proper precautions and disseminate appropriate warnings to civilians.Intensity prediction for TC is a very challenging task due to its d...The prediction of the Tropical Cyclone(TC)intensity helps the government to take proper precautions and disseminate appropriate warnings to civilians.Intensity prediction for TC is a very challenging task due to its dynamically changing internal and external impact factors.We proposed a system to predict TC intensity using CNN-based ensemble deep-learning models that are trained by both satellite images and numerical data of the TC.This paper presents a thorough examination of several deep-learning models such as CNN,Recurrent Neural Networks(RNN)and transfer learning models(AlexNet and VGG)to determine their effectiveness in forecasting TC intensity.Our focus is on four widely recognized models:AlexNet,VGG16,RNN and,a customized CNN-based ensemble model all of which were trained exclusively on image data,as well as an ensemble model that utilized both image and numerical datasets for training.Our analysis evaluates the performance of each model in terms of the loss incurred.The results provide a comparative assessment of the deep learning models selected and offer insights into their respective prediction loss in the form of Mean Square Error(MSE)as 194 in 100 epochs and execution time 1229 s to forecasting TC intensity.We also emphasize the potential benefits of incorporating both image and numerical data into an ensemble model,which can lead to improved prediction accuracy.This research provides valuable knowledge to the field of meteorology and disaster management,paving the way for more resilient and precise TC intensity forecasting models.展开更多
Detection of cracks in concrete structures is critical for their safety and the sustainability of maintenance processes.Traditional inspection techniques are costly,time-consuming,and inefficient regarding human resou...Detection of cracks in concrete structures is critical for their safety and the sustainability of maintenance processes.Traditional inspection techniques are costly,time-consuming,and inefficient regarding human resources.Deep learning architectures have become more widespread in recent years by accelerating these processes and increasing their efficiency.Deep learning models(DLMs)stand out as an effective solution in crack detection due to their features such as end-to-end learning capability,model adaptation,and automatic learning processes.However,providing an optimal balance between model performance and computational efficiency of DLMs is a vital research topic.In this article,three different methods are proposed for detecting cracks in concrete structures.In the first method,a Separable Convolutional with Attention and Multi-layer Enhanced Fusion Network(SCAMEFNet)deep neural network,which has a deep architecture and can provide a balance between the depth of DLMs and model parameters,has been developed.This model was designed using a convolutional neural network,multi-head attention,and various fusion techniques.The second method proposes a modified vision transformer(ViT)model.A two-stage ensemble learning model,deep featurebased two-stage ensemble model(DFTSEM),is proposed in the third method.In this method,deep features and machine learning methods are used.The proposed approaches are evaluated using the Concrete Cracks Image Data set,which the authors collected and contains concrete cracks on building surfaces.The results show that the SCAMEFNet model achieved an accuracy rate of 98.83%,the ViT model 97.33%,and the DFTSEM model 99.00%.These findings show that the proposed techniques successfully detect surface cracks and deformations and can provide practical solutions to realworld problems.In addition,the developed methods can contribute as a tool for BIM platforms in smart cities for building health.展开更多
The rapid growth of mobile applications,the popularity of the Android system and its openness have attracted many hackers and even criminals,who are creating lots of Android malware.However,the current methods of Andr...The rapid growth of mobile applications,the popularity of the Android system and its openness have attracted many hackers and even criminals,who are creating lots of Android malware.However,the current methods of Android malware detection need a lot of time in the feature engineering phase.Furthermore,these models have the defects of low detection rate,high complexity,and poor practicability,etc.We analyze the Android malware samples,and the distribution of malware and benign software in application programming interface(API)calls,permissions,and other attributes.We classify the software’s threat levels based on the correlation of features.Then,we propose deep neural networks and convolutional neural networks with ensemble learning(DCEL),a new classifier fusion model for Android malware detection.First,DCEL preprocesses the malware data to remove redundant data,and converts the one-dimensional data into a two-dimensional gray image.Then,the ensemble learning approach is used to combine the deep neural network with the convolutional neural network,and the final classification results are obtained by voting on the prediction of each single classifier.Experiments based on the Drebin and Malgenome datasets show that compared with current state-of-art models,the proposed DCEL has a higher detection rate,higher recall rate,and lower computational cost.展开更多
Widely used deep neural networks currently face limitations in achieving optimal performance for purchase intention prediction due to constraints on data volume and hyperparameter selection.To address this issue,based...Widely used deep neural networks currently face limitations in achieving optimal performance for purchase intention prediction due to constraints on data volume and hyperparameter selection.To address this issue,based on the deep forest algorithm and further integrating evolutionary ensemble learning methods,this paper proposes a novel Deep Adaptive Evolutionary Ensemble(DAEE)model.This model introduces model diversity into the cascade layer,allowing it to adaptively adjust its structure to accommodate complex and evolving purchasing behavior patterns.Moreover,this paper optimizes the methods of obtaining feature vectors,enhancement vectors,and prediction results within the deep forest algorithm to enhance the model’s predictive accuracy.Results demonstrate that the improved deep forest model not only possesses higher robustness but also shows an increase of 5.02%in AUC value compared to the baseline model.Furthermore,its training runtime speed is 6 times faster than that of deep models,and compared to other improved models,its accuracy has been enhanced by 0.9%.展开更多
Many plant species have a startling degree of morphological similarity,making it difficult to split and categorize them reliably.Unknown plant species can be challenging to classify and segment using deep learning.Whi...Many plant species have a startling degree of morphological similarity,making it difficult to split and categorize them reliably.Unknown plant species can be challenging to classify and segment using deep learning.While using deep learning architectures has helped improve classification accuracy,the resulting models often need to be more flexible and require a large dataset to train.For the sake of taxonomy,this research proposes a hybrid method for categorizing guava,potato,and java plumleaves.Two new approaches are used to formthe hybridmodel suggested here.The guava,potato,and java plum plant species have been successfully segmented using the first model built on the MobileNetV2-UNET architecture.As a second model,we use a Plant Species Detection Stacking Ensemble Deep Learning Model(PSD-SE-DLM)to identify potatoes,java plums,and guava.The proposed models were trained using data collected in Punjab,Pakistan,consisting of images of healthy and sick leaves from guava,java plum,and potatoes.These datasets are known as PLSD and PLSSD.Accuracy levels of 99.84%and 96.38%were achieved for the suggested PSD-SE-DLM and MobileNetV2-UNET models,respectively.展开更多
基金funded by the Deanship of Scientific Research(DSR)at King Abdulaziz University,Jeddah,under Grant No.(GPIP:1074-612-2024).
文摘The surge in smishing attacks underscores the urgent need for robust,real-time detection systems powered by advanced deep learning models.This paper introduces PhishNet,a novel ensemble learning framework that integrates transformer-based models(RoBERTa)and large language models(LLMs)(GPT-OSS 120B,LLaMA3.370B,and Qwen332B)to enhance smishing detection performance significantly.To mitigate class imbalance,we apply synthetic data augmentation using T5 and leverage various text preprocessing techniques.Our system employs a duallayer voting mechanism:weighted majority voting among LLMs and a final ensemble vote to classify messages as ham,spam,or smishing.Experimental results show an average accuracy improvement from 96%to 98.5%compared to the best standalone transformer,and from 93%to 98.5%when compared to LLMs across datasets.Furthermore,we present a real-time,user-friendly application to operationalize our detection model for practical use.PhishNet demonstrates superior scalability,usability,and detection accuracy,filling critical gaps in current smishing detection methodologies.
基金the Sichuan Science and Technology Department Research and Development Key Project(No.21ZDYF3607)the Weining Cloud Hospital Based AI Medical Software System Service and Demo Project(No.2019K0JTS0159)the China Postdoctoral Science Foundation(No.2020T130137ZX)。
文摘Deep learning based analyses of computed tomography(CT)images contribute to automated diagnosis of COVID-19,and ensemble learning may commonly provide a better solution.Here,we proposed an ensemble learning method that integrates several component neural networks to jointly diagnose COVID-19.Two ensemble strategies are considered:the output scores of all component models that are combined with the weights adjusted adaptively by cost function back propagation;voting strategy.A database containing 8347 CT slices of COVID-19,common pneumonia and normal subjects was used as training and testing sets.Results show that the novel method can reach a high accuracy of 99.37%(recall:0.9981;precision:0.9893),with an increase of about 7% in comparison to single-component models.And the average test accuracy is 95.62%(recall:0.9587;precision:0.9559),with a corresponding increase of 5.2%.Compared with several latest deep learning models on the identical test set,our method made an accuracy improvement up to 10.88%.The proposed method may be a promising solution for the diagnosis of COVID-19.
文摘The oil industries are an important part of a country’s economy.The crude oil’s price is influenced by a wide range of variables.Therefore,how accurately can countries predict its behavior and what predictors to employ are two main questions.In this view,we propose utilizing deep learning and ensemble learning techniques to boost crude oil’s price forecasting performance.The suggested method is based on a deep learning snapshot ensemble method of the Transformer model.To examine the superiority of the proposed model,this paper compares the proposed deep learning ensemble model against different machine learning and statistical models for daily Organization of the Petroleum Exporting Countries(OPEC)oil price forecasting.Experimental results demonstrated the outperformance of the proposed method over statistical and machine learning methods.More precisely,the proposed snapshot ensemble of Transformer method achieved relative improvement in the forecasting performance compared to autoregressive integrated moving average ARIMA(1,1,1),ARIMA(0,1,1),autoregressive moving average(ARMA)(0,1),vector autoregression(VAR),random walk(RW),support vector machine(SVM),and random forests(RF)models by 99.94%,99.62%,99.87%,99.65%,7.55%,98.38%,and 99.35%,respectively,according to mean square error metric.
文摘Covid-19 is a deadly virus that is rapidly spread around the world towards the end of the 2020.The consequences of this virus are quite frightening,especially when accompanied by an underlying disease.The novelty of the virus,the constant emergence of different variants and its rapid spread have a negative impact on the control and treatment process.Although the new test kits provide almost certain results,chest X-rays are extremely important to detect the progression and degree of the disease.In addition to the Covid-19 virus,pneumonia and harmless opacity of the lungs also complicate the diagnosis.Considering the negative results caused by the virus and the treatment costs,the importance of fast and accurate diagnosis is clearly seen.In this context,deep learning methods appear as an extremely popular approach.In this study,a hybrid model design with superior properties of convolutional neural networks is presented to correctly classify the Covid-19 disease.In addition,in order to contribute to the literature,a suitable dataset with balanced case numbers that can be used in all artificial intelligence classification studies is presented.With this ensemble model design,quite remarkable results are obtained for the diagnosis of three and four-class Covid-19.The proposed model can classify normal,pneumonia,and Covid-19 with 92.6%accuracy and 82.6%for normal,pneumonia,Covid-19,and lung opacity.
基金The authors are grateful for the support of National Natural Science Foundation of China(21878081)Fundamental Research Funds for the Central Universities under Grant of China(222201717006)the Program of Introducing Talents of Discipline to Universities(the 111 Project)under Grant B17017.
文摘A single model cannot satisfy the high-precision prediction requirements given the high nonlinearity between variables.By contrast,ensemble models can effectively solve this problem.Three key factors for improving the accuracy of ensemble models are namely the high accuracy of a submodel,the diversity between subsample sets and the optimal ensemble method.This study presents an improved ensemble modeling method to improve the prediction precision and generalization capability of the model.Our proposed method first uses a bagging algorithm to generate multiple subsample sets.Second,an indicator vector is defined to describe these subsample sets.Third,subsample sets are selected on the basis of the results of agglomerative nesting clustering on indicator vectors to maximize the diversity between subsets.Subsequently,these subsample sets are placed in a stacked autoencoder for training.Finally,XGBoost algorithm,rather than the traditional simple average ensemble method,is imported to ensemble the model during modeling.Three machine learning public datasets and atmospheric column dry point dataset from a practical industrial process show that our proposed method demonstrates high precision and improved prediction ability.
基金the Deanship of Scientific Research at King Khalid University for funding this work underGrant Number(RGP 2/209/42)PrincessNourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2022R136)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:(22UQU4210118DSR27).
文摘Proper waste management models using recent technologies like computer vision,machine learning(ML),and deep learning(DL)are needed to effectively handle the massive quantity of increasing waste.Therefore,waste classification becomes a crucial topic which helps to categorize waste into hazardous or non-hazardous ones and thereby assist in the decision making of the waste management process.This study concentrates on the design of hazardous waste detection and classification using ensemble learning(HWDC-EL)technique to reduce toxicity and improve human health.The goal of the HWDC-EL technique is to detect the multiple classes of wastes,particularly hazardous and non-hazardous wastes.The HWDC-EL technique involves the ensemble of three feature extractors using Model Averaging technique namely discrete local binary patterns(DLBP),EfficientNet,and DenseNet121.In addition,the flower pollination algorithm(FPA)based hyperparameter optimizers are used to optimally adjust the parameters involved in the EfficientNet and DenseNet121 models.Moreover,a weighted voting-based ensemble classifier is derived using three machine learning algorithms namely support vector machine(SVM),extreme learning machine(ELM),and gradient boosting tree(GBT).The performance of the HWDC-EL technique is tested using a benchmark Garbage dataset and it obtains a maximum accuracy of 98.85%.
文摘Pneumonia is an acute lung infection that has caused many fatalitiesglobally. Radiologists often employ chest X-rays to identify pneumoniasince they are presently the most effective imaging method for this purpose.Computer-aided diagnosis of pneumonia using deep learning techniques iswidely used due to its effectiveness and performance. In the proposed method,the Synthetic Minority Oversampling Technique (SMOTE) approach is usedto eliminate the class imbalance in the X-ray dataset. To compensate forthe paucity of accessible data, pre-trained transfer learning is used, and anensemble Convolutional Neural Network (CNN) model is developed. Theensemble model consists of all possible combinations of the MobileNetv2,Visual Geometry Group (VGG16), and DenseNet169 models. MobileNetV2and DenseNet169 performed well in the Single classifier model, with anaccuracy of 94%, while the ensemble model (MobileNetV2+DenseNet169)achieved an accuracy of 96.9%. Using the data synchronous parallel modelin Distributed Tensorflow, the training process accelerated performance by98.6% and outperformed other conventional approaches.
基金supported by the National Natural Science Foundation of China[Grant Number:92067106]the Ministry of Education of the People’s Republic of China[Grant Number:E-GCCRC20200309].
文摘As the COVID-19 pandemic swept the globe,social media plat-forms became an essential source of information and communication for many.International students,particularly,turned to Twitter to express their struggles and hardships during this difficult time.To better understand the sentiments and experiences of these international students,we developed the Situational Aspect-Based Annotation and Classification(SABAC)text mining framework.This framework uses a three-layer approach,combining baseline Deep Learning(DL)models with Machine Learning(ML)models as meta-classifiers to accurately predict the sentiments and aspects expressed in tweets from our collected Student-COVID-19 dataset.Using the pro-posed aspect2class annotation algorithm,we labeled bulk unlabeled tweets according to their contained aspect terms.However,we also recognized the challenges of reducing data’s high dimensionality and sparsity to improve performance and annotation on unlabeled datasets.To address this issue,we proposed the Volatile Stopwords Filtering(VSF)technique to reduce sparsity and enhance classifier performance.The resulting Student-COVID Twitter dataset achieved a sophisticated accuracy of 93.21%when using the random forest as a meta-classifier.Through testing on three benchmark datasets,we found that the SABAC ensemble framework performed exceptionally well.Our findings showed that international students during the pandemic faced various issues,including stress,uncertainty,health concerns,financial stress,and difficulties with online classes and returning to school.By analyzing and summarizing these annotated tweets,decision-makers can better understand and address the real-time problems international students face during the ongoing pandemic.
基金This research was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(2021R1I1A1A01052299).
文摘Pneumonia is a dangerous respiratory disease due to which breathing becomes incredibly difficult and painful;thus,catching it early is crucial.Medical physicians’time is limited in outdoor situations due to many patients;therefore,automated systems can be a rescue.The input images from the X-ray equipment are also highly unpredictable due to variances in radiologists’experience.Therefore,radiologists require an automated system that can swiftly and accurately detect pneumonic lungs from chest x-rays.In medical classifications,deep convolution neural networks are commonly used.This research aims to use deep pretrained transfer learning models to accurately categorize CXR images into binary classes,i.e.,Normal and Pneumonia.The MDEV is a proposed novel ensemble approach that concatenates four heterogeneous transfer learning models:Mobile-Net,DenseNet-201,EfficientNet-B0,and VGG-16,which have been finetuned and trained on 5,856 CXR images.The evaluation matrices used in this research to contrast different deep transfer learning architectures include precision,accuracy,recall,AUC-roc,and f1-score.The model effectively decreases training loss while increasing accuracy.The findings conclude that the proposed MDEV model outperformed cutting-edge deep transfer learning models and obtains an overall precision of 92.26%,an accuracy of 92.15%,a recall of 90.90%,an auc-roc score of 90.9%,and f-score of 91.49%with minimal data pre-processing,data augmentation,finetuning and hyperparameter adjustment in classifying Normal and Pneumonia chests.
文摘The Coronavirus Disease(COVID-19)pandemic has exposed the vulnerabilities of medical services across the globe,especially in underdeveloped nations.In the aftermath of the COVID-19 outbreak,a strong demand exists for developing novel computer-assisted diagnostic tools to execute rapid and cost-effective screenings in locations where many screenings cannot be executed using conventional methods.Medical imaging has become a crucial component in the disease diagnosis process,whereas X-rays and Computed Tomography(CT)scan imaging are employed in a deep network to diagnose the diseases.In general,four steps are followed in image-based diagnostics and disease classification processes by making use of the neural networks,such as network training,feature extraction,model performance testing and optimal feature selection.The current research article devises a Chaotic Flower Pollination Algorithm with a Deep Learning-Driven Fusion(CFPADLDF)approach for detecting and classifying COVID-19.The presented CFPA-DLDF model is developed by integrating two DL models to recognize COVID-19 in medical images.Initially,the proposed CFPA-DLDF technique employs the Gabor Filtering(GF)approach to pre-process the input images.In addition,a weighted voting-based ensemble model is employed for feature extraction,in which both VGG-19 and the MixNet models are included.Finally,the CFPA with Recurrent Neural Network(RNN)model is utilized for classification,showing the work’s novelty.A comparative analysis was conducted to demonstrate the enhanced performance of the proposed CFPADLDF model,and the results established the supremacy of the proposed CFPA-DLDF model over recent approaches.
基金supported by the National Natural Science Foundation of China(Nos.62071079 and 61803065).
文摘N^(6)-Methyladenine is a dynamic and reversible post translational modification,which plays an essential role in various biological processes.Because of the current inability to identify m6A-containing mRNAs,computational approaches have been developed to identify m6A sites in DNA sequences.Aiming to improve prediction performance,we introduced a novel ensemble computational approach based on three hybrid deep neural networks,including a convolutional neural network,a capsule network,and a bidirectional gated recurrent unit(BiGRU)with the self-attention mechanism,to identify m6A sites in four tissues of three species.Across a total of 11 datasets,we selected different feature subsets,after optimized from 4933 dimensional features,as input for the deep hybrid neural networks.In addition,to solve the deviation caused by the relatively small number of experimentally verified samples,we constructed an ensemble model through integrating five sub-classifiers based on different training datasets.When compared through 5-fold cross-validation and independent tests,our model showed its superiority to previous methods,im6A-TS-CNN and iRNA-m6A.
文摘The prediction of the Tropical Cyclone(TC)intensity helps the government to take proper precautions and disseminate appropriate warnings to civilians.Intensity prediction for TC is a very challenging task due to its dynamically changing internal and external impact factors.We proposed a system to predict TC intensity using CNN-based ensemble deep-learning models that are trained by both satellite images and numerical data of the TC.This paper presents a thorough examination of several deep-learning models such as CNN,Recurrent Neural Networks(RNN)and transfer learning models(AlexNet and VGG)to determine their effectiveness in forecasting TC intensity.Our focus is on four widely recognized models:AlexNet,VGG16,RNN and,a customized CNN-based ensemble model all of which were trained exclusively on image data,as well as an ensemble model that utilized both image and numerical datasets for training.Our analysis evaluates the performance of each model in terms of the loss incurred.The results provide a comparative assessment of the deep learning models selected and offer insights into their respective prediction loss in the form of Mean Square Error(MSE)as 194 in 100 epochs and execution time 1229 s to forecasting TC intensity.We also emphasize the potential benefits of incorporating both image and numerical data into an ensemble model,which can lead to improved prediction accuracy.This research provides valuable knowledge to the field of meteorology and disaster management,paving the way for more resilient and precise TC intensity forecasting models.
文摘Detection of cracks in concrete structures is critical for their safety and the sustainability of maintenance processes.Traditional inspection techniques are costly,time-consuming,and inefficient regarding human resources.Deep learning architectures have become more widespread in recent years by accelerating these processes and increasing their efficiency.Deep learning models(DLMs)stand out as an effective solution in crack detection due to their features such as end-to-end learning capability,model adaptation,and automatic learning processes.However,providing an optimal balance between model performance and computational efficiency of DLMs is a vital research topic.In this article,three different methods are proposed for detecting cracks in concrete structures.In the first method,a Separable Convolutional with Attention and Multi-layer Enhanced Fusion Network(SCAMEFNet)deep neural network,which has a deep architecture and can provide a balance between the depth of DLMs and model parameters,has been developed.This model was designed using a convolutional neural network,multi-head attention,and various fusion techniques.The second method proposes a modified vision transformer(ViT)model.A two-stage ensemble learning model,deep featurebased two-stage ensemble model(DFTSEM),is proposed in the third method.In this method,deep features and machine learning methods are used.The proposed approaches are evaluated using the Concrete Cracks Image Data set,which the authors collected and contains concrete cracks on building surfaces.The results show that the SCAMEFNet model achieved an accuracy rate of 98.83%,the ViT model 97.33%,and the DFTSEM model 99.00%.These findings show that the proposed techniques successfully detect surface cracks and deformations and can provide practical solutions to realworld problems.In addition,the developed methods can contribute as a tool for BIM platforms in smart cities for building health.
基金supported by the National Natural Science Foundation of China(62072255)。
文摘The rapid growth of mobile applications,the popularity of the Android system and its openness have attracted many hackers and even criminals,who are creating lots of Android malware.However,the current methods of Android malware detection need a lot of time in the feature engineering phase.Furthermore,these models have the defects of low detection rate,high complexity,and poor practicability,etc.We analyze the Android malware samples,and the distribution of malware and benign software in application programming interface(API)calls,permissions,and other attributes.We classify the software’s threat levels based on the correlation of features.Then,we propose deep neural networks and convolutional neural networks with ensemble learning(DCEL),a new classifier fusion model for Android malware detection.First,DCEL preprocesses the malware data to remove redundant data,and converts the one-dimensional data into a two-dimensional gray image.Then,the ensemble learning approach is used to combine the deep neural network with the convolutional neural network,and the final classification results are obtained by voting on the prediction of each single classifier.Experiments based on the Drebin and Malgenome datasets show that compared with current state-of-art models,the proposed DCEL has a higher detection rate,higher recall rate,and lower computational cost.
基金supported by Ningxia Key R&D Program (Key)Project (2023BDE02001)Ningxia Key R&D Program (Talent Introduction Special)Project (2022YCZX0013)+2 种基金North Minzu University 2022 School-Level Research Platform“Digital Agriculture Empowering Ningxia Rural Revitalization Innovation Team”,Project Number:2022PT_S10Yinchuan City School-Enterprise Joint Innovation Project (2022XQZD009)“Innovation Team for Imaging and Intelligent Information Processing”of the National Ethnic Affairs Commission.
文摘Widely used deep neural networks currently face limitations in achieving optimal performance for purchase intention prediction due to constraints on data volume and hyperparameter selection.To address this issue,based on the deep forest algorithm and further integrating evolutionary ensemble learning methods,this paper proposes a novel Deep Adaptive Evolutionary Ensemble(DAEE)model.This model introduces model diversity into the cascade layer,allowing it to adaptively adjust its structure to accommodate complex and evolving purchasing behavior patterns.Moreover,this paper optimizes the methods of obtaining feature vectors,enhancement vectors,and prediction results within the deep forest algorithm to enhance the model’s predictive accuracy.Results demonstrate that the improved deep forest model not only possesses higher robustness but also shows an increase of 5.02%in AUC value compared to the baseline model.Furthermore,its training runtime speed is 6 times faster than that of deep models,and compared to other improved models,its accuracy has been enhanced by 0.9%.
基金funding this work through the Research Group Program under the Grant Number:(R.G.P.2/382/44).
文摘Many plant species have a startling degree of morphological similarity,making it difficult to split and categorize them reliably.Unknown plant species can be challenging to classify and segment using deep learning.While using deep learning architectures has helped improve classification accuracy,the resulting models often need to be more flexible and require a large dataset to train.For the sake of taxonomy,this research proposes a hybrid method for categorizing guava,potato,and java plumleaves.Two new approaches are used to formthe hybridmodel suggested here.The guava,potato,and java plum plant species have been successfully segmented using the first model built on the MobileNetV2-UNET architecture.As a second model,we use a Plant Species Detection Stacking Ensemble Deep Learning Model(PSD-SE-DLM)to identify potatoes,java plums,and guava.The proposed models were trained using data collected in Punjab,Pakistan,consisting of images of healthy and sick leaves from guava,java plum,and potatoes.These datasets are known as PLSD and PLSSD.Accuracy levels of 99.84%and 96.38%were achieved for the suggested PSD-SE-DLM and MobileNetV2-UNET models,respectively.