Android smartphones have become an integral part of our daily lives,becoming targets for ransomware attacks.Such attacks encrypt user information and ask for payment to recover it.Conventional detection mechanisms,suc...Android smartphones have become an integral part of our daily lives,becoming targets for ransomware attacks.Such attacks encrypt user information and ask for payment to recover it.Conventional detection mechanisms,such as signature-based and heuristic techniques,often fail to detect new and polymorphic ransomware samples.To address this challenge,we employed various ensemble classifiers,such as Random Forest,Gradient Boosting,Bagging,and AutoML models.We aimed to showcase how AutoML can automate processes such as model selection,feature engineering,and hyperparameter optimization,to minimize manual effort while ensuring or enhancing performance compared to traditional approaches.We used this framework to test it with a publicly available dataset from the Kaggle repository,which contains features for Android ransomware network traffic.The dataset comprises 392,024 flow records,divided into eleven groups.There are ten classes for various ransomware types,including SVpeng,PornDroid,Koler,WannaLocker,and Lockerpin.There is also a class for regular traffic.We applied a three-step procedure to select themost relevant features:filter,wrapper,and embeddedmethods.The Bagging classifier was highly accurate,correctly getting 99.84%of the time.The FLAML AutoML framework was evenmore accurate,correctly getting 99.85%of the time.This is indicative of howwellAutoML performs in improving things with minimal human assistance.Our findings indicate that AutoML is an efficient,scalable,and flexible method to discover Android ransomware,and it will facilitate the development of next-generation intrusion detection systems.展开更多
Current successes in artificial intelligence domain have revitalized interest in spacecraft pursuit-evasion game,which is an interception problem with a non-cooperative maneuvering target.The paper presents an automat...Current successes in artificial intelligence domain have revitalized interest in spacecraft pursuit-evasion game,which is an interception problem with a non-cooperative maneuvering target.The paper presents an automated machine learning(AutoML)based method to generate optimal trajectories in long-distance scenarios.Compared with conventional deep neural network(DNN)methods,the proposed method dramatically reduces the reliance on manual intervention and machine learning expertise.Firstly,based on differential game theory and costate normalization technique,the trajectory optimization problem is formulated under the assumption of continuous thrust.Secondly,the AutoML technique based on sequential model-based optimization(SMBO)framework is introduced to automate DNN design in deep learning process.If recommended DNN architecture exists,the tree-structured Parzen estimator(TPE)is used,otherwise the efficient neural architecture search(NAS)with network morphism is used.Thus,a novel trajectory optimization method with high computational efficiency is achieved.Finally,numerical results demonstrate the feasibility and efficiency of the proposed method.展开更多
Disturbances such as forest fires,intense winds,and insect damage exert strong impacts on forest ecosystems by shaping their structure and growth dynamics,with contributions from climate change.Consequently,there is a...Disturbances such as forest fires,intense winds,and insect damage exert strong impacts on forest ecosystems by shaping their structure and growth dynamics,with contributions from climate change.Consequently,there is a need for reliable and operational methods to monitor and map these disturbances for the development of suitable management strategies.While susceptibility assessment using machine learning methods has increased,most studies have focused on a single disturbance.Moreover,there has been limited exploration of the use of“Automated Machine Learning(AutoML)”in the literature.In this study,susceptibility assessment for multiple forest disturbances(fires,insect damage,and wind damage)was conducted using the PyCaret AutoML framework in the Izmir Regional Forest Directorate(RFD)in Turkey.The AutoML framework compared 14 machine learning algorithms and ranked the best models based on AUC(area under the curve)values.The extra tree classifier(ET)algorithm was selected for modeling the susceptibility of each disturbance due to its good performance(AUC values>0.98).The study evaluated susceptibilities for both individual and multiple disturbances,creating a total of four susceptibility maps using fifteen driving factors in the assessment.According to the results,82.5%of forested areas in the Izmir RFD are susceptible to multiple disturbances at high and very high levels.Additionally,a potential forest disturbances map was created,revealing that 15.6%of forested areas in the Izmir RFD may experience no damage from the disturbances considered,while 54.2%could face damage from all three disturbances.The SHAP(Shapley Additive exPlanations)methodology was applied to evaluate the importance of features on prediction and the nonlinear relationship between explanatory features and susceptibility to disturbance.展开更多
The aims of this study were threefold:1)study the research gap in carpark and price index via big data and natural language processing,2)examine the research gap of carpark indices,and 3)construct carpark price indice...The aims of this study were threefold:1)study the research gap in carpark and price index via big data and natural language processing,2)examine the research gap of carpark indices,and 3)construct carpark price indices via repeat sales methods and predict carpark indices via the AutoML.By researching the keyword“carpark”in Google Scholar,the largest electronic academic database that coversWeb of Science and Scopus indexed articles,this study obtained 999 articles and book chapters from 1910 to 2019.It confirmed that most carpark research threw light on multi-storey carparks,management and ventilation systems,and reinforced concrete carparks.The most common research method was case studies.Regarding price index research,many previous studies focused on consumer,stock,press and futures,with many keywords being related to finance and economics.These indicated that there is no research predicting carpark price indices based on an AutoML approach.This study constructed repeat sales indices for 18 districts in Hong Kong by using 34,562 carpark transaction records from December 2009 to June 2019.Wanchai’s carpark price was about four times that of Yuen Long’s carpark price,indicating the considerable carpark price differences inHong Kong.This research evidenced the features that affected the carpark price indices models most:gold price ranked the first in all 19 models;oil price or Link stock price ranked second depending on the district,and carpark affordability ranked third.展开更多
Alzheimer’s disease(AD)diagnosis and prognosis increasingly rely on machine learning(ML)models.Although these models provide good results,clinical adoption is limited by the need for technical expertise and the lack ...Alzheimer’s disease(AD)diagnosis and prognosis increasingly rely on machine learning(ML)models.Although these models provide good results,clinical adoption is limited by the need for technical expertise and the lack of trustworthy and consistent model explanations.SHAP(SHapley Additive exPlanations)is commonly used to interpret AD models,but existing studies tend to focus on explanations for isolated tasks,providing little evidence about their robustness across disease stages,model architectures,or prediction objectives.This paper proposes amultilevel explainability framework that measures the coherence,stability and consistency of explanations by integrating:(1)within-model coherence metrics between feature importance and SHAP,(2)SHAP stability across AD boundaries,and(3)SHAP cross-task consistency between diagnosis and prognosis.Using AutoML to optimize classifiers on the NACC dataset,we trained four diagnostic and four prognostic models covering the standard AD progression stages:normal-control(NC),mild-cognitive impairment(NCI)and AD.For each model,we generated SHAP and feature importance(FI)plots.Stability was then evaluated using correlation metrics(Spearman,Kendall),top-k feature overlap(Jaccard@10/20),SHAP sign consistency,and domain-level contribution ratios.Results show that cognitive and functional markers(e.g.,MEMORY,JUDGMENT,ORIENT,PAYATTN)dominate SHAP explanations in both diagnosis and prognosis.SHAP-SHAP consistency between diagnostic and prognostic models was high across all classifiers(ρ=0.61–0.94),with 100%sign stability andminimal shifts in explanatory magnitude(meanΔ∣SHAP∣<0.03).Domain-level contributions also remained stable,with only minimal increases in genetic features for prognosis.These results demonstrate that SHAP explanations can be quantitatively validated for robustness and transferability,providing clinicians with more reliable interpretations of ML predictions.The proposed framework provides a reproducible methodology for evaluating explainability stability and coherence,supporting the deployment of trustworthy ML systems in AD clinical settings.展开更多
AutoML(Automated Machine Learning)is an emerging field that aims to automate the process of building machine learning models.AutoML emerged to increase productivity and efficiency by automating as much as possible the...AutoML(Automated Machine Learning)is an emerging field that aims to automate the process of building machine learning models.AutoML emerged to increase productivity and efficiency by automating as much as possible the inefficient work that occurs while repeating this process whenever machine learning is applied.In particular,research has been conducted for a long time on technologies that can effectively develop high-quality models by minimizing the intervention of model developers in the process from data preprocessing to algorithm selection and tuning.In this semantic review research,we summarize the data processing requirements for AutoML approaches and provide a detailed explanation.We place greater emphasis on neural architecture search(NAS)as it currently represents a highly popular sub-topic within the field of AutoML.NAS methods use machine learning algorithms to search through a large space of possible architectures and find the one that performs best on a given task.We provide a summary of the performance achieved by representative NAS algorithms on the CIFAR-10,CIFAR-100,ImageNet and wellknown benchmark datasets.Additionally,we delve into several noteworthy research directions in NAS methods including one/two-stage NAS,one-shot NAS and joint hyperparameter with architecture optimization.We discussed how the search space size and complexity in NAS can vary depending on the specific problem being addressed.To conclude,we examine several open problems(SOTA problems)within current AutoML methods that assure further investigation in future research.展开更多
Background:Segmentation of abdominal organs in computed tomography(CT)images within clinical oncological workflows is crucial for ensuring effective treatment planning and follow-up.However,manually generated segmenta...Background:Segmentation of abdominal organs in computed tomography(CT)images within clinical oncological workflows is crucial for ensuring effective treatment planning and follow-up.However,manually generated segmentations are time-consuming and labor-intensive in addition to being subject to inter-observer variability.Many deep learning and automated machine learning(AutoML)frameworks have emerged as a solution to this challenge and show promise in clinical workflows.Objective:This study aims to provide a comprehensive evaluation of existing AutoML frameworks(Auto3DSeg,nnU-Net)against a state-of-the-art non-AutoML framework,the Shifted Window U-Net Transformer(SwinUNETR).Methods:Each framework was trained on the same 122 training images,taken from the Abdominal Multi-Organ Segmentation(AMOS)Grand Challenge.Frameworks were compared using dice similarity coefficient(DSC),surface DSC(sDSC),and 95th percentile Hausdorff distances(HD95)on an additional 72 holdout-validation images.The perceived clinical viability of 30 auto-contoured test cases was assessed by three physicians in a blinded evaluation.Results:Comparisons show significantly better performance by AutoML methods:nnU-Net(average DSC:0.924,average sDSC:0.938,average HD95:4.26,median Likert:4.57),Auto3DSeg(average DSC:0.902,average sDSC:0.919,average HD95:8.76,median Likert:4.49),and SwinUNETR(average DSC:0.837,average sDSC:0.844,average HD95:13.93).AutoML frameworks were quantitatively preferred(13/13 organs at risks[OARs]P<0.05 in DSC and sDSC,12/13 OARs P<0.05 in HD95,comparing Auto3DSeg to SwinUNETR,and all OARs P<0.05 in all metrics comparing SwinUNETR to nnU-Net).Qualitatively,nnU-Net was preferred over Auto3DSeg(P=0.0027).Conclusion:The findings suggest that AutoML frameworks offer a significant advantage in the segmentation of abdominal organs,and underscores the potential of AutoML methods to enhance the efficiency of oncological workflows.展开更多
Objective To develop and evaluate an automated system for digitizing audiograms,classifying hearing loss levels,and comparing their performance with traditional methods and otolaryngologists'interpretations.Design...Objective To develop and evaluate an automated system for digitizing audiograms,classifying hearing loss levels,and comparing their performance with traditional methods and otolaryngologists'interpretations.Designed and Methods We conducted a retrospective diagnostic study using 1,959 audiogram images from patients aged 7 years and older at the Faculty of Medicine,Vajira Hospital,Navamindradhiraj University.We employed an object detection approach to digitize audiograms and developed multiple machine learning models to classify six hearing loss levels.The dataset was split into 70%training(1,407 images)and 30%testing(352 images)sets.We compared our model's performance with classifications based on manually extracted audiogram values and otolaryngologists'interpretations.Result Our object detection-based model achieved an F1-score of 94.72%in classifying hearing loss levels,comparable to the 96.43%F1-score obtained using manually extracted values.The Light Gradient Boosting Machine(LGBM)model is used as the classifier for the manually extracted data,which achieved top performance with 94.72%accuracy,94.72%f1-score,94.72 recall,and 94.72 precision.In object detection based model,The Random Forest Classifier(RFC)model showed the highest 96.43%accuracy in predicting hearing loss level,with a F1-score of 96.43%,recall of 96.43%,and precision of 96.45%.Conclusion Our proposed automated approach for audiogram digitization and hearing loss classification performs comparably to traditional methods and otolaryngologists'interpretations.This system can potentially assist otolaryngologists in providing more timely and effective treatment by quickly and accurately classifying hearing loss.展开更多
基金supported through theOngoing Research Funding Program(ORF-2025-498),King Saud University,Riyadh,Saudi Arabia.
文摘Android smartphones have become an integral part of our daily lives,becoming targets for ransomware attacks.Such attacks encrypt user information and ask for payment to recover it.Conventional detection mechanisms,such as signature-based and heuristic techniques,often fail to detect new and polymorphic ransomware samples.To address this challenge,we employed various ensemble classifiers,such as Random Forest,Gradient Boosting,Bagging,and AutoML models.We aimed to showcase how AutoML can automate processes such as model selection,feature engineering,and hyperparameter optimization,to minimize manual effort while ensuring or enhancing performance compared to traditional approaches.We used this framework to test it with a publicly available dataset from the Kaggle repository,which contains features for Android ransomware network traffic.The dataset comprises 392,024 flow records,divided into eleven groups.There are ten classes for various ransomware types,including SVpeng,PornDroid,Koler,WannaLocker,and Lockerpin.There is also a class for regular traffic.We applied a three-step procedure to select themost relevant features:filter,wrapper,and embeddedmethods.The Bagging classifier was highly accurate,correctly getting 99.84%of the time.The FLAML AutoML framework was evenmore accurate,correctly getting 99.85%of the time.This is indicative of howwellAutoML performs in improving things with minimal human assistance.Our findings indicate that AutoML is an efficient,scalable,and flexible method to discover Android ransomware,and it will facilitate the development of next-generation intrusion detection systems.
基金supported by the National Defense Science and Technology Innovation program(18-163-15-LZ-001-004-13).
文摘Current successes in artificial intelligence domain have revitalized interest in spacecraft pursuit-evasion game,which is an interception problem with a non-cooperative maneuvering target.The paper presents an automated machine learning(AutoML)based method to generate optimal trajectories in long-distance scenarios.Compared with conventional deep neural network(DNN)methods,the proposed method dramatically reduces the reliance on manual intervention and machine learning expertise.Firstly,based on differential game theory and costate normalization technique,the trajectory optimization problem is formulated under the assumption of continuous thrust.Secondly,the AutoML technique based on sequential model-based optimization(SMBO)framework is introduced to automate DNN design in deep learning process.If recommended DNN architecture exists,the tree-structured Parzen estimator(TPE)is used,otherwise the efficient neural architecture search(NAS)with network morphism is used.Thus,a novel trajectory optimization method with high computational efficiency is achieved.Finally,numerical results demonstrate the feasibility and efficiency of the proposed method.
文摘Disturbances such as forest fires,intense winds,and insect damage exert strong impacts on forest ecosystems by shaping their structure and growth dynamics,with contributions from climate change.Consequently,there is a need for reliable and operational methods to monitor and map these disturbances for the development of suitable management strategies.While susceptibility assessment using machine learning methods has increased,most studies have focused on a single disturbance.Moreover,there has been limited exploration of the use of“Automated Machine Learning(AutoML)”in the literature.In this study,susceptibility assessment for multiple forest disturbances(fires,insect damage,and wind damage)was conducted using the PyCaret AutoML framework in the Izmir Regional Forest Directorate(RFD)in Turkey.The AutoML framework compared 14 machine learning algorithms and ranked the best models based on AUC(area under the curve)values.The extra tree classifier(ET)algorithm was selected for modeling the susceptibility of each disturbance due to its good performance(AUC values>0.98).The study evaluated susceptibilities for both individual and multiple disturbances,creating a total of four susceptibility maps using fifteen driving factors in the assessment.According to the results,82.5%of forested areas in the Izmir RFD are susceptible to multiple disturbances at high and very high levels.Additionally,a potential forest disturbances map was created,revealing that 15.6%of forested areas in the Izmir RFD may experience no damage from the disturbances considered,while 54.2%could face damage from all three disturbances.The SHAP(Shapley Additive exPlanations)methodology was applied to evaluate the importance of features on prediction and the nonlinear relationship between explanatory features and susceptibility to disturbance.
文摘The aims of this study were threefold:1)study the research gap in carpark and price index via big data and natural language processing,2)examine the research gap of carpark indices,and 3)construct carpark price indices via repeat sales methods and predict carpark indices via the AutoML.By researching the keyword“carpark”in Google Scholar,the largest electronic academic database that coversWeb of Science and Scopus indexed articles,this study obtained 999 articles and book chapters from 1910 to 2019.It confirmed that most carpark research threw light on multi-storey carparks,management and ventilation systems,and reinforced concrete carparks.The most common research method was case studies.Regarding price index research,many previous studies focused on consumer,stock,press and futures,with many keywords being related to finance and economics.These indicated that there is no research predicting carpark price indices based on an AutoML approach.This study constructed repeat sales indices for 18 districts in Hong Kong by using 34,562 carpark transaction records from December 2009 to June 2019.Wanchai’s carpark price was about four times that of Yuen Long’s carpark price,indicating the considerable carpark price differences inHong Kong.This research evidenced the features that affected the carpark price indices models most:gold price ranked the first in all 19 models;oil price or Link stock price ranked second depending on the district,and carpark affordability ranked third.
基金Enrique Frias-Martinez would like to thank the IBM-UNIR Chair on Data Science in Education and the Research Institute for Innovation and Technology in Education(UNIR iTED)for partially funding this research.
文摘Alzheimer’s disease(AD)diagnosis and prognosis increasingly rely on machine learning(ML)models.Although these models provide good results,clinical adoption is limited by the need for technical expertise and the lack of trustworthy and consistent model explanations.SHAP(SHapley Additive exPlanations)is commonly used to interpret AD models,but existing studies tend to focus on explanations for isolated tasks,providing little evidence about their robustness across disease stages,model architectures,or prediction objectives.This paper proposes amultilevel explainability framework that measures the coherence,stability and consistency of explanations by integrating:(1)within-model coherence metrics between feature importance and SHAP,(2)SHAP stability across AD boundaries,and(3)SHAP cross-task consistency between diagnosis and prognosis.Using AutoML to optimize classifiers on the NACC dataset,we trained four diagnostic and four prognostic models covering the standard AD progression stages:normal-control(NC),mild-cognitive impairment(NCI)and AD.For each model,we generated SHAP and feature importance(FI)plots.Stability was then evaluated using correlation metrics(Spearman,Kendall),top-k feature overlap(Jaccard@10/20),SHAP sign consistency,and domain-level contribution ratios.Results show that cognitive and functional markers(e.g.,MEMORY,JUDGMENT,ORIENT,PAYATTN)dominate SHAP explanations in both diagnosis and prognosis.SHAP-SHAP consistency between diagnostic and prognostic models was high across all classifiers(ρ=0.61–0.94),with 100%sign stability andminimal shifts in explanatory magnitude(meanΔ∣SHAP∣<0.03).Domain-level contributions also remained stable,with only minimal increases in genetic features for prognosis.These results demonstrate that SHAP explanations can be quantitatively validated for robustness and transferability,providing clinicians with more reliable interpretations of ML predictions.The proposed framework provides a reproducible methodology for evaluating explainability stability and coherence,supporting the deployment of trustworthy ML systems in AD clinical settings.
文摘AutoML(Automated Machine Learning)is an emerging field that aims to automate the process of building machine learning models.AutoML emerged to increase productivity and efficiency by automating as much as possible the inefficient work that occurs while repeating this process whenever machine learning is applied.In particular,research has been conducted for a long time on technologies that can effectively develop high-quality models by minimizing the intervention of model developers in the process from data preprocessing to algorithm selection and tuning.In this semantic review research,we summarize the data processing requirements for AutoML approaches and provide a detailed explanation.We place greater emphasis on neural architecture search(NAS)as it currently represents a highly popular sub-topic within the field of AutoML.NAS methods use machine learning algorithms to search through a large space of possible architectures and find the one that performs best on a given task.We provide a summary of the performance achieved by representative NAS algorithms on the CIFAR-10,CIFAR-100,ImageNet and wellknown benchmark datasets.Additionally,we delve into several noteworthy research directions in NAS methods including one/two-stage NAS,one-shot NAS and joint hyperparameter with architecture optimization.We discussed how the search space size and complexity in NAS can vary depending on the specific problem being addressed.To conclude,we examine several open problems(SOTA problems)within current AutoML methods that assure further investigation in future research.
基金funding from the University of Alabama at Birmingham,the National Institutions of Health/National Cancer Institute Award(LRP0000018407)National Center for Advancing Translational Sciences(5KL2TR003097-05).
文摘Background:Segmentation of abdominal organs in computed tomography(CT)images within clinical oncological workflows is crucial for ensuring effective treatment planning and follow-up.However,manually generated segmentations are time-consuming and labor-intensive in addition to being subject to inter-observer variability.Many deep learning and automated machine learning(AutoML)frameworks have emerged as a solution to this challenge and show promise in clinical workflows.Objective:This study aims to provide a comprehensive evaluation of existing AutoML frameworks(Auto3DSeg,nnU-Net)against a state-of-the-art non-AutoML framework,the Shifted Window U-Net Transformer(SwinUNETR).Methods:Each framework was trained on the same 122 training images,taken from the Abdominal Multi-Organ Segmentation(AMOS)Grand Challenge.Frameworks were compared using dice similarity coefficient(DSC),surface DSC(sDSC),and 95th percentile Hausdorff distances(HD95)on an additional 72 holdout-validation images.The perceived clinical viability of 30 auto-contoured test cases was assessed by three physicians in a blinded evaluation.Results:Comparisons show significantly better performance by AutoML methods:nnU-Net(average DSC:0.924,average sDSC:0.938,average HD95:4.26,median Likert:4.57),Auto3DSeg(average DSC:0.902,average sDSC:0.919,average HD95:8.76,median Likert:4.49),and SwinUNETR(average DSC:0.837,average sDSC:0.844,average HD95:13.93).AutoML frameworks were quantitatively preferred(13/13 organs at risks[OARs]P<0.05 in DSC and sDSC,12/13 OARs P<0.05 in HD95,comparing Auto3DSeg to SwinUNETR,and all OARs P<0.05 in all metrics comparing SwinUNETR to nnU-Net).Qualitatively,nnU-Net was preferred over Auto3DSeg(P=0.0027).Conclusion:The findings suggest that AutoML frameworks offer a significant advantage in the segmentation of abdominal organs,and underscores the potential of AutoML methods to enhance the efficiency of oncological workflows.
文摘Objective To develop and evaluate an automated system for digitizing audiograms,classifying hearing loss levels,and comparing their performance with traditional methods and otolaryngologists'interpretations.Designed and Methods We conducted a retrospective diagnostic study using 1,959 audiogram images from patients aged 7 years and older at the Faculty of Medicine,Vajira Hospital,Navamindradhiraj University.We employed an object detection approach to digitize audiograms and developed multiple machine learning models to classify six hearing loss levels.The dataset was split into 70%training(1,407 images)and 30%testing(352 images)sets.We compared our model's performance with classifications based on manually extracted audiogram values and otolaryngologists'interpretations.Result Our object detection-based model achieved an F1-score of 94.72%in classifying hearing loss levels,comparable to the 96.43%F1-score obtained using manually extracted values.The Light Gradient Boosting Machine(LGBM)model is used as the classifier for the manually extracted data,which achieved top performance with 94.72%accuracy,94.72%f1-score,94.72 recall,and 94.72 precision.In object detection based model,The Random Forest Classifier(RFC)model showed the highest 96.43%accuracy in predicting hearing loss level,with a F1-score of 96.43%,recall of 96.43%,and precision of 96.45%.Conclusion Our proposed automated approach for audiogram digitization and hearing loss classification performs comparably to traditional methods and otolaryngologists'interpretations.This system can potentially assist otolaryngologists in providing more timely and effective treatment by quickly and accurately classifying hearing loss.