Short Message Service(SMS)is a widely used and cost-effective communication medium that has unfortunately become a frequent target for unsolicited messages-commonly known as SMS spam.With the rapid adoption of smartph...Short Message Service(SMS)is a widely used and cost-effective communication medium that has unfortunately become a frequent target for unsolicited messages-commonly known as SMS spam.With the rapid adoption of smartphones and increased Internet connectivity,SMS spam has emerged as a prevalent threat.Spammers have recognized the critical role SMS plays in today’s modern communication,making it a prime target for abuse.As cybersecurity threats continue to evolve,the volume of SMS spam has increased substantially in recent years.Moreover,the unstructured format of SMS data creates significant challenges for SMS spam detection,making it more difficult to successfully combat spam attacks.In this paper,we present an optimized and fine-tuned transformer-based Language Model to address the problem of SMS spam detection.We use a benchmark SMS spam dataset to analyze this spam detection model.Additionally,we utilize pre-processing techniques to obtain clean and noise-free data and address class imbalance problem by leveraging text augmentation techniques.The overall experiment showed that our optimized fine-tuned BERT(Bidirectional Encoder Representations from Transformers)variant model RoBERTa obtained high accuracy with 99.84%.To further enhance model transparency,we incorporate Explainable Artificial Intelligence(XAI)techniques that compute positive and negative coefficient scores,offering insight into the model’s decision-making process.Additionally,we evaluate the performance of traditional machine learning models as a baseline for comparison.This comprehensive analysis demonstrates the significant impact language models can have on addressing complex text-based challenges within the cybersecurity landscape.展开更多
In the era of advanced machine learning techniques,the development of accurate predictive models for complex medical conditions,such as thyroid cancer,has shown remarkable progress.Accurate predictivemodels for thyroi...In the era of advanced machine learning techniques,the development of accurate predictive models for complex medical conditions,such as thyroid cancer,has shown remarkable progress.Accurate predictivemodels for thyroid cancer enhance early detection,improve resource allocation,and reduce overtreatment.However,the widespread adoption of these models in clinical practice demands predictive performance along with interpretability and transparency.This paper proposes a novel association-rule based feature-integratedmachine learning model which shows better classification and prediction accuracy than present state-of-the-artmodels.Our study also focuses on the application of SHapley Additive exPlanations(SHAP)values as a powerful tool for explaining thyroid cancer prediction models.In the proposed method,the association-rule based feature integration framework identifies frequently occurring attribute combinations in the dataset.The original dataset is used in trainingmachine learning models,and further used in generating SHAP values fromthesemodels.In the next phase,the dataset is integrated with the dominant feature sets identified through association-rule based analysis.This new integrated dataset is used in re-training the machine learning models.The new SHAP values generated from these models help in validating the contributions of feature sets in predicting malignancy.The conventional machine learning models lack interpretability,which can hinder their integration into clinical decision-making systems.In this study,the SHAP values are introduced along with association-rule based feature integration as a comprehensive framework for understanding the contributions of feature sets inmodelling the predictions.The study discusses the importance of reliable predictive models for early diagnosis of thyroid cancer,and a validation framework of explainability.The proposed model shows an accuracy of 93.48%.Performance metrics such as precision,recall,F1-score,and the area under the receiver operating characteristic(AUROC)are also higher than the baseline models.The results of the proposed model help us identify the dominant feature sets that impact thyroid cancer classification and prediction.The features{calcification}and{shape}consistently emerged as the top-ranked features associated with thyroid malignancy,in both association-rule based interestingnessmetric values and SHAPmethods.The paper highlights the potential of the rule-based integrated models with SHAP in bridging the gap between the machine learning predictions and the interpretability of this prediction which is required for real-world medical applications.展开更多
Wildfires significantly disrupt the physical and hydrologic conditions of the environment,leading to vegetation loss and altered surface geo-material properties.These complex dynamics promote post-fire gully erosion,y...Wildfires significantly disrupt the physical and hydrologic conditions of the environment,leading to vegetation loss and altered surface geo-material properties.These complex dynamics promote post-fire gully erosion,yet the key conditioning factors(e.g.,topography,hydrology)remain insufficiently understood.This study proposes a novel artificial intelligence(AI)framework that integrates four machine learning(ML)models with Shapley Additive Explanations(SHAP)method,offering a hierarchical perspective from global to local on the dominant factors controlling gully distribution in wildfireaffected areas.In a case study of Xiangjiao catchment burned on March 28,2020,in Muli County in Sichuan Province of Southwest China,we derived 21 geoenvironmental factors to assess the susceptibility of post-fire gully erosion using logistic regression(LR),support vector machine(SVM),random forest(RF),and convolutional neural network(CNN)models.SHAP-based model interpretation revealed eight key conditioning factors:topographic position index(TPI),topographic wetness index(TWI),distance to stream,mean annual precipitation,differenced normalized burn ratio(d NBR),land use/cover,soil type,and distance to road.Comparative model evaluation demonstrated that reduced-variable models incorporating these dominant factors achieved accuracy comparable to that of the initial-variable models,with AUC values exceeding 0.868 across all ML algorithms.These findings provide critical insights into gully erosion behavior in wildfire-affected areas,supporting the decision-making process behind environmental management and hazard mitigation.展开更多
Self-Explaining Autonomous Systems(SEAS)have emerged as a strategic frontier within Artificial Intelligence(AI),responding to growing demands for transparency and interpretability in autonomous decisionmaking.This stu...Self-Explaining Autonomous Systems(SEAS)have emerged as a strategic frontier within Artificial Intelligence(AI),responding to growing demands for transparency and interpretability in autonomous decisionmaking.This study presents a comprehensive bibliometric analysis of SEAS research published between 2020 and February 2025,drawing upon 1380 documents indexed in Scopus.The analysis applies co-citation mapping,keyword co-occurrence,and author collaboration networks using VOSviewer,MASHA,and Python to examine scientific production,intellectual structure,and global collaboration patterns.The results indicate a sustained annual growth rate of 41.38%,with an h-index of 57 and an average of 21.97 citations per document.A normalized citation rate was computed to address temporal bias,enabling balanced evaluation across publication cohorts.Thematic analysis reveals four consolidated research fronts:interpretability in machine learning,explainability in deep neural networks,transparency in generative models,and optimization strategies in autonomous control.Author co-citation analysis identifies four distinct research communities,and keyword evolution shows growing interdisciplinary links with medicine,cybersecurity,and industrial automation.The United States leads in scientific output and citation impact at the geographical level,while countries like India and China show high productivity with varied influence.However,international collaboration remains limited at 7.39%,reflecting a fragmented research landscape.As discussed in this study,SEAS research is expanding rapidly yet remains epistemologically dispersed,with uneven integration of ethical and human-centered perspectives.This work offers a structured and data-driven perspective on SEAS development,highlights key contributors and thematic trends,and outlines critical directions for advancing responsible and transparent autonomous systems.展开更多
In the evolving landscape of cyber threats,phishing attacks pose significant challenges,particularly through deceptive webpages designed to extract sensitive information under the guise of legitimacy.Conventional and ...In the evolving landscape of cyber threats,phishing attacks pose significant challenges,particularly through deceptive webpages designed to extract sensitive information under the guise of legitimacy.Conventional and machine learning(ML)-based detection systems struggle to detect phishing websites owing to their constantly changing tactics.Furthermore,newer phishing websites exhibit subtle and expertly concealed indicators that are not readily detectable.Hence,effective detection depends on identifying the most critical features.Traditional feature selection(FS)methods often struggle to enhance ML model performance and instead decrease it.To combat these issues,we propose an innovative method using explainable AI(XAI)to enhance FS in ML models and improve the identification of phishing websites.Specifically,we employ SHapley Additive exPlanations(SHAP)for global perspective and aggregated local interpretable model-agnostic explanations(LIME)to deter-mine specific localized patterns.The proposed SHAP and LIME-aggregated FS(SLA-FS)framework pinpoints the most informative features,enabling more precise,swift,and adaptable phishing detection.Applying this approach to an up-to-date web phishing dataset,we evaluate the performance of three ML models before and after FS to assess their effectiveness.Our findings reveal that random forest(RF),with an accuracy of 97.41%and XGBoost(XGB)at 97.21%significantly benefit from the SLA-FS framework,while k-nearest neighbors lags.Our framework increases the accuracy of RF and XGB by 0.65%and 0.41%,respectively,outperforming traditional filter or wrapper methods and any prior methods evaluated on this dataset,showcasing its potential.展开更多
Cardiovascular disease(CVD)remains a leading global health challenge due to its high mortality rate and the complexity of early diagnosis,driven by risk factors such as hypertension,high cholesterol,and irregular puls...Cardiovascular disease(CVD)remains a leading global health challenge due to its high mortality rate and the complexity of early diagnosis,driven by risk factors such as hypertension,high cholesterol,and irregular pulse rates.Traditional diagnostic methods often struggle with the nuanced interplay of these risk factors,making early detection difficult.In this research,we propose a novel artificial intelligence-enabled(AI-enabled)framework for CVD risk prediction that integrates machine learning(ML)with eXplainable AI(XAI)to provide both high-accuracy predictions and transparent,interpretable insights.Compared to existing studies that typically focus on either optimizing ML performance or using XAI separately for local or global explanations,our approach uniquely combines both local and global interpretability using Local Interpretable Model-Agnostic Explanations(LIME)and SHapley Additive exPlanations(SHAP).This dual integration enhances the interpretability of the model and facilitates clinicians to comprehensively understand not just what the model predicts but also why those predictions are made by identifying the contribution of different risk factors,which is crucial for transparent and informed decision-making in healthcare.The framework uses ML techniques such as K-nearest neighbors(KNN),gradient boosting,random forest,and decision tree,trained on a cardiovascular dataset.Additionally,the integration of LIME and SHAP provides patient-specific insights alongside global trends,ensuring that clinicians receive comprehensive and actionable information.Our experimental results achieve 98%accuracy with the Random Forest model,with precision,recall,and F1-scores of 97%,98%,and 98%,respectively.The innovative combination of SHAP and LIME sets a new benchmark in CVD prediction by integrating advanced ML accuracy with robust interpretability,fills a critical gap in existing approaches.This framework paves the way for more explainable and transparent decision-making in healthcare,ensuring that the model is not only accurate but also trustworthy and actionable for clinicians.展开更多
This study presents a framework for the semi-automatic detection of rock discontinuities using a threedimensional(3D)point cloud(PC).The process begins by selecting an appropriate neighborhood size,a critical step for...This study presents a framework for the semi-automatic detection of rock discontinuities using a threedimensional(3D)point cloud(PC).The process begins by selecting an appropriate neighborhood size,a critical step for feature extraction from the PC.The effects of different neighborhood sizes(k=5,10,20,50,and 100)have been evaluated to assess their impact on classification performance.After that,17 geometric and spatial features were extracted from the PC.Next,ensemble methods,AdaBoost.M2,random forest,and decision tree,have been compared with Artificial Neural Networks to classify the main discontinuity sets.The McNemar test indicates that the classifiers are statistically significant.The random forest classifier consistently achieves the highest performance with an accuracy exceeding 95%when using a neighborhood size of k=100,while recall,F-score,and Cohen's Kappa also demonstrate high success.SHapley Additive exPlanations(SHAP),an Explainable AI technique,has been used to evaluate feature importance and improve the explainability of black-box machine learning models in the context of rock discontinuity classification.The analysis reveals that features such as normal vectors,verticality,and Z-values have the greatest influence on identifying main discontinuity sets,while linearity,planarity,and eigenvalues contribute less,making the model more transparent and easier to understand.After classification,individual discontinuity sets were detected using a revised DBSCAN from the main discontinuity sets.Finally,the orientation parameters of the plane fitted to each discontinuity were derived from the plane parameters obtained using the Random Sample Consensus(RANSAC).Two real-world datasets(obtained from SfM and LiDAR)and one synthetic dataset were used to validate the proposed method,which successfully identified rock discontinuities and their orientation parameters(dip angle/direction).展开更多
Machine learning(ML)models are widely used for predicting undrained shear strength(USS),but interpretability has been a limitation in various studies.Therefore,this study introduced shapley additive explanations(SHAP)...Machine learning(ML)models are widely used for predicting undrained shear strength(USS),but interpretability has been a limitation in various studies.Therefore,this study introduced shapley additive explanations(SHAP)to clarify the contribution of each input feature in USS prediction.Three ML models,artificial neural network(ANN),extreme gradient boosting(XGBoost),and random forest(RF),were employed,with accuracy evaluated using mean squared error,mean absolute error,and coefficient of determination(R^(2)).The RF achieved the highest performance with an R^(2) of 0.82.SHAP analysis identified pre-consolidation stress as a key contributor to USS prediction.SHAP dependence plots reveal that the ANN captures smoother,linear feature-output relationships,while the RF handles complex,non-linear interactions more effectively.This suggests a non-linear relationship between USS and input features,with RF outperforming ANN.These findings highlight SHAP’s role in enhancing interpretability and promoting transparency and reliability in ML predictions for geotechnical applications.展开更多
The fast increase of online communities has brought about an increase in cyber threats inclusive of cyberbullying, hate speech, misinformation, and online harassment, making content moderation a pressing necessity. Tr...The fast increase of online communities has brought about an increase in cyber threats inclusive of cyberbullying, hate speech, misinformation, and online harassment, making content moderation a pressing necessity. Traditional single-modal AI-based detection systems, which analyze both text, photos, or movies in isolation, have established useless at taking pictures multi-modal threats, in which malicious actors spread dangerous content throughout a couple of formats. To cope with these demanding situations, we advise a multi-modal deep mastering framework that integrates Natural Language Processing (NLP), Convolutional Neural Networks (CNNs), and Long Short-Term Memory (LSTM) networks to become aware of and mitigate online threats effectively. Our proposed model combines BERT for text class, ResNet50 for photograph processing, and a hybrid LSTM-3-d CNN community for video content material analysis. We constructed a large-scale dataset comprising 500,000 textual posts, 200,000 offensive images, and 50,000 annotated motion pictures from more than one platform, which includes Twitter, Reddit, YouTube, and online gaming forums. The system became carefully evaluated using trendy gadget mastering metrics which include accuracy, precision, remember, F1-score, and ROC-AUC curves. Experimental outcomes demonstrate that our multi-modal method extensively outperforms single-modal AI classifiers, achieving an accuracy of 92.3%, precision of 91.2%, do not forget of 90.1%, and an AUC rating of 0.95. The findings validate the necessity of integrating multi-modal AI for actual-time, high-accuracy online chance detection and moderation. Future paintings will have consciousness on improving hostile robustness, enhancing scalability for real-world deployment, and addressing ethical worries associated with AI-driven content moderation.展开更多
In the field of precision healthcare,where accurate decision-making is paramount,this study underscores the indispensability of eXplainable Artificial Intelligence(XAI)in the context of epilepsy management within the ...In the field of precision healthcare,where accurate decision-making is paramount,this study underscores the indispensability of eXplainable Artificial Intelligence(XAI)in the context of epilepsy management within the Internet of Medical Things(IoMT).The methodology entails meticulous preprocessing,involving the application of a band-pass filter and epoch segmentation to optimize the quality of Electroencephalograph(EEG)data.The subsequent extraction of statistical features facilitates the differentiation between seizure and non-seizure patterns.The classification phase integrates Support Vector Machine(SVM),K-Nearest Neighbor(KNN),and Random Forest classifiers.Notably,SVM attains an accuracy of 97.26%,excelling in the precision,recall,specificity,and F1 score for identifying seizures and non-seizure instances.Conversely,KNN achieves an accuracy of 72.69%,accompanied by certain trade-offs.The Random Forest classifierstands out with a remarkable accuracy of 99.89%,coupled with an exceptional precision(99.73%),recall(100%),specificity(99.80%),and F1 score(99.86%),surpassing both SVM and KNN performances.XAI techniques,namely Local Interpretable ModelAgnostic Explanations(LIME)and SHapley Additive exPlanation(SHAP),enhance the system’s transparency.This combination of machine learning and XAI not only improves the reliability and accuracy of the seizure detection system but also enhances trust and interpretability.Healthcare professionals can leverage the identified important features and their dependencies to gain deeper insights into the decision-making process,aiding in informed diagnosis and treatment decisions for patients with epilepsy.展开更多
Accurate and interpretable fault diagnosis in industrial gear systems is essential for ensuring safety,reliability,and predictive maintenance.This study presents an intelligent diagnostic framework utilizing Gradient ...Accurate and interpretable fault diagnosis in industrial gear systems is essential for ensuring safety,reliability,and predictive maintenance.This study presents an intelligent diagnostic framework utilizing Gradient Boosting(GB)for fault detection in gear systems,applied to the Aalto Gear Fault Dataset,which features a wide range of synthetic and realistic gear failure modes under varied operating conditions.The dataset was preprocessed and analyzed using an ensemble GB classifier,yielding high performance across multiple metrics:accuracy of 96.77%,precision of 95.44%,recall of 97.11%,and an F1-score of 96.22%.To enhance trust in model predictions,the study integrates an explainable AI(XAI)framework using SHAP(SHapley Additive exPlanations)to visualize feature contributions and support diagnostic transparency.A flowchart-based architecture is proposed to guide real-world deployment of interpretable fault detection pipelines.The results demonstrate the feasibility of combining predictive performance with interpretability,offering a robust approach for condition monitoring in safety-critical systems.展开更多
Predicting the health status of stroke patients at different stages of the disease is a critical clinical task.The onset and development of stroke are affected by an array of factors,encompassing genetic predispositio...Predicting the health status of stroke patients at different stages of the disease is a critical clinical task.The onset and development of stroke are affected by an array of factors,encompassing genetic predisposition,environmental exposure,unhealthy lifestyle habits,and existing medical conditions.Although existing machine learning-based methods for predicting stroke patients’health status have made significant progress,limitations remain in terms of prediction accuracy,model explainability,and system optimization.This paper proposes a multi-task learning approach based on Explainable Artificial Intelligence(XAI)for predicting the health status of stroke patients.First,we design a comprehensive multi-task learning framework that utilizes the task correlation of predicting various health status indicators in patients,enabling the parallel prediction of multiple health indicators.Second,we develop a multi-task Area Under Curve(AUC)optimization algorithm based on adaptive low-rank representation,which removes irrelevant information from the model structure to enhance the performance of multi-task AUC optimization.Additionally,the model’s explainability is analyzed through the stability analysis of SHAP values.Experimental results demonstrate that our approach outperforms comparison algorithms in key prognostic metrics F1 score and Efficiency.展开更多
Skin cancer is the most prevalent cancer globally,primarily due to extensive exposure to Ultraviolet(UV)radiation.Early identification of skin cancer enhances the likelihood of effective treatment,as delays may lead t...Skin cancer is the most prevalent cancer globally,primarily due to extensive exposure to Ultraviolet(UV)radiation.Early identification of skin cancer enhances the likelihood of effective treatment,as delays may lead to severe tumor advancement.This study proposes a novel hybrid deep learning strategy to address the complex issue of skin cancer diagnosis,with an architecture that integrates a Vision Transformer,a bespoke convolutional neural network(CNN),and an Xception module.They were evaluated using two benchmark datasets,HAM10000 and Skin Cancer ISIC.On the HAM10000,the model achieves a precision of 95.46%,an accuracy of 96.74%,a recall of 96.27%,specificity of 96.00%and an F1-Score of 95.86%.It obtains an accuracy of 93.19%,a precision of 93.25%,a recall of 92.80%,a specificity of 92.89%and an F1-Score of 93.19%on the Skin Cancer ISIC dataset.The findings demonstrate that the model that was proposed is robust and trustworthy when it comes to the classification of skin lesions.In addition,the utilization of Explainable AI techniques,such as Grad-CAM visualizations,assists in highlighting the most significant lesion areas that have an impact on the decisions that are made by the model.展开更多
The diagnosis of brain tumors is an extended process that significantly depends on the expertise and skills of radiologists.The rise in patient numbers has substantially elevated the data processing volume,making conv...The diagnosis of brain tumors is an extended process that significantly depends on the expertise and skills of radiologists.The rise in patient numbers has substantially elevated the data processing volume,making conventional methods both costly and inefficient.Recently,Artificial Intelligence(AI)has gained prominence for developing automated systems that can accurately diagnose or segment brain tumors in a shorter time frame.Many researchers have examined various algorithms that provide both speed and accuracy in detecting and classifying brain tumors.This paper proposes a newmodel based on AI,called the Brain Tumor Detection(BTD)model,based on brain tumor Magnetic Resonance Images(MRIs).The proposed BTC comprises three main modules:(i)Image Processing Module(IPM),(ii)Patient Detection Module(PDM),and(iii)Explainable AI(XAI).In the first module(i.e.,IPM),the used dataset is preprocessed through two stages:feature extraction and feature selection.At first,the MRI is preprocessed,then the images are converted into a set of features using several feature extraction methods:gray level co-occurrencematrix,histogramof oriented gradient,local binary pattern,and Tamura feature.Next,the most effective features are selected fromthese features separately using ImprovedGrayWolfOptimization(IGWO).IGWOis a hybrid methodology that consists of the Filter Selection Step(FSS)using information gain ratio as an initial selection stage and Binary Gray Wolf Optimization(BGWO)to make the proposed method better at detecting tumors by further optimizing and improving the chosen features.Then,these features are fed to PDM using several classifiers,and the final decision is based on weighted majority voting.Finally,through Local Interpretable Model-agnostic Explanations(LIME)XAI,the interpretability and transparency in decision-making processes are provided.The experiments are performed on a publicly available Brain MRI dataset that consists of 98 normal cases and 154 abnormal cases.During the experiments,the dataset was divided into 70%(177 cases)for training and 30%(75 cases)for testing.The numerical findings demonstrate that the BTD model outperforms its competitors in terms of accuracy,precision,recall,and F-measure.It introduces 98.8%accuracy,97%precision,97.5%recall,and 97.2%F-measure.The results demonstrate the potential of the proposed model to revolutionize brain tumor diagnosis,contribute to better treatment strategies,and improve patient outcomes.展开更多
Although digital changes in power systems have added more ways to monitor and control them,these changes have also led to new cyber-attack risks,mainly from False Data Injection(FDI)attacks.If this happens,the sensors...Although digital changes in power systems have added more ways to monitor and control them,these changes have also led to new cyber-attack risks,mainly from False Data Injection(FDI)attacks.If this happens,the sensors and operations are compromised,which can lead to big problems,disruptions,failures and blackouts.In response to this challenge,this paper presents a reliable and innovative detection framework that leverages Bidirectional Long Short-Term Memory(Bi-LSTM)networks and employs explanatory methods from Artificial Intelligence(AI).Not only does the suggested architecture detect potential fraud with high accuracy,but it also makes its decisions transparent,enabling operators to take appropriate action.Themethod developed here utilizesmodel-free,interpretable tools to identify essential input elements,thereby making predictions more understandable and usable.Enhancing detection performance is made possible by correcting class imbalance using Synthetic Minority Over-sampling Technique(SMOTE)-based data balancing.Benchmark power system data confirms that the model functions correctly through detailed experiments.Experimental results showed that Bi-LSTM+Explainable AI(XAI)achieved an average accuracy of 94%,surpassing XGBoost(89%)and Bagging(84%),while ensuring explainability and a high level of robustness across various operating scenarios.By conducting an ablation study,we find that bidirectional recursive modeling and ReLU activation help improve generalization and model predictability.Additionally,examining model decisions through LIME enables us to identify which features are crucial for making smart grid operational decisions in real time.The research offers a practical and flexible approach for detecting FDI attacks,improving the security of cyber-physical systems,and facilitating the deployment of AI in energy infrastructure.展开更多
Neonatal sepsis is the third most common cause of neonatal mortality and a serious public health problem,especially in developing countries.There have been researches on human sepsis,vaccine response,and immunity.Also...Neonatal sepsis is the third most common cause of neonatal mortality and a serious public health problem,especially in developing countries.There have been researches on human sepsis,vaccine response,and immunity.Also,machine learning methodologies were used for predicting infant mortality based on certain features like age,birth weight,gestational weeks,and Appearance,Pulse,Grimace,Activity and Respiration(APGAR)score.Sepsis,which is considered the most determining condition towards infant mortality,has never been considered for mortality prediction.So,we have deployed a deep neural model which is the state of art and performed a comparative analysis of machine learning models to predict the mortality among infants based on the most important features including sepsis.Also,for assessing the prediction reliability of deep neural model which is a black box,Explainable AI models like Dalex and Lime have been deployed.This would help any non-technical personnel like doctors and practitioners to understand and accordingly make decisions.展开更多
Machine fault diagnostics are essential for industrial operations,and advancements in machine learning have significantly advanced these systems by providing accurate predictions and expedited solutions.Machine learni...Machine fault diagnostics are essential for industrial operations,and advancements in machine learning have significantly advanced these systems by providing accurate predictions and expedited solutions.Machine learning models,especially those utilizing complex algorithms like deep learning,have demonstrated major potential in extracting important information fromlarge operational datasets.Despite their efficiency,machine learningmodels face challenges,making Explainable AI(XAI)crucial for improving their understandability and fine-tuning.The importance of feature contribution and selection using XAI in the diagnosis of machine faults is examined in this study.The technique is applied to evaluate different machine-learning algorithms.Extreme Gradient Boosting,Support Vector Machine,Gaussian Naive Bayes,and Random Forest classifiers are used alongside Logistic Regression(LR)as a baseline model because their efficacy and simplicity are evaluated thoroughly with empirical analysis.The XAI is used as a targeted feature selection technique to select among 29 features of the time and frequency domain.The XAI approach is lightweight,trained with only targeted features,and achieved similar results as the traditional approach.The accuracy without XAI on baseline LR is 79.57%,whereas the approach with XAI on LR is 80.28%.展开更多
With the advancements in artificial intelligence(AI)technology,attackers are increasingly using sophisticated techniques,including ChatGPT.Endpoint Detection&Response(EDR)is a system that detects and responds to s...With the advancements in artificial intelligence(AI)technology,attackers are increasingly using sophisticated techniques,including ChatGPT.Endpoint Detection&Response(EDR)is a system that detects and responds to strange activities or security threats occurring on computers or endpoint devices within an organization.Unlike traditional antivirus software,EDR is more about responding to a threat after it has already occurred than blocking it.This study aims to overcome challenges in security control,such as increased log size,emerging security threats,and technical demands faced by control staff.Previous studies have focused on AI detection models,emphasizing detection rates and model performance.However,the underlying reasons behind the detection results were often insufficiently understood,leading to varying outcomes based on the learning model.Additionally,the presence of both structured or unstructured logs,the growth in new security threats,and increasing technical disparities among control staff members pose further challenges for effective security control.This study proposed to improve the problems of the existing EDR system and overcome the limitations of security control.This study analyzed data during the preprocessing stage to identify potential threat factors that influence the detection process and its outcomes.Additionally,eleven commonly-used machine learning(ML)models for malware detection in XAI were tested,with the five models showing the highest performance selected for further analysis.Explainable AI(XAI)techniques are employed to assess the impact of preprocessing on the learning process outcomes.To ensure objectivity and versatility in the analysis,five widely recognized datasets were used.Additionally,eleven commonly-used machine learning models for malware detection in XAI were tested with the five models showing the highest performance selected for further analysis.The results indicate that eXtreme Gradient Boosting(XGBoost)model outperformed others.Moreover,the study conducts an in-depth analysis of the preprocessing phase,tracing backward from the detection result to infer potential threats and classify the primary variables influencing the model’s prediction.This analysis includes the application of SHapley Additive exPlanations(SHAP),an XAI result,which provides insight into the influence of specific features on detection outcomes,and suggests potential breaches by identifying common parameters in malware through file backtracking and providing weights.This study also proposed a counter-detection analysis process to overcome the limitations of existing Deep Learning outcomes,understand the decision-making process of AI,and enhance reliability.These contributions are expected to significantly enhance EDR systems and address existing limitations in security control.展开更多
Teaching students the concepts behind computational thinking is a difficult task,often gated by the inherent difficulty of programming languages.In the classroom,teaching assistants may be required to interact with st...Teaching students the concepts behind computational thinking is a difficult task,often gated by the inherent difficulty of programming languages.In the classroom,teaching assistants may be required to interact with students to help them learn the material.Time spent in grading and offering feedback on assignments removes from this time to help students directly.As such,we offer a framework for developing an explainable artificial intelligence that performs automated analysis of student code while offering feedback and partial credit.The creation of this system is dependent on three core components.Those components are a knowledge base,a set of conditions to be analyzed,and a formal set of inference rules.In this paper,we develop such a system for our own language by employing π-calculus and Hoare logic.Our detailed system can also perform self-learning of rules.Given solution files,the system is able to extract the important aspects of the program and develop feedback that explicitly details the errors students make when they veer away from these aspects.The level of detail and expected precision can be easily modified through parameter tuning and variety in sample solutions.展开更多
The issue of opacity within data-driven artificial intelligence(AI)algorithms has become an impediment to these algorithms’extensive utilization,especially within sensitive domains concerning health,safety,and high p...The issue of opacity within data-driven artificial intelligence(AI)algorithms has become an impediment to these algorithms’extensive utilization,especially within sensitive domains concerning health,safety,and high profitability,such as chemical engineering(CE).In order to promote reliable AI utilization in CE,this review discusses the concept of transparency within AI utilizations,which is defined based on both explainable AI(XAI)concepts and key features from within the CE field.This review also highlights the requirements of reliable AI from the aspects of causality(i.e.,the correlations between the predictions and inputs of an AI),explainability(i.e.,the operational rationales of the workflows),and informativeness(i.e.,the mechanistic insights of the investigating systems).Related techniques are evaluated together with state-of-the-art applications to highlight the significance of establishing reliable AI applications in CE.Furthermore,a comprehensive transparency analysis case study is provided as an example to enhance understanding.Overall,this work provides a thorough discussion of this subject matter in a way that—for the first time—is particularly geared toward chemical engineers in order to raise awareness of responsible AI utilization.With this vital missing link,AI is anticipated to serve as a novel and powerful tool that can tremendously aid chemical engineers in solving bottleneck challenges in CE.展开更多
文摘Short Message Service(SMS)is a widely used and cost-effective communication medium that has unfortunately become a frequent target for unsolicited messages-commonly known as SMS spam.With the rapid adoption of smartphones and increased Internet connectivity,SMS spam has emerged as a prevalent threat.Spammers have recognized the critical role SMS plays in today’s modern communication,making it a prime target for abuse.As cybersecurity threats continue to evolve,the volume of SMS spam has increased substantially in recent years.Moreover,the unstructured format of SMS data creates significant challenges for SMS spam detection,making it more difficult to successfully combat spam attacks.In this paper,we present an optimized and fine-tuned transformer-based Language Model to address the problem of SMS spam detection.We use a benchmark SMS spam dataset to analyze this spam detection model.Additionally,we utilize pre-processing techniques to obtain clean and noise-free data and address class imbalance problem by leveraging text augmentation techniques.The overall experiment showed that our optimized fine-tuned BERT(Bidirectional Encoder Representations from Transformers)variant model RoBERTa obtained high accuracy with 99.84%.To further enhance model transparency,we incorporate Explainable Artificial Intelligence(XAI)techniques that compute positive and negative coefficient scores,offering insight into the model’s decision-making process.Additionally,we evaluate the performance of traditional machine learning models as a baseline for comparison.This comprehensive analysis demonstrates the significant impact language models can have on addressing complex text-based challenges within the cybersecurity landscape.
文摘In the era of advanced machine learning techniques,the development of accurate predictive models for complex medical conditions,such as thyroid cancer,has shown remarkable progress.Accurate predictivemodels for thyroid cancer enhance early detection,improve resource allocation,and reduce overtreatment.However,the widespread adoption of these models in clinical practice demands predictive performance along with interpretability and transparency.This paper proposes a novel association-rule based feature-integratedmachine learning model which shows better classification and prediction accuracy than present state-of-the-artmodels.Our study also focuses on the application of SHapley Additive exPlanations(SHAP)values as a powerful tool for explaining thyroid cancer prediction models.In the proposed method,the association-rule based feature integration framework identifies frequently occurring attribute combinations in the dataset.The original dataset is used in trainingmachine learning models,and further used in generating SHAP values fromthesemodels.In the next phase,the dataset is integrated with the dominant feature sets identified through association-rule based analysis.This new integrated dataset is used in re-training the machine learning models.The new SHAP values generated from these models help in validating the contributions of feature sets in predicting malignancy.The conventional machine learning models lack interpretability,which can hinder their integration into clinical decision-making systems.In this study,the SHAP values are introduced along with association-rule based feature integration as a comprehensive framework for understanding the contributions of feature sets inmodelling the predictions.The study discusses the importance of reliable predictive models for early diagnosis of thyroid cancer,and a validation framework of explainability.The proposed model shows an accuracy of 93.48%.Performance metrics such as precision,recall,F1-score,and the area under the receiver operating characteristic(AUROC)are also higher than the baseline models.The results of the proposed model help us identify the dominant feature sets that impact thyroid cancer classification and prediction.The features{calcification}and{shape}consistently emerged as the top-ranked features associated with thyroid malignancy,in both association-rule based interestingnessmetric values and SHAPmethods.The paper highlights the potential of the rule-based integrated models with SHAP in bridging the gap between the machine learning predictions and the interpretability of this prediction which is required for real-world medical applications.
基金the National Natural Science Foundation of China(42377170,42407212)the National Funded Postdoctoral Researcher Program(GZB20230606)+3 种基金the Postdoctoral Research Foundation of China(2024M752679)the Sichuan Natural Science Foundation(2025ZNSFSC1205)the National Key R&D Program of China(2022YFC3005704)the Sichuan Province Science and Technology Support Program(2024NSFSC0100)。
文摘Wildfires significantly disrupt the physical and hydrologic conditions of the environment,leading to vegetation loss and altered surface geo-material properties.These complex dynamics promote post-fire gully erosion,yet the key conditioning factors(e.g.,topography,hydrology)remain insufficiently understood.This study proposes a novel artificial intelligence(AI)framework that integrates four machine learning(ML)models with Shapley Additive Explanations(SHAP)method,offering a hierarchical perspective from global to local on the dominant factors controlling gully distribution in wildfireaffected areas.In a case study of Xiangjiao catchment burned on March 28,2020,in Muli County in Sichuan Province of Southwest China,we derived 21 geoenvironmental factors to assess the susceptibility of post-fire gully erosion using logistic regression(LR),support vector machine(SVM),random forest(RF),and convolutional neural network(CNN)models.SHAP-based model interpretation revealed eight key conditioning factors:topographic position index(TPI),topographic wetness index(TWI),distance to stream,mean annual precipitation,differenced normalized burn ratio(d NBR),land use/cover,soil type,and distance to road.Comparative model evaluation demonstrated that reduced-variable models incorporating these dominant factors achieved accuracy comparable to that of the initial-variable models,with AUC values exceeding 0.868 across all ML algorithms.These findings provide critical insights into gully erosion behavior in wildfire-affected areas,supporting the decision-making process behind environmental management and hazard mitigation.
基金partially funded by the Programa Nacional de Becas y Crédito Educativo of Peru and the Universitat de València,Spain.
文摘Self-Explaining Autonomous Systems(SEAS)have emerged as a strategic frontier within Artificial Intelligence(AI),responding to growing demands for transparency and interpretability in autonomous decisionmaking.This study presents a comprehensive bibliometric analysis of SEAS research published between 2020 and February 2025,drawing upon 1380 documents indexed in Scopus.The analysis applies co-citation mapping,keyword co-occurrence,and author collaboration networks using VOSviewer,MASHA,and Python to examine scientific production,intellectual structure,and global collaboration patterns.The results indicate a sustained annual growth rate of 41.38%,with an h-index of 57 and an average of 21.97 citations per document.A normalized citation rate was computed to address temporal bias,enabling balanced evaluation across publication cohorts.Thematic analysis reveals four consolidated research fronts:interpretability in machine learning,explainability in deep neural networks,transparency in generative models,and optimization strategies in autonomous control.Author co-citation analysis identifies four distinct research communities,and keyword evolution shows growing interdisciplinary links with medicine,cybersecurity,and industrial automation.The United States leads in scientific output and citation impact at the geographical level,while countries like India and China show high productivity with varied influence.However,international collaboration remains limited at 7.39%,reflecting a fragmented research landscape.As discussed in this study,SEAS research is expanding rapidly yet remains epistemologically dispersed,with uneven integration of ethical and human-centered perspectives.This work offers a structured and data-driven perspective on SEAS development,highlights key contributors and thematic trends,and outlines critical directions for advancing responsible and transparent autonomous systems.
文摘In the evolving landscape of cyber threats,phishing attacks pose significant challenges,particularly through deceptive webpages designed to extract sensitive information under the guise of legitimacy.Conventional and machine learning(ML)-based detection systems struggle to detect phishing websites owing to their constantly changing tactics.Furthermore,newer phishing websites exhibit subtle and expertly concealed indicators that are not readily detectable.Hence,effective detection depends on identifying the most critical features.Traditional feature selection(FS)methods often struggle to enhance ML model performance and instead decrease it.To combat these issues,we propose an innovative method using explainable AI(XAI)to enhance FS in ML models and improve the identification of phishing websites.Specifically,we employ SHapley Additive exPlanations(SHAP)for global perspective and aggregated local interpretable model-agnostic explanations(LIME)to deter-mine specific localized patterns.The proposed SHAP and LIME-aggregated FS(SLA-FS)framework pinpoints the most informative features,enabling more precise,swift,and adaptable phishing detection.Applying this approach to an up-to-date web phishing dataset,we evaluate the performance of three ML models before and after FS to assess their effectiveness.Our findings reveal that random forest(RF),with an accuracy of 97.41%and XGBoost(XGB)at 97.21%significantly benefit from the SLA-FS framework,while k-nearest neighbors lags.Our framework increases the accuracy of RF and XGB by 0.65%and 0.41%,respectively,outperforming traditional filter or wrapper methods and any prior methods evaluated on this dataset,showcasing its potential.
基金funded by Researchers Supporting Project Number(RSPD2025R947),King Saud University,Riyadh,Saudi Arabia.
文摘Cardiovascular disease(CVD)remains a leading global health challenge due to its high mortality rate and the complexity of early diagnosis,driven by risk factors such as hypertension,high cholesterol,and irregular pulse rates.Traditional diagnostic methods often struggle with the nuanced interplay of these risk factors,making early detection difficult.In this research,we propose a novel artificial intelligence-enabled(AI-enabled)framework for CVD risk prediction that integrates machine learning(ML)with eXplainable AI(XAI)to provide both high-accuracy predictions and transparent,interpretable insights.Compared to existing studies that typically focus on either optimizing ML performance or using XAI separately for local or global explanations,our approach uniquely combines both local and global interpretability using Local Interpretable Model-Agnostic Explanations(LIME)and SHapley Additive exPlanations(SHAP).This dual integration enhances the interpretability of the model and facilitates clinicians to comprehensively understand not just what the model predicts but also why those predictions are made by identifying the contribution of different risk factors,which is crucial for transparent and informed decision-making in healthcare.The framework uses ML techniques such as K-nearest neighbors(KNN),gradient boosting,random forest,and decision tree,trained on a cardiovascular dataset.Additionally,the integration of LIME and SHAP provides patient-specific insights alongside global trends,ensuring that clinicians receive comprehensive and actionable information.Our experimental results achieve 98%accuracy with the Random Forest model,with precision,recall,and F1-scores of 97%,98%,and 98%,respectively.The innovative combination of SHAP and LIME sets a new benchmark in CVD prediction by integrating advanced ML accuracy with robust interpretability,fills a critical gap in existing approaches.This framework paves the way for more explainable and transparent decision-making in healthcare,ensuring that the model is not only accurate but also trustworthy and actionable for clinicians.
文摘This study presents a framework for the semi-automatic detection of rock discontinuities using a threedimensional(3D)point cloud(PC).The process begins by selecting an appropriate neighborhood size,a critical step for feature extraction from the PC.The effects of different neighborhood sizes(k=5,10,20,50,and 100)have been evaluated to assess their impact on classification performance.After that,17 geometric and spatial features were extracted from the PC.Next,ensemble methods,AdaBoost.M2,random forest,and decision tree,have been compared with Artificial Neural Networks to classify the main discontinuity sets.The McNemar test indicates that the classifiers are statistically significant.The random forest classifier consistently achieves the highest performance with an accuracy exceeding 95%when using a neighborhood size of k=100,while recall,F-score,and Cohen's Kappa also demonstrate high success.SHapley Additive exPlanations(SHAP),an Explainable AI technique,has been used to evaluate feature importance and improve the explainability of black-box machine learning models in the context of rock discontinuity classification.The analysis reveals that features such as normal vectors,verticality,and Z-values have the greatest influence on identifying main discontinuity sets,while linearity,planarity,and eigenvalues contribute less,making the model more transparent and easier to understand.After classification,individual discontinuity sets were detected using a revised DBSCAN from the main discontinuity sets.Finally,the orientation parameters of the plane fitted to each discontinuity were derived from the plane parameters obtained using the Random Sample Consensus(RANSAC).Two real-world datasets(obtained from SfM and LiDAR)and one synthetic dataset were used to validate the proposed method,which successfully identified rock discontinuities and their orientation parameters(dip angle/direction).
基金Ho Chi Minh City University of Technology (HCMUT), VNU-HCM for supporting this study
文摘Machine learning(ML)models are widely used for predicting undrained shear strength(USS),but interpretability has been a limitation in various studies.Therefore,this study introduced shapley additive explanations(SHAP)to clarify the contribution of each input feature in USS prediction.Three ML models,artificial neural network(ANN),extreme gradient boosting(XGBoost),and random forest(RF),were employed,with accuracy evaluated using mean squared error,mean absolute error,and coefficient of determination(R^(2)).The RF achieved the highest performance with an R^(2) of 0.82.SHAP analysis identified pre-consolidation stress as a key contributor to USS prediction.SHAP dependence plots reveal that the ANN captures smoother,linear feature-output relationships,while the RF handles complex,non-linear interactions more effectively.This suggests a non-linear relationship between USS and input features,with RF outperforming ANN.These findings highlight SHAP’s role in enhancing interpretability and promoting transparency and reliability in ML predictions for geotechnical applications.
文摘The fast increase of online communities has brought about an increase in cyber threats inclusive of cyberbullying, hate speech, misinformation, and online harassment, making content moderation a pressing necessity. Traditional single-modal AI-based detection systems, which analyze both text, photos, or movies in isolation, have established useless at taking pictures multi-modal threats, in which malicious actors spread dangerous content throughout a couple of formats. To cope with these demanding situations, we advise a multi-modal deep mastering framework that integrates Natural Language Processing (NLP), Convolutional Neural Networks (CNNs), and Long Short-Term Memory (LSTM) networks to become aware of and mitigate online threats effectively. Our proposed model combines BERT for text class, ResNet50 for photograph processing, and a hybrid LSTM-3-d CNN community for video content material analysis. We constructed a large-scale dataset comprising 500,000 textual posts, 200,000 offensive images, and 50,000 annotated motion pictures from more than one platform, which includes Twitter, Reddit, YouTube, and online gaming forums. The system became carefully evaluated using trendy gadget mastering metrics which include accuracy, precision, remember, F1-score, and ROC-AUC curves. Experimental outcomes demonstrate that our multi-modal method extensively outperforms single-modal AI classifiers, achieving an accuracy of 92.3%, precision of 91.2%, do not forget of 90.1%, and an AUC rating of 0.95. The findings validate the necessity of integrating multi-modal AI for actual-time, high-accuracy online chance detection and moderation. Future paintings will have consciousness on improving hostile robustness, enhancing scalability for real-world deployment, and addressing ethical worries associated with AI-driven content moderation.
文摘In the field of precision healthcare,where accurate decision-making is paramount,this study underscores the indispensability of eXplainable Artificial Intelligence(XAI)in the context of epilepsy management within the Internet of Medical Things(IoMT).The methodology entails meticulous preprocessing,involving the application of a band-pass filter and epoch segmentation to optimize the quality of Electroencephalograph(EEG)data.The subsequent extraction of statistical features facilitates the differentiation between seizure and non-seizure patterns.The classification phase integrates Support Vector Machine(SVM),K-Nearest Neighbor(KNN),and Random Forest classifiers.Notably,SVM attains an accuracy of 97.26%,excelling in the precision,recall,specificity,and F1 score for identifying seizures and non-seizure instances.Conversely,KNN achieves an accuracy of 72.69%,accompanied by certain trade-offs.The Random Forest classifierstands out with a remarkable accuracy of 99.89%,coupled with an exceptional precision(99.73%),recall(100%),specificity(99.80%),and F1 score(99.86%),surpassing both SVM and KNN performances.XAI techniques,namely Local Interpretable ModelAgnostic Explanations(LIME)and SHapley Additive exPlanation(SHAP),enhance the system’s transparency.This combination of machine learning and XAI not only improves the reliability and accuracy of the seizure detection system but also enhances trust and interpretability.Healthcare professionals can leverage the identified important features and their dependencies to gain deeper insights into the decision-making process,aiding in informed diagnosis and treatment decisions for patients with epilepsy.
文摘Accurate and interpretable fault diagnosis in industrial gear systems is essential for ensuring safety,reliability,and predictive maintenance.This study presents an intelligent diagnostic framework utilizing Gradient Boosting(GB)for fault detection in gear systems,applied to the Aalto Gear Fault Dataset,which features a wide range of synthetic and realistic gear failure modes under varied operating conditions.The dataset was preprocessed and analyzed using an ensemble GB classifier,yielding high performance across multiple metrics:accuracy of 96.77%,precision of 95.44%,recall of 97.11%,and an F1-score of 96.22%.To enhance trust in model predictions,the study integrates an explainable AI(XAI)framework using SHAP(SHapley Additive exPlanations)to visualize feature contributions and support diagnostic transparency.A flowchart-based architecture is proposed to guide real-world deployment of interpretable fault detection pipelines.The results demonstrate the feasibility of combining predictive performance with interpretability,offering a robust approach for condition monitoring in safety-critical systems.
基金funded by the Excellent Talent Training Funding Project in Dongcheng District,Beijing,with project number 2024-dchrcpyzz-9.
文摘Predicting the health status of stroke patients at different stages of the disease is a critical clinical task.The onset and development of stroke are affected by an array of factors,encompassing genetic predisposition,environmental exposure,unhealthy lifestyle habits,and existing medical conditions.Although existing machine learning-based methods for predicting stroke patients’health status have made significant progress,limitations remain in terms of prediction accuracy,model explainability,and system optimization.This paper proposes a multi-task learning approach based on Explainable Artificial Intelligence(XAI)for predicting the health status of stroke patients.First,we design a comprehensive multi-task learning framework that utilizes the task correlation of predicting various health status indicators in patients,enabling the parallel prediction of multiple health indicators.Second,we develop a multi-task Area Under Curve(AUC)optimization algorithm based on adaptive low-rank representation,which removes irrelevant information from the model structure to enhance the performance of multi-task AUC optimization.Additionally,the model’s explainability is analyzed through the stability analysis of SHAP values.Experimental results demonstrate that our approach outperforms comparison algorithms in key prognostic metrics F1 score and Efficiency.
文摘Skin cancer is the most prevalent cancer globally,primarily due to extensive exposure to Ultraviolet(UV)radiation.Early identification of skin cancer enhances the likelihood of effective treatment,as delays may lead to severe tumor advancement.This study proposes a novel hybrid deep learning strategy to address the complex issue of skin cancer diagnosis,with an architecture that integrates a Vision Transformer,a bespoke convolutional neural network(CNN),and an Xception module.They were evaluated using two benchmark datasets,HAM10000 and Skin Cancer ISIC.On the HAM10000,the model achieves a precision of 95.46%,an accuracy of 96.74%,a recall of 96.27%,specificity of 96.00%and an F1-Score of 95.86%.It obtains an accuracy of 93.19%,a precision of 93.25%,a recall of 92.80%,a specificity of 92.89%and an F1-Score of 93.19%on the Skin Cancer ISIC dataset.The findings demonstrate that the model that was proposed is robust and trustworthy when it comes to the classification of skin lesions.In addition,the utilization of Explainable AI techniques,such as Grad-CAM visualizations,assists in highlighting the most significant lesion areas that have an impact on the decisions that are made by the model.
文摘The diagnosis of brain tumors is an extended process that significantly depends on the expertise and skills of radiologists.The rise in patient numbers has substantially elevated the data processing volume,making conventional methods both costly and inefficient.Recently,Artificial Intelligence(AI)has gained prominence for developing automated systems that can accurately diagnose or segment brain tumors in a shorter time frame.Many researchers have examined various algorithms that provide both speed and accuracy in detecting and classifying brain tumors.This paper proposes a newmodel based on AI,called the Brain Tumor Detection(BTD)model,based on brain tumor Magnetic Resonance Images(MRIs).The proposed BTC comprises three main modules:(i)Image Processing Module(IPM),(ii)Patient Detection Module(PDM),and(iii)Explainable AI(XAI).In the first module(i.e.,IPM),the used dataset is preprocessed through two stages:feature extraction and feature selection.At first,the MRI is preprocessed,then the images are converted into a set of features using several feature extraction methods:gray level co-occurrencematrix,histogramof oriented gradient,local binary pattern,and Tamura feature.Next,the most effective features are selected fromthese features separately using ImprovedGrayWolfOptimization(IGWO).IGWOis a hybrid methodology that consists of the Filter Selection Step(FSS)using information gain ratio as an initial selection stage and Binary Gray Wolf Optimization(BGWO)to make the proposed method better at detecting tumors by further optimizing and improving the chosen features.Then,these features are fed to PDM using several classifiers,and the final decision is based on weighted majority voting.Finally,through Local Interpretable Model-agnostic Explanations(LIME)XAI,the interpretability and transparency in decision-making processes are provided.The experiments are performed on a publicly available Brain MRI dataset that consists of 98 normal cases and 154 abnormal cases.During the experiments,the dataset was divided into 70%(177 cases)for training and 30%(75 cases)for testing.The numerical findings demonstrate that the BTD model outperforms its competitors in terms of accuracy,precision,recall,and F-measure.It introduces 98.8%accuracy,97%precision,97.5%recall,and 97.2%F-measure.The results demonstrate the potential of the proposed model to revolutionize brain tumor diagnosis,contribute to better treatment strategies,and improve patient outcomes.
基金the Deanship of Scientific Research and Libraries in Princess Nourah bint Abdulrahman University for funding this research work through the Research Group project,Grant No.(RG-1445-0064).
文摘Although digital changes in power systems have added more ways to monitor and control them,these changes have also led to new cyber-attack risks,mainly from False Data Injection(FDI)attacks.If this happens,the sensors and operations are compromised,which can lead to big problems,disruptions,failures and blackouts.In response to this challenge,this paper presents a reliable and innovative detection framework that leverages Bidirectional Long Short-Term Memory(Bi-LSTM)networks and employs explanatory methods from Artificial Intelligence(AI).Not only does the suggested architecture detect potential fraud with high accuracy,but it also makes its decisions transparent,enabling operators to take appropriate action.Themethod developed here utilizesmodel-free,interpretable tools to identify essential input elements,thereby making predictions more understandable and usable.Enhancing detection performance is made possible by correcting class imbalance using Synthetic Minority Over-sampling Technique(SMOTE)-based data balancing.Benchmark power system data confirms that the model functions correctly through detailed experiments.Experimental results showed that Bi-LSTM+Explainable AI(XAI)achieved an average accuracy of 94%,surpassing XGBoost(89%)and Bagging(84%),while ensuring explainability and a high level of robustness across various operating scenarios.By conducting an ablation study,we find that bidirectional recursive modeling and ReLU activation help improve generalization and model predictability.Additionally,examining model decisions through LIME enables us to identify which features are crucial for making smart grid operational decisions in real time.The research offers a practical and flexible approach for detecting FDI attacks,improving the security of cyber-physical systems,and facilitating the deployment of AI in energy infrastructure.
文摘Neonatal sepsis is the third most common cause of neonatal mortality and a serious public health problem,especially in developing countries.There have been researches on human sepsis,vaccine response,and immunity.Also,machine learning methodologies were used for predicting infant mortality based on certain features like age,birth weight,gestational weeks,and Appearance,Pulse,Grimace,Activity and Respiration(APGAR)score.Sepsis,which is considered the most determining condition towards infant mortality,has never been considered for mortality prediction.So,we have deployed a deep neural model which is the state of art and performed a comparative analysis of machine learning models to predict the mortality among infants based on the most important features including sepsis.Also,for assessing the prediction reliability of deep neural model which is a black box,Explainable AI models like Dalex and Lime have been deployed.This would help any non-technical personnel like doctors and practitioners to understand and accordingly make decisions.
基金funded by Woosong University Academic Research 2024.
文摘Machine fault diagnostics are essential for industrial operations,and advancements in machine learning have significantly advanced these systems by providing accurate predictions and expedited solutions.Machine learning models,especially those utilizing complex algorithms like deep learning,have demonstrated major potential in extracting important information fromlarge operational datasets.Despite their efficiency,machine learningmodels face challenges,making Explainable AI(XAI)crucial for improving their understandability and fine-tuning.The importance of feature contribution and selection using XAI in the diagnosis of machine faults is examined in this study.The technique is applied to evaluate different machine-learning algorithms.Extreme Gradient Boosting,Support Vector Machine,Gaussian Naive Bayes,and Random Forest classifiers are used alongside Logistic Regression(LR)as a baseline model because their efficacy and simplicity are evaluated thoroughly with empirical analysis.The XAI is used as a targeted feature selection technique to select among 29 features of the time and frequency domain.The XAI approach is lightweight,trained with only targeted features,and achieved similar results as the traditional approach.The accuracy without XAI on baseline LR is 79.57%,whereas the approach with XAI on LR is 80.28%.
基金supported by Innovative Human Resource Development for Local Intellectualization program through the Institute of Information&Communications Technology Planning&Evaluation(IITP)grant funded by the Korea government(MSIT)(IITP-2024-RS-2022-00156287,50%)supported by the MSIT(Ministry of Science and ICT),Republic of Korea,under the Convergence Security Core Talent Training Business Support Program(IITP-2024-RS-2022-II221203,50%)supervised by the IITP(Institute of Information&Communications Technology Planning&Evaluation).
文摘With the advancements in artificial intelligence(AI)technology,attackers are increasingly using sophisticated techniques,including ChatGPT.Endpoint Detection&Response(EDR)is a system that detects and responds to strange activities or security threats occurring on computers or endpoint devices within an organization.Unlike traditional antivirus software,EDR is more about responding to a threat after it has already occurred than blocking it.This study aims to overcome challenges in security control,such as increased log size,emerging security threats,and technical demands faced by control staff.Previous studies have focused on AI detection models,emphasizing detection rates and model performance.However,the underlying reasons behind the detection results were often insufficiently understood,leading to varying outcomes based on the learning model.Additionally,the presence of both structured or unstructured logs,the growth in new security threats,and increasing technical disparities among control staff members pose further challenges for effective security control.This study proposed to improve the problems of the existing EDR system and overcome the limitations of security control.This study analyzed data during the preprocessing stage to identify potential threat factors that influence the detection process and its outcomes.Additionally,eleven commonly-used machine learning(ML)models for malware detection in XAI were tested,with the five models showing the highest performance selected for further analysis.Explainable AI(XAI)techniques are employed to assess the impact of preprocessing on the learning process outcomes.To ensure objectivity and versatility in the analysis,five widely recognized datasets were used.Additionally,eleven commonly-used machine learning models for malware detection in XAI were tested with the five models showing the highest performance selected for further analysis.The results indicate that eXtreme Gradient Boosting(XGBoost)model outperformed others.Moreover,the study conducts an in-depth analysis of the preprocessing phase,tracing backward from the detection result to infer potential threats and classify the primary variables influencing the model’s prediction.This analysis includes the application of SHapley Additive exPlanations(SHAP),an XAI result,which provides insight into the influence of specific features on detection outcomes,and suggests potential breaches by identifying common parameters in malware through file backtracking and providing weights.This study also proposed a counter-detection analysis process to overcome the limitations of existing Deep Learning outcomes,understand the decision-making process of AI,and enhance reliability.These contributions are expected to significantly enhance EDR systems and address existing limitations in security control.
基金supported by general funding at IoT and Robotics Education Lab and FURI program at Arizona State University.
文摘Teaching students the concepts behind computational thinking is a difficult task,often gated by the inherent difficulty of programming languages.In the classroom,teaching assistants may be required to interact with students to help them learn the material.Time spent in grading and offering feedback on assignments removes from this time to help students directly.As such,we offer a framework for developing an explainable artificial intelligence that performs automated analysis of student code while offering feedback and partial credit.The creation of this system is dependent on three core components.Those components are a knowledge base,a set of conditions to be analyzed,and a formal set of inference rules.In this paper,we develop such a system for our own language by employing π-calculus and Hoare logic.Our detailed system can also perform self-learning of rules.Given solution files,the system is able to extract the important aspects of the program and develop feedback that explicitly details the errors students make when they veer away from these aspects.The level of detail and expected precision can be easily modified through parameter tuning and variety in sample solutions.
文摘The issue of opacity within data-driven artificial intelligence(AI)algorithms has become an impediment to these algorithms’extensive utilization,especially within sensitive domains concerning health,safety,and high profitability,such as chemical engineering(CE).In order to promote reliable AI utilization in CE,this review discusses the concept of transparency within AI utilizations,which is defined based on both explainable AI(XAI)concepts and key features from within the CE field.This review also highlights the requirements of reliable AI from the aspects of causality(i.e.,the correlations between the predictions and inputs of an AI),explainability(i.e.,the operational rationales of the workflows),and informativeness(i.e.,the mechanistic insights of the investigating systems).Related techniques are evaluated together with state-of-the-art applications to highlight the significance of establishing reliable AI applications in CE.Furthermore,a comprehensive transparency analysis case study is provided as an example to enhance understanding.Overall,this work provides a thorough discussion of this subject matter in a way that—for the first time—is particularly geared toward chemical engineers in order to raise awareness of responsible AI utilization.With this vital missing link,AI is anticipated to serve as a novel and powerful tool that can tremendously aid chemical engineers in solving bottleneck challenges in CE.