This study presents an enhanced convolutional neural network(CNN)model integrated with Explainable Artificial Intelligence(XAI)techniques for accurate prediction and interpretation of wheat crop diseases.The aim is to...This study presents an enhanced convolutional neural network(CNN)model integrated with Explainable Artificial Intelligence(XAI)techniques for accurate prediction and interpretation of wheat crop diseases.The aim is to streamline the detection process while offering transparent insights into the model’s decision-making to support effective disease management.To evaluate the model,a dataset was collected from wheat fields in Kotli,Azad Kashmir,Pakistan,and tested across multiple data splits.The proposed model demonstrates improved stability,faster conver-gence,and higher classification accuracy.The results show significant improvements in prediction accuracy and stability compared to prior works,achieving up to 100%accuracy in certain configurations.In addition,XAI methods such as Local Interpretable Model-agnostic Explanations(LIME)and Shapley Additive Explanations(SHAP)were employed to explain the model’s predictions,highlighting the most influential features contributing to classification decisions.The combined use of CNN and XAI offers a dual benefit:strong predictive performance and clear interpretability of outcomes,which is especially critical in real-world agricultural applications.These findings underscore the potential of integrating deep learning models with XAI to advance automated plant disease detection.The study offers a precise,reliable,and interpretable solution for improving wheat production and promoting agricultural sustainability.Future extensions of this work may include scaling the dataset across broader regions and incorporating additional modalities such as environmental data to enhance model robustness and generalization.展开更多
Alzheimer’s disease(AD)is a neurological disorder that predominantly affects the brain.In the coming years,it is expected to spread rapidly,with limited progress in diagnostic techniques.Various machine learning(ML)a...Alzheimer’s disease(AD)is a neurological disorder that predominantly affects the brain.In the coming years,it is expected to spread rapidly,with limited progress in diagnostic techniques.Various machine learning(ML)and artificial intelligence(AI)algorithms have been employed to detect AD using single-modality data.However,recent developments in ML have enabled the application of these methods to multiple data sources and input modalities for AD prediction.In this study,we developed a framework that utilizes multimodal data(tabular data,magnetic resonance imaging(MRI)images,and genetic information)to classify AD.As part of the pre-processing phase,we generated a knowledge graph from the tabular data and MRI images.We employed graph neural networks for knowledge graph creation,and region-based convolutional neural network approach for image-to-knowledge graph generation.Additionally,we integrated various explainable AI(XAI)techniques to interpret and elucidate the prediction outcomes derived from multimodal data.Layer-wise relevance propagation was used to explain the layer-wise outcomes in the MRI images.We also incorporated submodular pick local interpretable model-agnostic explanations to interpret the decision-making process based on the tabular data provided.Genetic expression values play a crucial role in AD analysis.We used a graphical gene tree to identify genes associated with the disease.Moreover,a dashboard was designed to display XAI outcomes,enabling experts and medical professionals to easily comprehend the predic-tion results.展开更多
Wind power forecasting(WPF)is important for safe,stable,and reliable integration of new energy technologies into power systems.Machine learning(ML)algorithms have recently attracted increasing attention in the field o...Wind power forecasting(WPF)is important for safe,stable,and reliable integration of new energy technologies into power systems.Machine learning(ML)algorithms have recently attracted increasing attention in the field of WPF.However,opaque decisions and lack of trustworthiness of black-box models for WPF could cause scheduling risks.This study develops a method for identifying risky models in practical applications and avoiding the risks.First,a local interpretable model-agnostic explanations algorithm is introduced and improved for WPF model analysis.On that basis,a novel index is presented to quantify the level at which neural networks or other black-box models can trust features involved in training.Then,by revealing the operational mechanism for local samples,human interpretability of the black-box model is examined under different accuracies,time horizons,and seasons.This interpretability provides a basis for several technical routes for WPF from the viewpoint of the forecasting model.Moreover,further improvements in accuracy of WPF are explored by evaluating possibilities of using interpretable ML models that use multi-horizons global trust modeling and multi-seasons interpretable feature selection methods.Experimental results from a wind farm in China show that error can be robustly reduced.展开更多
Nowadays,deepfake is wreaking havoc on society.Deepfake content is created with the help of artificial intelligence and machine learning to replace one person’s likeness with another person in pictures or recorded vid...Nowadays,deepfake is wreaking havoc on society.Deepfake content is created with the help of artificial intelligence and machine learning to replace one person’s likeness with another person in pictures or recorded videos.Although visual media manipulations are not new,the introduction of deepfakes has marked a breakthrough in creating fake media and information.These manipulated pic-tures and videos will undoubtedly have an enormous societal impact.Deepfake uses the latest technology like Artificial Intelligence(AI),Machine Learning(ML),and Deep Learning(DL)to construct automated methods for creating fake content that is becoming increasingly difficult to detect with the human eye.Therefore,automated solutions employed by DL can be an efficient approach for detecting deepfake.Though the“black-box”nature of the DL system allows for robust predictions,they cannot be completely trustworthy.Explainability is thefirst step toward achieving transparency,but the existing incapacity of DL to explain its own decisions to human users limits the efficacy of these systems.Though Explainable Artificial Intelligence(XAI)can solve this problem by inter-preting the predictions of these systems.This work proposes to provide a compre-hensive study of deepfake detection using the DL method and analyze the result of the most effective algorithm with Local Interpretable Model-Agnostic Explana-tions(LIME)to assure its validity and reliability.This study identifies real and deepfake images using different Convolutional Neural Network(CNN)models to get the best accuracy.It also explains which part of the image caused the model to make a specific classification using the LIME algorithm.To apply the CNN model,the dataset is taken from Kaggle,which includes 70 k real images from the Flickr dataset collected by Nvidia and 70 k fake faces generated by StyleGAN of 256 px in size.For experimental results,Jupyter notebook,TensorFlow,Num-Py,and Pandas were used as software,InceptionResnetV2,DenseNet201,Incep-tionV3,and ResNet152V2 were used as CNN models.All these models’performances were good enough,such as InceptionV3 gained 99.68%accuracy,ResNet152V2 got an accuracy of 99.19%,and DenseNet201 performed with 99.81%accuracy.However,InceptionResNetV2 achieved the highest accuracy of 99.87%,which was verified later with the LIME algorithm for XAI,where the proposed method performed the best.The obtained results and dependability demonstrate its preference for detecting deepfake images effectively.展开更多
Owing to the convenience of online loans,an increasing number of people are borrowing money on online platforms.With the emergence of machine learning technology,predicting loan defaults has become a popular topic.How...Owing to the convenience of online loans,an increasing number of people are borrowing money on online platforms.With the emergence of machine learning technology,predicting loan defaults has become a popular topic.However,machine learning models have a black-box problem that cannot be disregarded.To make the prediction model rules more understandable and thereby increase the user’s faith in the model,an explanatory model must be used.Logistic regression,decision tree,XGBoost,and LightGBM models are employed to predict a loan default.The prediction results show that LightGBM and XGBoost outperform logistic regression and decision tree models in terms of the predictive ability.The area under curve for LightGBM is 0.7213.The accuracies of LightGBM and XGBoost exceed 0.8.The precisions of LightGBM and XGBoost exceed 0.55.Simultaneously,we employed the local interpretable model-agnostic explanations approach to undertake an explainable analysis of the prediction findings.The results show that factors such as the loan term,loan grade,credit rating,and loan amount affect the predicted outcomes.展开更多
Deep mixing,also known as deep stabilization,is a widely used ground improvement method in Nordic countries,particularly in urban and infrastructural projects,aiming to enhance the properties of soft,sensitive clays.U...Deep mixing,also known as deep stabilization,is a widely used ground improvement method in Nordic countries,particularly in urban and infrastructural projects,aiming to enhance the properties of soft,sensitive clays.Understanding the shear strength of stabilized soils and identifying key influencing factors are essential for ensuring the structural stability and durability of engineering structures.This study introduces a novel explainable artificial intelligence framework to investigate critical soil properties affecting shear strength,utilizing a data set derived from stabilization tests conducted on laboratory samples from the 1990s.The proposed framework investigates the statistical variability and distribution of crucial parameters affecting shear strength within the collected data set.Subsequently,machine learning models are trained and tested to predict soil shear strength based on input features such as water/binder ratio and water content,etc.Global model analysis using feature importance and Shapley additive explanations is conducted to understand the influence of soil input features on shear strength.Further exploration is carried out using partial dependence plots,individual conditional expectation plots,and accumulated local effects to uncover the degree of dependency and important thresholds between key stabilized soil parameters and shear strength.Heat map and feature interaction analysis techniques are then utilized to investigate soil properties interactions and correlations.Lastly,a more specific investigation is conducted on particular soil samples to highlight the most influential soil properties locally,employing the local interpretable model-agnostic explanations technique.The validation of the framework involves analyzing laboratory test results obtained from uniaxial compression tests.The framework demonstrates an ability to predict the shear strength of stabilized soil samples with an accuracy surpassing 90%.Importantly,the explainability results underscore the substantial impact of water content and the water/binder ratio on shear strength.展开更多
Heart disease remains a leading cause of mortality worldwide,emphasizing the urgent need for reliable and interpretable predictive models to support early diagnosis and timely intervention.However,existing Deep Learni...Heart disease remains a leading cause of mortality worldwide,emphasizing the urgent need for reliable and interpretable predictive models to support early diagnosis and timely intervention.However,existing Deep Learning(DL)approaches often face several limitations,including inefficient feature extraction,class imbalance,suboptimal classification performance,and limited interpretability,which collectively hinder their deployment in clinical settings.To address these challenges,we propose a novel DL framework for heart disease prediction that integrates a comprehensive preprocessing pipeline with an advanced classification architecture.The preprocessing stage involves label encoding and feature scaling.To address the issue of class imbalance inherent in the personal key indicators of the heart disease dataset,the localized random affine shadowsampling technique is employed,which enhances minority class representation while minimizing overfitting.At the core of the framework lies the Deep Residual Network(DeepResNet),which employs hierarchical residual transformations to facilitate efficient feature extraction and capture complex,non-linear relationships in the data.Experimental results demonstrate that the proposed model significantly outperforms existing techniques,achieving improvements of 3.26%in accuracy,3.16%in area under the receiver operating characteristics,1.09%in recall,and 1.07%in F1-score.Furthermore,robustness is validated using 10-fold crossvalidation,confirming the model’s generalizability across diverse data distributions.Moreover,model interpretability is ensured through the integration of Shapley additive explanations and local interpretable model-agnostic explanations,offering valuable insights into the contribution of individual features to model predictions.Overall,the proposed DL framework presents a robust,interpretable,and clinically applicable solution for heart disease prediction.展开更多
Understanding the characteristics of time and distance gaps between the primary(PC)and secondary crashes(SC)is crucial for preventing SC ccurrences and improving road safety.Although previous studies have tried to ana...Understanding the characteristics of time and distance gaps between the primary(PC)and secondary crashes(SC)is crucial for preventing SC ccurrences and improving road safety.Although previous studies have tried to analyse the variation of gaps,there is limited evidence in quantifying the relationships between different gaps and various influential factors.This study proposed a two-layer stacking framework to discuss the time and distance gaps.Specifically,the framework took random forests(RF),gradient boosting decision tree(GBDT)and eXtreme gradient boosting as the base classifiers in the first layer and applied logistic regression(LR)as a combiner in the second layer.On this basis,the local interpretable model-agnostic explanations(LIME)technology was used to interpret the output of the stacking model from both local and global perspectives.Through SC dentification and feature selection,346 SCs and 22 crash-related factors were collected from California interstate freeways.The results showed that the stacking model outperformed base models evaluated by accuracy,precision,and recall indicators.The explanations based on LIME suggest that collision type,distance,speed and volume are the critical features that affect the time and distance gaps.Higher volume can prolong queue length and increase the distance gap from the SCs to PCs.And collision types,peak periods,workday,truck involved and tow away likely induce a long-distance gap.Conversely,there is a shorter distance gap when secondary roads run in the same direction and are close to the primary roads.Lower speed is a significant factor resulting in a long-time gap,while the higher speed is correlated with a short-time gap.These results are expected to provide insights into how contributory features affect the time and distance gaps and help decision-makers develop accurate decisions to prevent SCs.展开更多
文摘This study presents an enhanced convolutional neural network(CNN)model integrated with Explainable Artificial Intelligence(XAI)techniques for accurate prediction and interpretation of wheat crop diseases.The aim is to streamline the detection process while offering transparent insights into the model’s decision-making to support effective disease management.To evaluate the model,a dataset was collected from wheat fields in Kotli,Azad Kashmir,Pakistan,and tested across multiple data splits.The proposed model demonstrates improved stability,faster conver-gence,and higher classification accuracy.The results show significant improvements in prediction accuracy and stability compared to prior works,achieving up to 100%accuracy in certain configurations.In addition,XAI methods such as Local Interpretable Model-agnostic Explanations(LIME)and Shapley Additive Explanations(SHAP)were employed to explain the model’s predictions,highlighting the most influential features contributing to classification decisions.The combined use of CNN and XAI offers a dual benefit:strong predictive performance and clear interpretability of outcomes,which is especially critical in real-world agricultural applications.These findings underscore the potential of integrating deep learning models with XAI to advance automated plant disease detection.The study offers a precise,reliable,and interpretable solution for improving wheat production and promoting agricultural sustainability.Future extensions of this work may include scaling the dataset across broader regions and incorporating additional modalities such as environmental data to enhance model robustness and generalization.
文摘Alzheimer’s disease(AD)is a neurological disorder that predominantly affects the brain.In the coming years,it is expected to spread rapidly,with limited progress in diagnostic techniques.Various machine learning(ML)and artificial intelligence(AI)algorithms have been employed to detect AD using single-modality data.However,recent developments in ML have enabled the application of these methods to multiple data sources and input modalities for AD prediction.In this study,we developed a framework that utilizes multimodal data(tabular data,magnetic resonance imaging(MRI)images,and genetic information)to classify AD.As part of the pre-processing phase,we generated a knowledge graph from the tabular data and MRI images.We employed graph neural networks for knowledge graph creation,and region-based convolutional neural network approach for image-to-knowledge graph generation.Additionally,we integrated various explainable AI(XAI)techniques to interpret and elucidate the prediction outcomes derived from multimodal data.Layer-wise relevance propagation was used to explain the layer-wise outcomes in the MRI images.We also incorporated submodular pick local interpretable model-agnostic explanations to interpret the decision-making process based on the tabular data provided.Genetic expression values play a crucial role in AD analysis.We used a graphical gene tree to identify genes associated with the disease.Moreover,a dashboard was designed to display XAI outcomes,enabling experts and medical professionals to easily comprehend the predic-tion results.
基金supported by the National Key R&D Program of China(Technology and application of wind power/photovoltaic power prediction for promoting renewable energy consumption)under Grant(2018YFB0904200).
文摘Wind power forecasting(WPF)is important for safe,stable,and reliable integration of new energy technologies into power systems.Machine learning(ML)algorithms have recently attracted increasing attention in the field of WPF.However,opaque decisions and lack of trustworthiness of black-box models for WPF could cause scheduling risks.This study develops a method for identifying risky models in practical applications and avoiding the risks.First,a local interpretable model-agnostic explanations algorithm is introduced and improved for WPF model analysis.On that basis,a novel index is presented to quantify the level at which neural networks or other black-box models can trust features involved in training.Then,by revealing the operational mechanism for local samples,human interpretability of the black-box model is examined under different accuracies,time horizons,and seasons.This interpretability provides a basis for several technical routes for WPF from the viewpoint of the forecasting model.Moreover,further improvements in accuracy of WPF are explored by evaluating possibilities of using interpretable ML models that use multi-horizons global trust modeling and multi-seasons interpretable feature selection methods.Experimental results from a wind farm in China show that error can be robustly reduced.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2022R193)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.Taif University Researchers Supporting Project(TURSP-2020/26),Taif University,Taif,Saudi Arabia.
文摘Nowadays,deepfake is wreaking havoc on society.Deepfake content is created with the help of artificial intelligence and machine learning to replace one person’s likeness with another person in pictures or recorded videos.Although visual media manipulations are not new,the introduction of deepfakes has marked a breakthrough in creating fake media and information.These manipulated pic-tures and videos will undoubtedly have an enormous societal impact.Deepfake uses the latest technology like Artificial Intelligence(AI),Machine Learning(ML),and Deep Learning(DL)to construct automated methods for creating fake content that is becoming increasingly difficult to detect with the human eye.Therefore,automated solutions employed by DL can be an efficient approach for detecting deepfake.Though the“black-box”nature of the DL system allows for robust predictions,they cannot be completely trustworthy.Explainability is thefirst step toward achieving transparency,but the existing incapacity of DL to explain its own decisions to human users limits the efficacy of these systems.Though Explainable Artificial Intelligence(XAI)can solve this problem by inter-preting the predictions of these systems.This work proposes to provide a compre-hensive study of deepfake detection using the DL method and analyze the result of the most effective algorithm with Local Interpretable Model-Agnostic Explana-tions(LIME)to assure its validity and reliability.This study identifies real and deepfake images using different Convolutional Neural Network(CNN)models to get the best accuracy.It also explains which part of the image caused the model to make a specific classification using the LIME algorithm.To apply the CNN model,the dataset is taken from Kaggle,which includes 70 k real images from the Flickr dataset collected by Nvidia and 70 k fake faces generated by StyleGAN of 256 px in size.For experimental results,Jupyter notebook,TensorFlow,Num-Py,and Pandas were used as software,InceptionResnetV2,DenseNet201,Incep-tionV3,and ResNet152V2 were used as CNN models.All these models’performances were good enough,such as InceptionV3 gained 99.68%accuracy,ResNet152V2 got an accuracy of 99.19%,and DenseNet201 performed with 99.81%accuracy.However,InceptionResNetV2 achieved the highest accuracy of 99.87%,which was verified later with the LIME algorithm for XAI,where the proposed method performed the best.The obtained results and dependability demonstrate its preference for detecting deepfake images effectively.
基金supported by Fundamental Research Funds for the Central Universities(WUT:2022IVA067).
文摘Owing to the convenience of online loans,an increasing number of people are borrowing money on online platforms.With the emergence of machine learning technology,predicting loan defaults has become a popular topic.However,machine learning models have a black-box problem that cannot be disregarded.To make the prediction model rules more understandable and thereby increase the user’s faith in the model,an explanatory model must be used.Logistic regression,decision tree,XGBoost,and LightGBM models are employed to predict a loan default.The prediction results show that LightGBM and XGBoost outperform logistic regression and decision tree models in terms of the predictive ability.The area under curve for LightGBM is 0.7213.The accuracies of LightGBM and XGBoost exceed 0.8.The precisions of LightGBM and XGBoost exceed 0.55.Simultaneously,we employed the local interpretable model-agnostic explanations approach to undertake an explainable analysis of the prediction findings.The results show that factors such as the loan term,loan grade,credit rating,and loan amount affect the predicted outcomes.
文摘Deep mixing,also known as deep stabilization,is a widely used ground improvement method in Nordic countries,particularly in urban and infrastructural projects,aiming to enhance the properties of soft,sensitive clays.Understanding the shear strength of stabilized soils and identifying key influencing factors are essential for ensuring the structural stability and durability of engineering structures.This study introduces a novel explainable artificial intelligence framework to investigate critical soil properties affecting shear strength,utilizing a data set derived from stabilization tests conducted on laboratory samples from the 1990s.The proposed framework investigates the statistical variability and distribution of crucial parameters affecting shear strength within the collected data set.Subsequently,machine learning models are trained and tested to predict soil shear strength based on input features such as water/binder ratio and water content,etc.Global model analysis using feature importance and Shapley additive explanations is conducted to understand the influence of soil input features on shear strength.Further exploration is carried out using partial dependence plots,individual conditional expectation plots,and accumulated local effects to uncover the degree of dependency and important thresholds between key stabilized soil parameters and shear strength.Heat map and feature interaction analysis techniques are then utilized to investigate soil properties interactions and correlations.Lastly,a more specific investigation is conducted on particular soil samples to highlight the most influential soil properties locally,employing the local interpretable model-agnostic explanations technique.The validation of the framework involves analyzing laboratory test results obtained from uniaxial compression tests.The framework demonstrates an ability to predict the shear strength of stabilized soil samples with an accuracy surpassing 90%.Importantly,the explainability results underscore the substantial impact of water content and the water/binder ratio on shear strength.
基金funded by Ongoing Research Funding Program for Project number(ORF-2025-648),King Saud University,Riyadh,Saudi Arabia.
文摘Heart disease remains a leading cause of mortality worldwide,emphasizing the urgent need for reliable and interpretable predictive models to support early diagnosis and timely intervention.However,existing Deep Learning(DL)approaches often face several limitations,including inefficient feature extraction,class imbalance,suboptimal classification performance,and limited interpretability,which collectively hinder their deployment in clinical settings.To address these challenges,we propose a novel DL framework for heart disease prediction that integrates a comprehensive preprocessing pipeline with an advanced classification architecture.The preprocessing stage involves label encoding and feature scaling.To address the issue of class imbalance inherent in the personal key indicators of the heart disease dataset,the localized random affine shadowsampling technique is employed,which enhances minority class representation while minimizing overfitting.At the core of the framework lies the Deep Residual Network(DeepResNet),which employs hierarchical residual transformations to facilitate efficient feature extraction and capture complex,non-linear relationships in the data.Experimental results demonstrate that the proposed model significantly outperforms existing techniques,achieving improvements of 3.26%in accuracy,3.16%in area under the receiver operating characteristics,1.09%in recall,and 1.07%in F1-score.Furthermore,robustness is validated using 10-fold crossvalidation,confirming the model’s generalizability across diverse data distributions.Moreover,model interpretability is ensured through the integration of Shapley additive explanations and local interpretable model-agnostic explanations,offering valuable insights into the contribution of individual features to model predictions.Overall,the proposed DL framework presents a robust,interpretable,and clinically applicable solution for heart disease prediction.
基金This research was funded in part by Innovation-Driven Project of Central South University(Grant No.2020CX041)the Fundamental Research Funds for the Central Universities of Central South University(Grant No.2022ZZTS0717)。
文摘Understanding the characteristics of time and distance gaps between the primary(PC)and secondary crashes(SC)is crucial for preventing SC ccurrences and improving road safety.Although previous studies have tried to analyse the variation of gaps,there is limited evidence in quantifying the relationships between different gaps and various influential factors.This study proposed a two-layer stacking framework to discuss the time and distance gaps.Specifically,the framework took random forests(RF),gradient boosting decision tree(GBDT)and eXtreme gradient boosting as the base classifiers in the first layer and applied logistic regression(LR)as a combiner in the second layer.On this basis,the local interpretable model-agnostic explanations(LIME)technology was used to interpret the output of the stacking model from both local and global perspectives.Through SC dentification and feature selection,346 SCs and 22 crash-related factors were collected from California interstate freeways.The results showed that the stacking model outperformed base models evaluated by accuracy,precision,and recall indicators.The explanations based on LIME suggest that collision type,distance,speed and volume are the critical features that affect the time and distance gaps.Higher volume can prolong queue length and increase the distance gap from the SCs to PCs.And collision types,peak periods,workday,truck involved and tow away likely induce a long-distance gap.Conversely,there is a shorter distance gap when secondary roads run in the same direction and are close to the primary roads.Lower speed is a significant factor resulting in a long-time gap,while the higher speed is correlated with a short-time gap.These results are expected to provide insights into how contributory features affect the time and distance gaps and help decision-makers develop accurate decisions to prevent SCs.