Predicting the future trajectories of multiple agents is essential for various applications in real life,such as surveillance systems,autonomous driving,and social robots.The trajectory prediction task is influenced b...Predicting the future trajectories of multiple agents is essential for various applications in real life,such as surveillance systems,autonomous driving,and social robots.The trajectory prediction task is influenced by many factors,including the individual historical trajectory,interactions between agents,and the fuzzy nature of the observed agents’motion.While existing methods have made great progress on the topic of trajectory prediction,they treat all the information uniformly,which limits the effectiveness of information utilization.To this end,in this paper,we propose and utilize a model-agnostic framework to regard all the information in a two-level hierarchical view.Particularly,the first-level view is the inter-trajectory view.In this level,we observe that the difficulty in predicting different trajectory samples varies.We define trajectory difficulty and train the proposed framework in an“easy-to-hard”schema.The second-level view is the intra-trajectory level.We find the influencing factors for a particular trajectory can be divided into two parts.The first part is global features,which keep stable within a trajectory,i.e.,the expected destination.The second part is local features,which change over time,i.e.,the current position.We believe that the two types of information should be handled in different ways.The hierarchical view is beneficial to take full advantage of the information in a fine-grained way.Experimental results validate the effectiveness of the proposed model-agnostic framework.展开更多
This study presents an enhanced convolutional neural network(CNN)model integrated with Explainable Artificial Intelligence(XAI)techniques for accurate prediction and interpretation of wheat crop diseases.The aim is to...This study presents an enhanced convolutional neural network(CNN)model integrated with Explainable Artificial Intelligence(XAI)techniques for accurate prediction and interpretation of wheat crop diseases.The aim is to streamline the detection process while offering transparent insights into the model’s decision-making to support effective disease management.To evaluate the model,a dataset was collected from wheat fields in Kotli,Azad Kashmir,Pakistan,and tested across multiple data splits.The proposed model demonstrates improved stability,faster conver-gence,and higher classification accuracy.The results show significant improvements in prediction accuracy and stability compared to prior works,achieving up to 100%accuracy in certain configurations.In addition,XAI methods such as Local Interpretable Model-agnostic Explanations(LIME)and Shapley Additive Explanations(SHAP)were employed to explain the model’s predictions,highlighting the most influential features contributing to classification decisions.The combined use of CNN and XAI offers a dual benefit:strong predictive performance and clear interpretability of outcomes,which is especially critical in real-world agricultural applications.These findings underscore the potential of integrating deep learning models with XAI to advance automated plant disease detection.The study offers a precise,reliable,and interpretable solution for improving wheat production and promoting agricultural sustainability.Future extensions of this work may include scaling the dataset across broader regions and incorporating additional modalities such as environmental data to enhance model robustness and generalization.展开更多
Knowledge distillation(KD) enhances student network generalization by transferring dark knowledge from a complex teacher network. To optimize computational expenditure and memory utilization, self-knowledge distillati...Knowledge distillation(KD) enhances student network generalization by transferring dark knowledge from a complex teacher network. To optimize computational expenditure and memory utilization, self-knowledge distillation(SKD) extracts dark knowledge from the model itself rather than an external teacher network. However, previous SKD methods performed distillation indiscriminately on full datasets, overlooking the analysis of representative samples. In this work, we present a novel two-stage approach to providing targeted knowledge on specific samples, named two-stage approach self-knowledge distillation(TOAST). We first soften the hard targets using class medoids generated based on logit vectors per class. Then, we iteratively distill the under-trained data with past predictions of half the batch size. The two-stage knowledge is linearly combined, efficiently enhancing model performance. Extensive experiments conducted on five backbone architectures show our method is model-agnostic and achieves the best generalization performance.Besides, TOAST is strongly compatible with existing augmentation-based regularization methods. Our method also obtains a speedup of up to 2.95x compared with a recent state-of-the-art method.展开更多
Alzheimer’s disease(AD)is a neurological disorder that predominantly affects the brain.In the coming years,it is expected to spread rapidly,with limited progress in diagnostic techniques.Various machine learning(ML)a...Alzheimer’s disease(AD)is a neurological disorder that predominantly affects the brain.In the coming years,it is expected to spread rapidly,with limited progress in diagnostic techniques.Various machine learning(ML)and artificial intelligence(AI)algorithms have been employed to detect AD using single-modality data.However,recent developments in ML have enabled the application of these methods to multiple data sources and input modalities for AD prediction.In this study,we developed a framework that utilizes multimodal data(tabular data,magnetic resonance imaging(MRI)images,and genetic information)to classify AD.As part of the pre-processing phase,we generated a knowledge graph from the tabular data and MRI images.We employed graph neural networks for knowledge graph creation,and region-based convolutional neural network approach for image-to-knowledge graph generation.Additionally,we integrated various explainable AI(XAI)techniques to interpret and elucidate the prediction outcomes derived from multimodal data.Layer-wise relevance propagation was used to explain the layer-wise outcomes in the MRI images.We also incorporated submodular pick local interpretable model-agnostic explanations to interpret the decision-making process based on the tabular data provided.Genetic expression values play a crucial role in AD analysis.We used a graphical gene tree to identify genes associated with the disease.Moreover,a dashboard was designed to display XAI outcomes,enabling experts and medical professionals to easily comprehend the predic-tion results.展开更多
Nowadays,deepfake is wreaking havoc on society.Deepfake content is created with the help of artificial intelligence and machine learning to replace one person’s likeness with another person in pictures or recorded vid...Nowadays,deepfake is wreaking havoc on society.Deepfake content is created with the help of artificial intelligence and machine learning to replace one person’s likeness with another person in pictures or recorded videos.Although visual media manipulations are not new,the introduction of deepfakes has marked a breakthrough in creating fake media and information.These manipulated pic-tures and videos will undoubtedly have an enormous societal impact.Deepfake uses the latest technology like Artificial Intelligence(AI),Machine Learning(ML),and Deep Learning(DL)to construct automated methods for creating fake content that is becoming increasingly difficult to detect with the human eye.Therefore,automated solutions employed by DL can be an efficient approach for detecting deepfake.Though the“black-box”nature of the DL system allows for robust predictions,they cannot be completely trustworthy.Explainability is thefirst step toward achieving transparency,but the existing incapacity of DL to explain its own decisions to human users limits the efficacy of these systems.Though Explainable Artificial Intelligence(XAI)can solve this problem by inter-preting the predictions of these systems.This work proposes to provide a compre-hensive study of deepfake detection using the DL method and analyze the result of the most effective algorithm with Local Interpretable Model-Agnostic Explana-tions(LIME)to assure its validity and reliability.This study identifies real and deepfake images using different Convolutional Neural Network(CNN)models to get the best accuracy.It also explains which part of the image caused the model to make a specific classification using the LIME algorithm.To apply the CNN model,the dataset is taken from Kaggle,which includes 70 k real images from the Flickr dataset collected by Nvidia and 70 k fake faces generated by StyleGAN of 256 px in size.For experimental results,Jupyter notebook,TensorFlow,Num-Py,and Pandas were used as software,InceptionResnetV2,DenseNet201,Incep-tionV3,and ResNet152V2 were used as CNN models.All these models’performances were good enough,such as InceptionV3 gained 99.68%accuracy,ResNet152V2 got an accuracy of 99.19%,and DenseNet201 performed with 99.81%accuracy.However,InceptionResNetV2 achieved the highest accuracy of 99.87%,which was verified later with the LIME algorithm for XAI,where the proposed method performed the best.The obtained results and dependability demonstrate its preference for detecting deepfake images effectively.展开更多
Owing to the convenience of online loans,an increasing number of people are borrowing money on online platforms.With the emergence of machine learning technology,predicting loan defaults has become a popular topic.How...Owing to the convenience of online loans,an increasing number of people are borrowing money on online platforms.With the emergence of machine learning technology,predicting loan defaults has become a popular topic.However,machine learning models have a black-box problem that cannot be disregarded.To make the prediction model rules more understandable and thereby increase the user’s faith in the model,an explanatory model must be used.Logistic regression,decision tree,XGBoost,and LightGBM models are employed to predict a loan default.The prediction results show that LightGBM and XGBoost outperform logistic regression and decision tree models in terms of the predictive ability.The area under curve for LightGBM is 0.7213.The accuracies of LightGBM and XGBoost exceed 0.8.The precisions of LightGBM and XGBoost exceed 0.55.Simultaneously,we employed the local interpretable model-agnostic explanations approach to undertake an explainable analysis of the prediction findings.The results show that factors such as the loan term,loan grade,credit rating,and loan amount affect the predicted outcomes.展开更多
To meet the demands of advanced electronic devices,inorganic glasses are required to have comprehensive dielectric,thermal,and mechanical properties.However,the complex composition–property relationship and vast comp...To meet the demands of advanced electronic devices,inorganic glasses are required to have comprehensive dielectric,thermal,and mechanical properties.However,the complex composition–property relationship and vast compositional diversity hinder optimization.This study developed machine learning models to predict permittivity,dielectric loss,thermal conductivity,coefficient of thermal expansion,and Young’s modulus based on the composition features of inorganic glasses.The optimal models achieve R^(2)values of 0.9614,0.7411,0.9454,0.9684,and 0.8164,respectively.By integrating domain knowledge with model-agnostic interpretation methods,feature contributions and interactions were analyzed.The mixed alkali effect is crucial for property regulation,especially Na-K for dielectric loss and Na-Li for thermal conductivity.Boron anomaly shifts the high-λregion to a balanced composition of alkali metals with rising B%.The multiobjective optimization of properties was realized using a genetic algorithm framework.After 23 iterations,the optimal material in the MgO-Al_(2)O_(3)-B_(2)O_(3)-SiO2 system exhibitsε_(r)=4.78,tanδ=0.00063,λ=2.59 W/(m⋅K),α=50.27�10−7K−1,and E=82.41 GPa,outperforming all materials in the dataset.The computational effort was reduced to 1/19 of that required using exhaustive search methods.This study provides a model interpretation framework and an effective multiobjective optimization strategy for glass design.展开更多
Deep mixing,also known as deep stabilization,is a widely used ground improvement method in Nordic countries,particularly in urban and infrastructural projects,aiming to enhance the properties of soft,sensitive clays.U...Deep mixing,also known as deep stabilization,is a widely used ground improvement method in Nordic countries,particularly in urban and infrastructural projects,aiming to enhance the properties of soft,sensitive clays.Understanding the shear strength of stabilized soils and identifying key influencing factors are essential for ensuring the structural stability and durability of engineering structures.This study introduces a novel explainable artificial intelligence framework to investigate critical soil properties affecting shear strength,utilizing a data set derived from stabilization tests conducted on laboratory samples from the 1990s.The proposed framework investigates the statistical variability and distribution of crucial parameters affecting shear strength within the collected data set.Subsequently,machine learning models are trained and tested to predict soil shear strength based on input features such as water/binder ratio and water content,etc.Global model analysis using feature importance and Shapley additive explanations is conducted to understand the influence of soil input features on shear strength.Further exploration is carried out using partial dependence plots,individual conditional expectation plots,and accumulated local effects to uncover the degree of dependency and important thresholds between key stabilized soil parameters and shear strength.Heat map and feature interaction analysis techniques are then utilized to investigate soil properties interactions and correlations.Lastly,a more specific investigation is conducted on particular soil samples to highlight the most influential soil properties locally,employing the local interpretable model-agnostic explanations technique.The validation of the framework involves analyzing laboratory test results obtained from uniaxial compression tests.The framework demonstrates an ability to predict the shear strength of stabilized soil samples with an accuracy surpassing 90%.Importantly,the explainability results underscore the substantial impact of water content and the water/binder ratio on shear strength.展开更多
Understanding the characteristics of time and distance gaps between the primary(PC)and secondary crashes(SC)is crucial for preventing SC ccurrences and improving road safety.Although previous studies have tried to ana...Understanding the characteristics of time and distance gaps between the primary(PC)and secondary crashes(SC)is crucial for preventing SC ccurrences and improving road safety.Although previous studies have tried to analyse the variation of gaps,there is limited evidence in quantifying the relationships between different gaps and various influential factors.This study proposed a two-layer stacking framework to discuss the time and distance gaps.Specifically,the framework took random forests(RF),gradient boosting decision tree(GBDT)and eXtreme gradient boosting as the base classifiers in the first layer and applied logistic regression(LR)as a combiner in the second layer.On this basis,the local interpretable model-agnostic explanations(LIME)technology was used to interpret the output of the stacking model from both local and global perspectives.Through SC dentification and feature selection,346 SCs and 22 crash-related factors were collected from California interstate freeways.The results showed that the stacking model outperformed base models evaluated by accuracy,precision,and recall indicators.The explanations based on LIME suggest that collision type,distance,speed and volume are the critical features that affect the time and distance gaps.Higher volume can prolong queue length and increase the distance gap from the SCs to PCs.And collision types,peak periods,workday,truck involved and tow away likely induce a long-distance gap.Conversely,there is a shorter distance gap when secondary roads run in the same direction and are close to the primary roads.Lower speed is a significant factor resulting in a long-time gap,while the higher speed is correlated with a short-time gap.These results are expected to provide insights into how contributory features affect the time and distance gaps and help decision-makers develop accurate decisions to prevent SCs.展开更多
基金supported by the Youth Innovation Promotion Association of Chinese Academy of Sciences under Grant No.2023112the National Natural Science Foundation of China under Grant No.62206266Zhao Zhang is supported by the China Postdoctoral Science Foundation under Grant No.2021M703273.
文摘Predicting the future trajectories of multiple agents is essential for various applications in real life,such as surveillance systems,autonomous driving,and social robots.The trajectory prediction task is influenced by many factors,including the individual historical trajectory,interactions between agents,and the fuzzy nature of the observed agents’motion.While existing methods have made great progress on the topic of trajectory prediction,they treat all the information uniformly,which limits the effectiveness of information utilization.To this end,in this paper,we propose and utilize a model-agnostic framework to regard all the information in a two-level hierarchical view.Particularly,the first-level view is the inter-trajectory view.In this level,we observe that the difficulty in predicting different trajectory samples varies.We define trajectory difficulty and train the proposed framework in an“easy-to-hard”schema.The second-level view is the intra-trajectory level.We find the influencing factors for a particular trajectory can be divided into two parts.The first part is global features,which keep stable within a trajectory,i.e.,the expected destination.The second part is local features,which change over time,i.e.,the current position.We believe that the two types of information should be handled in different ways.The hierarchical view is beneficial to take full advantage of the information in a fine-grained way.Experimental results validate the effectiveness of the proposed model-agnostic framework.
文摘This study presents an enhanced convolutional neural network(CNN)model integrated with Explainable Artificial Intelligence(XAI)techniques for accurate prediction and interpretation of wheat crop diseases.The aim is to streamline the detection process while offering transparent insights into the model’s decision-making to support effective disease management.To evaluate the model,a dataset was collected from wheat fields in Kotli,Azad Kashmir,Pakistan,and tested across multiple data splits.The proposed model demonstrates improved stability,faster conver-gence,and higher classification accuracy.The results show significant improvements in prediction accuracy and stability compared to prior works,achieving up to 100%accuracy in certain configurations.In addition,XAI methods such as Local Interpretable Model-agnostic Explanations(LIME)and Shapley Additive Explanations(SHAP)were employed to explain the model’s predictions,highlighting the most influential features contributing to classification decisions.The combined use of CNN and XAI offers a dual benefit:strong predictive performance and clear interpretability of outcomes,which is especially critical in real-world agricultural applications.These findings underscore the potential of integrating deep learning models with XAI to advance automated plant disease detection.The study offers a precise,reliable,and interpretable solution for improving wheat production and promoting agricultural sustainability.Future extensions of this work may include scaling the dataset across broader regions and incorporating additional modalities such as environmental data to enhance model robustness and generalization.
基金supported by the National Natural Science Foundation of China (62176061)。
文摘Knowledge distillation(KD) enhances student network generalization by transferring dark knowledge from a complex teacher network. To optimize computational expenditure and memory utilization, self-knowledge distillation(SKD) extracts dark knowledge from the model itself rather than an external teacher network. However, previous SKD methods performed distillation indiscriminately on full datasets, overlooking the analysis of representative samples. In this work, we present a novel two-stage approach to providing targeted knowledge on specific samples, named two-stage approach self-knowledge distillation(TOAST). We first soften the hard targets using class medoids generated based on logit vectors per class. Then, we iteratively distill the under-trained data with past predictions of half the batch size. The two-stage knowledge is linearly combined, efficiently enhancing model performance. Extensive experiments conducted on five backbone architectures show our method is model-agnostic and achieves the best generalization performance.Besides, TOAST is strongly compatible with existing augmentation-based regularization methods. Our method also obtains a speedup of up to 2.95x compared with a recent state-of-the-art method.
文摘Alzheimer’s disease(AD)is a neurological disorder that predominantly affects the brain.In the coming years,it is expected to spread rapidly,with limited progress in diagnostic techniques.Various machine learning(ML)and artificial intelligence(AI)algorithms have been employed to detect AD using single-modality data.However,recent developments in ML have enabled the application of these methods to multiple data sources and input modalities for AD prediction.In this study,we developed a framework that utilizes multimodal data(tabular data,magnetic resonance imaging(MRI)images,and genetic information)to classify AD.As part of the pre-processing phase,we generated a knowledge graph from the tabular data and MRI images.We employed graph neural networks for knowledge graph creation,and region-based convolutional neural network approach for image-to-knowledge graph generation.Additionally,we integrated various explainable AI(XAI)techniques to interpret and elucidate the prediction outcomes derived from multimodal data.Layer-wise relevance propagation was used to explain the layer-wise outcomes in the MRI images.We also incorporated submodular pick local interpretable model-agnostic explanations to interpret the decision-making process based on the tabular data provided.Genetic expression values play a crucial role in AD analysis.We used a graphical gene tree to identify genes associated with the disease.Moreover,a dashboard was designed to display XAI outcomes,enabling experts and medical professionals to easily comprehend the predic-tion results.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2022R193)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.Taif University Researchers Supporting Project(TURSP-2020/26),Taif University,Taif,Saudi Arabia.
文摘Nowadays,deepfake is wreaking havoc on society.Deepfake content is created with the help of artificial intelligence and machine learning to replace one person’s likeness with another person in pictures or recorded videos.Although visual media manipulations are not new,the introduction of deepfakes has marked a breakthrough in creating fake media and information.These manipulated pic-tures and videos will undoubtedly have an enormous societal impact.Deepfake uses the latest technology like Artificial Intelligence(AI),Machine Learning(ML),and Deep Learning(DL)to construct automated methods for creating fake content that is becoming increasingly difficult to detect with the human eye.Therefore,automated solutions employed by DL can be an efficient approach for detecting deepfake.Though the“black-box”nature of the DL system allows for robust predictions,they cannot be completely trustworthy.Explainability is thefirst step toward achieving transparency,but the existing incapacity of DL to explain its own decisions to human users limits the efficacy of these systems.Though Explainable Artificial Intelligence(XAI)can solve this problem by inter-preting the predictions of these systems.This work proposes to provide a compre-hensive study of deepfake detection using the DL method and analyze the result of the most effective algorithm with Local Interpretable Model-Agnostic Explana-tions(LIME)to assure its validity and reliability.This study identifies real and deepfake images using different Convolutional Neural Network(CNN)models to get the best accuracy.It also explains which part of the image caused the model to make a specific classification using the LIME algorithm.To apply the CNN model,the dataset is taken from Kaggle,which includes 70 k real images from the Flickr dataset collected by Nvidia and 70 k fake faces generated by StyleGAN of 256 px in size.For experimental results,Jupyter notebook,TensorFlow,Num-Py,and Pandas were used as software,InceptionResnetV2,DenseNet201,Incep-tionV3,and ResNet152V2 were used as CNN models.All these models’performances were good enough,such as InceptionV3 gained 99.68%accuracy,ResNet152V2 got an accuracy of 99.19%,and DenseNet201 performed with 99.81%accuracy.However,InceptionResNetV2 achieved the highest accuracy of 99.87%,which was verified later with the LIME algorithm for XAI,where the proposed method performed the best.The obtained results and dependability demonstrate its preference for detecting deepfake images effectively.
基金supported by Fundamental Research Funds for the Central Universities(WUT:2022IVA067).
文摘Owing to the convenience of online loans,an increasing number of people are borrowing money on online platforms.With the emergence of machine learning technology,predicting loan defaults has become a popular topic.However,machine learning models have a black-box problem that cannot be disregarded.To make the prediction model rules more understandable and thereby increase the user’s faith in the model,an explanatory model must be used.Logistic regression,decision tree,XGBoost,and LightGBM models are employed to predict a loan default.The prediction results show that LightGBM and XGBoost outperform logistic regression and decision tree models in terms of the predictive ability.The area under curve for LightGBM is 0.7213.The accuracies of LightGBM and XGBoost exceed 0.8.The precisions of LightGBM and XGBoost exceed 0.55.Simultaneously,we employed the local interpretable model-agnostic explanations approach to undertake an explainable analysis of the prediction findings.The results show that factors such as the loan term,loan grade,credit rating,and loan amount affect the predicted outcomes.
基金supported by the Joint Funds of the National Natural Science Foundation of China(No.U24A2052)the Program of Shanghai Academic Research Leader(No.23XD1404600)+1 种基金the Postdoctoral Fellowship Program of the China Postdoctoral Science Foundation(No.GZC20232825)the Shanghai Eastern Talent Plan(No.QNKJ2024026).
文摘To meet the demands of advanced electronic devices,inorganic glasses are required to have comprehensive dielectric,thermal,and mechanical properties.However,the complex composition–property relationship and vast compositional diversity hinder optimization.This study developed machine learning models to predict permittivity,dielectric loss,thermal conductivity,coefficient of thermal expansion,and Young’s modulus based on the composition features of inorganic glasses.The optimal models achieve R^(2)values of 0.9614,0.7411,0.9454,0.9684,and 0.8164,respectively.By integrating domain knowledge with model-agnostic interpretation methods,feature contributions and interactions were analyzed.The mixed alkali effect is crucial for property regulation,especially Na-K for dielectric loss and Na-Li for thermal conductivity.Boron anomaly shifts the high-λregion to a balanced composition of alkali metals with rising B%.The multiobjective optimization of properties was realized using a genetic algorithm framework.After 23 iterations,the optimal material in the MgO-Al_(2)O_(3)-B_(2)O_(3)-SiO2 system exhibitsε_(r)=4.78,tanδ=0.00063,λ=2.59 W/(m⋅K),α=50.27�10−7K−1,and E=82.41 GPa,outperforming all materials in the dataset.The computational effort was reduced to 1/19 of that required using exhaustive search methods.This study provides a model interpretation framework and an effective multiobjective optimization strategy for glass design.
文摘Deep mixing,also known as deep stabilization,is a widely used ground improvement method in Nordic countries,particularly in urban and infrastructural projects,aiming to enhance the properties of soft,sensitive clays.Understanding the shear strength of stabilized soils and identifying key influencing factors are essential for ensuring the structural stability and durability of engineering structures.This study introduces a novel explainable artificial intelligence framework to investigate critical soil properties affecting shear strength,utilizing a data set derived from stabilization tests conducted on laboratory samples from the 1990s.The proposed framework investigates the statistical variability and distribution of crucial parameters affecting shear strength within the collected data set.Subsequently,machine learning models are trained and tested to predict soil shear strength based on input features such as water/binder ratio and water content,etc.Global model analysis using feature importance and Shapley additive explanations is conducted to understand the influence of soil input features on shear strength.Further exploration is carried out using partial dependence plots,individual conditional expectation plots,and accumulated local effects to uncover the degree of dependency and important thresholds between key stabilized soil parameters and shear strength.Heat map and feature interaction analysis techniques are then utilized to investigate soil properties interactions and correlations.Lastly,a more specific investigation is conducted on particular soil samples to highlight the most influential soil properties locally,employing the local interpretable model-agnostic explanations technique.The validation of the framework involves analyzing laboratory test results obtained from uniaxial compression tests.The framework demonstrates an ability to predict the shear strength of stabilized soil samples with an accuracy surpassing 90%.Importantly,the explainability results underscore the substantial impact of water content and the water/binder ratio on shear strength.
基金This research was funded in part by Innovation-Driven Project of Central South University(Grant No.2020CX041)the Fundamental Research Funds for the Central Universities of Central South University(Grant No.2022ZZTS0717)。
文摘Understanding the characteristics of time and distance gaps between the primary(PC)and secondary crashes(SC)is crucial for preventing SC ccurrences and improving road safety.Although previous studies have tried to analyse the variation of gaps,there is limited evidence in quantifying the relationships between different gaps and various influential factors.This study proposed a two-layer stacking framework to discuss the time and distance gaps.Specifically,the framework took random forests(RF),gradient boosting decision tree(GBDT)and eXtreme gradient boosting as the base classifiers in the first layer and applied logistic regression(LR)as a combiner in the second layer.On this basis,the local interpretable model-agnostic explanations(LIME)technology was used to interpret the output of the stacking model from both local and global perspectives.Through SC dentification and feature selection,346 SCs and 22 crash-related factors were collected from California interstate freeways.The results showed that the stacking model outperformed base models evaluated by accuracy,precision,and recall indicators.The explanations based on LIME suggest that collision type,distance,speed and volume are the critical features that affect the time and distance gaps.Higher volume can prolong queue length and increase the distance gap from the SCs to PCs.And collision types,peak periods,workday,truck involved and tow away likely induce a long-distance gap.Conversely,there is a shorter distance gap when secondary roads run in the same direction and are close to the primary roads.Lower speed is a significant factor resulting in a long-time gap,while the higher speed is correlated with a short-time gap.These results are expected to provide insights into how contributory features affect the time and distance gaps and help decision-makers develop accurate decisions to prevent SCs.