期刊文献+
共找到39篇文章
< 1 2 >
每页显示 20 50 100
Corrigendum to“Meta databases of steel frame buildings for surrogate modelling and machine learning-based feature importance analysis”[Journal of Resilient Cities and Structures Volume 3 Issue 1(2024)20-43]
1
作者 Delbaz Samadian Jawad Fayaz +2 位作者 Imrose B.Muhit Annalisa Occhipinti Nashwan Dawood 《Resilient Cities and Structures》 2025年第1期124-124,共1页
The authors regret that the original publication of this paper did not include Jawad Fayaz as a co-author.After further discussions and a thorough review of the research contributions,it was agreed that his significan... The authors regret that the original publication of this paper did not include Jawad Fayaz as a co-author.After further discussions and a thorough review of the research contributions,it was agreed that his significant contributions to the foundational aspects of the research warranted recognition,and he has now been added as a co-author. 展开更多
关键词 machine learning meta databases jawad fayaz surrogate modelling feature importance analysis steel frame buildings
在线阅读 下载PDF
Unveiling hidden biases in machine learning feature importance
2
作者 Yoshiyasu Takefuji 《Journal of Energy Chemistry》 2025年第3期49-51,共3页
Nirmal et al.presented a machine learning-based design of ternary organic solar cells,utilizing feature importance[1].This paper highlights the alarming potential biases in the use of feature importance in machine lea... Nirmal et al.presented a machine learning-based design of ternary organic solar cells,utilizing feature importance[1].This paper highlights the alarming potential biases in the use of feature importance in machine learning,which can lead to incorrect conclusions and outcomes.Many scientists and researchers including Nirmal et al.are unaware that feature importances in machine learning in general are model-specific and do not necessarily represent true associations between the target and features. 展开更多
关键词 Machine learning Feature importance Potential bias Chi-squared and P-value
在线阅读 下载PDF
An explainable deep learning approach to enhance the prediction of shield tunnel deviation
3
作者 Jiajie Zhen Fengwen Lai +4 位作者 Ming Huang Junjie Zheng Jim S.Shiau Ping Wang Jinhuo Zheng 《Journal of Rock Mechanics and Geotechnical Engineering》 2026年第1期566-579,共14页
Although machine learning models have achieved high enough accuracy in predicting shield position deviations,their“black box”nature makes the prediction mechanisms and decision-making processes opaque,leading to wea... Although machine learning models have achieved high enough accuracy in predicting shield position deviations,their“black box”nature makes the prediction mechanisms and decision-making processes opaque,leading to weaker explanations and practicability.This study introduces a novel explainable deep learning framework comprising the Informer model with enhanced attention mechanisms(EAMInfor)and deep learning important features(DeepLIFT),aimed at improving the prediction accuracy of shield position deviations and providing interpretability for predictive results.The EAMInfor model attempts to integrate channel attention,spatial attention,and simple attention modules to improve the Informer model's performance.The framework is tested with the four different geological conditions datasets generated from the Xiamen metro line 3,China.Results show that the EAMInfor model outperforms the traditional Informer and comparison models.The analysis with the DeepLIFT method indicates that the push thrust of push cylinder and the earth chamber pressure are the most significant features,while the stroke length of the push cylinder demonstrated lower importance.Furthermore,the variation trends in the significance of data points within input sequences exhibit substantial differences between single and composite strata.This framework not only improves predictive accuracy but also strengthens the credibility and reliability of the results. 展开更多
关键词 Shield tunnel position deviation Machine learning Explainable AI Deep learning important features
在线阅读 下载PDF
Layered Feature Engineering for E-Commerce Purchase Prediction:A Hierarchical Evaluation on Taobao User Behavior Datasets
4
作者 Liqiu Suo Lin Xia +1 位作者 Yoona Chung Eunchan Kim 《Computers, Materials & Continua》 2026年第4期1865-1889,共25页
Accurate purchase prediction in e-commerce critically depends on the quality of behavioral features.This paper proposes a layered and interpretable feature engineering framework that organizes user signals into three ... Accurate purchase prediction in e-commerce critically depends on the quality of behavioral features.This paper proposes a layered and interpretable feature engineering framework that organizes user signals into three layers:Basic,Conversion&Stability(efficiency and volatility across actions),and Advanced Interactions&Activity(crossbehavior synergies and intensity).Using real Taobao(Alibaba’s primary e-commerce platform)logs(57,976 records for 10,203 users;25 November–03 December 2017),we conducted a hierarchical,layer-wise evaluation that holds data splits and hyperparameters fixed while varying only the feature set to quantify each layer’s marginal contribution.Across logistic regression(LR),decision tree,random forest,XGBoost,and CatBoost models with stratified 5-fold cross-validation,the performance improvedmonotonically fromBasic to Conversion&Stability to Advanced features.With LR,F1 increased from 0.613(Basic)to 0.962(Advanced);boosted models achieved high discrimination(0.995 AUC Score)and an F1 score up to 0.983.Calibration and precision–recall analyses indicated strong ranking quality and acknowledged potential dataset and period biases given the short(9-day)window.By making feature contributions measurable and reproducible,the framework complements model-centric advances and offers a transparent blueprint for production-grade behavioralmodeling.The code and processed artifacts are publicly available,and future work will extend the validation to longer,seasonal datasets and hybrid approaches that combine automated feature learning with domain-driven design. 展开更多
关键词 Hierarchical feature engineering purchase prediction user behavior dataset feature importance e-commerce platform TAOBAO
在线阅读 下载PDF
Analysis of Feature Importance and Interpretation for Malware Classification 被引量:2
5
作者 Dong-Wook Kim Gun-Yoon Shin Myung-Mook Han 《Computers, Materials & Continua》 SCIE EI 2020年第12期1891-1904,共14页
This study was conducted to enable prompt classification of malware,which was becoming increasingly sophisticated.To do this,we analyzed the important features of malware and the relative importance of selected featur... This study was conducted to enable prompt classification of malware,which was becoming increasingly sophisticated.To do this,we analyzed the important features of malware and the relative importance of selected features according to a learning model to assess how those important features were identified.Initially,the analysis features were extracted using Cuckoo Sandbox,an open-source malware analysis tool,then the features were divided into five categories using the extracted information.The 804 extracted features were reduced by 70%after selecting only the most suitable ones for malware classification using a learning model-based feature selection method called the recursive feature elimination.Next,these important features were analyzed.The level of contribution from each one was assessed by the Random Forest classifier method.The results showed that System call features were mostly allocated.At the end,it was possible to accurately identify the malware type using only 36 to 76 features for each of the four types of malware with the most analysis samples available.These were the Trojan,Adware,Downloader,and Backdoor malware. 展开更多
关键词 Recursive feature elimination model interpretability feature importance malware classification
在线阅读 下载PDF
A machine learning framework for accelerating the development of highly efficient methanol synthesis catalysts 被引量:3
6
作者 Weixian Li Yi Dong +9 位作者 Mingchu Ran Saisai Lin Peng Liu Hao Song Jundong Yi Chaoyang Zhu Zhifu Qi Chenghang Zheng Xiao Zhang Xiang Gao 《Journal of Energy Chemistry》 2025年第5期372-381,共10页
Converting CO_(2)with green hydrogen to methanol as a carbon-neutral liquid fuel is a promising route for the long-term storage and distribution of intermittent renewable energy.Nevertheless,attaining highly efficient... Converting CO_(2)with green hydrogen to methanol as a carbon-neutral liquid fuel is a promising route for the long-term storage and distribution of intermittent renewable energy.Nevertheless,attaining highly efficient methanol synthesis catalysts from the vast composition space remains a significant challenge.Here we present a machine learning framework for accelerating the development of high space-time yield(STY)methanol synthesis catalysts.A database of methanol synthesis catalysts has been compiled,consisting of catalyst composition,preparation parameters,structural characteristics,reaction conditions and their corresponding catalytic performance.A methodology for constructing catalyst features based on the intrinsic physicochemical properties of the catalyst components has been developed,which significantly reduced the data dimensionality and enhanced the efficiency of machine learning operations.Two high-precision machine learning prediction models for the activities and product selectivity of catalysts were trained and obtained.Using this machine learning framework,an efficient search was achieved within the catalyst composition space,leading to the successful identification of high STY multielement oxide methanol synthesis catalysts.Notably,the CuZnAlTi catalyst achieved high STYs of 0.49 and 0.65 g_(MeOH)/(g_(catalyst)h)for CO_(2)and CO hydrogenation to methanol at 250℃,respectively,and the STY was further increased to 2.63 g_(Me OH)/(g_(catalyst)h)in CO and CO_(2)co-hydrogenation. 展开更多
关键词 Methanol synthesis Machine learning Cu-based catalysts CO/CO_(2)hydrogenation Feature importance analysis
在线阅读 下载PDF
Studying corrosion resistance of ODS steels in supercritical water by machine learning
7
作者 Tian-xing Yang Peng Dou 《Journal of Iron and Steel Research International》 2025年第8期2609-2629,共21页
The corrosion performance of oxide dispersion strengthened(ODS)steel is crucial for SCWR application.Machine learning(ML)models were established to predict the mass gain of ODS steels under corrosion conditions(i.e.,s... The corrosion performance of oxide dispersion strengthened(ODS)steel is crucial for SCWR application.Machine learning(ML)models were established to predict the mass gain of ODS steels under corrosion conditions(i.e.,supercritical water),thereby evaluating their corrosion resistance.The grain and particle morphologies and crystal and interface structures of nanoparticles of six ODS steels were studied by transmission electron microscopy,scanning transmission electron microscopy,and high-resolution transmission electron microscopy.Among six ML models employed,the LightGBM(LGBM)model shows the highest accuracy(root mean square error of 43.18 mg/dm^(2) and 50.21 mg/dm^(2),mean absolute error of 25.91 mg/dm^(2) and 27.82 mg/dm^(2),and coefficient of determination R^(2) of 0.97 and 0.96 for training set and testing set,respectively)in predicting the mass gain of ODS steels.The LGBM feature importance coefficients were also applied to denote the degree of the feature on corrosion resistance.For microstructural features,the parameters that greatly influence corrosion resistance are inter-particle spacing and grain diameter,with importance scores of 73 and 63,respectively.Moreover,there is a strong synergistic influence between Cr and Al on the corrosion resistance of ODS steels.Developing this efficient and accurate LGBM model not only enhances the understanding of ODS steel corrosion mechanisms but also provides valuable insights for the targeted optimization and design of high-performance ODS alloys. 展开更多
关键词 ODS steel Machine learning Corrosion resistance NANOPARTICLE Feature importance
原文传递
A data-driven approach to predict fracture intensity using machine learning for presalt carbonate reservoirs:A feasibility study in the Mero Field,Santos Basin,Brazil
8
作者 Eberton Rodrigues de Oliveira Neto Fábio Júnior Damasceno Fernandes +4 位作者 Tuany Younis Abdul Fatah Raquel Macedo Dias Zoraida Roxana Tejada da Piedade Antonio Fernando Menezes Freire Wagner Moreira Lupinacci 《Energy Geoscience》 2025年第2期352-371,共20页
Predicting fracture intensity is essential for optimising reservoir production and mitigating drilling risks in the Brazilian pre-salt layer.However,previous studies rely excessively on conceptual models and typically... Predicting fracture intensity is essential for optimising reservoir production and mitigating drilling risks in the Brazilian pre-salt layer.However,previous studies rely excessively on conceptual models and typically do not integrate multiple types of data to perform such task.Moreover,to date,no feasibilitylike studies have assessed the reasonableness of such approaches.We propose a data-driven approach that utilises upscaled well logs(Young's modulus,Poisson's ratio,and silica content)alongside seismic attributes(curvature,distance to fault)to predict fracture intensity.The distance to fault is measured using the fault probability volume estimated by a pre-trained convolutional neural network(CNN).We evaluate the effectiveness of this data-driven approach employing two tree-ensemble models,eXtreme Gradient Boosting(XGBoost)and Random Forest,to estimate the volumetric fracture intensity(P32)in the wells.Regression and residual analyses indicate that XGBoost outperforms Random Forest.Results from feature importance methods,such as permutation importance and Shapley Additive explanations(SHAP),highlight curvature as the most important feature,followed by distance to fault,Young's modulus(or P-Impedance),silica content,and Poisson's ratio.The approach has been validated with rock sampling information and two blind tests.Consequently,we believe this workflow can be applied to other wells in nearby fields.The study offers a valuable tool for quantitatively estimating fracture intensity in pre-salt reservoirs.Future research may use this study as a reference for estimating fracture intensity within a seismic volume.The predicted fracture intensity estimates can enhance the reliability of reservoir porosity models and serve as a geohazard indicator to mitigate drilling risks. 展开更多
关键词 K1 curvature Naturally fractured reservoirs P32 Machine learning Feature importance
在线阅读 下载PDF
How well do machine learning models in finance work?
9
作者 Yeonchan Kang Doojin Ryu Robert IWebb 《Financial Innovation》 2025年第1期3702-3731,共30页
We examine how machine learning models predict stock returns in the Korean market.By analyzing various firm characteristics and macroeconomic variables,we find that tree-based models outperform other machine learning ... We examine how machine learning models predict stock returns in the Korean market.By analyzing various firm characteristics and macroeconomic variables,we find that tree-based models outperform other machine learning approaches.This finding suggests that,in data-constrained contexts,moderately complex models outperform advanced methods that require extensive datasets.Using PFI,SHAP,and LIME,we consistently identify the 36-month momentum as the key predictor.PDP,ICE,and ALE analyses reveal threshold effects of 36-month momentum that diminish at higher return levels.Our findings underscore the value of ensemble-based methods in settings characterized by short data histories and heightened volatility.This study illustrates how multimethod interpretability can yield deeper economic insights,ultimately guiding more effective investment strategies and policy decisions. 展开更多
关键词 Feature importance Interpretable machine learning Stock market prediction VISUALIZATION
在线阅读 下载PDF
Interpreting hourly mass concentrations of PM_(2.5)chemical components with an optimal deep-learning model
10
作者 Hongyi Li Ting Yang +2 位作者 Yiming Du Yining Tan Zifa Wang 《Journal of Environmental Sciences》 2025年第5期125-139,共15页
PM_(2.5)constitutes a complex and diversemixture that significantly impacts the environment,human health,and climate change.However,existing observation and numerical simulation techniques have limitations,such as a l... PM_(2.5)constitutes a complex and diversemixture that significantly impacts the environment,human health,and climate change.However,existing observation and numerical simulation techniques have limitations,such as a lack of data,high acquisition costs,andmultiple uncertainties.These limitations hinder the acquisition of comprehensive information on PM_(2.5)chemical composition and effectively implement refined air pollution protection and control strategies.In this study,we developed an optimal deep learning model to acquire hourly mass concentrations of key PM_(2.5)chemical components without complex chemical analysis.The model was trained using a randomly partitioned multivariate dataset arranged in chronological order,including atmospheric state indicators,which previous studies did not consider.Our results showed that the correlation coefficients of key chemical components were no less than 0.96,and the root mean square errors ranged from 0.20 to 2.11μg/m^(3)for the entire process(training and testing combined).The model accurately captured the temporal characteristics of key chemical components,outperforming typical machine-learning models,previous studies,and global reanalysis datasets(such asModern-Era Retrospective analysis for Research and Applications,Version 2(MERRA-2)and Copernicus Atmosphere Monitoring Service ReAnalysis(CAMSRA)).We also quantified the feature importance using the random forest model,which showed that PM_(2.5),PM_(1),visibility,and temperature were the most influential variables for key chemical components.In conclusion,this study presents a practical approach to accurately obtain chemical composition information that can contribute to filling missing data,improved air pollution monitoring and source identification.This approach has the potential to enhance air pollution control strategies and promote public health and environmental sustainability. 展开更多
关键词 Pm2.5 chemical composition Hourly mass concentration Deep learning Bayesian optimization Feature importance
原文传递
Augmentation of PM_(1.0) measurements based on machine learning model and environmental factors
11
作者 Hyemin Hwang Chang Hyeok Kim +3 位作者 Jong-Sung Park Sechan Park Jong Bum Kim Jae Young Lee 《Journal of Environmental Sciences》 2025年第10期91-101,共11页
PM_(1.0),particulate matter with an aerodynamic diameter smaller than 1.0μm,can adversely affect human health.However,fewer stations are capable of measuring PM_(1.0) concentrations than PM2.5 and PM10 concentrations... PM_(1.0),particulate matter with an aerodynamic diameter smaller than 1.0μm,can adversely affect human health.However,fewer stations are capable of measuring PM_(1.0) concentrations than PM2.5 and PM10 concentrations in real time(i.e.,only 9 locations for PM_(1.0) vs.623 locations for PM2.5 or PM10)in South Korea,making it impossible to conduct a nationwide health risk analysis of PM_(1.0).Thus,this study aimed to develop a PM_(1.0) prediction model using a random forest algorithm based on PM_(1.0) data from the nine measurement stations and various environmental input factors.Cross validation,in which the model was trained in eight stations and tested in the remaining station,achieved an average R^(2) of 0.913.The high R^(2) value achieved undermutually exclusive training and test locations in the cross validation can be ascribed to the fact that all the locations had similar relationships between PM_(1.0) and the input factors,which were captured by our model.Moreover,results of feature importance analysis showed that PM2.5 and PM10 concentrations were the two most important input features in predicting PM_(1.0) concentration.Finally,the model was used to estimate the PM_(1.0) concentrations in 623 locations,where input factors such as PM2.5 and PM10 can be obtained.Based on the augmented profile,we identified Seoul and Ansan to be PM_(1.0) concentration hotspots.These regions are large cities or the center of anthropogenic and industrial activities.The proposed model and the augmented PM_(1.0) profiles can be used for large epidemiological studies to understand the health impacts of PM_(1.0). 展开更多
关键词 Particulate matter Random forest Input factor PM_(1.0)prediction model Cross validation Feature importance analysis
原文传递
Effectiveness of predicting tunneling-induced ground settlements using machine learning methods with small datasets 被引量:12
12
作者 Linan Liu Wendy Zhou Marte Gutierrez 《Journal of Rock Mechanics and Geotechnical Engineering》 SCIE CSCD 2022年第4期1028-1041,共14页
Prediction of tunneling-induced ground settlements is an essential task,particularly for tunneling in urban settings.Ground settlements should be limited within a tolerable threshold to avoid damages to aboveground st... Prediction of tunneling-induced ground settlements is an essential task,particularly for tunneling in urban settings.Ground settlements should be limited within a tolerable threshold to avoid damages to aboveground structures.Machine learning(ML)methods are becoming popular in many fields,including tunneling and underground excavations,as a powerful learning and predicting technique.However,the available datasets collected from a tunneling project are usually small from the perspective of applying ML methods.Can ML algorithms effectively predict tunneling-induced ground settlements when the available datasets are small?In this study,seven ML methods are utilized to predict tunneling-induced ground settlement using 14 contributing factors measured before or during tunnel excavation.These methods include multiple linear regression(MLR),decision tree(DT),random forest(RF),gradient boosting(GB),support vector regression(SVR),back-propagation neural network(BPNN),and permutation importancebased BPNN(PI-BPNN)models.All methods except BPNN and PI-BPNN are shallow-structure ML methods.The effectiveness of these seven ML approaches on small datasets is evaluated using model accuracy and stability.The model accuracy is measured by the coefficient of determination(R2)of training and testing datasets,and the stability of a learning algorithm indicates robust predictive performance.Also,the quantile error(QE)criterion is introduced to assess model predictive performance considering underpredictions and overpredictions.Our study reveals that the RF algorithm outperforms all the other models with the highest model prediction accuracy(0.9)and stability(3.0210^(-27)).Deep-structure ML models do not perform well for small datasets with relatively low model accuracy(0.59)and stability(5.76).The PI-BPNN architecture is proposed and designed for small datasets,showing better performance than typical BPNN.Six important contributing factors of ground settlements are identified,including tunnel depth,the distance between tunnel face and surface monitoring points(DTM),weighted average soil compressibility modulus(ACM),grouting pressure,penetrating rate and thrust force. 展开更多
关键词 Ground settlements TUNNELING Machine learning Small dataset Model accuracy Model stability Feature importance
在线阅读 下载PDF
Machine learning-based identification for the main influencing factors of alluvial fan development in the Lhasa River Basin,Qinghai-Tibet Plateau 被引量:5
13
作者 CHEN Tongde WEI Wei +2 位作者 JIAO Juying ZHANG Ziqi LI Jianjun 《Journal of Geographical Sciences》 SCIE CSCD 2022年第8期1557-1580,共24页
Alluvial fans are an important land resource in the Qinghai-Tibet Plateau with the expansion of human activities.However,the factors of alluvial fan development are poorly understood.According to our previous investig... Alluvial fans are an important land resource in the Qinghai-Tibet Plateau with the expansion of human activities.However,the factors of alluvial fan development are poorly understood.According to our previous investigation and research,approximately 826 alluvial fans exist in the Lhasa River Basin(LRB).The main purpose of this work is to identify the main influencing factors by using machine learning.A development index(Di)of alluvial fan was created by combining its area,perimeter,height and gradient.The 72%of data,including Di,11 types of environmental parameters of the matching catchment of alluvial fan and 10 commonly used machine learning algorithms were used to train and build models.The 18%of data were used to validate models.The remaining 10%of data were used to test the model accuracy.The feature importance of the model was used to illustrate the significance of the 11 types of environmental parameters to Di.The primary modelling results showed that the accuracy of the ensemble models,including Gradient Boost Decision Tree,Random Forest and XGBoost,are not less than 0.5(R^(2)).The accuracy of the Gradient Boost Decision Tree and XGBoost improved after grid research,and their R^(2)values are 0.782 and 0.870,respectively.The XGBoost was selected as the final model due to its optimal accuracy and generalisation ability at the sites closest to the LRB.Morphology parameters are the main factors in alluvial fan development,with a cumulative value of relative feature importance of 74.60%in XGBoost.The final model will have better accuracy and generalisation ability after adding training samples in other regions. 展开更多
关键词 alluvial fan machine learning feature importance XGBoost Lhasa River Basin
原文传递
Application of an interpretable artificial neural network to predict the interface strength of a near-surface mounted fiber-reinforced polymer to concrete joint 被引量:3
14
作者 Miao SU Hui PENG Shao-fan LI 《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》 SCIE EI CAS CSCD 2021年第6期427-440,共14页
Accurately estimating the interfacial bond capacity of the near-surface mounted(NSM)carbon fiber-reinforced polymer(CFRP)to concrete joint is a fundamental task in the strengthening and retrofit of existing reinforced... Accurately estimating the interfacial bond capacity of the near-surface mounted(NSM)carbon fiber-reinforced polymer(CFRP)to concrete joint is a fundamental task in the strengthening and retrofit of existing reinforced concrete(RC)structures.The machine learning(ML)approach may provide an alternative to the commonly used semi-empirical or semi-analytical methods.Therefore,in this work we have developed a predictive model based on an artificial neural network(ANN)approach,i.e.using a back propagation neural network(BPNN),to map the complex data pattern obtained from an NSM CFRP to concrete joint.It involves a set of nine material and geometric input parameters and one output value.Moreover,by employing the neural interpretation diagram(NID)technique,the BPNN model becomes interpretable,as the influence of each input variable on the model can be tracked and quantified based on the connection weights of the neural network.An extensive database including 163 pull-out testing samples,collected from the authors’research group and from published results in the literature,is used to train and verify the ANN.Our results show that the prediction given by the BPNN model agrees well with the experimental data and yields a coefficient of determination of 0.957 on the whole database.After removing one non-significant feature,the BPNN becomes even more computationally efficient and accurate.In addition,compared with the existed semi-analytical model,the ANN-based approach demonstrates a more accurate estimation.Therefore,the proposed ML method may be a promising alternative for predicting the bond strength of NSM CFRP to concrete joint for structural engineers. 展开更多
关键词 Fiber-reinforced polymer(FRP) Bond strength Machine learning(ML) Neural interpretation diagram(NID) Regression Feature importance Connection weights approach
原文传递
Data-driven casting defect prediction model for sand casting based on random forest classification algorithm 被引量:2
15
作者 Bang Guan Dong-hong Wang +3 位作者 Da Shu Shou-qin Zhu Xiao-yuan Ji Bao-de Sun 《China Foundry》 SCIE EI CAS CSCD 2024年第2期137-146,共10页
The complex sand-casting process combined with the interactions between process parameters makes it difficult to control the casting quality,resulting in a high scrap rate.A strategy based on a data-driven model was p... The complex sand-casting process combined with the interactions between process parameters makes it difficult to control the casting quality,resulting in a high scrap rate.A strategy based on a data-driven model was proposed to reduce casting defects and improve production efficiency,which includes the random forest(RF)classification model,the feature importance analysis,and the process parameters optimization with Monte Carlo simulation.The collected data includes four types of defects and corresponding process parameters were used to construct the RF model.Classification results show a recall rate above 90% for all categories.The Gini Index was used to assess the importance of the process parameters in the formation of various defects in the RF model.Finally,the classification model was applied to different production conditions for quality prediction.In the case of process parameters optimization for gas porosity defects,this model serves as an experimental process in the Monte Carlo method to estimate a better temperature distribution.The prediction model,when applied to the factory,greatly improved the efficiency of defect detection.Results show that the scrap rate decreased from 10.16% to 6.68%. 展开更多
关键词 sand casting process data-driven method classification model quality prediction feature importance
在线阅读 下载PDF
Time Delay Identification in Dynamical Systems Based on Interpretable Machine Learning 被引量:2
16
作者 XIA Meng WU Yuzhe WANG Zhijie 《Journal of Donghua University(English Edition)》 CAS 2022年第4期332-339,共8页
The existence of time delay in complex industrial processes or dynamical systems is a common phenomenon and is a difficult problem to deal with in industrial control systems,as well as in the textile field.Accurate id... The existence of time delay in complex industrial processes or dynamical systems is a common phenomenon and is a difficult problem to deal with in industrial control systems,as well as in the textile field.Accurate identification of the time delay can greatly improve the efficiency of the design of industrial process control systems.The time delay identification methods based on mathematical modeling require prior knowledge of the structural information of the model,especially for nonlinear systems.The neural network-based identification method can predict the time delay of the system,but cannot accurately obtain the specific parameters of the time delay.Benefit from the interpretability of machine learning,a novel method for delay identification based on an interpretable regression decision tree is proposed.Utilizing the self-explanatory analysis of the decision tree model,the parameters with the highest feature importance are obtained to identify the time delay of the system.Excellent results are gained by the simulation data of linear and nonlinear control systems,and the time delay of the systems can be accurately identified. 展开更多
关键词 time delay dynamical system INTERPRETABILITY regression tree feature importance
在线阅读 下载PDF
Smart prediction of liquefaction-induced lateral spreading 被引量:1
17
作者 Muhammad Nouman Amjad Raja Tarek Abdoun Waleed El-Sekelly 《Journal of Rock Mechanics and Geotechnical Engineering》 SCIE CSCD 2024年第6期2310-2325,共16页
The prediction of liquefaction-induced lateral spreading/displacement(Dh)is a challenging task for civil/geotechnical engineers.In this study,a new approach is proposed to predict Dh using gene expression programming(... The prediction of liquefaction-induced lateral spreading/displacement(Dh)is a challenging task for civil/geotechnical engineers.In this study,a new approach is proposed to predict Dh using gene expression programming(GEP).Based on statistical reasoning,individual models were developed for two topographies:free-face and gently sloping ground.Along with a comparison with conventional approaches for predicting the Dh,four additional regression-based soft computing models,i.e.Gaussian process regression(GPR),relevance vector machine(RVM),sequential minimal optimization regression(SMOR),and M5-tree,were developed and compared with the GEP model.The results indicate that the GEP models predict Dh with less bias,as evidenced by the root mean square error(RMSE)and mean absolute error(MAE)for training(i.e.1.092 and 0.815;and 0.643 and 0.526)and for testing(i.e.0.89 and 0.705;and 0.773 and 0.573)in free-face and gently sloping ground topographies,respectively.The overall performance for the free-face topology was ranked as follows:GEP>RVM>M5-tree>GPR>SMOR,with a total score of 40,32,24,15,and 10,respectively.For the gently sloping condition,the performance was ranked as follows:GEP>RVM>GPR>M5-tree>SMOR with a total score of 40,32,21,19,and 8,respectively.Finally,the results of the sensitivity analysis showed that for both free-face and gently sloping ground,the liquefiable layer thickness(T_(15))was the major parameter with percentage deterioration(%D)value of 99.15 and 90.72,respectively. 展开更多
关键词 Lateral spreading Intelligent modeling Gene expression programming(GEP) Closed-form solution Feature importance
在线阅读 下载PDF
Towards an improved prediction of soil-freezing characteristic curve based on extreme gradient boosting model 被引量:1
18
作者 Kai-Qi Li Hai-Long He 《Geoscience Frontiers》 SCIE CAS CSCD 2024年第6期229-243,共15页
As an essential property of frozen soils,change of unfrozen water content(UWC)with temperature,namely soil-freezing characteristic curve(SFCC),plays significant roles in numerous physical,hydraulic and mechanical proc... As an essential property of frozen soils,change of unfrozen water content(UWC)with temperature,namely soil-freezing characteristic curve(SFCC),plays significant roles in numerous physical,hydraulic and mechanical processes in cold regions,including the heat and water transfer within soils and at the land–atmosphere interface,frost heave and thaw settlement,as well as the simulation of coupled thermo-hydro-mechanical interactions.Although various models have been proposed to estimate SFCC,their applicability remains limited due to their derivation from specific soil types,soil treatments,and test devices.Accordingly,this study proposes a novel data-driven model to predict the SFCC using an extreme Gradient Boosting(XGBoost)model.A systematic database for SFCC of frozen soils compiled from extensive experimental investigations via various testing methods was utilized to train the XGBoost model.The predicted soil freezing characteristic curves(SFCC,UWC as a function of temperature)from the well-trained XGBoost model were compared with original experimental data and three conventional models.The results demonstrate the superior performance of the proposed XGBoost model over the traditional models in predicting SFCC.This study provides valuable insights for future investigations regarding the SFCC of frozen soils. 展开更多
关键词 Soil freezing characteristic curve(SFCC) Soil temperature Unfrozen water content XGBoost model Machine Learning Feature importance
在线阅读 下载PDF
Decipher Clinical and Genetic Underpins of Breast Cancer Survival with Machine Learning Methods
19
作者 Zhengkai Zhuang 《Advances in Breast Cancer Research》 2023年第4期163-185,共23页
Breast cancer is one of the most common cancers among women in the world, with more than two million new cases of breast cancer every year. This disease is associated with numerous clinical and genetic characteristics... Breast cancer is one of the most common cancers among women in the world, with more than two million new cases of breast cancer every year. This disease is associated with numerous clinical and genetic characteristics. In recent years, machine learning technology has been increasingly applied to the medical field, including predicting the risk of malignant tumors such as breast cancer. Based on clinical and targeted sequencing data of 1980 primary breast cancer samples, this article aimed to analyze these data and predict living conditions after breast cancer. After data engineering, feature selection, and comparison of machine learning methods, the light gradient boosting machine model was found the best with hyperparameter tuning (precision = 0.818, recall = 0.816, f1 score = 0.817, roc-auc = 0.867). And the top 5 determinants were clinical features age at diagnosis, Nottingham Prognostic Index, cohort and genetic features rheb, nr3c1. The study shed light on rational allocation of medical resources and provided insights to early prevention, diagnosis and treatment of breast cancer with the identified risk clinical and genetic factors. 展开更多
关键词 Machine Learning Breast Cancer Prediction Data Analysis Feature importance Comparison
暂未订购
Building Safe, Green and Efficient Energy System in China
20
《China Oil & Gas》 CAS 2015年第3期1-1,共1页
Energy issue is of strategic importance influencing China’s overall economic and social development that needs systematic planning and far-sighted deliberation.At the present time the revolution of energy technology ... Energy issue is of strategic importance influencing China’s overall economic and social development that needs systematic planning and far-sighted deliberation.At the present time the revolution of energy technology is advancing rapidly.The global innovation of energy technology has entered a highly dynamic period featured by multi-point breakthroughs, 展开更多
关键词 revolution featured advancing strategic producing downstream sector currently import refining
在线阅读 下载PDF
上一页 1 2 下一页 到第
使用帮助 返回顶部