BACKGROUND Ischemic heart disease(IHD)impacts the quality of life and has the highest mortality rate of cardiovascular diseases globally.AIM To compare variations in the parameters of the single-lead electrocardiogram...BACKGROUND Ischemic heart disease(IHD)impacts the quality of life and has the highest mortality rate of cardiovascular diseases globally.AIM To compare variations in the parameters of the single-lead electrocardiogram(ECG)during resting conditions and physical exertion in individuals diagnosed with IHD and those without the condition using vasodilator-induced stress computed tomography(CT)myocardial perfusion imaging as the diagnostic reference standard.METHODS This single center observational study included 80 participants.The participants were aged≥40 years and given an informed written consent to participate in the study.Both groups,G1(n=31)with and G2(n=49)without post stress induced myocardial perfusion defect,passed cardiologist consultation,anthropometric measurements,blood pressure and pulse rate measurement,echocardiography,cardio-ankle vascular index,bicycle ergometry,recording 3-min single-lead ECG(Cardio-Qvark)before and just after bicycle ergometry followed by performing CT myocardial perfusion.The LASSO regression with nested cross-validation was used to find the association between Cardio-Qvark parameters and the existence of the perfusion defect.Statistical processing was performed with the R programming language v4.2,Python v.3.10[^R],and Statistica 12 program.RESULTS Bicycle ergometry yielded an area under the receiver operating characteristic curve of 50.7%[95%confidence interval(CI):0.388-0.625],specificity of 53.1%(95%CI:0.392-0.673),and sensitivity of 48.4%(95%CI:0.306-0.657).In contrast,the Cardio-Qvark test performed notably better with an area under the receiver operating characteristic curve of 67%(95%CI:0.530-0.801),specificity of 75.5%(95%CI:0.628-0.88),and sensitivity of 51.6%(95%CI:0.333-0.695).CONCLUSION The single-lead ECG has a relatively higher diagnostic accuracy compared with bicycle ergometry by using machine learning models,but the difference was not statistically significant.However,further investigations are required to uncover the hidden capabilities of single-lead ECG in IHD diagnosis.展开更多
Background Cotton is one of the most important commercial crops after food crops,especially in countries like India,where it’s grown extensively under rainfed conditions.Because of its usage in multiple industries,su...Background Cotton is one of the most important commercial crops after food crops,especially in countries like India,where it’s grown extensively under rainfed conditions.Because of its usage in multiple industries,such as textile,medicine,and automobile industries,it has greater commercial importance.The crop’s performance is greatly influenced by prevailing weather dynamics.As climate changes,assessing how weather changes affect crop performance is essential.Among various techniques that are available,crop models are the most effective and widely used tools for predicting yields.Results This study compares statistical and machine learning models to assess their ability to predict cotton yield across major producing districts of Karnataka,India,utilizing a long-term dataset spanning from 1990 to 2023 that includes yield and weather factors.The artificial neural networks(ANNs)performed superiorly with acceptable yield deviations ranging within±10%during both vegetative stage(F1)and mid stage(F2)for cotton.The model evaluation metrics such as root mean square error(RMSE),normalized root mean square error(nRMSE),and modelling efficiency(EF)were also within the acceptance limits in most districts.Furthermore,the tested ANN model was used to assess the importance of the dominant weather factors influencing crop yield in each district.Specifically,the use of morning relative humidity as an individual parameter and its interaction with maximum and minimum tempera-ture had a major influence on cotton yield in most of the yield predicted districts.These differences highlighted the differential interactions of weather factors in each district for cotton yield formation,highlighting individual response of each weather factor under different soils and management conditions over the major cotton growing districts of Karnataka.Conclusions Compared with statistical models,machine learning models such as ANNs proved higher efficiency in forecasting the cotton yield due to their ability to consider the interactive effects of weather factors on yield forma-tion at different growth stages.This highlights the best suitability of ANNs for yield forecasting in rainfed conditions and for the study on relative impacts of weather factors on yield.Thus,the study aims to provide valuable insights to support stakeholders in planning effective crop management strategies and formulating relevant policies.展开更多
The backwater effect caused by tributary inflow can significantly elevate the water level profile upstream of a confluence point.However,the influence of mainstream and confluence discharges on the backwater effect in...The backwater effect caused by tributary inflow can significantly elevate the water level profile upstream of a confluence point.However,the influence of mainstream and confluence discharges on the backwater effect in a river reach remains unclear.In this study,various hydrological data collected from the Jingjiang Reach of the Yangtze River in China were statistically analyzed to determine the backwater degree and range with three representative mainstream discharges.The results indicated that the backwater degree increased with mainstream discharge,and a positive relationship was observed between the runoff ratio and backwater degree at specific representative mainstream discharges.Following the operation of the Three Gorges Project,the backwater effect in the Jingjiang Reach diminished.For instance,mean backwater degrees for low,moderate,and high mainstream discharges were recorded as 0.83 m,1.61 m,and 2.41 m during the period from 1990 to 2002,whereas these values decreased to 0.30 m,0.95 m,and 2.08 m from 2009 to 2020.The backwater range extended upstream as mainstream discharge increased from 7000 m3/s to 30000 m3/s.Moreover,a random forest-based machine learning model was used to quantify the backwater effect with varying mainstream and confluence discharges,accounting for the impacts of mainstream discharge,confluence discharge,and channel degradation in the Jingjiang Reach.At the Jianli Hydrological Station,a decrease in mainstream discharge during flood seasons resulted in a 7%–15%increase in monthly mean backwater degree,while an increase in mainstream discharge during dry seasons led to a 1%–15%decrease in monthly mean backwater degree.Furthermore,increasing confluence discharge from Dongting Lake during June to July and September to November resulted in an 11%–42%increase in monthly mean backwater degree.Continuous channel degradation in the Jingjiang Reach contributed to a 6%–19%decrease in monthly mean backwater degree.Under the influence of these factors,the monthly mean backwater degree in 2017 varied from a decrease of 53%to an increase of 37%compared to corresponding values in 1991.展开更多
Understanding spatial heterogeneity in groundwater responses to multiple factors is critical for water resource management in coastal cities.Daily groundwater depth(GWD)data from 43 wells(2018-2022)were collected in t...Understanding spatial heterogeneity in groundwater responses to multiple factors is critical for water resource management in coastal cities.Daily groundwater depth(GWD)data from 43 wells(2018-2022)were collected in three coastal cities in Jiangsu Province,China.Seasonal and Trend decomposition using Loess(STL)together with wavelet analysis and empirical mode decomposition were applied to identify tide-influenced wells while remaining wells were grouped by hierarchical clustering analysis(HCA).Machine learning models were developed to predict GWD,then their response to natural conditions and human activities was assessed by the Shapley Additive exPlanations(SHAP)method.Results showed that eXtreme Gradient Boosting(XGB)was superior to other models in terms of prediction performance and computational efficiency(R^(2)>0.95).GWD in Yancheng and southern Lianyungang were greater than those in Nantong,exhibiting larger fluctuations.Groundwater within 5 km of the coastline was affected by tides,with more pronounced effects in agricultural areas compared to urban areas.Shallow groundwater(3-7 m depth)responded immediately(0-1 day)to rainfall,primarily influenced by farmland and topography(slope and distance from rivers).Rainfall recharge to groundwater peaked at 50%farmland coverage,but this effect was suppressed by high temperatures(>30℃)which intensified as distance from rivers increased,especially in forest and grassland.Deep groundwater(>10 m)showed delayed responses to rainfall(1-4 days)and temperature(10-15 days),with GDP as the primary influence,followed by agricultural irrigation and population density.Farmland helped to maintain stable GWD in low population density regions,while excessive farmland coverage(>90%)led to overexploitation.In the early stages of GDP development,increased industrial and agricultural water demand led to GWD decline,but as GDP levels significantly improved,groundwater consumption pressure gradually eased.This methodological framework is applicable not only to coastal cities in China but also could be extended to coastal regions worldwide.展开更多
This research investigates the influence of indoor and outdoor factors on photovoltaic(PV)power generation at Utrecht University to accurately predict PV system performance by identifying critical impact factors and i...This research investigates the influence of indoor and outdoor factors on photovoltaic(PV)power generation at Utrecht University to accurately predict PV system performance by identifying critical impact factors and improving renewable energy efficiency.To predict plant efficiency,nineteen variables are analyzed,consisting of nine indoor photovoltaic panel characteristics(Open Circuit Voltage(Voc),Short Circuit Current(Isc),Maximum Power(Pmpp),Maximum Voltage(Umpp),Maximum Current(Impp),Filling Factor(FF),Parallel Resistance(Rp),Series Resistance(Rs),Module Temperature)and ten environmental factors(Air Temperature,Air Humidity,Dew Point,Air Pressure,Irradiation,Irradiation Propagation,Wind Speed,Wind Speed Propagation,Wind Direction,Wind Direction Propagation).This study provides a new perspective not previously addressed in the literature.In this study,different machine learning methods such as Multilayer Perceptron(MLP),Multivariate Adaptive Regression Spline(MARS),Multiple Linear Regression(MLR),and Random Forest(RF)models are used to predict power values using data from installed PVpanels.Panel values obtained under real field conditions were used to train the models,and the results were compared.The Multilayer Perceptron(MLP)model was achieved with the highest classification accuracy of 0.990%.The machine learning models used for solar energy forecasting show high performance and produce results close to actual values.Models like Multi-Layer Perceptron(MLP)and Random Forest(RF)can be used in diverse locations based on load demand.展开更多
The Indian Himalayan region is frequently experiencing climate change-induced landslides.Thus,landslide susceptibility assessment assumes greater significance for lessening the impact of a landslide hazard.This paper ...The Indian Himalayan region is frequently experiencing climate change-induced landslides.Thus,landslide susceptibility assessment assumes greater significance for lessening the impact of a landslide hazard.This paper makes an attempt to assess landslide susceptibility in Shimla district of the northwest Indian Himalayan region.It examined the effectiveness of random forest(RF),multilayer perceptron(MLP),sequential minimal optimization regression(SMOreg)and bagging ensemble(B-RF,BSMOreg,B-MLP)models.A landslide inventory map comprising 1052 locations of past landslide occurrences was classified into training(70%)and testing(30%)datasets.The site-specific influencing factors were selected by employing a multicollinearity test.The relationship between past landslide occurrences and influencing factors was established using the frequency ratio method.The effectiveness of machine learning models was verified through performance assessors.The landslide susceptibility maps were validated by the area under the receiver operating characteristic curves(ROC-AUC),accuracy,precision,recall and F1-score.The key performance metrics and map validation demonstrated that the BRF model(correlation coefficient:0.988,mean absolute error:0.010,root mean square error:0.058,relative absolute error:2.964,ROC-AUC:0.947,accuracy:0.778,precision:0.819,recall:0.917 and F-1 score:0.865)outperformed the single classifiers and other bagging ensemble models for landslide susceptibility.The results show that the largest area was found under the very high susceptibility zone(33.87%),followed by the low(27.30%),high(20.68%)and moderate(18.16%)susceptibility zones.The factors,namely average annual rainfall,slope,lithology,soil texture and earthquake magnitude have been identified as the influencing factors for very high landslide susceptibility.Soil texture,lineament density and elevation have been attributed to high and moderate susceptibility.Thus,the study calls for devising suitable landslide mitigation measures in the study area.Structural measures,an immediate response system,community participation and coordination among stakeholders may help lessen the detrimental impact of landslides.The findings from this study could aid decision-makers in mitigating future catastrophes and devising suitable strategies in other geographical regions with similar geological characteristics.展开更多
BACKGROUND Colorectal cancer significantly impacts global health,with unplanned reoperations post-surgery being key determinants of patient outcomes.Existing predictive models for these reoperations lack precision in ...BACKGROUND Colorectal cancer significantly impacts global health,with unplanned reoperations post-surgery being key determinants of patient outcomes.Existing predictive models for these reoperations lack precision in integrating complex clinical data.AIM To develop and validate a machine learning model for predicting unplanned reoperation risk in colorectal cancer patients.METHODS Data of patients treated for colorectal cancer(n=2044)at the First Affiliated Hospital of Wenzhou Medical University and Wenzhou Central Hospital from March 2020 to March 2022 were retrospectively collected.Patients were divided into an experimental group(n=60)and a control group(n=1984)according to unplanned reoperation occurrence.Patients were also divided into a training group and a validation group(7:3 ratio).We used three different machine learning methods to screen characteristic variables.A nomogram was created based on multifactor logistic regression,and the model performance was assessed using receiver operating characteristic curve,calibration curve,Hosmer-Lemeshow test,and decision curve analysis.The risk scores of the two groups were calculated and compared to validate the model.RESULTS More patients in the experimental group were≥60 years old,male,and had a history of hypertension,laparotomy,and hypoproteinemia,compared to the control group.Multiple logistic regression analysis confirmed the following as independent risk factors for unplanned reoperation(P<0.05):Prognostic Nutritional Index value,history of laparotomy,hypertension,or stroke,hypoproteinemia,age,tumor-node-metastasis staging,surgical time,gender,and American Society of Anesthesiologists classification.Receiver operating characteristic curve analysis showed that the model had good discrimination and clinical utility.CONCLUSION This study used a machine learning approach to build a model that accurately predicts the risk of postoperative unplanned reoperation in patients with colorectal cancer,which can improve treatment decisions and prognosis.展开更多
BACKGROUND Liver transplantation(LT)is a life-saving intervention for patients with end-stage liver disease.However,the equitable allocation of scarce donor organs remains a formidable challenge.Prognostic tools are p...BACKGROUND Liver transplantation(LT)is a life-saving intervention for patients with end-stage liver disease.However,the equitable allocation of scarce donor organs remains a formidable challenge.Prognostic tools are pivotal in identifying the most suitable transplant candidates.Traditionally,scoring systems like the model for end-stage liver disease have been instrumental in this process.Nevertheless,the landscape of prognostication is undergoing a transformation with the integration of machine learning(ML)and artificial intelligence models.AIM To assess the utility of ML models in prognostication for LT,comparing their performance and reliability to established traditional scoring systems.METHODS Following the Preferred Reporting Items for Systematic Reviews and Meta-Analysis guidelines,we conducted a thorough and standardized literature search using the PubMed/MEDLINE database.Our search imposed no restrictions on publication year,age,or gender.Exclusion criteria encompassed non-English studies,review articles,case reports,conference papers,studies with missing data,or those exhibiting evident methodological flaws.RESULTS Our search yielded a total of 64 articles,with 23 meeting the inclusion criteria.Among the selected studies,60.8%originated from the United States and China combined.Only one pediatric study met the criteria.Notably,91%of the studies were published within the past five years.ML models consistently demonstrated satisfactory to excellent area under the receiver operating characteristic curve values(ranging from 0.6 to 1)across all studies,surpassing the performance of traditional scoring systems.Random forest exhibited superior predictive capabilities for 90-d mortality following LT,sepsis,and acute kidney injury(AKI).In contrast,gradient boosting excelled in predicting the risk of graft-versus-host disease,pneumonia,and AKI.CONCLUSION This study underscores the potential of ML models in guiding decisions related to allograft allocation and LT,marking a significant evolution in the field of prognostication.展开更多
Deformation monitoring is a critical measure for intuitively reflecting the operational behavior of a dam.However,the deformation monitoring data are often incomplete due to environmental changes,monitoring instrument...Deformation monitoring is a critical measure for intuitively reflecting the operational behavior of a dam.However,the deformation monitoring data are often incomplete due to environmental changes,monitoring instrument faults,and human operational errors,thereby often hindering the accurate assessment of actual deformation patterns.This study proposed a method for quantifying deformation similarity between measurement points by recognizing the spatiotemporal characteristics of concrete dam deformation monitoring data.It introduces a spatiotemporal clustering analysis of the concrete dam deformation behavior and employs the support vector machine model to address the missing data in concrete dam deformation monitoring.The proposed method was validated in a concrete dam project,with the model error maintaining within 5%,demonstrating its effectiveness in processing missing deformation data.This approach enhances the capability of early-warning systems and contributes to enhanced dam safety management.展开更多
Background and Objective The effectiveness of radiofrequency ablation(RFA)in improving long-term survival outcomes for patients with a solitary hepatocellular carcinoma(HCC)measuring 5 cm or less remains uncertain.Thi...Background and Objective The effectiveness of radiofrequency ablation(RFA)in improving long-term survival outcomes for patients with a solitary hepatocellular carcinoma(HCC)measuring 5 cm or less remains uncertain.This study was designed to elucidate the impact of RFA therapy on the survival outcomes of these patients and to construct a prognostic model for patients following RFA.Methods This study was performed using the Surveillance,Epidemiology,and End Results(SEER)database from 2004 to 2017,focusing on patients diagnosed with a solitary HCC lesion≤5 cm in size.We compared the overall survival(OS)and cancer-specific survival(CSS)rates of these patients with those of patients who received hepatectomy,radiotherapy,or chemotherapy or who were part of a blank control group.To enhance the reliability of our findings,we employed stabilized inverse probability treatment weighting(sIPTW)and stratified analyses.Additionally,we conducted a Cox regression analysis to identify prognostic factors.XGBoost models were developed to predict 1-,3-,and 5-year CSS.The XGBoost models were evaluated via receiver operating characteristic(ROC)curves,calibration plots,decision curve analysis(DCA)curves and so on.Results Regardless of whether the data were unadjusted or adjusted for the use of sIPTWs,the 5-year OS(46.7%)and CSS(58.9%)rates were greater in the RFA group than in the radiotherapy(27.1%/35.8%),chemotherapy(32.9%/43.7%),and blank control(18.6%/30.7%)groups,but these rates were lower than those in the hepatectomy group(69.4%/78.9%).Stratified analysis based on age and cirrhosis status revealed that RFA and hepatectomy yielded similar OS and CSS outcomes for patients with cirrhosis aged over 65 years.Age,race,marital status,grade,cirrhosis status,tumor size,and AFP level were selected to construct the XGBoost models based on the training cohort.The areas under the curve(AUCs)for 1,3,and 5 years in the validation cohort were 0.88,0.81,and 0.79,respectively.Calibration plots further demonstrated the consistency between the predicted and actual values in both the training and validation cohorts.Conclusion RFA can improve the survival of patients diagnosed with a solitary HCC lesion≤5 cm.In certain clinical scenarios,RFA achieves survival outcomes comparable to those of hepatectomy.The XGBoost models developed in this study performed admirably in predicting the CSS of patients with solitary HCC tumors smaller than 5 cm following RFA.展开更多
Spatial heterogeneity refers to the variation or differences in characteristics or features across different locations or areas in space. Spatial data refers to information that explicitly or indirectly belongs to a p...Spatial heterogeneity refers to the variation or differences in characteristics or features across different locations or areas in space. Spatial data refers to information that explicitly or indirectly belongs to a particular geographic region or location, also known as geo-spatial data or geographic information. Focusing on spatial heterogeneity, we present a hybrid machine learning model combining two competitive algorithms: the Random Forest Regressor and CNN. The model is fine-tuned using cross validation for hyper-parameter adjustment and performance evaluation, ensuring robustness and generalization. Our approach integrates Global Moran’s I for examining global autocorrelation, and local Moran’s I for assessing local spatial autocorrelation in the residuals. To validate our approach, we implemented the hybrid model on a real-world dataset and compared its performance with that of the traditional machine learning models. Results indicate superior performance with an R-squared of 0.90, outperforming RF 0.84 and CNN 0.74. This study contributed to a detailed understanding of spatial variations in data considering the geographical information (Longitude & Latitude) present in the dataset. Our results, also assessed using the Root Mean Squared Error (RMSE), indicated that the hybrid yielded lower errors, showing a deviation of 53.65% from the RF model and 63.24% from the CNN model. Additionally, the global Moran’s I index was observed to be 0.10. This study underscores that the hybrid was able to predict correctly the house prices both in clusters and in dispersed areas.展开更多
This study evaluates the performance of advanced machine learning(ML)models in predicting the mechanical properties of eco-friendly self-compacting concrete(SCC),with a focus on compressive strength,V-funnel time,Lbox...This study evaluates the performance of advanced machine learning(ML)models in predicting the mechanical properties of eco-friendly self-compacting concrete(SCC),with a focus on compressive strength,V-funnel time,Lbox ratio,and slump flow.The motivation for this study stems from the increasing need to optimize concrete mix designs while minimizing environmental impact and reducing the reliance on costly physical testing.Six ML models-backpropagation neural network(BPNN),random forest regression(RFR),K-nearest neighbors(KNN),stacking,bagging,and eXtreme gradient boosting(XGBoost)-were trained and validated using a comprehensive dataset of 239 mix design parameters.The models'predictive accuracies were assessed using the coefficient of determination,mean squared error,root mean squared error,and mean absolute error.XGBoost consistently outperformed other models,achieving the coefficient of determination values of 0.999,0.933,and 0.935 for compressive strength in the training,validation,and testing datasets,respectively.Sensitivity analysis revealed that cement,silica fume,coarse aggregate,and superplasticizer positively influenced compressive strength,while water content had a negative impact.These findings highlight the potential of ML models,particularly XGBoost and RFR,in optimizing SCC mix designs,reducing reliance on physical testing,and enhancing sustainability in construction.The application of these models can lead to more efficient and eco-friendly concrete mix designs,benefiting real-world construction projects by improving quality control and reducing costs.展开更多
Zhou et al’s investigation on the creation of a non-invasive deep learning(DL)method for colorectal tumor immune microenvironment evaluation using preoperative computed tomography(CT)radiomics published in the World ...Zhou et al’s investigation on the creation of a non-invasive deep learning(DL)method for colorectal tumor immune microenvironment evaluation using preoperative computed tomography(CT)radiomics published in the World Journal of Gastrointestinal Oncology is thorough and scientific.The study analyzed preoperative CT images of 315 confirmed colorectal cancer patients,using manual regions of interest to extract DL features.The study developed a DL model using CT images and histopathological images to predict immune-related indicators in colorectal cancer patients.Pathological(tumor-stroma ratio,tumor-infiltrating lymphocytes infiltration,immunohistochemistry,tumor immune microenvir-onment and immune score)parameters and radiomics(CT imaging and model construction)data were combined to generate artificial intelligence-powered models.Clinical benefit and goodness of fit of the models were assessed using receiver operating characteristic,area under curve and decision curve analysis.The developed DL-based radiomics prediction model for non-invasive evaluation of tumor markers demonstrated potential for personalized treatment planning and immunotherapy strategies in colorectal cancer patients.The study,involving a small group from a single medical center,lacks inclusion/exclusion criteria and should include clinicopathological features for valuable therapeutic practice insights in colorectal cancer patients.展开更多
Accurate Global Horizontal Irradiance(GHI)forecasting has become vital for successfully integrating solar energy into the electrical grid because of the expanding demand for green power and the worldwide shift favouri...Accurate Global Horizontal Irradiance(GHI)forecasting has become vital for successfully integrating solar energy into the electrical grid because of the expanding demand for green power and the worldwide shift favouring green energy resources.Particularly considering the implications of the aggressive GHG emission targets,accurate GHI forecasting has become vital for developing,designing,and operational managing solar energy systems.This research presented the core concepts of modelling and performance analysis of the application of various forecasting models such as ARIMA(Autoregressive Integrated Moving Average),Elaman NN(Elman Neural Network),RBFN(Radial Basis Function Neural Network),SVM(Support Vector Machine),LSTM(Long Short-Term Memory),Persistent,BPN(Back Propagation Neural Network),MLP(Multilayer Perceptron Neural Network),RF(Random Forest),and XGBoost(eXtreme Gradient Boosting)for assessing multi-seasonal forecasting of GHI.Used the India region data to evaluate the models’performance and forecasting ability.Research using forecasting models for seasonal Global Horizontal Irradiance(GHI)forecasting in winter,spring,summer,monsoon,and autumn.Substantiated performance effectiveness through evaluation metrics,such as Mean Absolute Error(MAE),Root Mean Squared Error(RMSE),and R-squared(R^(2)),coded using Python programming.The performance experimentation analysis inferred that the most accurate forecasts in all the seasons compared to the other forecasting models the Random Forest and eXtreme Gradient Boosting,are the superior and competing models that yield Winter season-based forecasting XGBoost is the best forecasting model with MAE:1.6325,RMSE:4.8338,and R^(2):0.9998.Spring season-based forecasting XGBoost is the best forecasting model with MAE:2.599599,RMSE:5.58539,and R^(2):0.999784.Summer season-based forecasting RF is the best forecasting model with MAE:1.03843,RMSE:2.116325,and R^(2):0.999967.Monsoon season-based forecasting RF is the best forecasting model with MAE:0.892385,RMSE:2.417587,and R^(2):0.999942.Autumn season-based forecasting RF is the best forecasting model with MAE:0.810462,RMSE:1.928215,and R^(2):0.999958.Based on seasonal variations and computing constraints,the findings enable energy system operators to make helpful recommendations for choosing the most effective forecasting models.展开更多
Soil desiccation cracking is ubiquitous in nature and has significantpotential impacts on the engineering geological properties of soils.Previous studies have extensively examined various factors affecting soil cracki...Soil desiccation cracking is ubiquitous in nature and has significantpotential impacts on the engineering geological properties of soils.Previous studies have extensively examined various factors affecting soil cracking behavior through a numerous small-sample experiments.However,experimental studies alone cannot accurately describe soil cracking behavior.In this study,we firstly propose a modeling framework for predicting the surface crack ratio of soil desiccation cracking based on machine learning and interpretable analysis.The framework utilizes 1040 sets of soil cracking experimental data and employs random forest(RF),extreme gradient boosting(XGBoost),and artificialneural network(ANN)models to predict the surface crack ratio of soil desiccation cracking.To clarify the influenceof input features on soil cracking behavior,feature importance and Shapley additive explanations(SHAP)are applied for interpretability analysis.The results reveal that ensemble methods(RF and XGBoost)provide better predictive performance than the deep learning model(ANN).The feature importance analysis shows that soil desiccation cracking is primarily influencedby initial water content,plasticity index,finalwater content,liquid limit,sand content,clay content and thickness.Moreover,SHAP-based interpretability analysis further explores how soil cracking responds to various input variables.This study provides new insight into the evolution of soil cracking behavior,enhancing the understanding of its physical mechanisms and facilitating the assessment of potential regional development of soil desiccation cracking.展开更多
To overcome the challenges of limited experimental data and improve the accuracy of empirical formulas,we propose a low-cycle fatigue(LCF)life prediction model for nickel-based superalloys using a data augmentation me...To overcome the challenges of limited experimental data and improve the accuracy of empirical formulas,we propose a low-cycle fatigue(LCF)life prediction model for nickel-based superalloys using a data augmentation method.This method utilizes a variational autoencoder(VAE)to generate low-cycle fatigue data and form an augmented dataset.The Pearson correlation coefficient(PCC)is employed to verify the similarity of feature distributions between the original and augmented datasets.Six machine learning models,namely random forest(RF),artificial neural network(ANN),support vector machine(SVM),gradient-boosted decision tree(GBDT),eXtreme Gradient Boosting(XGBoost),and Categorical Boosting(CatBoost),are utilized to predict the LCF life of nickel-based superalloys.Results indicate that the proposed data augmentation method based on VAE can effectively expand the dataset,and the mean absolute error(MAE),root mean square error(RMSE),and R-squared(R^(2))values achieved using the CatBoost model,with respective values of 0.0242,0.0391,and 0.9538,are superior to those of the other models.The proposed method reduces the cost and time associated with LCF experiments and accurately establishes the relationship between fatigue characteristics and LCF life of nickel-based superalloys.展开更多
In this paper we apply the nonlinear time series analysis method to small-time scale traffic measurement data. The prediction-based method is used to determine the embedding dimension of the traffic data. Based on the...In this paper we apply the nonlinear time series analysis method to small-time scale traffic measurement data. The prediction-based method is used to determine the embedding dimension of the traffic data. Based on the reconstructed phase space, the local support vector machine prediction method is used to predict the traffic measurement data, and the BIC-based neighbouring point selection method is used to choose the number of the nearest neighbouring points for the local support vector machine regression model. The experimental results show that the local support vector machine prediction method whose neighbouring points are optimized can effectively predict the small-time scale traffic measurement data and can reproduce the statistical features of real traffic measurements.展开更多
The numerical simulation and slope stability prediction are the focus of slope disaster research.Recently,machine learning models are commonly used in the slope stability prediction.However,these machine learning mode...The numerical simulation and slope stability prediction are the focus of slope disaster research.Recently,machine learning models are commonly used in the slope stability prediction.However,these machine learning models have some problems,such as poor nonlinear performance,local optimum and incomplete factors feature extraction.These issues can affect the accuracy of slope stability prediction.Therefore,a deep learning algorithm called Long short-term memory(LSTM)has been innovatively proposed to predict slope stability.Taking the Ganzhou City in China as the study area,the landslide inventory and their characteristics of geotechnical parameters,slope height and slope angle are analyzed.Based on these characteristics,typical soil slopes are constructed using the Geo-Studio software.Five control factors affecting slope stability,including slope height,slope angle,internal friction angle,cohesion and volumetric weight,are selected to form different slope and construct model input variables.Then,the limit equilibrium method is used to calculate the stability coefficients of these typical soil slopes under different control factors.Each slope stability coefficient and its corresponding control factors is a slope sample.As a result,a total of 2160 training samples and 450 testing samples are constructed.These sample sets are imported into LSTM for modelling and compared with the support vector machine(SVM),random forest(RF)and convo-lutional neural network(CNN).The results show that the LSTM overcomes the problem that the commonly used machine learning models have difficulty extracting global features.Furthermore,LSTM has a better prediction performance for slope stability compared to SVM,RF and CNN models.展开更多
To perform landslide susceptibility prediction(LSP),it is important to select appropriate mapping unit and landslide-related conditioning factors.The efficient and automatic multi-scale segmentation(MSS)method propose...To perform landslide susceptibility prediction(LSP),it is important to select appropriate mapping unit and landslide-related conditioning factors.The efficient and automatic multi-scale segmentation(MSS)method proposed by the authors promotes the application of slope units.However,LSP modeling based on these slope units has not been performed.Moreover,the heterogeneity of conditioning factors in slope units is neglected,leading to incomplete input variables of LSP modeling.In this study,the slope units extracted by the MSS method are used to construct LSP modeling,and the heterogeneity of conditioning factors is represented by the internal variations of conditioning factors within slope unit using the descriptive statistics features of mean,standard deviation and range.Thus,slope units-based machine learning models considering internal variations of conditioning factors(variant slope-machine learning)are proposed.The Chongyi County is selected as the case study and is divided into 53,055 slope units.Fifteen original slope unit-based conditioning factors are expanded to 38 slope unit-based conditioning factors through considering their internal variations.Random forest(RF)and multi-layer perceptron(MLP)machine learning models are used to construct variant Slope-RF and Slope-MLP models.Meanwhile,the Slope-RF and Slope-MLP models without considering the internal variations of conditioning factors,and conventional grid units-based machine learning(Grid-RF and MLP)models are built for comparisons through the LSP performance assessments.Results show that the variant Slopemachine learning models have higher LSP performances than Slope-machine learning models;LSP results of variant Slope-machine learning models have stronger directivity and practical application than Grid-machine learning models.It is concluded that slope units extracted by MSS method can be appropriate for LSP modeling,and the heterogeneity of conditioning factors within slope units can more comprehensively reflect the relationships between conditioning factors and landslides.The research results have important reference significance for land use and landslide prevention.展开更多
Fine-grained weather forecasting data,i.e.,the grid data with high-resolution,have attracted increasing attention in recent years,especially for some specific applications such as the Winter Olympic Games.Although Eur...Fine-grained weather forecasting data,i.e.,the grid data with high-resolution,have attracted increasing attention in recent years,especially for some specific applications such as the Winter Olympic Games.Although European Centre for Medium-Range Weather Forecasts(ECMWF)provides grid prediction up to 240 hours,the coarse data are unable to meet high requirements of these major events.In this paper,we propose a method,called model residual machine learning(MRML),to generate grid prediction with high-resolution based on high-precision stations forecasting.MRML applies model output machine learning(MOML)for stations forecasting.Subsequently,MRML utilizes these forecasts to improve the quality of the grid data by fitting a machine learning(ML)model to the residuals.We demonstrate that MRML achieves high capability at diverse meteorological elements,specifically,temperature,relative humidity,and wind speed.In addition,MRML could be easily extended to other post-processing methods by invoking different techniques.In our experiments,MRML outperforms the traditional downscaling methods such as piecewise linear interpolation(PLI)on the testing data.展开更多
基金Supported by Government Assignment,No.1023022600020-6RSF Grant,No.24-15-00549Ministry of Science and Higher Education of the Russian Federation within the Framework of State Support for the Creation and Development of World-Class Research Center,No.075-15-2022-304.
文摘BACKGROUND Ischemic heart disease(IHD)impacts the quality of life and has the highest mortality rate of cardiovascular diseases globally.AIM To compare variations in the parameters of the single-lead electrocardiogram(ECG)during resting conditions and physical exertion in individuals diagnosed with IHD and those without the condition using vasodilator-induced stress computed tomography(CT)myocardial perfusion imaging as the diagnostic reference standard.METHODS This single center observational study included 80 participants.The participants were aged≥40 years and given an informed written consent to participate in the study.Both groups,G1(n=31)with and G2(n=49)without post stress induced myocardial perfusion defect,passed cardiologist consultation,anthropometric measurements,blood pressure and pulse rate measurement,echocardiography,cardio-ankle vascular index,bicycle ergometry,recording 3-min single-lead ECG(Cardio-Qvark)before and just after bicycle ergometry followed by performing CT myocardial perfusion.The LASSO regression with nested cross-validation was used to find the association between Cardio-Qvark parameters and the existence of the perfusion defect.Statistical processing was performed with the R programming language v4.2,Python v.3.10[^R],and Statistica 12 program.RESULTS Bicycle ergometry yielded an area under the receiver operating characteristic curve of 50.7%[95%confidence interval(CI):0.388-0.625],specificity of 53.1%(95%CI:0.392-0.673),and sensitivity of 48.4%(95%CI:0.306-0.657).In contrast,the Cardio-Qvark test performed notably better with an area under the receiver operating characteristic curve of 67%(95%CI:0.530-0.801),specificity of 75.5%(95%CI:0.628-0.88),and sensitivity of 51.6%(95%CI:0.333-0.695).CONCLUSION The single-lead ECG has a relatively higher diagnostic accuracy compared with bicycle ergometry by using machine learning models,but the difference was not statistically significant.However,further investigations are required to uncover the hidden capabilities of single-lead ECG in IHD diagnosis.
基金funded through India Meteorological Department,New Delhi,India under the Forecasting Agricultural output using Space,Agrometeorol ogy and Land based observations(FASAL)project and fund number:No.ASC/FASAL/KT-11/01/HQ-2010.
文摘Background Cotton is one of the most important commercial crops after food crops,especially in countries like India,where it’s grown extensively under rainfed conditions.Because of its usage in multiple industries,such as textile,medicine,and automobile industries,it has greater commercial importance.The crop’s performance is greatly influenced by prevailing weather dynamics.As climate changes,assessing how weather changes affect crop performance is essential.Among various techniques that are available,crop models are the most effective and widely used tools for predicting yields.Results This study compares statistical and machine learning models to assess their ability to predict cotton yield across major producing districts of Karnataka,India,utilizing a long-term dataset spanning from 1990 to 2023 that includes yield and weather factors.The artificial neural networks(ANNs)performed superiorly with acceptable yield deviations ranging within±10%during both vegetative stage(F1)and mid stage(F2)for cotton.The model evaluation metrics such as root mean square error(RMSE),normalized root mean square error(nRMSE),and modelling efficiency(EF)were also within the acceptance limits in most districts.Furthermore,the tested ANN model was used to assess the importance of the dominant weather factors influencing crop yield in each district.Specifically,the use of morning relative humidity as an individual parameter and its interaction with maximum and minimum tempera-ture had a major influence on cotton yield in most of the yield predicted districts.These differences highlighted the differential interactions of weather factors in each district for cotton yield formation,highlighting individual response of each weather factor under different soils and management conditions over the major cotton growing districts of Karnataka.Conclusions Compared with statistical models,machine learning models such as ANNs proved higher efficiency in forecasting the cotton yield due to their ability to consider the interactive effects of weather factors on yield forma-tion at different growth stages.This highlights the best suitability of ANNs for yield forecasting in rainfed conditions and for the study on relative impacts of weather factors on yield.Thus,the study aims to provide valuable insights to support stakeholders in planning effective crop management strategies and formulating relevant policies.
基金supported by the National Key Research and Development Program of China(Grant No.2023YFC3209504)the National Natural Science Foundation of China(Grants No.U2040215 and 52479075)the Natural Science Foundation of Hubei Province(Grant No.2021CFA029).
文摘The backwater effect caused by tributary inflow can significantly elevate the water level profile upstream of a confluence point.However,the influence of mainstream and confluence discharges on the backwater effect in a river reach remains unclear.In this study,various hydrological data collected from the Jingjiang Reach of the Yangtze River in China were statistically analyzed to determine the backwater degree and range with three representative mainstream discharges.The results indicated that the backwater degree increased with mainstream discharge,and a positive relationship was observed between the runoff ratio and backwater degree at specific representative mainstream discharges.Following the operation of the Three Gorges Project,the backwater effect in the Jingjiang Reach diminished.For instance,mean backwater degrees for low,moderate,and high mainstream discharges were recorded as 0.83 m,1.61 m,and 2.41 m during the period from 1990 to 2002,whereas these values decreased to 0.30 m,0.95 m,and 2.08 m from 2009 to 2020.The backwater range extended upstream as mainstream discharge increased from 7000 m3/s to 30000 m3/s.Moreover,a random forest-based machine learning model was used to quantify the backwater effect with varying mainstream and confluence discharges,accounting for the impacts of mainstream discharge,confluence discharge,and channel degradation in the Jingjiang Reach.At the Jianli Hydrological Station,a decrease in mainstream discharge during flood seasons resulted in a 7%–15%increase in monthly mean backwater degree,while an increase in mainstream discharge during dry seasons led to a 1%–15%decrease in monthly mean backwater degree.Furthermore,increasing confluence discharge from Dongting Lake during June to July and September to November resulted in an 11%–42%increase in monthly mean backwater degree.Continuous channel degradation in the Jingjiang Reach contributed to a 6%–19%decrease in monthly mean backwater degree.Under the influence of these factors,the monthly mean backwater degree in 2017 varied from a decrease of 53%to an increase of 37%compared to corresponding values in 1991.
基金supported by the Natural Science Foundation of Jiangsu province,China(BK20240937)the Belt and Road Special Foundation of the National Key Laboratory of Water Disaster Prevention(2022491411,2021491811)the Basal Research Fund of Central Public Welfare Scientific Institution of Nanjing Hydraulic Research Institute(Y223006).
文摘Understanding spatial heterogeneity in groundwater responses to multiple factors is critical for water resource management in coastal cities.Daily groundwater depth(GWD)data from 43 wells(2018-2022)were collected in three coastal cities in Jiangsu Province,China.Seasonal and Trend decomposition using Loess(STL)together with wavelet analysis and empirical mode decomposition were applied to identify tide-influenced wells while remaining wells were grouped by hierarchical clustering analysis(HCA).Machine learning models were developed to predict GWD,then their response to natural conditions and human activities was assessed by the Shapley Additive exPlanations(SHAP)method.Results showed that eXtreme Gradient Boosting(XGB)was superior to other models in terms of prediction performance and computational efficiency(R^(2)>0.95).GWD in Yancheng and southern Lianyungang were greater than those in Nantong,exhibiting larger fluctuations.Groundwater within 5 km of the coastline was affected by tides,with more pronounced effects in agricultural areas compared to urban areas.Shallow groundwater(3-7 m depth)responded immediately(0-1 day)to rainfall,primarily influenced by farmland and topography(slope and distance from rivers).Rainfall recharge to groundwater peaked at 50%farmland coverage,but this effect was suppressed by high temperatures(>30℃)which intensified as distance from rivers increased,especially in forest and grassland.Deep groundwater(>10 m)showed delayed responses to rainfall(1-4 days)and temperature(10-15 days),with GDP as the primary influence,followed by agricultural irrigation and population density.Farmland helped to maintain stable GWD in low population density regions,while excessive farmland coverage(>90%)led to overexploitation.In the early stages of GDP development,increased industrial and agricultural water demand led to GWD decline,but as GDP levels significantly improved,groundwater consumption pressure gradually eased.This methodological framework is applicable not only to coastal cities in China but also could be extended to coastal regions worldwide.
文摘This research investigates the influence of indoor and outdoor factors on photovoltaic(PV)power generation at Utrecht University to accurately predict PV system performance by identifying critical impact factors and improving renewable energy efficiency.To predict plant efficiency,nineteen variables are analyzed,consisting of nine indoor photovoltaic panel characteristics(Open Circuit Voltage(Voc),Short Circuit Current(Isc),Maximum Power(Pmpp),Maximum Voltage(Umpp),Maximum Current(Impp),Filling Factor(FF),Parallel Resistance(Rp),Series Resistance(Rs),Module Temperature)and ten environmental factors(Air Temperature,Air Humidity,Dew Point,Air Pressure,Irradiation,Irradiation Propagation,Wind Speed,Wind Speed Propagation,Wind Direction,Wind Direction Propagation).This study provides a new perspective not previously addressed in the literature.In this study,different machine learning methods such as Multilayer Perceptron(MLP),Multivariate Adaptive Regression Spline(MARS),Multiple Linear Regression(MLR),and Random Forest(RF)models are used to predict power values using data from installed PVpanels.Panel values obtained under real field conditions were used to train the models,and the results were compared.The Multilayer Perceptron(MLP)model was achieved with the highest classification accuracy of 0.990%.The machine learning models used for solar energy forecasting show high performance and produce results close to actual values.Models like Multi-Layer Perceptron(MLP)and Random Forest(RF)can be used in diverse locations based on load demand.
文摘The Indian Himalayan region is frequently experiencing climate change-induced landslides.Thus,landslide susceptibility assessment assumes greater significance for lessening the impact of a landslide hazard.This paper makes an attempt to assess landslide susceptibility in Shimla district of the northwest Indian Himalayan region.It examined the effectiveness of random forest(RF),multilayer perceptron(MLP),sequential minimal optimization regression(SMOreg)and bagging ensemble(B-RF,BSMOreg,B-MLP)models.A landslide inventory map comprising 1052 locations of past landslide occurrences was classified into training(70%)and testing(30%)datasets.The site-specific influencing factors were selected by employing a multicollinearity test.The relationship between past landslide occurrences and influencing factors was established using the frequency ratio method.The effectiveness of machine learning models was verified through performance assessors.The landslide susceptibility maps were validated by the area under the receiver operating characteristic curves(ROC-AUC),accuracy,precision,recall and F1-score.The key performance metrics and map validation demonstrated that the BRF model(correlation coefficient:0.988,mean absolute error:0.010,root mean square error:0.058,relative absolute error:2.964,ROC-AUC:0.947,accuracy:0.778,precision:0.819,recall:0.917 and F-1 score:0.865)outperformed the single classifiers and other bagging ensemble models for landslide susceptibility.The results show that the largest area was found under the very high susceptibility zone(33.87%),followed by the low(27.30%),high(20.68%)and moderate(18.16%)susceptibility zones.The factors,namely average annual rainfall,slope,lithology,soil texture and earthquake magnitude have been identified as the influencing factors for very high landslide susceptibility.Soil texture,lineament density and elevation have been attributed to high and moderate susceptibility.Thus,the study calls for devising suitable landslide mitigation measures in the study area.Structural measures,an immediate response system,community participation and coordination among stakeholders may help lessen the detrimental impact of landslides.The findings from this study could aid decision-makers in mitigating future catastrophes and devising suitable strategies in other geographical regions with similar geological characteristics.
基金This study has been reviewed and approved by the Clinical Research Ethics Committee of Wenzhou Central Hospital and the First Hospital Affiliated to Wenzhou Medical University,No.KY2024-R016.
文摘BACKGROUND Colorectal cancer significantly impacts global health,with unplanned reoperations post-surgery being key determinants of patient outcomes.Existing predictive models for these reoperations lack precision in integrating complex clinical data.AIM To develop and validate a machine learning model for predicting unplanned reoperation risk in colorectal cancer patients.METHODS Data of patients treated for colorectal cancer(n=2044)at the First Affiliated Hospital of Wenzhou Medical University and Wenzhou Central Hospital from March 2020 to March 2022 were retrospectively collected.Patients were divided into an experimental group(n=60)and a control group(n=1984)according to unplanned reoperation occurrence.Patients were also divided into a training group and a validation group(7:3 ratio).We used three different machine learning methods to screen characteristic variables.A nomogram was created based on multifactor logistic regression,and the model performance was assessed using receiver operating characteristic curve,calibration curve,Hosmer-Lemeshow test,and decision curve analysis.The risk scores of the two groups were calculated and compared to validate the model.RESULTS More patients in the experimental group were≥60 years old,male,and had a history of hypertension,laparotomy,and hypoproteinemia,compared to the control group.Multiple logistic regression analysis confirmed the following as independent risk factors for unplanned reoperation(P<0.05):Prognostic Nutritional Index value,history of laparotomy,hypertension,or stroke,hypoproteinemia,age,tumor-node-metastasis staging,surgical time,gender,and American Society of Anesthesiologists classification.Receiver operating characteristic curve analysis showed that the model had good discrimination and clinical utility.CONCLUSION This study used a machine learning approach to build a model that accurately predicts the risk of postoperative unplanned reoperation in patients with colorectal cancer,which can improve treatment decisions and prognosis.
文摘BACKGROUND Liver transplantation(LT)is a life-saving intervention for patients with end-stage liver disease.However,the equitable allocation of scarce donor organs remains a formidable challenge.Prognostic tools are pivotal in identifying the most suitable transplant candidates.Traditionally,scoring systems like the model for end-stage liver disease have been instrumental in this process.Nevertheless,the landscape of prognostication is undergoing a transformation with the integration of machine learning(ML)and artificial intelligence models.AIM To assess the utility of ML models in prognostication for LT,comparing their performance and reliability to established traditional scoring systems.METHODS Following the Preferred Reporting Items for Systematic Reviews and Meta-Analysis guidelines,we conducted a thorough and standardized literature search using the PubMed/MEDLINE database.Our search imposed no restrictions on publication year,age,or gender.Exclusion criteria encompassed non-English studies,review articles,case reports,conference papers,studies with missing data,or those exhibiting evident methodological flaws.RESULTS Our search yielded a total of 64 articles,with 23 meeting the inclusion criteria.Among the selected studies,60.8%originated from the United States and China combined.Only one pediatric study met the criteria.Notably,91%of the studies were published within the past five years.ML models consistently demonstrated satisfactory to excellent area under the receiver operating characteristic curve values(ranging from 0.6 to 1)across all studies,surpassing the performance of traditional scoring systems.Random forest exhibited superior predictive capabilities for 90-d mortality following LT,sepsis,and acute kidney injury(AKI).In contrast,gradient boosting excelled in predicting the risk of graft-versus-host disease,pneumonia,and AKI.CONCLUSION This study underscores the potential of ML models in guiding decisions related to allograft allocation and LT,marking a significant evolution in the field of prognostication.
基金supported by the National Key R&D Program of China(Grant No.2022YFC3005401)the Fundamental Research Funds for the Central Universities(Grant No.B230201013)+2 种基金the National Natural Science Foundation of China(Grants No.52309152,U2243223,and U23B20150)the Natural Science Foundation of Jiangsu Province(Grant No.BK20220978)the Open Fund of National Dam Safety Research Center(Grant No.CX2023B03).
文摘Deformation monitoring is a critical measure for intuitively reflecting the operational behavior of a dam.However,the deformation monitoring data are often incomplete due to environmental changes,monitoring instrument faults,and human operational errors,thereby often hindering the accurate assessment of actual deformation patterns.This study proposed a method for quantifying deformation similarity between measurement points by recognizing the spatiotemporal characteristics of concrete dam deformation monitoring data.It introduces a spatiotemporal clustering analysis of the concrete dam deformation behavior and employs the support vector machine model to address the missing data in concrete dam deformation monitoring.The proposed method was validated in a concrete dam project,with the model error maintaining within 5%,demonstrating its effectiveness in processing missing deformation data.This approach enhances the capability of early-warning systems and contributes to enhanced dam safety management.
文摘Background and Objective The effectiveness of radiofrequency ablation(RFA)in improving long-term survival outcomes for patients with a solitary hepatocellular carcinoma(HCC)measuring 5 cm or less remains uncertain.This study was designed to elucidate the impact of RFA therapy on the survival outcomes of these patients and to construct a prognostic model for patients following RFA.Methods This study was performed using the Surveillance,Epidemiology,and End Results(SEER)database from 2004 to 2017,focusing on patients diagnosed with a solitary HCC lesion≤5 cm in size.We compared the overall survival(OS)and cancer-specific survival(CSS)rates of these patients with those of patients who received hepatectomy,radiotherapy,or chemotherapy or who were part of a blank control group.To enhance the reliability of our findings,we employed stabilized inverse probability treatment weighting(sIPTW)and stratified analyses.Additionally,we conducted a Cox regression analysis to identify prognostic factors.XGBoost models were developed to predict 1-,3-,and 5-year CSS.The XGBoost models were evaluated via receiver operating characteristic(ROC)curves,calibration plots,decision curve analysis(DCA)curves and so on.Results Regardless of whether the data were unadjusted or adjusted for the use of sIPTWs,the 5-year OS(46.7%)and CSS(58.9%)rates were greater in the RFA group than in the radiotherapy(27.1%/35.8%),chemotherapy(32.9%/43.7%),and blank control(18.6%/30.7%)groups,but these rates were lower than those in the hepatectomy group(69.4%/78.9%).Stratified analysis based on age and cirrhosis status revealed that RFA and hepatectomy yielded similar OS and CSS outcomes for patients with cirrhosis aged over 65 years.Age,race,marital status,grade,cirrhosis status,tumor size,and AFP level were selected to construct the XGBoost models based on the training cohort.The areas under the curve(AUCs)for 1,3,and 5 years in the validation cohort were 0.88,0.81,and 0.79,respectively.Calibration plots further demonstrated the consistency between the predicted and actual values in both the training and validation cohorts.Conclusion RFA can improve the survival of patients diagnosed with a solitary HCC lesion≤5 cm.In certain clinical scenarios,RFA achieves survival outcomes comparable to those of hepatectomy.The XGBoost models developed in this study performed admirably in predicting the CSS of patients with solitary HCC tumors smaller than 5 cm following RFA.
文摘Spatial heterogeneity refers to the variation or differences in characteristics or features across different locations or areas in space. Spatial data refers to information that explicitly or indirectly belongs to a particular geographic region or location, also known as geo-spatial data or geographic information. Focusing on spatial heterogeneity, we present a hybrid machine learning model combining two competitive algorithms: the Random Forest Regressor and CNN. The model is fine-tuned using cross validation for hyper-parameter adjustment and performance evaluation, ensuring robustness and generalization. Our approach integrates Global Moran’s I for examining global autocorrelation, and local Moran’s I for assessing local spatial autocorrelation in the residuals. To validate our approach, we implemented the hybrid model on a real-world dataset and compared its performance with that of the traditional machine learning models. Results indicate superior performance with an R-squared of 0.90, outperforming RF 0.84 and CNN 0.74. This study contributed to a detailed understanding of spatial variations in data considering the geographical information (Longitude & Latitude) present in the dataset. Our results, also assessed using the Root Mean Squared Error (RMSE), indicated that the hybrid yielded lower errors, showing a deviation of 53.65% from the RF model and 63.24% from the CNN model. Additionally, the global Moran’s I index was observed to be 0.10. This study underscores that the hybrid was able to predict correctly the house prices both in clusters and in dispersed areas.
文摘This study evaluates the performance of advanced machine learning(ML)models in predicting the mechanical properties of eco-friendly self-compacting concrete(SCC),with a focus on compressive strength,V-funnel time,Lbox ratio,and slump flow.The motivation for this study stems from the increasing need to optimize concrete mix designs while minimizing environmental impact and reducing the reliance on costly physical testing.Six ML models-backpropagation neural network(BPNN),random forest regression(RFR),K-nearest neighbors(KNN),stacking,bagging,and eXtreme gradient boosting(XGBoost)-were trained and validated using a comprehensive dataset of 239 mix design parameters.The models'predictive accuracies were assessed using the coefficient of determination,mean squared error,root mean squared error,and mean absolute error.XGBoost consistently outperformed other models,achieving the coefficient of determination values of 0.999,0.933,and 0.935 for compressive strength in the training,validation,and testing datasets,respectively.Sensitivity analysis revealed that cement,silica fume,coarse aggregate,and superplasticizer positively influenced compressive strength,while water content had a negative impact.These findings highlight the potential of ML models,particularly XGBoost and RFR,in optimizing SCC mix designs,reducing reliance on physical testing,and enhancing sustainability in construction.The application of these models can lead to more efficient and eco-friendly concrete mix designs,benefiting real-world construction projects by improving quality control and reducing costs.
文摘Zhou et al’s investigation on the creation of a non-invasive deep learning(DL)method for colorectal tumor immune microenvironment evaluation using preoperative computed tomography(CT)radiomics published in the World Journal of Gastrointestinal Oncology is thorough and scientific.The study analyzed preoperative CT images of 315 confirmed colorectal cancer patients,using manual regions of interest to extract DL features.The study developed a DL model using CT images and histopathological images to predict immune-related indicators in colorectal cancer patients.Pathological(tumor-stroma ratio,tumor-infiltrating lymphocytes infiltration,immunohistochemistry,tumor immune microenvir-onment and immune score)parameters and radiomics(CT imaging and model construction)data were combined to generate artificial intelligence-powered models.Clinical benefit and goodness of fit of the models were assessed using receiver operating characteristic,area under curve and decision curve analysis.The developed DL-based radiomics prediction model for non-invasive evaluation of tumor markers demonstrated potential for personalized treatment planning and immunotherapy strategies in colorectal cancer patients.The study,involving a small group from a single medical center,lacks inclusion/exclusion criteria and should include clinicopathological features for valuable therapeutic practice insights in colorectal cancer patients.
文摘Accurate Global Horizontal Irradiance(GHI)forecasting has become vital for successfully integrating solar energy into the electrical grid because of the expanding demand for green power and the worldwide shift favouring green energy resources.Particularly considering the implications of the aggressive GHG emission targets,accurate GHI forecasting has become vital for developing,designing,and operational managing solar energy systems.This research presented the core concepts of modelling and performance analysis of the application of various forecasting models such as ARIMA(Autoregressive Integrated Moving Average),Elaman NN(Elman Neural Network),RBFN(Radial Basis Function Neural Network),SVM(Support Vector Machine),LSTM(Long Short-Term Memory),Persistent,BPN(Back Propagation Neural Network),MLP(Multilayer Perceptron Neural Network),RF(Random Forest),and XGBoost(eXtreme Gradient Boosting)for assessing multi-seasonal forecasting of GHI.Used the India region data to evaluate the models’performance and forecasting ability.Research using forecasting models for seasonal Global Horizontal Irradiance(GHI)forecasting in winter,spring,summer,monsoon,and autumn.Substantiated performance effectiveness through evaluation metrics,such as Mean Absolute Error(MAE),Root Mean Squared Error(RMSE),and R-squared(R^(2)),coded using Python programming.The performance experimentation analysis inferred that the most accurate forecasts in all the seasons compared to the other forecasting models the Random Forest and eXtreme Gradient Boosting,are the superior and competing models that yield Winter season-based forecasting XGBoost is the best forecasting model with MAE:1.6325,RMSE:4.8338,and R^(2):0.9998.Spring season-based forecasting XGBoost is the best forecasting model with MAE:2.599599,RMSE:5.58539,and R^(2):0.999784.Summer season-based forecasting RF is the best forecasting model with MAE:1.03843,RMSE:2.116325,and R^(2):0.999967.Monsoon season-based forecasting RF is the best forecasting model with MAE:0.892385,RMSE:2.417587,and R^(2):0.999942.Autumn season-based forecasting RF is the best forecasting model with MAE:0.810462,RMSE:1.928215,and R^(2):0.999958.Based on seasonal variations and computing constraints,the findings enable energy system operators to make helpful recommendations for choosing the most effective forecasting models.
基金supported by the National Key Research and Development Program of China(Grant Nos.2023YFC3707900 and 2024YFC3012700)the National Natural Science Foundation of China(Grant No.42230710).
文摘Soil desiccation cracking is ubiquitous in nature and has significantpotential impacts on the engineering geological properties of soils.Previous studies have extensively examined various factors affecting soil cracking behavior through a numerous small-sample experiments.However,experimental studies alone cannot accurately describe soil cracking behavior.In this study,we firstly propose a modeling framework for predicting the surface crack ratio of soil desiccation cracking based on machine learning and interpretable analysis.The framework utilizes 1040 sets of soil cracking experimental data and employs random forest(RF),extreme gradient boosting(XGBoost),and artificialneural network(ANN)models to predict the surface crack ratio of soil desiccation cracking.To clarify the influenceof input features on soil cracking behavior,feature importance and Shapley additive explanations(SHAP)are applied for interpretability analysis.The results reveal that ensemble methods(RF and XGBoost)provide better predictive performance than the deep learning model(ANN).The feature importance analysis shows that soil desiccation cracking is primarily influencedby initial water content,plasticity index,finalwater content,liquid limit,sand content,clay content and thickness.Moreover,SHAP-based interpretability analysis further explores how soil cracking responds to various input variables.This study provides new insight into the evolution of soil cracking behavior,enhancing the understanding of its physical mechanisms and facilitating the assessment of potential regional development of soil desiccation cracking.
基金Financial support from the Fundamental Research Funds for the Central Universities(ZJ2022-003,JG2022-27,J2020-060,and J2021-060)Sichuan Province Engineering Technology Research Center of General Aircraft Maintenance(GAMRC2021YB08)the Young Scientists Fund of the National Natural Science Foundation of China(No.52105417)is acknowledged.
文摘To overcome the challenges of limited experimental data and improve the accuracy of empirical formulas,we propose a low-cycle fatigue(LCF)life prediction model for nickel-based superalloys using a data augmentation method.This method utilizes a variational autoencoder(VAE)to generate low-cycle fatigue data and form an augmented dataset.The Pearson correlation coefficient(PCC)is employed to verify the similarity of feature distributions between the original and augmented datasets.Six machine learning models,namely random forest(RF),artificial neural network(ANN),support vector machine(SVM),gradient-boosted decision tree(GBDT),eXtreme Gradient Boosting(XGBoost),and Categorical Boosting(CatBoost),are utilized to predict the LCF life of nickel-based superalloys.Results indicate that the proposed data augmentation method based on VAE can effectively expand the dataset,and the mean absolute error(MAE),root mean square error(RMSE),and R-squared(R^(2))values achieved using the CatBoost model,with respective values of 0.0242,0.0391,and 0.9538,are superior to those of the other models.The proposed method reduces the cost and time associated with LCF experiments and accurately establishes the relationship between fatigue characteristics and LCF life of nickel-based superalloys.
基金Project supported by the National Natural Science Foundation of China (Grant No 60573065)the Natural Science Foundation of Shandong Province,China (Grant No Y2007G33)the Key Subject Research Foundation of Shandong Province,China(Grant No XTD0708)
文摘In this paper we apply the nonlinear time series analysis method to small-time scale traffic measurement data. The prediction-based method is used to determine the embedding dimension of the traffic data. Based on the reconstructed phase space, the local support vector machine prediction method is used to predict the traffic measurement data, and the BIC-based neighbouring point selection method is used to choose the number of the nearest neighbouring points for the local support vector machine regression model. The experimental results show that the local support vector machine prediction method whose neighbouring points are optimized can effectively predict the small-time scale traffic measurement data and can reproduce the statistical features of real traffic measurements.
基金funded by the National Natural Science Foundation of China (41807285)。
文摘The numerical simulation and slope stability prediction are the focus of slope disaster research.Recently,machine learning models are commonly used in the slope stability prediction.However,these machine learning models have some problems,such as poor nonlinear performance,local optimum and incomplete factors feature extraction.These issues can affect the accuracy of slope stability prediction.Therefore,a deep learning algorithm called Long short-term memory(LSTM)has been innovatively proposed to predict slope stability.Taking the Ganzhou City in China as the study area,the landslide inventory and their characteristics of geotechnical parameters,slope height and slope angle are analyzed.Based on these characteristics,typical soil slopes are constructed using the Geo-Studio software.Five control factors affecting slope stability,including slope height,slope angle,internal friction angle,cohesion and volumetric weight,are selected to form different slope and construct model input variables.Then,the limit equilibrium method is used to calculate the stability coefficients of these typical soil slopes under different control factors.Each slope stability coefficient and its corresponding control factors is a slope sample.As a result,a total of 2160 training samples and 450 testing samples are constructed.These sample sets are imported into LSTM for modelling and compared with the support vector machine(SVM),random forest(RF)and convo-lutional neural network(CNN).The results show that the LSTM overcomes the problem that the commonly used machine learning models have difficulty extracting global features.Furthermore,LSTM has a better prediction performance for slope stability compared to SVM,RF and CNN models.
基金funded by the Natural Science Foundation of China(Grant Nos.41807285,41972280 and 52179103).
文摘To perform landslide susceptibility prediction(LSP),it is important to select appropriate mapping unit and landslide-related conditioning factors.The efficient and automatic multi-scale segmentation(MSS)method proposed by the authors promotes the application of slope units.However,LSP modeling based on these slope units has not been performed.Moreover,the heterogeneity of conditioning factors in slope units is neglected,leading to incomplete input variables of LSP modeling.In this study,the slope units extracted by the MSS method are used to construct LSP modeling,and the heterogeneity of conditioning factors is represented by the internal variations of conditioning factors within slope unit using the descriptive statistics features of mean,standard deviation and range.Thus,slope units-based machine learning models considering internal variations of conditioning factors(variant slope-machine learning)are proposed.The Chongyi County is selected as the case study and is divided into 53,055 slope units.Fifteen original slope unit-based conditioning factors are expanded to 38 slope unit-based conditioning factors through considering their internal variations.Random forest(RF)and multi-layer perceptron(MLP)machine learning models are used to construct variant Slope-RF and Slope-MLP models.Meanwhile,the Slope-RF and Slope-MLP models without considering the internal variations of conditioning factors,and conventional grid units-based machine learning(Grid-RF and MLP)models are built for comparisons through the LSP performance assessments.Results show that the variant Slopemachine learning models have higher LSP performances than Slope-machine learning models;LSP results of variant Slope-machine learning models have stronger directivity and practical application than Grid-machine learning models.It is concluded that slope units extracted by MSS method can be appropriate for LSP modeling,and the heterogeneity of conditioning factors within slope units can more comprehensively reflect the relationships between conditioning factors and landslides.The research results have important reference significance for land use and landslide prevention.
基金Project supported by the National Natural Science Foundation of China(Nos.12101072 and 11421101)the National Key Research and Development Program of China(No.2018YFF0300104)+1 种基金the Beijing Municipal Science and Technology Project(No.Z201100005820002)the Open Research Fund of Shenzhen Research Institute of Big Data(No.2019ORF01001)。
文摘Fine-grained weather forecasting data,i.e.,the grid data with high-resolution,have attracted increasing attention in recent years,especially for some specific applications such as the Winter Olympic Games.Although European Centre for Medium-Range Weather Forecasts(ECMWF)provides grid prediction up to 240 hours,the coarse data are unable to meet high requirements of these major events.In this paper,we propose a method,called model residual machine learning(MRML),to generate grid prediction with high-resolution based on high-precision stations forecasting.MRML applies model output machine learning(MOML)for stations forecasting.Subsequently,MRML utilizes these forecasts to improve the quality of the grid data by fitting a machine learning(ML)model to the residuals.We demonstrate that MRML achieves high capability at diverse meteorological elements,specifically,temperature,relative humidity,and wind speed.In addition,MRML could be easily extended to other post-processing methods by invoking different techniques.In our experiments,MRML outperforms the traditional downscaling methods such as piecewise linear interpolation(PLI)on the testing data.