Excellent detonation performances and low sensitivity are prerequisites for the deployment of energetic materials.Exploring the underlying factors that affect impact sensitivity and detonation performances as well as ...Excellent detonation performances and low sensitivity are prerequisites for the deployment of energetic materials.Exploring the underlying factors that affect impact sensitivity and detonation performances as well as exploring how to obtain materials with desired properties remains a long-term challenge.Machine learning with its ability to solve complex tasks and perform robust data processing can reveal the relationship between performance and descriptive indicators,potentially accelerating the development process of energetic materials.In this background,impact sensitivity,detonation performances,and 28 physicochemical parameters for 222 energetic materials from density functional theory calculations and published literature were sorted out.Four machine learning algorithms were employed to predict various properties of energetic materials,including impact sensitivity,detonation velocity,detonation pressure,and Gurney energy.Analysis of Pearson coefficients and feature importance showed that the heat of explosion,oxygen balance,decomposition products,and HOMO energy levels have a strong correlation with the impact sensitivity of energetic materials.Oxygen balance,decomposition products,and density have a strong correlation with detonation performances.Utilizing impact sensitivity of 2,3,4-trinitrotoluene and the detonation performances of 2,4,6-trinitrobenzene-1,3,5-triamine as the benchmark,the analysis of feature importance rankings and statistical data revealed the optimal range of key features balancing impact sensitivity and detonation performances:oxygen balance values should be between-40%and-30%,density should range from 1.66 to 1.72 g/cm^(3),HOMO energy levels should be between-6.34 and-6.31 eV,and lipophilicity should be between-1.0 and 0.1,4.49 and 5.59.These findings not only offer important insights into the impact sensitivity and detonation performances of energetic materials,but also provide a theoretical guidance paradigm for the design and development of new energetic materials with optimal detonation performances and reduced sensitivity.展开更多
The presence of aluminum(Al^(3+))and fluoride(F^(−))ions in the environment can be harmful to ecosystems and human health,highlighting the need for accurate and efficient monitoring.In this paper,an innovative approac...The presence of aluminum(Al^(3+))and fluoride(F^(−))ions in the environment can be harmful to ecosystems and human health,highlighting the need for accurate and efficient monitoring.In this paper,an innovative approach is presented that leverages the power of machine learning to enhance the accuracy and efficiency of fluorescence-based detection for sequential quantitative analysis of aluminum(Al^(3+))and fluoride(F^(−))ions in aqueous solutions.The proposed method involves the synthesis of sulfur-functionalized carbon dots(C-dots)as fluorescence probes,with fluorescence enhancement upon interaction with Al^(3+)ions,achieving a detection limit of 4.2 nmol/L.Subsequently,in the presence of F^(−)ions,fluorescence is quenched,with a detection limit of 47.6 nmol/L.The fingerprints of fluorescence images are extracted using a cross-platform computer vision library in Python,followed by data preprocessing.Subsequently,the fingerprint data is subjected to cluster analysis using the K-means model from machine learning,and the average Silhouette Coefficient indicates excellent model performance.Finally,a regression analysis based on the principal component analysis method is employed to achieve more precise quantitative analysis of aluminum and fluoride ions.The results demonstrate that the developed model excels in terms of accuracy and sensitivity.This groundbreaking model not only showcases exceptional performance but also addresses the urgent need for effective environmental monitoring and risk assessment,making it a valuable tool for safeguarding our ecosystems and public health.展开更多
To better understand the migration behavior of plastic fragments in the environment,development of rapid non-destructive methods for in-situ identification and characterization of plastic fragments is necessary.Howeve...To better understand the migration behavior of plastic fragments in the environment,development of rapid non-destructive methods for in-situ identification and characterization of plastic fragments is necessary.However,most of the studies had focused only on colored plastic fragments,ignoring colorless plastic fragments and the effects of different environmental media(backgrounds),thus underestimating their abundance.To address this issue,the present study used near-infrared spectroscopy to compare the identification of colored and colorless plastic fragments based on partial least squares-discriminant analysis(PLS-DA),extreme gradient boost,support vector machine and random forest classifier.The effects of polymer color,type,thickness,and background on the plastic fragments classification were evaluated.PLS-DA presented the best and most stable outcome,with higher robustness and lower misclassification rate.All models frequently misinterpreted colorless plastic fragments and its background when the fragment thickness was less than 0.1mm.A two-stage modeling method,which first distinguishes the plastic types and then identifies colorless plastic fragments that had been misclassified as background,was proposed.The method presented an accuracy higher than 99%in different backgrounds.In summary,this study developed a novel method for rapid and synchronous identification of colored and colorless plastic fragments under complex environmental backgrounds.展开更多
The application of machine learning for pyrite discrimination establishes a robust foundation for constructing the ore-forming history of multi-stage deposits;however,published models face challenges related to limite...The application of machine learning for pyrite discrimination establishes a robust foundation for constructing the ore-forming history of multi-stage deposits;however,published models face challenges related to limited,imbalanced datasets and oversampling.In this study,the dataset was expanded to approximately 500 samples for each type,including 508 sedimentary,573 orogenic gold,548 sedimentary exhalative(SEDEX)deposits,and 364 volcanogenic massive sulfides(VMS)pyrites,utilizing random forest(RF)and support vector machine(SVM)methodologies to enhance the reliability of the classifier models.The RF classifier achieved an overall accuracy of 99.8%,and the SVM classifier attained an overall accuracy of 100%.The model was evaluated by a five-fold cross-validation approach with 93.8%accuracy for the RF and 94.9%for the SVM classifier.These results demonstrate the strong feasibility of pyrite classification,supported by a relatively large,balanced dataset and high accuracy rates.The classifier was employed to reveal the genesis of the controversial Keketale Pb-Zn deposit in NW China,which has been inconclusive among SEDEX,VMS,or a SEDEX-VMS transition.Petrographic investigations indicated that the deposit comprises early fine-grained layered pyrite(Py1)and late recrystallized pyrite(Py2).The majority voting classified Py1 as the VMS type,with an accuracy of RF and SVM being 72.2%and 75%,respectively,and confirmed Py2 as an orogenic type with 74.3% and 77.1%accuracy,respectively.The new findings indicated that the Keketale deposit originated from a submarine VMS mineralization system,followed by late orogenic-type overprinting of metamorphism and deformation,which is consistent with the geological and geochemical observations.This study further emphasizes the advantages of Machine learning(ML)methods in accurately and directly discriminating the deposit types and reconstructing the formation history of multi-stage deposits.展开更多
Liposomes serve as critical carriers for drugs and vaccines,with their biological effects influenced by their size.The microfluidic method,renowned for its precise control,reproducibility,and scalability,has been wide...Liposomes serve as critical carriers for drugs and vaccines,with their biological effects influenced by their size.The microfluidic method,renowned for its precise control,reproducibility,and scalability,has been widely employed for liposome preparation.Although some studies have explored factors affecting liposomal size in microfluidic processes,most focus on small-sized liposomes,predominantly through experimental data analysis.However,the production of larger liposomes,which are equally significant,remains underexplored.In this work,we thoroughly investigate multiple variables influencing liposome size during microfluidic preparation and develop a machine learning(ML)model capable of accurately predicting liposomal size.Experimental validation was conducted using a staggered herringbone micromixer(SHM)chip.Our findings reveal that most investigated variables significantly influence liposomal size,often interrelating in complex ways.We evaluated the predictive performance of several widely-used ML algorithms,including ensemble methods,through cross-validation(CV)for both lipo-some size and polydispersity index(PDI).A standalone dataset was experimentally validated to assess the accuracy of the ML predictions,with results indicating that ensemble algorithms provided the most reliable predictions.Specifically,gradient boosting was selected for size prediction,while random forest was employed for PDI prediction.We successfully produced uniform large(600 nm)and small(100 nm)liposomes using the optimised experimental conditions derived from the ML models.In conclusion,this study presents a robust methodology that enables precise control over liposome size distribution,of-fering valuable insights for medicinal research applications.展开更多
Arsenic(As)pollution in soils is a pervasive environmental issue.Biochar immobilization offers a promising solution for addressing soil As contamination.The efficiency of biochar in immobilizing As in soils primarily ...Arsenic(As)pollution in soils is a pervasive environmental issue.Biochar immobilization offers a promising solution for addressing soil As contamination.The efficiency of biochar in immobilizing As in soils primarily hinges on the characteristics of both the soil and the biochar.However,the influence of a specific property on As immobilization varies among different studies,and the development and application of arsenic passivation materials based on biochar often rely on empirical knowledge.To enhance immobilization efficiency and reduce labor and time costs,a machine learning(ML)model was employed to predict As immobilization efficiency before biochar application.In this study,we collected a dataset comprising 182 data points on As immobilization efficiency from 17 publications to construct three ML models.The results demonstrated that the random forest(RF)model outperformed gradient boost regression tree and support vector regression models in predictive performance.Relative importance analysis and partial dependence plots based on the RF model were conducted to identify the most crucial factors influencing As immobilization.These findings highlighted the significant roles of biochar application time and biochar pH in As immobilization efficiency in soils.Furthermore,the study revealed that Fe-modified biochar exhibited a substantial improvement in As immobilization.These insights can facilitate targeted biochar property design and optimization of biochar application conditions to enhance As immobilization efficiency.展开更多
Open caissons are widely used in foundation engineering because of their load-bearing efficiency and adaptability in diverse soil conditions.However,accurately predicting their undrained bearing capacity in layered so...Open caissons are widely used in foundation engineering because of their load-bearing efficiency and adaptability in diverse soil conditions.However,accurately predicting their undrained bearing capacity in layered soils remains a complex challenge.This study presents a novel application of five ensemble machine(ML)algorithms-random forest(RF),gradient boosting machine(GBM),extreme gradient boosting(XGBoost),adaptive boosting(AdaBoost),and categorical boosting(CatBoost)-to predict the undrained bearing capacity factor(Nc)of circular open caissons embedded in two-layered clay on the basis of results from finite element limit analysis(FELA).The input dataset consists of 1188 numerical simulations using the Tresca failure criterion,varying in geometrical and soil parameters.The FELA was performed via OptumG2 software with adaptive meshing techniques and verified against existing benchmark studies.The ML models were trained on 70% of the dataset and tested on the remaining 30%.Their performance was evaluated using six statistical metrics:coefficient of determination(R²),mean absolute error(MAE),root mean squared error(RMSE),index of scatter(IOS),RMSE-to-standard deviation ratio(RSR),and variance explained factor(VAF).The results indicate that all the models achieved high accuracy,with R²values exceeding 97.6%and RMSE values below 0.02.Among them,AdaBoost and CatBoost consistently outperformed the other methods across both the training and testing datasets,demonstrating superior generalizability and robustness.The proposed ML framework offers an efficient,accurate,and data-driven alternative to traditional methods for estimating caisson capacity in stratified soils.This approach can aid in reducing computational costs while improving reliability in the early stages of foundation design.展开更多
BACKGROUND To investigate the preoperative factors influencing textbook outcomes(TO)in Intrahepatic cholangiocarcinoma(ICC)patients and evaluate the feasibility of an interpretable machine learning model for preoperat...BACKGROUND To investigate the preoperative factors influencing textbook outcomes(TO)in Intrahepatic cholangiocarcinoma(ICC)patients and evaluate the feasibility of an interpretable machine learning model for preoperative prediction of TO,we developed a machine learning model for preoperative prediction of TO and used the SHapley Additive exPlanations(SHAP)technique to illustrate the prediction process.AIM To analyze the factors influencing textbook outcomes before surgery and to establish interpretable machine learning models for preoperative prediction.METHODS A total of 376 patients diagnosed with ICC were retrospectively collected from four major medical institutions in China,covering the period from 2011 to 2017.Logistic regression analysis was conducted to identify preoperative variables associated with achieving TO.Based on these variables,an EXtreme Gradient Boosting(XGBoost)machine learning prediction model was constructed using the XGBoost package.The SHAP(package:Shapviz)algorithm was employed to visualize each variable's contribution to the model's predictions.Kaplan-Meier survival analysis was performed to compare the prognostic differences between the TO-achieving and non-TO-achieving groups.RESULTS Among 376 patients,287 were included in the training group and 89 in the validation group.Logistic regression identified the following preoperative variables influencing TO:Child-Pugh classification,Eastern Cooperative Oncology Group(ECOG)score,hepatitis B,and tumor size.The XGBoost prediction model demonstrated high accuracy in internal validation(AUC=0.8825)and external validation(AUC=0.8346).Survival analysis revealed that the disease-free survival rates for patients achieving TO at 1,2,and 3 years were 64.2%,56.8%,and 43.4%,respectively.CONCLUSION Child-Pugh classification,ECOG score,hepatitis B,and tumor size are preoperative predictors of TO.In both the training group and the validation group,the machine learning model had certain effectiveness in predicting TO before surgery.The SHAP algorithm provided intuitive visualization of the machine learning prediction process,enhancing its interpretability.展开更多
Integrating exhaled breath analysis into the diagnosis of cardiovascular diseases holds significant promise as a valuable tool for future clinical use,particularly for ischemic heart disease(IHD).However,current resea...Integrating exhaled breath analysis into the diagnosis of cardiovascular diseases holds significant promise as a valuable tool for future clinical use,particularly for ischemic heart disease(IHD).However,current research on the volatilome(exhaled breath composition)in heart disease remains underexplored and lacks sufficient evidence to confirm its clinical validity.Key challenges hindering the application of breath analysis in diagnosing IHD include the scarcity of studies(only three published papers to date),substantial methodological bias in two of these studies,and the absence of standardized protocols for clinical imple-mentation.Additionally,inconsistencies in methodologies—such as sample collection,analytical techniques,machine learning(ML)approaches,and result interpretation—vary widely across studies,further complicating their reprodu-cibility and comparability.To address these gaps,there is an urgent need to establish unified guidelines that define best practices for breath sample collection,data analysis,ML integration,and biomarker annotation.Until these challenges are systematically resolved,the widespread adoption of exhaled breath analysis as a reliable diagnostic tool for IHD remains a distant goal rather than an immi-nent reality.展开更多
The application of machine learning in alloy design is increasingly widespread,yet traditional models still face challenges when dealing with limited datasets and complex nonlinear relationships.This work proposes an ...The application of machine learning in alloy design is increasingly widespread,yet traditional models still face challenges when dealing with limited datasets and complex nonlinear relationships.This work proposes an interpretable machine learning method based on data augmentation and reconstruction,excavating high-performance low-alloyed magnesium(Mg)alloys.The data augmentation technique expands the original dataset through Gaussian noise.The data reconstruction method reorganizes and transforms the original data to extract more representative features,significantly improving the model's generalization ability and prediction accuracy,with a coefficient of determination(R^(2))of 95.9%for the ultimate tensile strength(UTS)model and a R^(2)of 95.3%for the elongation-to-failure(EL)model.The correlation coefficient assisted screening(CCAS)method is proposed to filter low-alloyed target alloys.A new Mg-2.2Mn-0.4Zn-0.2Al-0.2Ca(MZAX2000,wt%)alloy is designed and extruded into bar at given processing parameters,achieving room-temperature strength-ductility synergy showing an excellent UTS of 395 MPa and a high EL of 17.9%.This is closely related to its hetero-structured characteristic in the as-extruded MZAX2000 alloy consisting of coarse grains(16%),fine grains(75%),and fiber regions(9%).Therefore,this work offers new insights into optimizing alloy compositions and processing parameters for attaining new high strong and ductile low-alloyed Mg alloys.展开更多
Carbon emissions resulting from energy consumption have become a pressing issue for governments worldwide.Accurate estimation of carbon emissions using satellite remote sensing data has become a crucial research probl...Carbon emissions resulting from energy consumption have become a pressing issue for governments worldwide.Accurate estimation of carbon emissions using satellite remote sensing data has become a crucial research problem.Previous studies relied on statistical regression models that failed to capture the complex nonlinear relationships between carbon emissions and characteristic variables.In this study,we propose a machine learning algorithm for carbon emissions,a Bayesian optimized XGboost regression model,using multi-year energy carbon emission data and nighttime lights(NTL)remote sensing data from Shaanxi Province,China.Our results demonstrate that the XGboost algorithm outperforms linear regression and four other machine learning models,with an R^(2)of 0.906 and RMSE of 5.687.We observe an annual increase in carbon emissions,with high-emission counties primarily concentrated in northern and central Shaanxi Province,displaying a shift from discrete,sporadic points to contiguous,extended spatial distribution.Spatial autocorrelation clustering reveals predominantly high-high and low-low clustering patterns,with economically developed counties showing high-emission clustering and economically relatively backward counties displaying low-emission clustering.Our findings show that the use of NTL data and the XGboost algorithm can estimate and predict carbon emissionsmore accurately and provide a complementary reference for satellite remote sensing image data to serve carbon emission monitoring and assessment.This research provides an important theoretical basis for formulating practical carbon emission reduction policies and contributes to the development of techniques for accurate carbon emission estimation using remote sensing data.展开更多
BACKGROUND Patients with early-stage hepatocellular carcinoma(HCC)generally have good survival rates following surgical resection.However,a subset of these patients experience recurrence within five years post-surgery...BACKGROUND Patients with early-stage hepatocellular carcinoma(HCC)generally have good survival rates following surgical resection.However,a subset of these patients experience recurrence within five years post-surgery.AIM To develop predictive models utilizing machine learning(ML)methods to detect early-stage patients at a high risk of mortality.METHODS Eight hundred and eight patients with HCC at Beijing Ditan Hospital were randomly allocated to training and validation cohorts in a 2:1 ratio.Prognostic models were generated using random survival forests and artificial neural networks(ANNs).These ML models were compared with other classic HCC scoring systems.A decision-tree model was established to validate the contri-bution of immune-inflammatory indicators to the long-term outlook of patients with early-stage HCC.RESULTS Immune-inflammatory markers,albumin-bilirubin scores,alpha-fetoprotein,tumor size,and International Normalized Ratio were closely associated with the 5-year survival rates.Among various predictive models,the ANN model gene-rated using these indicators through ML algorithms exhibited superior perfor-mance,with a 5-year area under the curve(AUC)of 0.85(95%CI:0.82-0.88).In the validation cohort,the 5-year AUC was 0.82(95%CI:0.74-0.85).According to the ANN model,patients were classified into high-risk and low-risk groups,with an overall survival hazard ratio of 7.98(95%CI:5.85-10.93,P<0.0001)between the two cohorts.INTRODUCTION Hepatocellular carcinoma(HCC)is one of the six most prevalent cancers[1]and the third leading cause of cancer-related mortality[2].China has some of the highest incidence and mortality rates for liver cancer,accounting for half of global cases[3,4].The Barcelona Clinic Liver Cancer(BCLC)Staging System is the most widely used framework for diagnosing and treating HCC[5].The optimal candidates for surgical treatment are those with early-stage HCC,classified as BCLC stage 0 or A.Patients with early-stage liver cancer typically have a better prognosis after surgical resection,achieving a 5-year survival rate of 60%-70%[6].However,the high postoperative recurrence rates of HCC remain a major obstacle to long-term efficacy.To improve the prognosis of patients with early-stage HCC,it is necessary to develop models that can identify those with poor prognoses,enabling stratified and personalized treatment and follow-up strategies.Chronic inflammation is linked to the development and advancement of tumors[7].Recently,peripheral blood immune indicators,such as neutrophil-to-lymphocyte ratio(NLR),platelet-to-lymphocyte ratio(PLR),and lymphocyte-to-monocyte ratio(LMR),have garnered extensive attention and have been used to predict survival in various tumors and inflammation-related diseases[8-10].However,the relationship between these combinations of immune markers and the outcomes in patients with early-stage HCC require further investigation.Machine learning(ML)algorithms are capable of handling large and complex datasets,generating more accurate and personalized predictions through unique training algorithms that better manage nonlinear statistical relationships than traditional analytical methods.Commonly used ML models include artificial neural networks(ANNs)and random survival forests(RSFs),which have shown satisfactory accuracy in prognostic predictions across various cancers and other diseases[11-13].ANNs have performed well in identifying the progression from liver cirrhosis to HCC and predicting overall survival(OS)in patients with HCC[14,15].However,no studies have confirmed the ability of ML models to predict post-surgical survival in patients with early-stage HCC.Through ML,a better understanding of the risk factors for early-stage HCC prognosis can be achieved.This aids in surgical decision-making,identifying patients at a high risk of mortality,and selecting subsequent treatment strategies.In this study,we aimed to establish a 5-year prognostic model for patients with early-stage HCC after surgical resection,based on ML and systemic immune-inflammatory indicators.This model seeks to improve the early monitoring of high-risk patients and provide personalized treatment plans.展开更多
With the rapid development of artificial intelligence,magnetocaloric materials as well as other materials are being developed with increased efficiency and enhanced performance.However,most studies do not take phase t...With the rapid development of artificial intelligence,magnetocaloric materials as well as other materials are being developed with increased efficiency and enhanced performance.However,most studies do not take phase transitions into account,and as a result,the predictions are usually not accurate enough.In this context,we have established an explicable relationship between alloy compositions and phase transition by feature imputation.A facile machine learning is proposed to screen candidate NiMn-based Heusler alloys with desired magnetic entropy change and magnetic transition temperature with a high accuracy R^(2)≈0.98.As expected,the measured properties of prepared NiMn-based alloys,including phase transition type,magnetic entropy changes and transition temperature,are all in good agreement with the ML predictions.As well as being the first to demonstrate an explicable relationship between alloy compositions,phase transitions and magnetocaloric properties,our proposed ML model is highly predictive and interpretable,which can provide a strong theoretical foundation for identifying high-performance magnetocaloric materials in the future.展开更多
Superconducting radio-frequency(SRF)cavities are the core components of SRF linear accelerators,making their stable operation considerably important.However,the operational experience from different accelerator labora...Superconducting radio-frequency(SRF)cavities are the core components of SRF linear accelerators,making their stable operation considerably important.However,the operational experience from different accelerator laboratories has revealed that SRF faults are the leading cause of short machine downtime trips.When a cavity fault occurs,system experts analyze the time-series data recorded by low-level RF systems and identify the fault type.However,this requires expertise and intuition,posing a major challenge for control-room operators.Here,we propose an expert feature-based machine learning model for automating SRF cavity fault recognition.The main challenge in converting the"expert reasoning"process for SRF faults into a"model inference"process lies in feature extraction,which is attributed to the associated multidimensional and complex time-series waveforms.Existing autoregression-based feature-extraction methods require the signal to be stable and autocorrelated,resulting in difficulty in capturing the abrupt features that exist in several SRF failure patterns.To address these issues,we introduce expertise into the classification model through reasonable feature engineering.We demonstrate the feasibility of this method using the SRF cavity of the China accelerator facility for superheavy elements(CAFE2).Although specific faults in SRF cavities may vary across different accelerators,similarities exist in the RF signals.Therefore,this study provides valuable guidance for fault analysis of the entire SRF community.展开更多
The roles of diurnal temperature in providing heat accumulation and chilling requirements for vegetation spring phenology differ.Although previous studies have established a stronger correlation between leaf onset and...The roles of diurnal temperature in providing heat accumulation and chilling requirements for vegetation spring phenology differ.Although previous studies have established a stronger correlation between leaf onset and diurnal temperature than between leaf onset and average temperature,current research on modeling spring phenology based on diurnal temperature indicators remains limited.In this study,we confirmed the start of the growing season(SOS)sensitivity to diurnal temperature and average temperature in boreal forest.The estimation of SOS was carried out by employing K-Nearest Neighbor Regression(KNR-TDN)model,Random Forest Regres-sion(RFR-TDN)model,eXtreme Gradient Boosting(XGB-TDN)model and Light Gradient Boosting Machine model(LightGBM-TDN)driven by diurnal temperature indicators during 1982-2015,and the SOS was projected from 2015 to 2100 based on the Coupled Model Intercomparison Project Phase 6(CMIP6)climate scenario datasets.The sensitivity of boreal forest SOS to daytime temperature is greater than that to average temperature and nighttime temperature.The LightGBM-TDN model perform best across all vegetation types,exhibiting the lowest RMSE and bias compared to the KNR-TDN model,RFR-TDN model and XGB-TDN model.By incorporating diurn-al temperature indicators instead of relying only on average temperature indicators to simulate spring phenology,an improvement in the accuracy of the model is achieved.Furthermore,the preseason accumulated daytime temperature,daytime temperature and snow cover end date emerged as significant drivers of the SOS simulation in the study area.The simulation results based on LightGBM-TDN model exhibit a trend of advancing SOS followed by stabilization under future climate scenarios.This study underscores the potential of diurn-al temperature indicators as a viable alternative to average temperature indicators in driving spring phenology models,offering a prom-ising new method for simulating spring phenology.展开更多
It is of great significance to accurately and rapidly identify shale lithofacies in relation to the evaluation and prediction of sweet spots for shale oil and gas reservoirs.To address the problem of low resolution in...It is of great significance to accurately and rapidly identify shale lithofacies in relation to the evaluation and prediction of sweet spots for shale oil and gas reservoirs.To address the problem of low resolution in logging curves,this study establishes a grayscale-phase model based on high-resolution grayscale curves using clustering analysis algorithms for shale lithofacies identification,working with the Shahejie For-mation,Bohai Bay Basin,China.The grayscale phase is defined as the sum of absolute grayscale and relative amplitude as well as their features.The absolute grayscale is the absolute magnitude of the gray values and is utilized for evaluating the material composition(mineral composition+total organic carbon)of shale,while the relative amplitude is the difference between adjacent gray values and is used to identify the shale structure type.The research results show that the grayscale phase model can identify shale lithofacies well,and the accuracy and applicability of this model were verified by the fitting relationship between absolute grayscale and shale mineral composition,as well as corresponding re-lationships between relative amplitudes and laminae development in shales.Four lithofacies are iden-tified in the target layer of the study area:massive mixed shale,laminated mixed shale,massive calcareous shale and laminated calcareous shale.This method can not only effectively characterize the material composition of shale,but also numerically characterize the development degree of shale laminae,and solve the problem that difficult to identify millimeter-scale laminae based on logging curves,which can provide technical support for shale lithofacies identification,sweet spot evaluation and prediction of complex continental lacustrine basins.展开更多
As energy demands continue to rise in modern society,the development of high-performance lithium-ion batteries(LIBs)has become crucial.However,traditional research methods of material science face challenges such as l...As energy demands continue to rise in modern society,the development of high-performance lithium-ion batteries(LIBs)has become crucial.However,traditional research methods of material science face challenges such as lengthy timelines and complex processes.In recent years,the integration of machine learning(ML)in LIB materials,including electrolytes,solid-state electrolytes,and electrodes,has yielded remarkable achievements.This comprehensive review explores the latest applications of ML in predicting LIB material performance,covering the core principles and recent advancements in three key inverse material design strategies:high-throughput virtual screening,global optimization,and generative models.These strategies have played a pivotal role in fostering LIB material innovations.Meanwhile,the paper briefly discusses the challenges associated with applying ML to materials research and offers insights and directions for future research.展开更多
Hydrogen partitioning between liquid iron alloys and silicate melts governs its distribution and cycling in Earth’s deep interior.Existing models based on simplified Fe-H systems predict strong hydrogen sequestration...Hydrogen partitioning between liquid iron alloys and silicate melts governs its distribution and cycling in Earth’s deep interior.Existing models based on simplified Fe-H systems predict strong hydrogen sequestration into the core.However,these models do not account for the modulating effects of major light elements such as oxygen and silicon in the core during Earth’s primordial differentiation.In this study,we use first-principles molecular dynamics simulations,augmented by machine learning techniques,to quantify hydrogen chemical potentials in quaternary Fe-O-Si-H systems under early core-mantle boundary conditions(135 GPa,5000 K).Our results demonstrate that the presence of 5.2 wt%oxygen and 4.8 wt%silicon reduces the siderophile affinity of hydrogen by 35%,decreasing its alloy-silicate partition coefficient from 18.2(in the case of Fe-H)to 11.8(in the case of Fe-O-Si-H).These findings suggest that previous estimates of the core hydrogen content derived from binary system models require downward revision.Our study underscores the critical role of multicomponent interactions in core formation models and provides first-principles-derived constraints to reconcile Earth’s present-day hydrogen reservoirs with its accretionary history.展开更多
In engineering practice,it is often necessary to determine functional relationships between dependent and independent variables.These relationships can be highly nonlinear,and classical regression approaches cannot al...In engineering practice,it is often necessary to determine functional relationships between dependent and independent variables.These relationships can be highly nonlinear,and classical regression approaches cannot always provide sufficiently reliable solutions.Nevertheless,Machine Learning(ML)techniques,which offer advanced regression tools to address complicated engineering issues,have been developed and widely explored.This study investigates the selected ML techniques to evaluate their suitability for application in the hot deformation behavior of metallic materials.The ML-based regression methods of Artificial Neural Networks(ANNs),Support Vector Machine(SVM),Decision Tree Regression(DTR),and Gaussian Process Regression(GPR)are applied to mathematically describe hot flow stress curve datasets acquired experimentally for a medium-carbon steel.Although the GPR method has not been used for such a regression task before,the results showed that its performance is the most favorable and practically unrivaled;neither the ANN method nor the other studied ML techniques provide such precise results of the solved regression analysis.展开更多
Ensuring the consistent mechanical performance of three-dimensional(3D)-printed continuous fiber-reinforced composites is a significant challenge in additive manufacturing.The current reliance on manual monitoring exa...Ensuring the consistent mechanical performance of three-dimensional(3D)-printed continuous fiber-reinforced composites is a significant challenge in additive manufacturing.The current reliance on manual monitoring exacerbates this challenge by rendering the process vulnerable to environmental changes and unexpected factors,resulting in defects and inconsistent product quality,particularly in unmanned long-term operations or printing in extreme environments.To address these issues,we developed a process monitoring and closed-loop feedback control strategy for the 3D printing process.Real-time printing image data were captured and analyzed using a well-trained neural network model,and a real-time control module-enabled closed-loop feedback control of the flow rate was developed.The neural network model,which was based on image processing and artificial intelligence,enabled the recognition of flow rate values with an accuracy of 94.70%.The experimental results showed significant improvements in both the surface performance and mechanical properties of printed composites,with three to six times improvement in tensile strength and elastic modulus,demonstrating the effectiveness of the strategy.This study provides a generalized process monitoring and feedback control method for the 3D printing of continuous fiber-reinforced composites,and offers a potential solution for remote online monitoring and closed-loop adjustment in unmanned or extreme space environments.展开更多
基金supported by the Fundamental Research Funds for the Central Universities(Grant No.2682024GF019)。
文摘Excellent detonation performances and low sensitivity are prerequisites for the deployment of energetic materials.Exploring the underlying factors that affect impact sensitivity and detonation performances as well as exploring how to obtain materials with desired properties remains a long-term challenge.Machine learning with its ability to solve complex tasks and perform robust data processing can reveal the relationship between performance and descriptive indicators,potentially accelerating the development process of energetic materials.In this background,impact sensitivity,detonation performances,and 28 physicochemical parameters for 222 energetic materials from density functional theory calculations and published literature were sorted out.Four machine learning algorithms were employed to predict various properties of energetic materials,including impact sensitivity,detonation velocity,detonation pressure,and Gurney energy.Analysis of Pearson coefficients and feature importance showed that the heat of explosion,oxygen balance,decomposition products,and HOMO energy levels have a strong correlation with the impact sensitivity of energetic materials.Oxygen balance,decomposition products,and density have a strong correlation with detonation performances.Utilizing impact sensitivity of 2,3,4-trinitrotoluene and the detonation performances of 2,4,6-trinitrobenzene-1,3,5-triamine as the benchmark,the analysis of feature importance rankings and statistical data revealed the optimal range of key features balancing impact sensitivity and detonation performances:oxygen balance values should be between-40%and-30%,density should range from 1.66 to 1.72 g/cm^(3),HOMO energy levels should be between-6.34 and-6.31 eV,and lipophilicity should be between-1.0 and 0.1,4.49 and 5.59.These findings not only offer important insights into the impact sensitivity and detonation performances of energetic materials,but also provide a theoretical guidance paradigm for the design and development of new energetic materials with optimal detonation performances and reduced sensitivity.
基金supported by the National Natural Science Foundation of China(No.U21A20290)Guangdong Basic and Applied Basic Research Foundation(No.2022A1515011656)+2 种基金the Projects of Talents Recruitment of GDUPT(No.2023rcyj1003)the 2022“Sail Plan”Project of Maoming Green Chemical Industry Research Institute(No.MMGCIRI2022YFJH-Y-024)Maoming Science and Technology Project(No.2023382).
文摘The presence of aluminum(Al^(3+))and fluoride(F^(−))ions in the environment can be harmful to ecosystems and human health,highlighting the need for accurate and efficient monitoring.In this paper,an innovative approach is presented that leverages the power of machine learning to enhance the accuracy and efficiency of fluorescence-based detection for sequential quantitative analysis of aluminum(Al^(3+))and fluoride(F^(−))ions in aqueous solutions.The proposed method involves the synthesis of sulfur-functionalized carbon dots(C-dots)as fluorescence probes,with fluorescence enhancement upon interaction with Al^(3+)ions,achieving a detection limit of 4.2 nmol/L.Subsequently,in the presence of F^(−)ions,fluorescence is quenched,with a detection limit of 47.6 nmol/L.The fingerprints of fluorescence images are extracted using a cross-platform computer vision library in Python,followed by data preprocessing.Subsequently,the fingerprint data is subjected to cluster analysis using the K-means model from machine learning,and the average Silhouette Coefficient indicates excellent model performance.Finally,a regression analysis based on the principal component analysis method is employed to achieve more precise quantitative analysis of aluminum and fluoride ions.The results demonstrate that the developed model excels in terms of accuracy and sensitivity.This groundbreaking model not only showcases exceptional performance but also addresses the urgent need for effective environmental monitoring and risk assessment,making it a valuable tool for safeguarding our ecosystems and public health.
基金supported by the National Natural Science Foundation of China(No.22276139)the Shanghai’s Municipal State-owned Assets Supervision and Administration Commission(No.2022028).
文摘To better understand the migration behavior of plastic fragments in the environment,development of rapid non-destructive methods for in-situ identification and characterization of plastic fragments is necessary.However,most of the studies had focused only on colored plastic fragments,ignoring colorless plastic fragments and the effects of different environmental media(backgrounds),thus underestimating their abundance.To address this issue,the present study used near-infrared spectroscopy to compare the identification of colored and colorless plastic fragments based on partial least squares-discriminant analysis(PLS-DA),extreme gradient boost,support vector machine and random forest classifier.The effects of polymer color,type,thickness,and background on the plastic fragments classification were evaluated.PLS-DA presented the best and most stable outcome,with higher robustness and lower misclassification rate.All models frequently misinterpreted colorless plastic fragments and its background when the fragment thickness was less than 0.1mm.A two-stage modeling method,which first distinguishes the plastic types and then identifies colorless plastic fragments that had been misclassified as background,was proposed.The method presented an accuracy higher than 99%in different backgrounds.In summary,this study developed a novel method for rapid and synchronous identification of colored and colorless plastic fragments under complex environmental backgrounds.
基金the National Key Research and Development Program of China(2021YFC2900300)the Natural Science Foundation of Guangdong Province(2024A1515030216)+2 种基金MOST Special Fund from State Key Laboratory of Geological Processes and Mineral Resources,China University of Geosciences(GPMR202437)the Guangdong Province Introduced of Innovative R&D Team(2021ZT09H399)the Third Xinjiang Scientific Expedition Program(2022xjkk1301).
文摘The application of machine learning for pyrite discrimination establishes a robust foundation for constructing the ore-forming history of multi-stage deposits;however,published models face challenges related to limited,imbalanced datasets and oversampling.In this study,the dataset was expanded to approximately 500 samples for each type,including 508 sedimentary,573 orogenic gold,548 sedimentary exhalative(SEDEX)deposits,and 364 volcanogenic massive sulfides(VMS)pyrites,utilizing random forest(RF)and support vector machine(SVM)methodologies to enhance the reliability of the classifier models.The RF classifier achieved an overall accuracy of 99.8%,and the SVM classifier attained an overall accuracy of 100%.The model was evaluated by a five-fold cross-validation approach with 93.8%accuracy for the RF and 94.9%for the SVM classifier.These results demonstrate the strong feasibility of pyrite classification,supported by a relatively large,balanced dataset and high accuracy rates.The classifier was employed to reveal the genesis of the controversial Keketale Pb-Zn deposit in NW China,which has been inconclusive among SEDEX,VMS,or a SEDEX-VMS transition.Petrographic investigations indicated that the deposit comprises early fine-grained layered pyrite(Py1)and late recrystallized pyrite(Py2).The majority voting classified Py1 as the VMS type,with an accuracy of RF and SVM being 72.2%and 75%,respectively,and confirmed Py2 as an orogenic type with 74.3% and 77.1%accuracy,respectively.The new findings indicated that the Keketale deposit originated from a submarine VMS mineralization system,followed by late orogenic-type overprinting of metamorphism and deformation,which is consistent with the geological and geochemical observations.This study further emphasizes the advantages of Machine learning(ML)methods in accurately and directly discriminating the deposit types and reconstructing the formation history of multi-stage deposits.
基金supported by the National Key Research and Development Plan of the Ministry of Science and Technology,China(Grant No.:2022YFE0125300)the National Natural Science Foundation of China(Grant No:81690262)+2 种基金the National Science and Technology Major Project,China(Grant No.:2017ZX09201004-021)the Open Project of National facility for Translational Medicine(Shanghai),China(Grant No.:TMSK-2021-104)Shanghai Jiao Tong University STAR Grant,China(Grant Nos.:YG2022ZD024 and YG2022QN111).
文摘Liposomes serve as critical carriers for drugs and vaccines,with their biological effects influenced by their size.The microfluidic method,renowned for its precise control,reproducibility,and scalability,has been widely employed for liposome preparation.Although some studies have explored factors affecting liposomal size in microfluidic processes,most focus on small-sized liposomes,predominantly through experimental data analysis.However,the production of larger liposomes,which are equally significant,remains underexplored.In this work,we thoroughly investigate multiple variables influencing liposome size during microfluidic preparation and develop a machine learning(ML)model capable of accurately predicting liposomal size.Experimental validation was conducted using a staggered herringbone micromixer(SHM)chip.Our findings reveal that most investigated variables significantly influence liposomal size,often interrelating in complex ways.We evaluated the predictive performance of several widely-used ML algorithms,including ensemble methods,through cross-validation(CV)for both lipo-some size and polydispersity index(PDI).A standalone dataset was experimentally validated to assess the accuracy of the ML predictions,with results indicating that ensemble algorithms provided the most reliable predictions.Specifically,gradient boosting was selected for size prediction,while random forest was employed for PDI prediction.We successfully produced uniform large(600 nm)and small(100 nm)liposomes using the optimised experimental conditions derived from the ML models.In conclusion,this study presents a robust methodology that enables precise control over liposome size distribution,of-fering valuable insights for medicinal research applications.
基金supported by the National Key Research and Development Program of China(No.2020YFC1808701).
文摘Arsenic(As)pollution in soils is a pervasive environmental issue.Biochar immobilization offers a promising solution for addressing soil As contamination.The efficiency of biochar in immobilizing As in soils primarily hinges on the characteristics of both the soil and the biochar.However,the influence of a specific property on As immobilization varies among different studies,and the development and application of arsenic passivation materials based on biochar often rely on empirical knowledge.To enhance immobilization efficiency and reduce labor and time costs,a machine learning(ML)model was employed to predict As immobilization efficiency before biochar application.In this study,we collected a dataset comprising 182 data points on As immobilization efficiency from 17 publications to construct three ML models.The results demonstrated that the random forest(RF)model outperformed gradient boost regression tree and support vector regression models in predictive performance.Relative importance analysis and partial dependence plots based on the RF model were conducted to identify the most crucial factors influencing As immobilization.These findings highlighted the significant roles of biochar application time and biochar pH in As immobilization efficiency in soils.Furthermore,the study revealed that Fe-modified biochar exhibited a substantial improvement in As immobilization.These insights can facilitate targeted biochar property design and optimization of biochar application conditions to enhance As immobilization efficiency.
文摘Open caissons are widely used in foundation engineering because of their load-bearing efficiency and adaptability in diverse soil conditions.However,accurately predicting their undrained bearing capacity in layered soils remains a complex challenge.This study presents a novel application of five ensemble machine(ML)algorithms-random forest(RF),gradient boosting machine(GBM),extreme gradient boosting(XGBoost),adaptive boosting(AdaBoost),and categorical boosting(CatBoost)-to predict the undrained bearing capacity factor(Nc)of circular open caissons embedded in two-layered clay on the basis of results from finite element limit analysis(FELA).The input dataset consists of 1188 numerical simulations using the Tresca failure criterion,varying in geometrical and soil parameters.The FELA was performed via OptumG2 software with adaptive meshing techniques and verified against existing benchmark studies.The ML models were trained on 70% of the dataset and tested on the remaining 30%.Their performance was evaluated using six statistical metrics:coefficient of determination(R²),mean absolute error(MAE),root mean squared error(RMSE),index of scatter(IOS),RMSE-to-standard deviation ratio(RSR),and variance explained factor(VAF).The results indicate that all the models achieved high accuracy,with R²values exceeding 97.6%and RMSE values below 0.02.Among them,AdaBoost and CatBoost consistently outperformed the other methods across both the training and testing datasets,demonstrating superior generalizability and robustness.The proposed ML framework offers an efficient,accurate,and data-driven alternative to traditional methods for estimating caisson capacity in stratified soils.This approach can aid in reducing computational costs while improving reliability in the early stages of foundation design.
基金Supported by National Key Research and Development Program,No.2022YFC2407304Major Research Project for Middle-Aged and Young Scientists of Fujian Provincial Health Commission,No.2021ZQNZD013+2 种基金The National Natural Science Foundation of China,No.62275050Fujian Province Science and Technology Innovation Joint Fund Project,No.2019Y9108Major Science and Technology Projects of Fujian Province,No.2021YZ036017.
文摘BACKGROUND To investigate the preoperative factors influencing textbook outcomes(TO)in Intrahepatic cholangiocarcinoma(ICC)patients and evaluate the feasibility of an interpretable machine learning model for preoperative prediction of TO,we developed a machine learning model for preoperative prediction of TO and used the SHapley Additive exPlanations(SHAP)technique to illustrate the prediction process.AIM To analyze the factors influencing textbook outcomes before surgery and to establish interpretable machine learning models for preoperative prediction.METHODS A total of 376 patients diagnosed with ICC were retrospectively collected from four major medical institutions in China,covering the period from 2011 to 2017.Logistic regression analysis was conducted to identify preoperative variables associated with achieving TO.Based on these variables,an EXtreme Gradient Boosting(XGBoost)machine learning prediction model was constructed using the XGBoost package.The SHAP(package:Shapviz)algorithm was employed to visualize each variable's contribution to the model's predictions.Kaplan-Meier survival analysis was performed to compare the prognostic differences between the TO-achieving and non-TO-achieving groups.RESULTS Among 376 patients,287 were included in the training group and 89 in the validation group.Logistic regression identified the following preoperative variables influencing TO:Child-Pugh classification,Eastern Cooperative Oncology Group(ECOG)score,hepatitis B,and tumor size.The XGBoost prediction model demonstrated high accuracy in internal validation(AUC=0.8825)and external validation(AUC=0.8346).Survival analysis revealed that the disease-free survival rates for patients achieving TO at 1,2,and 3 years were 64.2%,56.8%,and 43.4%,respectively.CONCLUSION Child-Pugh classification,ECOG score,hepatitis B,and tumor size are preoperative predictors of TO.In both the training group and the validation group,the machine learning model had certain effectiveness in predicting TO before surgery.The SHAP algorithm provided intuitive visualization of the machine learning prediction process,enhancing its interpretability.
基金Supported by The government assignment,No.1023022600020-6The Ministry of Science and Higher Education of the Russian Federation Within The Framework of State Support for The Creation and Development of World-Class Research Center“Digital Biodesign and Personalized Healthcare,”No.075-15-2022-304RSF grant,No.24-15-00549.
文摘Integrating exhaled breath analysis into the diagnosis of cardiovascular diseases holds significant promise as a valuable tool for future clinical use,particularly for ischemic heart disease(IHD).However,current research on the volatilome(exhaled breath composition)in heart disease remains underexplored and lacks sufficient evidence to confirm its clinical validity.Key challenges hindering the application of breath analysis in diagnosing IHD include the scarcity of studies(only three published papers to date),substantial methodological bias in two of these studies,and the absence of standardized protocols for clinical imple-mentation.Additionally,inconsistencies in methodologies—such as sample collection,analytical techniques,machine learning(ML)approaches,and result interpretation—vary widely across studies,further complicating their reprodu-cibility and comparability.To address these gaps,there is an urgent need to establish unified guidelines that define best practices for breath sample collection,data analysis,ML integration,and biomarker annotation.Until these challenges are systematically resolved,the widespread adoption of exhaled breath analysis as a reliable diagnostic tool for IHD remains a distant goal rather than an immi-nent reality.
基金funded by the National Natural Science Foundation of China(No.52204407)the Natural Science Foundation of Jiangsu Province(No.BK20220595)+1 种基金the China Postdoctoral Science Foundation(No.2022M723689)the Industrial Collaborative Innovation Project of Shanghai(No.XTCX-KJ-2022-2-11)。
文摘The application of machine learning in alloy design is increasingly widespread,yet traditional models still face challenges when dealing with limited datasets and complex nonlinear relationships.This work proposes an interpretable machine learning method based on data augmentation and reconstruction,excavating high-performance low-alloyed magnesium(Mg)alloys.The data augmentation technique expands the original dataset through Gaussian noise.The data reconstruction method reorganizes and transforms the original data to extract more representative features,significantly improving the model's generalization ability and prediction accuracy,with a coefficient of determination(R^(2))of 95.9%for the ultimate tensile strength(UTS)model and a R^(2)of 95.3%for the elongation-to-failure(EL)model.The correlation coefficient assisted screening(CCAS)method is proposed to filter low-alloyed target alloys.A new Mg-2.2Mn-0.4Zn-0.2Al-0.2Ca(MZAX2000,wt%)alloy is designed and extruded into bar at given processing parameters,achieving room-temperature strength-ductility synergy showing an excellent UTS of 395 MPa and a high EL of 17.9%.This is closely related to its hetero-structured characteristic in the as-extruded MZAX2000 alloy consisting of coarse grains(16%),fine grains(75%),and fiber regions(9%).Therefore,this work offers new insights into optimizing alloy compositions and processing parameters for attaining new high strong and ductile low-alloyed Mg alloys.
基金supported by the Key Research and Development Program in Shaanxi Province,China(No.2022ZDLSF07-05)the Fundamental Research Funds for the Central Universities,CHD(No.300102352901)。
文摘Carbon emissions resulting from energy consumption have become a pressing issue for governments worldwide.Accurate estimation of carbon emissions using satellite remote sensing data has become a crucial research problem.Previous studies relied on statistical regression models that failed to capture the complex nonlinear relationships between carbon emissions and characteristic variables.In this study,we propose a machine learning algorithm for carbon emissions,a Bayesian optimized XGboost regression model,using multi-year energy carbon emission data and nighttime lights(NTL)remote sensing data from Shaanxi Province,China.Our results demonstrate that the XGboost algorithm outperforms linear regression and four other machine learning models,with an R^(2)of 0.906 and RMSE of 5.687.We observe an annual increase in carbon emissions,with high-emission counties primarily concentrated in northern and central Shaanxi Province,displaying a shift from discrete,sporadic points to contiguous,extended spatial distribution.Spatial autocorrelation clustering reveals predominantly high-high and low-low clustering patterns,with economically developed counties showing high-emission clustering and economically relatively backward counties displaying low-emission clustering.Our findings show that the use of NTL data and the XGboost algorithm can estimate and predict carbon emissionsmore accurately and provide a complementary reference for satellite remote sensing image data to serve carbon emission monitoring and assessment.This research provides an important theoretical basis for formulating practical carbon emission reduction policies and contributes to the development of techniques for accurate carbon emission estimation using remote sensing data.
基金Supported by High-Level Chinese Medicine Key Discipline Construction Project,No.zyyzdxk-2023005Capital Health Development Research Project,No.2024-1-2173the National Natural Science Foundation of China,No.82474426 and No.82474419。
文摘BACKGROUND Patients with early-stage hepatocellular carcinoma(HCC)generally have good survival rates following surgical resection.However,a subset of these patients experience recurrence within five years post-surgery.AIM To develop predictive models utilizing machine learning(ML)methods to detect early-stage patients at a high risk of mortality.METHODS Eight hundred and eight patients with HCC at Beijing Ditan Hospital were randomly allocated to training and validation cohorts in a 2:1 ratio.Prognostic models were generated using random survival forests and artificial neural networks(ANNs).These ML models were compared with other classic HCC scoring systems.A decision-tree model was established to validate the contri-bution of immune-inflammatory indicators to the long-term outlook of patients with early-stage HCC.RESULTS Immune-inflammatory markers,albumin-bilirubin scores,alpha-fetoprotein,tumor size,and International Normalized Ratio were closely associated with the 5-year survival rates.Among various predictive models,the ANN model gene-rated using these indicators through ML algorithms exhibited superior perfor-mance,with a 5-year area under the curve(AUC)of 0.85(95%CI:0.82-0.88).In the validation cohort,the 5-year AUC was 0.82(95%CI:0.74-0.85).According to the ANN model,patients were classified into high-risk and low-risk groups,with an overall survival hazard ratio of 7.98(95%CI:5.85-10.93,P<0.0001)between the two cohorts.INTRODUCTION Hepatocellular carcinoma(HCC)is one of the six most prevalent cancers[1]and the third leading cause of cancer-related mortality[2].China has some of the highest incidence and mortality rates for liver cancer,accounting for half of global cases[3,4].The Barcelona Clinic Liver Cancer(BCLC)Staging System is the most widely used framework for diagnosing and treating HCC[5].The optimal candidates for surgical treatment are those with early-stage HCC,classified as BCLC stage 0 or A.Patients with early-stage liver cancer typically have a better prognosis after surgical resection,achieving a 5-year survival rate of 60%-70%[6].However,the high postoperative recurrence rates of HCC remain a major obstacle to long-term efficacy.To improve the prognosis of patients with early-stage HCC,it is necessary to develop models that can identify those with poor prognoses,enabling stratified and personalized treatment and follow-up strategies.Chronic inflammation is linked to the development and advancement of tumors[7].Recently,peripheral blood immune indicators,such as neutrophil-to-lymphocyte ratio(NLR),platelet-to-lymphocyte ratio(PLR),and lymphocyte-to-monocyte ratio(LMR),have garnered extensive attention and have been used to predict survival in various tumors and inflammation-related diseases[8-10].However,the relationship between these combinations of immune markers and the outcomes in patients with early-stage HCC require further investigation.Machine learning(ML)algorithms are capable of handling large and complex datasets,generating more accurate and personalized predictions through unique training algorithms that better manage nonlinear statistical relationships than traditional analytical methods.Commonly used ML models include artificial neural networks(ANNs)and random survival forests(RSFs),which have shown satisfactory accuracy in prognostic predictions across various cancers and other diseases[11-13].ANNs have performed well in identifying the progression from liver cirrhosis to HCC and predicting overall survival(OS)in patients with HCC[14,15].However,no studies have confirmed the ability of ML models to predict post-surgical survival in patients with early-stage HCC.Through ML,a better understanding of the risk factors for early-stage HCC prognosis can be achieved.This aids in surgical decision-making,identifying patients at a high risk of mortality,and selecting subsequent treatment strategies.In this study,we aimed to establish a 5-year prognostic model for patients with early-stage HCC after surgical resection,based on ML and systemic immune-inflammatory indicators.This model seeks to improve the early monitoring of high-risk patients and provide personalized treatment plans.
基金supported by the National Key R&D Program of China(No.2022YFE0109500)the National Natural Science Foundation of China(Nos.52071255,52301250,52171190 and 12304027)+2 种基金the Key R&D Project of Shaanxi Province(No.2022GXLH-01-07)the Fundamental Research Funds for the Central Universities(China)the World-Class Universities(Disciplines)and the Characteristic Development Guidance Funds for the Central Universities.
文摘With the rapid development of artificial intelligence,magnetocaloric materials as well as other materials are being developed with increased efficiency and enhanced performance.However,most studies do not take phase transitions into account,and as a result,the predictions are usually not accurate enough.In this context,we have established an explicable relationship between alloy compositions and phase transition by feature imputation.A facile machine learning is proposed to screen candidate NiMn-based Heusler alloys with desired magnetic entropy change and magnetic transition temperature with a high accuracy R^(2)≈0.98.As expected,the measured properties of prepared NiMn-based alloys,including phase transition type,magnetic entropy changes and transition temperature,are all in good agreement with the ML predictions.As well as being the first to demonstrate an explicable relationship between alloy compositions,phase transitions and magnetocaloric properties,our proposed ML model is highly predictive and interpretable,which can provide a strong theoretical foundation for identifying high-performance magnetocaloric materials in the future.
基金supported by the studies of intelligent LLRF control algorithms for superconducting RF cavities(No.E129851YR0)the National Natural Science Foundation of China(No.U22A20261)Applications of Artificial Intelligence in the Stability Study of Superconducting Linear Accelerators(No.E429851YR0)。
文摘Superconducting radio-frequency(SRF)cavities are the core components of SRF linear accelerators,making their stable operation considerably important.However,the operational experience from different accelerator laboratories has revealed that SRF faults are the leading cause of short machine downtime trips.When a cavity fault occurs,system experts analyze the time-series data recorded by low-level RF systems and identify the fault type.However,this requires expertise and intuition,posing a major challenge for control-room operators.Here,we propose an expert feature-based machine learning model for automating SRF cavity fault recognition.The main challenge in converting the"expert reasoning"process for SRF faults into a"model inference"process lies in feature extraction,which is attributed to the associated multidimensional and complex time-series waveforms.Existing autoregression-based feature-extraction methods require the signal to be stable and autocorrelated,resulting in difficulty in capturing the abrupt features that exist in several SRF failure patterns.To address these issues,we introduce expertise into the classification model through reasonable feature engineering.We demonstrate the feasibility of this method using the SRF cavity of the China accelerator facility for superheavy elements(CAFE2).Although specific faults in SRF cavities may vary across different accelerators,similarities exist in the RF signals.Therefore,this study provides valuable guidance for fault analysis of the entire SRF community.
基金Under the auspices of National Natural Science Foundation of China(No.42201374,42071359)。
文摘The roles of diurnal temperature in providing heat accumulation and chilling requirements for vegetation spring phenology differ.Although previous studies have established a stronger correlation between leaf onset and diurnal temperature than between leaf onset and average temperature,current research on modeling spring phenology based on diurnal temperature indicators remains limited.In this study,we confirmed the start of the growing season(SOS)sensitivity to diurnal temperature and average temperature in boreal forest.The estimation of SOS was carried out by employing K-Nearest Neighbor Regression(KNR-TDN)model,Random Forest Regres-sion(RFR-TDN)model,eXtreme Gradient Boosting(XGB-TDN)model and Light Gradient Boosting Machine model(LightGBM-TDN)driven by diurnal temperature indicators during 1982-2015,and the SOS was projected from 2015 to 2100 based on the Coupled Model Intercomparison Project Phase 6(CMIP6)climate scenario datasets.The sensitivity of boreal forest SOS to daytime temperature is greater than that to average temperature and nighttime temperature.The LightGBM-TDN model perform best across all vegetation types,exhibiting the lowest RMSE and bias compared to the KNR-TDN model,RFR-TDN model and XGB-TDN model.By incorporating diurn-al temperature indicators instead of relying only on average temperature indicators to simulate spring phenology,an improvement in the accuracy of the model is achieved.Furthermore,the preseason accumulated daytime temperature,daytime temperature and snow cover end date emerged as significant drivers of the SOS simulation in the study area.The simulation results based on LightGBM-TDN model exhibit a trend of advancing SOS followed by stabilization under future climate scenarios.This study underscores the potential of diurn-al temperature indicators as a viable alternative to average temperature indicators in driving spring phenology models,offering a prom-ising new method for simulating spring phenology.
基金supported by the National Natural Science Foundation of China(42122017,41821002)the Independent Innovation Research Program of China University of Petroleum(East China)(21CX06001A).
文摘It is of great significance to accurately and rapidly identify shale lithofacies in relation to the evaluation and prediction of sweet spots for shale oil and gas reservoirs.To address the problem of low resolution in logging curves,this study establishes a grayscale-phase model based on high-resolution grayscale curves using clustering analysis algorithms for shale lithofacies identification,working with the Shahejie For-mation,Bohai Bay Basin,China.The grayscale phase is defined as the sum of absolute grayscale and relative amplitude as well as their features.The absolute grayscale is the absolute magnitude of the gray values and is utilized for evaluating the material composition(mineral composition+total organic carbon)of shale,while the relative amplitude is the difference between adjacent gray values and is used to identify the shale structure type.The research results show that the grayscale phase model can identify shale lithofacies well,and the accuracy and applicability of this model were verified by the fitting relationship between absolute grayscale and shale mineral composition,as well as corresponding re-lationships between relative amplitudes and laminae development in shales.Four lithofacies are iden-tified in the target layer of the study area:massive mixed shale,laminated mixed shale,massive calcareous shale and laminated calcareous shale.This method can not only effectively characterize the material composition of shale,but also numerically characterize the development degree of shale laminae,and solve the problem that difficult to identify millimeter-scale laminae based on logging curves,which can provide technical support for shale lithofacies identification,sweet spot evaluation and prediction of complex continental lacustrine basins.
基金supported by the National Natural Science Foundation of China(Grant Nos.22225801,W2441009,22408228)。
文摘As energy demands continue to rise in modern society,the development of high-performance lithium-ion batteries(LIBs)has become crucial.However,traditional research methods of material science face challenges such as lengthy timelines and complex processes.In recent years,the integration of machine learning(ML)in LIB materials,including electrolytes,solid-state electrolytes,and electrodes,has yielded remarkable achievements.This comprehensive review explores the latest applications of ML in predicting LIB material performance,covering the core principles and recent advancements in three key inverse material design strategies:high-throughput virtual screening,global optimization,and generative models.These strategies have played a pivotal role in fostering LIB material innovations.Meanwhile,the paper briefly discusses the challenges associated with applying ML to materials research and offers insights and directions for future research.
基金supported by the National Key R&D Program of China(Grant No.2022YFF0503203)National Natural Science Foundation of China(NSFC)projects(Grant Nos.42441826 and 42173041)+1 种基金the Key Research Program of the Institute of Geology and Geophysics,Chinese Academy of Sciences(Grant No.IGGCAS-202204)the computational facilities of the Computer Simulation Laboratory at IGGCAS and the Beijing Super Cloud Computing Center(BSCC).
文摘Hydrogen partitioning between liquid iron alloys and silicate melts governs its distribution and cycling in Earth’s deep interior.Existing models based on simplified Fe-H systems predict strong hydrogen sequestration into the core.However,these models do not account for the modulating effects of major light elements such as oxygen and silicon in the core during Earth’s primordial differentiation.In this study,we use first-principles molecular dynamics simulations,augmented by machine learning techniques,to quantify hydrogen chemical potentials in quaternary Fe-O-Si-H systems under early core-mantle boundary conditions(135 GPa,5000 K).Our results demonstrate that the presence of 5.2 wt%oxygen and 4.8 wt%silicon reduces the siderophile affinity of hydrogen by 35%,decreasing its alloy-silicate partition coefficient from 18.2(in the case of Fe-H)to 11.8(in the case of Fe-O-Si-H).These findings suggest that previous estimates of the core hydrogen content derived from binary system models require downward revision.Our study underscores the critical role of multicomponent interactions in core formation models and provides first-principles-derived constraints to reconcile Earth’s present-day hydrogen reservoirs with its accretionary history.
基金supported by the SP2024/089 Project by the Faculty of Materials Science and Technology,VˇSB-Technical University of Ostrava.
文摘In engineering practice,it is often necessary to determine functional relationships between dependent and independent variables.These relationships can be highly nonlinear,and classical regression approaches cannot always provide sufficiently reliable solutions.Nevertheless,Machine Learning(ML)techniques,which offer advanced regression tools to address complicated engineering issues,have been developed and widely explored.This study investigates the selected ML techniques to evaluate their suitability for application in the hot deformation behavior of metallic materials.The ML-based regression methods of Artificial Neural Networks(ANNs),Support Vector Machine(SVM),Decision Tree Regression(DTR),and Gaussian Process Regression(GPR)are applied to mathematically describe hot flow stress curve datasets acquired experimentally for a medium-carbon steel.Although the GPR method has not been used for such a regression task before,the results showed that its performance is the most favorable and practically unrivaled;neither the ANN method nor the other studied ML techniques provide such precise results of the solved regression analysis.
基金supported by National Key Research and Development Program of China(Grant No.2023YFB4604100)National Key Research and Development Program of China(Grant No.2022YFB3806104)+4 种基金Key Research and Development Program in Shaanxi Province(Grant No.2021LLRH-08-17)Young Elite Scientists Sponsorship Program by CAST(No.2023QNRC001)K C Wong Education Foundation of ChinaYouth Innovation Team of Shaanxi Universities of ChinaKey Research and Development Program of Shaanxi Province(Grant 2021LLRH-08-3.1).
文摘Ensuring the consistent mechanical performance of three-dimensional(3D)-printed continuous fiber-reinforced composites is a significant challenge in additive manufacturing.The current reliance on manual monitoring exacerbates this challenge by rendering the process vulnerable to environmental changes and unexpected factors,resulting in defects and inconsistent product quality,particularly in unmanned long-term operations or printing in extreme environments.To address these issues,we developed a process monitoring and closed-loop feedback control strategy for the 3D printing process.Real-time printing image data were captured and analyzed using a well-trained neural network model,and a real-time control module-enabled closed-loop feedback control of the flow rate was developed.The neural network model,which was based on image processing and artificial intelligence,enabled the recognition of flow rate values with an accuracy of 94.70%.The experimental results showed significant improvements in both the surface performance and mechanical properties of printed composites,with three to six times improvement in tensile strength and elastic modulus,demonstrating the effectiveness of the strategy.This study provides a generalized process monitoring and feedback control method for the 3D printing of continuous fiber-reinforced composites,and offers a potential solution for remote online monitoring and closed-loop adjustment in unmanned or extreme space environments.