Modern intrusion detection systems(MIDS)face persistent challenges in coping with the rapid evolution of cyber threats,high-volume network traffic,and imbalanced datasets.Traditional models often lack the robustness a...Modern intrusion detection systems(MIDS)face persistent challenges in coping with the rapid evolution of cyber threats,high-volume network traffic,and imbalanced datasets.Traditional models often lack the robustness and explainability required to detect novel and sophisticated attacks effectively.This study introduces an advanced,explainable machine learning framework for multi-class IDS using the KDD99 and IDS datasets,which reflects real-world network behavior through a blend of normal and diverse attack classes.The methodology begins with sophisticated data preprocessing,incorporating both RobustScaler and QuantileTransformer to address outliers and skewed feature distributions,ensuring standardized and model-ready inputs.Critical dimensionality reduction is achieved via the Harris Hawks Optimization(HHO)algorithm—a nature-inspired metaheuristic modeled on hawks’hunting strategies.HHO efficiently identifies the most informative features by optimizing a fitness function based on classification performance.Following feature selection,the SMOTE is applied to the training data to resolve class imbalance by synthetically augmenting underrepresented attack types.The stacked architecture is then employed,combining the strengths of XGBoost,SVM,and RF as base learners.This layered approach improves prediction robustness and generalization by balancing bias and variance across diverse classifiers.The model was evaluated using standard classification metrics:precision,recall,F1-score,and overall accuracy.The best overall performance was recorded with an accuracy of 99.44%for UNSW-NB15,demonstrating the model’s effectiveness.After balancing,the model demonstrated a clear improvement in detecting the attacks.We tested the model on four datasets to show the effectiveness of the proposed approach and performed the ablation study to check the effect of each parameter.Also,the proposed model is computationaly efficient.To support transparency and trust in decision-making,explainable AI(XAI)techniques are incorporated that provides both global and local insight into feature contributions,and offers intuitive visualizations for individual predictions.This makes it suitable for practical deployment in cybersecurity environments that demand both precision and accountability.展开更多
Sudden wildfires cause significant global ecological damage.While satellite imagery has advanced early fire detection and mitigation,image-based systems face limitations including high false alarm rates,visual obstruc...Sudden wildfires cause significant global ecological damage.While satellite imagery has advanced early fire detection and mitigation,image-based systems face limitations including high false alarm rates,visual obstructions,and substantial computational demands,especially in complex forest terrains.To address these challenges,this study proposes a novel forest fire detection model utilizing audio classification and machine learning.We developed an audio-based pipeline using real-world environmental sound recordings.Sounds were converted into Mel-spectrograms and classified via a Convolutional Neural Network(CNN),enabling the capture of distinctive fire acoustic signatures(e.g.,crackling,roaring)that are minimally impacted by visual or weather conditions.Internet of Things(IoT)sound sensors were crucial for generating complex environmental parameters to optimize feature extraction.The CNN model achieved high performance in stratified 5-fold cross-validation(92.4%±1.6 accuracy,91.2%±1.8 F1-score)and on test data(94.93%accuracy,93.04%F1-score),with 98.44%precision and 88.32%recall,demonstrating reliability across environmental conditions.These results indicate that the audio-based approach not only improves detection reliability but also markedly reduces computational overhead compared to traditional image-based methods.The findings suggest that acoustic sensing integrated with machine learning offers a powerful,low-cost,and efficient solution for real-time forest fire monitoring in complex,dynamic environments.展开更多
The spatial information of rockhead is crucial for the design and construction of tunneling or underground excavation.Although the conventional site investigation methods(i.e.borehole drilling) could provide local eng...The spatial information of rockhead is crucial for the design and construction of tunneling or underground excavation.Although the conventional site investigation methods(i.e.borehole drilling) could provide local engineering geological information,the accurate prediction of the rockhead position with limited borehole data is still challenging due to its spatial variation and great uncertainties involved.With the development of computer science,machine learning(ML) has been proved to be a promising way to avoid subjective judgments by human beings and to establish complex relationships with mega data automatically.However,few studies have been reported on the adoption of ML models for the prediction of the rockhead position.In this paper,we proposed a robust probabilistic ML model for predicting the rockhead distribution using the spatial geographic information.The framework of the natural gradient boosting(NGBoost) algorithm combined with the extreme gradient boosting(XGBoost)is used as the basic learner.The XGBoost model was also compared with some other ML models such as the gradient boosting regression tree(GBRT),the light gradient boosting machine(LightGBM),the multivariate linear regression(MLR),the artificial neural network(ANN),and the support vector machine(SVM).The results demonstrate that the XGBoost algorithm,the core algorithm of the probabilistic NXGBoost model,outperformed the other conventional ML models with a coefficient of determination(R2)of 0.89 and a root mean squared error(RMSE) of 5.8 m for the prediction of rockhead position based on limited borehole data.The probabilistic N-XGBoost model not only achieved a higher prediction accuracy,but also provided a predictive estimation of the uncertainty.Thus,the proposed N-XGBoost probabilistic model has the potential to be used as a reliable and effective ML algorithm for the prediction of rockhead position in rock and geotechnical engineering.展开更多
With the rapid development of artificial intelligence,magnetocaloric materials as well as other materials are being developed with increased efficiency and enhanced performance.However,most studies do not take phase t...With the rapid development of artificial intelligence,magnetocaloric materials as well as other materials are being developed with increased efficiency and enhanced performance.However,most studies do not take phase transitions into account,and as a result,the predictions are usually not accurate enough.In this context,we have established an explicable relationship between alloy compositions and phase transition by feature imputation.A facile machine learning is proposed to screen candidate NiMn-based Heusler alloys with desired magnetic entropy change and magnetic transition temperature with a high accuracy R^(2)≈0.98.As expected,the measured properties of prepared NiMn-based alloys,including phase transition type,magnetic entropy changes and transition temperature,are all in good agreement with the ML predictions.As well as being the first to demonstrate an explicable relationship between alloy compositions,phase transitions and magnetocaloric properties,our proposed ML model is highly predictive and interpretable,which can provide a strong theoretical foundation for identifying high-performance magnetocaloric materials in the future.展开更多
Excellent detonation performances and low sensitivity are prerequisites for the deployment of energetic materials.Exploring the underlying factors that affect impact sensitivity and detonation performances as well as ...Excellent detonation performances and low sensitivity are prerequisites for the deployment of energetic materials.Exploring the underlying factors that affect impact sensitivity and detonation performances as well as exploring how to obtain materials with desired properties remains a long-term challenge.Machine learning with its ability to solve complex tasks and perform robust data processing can reveal the relationship between performance and descriptive indicators,potentially accelerating the development process of energetic materials.In this background,impact sensitivity,detonation performances,and 28 physicochemical parameters for 222 energetic materials from density functional theory calculations and published literature were sorted out.Four machine learning algorithms were employed to predict various properties of energetic materials,including impact sensitivity,detonation velocity,detonation pressure,and Gurney energy.Analysis of Pearson coefficients and feature importance showed that the heat of explosion,oxygen balance,decomposition products,and HOMO energy levels have a strong correlation with the impact sensitivity of energetic materials.Oxygen balance,decomposition products,and density have a strong correlation with detonation performances.Utilizing impact sensitivity of 2,3,4-trinitrotoluene and the detonation performances of 2,4,6-trinitrobenzene-1,3,5-triamine as the benchmark,the analysis of feature importance rankings and statistical data revealed the optimal range of key features balancing impact sensitivity and detonation performances:oxygen balance values should be between-40%and-30%,density should range from 1.66 to 1.72 g/cm^(3),HOMO energy levels should be between-6.34 and-6.31 eV,and lipophilicity should be between-1.0 and 0.1,4.49 and 5.59.These findings not only offer important insights into the impact sensitivity and detonation performances of energetic materials,but also provide a theoretical guidance paradigm for the design and development of new energetic materials with optimal detonation performances and reduced sensitivity.展开更多
The presence of aluminum(Al^(3+))and fluoride(F^(−))ions in the environment can be harmful to ecosystems and human health,highlighting the need for accurate and efficient monitoring.In this paper,an innovative approac...The presence of aluminum(Al^(3+))and fluoride(F^(−))ions in the environment can be harmful to ecosystems and human health,highlighting the need for accurate and efficient monitoring.In this paper,an innovative approach is presented that leverages the power of machine learning to enhance the accuracy and efficiency of fluorescence-based detection for sequential quantitative analysis of aluminum(Al^(3+))and fluoride(F^(−))ions in aqueous solutions.The proposed method involves the synthesis of sulfur-functionalized carbon dots(C-dots)as fluorescence probes,with fluorescence enhancement upon interaction with Al^(3+)ions,achieving a detection limit of 4.2 nmol/L.Subsequently,in the presence of F^(−)ions,fluorescence is quenched,with a detection limit of 47.6 nmol/L.The fingerprints of fluorescence images are extracted using a cross-platform computer vision library in Python,followed by data preprocessing.Subsequently,the fingerprint data is subjected to cluster analysis using the K-means model from machine learning,and the average Silhouette Coefficient indicates excellent model performance.Finally,a regression analysis based on the principal component analysis method is employed to achieve more precise quantitative analysis of aluminum and fluoride ions.The results demonstrate that the developed model excels in terms of accuracy and sensitivity.This groundbreaking model not only showcases exceptional performance but also addresses the urgent need for effective environmental monitoring and risk assessment,making it a valuable tool for safeguarding our ecosystems and public health.展开更多
To better understand the migration behavior of plastic fragments in the environment,development of rapid non-destructive methods for in-situ identification and characterization of plastic fragments is necessary.Howeve...To better understand the migration behavior of plastic fragments in the environment,development of rapid non-destructive methods for in-situ identification and characterization of plastic fragments is necessary.However,most of the studies had focused only on colored plastic fragments,ignoring colorless plastic fragments and the effects of different environmental media(backgrounds),thus underestimating their abundance.To address this issue,the present study used near-infrared spectroscopy to compare the identification of colored and colorless plastic fragments based on partial least squares-discriminant analysis(PLS-DA),extreme gradient boost,support vector machine and random forest classifier.The effects of polymer color,type,thickness,and background on the plastic fragments classification were evaluated.PLS-DA presented the best and most stable outcome,with higher robustness and lower misclassification rate.All models frequently misinterpreted colorless plastic fragments and its background when the fragment thickness was less than 0.1mm.A two-stage modeling method,which first distinguishes the plastic types and then identifies colorless plastic fragments that had been misclassified as background,was proposed.The method presented an accuracy higher than 99%in different backgrounds.In summary,this study developed a novel method for rapid and synchronous identification of colored and colorless plastic fragments under complex environmental backgrounds.展开更多
The application of machine learning for pyrite discrimination establishes a robust foundation for constructing the ore-forming history of multi-stage deposits;however,published models face challenges related to limite...The application of machine learning for pyrite discrimination establishes a robust foundation for constructing the ore-forming history of multi-stage deposits;however,published models face challenges related to limited,imbalanced datasets and oversampling.In this study,the dataset was expanded to approximately 500 samples for each type,including 508 sedimentary,573 orogenic gold,548 sedimentary exhalative(SEDEX)deposits,and 364 volcanogenic massive sulfides(VMS)pyrites,utilizing random forest(RF)and support vector machine(SVM)methodologies to enhance the reliability of the classifier models.The RF classifier achieved an overall accuracy of 99.8%,and the SVM classifier attained an overall accuracy of 100%.The model was evaluated by a five-fold cross-validation approach with 93.8%accuracy for the RF and 94.9%for the SVM classifier.These results demonstrate the strong feasibility of pyrite classification,supported by a relatively large,balanced dataset and high accuracy rates.The classifier was employed to reveal the genesis of the controversial Keketale Pb-Zn deposit in NW China,which has been inconclusive among SEDEX,VMS,or a SEDEX-VMS transition.Petrographic investigations indicated that the deposit comprises early fine-grained layered pyrite(Py1)and late recrystallized pyrite(Py2).The majority voting classified Py1 as the VMS type,with an accuracy of RF and SVM being 72.2%and 75%,respectively,and confirmed Py2 as an orogenic type with 74.3% and 77.1%accuracy,respectively.The new findings indicated that the Keketale deposit originated from a submarine VMS mineralization system,followed by late orogenic-type overprinting of metamorphism and deformation,which is consistent with the geological and geochemical observations.This study further emphasizes the advantages of Machine learning(ML)methods in accurately and directly discriminating the deposit types and reconstructing the formation history of multi-stage deposits.展开更多
Liposomes serve as critical carriers for drugs and vaccines,with their biological effects influenced by their size.The microfluidic method,renowned for its precise control,reproducibility,and scalability,has been wide...Liposomes serve as critical carriers for drugs and vaccines,with their biological effects influenced by their size.The microfluidic method,renowned for its precise control,reproducibility,and scalability,has been widely employed for liposome preparation.Although some studies have explored factors affecting liposomal size in microfluidic processes,most focus on small-sized liposomes,predominantly through experimental data analysis.However,the production of larger liposomes,which are equally significant,remains underexplored.In this work,we thoroughly investigate multiple variables influencing liposome size during microfluidic preparation and develop a machine learning(ML)model capable of accurately predicting liposomal size.Experimental validation was conducted using a staggered herringbone micromixer(SHM)chip.Our findings reveal that most investigated variables significantly influence liposomal size,often interrelating in complex ways.We evaluated the predictive performance of several widely-used ML algorithms,including ensemble methods,through cross-validation(CV)for both lipo-some size and polydispersity index(PDI).A standalone dataset was experimentally validated to assess the accuracy of the ML predictions,with results indicating that ensemble algorithms provided the most reliable predictions.Specifically,gradient boosting was selected for size prediction,while random forest was employed for PDI prediction.We successfully produced uniform large(600 nm)and small(100 nm)liposomes using the optimised experimental conditions derived from the ML models.In conclusion,this study presents a robust methodology that enables precise control over liposome size distribution,of-fering valuable insights for medicinal research applications.展开更多
Arsenic(As)pollution in soils is a pervasive environmental issue.Biochar immobilization offers a promising solution for addressing soil As contamination.The efficiency of biochar in immobilizing As in soils primarily ...Arsenic(As)pollution in soils is a pervasive environmental issue.Biochar immobilization offers a promising solution for addressing soil As contamination.The efficiency of biochar in immobilizing As in soils primarily hinges on the characteristics of both the soil and the biochar.However,the influence of a specific property on As immobilization varies among different studies,and the development and application of arsenic passivation materials based on biochar often rely on empirical knowledge.To enhance immobilization efficiency and reduce labor and time costs,a machine learning(ML)model was employed to predict As immobilization efficiency before biochar application.In this study,we collected a dataset comprising 182 data points on As immobilization efficiency from 17 publications to construct three ML models.The results demonstrated that the random forest(RF)model outperformed gradient boost regression tree and support vector regression models in predictive performance.Relative importance analysis and partial dependence plots based on the RF model were conducted to identify the most crucial factors influencing As immobilization.These findings highlighted the significant roles of biochar application time and biochar pH in As immobilization efficiency in soils.Furthermore,the study revealed that Fe-modified biochar exhibited a substantial improvement in As immobilization.These insights can facilitate targeted biochar property design and optimization of biochar application conditions to enhance As immobilization efficiency.展开更多
Finding materials with specific properties is a hot topic in materials science.Traditional materials design relies on empirical and trial-and-error methods,requiring extensive experiments and time,resulting in high co...Finding materials with specific properties is a hot topic in materials science.Traditional materials design relies on empirical and trial-and-error methods,requiring extensive experiments and time,resulting in high costs.With the development of physics,statistics,computer science,and other fields,machine learning offers opportunities for systematically discovering new materials.Especially through machine learning-based inverse design,machine learning algorithms analyze the mapping relationships between materials and their properties to find materials with desired properties.This paper first outlines the basic concepts of materials inverse design and the challenges faced by machine learning-based approaches to materials inverse design.Then,three main inverse design methods—exploration-based,model-based,and optimization-based—are analyzed in the context of different application scenarios.Finally,the applications of inverse design methods in alloys,optical materials,and acoustic materials are elaborated on,and the prospects for materials inverse design are discussed.The authors hope to accelerate the discovery of new materials and provide new possibilities for advancing materials science and innovative design methods.展开更多
Open caissons are widely used in foundation engineering because of their load-bearing efficiency and adaptability in diverse soil conditions.However,accurately predicting their undrained bearing capacity in layered so...Open caissons are widely used in foundation engineering because of their load-bearing efficiency and adaptability in diverse soil conditions.However,accurately predicting their undrained bearing capacity in layered soils remains a complex challenge.This study presents a novel application of five ensemble machine(ML)algorithms-random forest(RF),gradient boosting machine(GBM),extreme gradient boosting(XGBoost),adaptive boosting(AdaBoost),and categorical boosting(CatBoost)-to predict the undrained bearing capacity factor(Nc)of circular open caissons embedded in two-layered clay on the basis of results from finite element limit analysis(FELA).The input dataset consists of 1188 numerical simulations using the Tresca failure criterion,varying in geometrical and soil parameters.The FELA was performed via OptumG2 software with adaptive meshing techniques and verified against existing benchmark studies.The ML models were trained on 70% of the dataset and tested on the remaining 30%.Their performance was evaluated using six statistical metrics:coefficient of determination(R²),mean absolute error(MAE),root mean squared error(RMSE),index of scatter(IOS),RMSE-to-standard deviation ratio(RSR),and variance explained factor(VAF).The results indicate that all the models achieved high accuracy,with R²values exceeding 97.6%and RMSE values below 0.02.Among them,AdaBoost and CatBoost consistently outperformed the other methods across both the training and testing datasets,demonstrating superior generalizability and robustness.The proposed ML framework offers an efficient,accurate,and data-driven alternative to traditional methods for estimating caisson capacity in stratified soils.This approach can aid in reducing computational costs while improving reliability in the early stages of foundation design.展开更多
Recent studies have shown that synergistic precipitation of continuous precipitates(CPs)and discontinuous precipitates(DPs)is a promising method to simultaneously improve the strength and electrical conductivity of Cu...Recent studies have shown that synergistic precipitation of continuous precipitates(CPs)and discontinuous precipitates(DPs)is a promising method to simultaneously improve the strength and electrical conductivity of Cu-Ni-Si alloy.However,the complex relationship between precipitates and two-stage aging process presents a significant challenge for the optimization of process parameters.In this study,machine learning models were established based on orthogonal experiment to mine the relationship between two-stage aging parameters and properties of Cu-5.3Ni-1.3Si-0.12Nb alloy with preferred formation of DPs.Two-stage aging parameters of 400℃/75 min+400℃/30 min were then obtained by multi-objective optimization combined with an experimental iteration strategy,resulting in a tensile strength of 875 MPa and a conductivity of 41.43%IACS,respectively.Such an excellent comprehensive performance of the alloy is attributed to the combined precipitation of DPs and CPs(with a total volume fraction of 5.4%and a volume ratio of CPs to DPs of 6.7).This study could provide a new approach and insight for improving the comprehensive properties of the Cu-Ni-Si alloys.展开更多
BACKGROUND To investigate the preoperative factors influencing textbook outcomes(TO)in Intrahepatic cholangiocarcinoma(ICC)patients and evaluate the feasibility of an interpretable machine learning model for preoperat...BACKGROUND To investigate the preoperative factors influencing textbook outcomes(TO)in Intrahepatic cholangiocarcinoma(ICC)patients and evaluate the feasibility of an interpretable machine learning model for preoperative prediction of TO,we developed a machine learning model for preoperative prediction of TO and used the SHapley Additive exPlanations(SHAP)technique to illustrate the prediction process.AIM To analyze the factors influencing textbook outcomes before surgery and to establish interpretable machine learning models for preoperative prediction.METHODS A total of 376 patients diagnosed with ICC were retrospectively collected from four major medical institutions in China,covering the period from 2011 to 2017.Logistic regression analysis was conducted to identify preoperative variables associated with achieving TO.Based on these variables,an EXtreme Gradient Boosting(XGBoost)machine learning prediction model was constructed using the XGBoost package.The SHAP(package:Shapviz)algorithm was employed to visualize each variable's contribution to the model's predictions.Kaplan-Meier survival analysis was performed to compare the prognostic differences between the TO-achieving and non-TO-achieving groups.RESULTS Among 376 patients,287 were included in the training group and 89 in the validation group.Logistic regression identified the following preoperative variables influencing TO:Child-Pugh classification,Eastern Cooperative Oncology Group(ECOG)score,hepatitis B,and tumor size.The XGBoost prediction model demonstrated high accuracy in internal validation(AUC=0.8825)and external validation(AUC=0.8346).Survival analysis revealed that the disease-free survival rates for patients achieving TO at 1,2,and 3 years were 64.2%,56.8%,and 43.4%,respectively.CONCLUSION Child-Pugh classification,ECOG score,hepatitis B,and tumor size are preoperative predictors of TO.In both the training group and the validation group,the machine learning model had certain effectiveness in predicting TO before surgery.The SHAP algorithm provided intuitive visualization of the machine learning prediction process,enhancing its interpretability.展开更多
Integrating exhaled breath analysis into the diagnosis of cardiovascular diseases holds significant promise as a valuable tool for future clinical use,particularly for ischemic heart disease(IHD).However,current resea...Integrating exhaled breath analysis into the diagnosis of cardiovascular diseases holds significant promise as a valuable tool for future clinical use,particularly for ischemic heart disease(IHD).However,current research on the volatilome(exhaled breath composition)in heart disease remains underexplored and lacks sufficient evidence to confirm its clinical validity.Key challenges hindering the application of breath analysis in diagnosing IHD include the scarcity of studies(only three published papers to date),substantial methodological bias in two of these studies,and the absence of standardized protocols for clinical imple-mentation.Additionally,inconsistencies in methodologies—such as sample collection,analytical techniques,machine learning(ML)approaches,and result interpretation—vary widely across studies,further complicating their reprodu-cibility and comparability.To address these gaps,there is an urgent need to establish unified guidelines that define best practices for breath sample collection,data analysis,ML integration,and biomarker annotation.Until these challenges are systematically resolved,the widespread adoption of exhaled breath analysis as a reliable diagnostic tool for IHD remains a distant goal rather than an immi-nent reality.展开更多
The application of machine learning in alloy design is increasingly widespread,yet traditional models still face challenges when dealing with limited datasets and complex nonlinear relationships.This work proposes an ...The application of machine learning in alloy design is increasingly widespread,yet traditional models still face challenges when dealing with limited datasets and complex nonlinear relationships.This work proposes an interpretable machine learning method based on data augmentation and reconstruction,excavating high-performance low-alloyed magnesium(Mg)alloys.The data augmentation technique expands the original dataset through Gaussian noise.The data reconstruction method reorganizes and transforms the original data to extract more representative features,significantly improving the model's generalization ability and prediction accuracy,with a coefficient of determination(R^(2))of 95.9%for the ultimate tensile strength(UTS)model and a R^(2)of 95.3%for the elongation-to-failure(EL)model.The correlation coefficient assisted screening(CCAS)method is proposed to filter low-alloyed target alloys.A new Mg-2.2Mn-0.4Zn-0.2Al-0.2Ca(MZAX2000,wt%)alloy is designed and extruded into bar at given processing parameters,achieving room-temperature strength-ductility synergy showing an excellent UTS of 395 MPa and a high EL of 17.9%.This is closely related to its hetero-structured characteristic in the as-extruded MZAX2000 alloy consisting of coarse grains(16%),fine grains(75%),and fiber regions(9%).Therefore,this work offers new insights into optimizing alloy compositions and processing parameters for attaining new high strong and ductile low-alloyed Mg alloys.展开更多
Carbon emissions resulting from energy consumption have become a pressing issue for governments worldwide.Accurate estimation of carbon emissions using satellite remote sensing data has become a crucial research probl...Carbon emissions resulting from energy consumption have become a pressing issue for governments worldwide.Accurate estimation of carbon emissions using satellite remote sensing data has become a crucial research problem.Previous studies relied on statistical regression models that failed to capture the complex nonlinear relationships between carbon emissions and characteristic variables.In this study,we propose a machine learning algorithm for carbon emissions,a Bayesian optimized XGboost regression model,using multi-year energy carbon emission data and nighttime lights(NTL)remote sensing data from Shaanxi Province,China.Our results demonstrate that the XGboost algorithm outperforms linear regression and four other machine learning models,with an R^(2)of 0.906 and RMSE of 5.687.We observe an annual increase in carbon emissions,with high-emission counties primarily concentrated in northern and central Shaanxi Province,displaying a shift from discrete,sporadic points to contiguous,extended spatial distribution.Spatial autocorrelation clustering reveals predominantly high-high and low-low clustering patterns,with economically developed counties showing high-emission clustering and economically relatively backward counties displaying low-emission clustering.Our findings show that the use of NTL data and the XGboost algorithm can estimate and predict carbon emissionsmore accurately and provide a complementary reference for satellite remote sensing image data to serve carbon emission monitoring and assessment.This research provides an important theoretical basis for formulating practical carbon emission reduction policies and contributes to the development of techniques for accurate carbon emission estimation using remote sensing data.展开更多
BACKGROUND Patients with early-stage hepatocellular carcinoma(HCC)generally have good survival rates following surgical resection.However,a subset of these patients experience recurrence within five years post-surgery...BACKGROUND Patients with early-stage hepatocellular carcinoma(HCC)generally have good survival rates following surgical resection.However,a subset of these patients experience recurrence within five years post-surgery.AIM To develop predictive models utilizing machine learning(ML)methods to detect early-stage patients at a high risk of mortality.METHODS Eight hundred and eight patients with HCC at Beijing Ditan Hospital were randomly allocated to training and validation cohorts in a 2:1 ratio.Prognostic models were generated using random survival forests and artificial neural networks(ANNs).These ML models were compared with other classic HCC scoring systems.A decision-tree model was established to validate the contri-bution of immune-inflammatory indicators to the long-term outlook of patients with early-stage HCC.RESULTS Immune-inflammatory markers,albumin-bilirubin scores,alpha-fetoprotein,tumor size,and International Normalized Ratio were closely associated with the 5-year survival rates.Among various predictive models,the ANN model gene-rated using these indicators through ML algorithms exhibited superior perfor-mance,with a 5-year area under the curve(AUC)of 0.85(95%CI:0.82-0.88).In the validation cohort,the 5-year AUC was 0.82(95%CI:0.74-0.85).According to the ANN model,patients were classified into high-risk and low-risk groups,with an overall survival hazard ratio of 7.98(95%CI:5.85-10.93,P<0.0001)between the two cohorts.INTRODUCTION Hepatocellular carcinoma(HCC)is one of the six most prevalent cancers[1]and the third leading cause of cancer-related mortality[2].China has some of the highest incidence and mortality rates for liver cancer,accounting for half of global cases[3,4].The Barcelona Clinic Liver Cancer(BCLC)Staging System is the most widely used framework for diagnosing and treating HCC[5].The optimal candidates for surgical treatment are those with early-stage HCC,classified as BCLC stage 0 or A.Patients with early-stage liver cancer typically have a better prognosis after surgical resection,achieving a 5-year survival rate of 60%-70%[6].However,the high postoperative recurrence rates of HCC remain a major obstacle to long-term efficacy.To improve the prognosis of patients with early-stage HCC,it is necessary to develop models that can identify those with poor prognoses,enabling stratified and personalized treatment and follow-up strategies.Chronic inflammation is linked to the development and advancement of tumors[7].Recently,peripheral blood immune indicators,such as neutrophil-to-lymphocyte ratio(NLR),platelet-to-lymphocyte ratio(PLR),and lymphocyte-to-monocyte ratio(LMR),have garnered extensive attention and have been used to predict survival in various tumors and inflammation-related diseases[8-10].However,the relationship between these combinations of immune markers and the outcomes in patients with early-stage HCC require further investigation.Machine learning(ML)algorithms are capable of handling large and complex datasets,generating more accurate and personalized predictions through unique training algorithms that better manage nonlinear statistical relationships than traditional analytical methods.Commonly used ML models include artificial neural networks(ANNs)and random survival forests(RSFs),which have shown satisfactory accuracy in prognostic predictions across various cancers and other diseases[11-13].ANNs have performed well in identifying the progression from liver cirrhosis to HCC and predicting overall survival(OS)in patients with HCC[14,15].However,no studies have confirmed the ability of ML models to predict post-surgical survival in patients with early-stage HCC.Through ML,a better understanding of the risk factors for early-stage HCC prognosis can be achieved.This aids in surgical decision-making,identifying patients at a high risk of mortality,and selecting subsequent treatment strategies.In this study,we aimed to establish a 5-year prognostic model for patients with early-stage HCC after surgical resection,based on ML and systemic immune-inflammatory indicators.This model seeks to improve the early monitoring of high-risk patients and provide personalized treatment plans.展开更多
Superconducting radio-frequency(SRF)cavities are the core components of SRF linear accelerators,making their stable operation considerably important.However,the operational experience from different accelerator labora...Superconducting radio-frequency(SRF)cavities are the core components of SRF linear accelerators,making their stable operation considerably important.However,the operational experience from different accelerator laboratories has revealed that SRF faults are the leading cause of short machine downtime trips.When a cavity fault occurs,system experts analyze the time-series data recorded by low-level RF systems and identify the fault type.However,this requires expertise and intuition,posing a major challenge for control-room operators.Here,we propose an expert feature-based machine learning model for automating SRF cavity fault recognition.The main challenge in converting the"expert reasoning"process for SRF faults into a"model inference"process lies in feature extraction,which is attributed to the associated multidimensional and complex time-series waveforms.Existing autoregression-based feature-extraction methods require the signal to be stable and autocorrelated,resulting in difficulty in capturing the abrupt features that exist in several SRF failure patterns.To address these issues,we introduce expertise into the classification model through reasonable feature engineering.We demonstrate the feasibility of this method using the SRF cavity of the China accelerator facility for superheavy elements(CAFE2).Although specific faults in SRF cavities may vary across different accelerators,similarities exist in the RF signals.Therefore,this study provides valuable guidance for fault analysis of the entire SRF community.展开更多
基金funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2025R104)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Modern intrusion detection systems(MIDS)face persistent challenges in coping with the rapid evolution of cyber threats,high-volume network traffic,and imbalanced datasets.Traditional models often lack the robustness and explainability required to detect novel and sophisticated attacks effectively.This study introduces an advanced,explainable machine learning framework for multi-class IDS using the KDD99 and IDS datasets,which reflects real-world network behavior through a blend of normal and diverse attack classes.The methodology begins with sophisticated data preprocessing,incorporating both RobustScaler and QuantileTransformer to address outliers and skewed feature distributions,ensuring standardized and model-ready inputs.Critical dimensionality reduction is achieved via the Harris Hawks Optimization(HHO)algorithm—a nature-inspired metaheuristic modeled on hawks’hunting strategies.HHO efficiently identifies the most informative features by optimizing a fitness function based on classification performance.Following feature selection,the SMOTE is applied to the training data to resolve class imbalance by synthetically augmenting underrepresented attack types.The stacked architecture is then employed,combining the strengths of XGBoost,SVM,and RF as base learners.This layered approach improves prediction robustness and generalization by balancing bias and variance across diverse classifiers.The model was evaluated using standard classification metrics:precision,recall,F1-score,and overall accuracy.The best overall performance was recorded with an accuracy of 99.44%for UNSW-NB15,demonstrating the model’s effectiveness.After balancing,the model demonstrated a clear improvement in detecting the attacks.We tested the model on four datasets to show the effectiveness of the proposed approach and performed the ablation study to check the effect of each parameter.Also,the proposed model is computationaly efficient.To support transparency and trust in decision-making,explainable AI(XAI)techniques are incorporated that provides both global and local insight into feature contributions,and offers intuitive visualizations for individual predictions.This makes it suitable for practical deployment in cybersecurity environments that demand both precision and accountability.
基金funded by the Directorate of Research and Community Service,Directorate General of Research and Development,Ministry of Higher Education,Science and Technologyin accordance with the Implementation Contract for the Operational Assistance Program for State Universities,Research Program Number:109/C3/DT.05.00/PL/2025.
文摘Sudden wildfires cause significant global ecological damage.While satellite imagery has advanced early fire detection and mitigation,image-based systems face limitations including high false alarm rates,visual obstructions,and substantial computational demands,especially in complex forest terrains.To address these challenges,this study proposes a novel forest fire detection model utilizing audio classification and machine learning.We developed an audio-based pipeline using real-world environmental sound recordings.Sounds were converted into Mel-spectrograms and classified via a Convolutional Neural Network(CNN),enabling the capture of distinctive fire acoustic signatures(e.g.,crackling,roaring)that are minimally impacted by visual or weather conditions.Internet of Things(IoT)sound sensors were crucial for generating complex environmental parameters to optimize feature extraction.The CNN model achieved high performance in stratified 5-fold cross-validation(92.4%±1.6 accuracy,91.2%±1.8 F1-score)and on test data(94.93%accuracy,93.04%F1-score),with 98.44%precision and 88.32%recall,demonstrating reliability across environmental conditions.These results indicate that the audio-based approach not only improves detection reliability but also markedly reduces computational overhead compared to traditional image-based methods.The findings suggest that acoustic sensing integrated with machine learning offers a powerful,low-cost,and efficient solution for real-time forest fire monitoring in complex,dynamic environments.
基金supported by National Research Foundation(NRF)of Singapore,under its Virtual Singapore program(Grant No.NRF2019VSG-GMS-001)by the Singapore Ministry of National Development and the National Research Foundation,Prime Minister’s Office under the Land and Livability National Innovation Challenge(L2 NIC)Research Program(Grant No.L2NICCFP2-2015-1)。
文摘The spatial information of rockhead is crucial for the design and construction of tunneling or underground excavation.Although the conventional site investigation methods(i.e.borehole drilling) could provide local engineering geological information,the accurate prediction of the rockhead position with limited borehole data is still challenging due to its spatial variation and great uncertainties involved.With the development of computer science,machine learning(ML) has been proved to be a promising way to avoid subjective judgments by human beings and to establish complex relationships with mega data automatically.However,few studies have been reported on the adoption of ML models for the prediction of the rockhead position.In this paper,we proposed a robust probabilistic ML model for predicting the rockhead distribution using the spatial geographic information.The framework of the natural gradient boosting(NGBoost) algorithm combined with the extreme gradient boosting(XGBoost)is used as the basic learner.The XGBoost model was also compared with some other ML models such as the gradient boosting regression tree(GBRT),the light gradient boosting machine(LightGBM),the multivariate linear regression(MLR),the artificial neural network(ANN),and the support vector machine(SVM).The results demonstrate that the XGBoost algorithm,the core algorithm of the probabilistic NXGBoost model,outperformed the other conventional ML models with a coefficient of determination(R2)of 0.89 and a root mean squared error(RMSE) of 5.8 m for the prediction of rockhead position based on limited borehole data.The probabilistic N-XGBoost model not only achieved a higher prediction accuracy,but also provided a predictive estimation of the uncertainty.Thus,the proposed N-XGBoost probabilistic model has the potential to be used as a reliable and effective ML algorithm for the prediction of rockhead position in rock and geotechnical engineering.
基金supported by the National Key R&D Program of China(No.2022YFE0109500)the National Natural Science Foundation of China(Nos.52071255,52301250,52171190 and 12304027)+2 种基金the Key R&D Project of Shaanxi Province(No.2022GXLH-01-07)the Fundamental Research Funds for the Central Universities(China)the World-Class Universities(Disciplines)and the Characteristic Development Guidance Funds for the Central Universities.
文摘With the rapid development of artificial intelligence,magnetocaloric materials as well as other materials are being developed with increased efficiency and enhanced performance.However,most studies do not take phase transitions into account,and as a result,the predictions are usually not accurate enough.In this context,we have established an explicable relationship between alloy compositions and phase transition by feature imputation.A facile machine learning is proposed to screen candidate NiMn-based Heusler alloys with desired magnetic entropy change and magnetic transition temperature with a high accuracy R^(2)≈0.98.As expected,the measured properties of prepared NiMn-based alloys,including phase transition type,magnetic entropy changes and transition temperature,are all in good agreement with the ML predictions.As well as being the first to demonstrate an explicable relationship between alloy compositions,phase transitions and magnetocaloric properties,our proposed ML model is highly predictive and interpretable,which can provide a strong theoretical foundation for identifying high-performance magnetocaloric materials in the future.
基金supported by the Fundamental Research Funds for the Central Universities(Grant No.2682024GF019)。
文摘Excellent detonation performances and low sensitivity are prerequisites for the deployment of energetic materials.Exploring the underlying factors that affect impact sensitivity and detonation performances as well as exploring how to obtain materials with desired properties remains a long-term challenge.Machine learning with its ability to solve complex tasks and perform robust data processing can reveal the relationship between performance and descriptive indicators,potentially accelerating the development process of energetic materials.In this background,impact sensitivity,detonation performances,and 28 physicochemical parameters for 222 energetic materials from density functional theory calculations and published literature were sorted out.Four machine learning algorithms were employed to predict various properties of energetic materials,including impact sensitivity,detonation velocity,detonation pressure,and Gurney energy.Analysis of Pearson coefficients and feature importance showed that the heat of explosion,oxygen balance,decomposition products,and HOMO energy levels have a strong correlation with the impact sensitivity of energetic materials.Oxygen balance,decomposition products,and density have a strong correlation with detonation performances.Utilizing impact sensitivity of 2,3,4-trinitrotoluene and the detonation performances of 2,4,6-trinitrobenzene-1,3,5-triamine as the benchmark,the analysis of feature importance rankings and statistical data revealed the optimal range of key features balancing impact sensitivity and detonation performances:oxygen balance values should be between-40%and-30%,density should range from 1.66 to 1.72 g/cm^(3),HOMO energy levels should be between-6.34 and-6.31 eV,and lipophilicity should be between-1.0 and 0.1,4.49 and 5.59.These findings not only offer important insights into the impact sensitivity and detonation performances of energetic materials,but also provide a theoretical guidance paradigm for the design and development of new energetic materials with optimal detonation performances and reduced sensitivity.
基金supported by the National Natural Science Foundation of China(No.U21A20290)Guangdong Basic and Applied Basic Research Foundation(No.2022A1515011656)+2 种基金the Projects of Talents Recruitment of GDUPT(No.2023rcyj1003)the 2022“Sail Plan”Project of Maoming Green Chemical Industry Research Institute(No.MMGCIRI2022YFJH-Y-024)Maoming Science and Technology Project(No.2023382).
文摘The presence of aluminum(Al^(3+))and fluoride(F^(−))ions in the environment can be harmful to ecosystems and human health,highlighting the need for accurate and efficient monitoring.In this paper,an innovative approach is presented that leverages the power of machine learning to enhance the accuracy and efficiency of fluorescence-based detection for sequential quantitative analysis of aluminum(Al^(3+))and fluoride(F^(−))ions in aqueous solutions.The proposed method involves the synthesis of sulfur-functionalized carbon dots(C-dots)as fluorescence probes,with fluorescence enhancement upon interaction with Al^(3+)ions,achieving a detection limit of 4.2 nmol/L.Subsequently,in the presence of F^(−)ions,fluorescence is quenched,with a detection limit of 47.6 nmol/L.The fingerprints of fluorescence images are extracted using a cross-platform computer vision library in Python,followed by data preprocessing.Subsequently,the fingerprint data is subjected to cluster analysis using the K-means model from machine learning,and the average Silhouette Coefficient indicates excellent model performance.Finally,a regression analysis based on the principal component analysis method is employed to achieve more precise quantitative analysis of aluminum and fluoride ions.The results demonstrate that the developed model excels in terms of accuracy and sensitivity.This groundbreaking model not only showcases exceptional performance but also addresses the urgent need for effective environmental monitoring and risk assessment,making it a valuable tool for safeguarding our ecosystems and public health.
基金supported by the National Natural Science Foundation of China(No.22276139)the Shanghai’s Municipal State-owned Assets Supervision and Administration Commission(No.2022028).
文摘To better understand the migration behavior of plastic fragments in the environment,development of rapid non-destructive methods for in-situ identification and characterization of plastic fragments is necessary.However,most of the studies had focused only on colored plastic fragments,ignoring colorless plastic fragments and the effects of different environmental media(backgrounds),thus underestimating their abundance.To address this issue,the present study used near-infrared spectroscopy to compare the identification of colored and colorless plastic fragments based on partial least squares-discriminant analysis(PLS-DA),extreme gradient boost,support vector machine and random forest classifier.The effects of polymer color,type,thickness,and background on the plastic fragments classification were evaluated.PLS-DA presented the best and most stable outcome,with higher robustness and lower misclassification rate.All models frequently misinterpreted colorless plastic fragments and its background when the fragment thickness was less than 0.1mm.A two-stage modeling method,which first distinguishes the plastic types and then identifies colorless plastic fragments that had been misclassified as background,was proposed.The method presented an accuracy higher than 99%in different backgrounds.In summary,this study developed a novel method for rapid and synchronous identification of colored and colorless plastic fragments under complex environmental backgrounds.
基金the National Key Research and Development Program of China(2021YFC2900300)the Natural Science Foundation of Guangdong Province(2024A1515030216)+2 种基金MOST Special Fund from State Key Laboratory of Geological Processes and Mineral Resources,China University of Geosciences(GPMR202437)the Guangdong Province Introduced of Innovative R&D Team(2021ZT09H399)the Third Xinjiang Scientific Expedition Program(2022xjkk1301).
文摘The application of machine learning for pyrite discrimination establishes a robust foundation for constructing the ore-forming history of multi-stage deposits;however,published models face challenges related to limited,imbalanced datasets and oversampling.In this study,the dataset was expanded to approximately 500 samples for each type,including 508 sedimentary,573 orogenic gold,548 sedimentary exhalative(SEDEX)deposits,and 364 volcanogenic massive sulfides(VMS)pyrites,utilizing random forest(RF)and support vector machine(SVM)methodologies to enhance the reliability of the classifier models.The RF classifier achieved an overall accuracy of 99.8%,and the SVM classifier attained an overall accuracy of 100%.The model was evaluated by a five-fold cross-validation approach with 93.8%accuracy for the RF and 94.9%for the SVM classifier.These results demonstrate the strong feasibility of pyrite classification,supported by a relatively large,balanced dataset and high accuracy rates.The classifier was employed to reveal the genesis of the controversial Keketale Pb-Zn deposit in NW China,which has been inconclusive among SEDEX,VMS,or a SEDEX-VMS transition.Petrographic investigations indicated that the deposit comprises early fine-grained layered pyrite(Py1)and late recrystallized pyrite(Py2).The majority voting classified Py1 as the VMS type,with an accuracy of RF and SVM being 72.2%and 75%,respectively,and confirmed Py2 as an orogenic type with 74.3% and 77.1%accuracy,respectively.The new findings indicated that the Keketale deposit originated from a submarine VMS mineralization system,followed by late orogenic-type overprinting of metamorphism and deformation,which is consistent with the geological and geochemical observations.This study further emphasizes the advantages of Machine learning(ML)methods in accurately and directly discriminating the deposit types and reconstructing the formation history of multi-stage deposits.
基金supported by the National Key Research and Development Plan of the Ministry of Science and Technology,China(Grant No.:2022YFE0125300)the National Natural Science Foundation of China(Grant No:81690262)+2 种基金the National Science and Technology Major Project,China(Grant No.:2017ZX09201004-021)the Open Project of National facility for Translational Medicine(Shanghai),China(Grant No.:TMSK-2021-104)Shanghai Jiao Tong University STAR Grant,China(Grant Nos.:YG2022ZD024 and YG2022QN111).
文摘Liposomes serve as critical carriers for drugs and vaccines,with their biological effects influenced by their size.The microfluidic method,renowned for its precise control,reproducibility,and scalability,has been widely employed for liposome preparation.Although some studies have explored factors affecting liposomal size in microfluidic processes,most focus on small-sized liposomes,predominantly through experimental data analysis.However,the production of larger liposomes,which are equally significant,remains underexplored.In this work,we thoroughly investigate multiple variables influencing liposome size during microfluidic preparation and develop a machine learning(ML)model capable of accurately predicting liposomal size.Experimental validation was conducted using a staggered herringbone micromixer(SHM)chip.Our findings reveal that most investigated variables significantly influence liposomal size,often interrelating in complex ways.We evaluated the predictive performance of several widely-used ML algorithms,including ensemble methods,through cross-validation(CV)for both lipo-some size and polydispersity index(PDI).A standalone dataset was experimentally validated to assess the accuracy of the ML predictions,with results indicating that ensemble algorithms provided the most reliable predictions.Specifically,gradient boosting was selected for size prediction,while random forest was employed for PDI prediction.We successfully produced uniform large(600 nm)and small(100 nm)liposomes using the optimised experimental conditions derived from the ML models.In conclusion,this study presents a robust methodology that enables precise control over liposome size distribution,of-fering valuable insights for medicinal research applications.
基金supported by the National Key Research and Development Program of China(No.2020YFC1808701).
文摘Arsenic(As)pollution in soils is a pervasive environmental issue.Biochar immobilization offers a promising solution for addressing soil As contamination.The efficiency of biochar in immobilizing As in soils primarily hinges on the characteristics of both the soil and the biochar.However,the influence of a specific property on As immobilization varies among different studies,and the development and application of arsenic passivation materials based on biochar often rely on empirical knowledge.To enhance immobilization efficiency and reduce labor and time costs,a machine learning(ML)model was employed to predict As immobilization efficiency before biochar application.In this study,we collected a dataset comprising 182 data points on As immobilization efficiency from 17 publications to construct three ML models.The results demonstrated that the random forest(RF)model outperformed gradient boost regression tree and support vector regression models in predictive performance.Relative importance analysis and partial dependence plots based on the RF model were conducted to identify the most crucial factors influencing As immobilization.These findings highlighted the significant roles of biochar application time and biochar pH in As immobilization efficiency in soils.Furthermore,the study revealed that Fe-modified biochar exhibited a substantial improvement in As immobilization.These insights can facilitate targeted biochar property design and optimization of biochar application conditions to enhance As immobilization efficiency.
基金funded by theNationalNatural Science Foundation of China(52061020)Major Science and Technology Projects in Yunnan Province(202302AG050009)Yunnan Fundamental Research Projects(202301AV070003).
文摘Finding materials with specific properties is a hot topic in materials science.Traditional materials design relies on empirical and trial-and-error methods,requiring extensive experiments and time,resulting in high costs.With the development of physics,statistics,computer science,and other fields,machine learning offers opportunities for systematically discovering new materials.Especially through machine learning-based inverse design,machine learning algorithms analyze the mapping relationships between materials and their properties to find materials with desired properties.This paper first outlines the basic concepts of materials inverse design and the challenges faced by machine learning-based approaches to materials inverse design.Then,three main inverse design methods—exploration-based,model-based,and optimization-based—are analyzed in the context of different application scenarios.Finally,the applications of inverse design methods in alloys,optical materials,and acoustic materials are elaborated on,and the prospects for materials inverse design are discussed.The authors hope to accelerate the discovery of new materials and provide new possibilities for advancing materials science and innovative design methods.
文摘Open caissons are widely used in foundation engineering because of their load-bearing efficiency and adaptability in diverse soil conditions.However,accurately predicting their undrained bearing capacity in layered soils remains a complex challenge.This study presents a novel application of five ensemble machine(ML)algorithms-random forest(RF),gradient boosting machine(GBM),extreme gradient boosting(XGBoost),adaptive boosting(AdaBoost),and categorical boosting(CatBoost)-to predict the undrained bearing capacity factor(Nc)of circular open caissons embedded in two-layered clay on the basis of results from finite element limit analysis(FELA).The input dataset consists of 1188 numerical simulations using the Tresca failure criterion,varying in geometrical and soil parameters.The FELA was performed via OptumG2 software with adaptive meshing techniques and verified against existing benchmark studies.The ML models were trained on 70% of the dataset and tested on the remaining 30%.Their performance was evaluated using six statistical metrics:coefficient of determination(R²),mean absolute error(MAE),root mean squared error(RMSE),index of scatter(IOS),RMSE-to-standard deviation ratio(RSR),and variance explained factor(VAF).The results indicate that all the models achieved high accuracy,with R²values exceeding 97.6%and RMSE values below 0.02.Among them,AdaBoost and CatBoost consistently outperformed the other methods across both the training and testing datasets,demonstrating superior generalizability and robustness.The proposed ML framework offers an efficient,accurate,and data-driven alternative to traditional methods for estimating caisson capacity in stratified soils.This approach can aid in reducing computational costs while improving reliability in the early stages of foundation design.
基金financially supported by the National Key Research and Development Program of China(No.2023YFB3812601)the National Natural Science Foundation of China(Nos.51925401,92066205 and 92266301)the Young Elite Scientists Sponsorship Program by CAST(No.2022QNRC001).
文摘Recent studies have shown that synergistic precipitation of continuous precipitates(CPs)and discontinuous precipitates(DPs)is a promising method to simultaneously improve the strength and electrical conductivity of Cu-Ni-Si alloy.However,the complex relationship between precipitates and two-stage aging process presents a significant challenge for the optimization of process parameters.In this study,machine learning models were established based on orthogonal experiment to mine the relationship between two-stage aging parameters and properties of Cu-5.3Ni-1.3Si-0.12Nb alloy with preferred formation of DPs.Two-stage aging parameters of 400℃/75 min+400℃/30 min were then obtained by multi-objective optimization combined with an experimental iteration strategy,resulting in a tensile strength of 875 MPa and a conductivity of 41.43%IACS,respectively.Such an excellent comprehensive performance of the alloy is attributed to the combined precipitation of DPs and CPs(with a total volume fraction of 5.4%and a volume ratio of CPs to DPs of 6.7).This study could provide a new approach and insight for improving the comprehensive properties of the Cu-Ni-Si alloys.
基金Supported by National Key Research and Development Program,No.2022YFC2407304Major Research Project for Middle-Aged and Young Scientists of Fujian Provincial Health Commission,No.2021ZQNZD013+2 种基金The National Natural Science Foundation of China,No.62275050Fujian Province Science and Technology Innovation Joint Fund Project,No.2019Y9108Major Science and Technology Projects of Fujian Province,No.2021YZ036017.
文摘BACKGROUND To investigate the preoperative factors influencing textbook outcomes(TO)in Intrahepatic cholangiocarcinoma(ICC)patients and evaluate the feasibility of an interpretable machine learning model for preoperative prediction of TO,we developed a machine learning model for preoperative prediction of TO and used the SHapley Additive exPlanations(SHAP)technique to illustrate the prediction process.AIM To analyze the factors influencing textbook outcomes before surgery and to establish interpretable machine learning models for preoperative prediction.METHODS A total of 376 patients diagnosed with ICC were retrospectively collected from four major medical institutions in China,covering the period from 2011 to 2017.Logistic regression analysis was conducted to identify preoperative variables associated with achieving TO.Based on these variables,an EXtreme Gradient Boosting(XGBoost)machine learning prediction model was constructed using the XGBoost package.The SHAP(package:Shapviz)algorithm was employed to visualize each variable's contribution to the model's predictions.Kaplan-Meier survival analysis was performed to compare the prognostic differences between the TO-achieving and non-TO-achieving groups.RESULTS Among 376 patients,287 were included in the training group and 89 in the validation group.Logistic regression identified the following preoperative variables influencing TO:Child-Pugh classification,Eastern Cooperative Oncology Group(ECOG)score,hepatitis B,and tumor size.The XGBoost prediction model demonstrated high accuracy in internal validation(AUC=0.8825)and external validation(AUC=0.8346).Survival analysis revealed that the disease-free survival rates for patients achieving TO at 1,2,and 3 years were 64.2%,56.8%,and 43.4%,respectively.CONCLUSION Child-Pugh classification,ECOG score,hepatitis B,and tumor size are preoperative predictors of TO.In both the training group and the validation group,the machine learning model had certain effectiveness in predicting TO before surgery.The SHAP algorithm provided intuitive visualization of the machine learning prediction process,enhancing its interpretability.
基金Supported by The government assignment,No.1023022600020-6The Ministry of Science and Higher Education of the Russian Federation Within The Framework of State Support for The Creation and Development of World-Class Research Center“Digital Biodesign and Personalized Healthcare,”No.075-15-2022-304RSF grant,No.24-15-00549.
文摘Integrating exhaled breath analysis into the diagnosis of cardiovascular diseases holds significant promise as a valuable tool for future clinical use,particularly for ischemic heart disease(IHD).However,current research on the volatilome(exhaled breath composition)in heart disease remains underexplored and lacks sufficient evidence to confirm its clinical validity.Key challenges hindering the application of breath analysis in diagnosing IHD include the scarcity of studies(only three published papers to date),substantial methodological bias in two of these studies,and the absence of standardized protocols for clinical imple-mentation.Additionally,inconsistencies in methodologies—such as sample collection,analytical techniques,machine learning(ML)approaches,and result interpretation—vary widely across studies,further complicating their reprodu-cibility and comparability.To address these gaps,there is an urgent need to establish unified guidelines that define best practices for breath sample collection,data analysis,ML integration,and biomarker annotation.Until these challenges are systematically resolved,the widespread adoption of exhaled breath analysis as a reliable diagnostic tool for IHD remains a distant goal rather than an immi-nent reality.
基金funded by the National Natural Science Foundation of China(No.52204407)the Natural Science Foundation of Jiangsu Province(No.BK20220595)+1 种基金the China Postdoctoral Science Foundation(No.2022M723689)the Industrial Collaborative Innovation Project of Shanghai(No.XTCX-KJ-2022-2-11)。
文摘The application of machine learning in alloy design is increasingly widespread,yet traditional models still face challenges when dealing with limited datasets and complex nonlinear relationships.This work proposes an interpretable machine learning method based on data augmentation and reconstruction,excavating high-performance low-alloyed magnesium(Mg)alloys.The data augmentation technique expands the original dataset through Gaussian noise.The data reconstruction method reorganizes and transforms the original data to extract more representative features,significantly improving the model's generalization ability and prediction accuracy,with a coefficient of determination(R^(2))of 95.9%for the ultimate tensile strength(UTS)model and a R^(2)of 95.3%for the elongation-to-failure(EL)model.The correlation coefficient assisted screening(CCAS)method is proposed to filter low-alloyed target alloys.A new Mg-2.2Mn-0.4Zn-0.2Al-0.2Ca(MZAX2000,wt%)alloy is designed and extruded into bar at given processing parameters,achieving room-temperature strength-ductility synergy showing an excellent UTS of 395 MPa and a high EL of 17.9%.This is closely related to its hetero-structured characteristic in the as-extruded MZAX2000 alloy consisting of coarse grains(16%),fine grains(75%),and fiber regions(9%).Therefore,this work offers new insights into optimizing alloy compositions and processing parameters for attaining new high strong and ductile low-alloyed Mg alloys.
基金supported by the Key Research and Development Program in Shaanxi Province,China(No.2022ZDLSF07-05)the Fundamental Research Funds for the Central Universities,CHD(No.300102352901)。
文摘Carbon emissions resulting from energy consumption have become a pressing issue for governments worldwide.Accurate estimation of carbon emissions using satellite remote sensing data has become a crucial research problem.Previous studies relied on statistical regression models that failed to capture the complex nonlinear relationships between carbon emissions and characteristic variables.In this study,we propose a machine learning algorithm for carbon emissions,a Bayesian optimized XGboost regression model,using multi-year energy carbon emission data and nighttime lights(NTL)remote sensing data from Shaanxi Province,China.Our results demonstrate that the XGboost algorithm outperforms linear regression and four other machine learning models,with an R^(2)of 0.906 and RMSE of 5.687.We observe an annual increase in carbon emissions,with high-emission counties primarily concentrated in northern and central Shaanxi Province,displaying a shift from discrete,sporadic points to contiguous,extended spatial distribution.Spatial autocorrelation clustering reveals predominantly high-high and low-low clustering patterns,with economically developed counties showing high-emission clustering and economically relatively backward counties displaying low-emission clustering.Our findings show that the use of NTL data and the XGboost algorithm can estimate and predict carbon emissionsmore accurately and provide a complementary reference for satellite remote sensing image data to serve carbon emission monitoring and assessment.This research provides an important theoretical basis for formulating practical carbon emission reduction policies and contributes to the development of techniques for accurate carbon emission estimation using remote sensing data.
基金Supported by High-Level Chinese Medicine Key Discipline Construction Project,No.zyyzdxk-2023005Capital Health Development Research Project,No.2024-1-2173the National Natural Science Foundation of China,No.82474426 and No.82474419。
文摘BACKGROUND Patients with early-stage hepatocellular carcinoma(HCC)generally have good survival rates following surgical resection.However,a subset of these patients experience recurrence within five years post-surgery.AIM To develop predictive models utilizing machine learning(ML)methods to detect early-stage patients at a high risk of mortality.METHODS Eight hundred and eight patients with HCC at Beijing Ditan Hospital were randomly allocated to training and validation cohorts in a 2:1 ratio.Prognostic models were generated using random survival forests and artificial neural networks(ANNs).These ML models were compared with other classic HCC scoring systems.A decision-tree model was established to validate the contri-bution of immune-inflammatory indicators to the long-term outlook of patients with early-stage HCC.RESULTS Immune-inflammatory markers,albumin-bilirubin scores,alpha-fetoprotein,tumor size,and International Normalized Ratio were closely associated with the 5-year survival rates.Among various predictive models,the ANN model gene-rated using these indicators through ML algorithms exhibited superior perfor-mance,with a 5-year area under the curve(AUC)of 0.85(95%CI:0.82-0.88).In the validation cohort,the 5-year AUC was 0.82(95%CI:0.74-0.85).According to the ANN model,patients were classified into high-risk and low-risk groups,with an overall survival hazard ratio of 7.98(95%CI:5.85-10.93,P<0.0001)between the two cohorts.INTRODUCTION Hepatocellular carcinoma(HCC)is one of the six most prevalent cancers[1]and the third leading cause of cancer-related mortality[2].China has some of the highest incidence and mortality rates for liver cancer,accounting for half of global cases[3,4].The Barcelona Clinic Liver Cancer(BCLC)Staging System is the most widely used framework for diagnosing and treating HCC[5].The optimal candidates for surgical treatment are those with early-stage HCC,classified as BCLC stage 0 or A.Patients with early-stage liver cancer typically have a better prognosis after surgical resection,achieving a 5-year survival rate of 60%-70%[6].However,the high postoperative recurrence rates of HCC remain a major obstacle to long-term efficacy.To improve the prognosis of patients with early-stage HCC,it is necessary to develop models that can identify those with poor prognoses,enabling stratified and personalized treatment and follow-up strategies.Chronic inflammation is linked to the development and advancement of tumors[7].Recently,peripheral blood immune indicators,such as neutrophil-to-lymphocyte ratio(NLR),platelet-to-lymphocyte ratio(PLR),and lymphocyte-to-monocyte ratio(LMR),have garnered extensive attention and have been used to predict survival in various tumors and inflammation-related diseases[8-10].However,the relationship between these combinations of immune markers and the outcomes in patients with early-stage HCC require further investigation.Machine learning(ML)algorithms are capable of handling large and complex datasets,generating more accurate and personalized predictions through unique training algorithms that better manage nonlinear statistical relationships than traditional analytical methods.Commonly used ML models include artificial neural networks(ANNs)and random survival forests(RSFs),which have shown satisfactory accuracy in prognostic predictions across various cancers and other diseases[11-13].ANNs have performed well in identifying the progression from liver cirrhosis to HCC and predicting overall survival(OS)in patients with HCC[14,15].However,no studies have confirmed the ability of ML models to predict post-surgical survival in patients with early-stage HCC.Through ML,a better understanding of the risk factors for early-stage HCC prognosis can be achieved.This aids in surgical decision-making,identifying patients at a high risk of mortality,and selecting subsequent treatment strategies.In this study,we aimed to establish a 5-year prognostic model for patients with early-stage HCC after surgical resection,based on ML and systemic immune-inflammatory indicators.This model seeks to improve the early monitoring of high-risk patients and provide personalized treatment plans.
基金supported by the studies of intelligent LLRF control algorithms for superconducting RF cavities(No.E129851YR0)the National Natural Science Foundation of China(No.U22A20261)Applications of Artificial Intelligence in the Stability Study of Superconducting Linear Accelerators(No.E429851YR0)。
文摘Superconducting radio-frequency(SRF)cavities are the core components of SRF linear accelerators,making their stable operation considerably important.However,the operational experience from different accelerator laboratories has revealed that SRF faults are the leading cause of short machine downtime trips.When a cavity fault occurs,system experts analyze the time-series data recorded by low-level RF systems and identify the fault type.However,this requires expertise and intuition,posing a major challenge for control-room operators.Here,we propose an expert feature-based machine learning model for automating SRF cavity fault recognition.The main challenge in converting the"expert reasoning"process for SRF faults into a"model inference"process lies in feature extraction,which is attributed to the associated multidimensional and complex time-series waveforms.Existing autoregression-based feature-extraction methods require the signal to be stable and autocorrelated,resulting in difficulty in capturing the abrupt features that exist in several SRF failure patterns.To address these issues,we introduce expertise into the classification model through reasonable feature engineering.We demonstrate the feasibility of this method using the SRF cavity of the China accelerator facility for superheavy elements(CAFE2).Although specific faults in SRF cavities may vary across different accelerators,similarities exist in the RF signals.Therefore,this study provides valuable guidance for fault analysis of the entire SRF community.