Nasopharyngeal carcinoma(NPC)is a malignant tumor prevalent in southern China and Southeast Asia,where its early detection is crucial for improving patient prognosis and reducing mortality rates.However,existing scree...Nasopharyngeal carcinoma(NPC)is a malignant tumor prevalent in southern China and Southeast Asia,where its early detection is crucial for improving patient prognosis and reducing mortality rates.However,existing screening methods suffer from limitations in accuracy and accessibility,hindering their application in large-scale population screening.In this work,a surface-enhanced Raman spectroscopy(SERS)-based method was established to explore the profiles of different stratified components in saliva from NPC and healthy subjects after fractionation processing.The study findings indicate that all fractionated samples exhibit diseaseassociated molecular signaling differences,where small-molecule(molecular weight cut-offvalue is 10 kDa)demonstrating superior classification capabilities with sensitivity of 90.5%and speci-ficity of 75.6%,area under receiver operating characteristic(ROC)curve of 0:925±0:031.The primary objective of this study was to qualitatively explore patterns in saliva composition across groups.The proposed SERS detection strategy for fractionated saliva offers novel insights for enhancing the sensitivity and reliability of noninvasive NPC screening,laying the foundation for translational application in large-scale clinical settings.展开更多
Replicating the chaotic characteristics inherent in nonlinear dynamical systems via machine learning(ML)is a key challenge in this rapidly advancing interdisciplinary field.In this work,we explore the potential of var...Replicating the chaotic characteristics inherent in nonlinear dynamical systems via machine learning(ML)is a key challenge in this rapidly advancing interdisciplinary field.In this work,we explore the potential of variational quantum circuits(VQC)for learning the stochastic properties of classical nonlinear dynamical systems.Specifically,we focus on the one-and two-dimensional logistic maps,which,while simple,remain under-explored in the context of learning dynamical characteristics.Our findings reveal that,even for such simple dynamical systems,accurately replicating longterm characteristics is hindered by a pronounced sensitivity to overfitting.While increasing the parameter complexity of the ML model typically enhances short-term prediction accuracy,it also leads to a degradation in the model’s ability to replicate long-term characteristics,primarily due to the detrimental effects of overfitting on generalization power.By comparing the VQC with two widely recognized classical ML techniques,which are long short-term memory(LSTM)networks for timeseries processing and reservoir computing,we demonstrate that VQC outperforms these methods in terms of replicating long-term characteristics.Our results suggest that for the ML of dynamics,it is demanded to develop more compact and efficient models(such as VQC)rather than more complicated and large-scale ones.展开更多
The glass transition temperature(T_(g))of styrene-butadiene rubber(SBR)is a key parameter determining its low-temperature flexibility and processing performance.Accurate prediction of T_(g)is crucial formaterial desig...The glass transition temperature(T_(g))of styrene-butadiene rubber(SBR)is a key parameter determining its low-temperature flexibility and processing performance.Accurate prediction of T_(g)is crucial formaterial design and application optimisation.Addressing the limitations of traditional experimental measurements and theoretical models in terms of efficiency,cost,and accuracy,this study proposes a machine learning prediction framework that integrates multi-model ensemble and Bayesian optimization by constructing a multi-component feature dataset and algorithm optimization strategy.Based on the constructed high-quality dataset containing 96 SBR samples,ninemachine learning models were employed to predict the T_(g)of SBR and compare their prediction performance.Ultimately,aGPR-XGBoost mixed model was constructed through model ensemble,achieving high-precision prediction with R^(2)values greater than 0.9 on both the training and test sets.Further feature attribution and local effect analysis were conducted using feature analysis methods such as SHAP and ALE,revealing the nonlinear influence patterns of various components on T_(g),providing a theoretical basis for SBR formulation design and T_(g)regulation.The machine learning prediction framework established in this study combines high-precision prediction with interpretability,significantly enhancing the prediction performance of the T_(g)of SBR.It offers an efficient tool for SBR molecular design and holds great potential for promotion and application.展开更多
The complex interactions and conflicting performance demands in multi-component composites pose significant challenges for achieving balanced multi-property optimization through conventional trial-and-error approaches...The complex interactions and conflicting performance demands in multi-component composites pose significant challenges for achieving balanced multi-property optimization through conventional trial-and-error approaches.Machine learning(ML)offers a promising solution,markedly improving materials discovery efficiency.However,the high dimensionality of feature spaces in such systems has long impeded effective ML-driven feature representation and inverse design.To overcome this,we present an Intelligent Screening System(ISS)framework to accelerate the discovery of optimal formulations balancing four key properties in 15-component PTFE-based copper-clad laminate composites(PTFE-CCLCs).ISS adopts modular descriptors based on the physical information of component volume fractions,thereby simplifying the feature representation.By leveraging the inverse prediction capability of ML models and constructing a performance-driven virtual candidate database,ISS significantly reduced the computational complexity associated with high-dimensional spaces.Experimental validation confirmed that ISSoptimized formulations exhibited superior synergy,notably resolving the trade-off between thermal conductivity and peel strength,and outperform many commercial counterparts.Despite limited data and inherent process variability,ISS achieved an average prediction accuracy of 76.5%,with thermal conductivity predictions exceeding 90%,demonstrating robust reliability.This work provides an innovative,efficient strategy for multifunctional optimization and accelerated discovery in ultra-complex composite systems,highlighting the integration of ML and advanced materials design.展开更多
The detection and characterization of non-metallic inclusions are essential for clean steel production.Recently,imaging analysis combined with high-dimensional data processing of metallic materials using artificial in...The detection and characterization of non-metallic inclusions are essential for clean steel production.Recently,imaging analysis combined with high-dimensional data processing of metallic materials using artificial intelligence(AI)-based machine learning(ML)has developed rapidly.This technique has achieved impressive results in the field of inclusion classification in process metallurgy.The present study surveys the ML modeling of inclusion prediction in advanced steels,including the detection,classification,and feature prediction of inclusions in different steel grades.Studies on clean steel with different features based on data and image analysis via ML are summarized.Regarding the data analysis,the inclusion prediction methodology based on ML establishes a connection between the experimental parameters and inclusion characteristics and analyzes the importance of the experimental parameters.Regarding the image analysis,the focus is placed on the classification of different types of inclusions via deep learning,in comparison with data analysis.Finally,further development of inclusion analyses using ML-based methods is recommended.This work paves the way for the application of AIbased methodologies for ultraclean-steel studies from a sustainable metallurgy perspective.展开更多
During electrochemical machining(ECM),the passivation film formed on the surface of titanium alloy can lead to uneven dissolution and pitting.Solid particle erosion can effectively remove this passivation film.In this...During electrochemical machining(ECM),the passivation film formed on the surface of titanium alloy can lead to uneven dissolution and pitting.Solid particle erosion can effectively remove this passivation film.In this paper,the electrochemical dissolution behavior of Ti-6.5Al-2Zr-1Mo-1V(TA15)titanium alloy at without particle impact,low(15°)and high(90°)angle particle impact was investigated,and the influence of Al_(2)O_(3)particles on ECM was systematically expounded.It was found that under the condition of no particle erosion,the surface of electrochemically processed titanium alloy had serious pitting corrosion due to the influence of the passivation film,and the surface roughness(Sa)of the local area reached 10.088μm.Under the condition of a high-impact angle(90°),due to the existence of strain hardening and particle embedding,only the edge of the surface is dissolved,while the central area is almost insoluble,with the surface roughness(S_(a))reaching 16.086μm.On the contrary,under the condition of a low-impact angle(15°),the machining efficiency and surface quality of the material were significantly improved due to the ploughing effect and galvanic corrosion,and the surface roughness(S_(a))reached 2.823μm.Based on these findings,the electrochemical dissolution model of TA15 titanium alloy under different particle erosion conditions was established.展开更多
Accurate prediction of flood events is important for flood control and risk management.Machine learning techniques contributed greatly to advances in flood predictions,and existing studies mainly focused on predicting...Accurate prediction of flood events is important for flood control and risk management.Machine learning techniques contributed greatly to advances in flood predictions,and existing studies mainly focused on predicting flood resource variables using single or hybrid machine learning techniques.However,class-based flood predictions have rarely been investigated,which can aid in quickly diagnosing comprehensive flood characteristics and proposing targeted management strategies.This study proposed a prediction approach of flood regime metrics and event classes coupling machine learning algorithms with clustering-deduced membership degrees.Five algorithms were adopted for this exploration.Results showed that the class membership degrees accurately determined event classes with class hit rates up to 100%,compared with the four classes clustered from nine regime metrics.The nonlinear algorithms(Multiple Linear Regression,Random Forest,and least squares-Support Vector Machine)outperformed the linear techniques(Multiple Linear Regression and Stepwise Regression)in predicting flood regime metrics.The proposed approach well predicted flood event classes with average class hit rates of 66.0%-85.4%and 47.2%-76.0%in calibration and validation periods,respectively,particularly for the slow and late flood events.The predictive capability of the proposed prediction approach for flood regime metrics and classes was considerably stronger than that of hydrological modeling approach.展开更多
Oxide dispersion strengthened(ODS)alloys are extensively used owing to high thermostability and creep strength contributed from uniformly dispersed fine oxides particles.However,the existence of these strengthening pa...Oxide dispersion strengthened(ODS)alloys are extensively used owing to high thermostability and creep strength contributed from uniformly dispersed fine oxides particles.However,the existence of these strengthening particles also deteriorates the processability and it is of great importance to establish accurate processing maps to guide the thermomechanical processes to enhance the formability.In this study,we performed particle swarm optimization-based back propagation artificial neural network model to predict the high temperature flow behavior of 0.25wt%Al2O3 particle-reinforced Cu alloys,and compared the accuracy with that of derived by Arrhenius-type constitutive model and back propagation artificial neural network model.To train these models,we obtained the raw data by fabricating ODS Cu alloys using the internal oxidation and reduction method,and conducting systematic hot compression tests between 400 and800℃with strain rates of 10^(-2)-10 S^(-1).At last,processing maps for ODS Cu alloys were proposed by combining processing parameters,mechanical behavior,microstructure characterization,and the modeling results achieved a coefficient of determination higher than>99%.展开更多
Latest digital advancements have intensified the necessity for adaptive,data-driven and socially-centered learning ecosystems.This paper presents the formulation of a cross-platform,innovative,gamified and personalize...Latest digital advancements have intensified the necessity for adaptive,data-driven and socially-centered learning ecosystems.This paper presents the formulation of a cross-platform,innovative,gamified and personalized Learning Ecosystem,which integrates 3D/VR environments,as well as machine learning algorithms,and business intelligence frameworks to enhance learner-centered education and inferenced decision-making.This Learning System makes use of immersive,analytically assessed virtual learning spaces,therefore facilitating real-time monitoring of not just learning performance,but also overall engagement and behavioral patterns,via a comprehensive set of sustainability-oriented ESG-aligned Key Performance Indicators(KPIs).Machine learning models support predictive analysis,personalized feedback,and hybrid recommendation mechanisms,whilst dedicated dashboards translate complex educational data into actionable insights for all Use Cases of the System(Educational Institutions,Educators and Learners).Additionally,the presented Learning System introduces a structured Mentoring and Consulting Subsystem,thence reinforcing human-centered guidance alongside automated intelligence.The Platform’s modular architecture and simulation-centered evaluation approach actively support personalized,and continuously optimized learning pathways.Thence,it exemplifies a mature,adaptive Learning Ecosystem,supporting immersive technologies,analytics,and pedagogical support,hence,contributing to contemporary digital learning innovation and sociotechnical transformation in education.展开更多
Background:Lumbar disc degeneration(LDD)displays considerable heterogeneity in terms of clinical features and pathological changes.However,researchers have not clearly determined whether the transcriptome variations i...Background:Lumbar disc degeneration(LDD)displays considerable heterogeneity in terms of clinical features and pathological changes.However,researchers have not clearly determined whether the transcriptome variations in LDD could be used to identify or interpret the causes of heterogeneity in clinical features.This study aimed to identify the transcriptomic classification of degenerated discs in LDD patients and whether the molecular subtypes of LDD could be accurately predicted using clinical features.Methods:One hundred and twenty-two nucleus pulposus(NP)tissues from 108 patients were consecutively collected for bulk RNA sequencing(RNA-seq).An unsupervised clustering method was employed to analyze the bulk RNA matrix.Differential analysis was performed to characterize the transcriptional signatures and subtype-specific extracellular matrix(ECM)dysregulation.The cell subpopulation states of each subtype were inferred by integrating bulk and single-cell sequencing datasets.Transwell and dual-luciferase reporter gene assays were employed to investigate possible molecular mechanisms involved.Machine learning algorithm diagnostic prediction models were developed to correlate molecular classification with clinical features.Results:LDD was classified into 4 subtypes with distinct molecular signatures and ECM remodeling:C1 with collagenesis,C2 with ossification,C3 with low chondrogenesis,and C4 with fibrogenesis.Chond1-3 in C1 dominated disc collagenesis via the activation of the mechanosensors TRPV4 and PIEZO1;NP progenitor cells in C2 exhibited chondrogenic and osteogenic phenotypes;Chond1 in C3 was linked to a disrupted hypoxic microenvironment leading to reduced chondrogenesis;Macrophages in C4 played a crucial role in disc fibrogenesis via the secretion of tumor necrosis factor-α(TNF-α).Furthermore,the random forest diagnostic prediction model was proven to have a robust performance[area under the receiver operating characteristic(ROC)curve:0.9312;accuracy:0.84]in stratifying the molecular subtypes of LDD based on 12 clinical features.Conclusions:Our study delineates 4 distinct molecular subtypes of LDD that can be accurately stratified on the basis of clinical features.The identification of these subtypes would facilitate precise diagnostics and guide the development of personalized treatment strategies for LDD.展开更多
Efficient surface passivation is critical for achieving high-performance perovskite solar cells(PSCs),yet the discovery of optimal passivators remains a time-consuming,trial-and-error process.Here,we report a synergis...Efficient surface passivation is critical for achieving high-performance perovskite solar cells(PSCs),yet the discovery of optimal passivators remains a time-consuming,trial-and-error process.Here,we report a synergistic machine learning(ML)and density functional theory(DFT)approach that enables predictive and rapid identification of effective passivation materials.By training an XGBoost model(91.3%accuracy)with DFT-derived molecular descriptors and activity calculations,we identify 2-(4-aminophenyl)-3H-benzimidazol-5-amine(APBIA)as a promising passivator.Experimental validation demonstrates that APBIA effectively removes surface impurities and passivates defects within perovskite films,leading to a significant increase in power conversion efficiency(PCE)from 22.48%to 25.55%(certified as 25.02%).This ML-DFT framework provides a generalizable pathway for accelerating the development of advanced functional materials for photovoltaic applications.展开更多
Post-kidney transplant rejection is a critical factor influencing transplant success rates and the survival of transplanted organs.With the rapid advancement of artificial intelligence technologies,machine learning(ML...Post-kidney transplant rejection is a critical factor influencing transplant success rates and the survival of transplanted organs.With the rapid advancement of artificial intelligence technologies,machine learning(ML)has emerged as a powerful data analysis tool,widely applied in the prediction,diagnosis,and mechanistic study of kidney transplant rejection.This mini-review systematically summarizes the recent applications of ML techniques in post-kidney transplant rejection,covering areas such as the construction of predictive models,identification of biomarkers,analysis of pathological images,assessment of immune cell infiltration,and formulation of personalized treatment strategies.By integrating multi-omics data and clinical information,ML has significantly enhanced the accuracy of early rejection diagnosis and the capability for prognostic evaluation,driving the development of precision medicine in the field of kidney transplantation.Furthermore,this article discusses the challenges faced in existing research and potential future directions,providing a theoretical basis and technical references for related studies.展开更多
Sudden wildfires cause significant global ecological damage.While satellite imagery has advanced early fire detection and mitigation,image-based systems face limitations including high false alarm rates,visual obstruc...Sudden wildfires cause significant global ecological damage.While satellite imagery has advanced early fire detection and mitigation,image-based systems face limitations including high false alarm rates,visual obstructions,and substantial computational demands,especially in complex forest terrains.To address these challenges,this study proposes a novel forest fire detection model utilizing audio classification and machine learning.We developed an audio-based pipeline using real-world environmental sound recordings.Sounds were converted into Mel-spectrograms and classified via a Convolutional Neural Network(CNN),enabling the capture of distinctive fire acoustic signatures(e.g.,crackling,roaring)that are minimally impacted by visual or weather conditions.Internet of Things(IoT)sound sensors were crucial for generating complex environmental parameters to optimize feature extraction.The CNN model achieved high performance in stratified 5-fold cross-validation(92.4%±1.6 accuracy,91.2%±1.8 F1-score)and on test data(94.93%accuracy,93.04%F1-score),with 98.44%precision and 88.32%recall,demonstrating reliability across environmental conditions.These results indicate that the audio-based approach not only improves detection reliability but also markedly reduces computational overhead compared to traditional image-based methods.The findings suggest that acoustic sensing integrated with machine learning offers a powerful,low-cost,and efficient solution for real-time forest fire monitoring in complex,dynamic environments.展开更多
This study developed a modeling methodology for statistical optimization-based geologic hazard susceptibility assessment,aiming to enhance the comprehensive performance and classification accuracy of the assessment mo...This study developed a modeling methodology for statistical optimization-based geologic hazard susceptibility assessment,aiming to enhance the comprehensive performance and classification accuracy of the assessment models.First,the cumulative probability method revealed that a low probability(15%)of geologic hazards between any two geologic hazard points occurred outside a buffer zone with a radius of 2297 m(i.e.,the distance threshold).The training dataset was established,consisting of negative samples(non-hazard points)randomly generated based on the distance threshold,positive samples(i.e.,historical hazards),and 13 conditioning factors.Then,models were built using five machine learning algorithms,namely random forest(RF),gradient boosting decision tree(GBDT),naive Bayes(NB),logistic regression(LR),and support vector machine(SVM).The comprehensive performance of the models was assessed using the area under the receiver operating characteristic curve(AUC)and overall accuracy(OA)as indicators,revealing that RF exhibited the best performance,with OA and AUC values of 2.7127 and 0.981,respectively.Furthermore,the machine learning models constructed by considering the distance threshold outperformed those built using the unoptimized dataset.The characteristic factors were ranked using the mutual information method,with their scores decreasing in the order of rainfall(0.1616),altitude(0.06),normalized difference vegetation index(NDVI;0.04),and distance from roads(0.03).Finally,the geologic hazard susceptibility classification was assessed using the natural breaks method combined with a clustering algorithm.The results indicate that the clustering algorithm exhibited higher classification accuracy than the natural breaks method.The findings of this study demonstrate that the proposed model optimization scheme can provide a scientific basis for the prevention and control of geologic hazards.展开更多
Carbon emissions resulting from energy consumption have become a pressing issue for governments worldwide.Accurate estimation of carbon emissions using satellite remote sensing data has become a crucial research probl...Carbon emissions resulting from energy consumption have become a pressing issue for governments worldwide.Accurate estimation of carbon emissions using satellite remote sensing data has become a crucial research problem.Previous studies relied on statistical regression models that failed to capture the complex nonlinear relationships between carbon emissions and characteristic variables.In this study,we propose a machine learning algorithm for carbon emissions,a Bayesian optimized XGboost regression model,using multi-year energy carbon emission data and nighttime lights(NTL)remote sensing data from Shaanxi Province,China.Our results demonstrate that the XGboost algorithm outperforms linear regression and four other machine learning models,with an R^(2)of 0.906 and RMSE of 5.687.We observe an annual increase in carbon emissions,with high-emission counties primarily concentrated in northern and central Shaanxi Province,displaying a shift from discrete,sporadic points to contiguous,extended spatial distribution.Spatial autocorrelation clustering reveals predominantly high-high and low-low clustering patterns,with economically developed counties showing high-emission clustering and economically relatively backward counties displaying low-emission clustering.Our findings show that the use of NTL data and the XGboost algorithm can estimate and predict carbon emissionsmore accurately and provide a complementary reference for satellite remote sensing image data to serve carbon emission monitoring and assessment.This research provides an important theoretical basis for formulating practical carbon emission reduction policies and contributes to the development of techniques for accurate carbon emission estimation using remote sensing data.展开更多
Open caissons are widely used in foundation engineering because of their load-bearing efficiency and adaptability in diverse soil conditions.However,accurately predicting their undrained bearing capacity in layered so...Open caissons are widely used in foundation engineering because of their load-bearing efficiency and adaptability in diverse soil conditions.However,accurately predicting their undrained bearing capacity in layered soils remains a complex challenge.This study presents a novel application of five ensemble machine(ML)algorithms-random forest(RF),gradient boosting machine(GBM),extreme gradient boosting(XGBoost),adaptive boosting(AdaBoost),and categorical boosting(CatBoost)-to predict the undrained bearing capacity factor(Nc)of circular open caissons embedded in two-layered clay on the basis of results from finite element limit analysis(FELA).The input dataset consists of 1188 numerical simulations using the Tresca failure criterion,varying in geometrical and soil parameters.The FELA was performed via OptumG2 software with adaptive meshing techniques and verified against existing benchmark studies.The ML models were trained on 70% of the dataset and tested on the remaining 30%.Their performance was evaluated using six statistical metrics:coefficient of determination(R²),mean absolute error(MAE),root mean squared error(RMSE),index of scatter(IOS),RMSE-to-standard deviation ratio(RSR),and variance explained factor(VAF).The results indicate that all the models achieved high accuracy,with R²values exceeding 97.6%and RMSE values below 0.02.Among them,AdaBoost and CatBoost consistently outperformed the other methods across both the training and testing datasets,demonstrating superior generalizability and robustness.The proposed ML framework offers an efficient,accurate,and data-driven alternative to traditional methods for estimating caisson capacity in stratified soils.This approach can aid in reducing computational costs while improving reliability in the early stages of foundation design.展开更多
Pharmaceutical pollution is becoming an increasing threat to aquatic environments since inactive compounds do not break down,and the drug products are accumulated in living organisms.The ability of a drug to dissolve ...Pharmaceutical pollution is becoming an increasing threat to aquatic environments since inactive compounds do not break down,and the drug products are accumulated in living organisms.The ability of a drug to dissolve in water(i.e.,LogS)is an important parameter for assessing a drug’s environmental fate,biovailability,and toxicity.LogS is typically measured in a laboratory setting,which can be costly and time-consuming,and does not provide the opportunity to conduct large-scale analyses.This research develops and evaluates machine learning models that can produce LogS estimates and may improve the environmental risk assessments of toxic pharmaceutical pollutants.We used a dataset from the ChEMBL database that contained 8832 molecular compounds.Various data preprocessing and cleaning techniques were applied(i.e.,removing the missing values),we then recorded chemical properties by normalizing and,even,using some feature selection techniques.We evaluated logS with a total of several machine learning and deep learning models,including;linear regression,random forests(RF),support vector machines(SVM),gradient boosting(GBM),and artificial neural networks(ANNs).We assessed model performance using a series of metrics,including root mean square error(RMSE)and mean absolute error(MAE),as well as the coefficient of determination(R^(2)).The findings show that the Least Angle Regression(LAR)model performed the best with an R^(2) value close to 1.0000,confirming high predictive accuracy.The OMP model performed well with good accuracy(R^(2)=0.8727)while remaining computationally cheap,while other models(e.g.,neural networks,random forests)performed well but were too computationally expensive.Finally,to assess the robustness of the results,an error analysis indicated that residuals were evenly distributed around zero,confirming the results from the LAR model.The current research illustrates the potential of AI in anticipating drug solubility,providing support for green pharmaceutical design and environmental risk assessment.Future work should extend predictions to include degradation and toxicity to enhance predictive power and applicability.展开更多
The presence of aluminum(Al^(3+))and fluoride(F^(−))ions in the environment can be harmful to ecosystems and human health,highlighting the need for accurate and efficient monitoring.In this paper,an innovative approac...The presence of aluminum(Al^(3+))and fluoride(F^(−))ions in the environment can be harmful to ecosystems and human health,highlighting the need for accurate and efficient monitoring.In this paper,an innovative approach is presented that leverages the power of machine learning to enhance the accuracy and efficiency of fluorescence-based detection for sequential quantitative analysis of aluminum(Al^(3+))and fluoride(F^(−))ions in aqueous solutions.The proposed method involves the synthesis of sulfur-functionalized carbon dots(C-dots)as fluorescence probes,with fluorescence enhancement upon interaction with Al^(3+)ions,achieving a detection limit of 4.2 nmol/L.Subsequently,in the presence of F^(−)ions,fluorescence is quenched,with a detection limit of 47.6 nmol/L.The fingerprints of fluorescence images are extracted using a cross-platform computer vision library in Python,followed by data preprocessing.Subsequently,the fingerprint data is subjected to cluster analysis using the K-means model from machine learning,and the average Silhouette Coefficient indicates excellent model performance.Finally,a regression analysis based on the principal component analysis method is employed to achieve more precise quantitative analysis of aluminum and fluoride ions.The results demonstrate that the developed model excels in terms of accuracy and sensitivity.This groundbreaking model not only showcases exceptional performance but also addresses the urgent need for effective environmental monitoring and risk assessment,making it a valuable tool for safeguarding our ecosystems and public health.展开更多
Liposomes serve as critical carriers for drugs and vaccines,with their biological effects influenced by their size.The microfluidic method,renowned for its precise control,reproducibility,and scalability,has been wide...Liposomes serve as critical carriers for drugs and vaccines,with their biological effects influenced by their size.The microfluidic method,renowned for its precise control,reproducibility,and scalability,has been widely employed for liposome preparation.Although some studies have explored factors affecting liposomal size in microfluidic processes,most focus on small-sized liposomes,predominantly through experimental data analysis.However,the production of larger liposomes,which are equally significant,remains underexplored.In this work,we thoroughly investigate multiple variables influencing liposome size during microfluidic preparation and develop a machine learning(ML)model capable of accurately predicting liposomal size.Experimental validation was conducted using a staggered herringbone micromixer(SHM)chip.Our findings reveal that most investigated variables significantly influence liposomal size,often interrelating in complex ways.We evaluated the predictive performance of several widely-used ML algorithms,including ensemble methods,through cross-validation(CV)for both lipo-some size and polydispersity index(PDI).A standalone dataset was experimentally validated to assess the accuracy of the ML predictions,with results indicating that ensemble algorithms provided the most reliable predictions.Specifically,gradient boosting was selected for size prediction,while random forest was employed for PDI prediction.We successfully produced uniform large(600 nm)and small(100 nm)liposomes using the optimised experimental conditions derived from the ML models.In conclusion,this study presents a robust methodology that enables precise control over liposome size distribution,of-fering valuable insights for medicinal research applications.展开更多
To better understand the migration behavior of plastic fragments in the environment,development of rapid non-destructive methods for in-situ identification and characterization of plastic fragments is necessary.Howeve...To better understand the migration behavior of plastic fragments in the environment,development of rapid non-destructive methods for in-situ identification and characterization of plastic fragments is necessary.However,most of the studies had focused only on colored plastic fragments,ignoring colorless plastic fragments and the effects of different environmental media(backgrounds),thus underestimating their abundance.To address this issue,the present study used near-infrared spectroscopy to compare the identification of colored and colorless plastic fragments based on partial least squares-discriminant analysis(PLS-DA),extreme gradient boost,support vector machine and random forest classifier.The effects of polymer color,type,thickness,and background on the plastic fragments classification were evaluated.PLS-DA presented the best and most stable outcome,with higher robustness and lower misclassification rate.All models frequently misinterpreted colorless plastic fragments and its background when the fragment thickness was less than 0.1mm.A two-stage modeling method,which first distinguishes the plastic types and then identifies colorless plastic fragments that had been misclassified as background,was proposed.The method presented an accuracy higher than 99%in different backgrounds.In summary,this study developed a novel method for rapid and synchronous identification of colored and colorless plastic fragments under complex environmental backgrounds.展开更多
基金financially supported by National Natural Science Foundation ofChina(No.12374405)Provincial Science Foundation for Distinguished Young Scholars of Fujian(No.2024J010024)+1 种基金Natural Science Foundation of Fujian Province of China(No.2023J011267)Major Research Projects for Young and Middle-aged Researchers of Fujian Provincial Health Commission(No.2021ZQNZD010).
文摘Nasopharyngeal carcinoma(NPC)is a malignant tumor prevalent in southern China and Southeast Asia,where its early detection is crucial for improving patient prognosis and reducing mortality rates.However,existing screening methods suffer from limitations in accuracy and accessibility,hindering their application in large-scale population screening.In this work,a surface-enhanced Raman spectroscopy(SERS)-based method was established to explore the profiles of different stratified components in saliva from NPC and healthy subjects after fractionation processing.The study findings indicate that all fractionated samples exhibit diseaseassociated molecular signaling differences,where small-molecule(molecular weight cut-offvalue is 10 kDa)demonstrating superior classification capabilities with sensitivity of 90.5%and speci-ficity of 75.6%,area under receiver operating characteristic(ROC)curve of 0:925±0:031.The primary objective of this study was to qualitatively explore patterns in saliva composition across groups.The proposed SERS detection strategy for fractionated saliva offers novel insights for enhancing the sensitivity and reliability of noninvasive NPC screening,laying the foundation for translational application in large-scale clinical settings.
基金Project supported in part by Beijing Natural Science Foundation(Grant No.1232025)Peng Huanwu Visiting Pro-fessor Program,and Academy for Multidisciplinary Studies,Capital Normal University.
文摘Replicating the chaotic characteristics inherent in nonlinear dynamical systems via machine learning(ML)is a key challenge in this rapidly advancing interdisciplinary field.In this work,we explore the potential of variational quantum circuits(VQC)for learning the stochastic properties of classical nonlinear dynamical systems.Specifically,we focus on the one-and two-dimensional logistic maps,which,while simple,remain under-explored in the context of learning dynamical characteristics.Our findings reveal that,even for such simple dynamical systems,accurately replicating longterm characteristics is hindered by a pronounced sensitivity to overfitting.While increasing the parameter complexity of the ML model typically enhances short-term prediction accuracy,it also leads to a degradation in the model’s ability to replicate long-term characteristics,primarily due to the detrimental effects of overfitting on generalization power.By comparing the VQC with two widely recognized classical ML techniques,which are long short-term memory(LSTM)networks for timeseries processing and reservoir computing,we demonstrate that VQC outperforms these methods in terms of replicating long-term characteristics.Our results suggest that for the ML of dynamics,it is demanded to develop more compact and efficient models(such as VQC)rather than more complicated and large-scale ones.
基金supported by the National Natural Science Foundation of China(grant numbers 52250357 and 52203003).
文摘The glass transition temperature(T_(g))of styrene-butadiene rubber(SBR)is a key parameter determining its low-temperature flexibility and processing performance.Accurate prediction of T_(g)is crucial formaterial design and application optimisation.Addressing the limitations of traditional experimental measurements and theoretical models in terms of efficiency,cost,and accuracy,this study proposes a machine learning prediction framework that integrates multi-model ensemble and Bayesian optimization by constructing a multi-component feature dataset and algorithm optimization strategy.Based on the constructed high-quality dataset containing 96 SBR samples,ninemachine learning models were employed to predict the T_(g)of SBR and compare their prediction performance.Ultimately,aGPR-XGBoost mixed model was constructed through model ensemble,achieving high-precision prediction with R^(2)values greater than 0.9 on both the training and test sets.Further feature attribution and local effect analysis were conducted using feature analysis methods such as SHAP and ALE,revealing the nonlinear influence patterns of various components on T_(g),providing a theoretical basis for SBR formulation design and T_(g)regulation.The machine learning prediction framework established in this study combines high-precision prediction with interpretability,significantly enhancing the prediction performance of the T_(g)of SBR.It offers an efficient tool for SBR molecular design and holds great potential for promotion and application.
基金financially supported by the National Key Research and Development Project of China(No.2022YFB3806900)。
文摘The complex interactions and conflicting performance demands in multi-component composites pose significant challenges for achieving balanced multi-property optimization through conventional trial-and-error approaches.Machine learning(ML)offers a promising solution,markedly improving materials discovery efficiency.However,the high dimensionality of feature spaces in such systems has long impeded effective ML-driven feature representation and inverse design.To overcome this,we present an Intelligent Screening System(ISS)framework to accelerate the discovery of optimal formulations balancing four key properties in 15-component PTFE-based copper-clad laminate composites(PTFE-CCLCs).ISS adopts modular descriptors based on the physical information of component volume fractions,thereby simplifying the feature representation.By leveraging the inverse prediction capability of ML models and constructing a performance-driven virtual candidate database,ISS significantly reduced the computational complexity associated with high-dimensional spaces.Experimental validation confirmed that ISSoptimized formulations exhibited superior synergy,notably resolving the trade-off between thermal conductivity and peel strength,and outperform many commercial counterparts.Despite limited data and inherent process variability,ISS achieved an average prediction accuracy of 76.5%,with thermal conductivity predictions exceeding 90%,demonstrating robust reliability.This work provides an innovative,efficient strategy for multifunctional optimization and accelerated discovery in ultra-complex composite systems,highlighting the integration of ML and advanced materials design.
基金support from the National Key Research and Development Program of China(No.2024YFB3713705)is acknowledgedWangzhong Mu would like to acknowledge the Strategic Mobility,Sweden(SSF,No.SM22-0039)+1 种基金the Swedish Foundation for International Cooperation in Research and Higher Education(STINT,No.IB2022-9228)the Jernkontoret(Sweden)for supporting this clean steel research.Gonghao Lian would like to acknowledge China Scholarship Council(CSC,No.202306080032).
文摘The detection and characterization of non-metallic inclusions are essential for clean steel production.Recently,imaging analysis combined with high-dimensional data processing of metallic materials using artificial intelligence(AI)-based machine learning(ML)has developed rapidly.This technique has achieved impressive results in the field of inclusion classification in process metallurgy.The present study surveys the ML modeling of inclusion prediction in advanced steels,including the detection,classification,and feature prediction of inclusions in different steel grades.Studies on clean steel with different features based on data and image analysis via ML are summarized.Regarding the data analysis,the inclusion prediction methodology based on ML establishes a connection between the experimental parameters and inclusion characteristics and analyzes the importance of the experimental parameters.Regarding the image analysis,the focus is placed on the classification of different types of inclusions via deep learning,in comparison with data analysis.Finally,further development of inclusion analyses using ML-based methods is recommended.This work paves the way for the application of AIbased methodologies for ultraclean-steel studies from a sustainable metallurgy perspective.
基金supported by the National Natural Science Foundation of China(No.52175414)the Natural Science Foundation of Jiangsu Province of China(No.BK20220134)+1 种基金the Fundamental Research Funds for the Central Universities,China(No.NE2023002)the Postgraduate Research&Practice Innovation Program of Jiangsu Province,China(No.KYCX24_0559)。
文摘During electrochemical machining(ECM),the passivation film formed on the surface of titanium alloy can lead to uneven dissolution and pitting.Solid particle erosion can effectively remove this passivation film.In this paper,the electrochemical dissolution behavior of Ti-6.5Al-2Zr-1Mo-1V(TA15)titanium alloy at without particle impact,low(15°)and high(90°)angle particle impact was investigated,and the influence of Al_(2)O_(3)particles on ECM was systematically expounded.It was found that under the condition of no particle erosion,the surface of electrochemically processed titanium alloy had serious pitting corrosion due to the influence of the passivation film,and the surface roughness(Sa)of the local area reached 10.088μm.Under the condition of a high-impact angle(90°),due to the existence of strain hardening and particle embedding,only the edge of the surface is dissolved,while the central area is almost insoluble,with the surface roughness(S_(a))reaching 16.086μm.On the contrary,under the condition of a low-impact angle(15°),the machining efficiency and surface quality of the material were significantly improved due to the ploughing effect and galvanic corrosion,and the surface roughness(S_(a))reached 2.823μm.Based on these findings,the electrochemical dissolution model of TA15 titanium alloy under different particle erosion conditions was established.
基金National Key Research and Development Program of China,No.2023YFC3006704National Natural Science Foundation of China,No.42171047CAS-CSIRO Partnership Joint Project of 2024,No.177GJHZ2023097MI。
文摘Accurate prediction of flood events is important for flood control and risk management.Machine learning techniques contributed greatly to advances in flood predictions,and existing studies mainly focused on predicting flood resource variables using single or hybrid machine learning techniques.However,class-based flood predictions have rarely been investigated,which can aid in quickly diagnosing comprehensive flood characteristics and proposing targeted management strategies.This study proposed a prediction approach of flood regime metrics and event classes coupling machine learning algorithms with clustering-deduced membership degrees.Five algorithms were adopted for this exploration.Results showed that the class membership degrees accurately determined event classes with class hit rates up to 100%,compared with the four classes clustered from nine regime metrics.The nonlinear algorithms(Multiple Linear Regression,Random Forest,and least squares-Support Vector Machine)outperformed the linear techniques(Multiple Linear Regression and Stepwise Regression)in predicting flood regime metrics.The proposed approach well predicted flood event classes with average class hit rates of 66.0%-85.4%and 47.2%-76.0%in calibration and validation periods,respectively,particularly for the slow and late flood events.The predictive capability of the proposed prediction approach for flood regime metrics and classes was considerably stronger than that of hydrological modeling approach.
基金financial support of the National Natural Science Foundation of China(No.52371103)the Fundamental Research Funds for the Central Universities,China(No.2242023K40028)+1 种基金the Open Research Fund of Jiangsu Key Laboratory for Advanced Metallic Materials,China(No.AMM2023B01).financial support of the Research Fund of Shihezi Key Laboratory of AluminumBased Advanced Materials,China(No.2023PT02)financial support of Guangdong Province Science and Technology Major Project,China(No.2021B0301030005)。
文摘Oxide dispersion strengthened(ODS)alloys are extensively used owing to high thermostability and creep strength contributed from uniformly dispersed fine oxides particles.However,the existence of these strengthening particles also deteriorates the processability and it is of great importance to establish accurate processing maps to guide the thermomechanical processes to enhance the formability.In this study,we performed particle swarm optimization-based back propagation artificial neural network model to predict the high temperature flow behavior of 0.25wt%Al2O3 particle-reinforced Cu alloys,and compared the accuracy with that of derived by Arrhenius-type constitutive model and back propagation artificial neural network model.To train these models,we obtained the raw data by fabricating ODS Cu alloys using the internal oxidation and reduction method,and conducting systematic hot compression tests between 400 and800℃with strain rates of 10^(-2)-10 S^(-1).At last,processing maps for ODS Cu alloys were proposed by combining processing parameters,mechanical behavior,microstructure characterization,and the modeling results achieved a coefficient of determination higher than>99%.
文摘Latest digital advancements have intensified the necessity for adaptive,data-driven and socially-centered learning ecosystems.This paper presents the formulation of a cross-platform,innovative,gamified and personalized Learning Ecosystem,which integrates 3D/VR environments,as well as machine learning algorithms,and business intelligence frameworks to enhance learner-centered education and inferenced decision-making.This Learning System makes use of immersive,analytically assessed virtual learning spaces,therefore facilitating real-time monitoring of not just learning performance,but also overall engagement and behavioral patterns,via a comprehensive set of sustainability-oriented ESG-aligned Key Performance Indicators(KPIs).Machine learning models support predictive analysis,personalized feedback,and hybrid recommendation mechanisms,whilst dedicated dashboards translate complex educational data into actionable insights for all Use Cases of the System(Educational Institutions,Educators and Learners).Additionally,the presented Learning System introduces a structured Mentoring and Consulting Subsystem,thence reinforcing human-centered guidance alongside automated intelligence.The Platform’s modular architecture and simulation-centered evaluation approach actively support personalized,and continuously optimized learning pathways.Thence,it exemplifies a mature,adaptive Learning Ecosystem,supporting immersive technologies,analytics,and pedagogical support,hence,contributing to contemporary digital learning innovation and sociotechnical transformation in education.
基金supported by the National Natural Science Foundation of China(32270887,82272507,32200654,82430079,and 82472519)the National Key Research and Development Program of China(2022YFA1103202)+7 种基金the Chongqing High-End Medical Talents for Middle-aged and Young(YXGD202408)the Army Scientific and Technological Innovation Talents Prioritized Suppor t Program(2023-124)the Natural Science Foundation of Chongqing(CSTB2023NSCQ-ZDJO008)the Postdoctoral Innovative Talent Support Program(BX20220397)the Open Project of State Key Laboratory of TraumaBurns and Combined Injury(SFLKF202201)the Project for Enhancing Innovation of Army Medical University(2023XJS39)the Talent Innovation Training Program at the Army Medical Center(ZXZYTSYS09)。
文摘Background:Lumbar disc degeneration(LDD)displays considerable heterogeneity in terms of clinical features and pathological changes.However,researchers have not clearly determined whether the transcriptome variations in LDD could be used to identify or interpret the causes of heterogeneity in clinical features.This study aimed to identify the transcriptomic classification of degenerated discs in LDD patients and whether the molecular subtypes of LDD could be accurately predicted using clinical features.Methods:One hundred and twenty-two nucleus pulposus(NP)tissues from 108 patients were consecutively collected for bulk RNA sequencing(RNA-seq).An unsupervised clustering method was employed to analyze the bulk RNA matrix.Differential analysis was performed to characterize the transcriptional signatures and subtype-specific extracellular matrix(ECM)dysregulation.The cell subpopulation states of each subtype were inferred by integrating bulk and single-cell sequencing datasets.Transwell and dual-luciferase reporter gene assays were employed to investigate possible molecular mechanisms involved.Machine learning algorithm diagnostic prediction models were developed to correlate molecular classification with clinical features.Results:LDD was classified into 4 subtypes with distinct molecular signatures and ECM remodeling:C1 with collagenesis,C2 with ossification,C3 with low chondrogenesis,and C4 with fibrogenesis.Chond1-3 in C1 dominated disc collagenesis via the activation of the mechanosensors TRPV4 and PIEZO1;NP progenitor cells in C2 exhibited chondrogenic and osteogenic phenotypes;Chond1 in C3 was linked to a disrupted hypoxic microenvironment leading to reduced chondrogenesis;Macrophages in C4 played a crucial role in disc fibrogenesis via the secretion of tumor necrosis factor-α(TNF-α).Furthermore,the random forest diagnostic prediction model was proven to have a robust performance[area under the receiver operating characteristic(ROC)curve:0.9312;accuracy:0.84]in stratifying the molecular subtypes of LDD based on 12 clinical features.Conclusions:Our study delineates 4 distinct molecular subtypes of LDD that can be accurately stratified on the basis of clinical features.The identification of these subtypes would facilitate precise diagnostics and guide the development of personalized treatment strategies for LDD.
基金supported by the National Key Research and Development Program of China (Grant No. 2024YFB4205101)the National Natural Science Foundation of China (No. 62274098 and No. 62074084)+2 种基金the Natural Science Foundation of Tianjin (No.22JCYBJC01300, No. 23JCYBJC01620 and No. 21JCYBJC00270)the Overseas Expertise Introduction Project for Discipline Innovation of Higher Edu cation of China (Grant No. B16027)the Fundamental Research Funds for the Central Universities,Nankai University (No. 63241568)
文摘Efficient surface passivation is critical for achieving high-performance perovskite solar cells(PSCs),yet the discovery of optimal passivators remains a time-consuming,trial-and-error process.Here,we report a synergistic machine learning(ML)and density functional theory(DFT)approach that enables predictive and rapid identification of effective passivation materials.By training an XGBoost model(91.3%accuracy)with DFT-derived molecular descriptors and activity calculations,we identify 2-(4-aminophenyl)-3H-benzimidazol-5-amine(APBIA)as a promising passivator.Experimental validation demonstrates that APBIA effectively removes surface impurities and passivates defects within perovskite films,leading to a significant increase in power conversion efficiency(PCE)from 22.48%to 25.55%(certified as 25.02%).This ML-DFT framework provides a generalizable pathway for accelerating the development of advanced functional materials for photovoltaic applications.
文摘Post-kidney transplant rejection is a critical factor influencing transplant success rates and the survival of transplanted organs.With the rapid advancement of artificial intelligence technologies,machine learning(ML)has emerged as a powerful data analysis tool,widely applied in the prediction,diagnosis,and mechanistic study of kidney transplant rejection.This mini-review systematically summarizes the recent applications of ML techniques in post-kidney transplant rejection,covering areas such as the construction of predictive models,identification of biomarkers,analysis of pathological images,assessment of immune cell infiltration,and formulation of personalized treatment strategies.By integrating multi-omics data and clinical information,ML has significantly enhanced the accuracy of early rejection diagnosis and the capability for prognostic evaluation,driving the development of precision medicine in the field of kidney transplantation.Furthermore,this article discusses the challenges faced in existing research and potential future directions,providing a theoretical basis and technical references for related studies.
基金funded by the Directorate of Research and Community Service,Directorate General of Research and Development,Ministry of Higher Education,Science and Technologyin accordance with the Implementation Contract for the Operational Assistance Program for State Universities,Research Program Number:109/C3/DT.05.00/PL/2025.
文摘Sudden wildfires cause significant global ecological damage.While satellite imagery has advanced early fire detection and mitigation,image-based systems face limitations including high false alarm rates,visual obstructions,and substantial computational demands,especially in complex forest terrains.To address these challenges,this study proposes a novel forest fire detection model utilizing audio classification and machine learning.We developed an audio-based pipeline using real-world environmental sound recordings.Sounds were converted into Mel-spectrograms and classified via a Convolutional Neural Network(CNN),enabling the capture of distinctive fire acoustic signatures(e.g.,crackling,roaring)that are minimally impacted by visual or weather conditions.Internet of Things(IoT)sound sensors were crucial for generating complex environmental parameters to optimize feature extraction.The CNN model achieved high performance in stratified 5-fold cross-validation(92.4%±1.6 accuracy,91.2%±1.8 F1-score)and on test data(94.93%accuracy,93.04%F1-score),with 98.44%precision and 88.32%recall,demonstrating reliability across environmental conditions.These results indicate that the audio-based approach not only improves detection reliability but also markedly reduces computational overhead compared to traditional image-based methods.The findings suggest that acoustic sensing integrated with machine learning offers a powerful,low-cost,and efficient solution for real-time forest fire monitoring in complex,dynamic environments.
基金supported by a project entitled Loess Plateau Region-Watershed-Slope Geological Hazard Multi-Scale Collaborative Intelligent Early Warning System of the National Key R&D Program of China(2022YFC3003404)a project of the Shaanxi Youth Science and Technology Star(2021KJXX-87)public welfare geological survey projects of Shaanxi Institute of Geologic Survey(20180301,201918,202103,and 202413)。
文摘This study developed a modeling methodology for statistical optimization-based geologic hazard susceptibility assessment,aiming to enhance the comprehensive performance and classification accuracy of the assessment models.First,the cumulative probability method revealed that a low probability(15%)of geologic hazards between any two geologic hazard points occurred outside a buffer zone with a radius of 2297 m(i.e.,the distance threshold).The training dataset was established,consisting of negative samples(non-hazard points)randomly generated based on the distance threshold,positive samples(i.e.,historical hazards),and 13 conditioning factors.Then,models were built using five machine learning algorithms,namely random forest(RF),gradient boosting decision tree(GBDT),naive Bayes(NB),logistic regression(LR),and support vector machine(SVM).The comprehensive performance of the models was assessed using the area under the receiver operating characteristic curve(AUC)and overall accuracy(OA)as indicators,revealing that RF exhibited the best performance,with OA and AUC values of 2.7127 and 0.981,respectively.Furthermore,the machine learning models constructed by considering the distance threshold outperformed those built using the unoptimized dataset.The characteristic factors were ranked using the mutual information method,with their scores decreasing in the order of rainfall(0.1616),altitude(0.06),normalized difference vegetation index(NDVI;0.04),and distance from roads(0.03).Finally,the geologic hazard susceptibility classification was assessed using the natural breaks method combined with a clustering algorithm.The results indicate that the clustering algorithm exhibited higher classification accuracy than the natural breaks method.The findings of this study demonstrate that the proposed model optimization scheme can provide a scientific basis for the prevention and control of geologic hazards.
基金supported by the Key Research and Development Program in Shaanxi Province,China(No.2022ZDLSF07-05)the Fundamental Research Funds for the Central Universities,CHD(No.300102352901)。
文摘Carbon emissions resulting from energy consumption have become a pressing issue for governments worldwide.Accurate estimation of carbon emissions using satellite remote sensing data has become a crucial research problem.Previous studies relied on statistical regression models that failed to capture the complex nonlinear relationships between carbon emissions and characteristic variables.In this study,we propose a machine learning algorithm for carbon emissions,a Bayesian optimized XGboost regression model,using multi-year energy carbon emission data and nighttime lights(NTL)remote sensing data from Shaanxi Province,China.Our results demonstrate that the XGboost algorithm outperforms linear regression and four other machine learning models,with an R^(2)of 0.906 and RMSE of 5.687.We observe an annual increase in carbon emissions,with high-emission counties primarily concentrated in northern and central Shaanxi Province,displaying a shift from discrete,sporadic points to contiguous,extended spatial distribution.Spatial autocorrelation clustering reveals predominantly high-high and low-low clustering patterns,with economically developed counties showing high-emission clustering and economically relatively backward counties displaying low-emission clustering.Our findings show that the use of NTL data and the XGboost algorithm can estimate and predict carbon emissionsmore accurately and provide a complementary reference for satellite remote sensing image data to serve carbon emission monitoring and assessment.This research provides an important theoretical basis for formulating practical carbon emission reduction policies and contributes to the development of techniques for accurate carbon emission estimation using remote sensing data.
文摘Open caissons are widely used in foundation engineering because of their load-bearing efficiency and adaptability in diverse soil conditions.However,accurately predicting their undrained bearing capacity in layered soils remains a complex challenge.This study presents a novel application of five ensemble machine(ML)algorithms-random forest(RF),gradient boosting machine(GBM),extreme gradient boosting(XGBoost),adaptive boosting(AdaBoost),and categorical boosting(CatBoost)-to predict the undrained bearing capacity factor(Nc)of circular open caissons embedded in two-layered clay on the basis of results from finite element limit analysis(FELA).The input dataset consists of 1188 numerical simulations using the Tresca failure criterion,varying in geometrical and soil parameters.The FELA was performed via OptumG2 software with adaptive meshing techniques and verified against existing benchmark studies.The ML models were trained on 70% of the dataset and tested on the remaining 30%.Their performance was evaluated using six statistical metrics:coefficient of determination(R²),mean absolute error(MAE),root mean squared error(RMSE),index of scatter(IOS),RMSE-to-standard deviation ratio(RSR),and variance explained factor(VAF).The results indicate that all the models achieved high accuracy,with R²values exceeding 97.6%and RMSE values below 0.02.Among them,AdaBoost and CatBoost consistently outperformed the other methods across both the training and testing datasets,demonstrating superior generalizability and robustness.The proposed ML framework offers an efficient,accurate,and data-driven alternative to traditional methods for estimating caisson capacity in stratified soils.This approach can aid in reducing computational costs while improving reliability in the early stages of foundation design.
文摘Pharmaceutical pollution is becoming an increasing threat to aquatic environments since inactive compounds do not break down,and the drug products are accumulated in living organisms.The ability of a drug to dissolve in water(i.e.,LogS)is an important parameter for assessing a drug’s environmental fate,biovailability,and toxicity.LogS is typically measured in a laboratory setting,which can be costly and time-consuming,and does not provide the opportunity to conduct large-scale analyses.This research develops and evaluates machine learning models that can produce LogS estimates and may improve the environmental risk assessments of toxic pharmaceutical pollutants.We used a dataset from the ChEMBL database that contained 8832 molecular compounds.Various data preprocessing and cleaning techniques were applied(i.e.,removing the missing values),we then recorded chemical properties by normalizing and,even,using some feature selection techniques.We evaluated logS with a total of several machine learning and deep learning models,including;linear regression,random forests(RF),support vector machines(SVM),gradient boosting(GBM),and artificial neural networks(ANNs).We assessed model performance using a series of metrics,including root mean square error(RMSE)and mean absolute error(MAE),as well as the coefficient of determination(R^(2)).The findings show that the Least Angle Regression(LAR)model performed the best with an R^(2) value close to 1.0000,confirming high predictive accuracy.The OMP model performed well with good accuracy(R^(2)=0.8727)while remaining computationally cheap,while other models(e.g.,neural networks,random forests)performed well but were too computationally expensive.Finally,to assess the robustness of the results,an error analysis indicated that residuals were evenly distributed around zero,confirming the results from the LAR model.The current research illustrates the potential of AI in anticipating drug solubility,providing support for green pharmaceutical design and environmental risk assessment.Future work should extend predictions to include degradation and toxicity to enhance predictive power and applicability.
基金supported by the National Natural Science Foundation of China(No.U21A20290)Guangdong Basic and Applied Basic Research Foundation(No.2022A1515011656)+2 种基金the Projects of Talents Recruitment of GDUPT(No.2023rcyj1003)the 2022“Sail Plan”Project of Maoming Green Chemical Industry Research Institute(No.MMGCIRI2022YFJH-Y-024)Maoming Science and Technology Project(No.2023382).
文摘The presence of aluminum(Al^(3+))and fluoride(F^(−))ions in the environment can be harmful to ecosystems and human health,highlighting the need for accurate and efficient monitoring.In this paper,an innovative approach is presented that leverages the power of machine learning to enhance the accuracy and efficiency of fluorescence-based detection for sequential quantitative analysis of aluminum(Al^(3+))and fluoride(F^(−))ions in aqueous solutions.The proposed method involves the synthesis of sulfur-functionalized carbon dots(C-dots)as fluorescence probes,with fluorescence enhancement upon interaction with Al^(3+)ions,achieving a detection limit of 4.2 nmol/L.Subsequently,in the presence of F^(−)ions,fluorescence is quenched,with a detection limit of 47.6 nmol/L.The fingerprints of fluorescence images are extracted using a cross-platform computer vision library in Python,followed by data preprocessing.Subsequently,the fingerprint data is subjected to cluster analysis using the K-means model from machine learning,and the average Silhouette Coefficient indicates excellent model performance.Finally,a regression analysis based on the principal component analysis method is employed to achieve more precise quantitative analysis of aluminum and fluoride ions.The results demonstrate that the developed model excels in terms of accuracy and sensitivity.This groundbreaking model not only showcases exceptional performance but also addresses the urgent need for effective environmental monitoring and risk assessment,making it a valuable tool for safeguarding our ecosystems and public health.
基金supported by the National Key Research and Development Plan of the Ministry of Science and Technology,China(Grant No.:2022YFE0125300)the National Natural Science Foundation of China(Grant No:81690262)+2 种基金the National Science and Technology Major Project,China(Grant No.:2017ZX09201004-021)the Open Project of National facility for Translational Medicine(Shanghai),China(Grant No.:TMSK-2021-104)Shanghai Jiao Tong University STAR Grant,China(Grant Nos.:YG2022ZD024 and YG2022QN111).
文摘Liposomes serve as critical carriers for drugs and vaccines,with their biological effects influenced by their size.The microfluidic method,renowned for its precise control,reproducibility,and scalability,has been widely employed for liposome preparation.Although some studies have explored factors affecting liposomal size in microfluidic processes,most focus on small-sized liposomes,predominantly through experimental data analysis.However,the production of larger liposomes,which are equally significant,remains underexplored.In this work,we thoroughly investigate multiple variables influencing liposome size during microfluidic preparation and develop a machine learning(ML)model capable of accurately predicting liposomal size.Experimental validation was conducted using a staggered herringbone micromixer(SHM)chip.Our findings reveal that most investigated variables significantly influence liposomal size,often interrelating in complex ways.We evaluated the predictive performance of several widely-used ML algorithms,including ensemble methods,through cross-validation(CV)for both lipo-some size and polydispersity index(PDI).A standalone dataset was experimentally validated to assess the accuracy of the ML predictions,with results indicating that ensemble algorithms provided the most reliable predictions.Specifically,gradient boosting was selected for size prediction,while random forest was employed for PDI prediction.We successfully produced uniform large(600 nm)and small(100 nm)liposomes using the optimised experimental conditions derived from the ML models.In conclusion,this study presents a robust methodology that enables precise control over liposome size distribution,of-fering valuable insights for medicinal research applications.
基金supported by the National Natural Science Foundation of China(No.22276139)the Shanghai’s Municipal State-owned Assets Supervision and Administration Commission(No.2022028).
文摘To better understand the migration behavior of plastic fragments in the environment,development of rapid non-destructive methods for in-situ identification and characterization of plastic fragments is necessary.However,most of the studies had focused only on colored plastic fragments,ignoring colorless plastic fragments and the effects of different environmental media(backgrounds),thus underestimating their abundance.To address this issue,the present study used near-infrared spectroscopy to compare the identification of colored and colorless plastic fragments based on partial least squares-discriminant analysis(PLS-DA),extreme gradient boost,support vector machine and random forest classifier.The effects of polymer color,type,thickness,and background on the plastic fragments classification were evaluated.PLS-DA presented the best and most stable outcome,with higher robustness and lower misclassification rate.All models frequently misinterpreted colorless plastic fragments and its background when the fragment thickness was less than 0.1mm.A two-stage modeling method,which first distinguishes the plastic types and then identifies colorless plastic fragments that had been misclassified as background,was proposed.The method presented an accuracy higher than 99%in different backgrounds.In summary,this study developed a novel method for rapid and synchronous identification of colored and colorless plastic fragments under complex environmental backgrounds.