BACKGROUND Microvascular invasion(MVI)is an important prognostic factor in hepatocellular carcinoma(HCC),but its preoperative prediction remains challenging.AIM To develop and validate a 2.5-dimensional(2.5D)deep lear...BACKGROUND Microvascular invasion(MVI)is an important prognostic factor in hepatocellular carcinoma(HCC),but its preoperative prediction remains challenging.AIM To develop and validate a 2.5-dimensional(2.5D)deep learning-based multiinstance learning(MIL)model(MIL signature)for predicting MVI in HCC,evaluate and compare its performance against the radiomics signature and clinical signature,and assess its prognostic predictive value in both surgical resection and transcatheter arterial chemoembolization(TACE)cohorts.METHODS A retrospective cohort consisting of 192 patients with pathologically confirmed HCC was included,of whom 68 were MVI-positive and 124 were MVI-negative.The patients were randomly assigned to a training set(134 patients)and a validation set(58 patients)in a 7:3 ratio.An additional 45 HCC patients undergoing TACE treatment were included in the TACE validation cohort.A modeling strategy based on computed tomography arterial phase images was implemented,utilizing 2.5D deep learning in combination with a MIL framework for the prediction of MVI in HCC.Moreover,this method was compared with the radiomics signature and clinical signatures,and the predictive performance of the various models was evaluated using receiver operating characteristic curves and decision curve analysis(DCA),with DeLong’s test applied to compare the area under the curve(AUC)between models.Kaplan-Meier curves were utilized to analyze differences in recurrence-free survival(RFS)or progression-free survival(PFS)among different HCC treatment cohorts stratified by MIL signature risk.RESULTS MIL signature demonstrated superior performance in the validation set(AUC=0.877),significantly surpassing the radiomics signature(AUC=0.727,P=0.047)and clinical signature(AUC=0.631,P=0.004).DCA curves indicated that the MIL signature provided a greater clinical net benefit across the full spectrum of risk thresholds.In the prognostic analysis,high-and low-risk groups stratified by the MIL signature exhibited significant differences in RFS within the surgical resection cohort(training set P=0.0058,validation set P=0.031)and PFS within the TACE treatment cohort(P=0.045).CONCLUSION MIL signature demonstrates more accurate MVI prediction in HCC,surpassing radiomics signature and clinical signature,and offers precise prognostic stratification,thereby providing new technical support for personalized HCC treatment strategies.展开更多
Sinter is the core raw material for blast furnaces.Flue pressure,which is an important state parameter,affects sinter quality.In this paper,flue pressure prediction and optimization were studied based on the shapley a...Sinter is the core raw material for blast furnaces.Flue pressure,which is an important state parameter,affects sinter quality.In this paper,flue pressure prediction and optimization were studied based on the shapley additive explanation(SHAP)to predict the flue pressure and take targeted adjustment measures.First,the sintering process data were collected and processed.A flue pressure prediction model was then constructed after comparing different feature selection methods and model algorithms using SHAP+extremely random-ized trees(ET).The prediction accuracy of the model within the error range of±0.25 kPa was 92.63%.SHAP analysis was employed to improve the interpretability of the prediction model.The effects of various sintering operation parameters on flue pressure,the relation-ship between the numerical range of key operation parameters and flue pressure,the effect of operation parameter combinations on flue pressure,and the prediction process of the flue pressure prediction model on a single sample were analyzed.A flue pressure optimization module was also constructed and analyzed when the prediction satisfied the judgment conditions.The operating parameter combination was then pushed.The flue pressure was increased by 5.87%during the verification process,achieving a good optimization effect.展开更多
Photovoltaic (PV) modules, as essential components of solar power generation systems, significantly influence unitpower generation costs.The service life of these modules directly affects these costs. Over time, the p...Photovoltaic (PV) modules, as essential components of solar power generation systems, significantly influence unitpower generation costs.The service life of these modules directly affects these costs. Over time, the performanceof PV modules gradually declines due to internal degradation and external environmental factors.This cumulativedegradation impacts the overall reliability of photovoltaic power generation. This study addresses the complexdegradation process of PV modules by developing a two-stage Wiener process model. This approach accountsfor the distinct phases of degradation resulting from module aging and environmental influences. A powerdegradation model based on the two-stage Wiener process is constructed to describe individual differences inmodule degradation processes. To estimate the model parameters, a combination of the Expectation-Maximization(EM) algorithm and the Bayesian method is employed. Furthermore, the Schwarz Information Criterion (SIC) isutilized to identify critical change points in PV module degradation trajectories. To validate the universality andeffectiveness of the proposed method, a comparative analysis is conducted against other established life predictiontechniques for PV modules.展开更多
Objective:To compare the clinical efficacy of mifepristone-misoprostol medical management versus surgical curettage for first-trimester missed miscarriage,and to establish evidence-based sonographic cutoff values pred...Objective:To compare the clinical efficacy of mifepristone-misoprostol medical management versus surgical curettage for first-trimester missed miscarriage,and to establish evidence-based sonographic cutoff values predictive of incomplete abortion requiring surgical intervention.Methods:We retrospectively analyzed a cohort of 702 women diagnosed with first-trimester missed miscarriage between January 2020 and May 2023.Demographic characteristics and ultrasound parameters were systematically recorded.Receiver operating characteristic(ROC)curve analysis was performed to establish optimal sonographic cutoff values for predicting incomplete abortion requiring surgical intervention.Results:146 patients received medical treatment(mifepristone and misoprostol)and 556 underwent surgical curettage.At the 1-month follow-up,the medical group showed significantly greater endometrial thickness and longer postoperative bleeding duration than the surgical group(P<0.05).The menstrual volume reduction rate(23.56%)was significantly lower in the medical group than in the surgical group.The incomplete abortion rate was higher in the medical group(17.12%,25/146)than in the surgical group(2.88%,16/556).Among the medical group,14 patients(9.59%)required curettage due to incomplete abortion,while 11 cases resolved spontaneously after prolonged medication.ROC curve analysis identified two cut-off values indicating the need for surgical intervention:endometrial thickness>1.21 cm at 24 h post-medical abortion,and residual mass diameter>0.95 cm at 7 days post-medical abortion.Conclusions:Medical management of first-trimester missed miscarriage using mifepristone-misoprostol demonstrates comparable efficacy to surgical curettage.An endometrial thickness>1.21 cm at 24 h or residual tissue diameter>0.95 cm at 7 days post-medical abortion should prompt consideration of incomplete abortion.展开更多
Accurate channel state information(CSI)is crucial for 6G wireless communication systems to accommodate the growing demands of mobile broadband services.In massive multiple-input multiple-output(MIMO)systems,traditiona...Accurate channel state information(CSI)is crucial for 6G wireless communication systems to accommodate the growing demands of mobile broadband services.In massive multiple-input multiple-output(MIMO)systems,traditional CSI feedback approaches face challenges such as performance degradation due to feedback delay and channel aging caused by user mobility.To address these issues,we propose a novel spatio-temporal predictive network(STPNet)that jointly integrates CSI feedback and prediction modules.STPNet employs stacked Inception modules to learn the spatial correlation and temporal evolution of CSI,which captures both the local and the global spatiotemporal features.In addition,the signal-to-noise ratio(SNR)adaptive module is designed to adapt flexibly to diverse feedback channel conditions.Simulation results demonstrate that STPNet outperforms existing channel prediction methods under various channel conditions.展开更多
Phenotypic prediction is a promising strategy for accelerating plant breeding.Data from multiple sources(called multi-view data)can provide complementary information to characterize a biological object from various as...Phenotypic prediction is a promising strategy for accelerating plant breeding.Data from multiple sources(called multi-view data)can provide complementary information to characterize a biological object from various aspects.By integrating multi-view information into phenotypic prediction,a multi-view best linear unbiased prediction(MVBLUP)method is proposed in this paper.To measure the importance of multiple data views,the differential evolution algorithm with an early stopping mechanism is used,by which we obtain a multi-view kinship matrix and then incorporate it into the BLUP model for phenotypic prediction.To further illustrate the characteristics of MVBLUP,we perform the empirical experiments on four multi-view datasets in different crops.Compared to the single-view method,the prediction accuracy of the MVBLUP method has improved by 0.038–0.201 on average.The results demonstrate that the MVBLUP is an effective integrative prediction method for multi-view data.展开更多
Influenced by complex external factors,the displacement-time curve of reservoir landslides demonstrates both short-term and long-term diversity and dynamic complexity.It is difficult for existing methods,including Reg...Influenced by complex external factors,the displacement-time curve of reservoir landslides demonstrates both short-term and long-term diversity and dynamic complexity.It is difficult for existing methods,including Regression models and Neural network models,to perform multi-characteristic coupled displacement prediction because they fail to consider landslide creep characteristics.This paper integrates the creep characteristics of landslides with non-linear intelligent algorithms and proposes a dynamic intelligent landslide displacement prediction method based on a combination of the Biological Growth model(BG),Convolutional Neural Network(CNN),and Long ShortTerm Memory Network(LSTM).This prediction approach improves three different biological growth models,thereby effectively extracting landslide creep characteristic parameters.Simultaneously,it integrates external factors(rainfall and reservoir water level)to construct an internal and external comprehensive dataset for data augmentation,which is input into the improved CNN-LSTM model.Thereafter,harnessing the robust feature extraction capabilities and spatial translation invariance of CNN,the model autonomously captures short-term local fluctuation characteristics of landslide displacement,and combines LSTM's efficient handling of long-term nonlinear temporal data to improve prediction performance.An evaluation of the Liangshuijing landslide in the Three Gorges Reservoir Area indicates that BG-CNN-LSTM exhibits high prediction accuracy,excellent generalization capabilities when dealing with various types of landslides.The research provides an innovative approach to achieving the whole-process,realtime,high-precision displacement predictions for multicharacteristic coupled landslides.展开更多
In this article,our nonlinear theory and technology for reducing the uncertainties of high-impact ocean‒atmosphere event predictions,with the conditional nonlinear optimal perturbation(CNOP)method as its core,are revi...In this article,our nonlinear theory and technology for reducing the uncertainties of high-impact ocean‒atmosphere event predictions,with the conditional nonlinear optimal perturbation(CNOP)method as its core,are reviewed,and the“spring predictability barrier”problem for El Nino‒Southern Oscillation events and targeted observation issues for tropical cyclone forecasts are taken as two representative examples.Nonlinear theory reveals that initial errors of particular spatial structures,environmental conditions,and nonlinear processes contribute to significant prediction errors,whereas nonlinear technology provides a pioneering approach for reducing observational and forecast errors via targeted observations through the application of the CNOP method.Follow-up research further validates the scientific rigor of the theory in revealing the nonlinear mechanism of significant prediction errors,and relevant practical field campaigns for targeted observations verify the effectiveness of the technology in reducing prediction uncertainties.The CNOP method has achieved international recognition;furthermore,its applications further extend to ensemble forecasts for weather and climate and further enrich the nonlinear technology for reducing prediction uncertainties.It is expected that this nonlinear theory and technology will play a considerably important role in reducing prediction uncertainties for high-impact weather and climate events.展开更多
Stock price prediction is a typical complex time series prediction problem characterized by dynamics,nonlinearity,and complexity.This paper introduces a generative adversarial network model that incorporates an attent...Stock price prediction is a typical complex time series prediction problem characterized by dynamics,nonlinearity,and complexity.This paper introduces a generative adversarial network model that incorporates an attention mechanism(GAN-LSTM-Attention)to improve the accuracy of stock price prediction.Firstly,the generator of this model combines the Long and Short-Term Memory Network(LSTM),the Attention Mechanism and,the Fully-Connected Layer,focusing on generating the predicted stock price.The discriminator combines the Convolutional Neural Network(CNN)and the Fully-Connected Layer to discriminate between real stock prices and generated stock prices.Secondly,to evaluate the practical application ability and generalization ability of the GAN-LSTM-Attention model,four representative stocks in the United States of America(USA)stock market,namely,Standard&Poor’s 500 Index stock,Apple Incorporatedstock,AdvancedMicroDevices Incorporatedstock,and Google Incorporated stock were selected for prediction experiments,and the prediction performance was comprehensively evaluated by using the three evaluation metrics,namely,mean absolute error(MAE),root mean square error(RMSE),and coefficient of determination(R2).Finally,the specific effects of the attention mechanism,convolutional layer,and fully-connected layer on the prediction performance of the model are systematically analyzed through ablation study.The results of experiment show that the GAN-LSTM-Attention model exhibits excellent performance and robustness in stock price prediction.展开更多
Objective The Asia-Pacific region has a high chronic obstructive pulmonary disease(COPD)burden,but studies on its trends are limited.Using the Global Burden of Disease(GBD)2019 data,we analyzed COPD trends in 36 count...Objective The Asia-Pacific region has a high chronic obstructive pulmonary disease(COPD)burden,but studies on its trends are limited.Using the Global Burden of Disease(GBD)2019 data,we analyzed COPD trends in 36 countries and territories from 1990 to 2019 and predicted future incidence trends through 2034.Methods COPD data by age and sex from the GBD 2019 database were analyzed for incidence,prevalence,mortality,and disability-adjusted life years(DALY)rates from 1990 to 2019.Joinpoint regression identified significant annual trends,and age-standardized incidence rates were predicted through 2034 using age-period-cohort models.Results The incidence,prevalence,mortality,and disease burden of COPD have been decreasing,and the incidence rates will continue to decrease or remain stable until 2034 in most selected countries and territories,except for a few Southeastern Asian countries.The Lao People’s Democratic Republic and Vietnam are projected to experience an increase in COPD incidence from 165.3 per 100,000 in 2019 to 177 per 100,000 in 2034 and from 179.9 per 100,000 in 2019 to 192.5 per 100,000 in 2034,respectively.Older males had a higher incidence than any other sex or age group.The sex gap in incidence rates continues to widen,though it is smaller and less significant in the younger age group than in those in the older one.Conclusion COPD rates are expected to decline until 2034 but remain a health risk,especially in countries with rising rates.Urgent action on tobacco control,air pollution,and public education is needed.展开更多
BACKGROUND To investigate the preoperative factors influencing textbook outcomes(TO)in Intrahepatic cholangiocarcinoma(ICC)patients and evaluate the feasibility of an interpretable machine learning model for preoperat...BACKGROUND To investigate the preoperative factors influencing textbook outcomes(TO)in Intrahepatic cholangiocarcinoma(ICC)patients and evaluate the feasibility of an interpretable machine learning model for preoperative prediction of TO,we developed a machine learning model for preoperative prediction of TO and used the SHapley Additive exPlanations(SHAP)technique to illustrate the prediction process.AIM To analyze the factors influencing textbook outcomes before surgery and to establish interpretable machine learning models for preoperative prediction.METHODS A total of 376 patients diagnosed with ICC were retrospectively collected from four major medical institutions in China,covering the period from 2011 to 2017.Logistic regression analysis was conducted to identify preoperative variables associated with achieving TO.Based on these variables,an EXtreme Gradient Boosting(XGBoost)machine learning prediction model was constructed using the XGBoost package.The SHAP(package:Shapviz)algorithm was employed to visualize each variable's contribution to the model's predictions.Kaplan-Meier survival analysis was performed to compare the prognostic differences between the TO-achieving and non-TO-achieving groups.RESULTS Among 376 patients,287 were included in the training group and 89 in the validation group.Logistic regression identified the following preoperative variables influencing TO:Child-Pugh classification,Eastern Cooperative Oncology Group(ECOG)score,hepatitis B,and tumor size.The XGBoost prediction model demonstrated high accuracy in internal validation(AUC=0.8825)and external validation(AUC=0.8346).Survival analysis revealed that the disease-free survival rates for patients achieving TO at 1,2,and 3 years were 64.2%,56.8%,and 43.4%,respectively.CONCLUSION Child-Pugh classification,ECOG score,hepatitis B,and tumor size are preoperative predictors of TO.In both the training group and the validation group,the machine learning model had certain effectiveness in predicting TO before surgery.The SHAP algorithm provided intuitive visualization of the machine learning prediction process,enhancing its interpretability.展开更多
Personalized drug response prediction from molecular data is an important challenge in precision medicine for treating cancer.Computational methods have been widely explored and have become increasingly accurate in re...Personalized drug response prediction from molecular data is an important challenge in precision medicine for treating cancer.Computational methods have been widely explored and have become increasingly accurate in recent years.However,the clinical application of prediction methods is still in its infancy due to large discrepancies between preclinial models and patients.We present a novel disentangled synthesis transfer network(DiSyn)for drug response prediction specifically designed for transfer learning from preclinical models to clinical patients.DiSyn uses a domain separation network(DSN)to disentangle drug response related features,employs data synthesis technology to increase the sample size and iteratively trains for better feature disentanglement.DiSyn is pretrained on large-scale unlabeled cancer samples and validated by three datasets,The Cancer Genome Atlas(TCGA),Investigation of Serial Studies to Predict Your Therapeutic Response With Imaging And moLecular Analysis 2(I-SPY2)and Novartis Institutes for Biomedical Research Patient-Derived Xenograft Encyclopedia(NIBR PDXE),achieving competitive performance with the state-of-the-art methods on cancer patients and mice.Furthermore,the application of DiSyn to thousands of breast cancer patients show the heterogeneity in drug responses and demonstrate its potential value in biomarker discovery and drug combination prediction.展开更多
This study presents a machine learning-based method for predicting fragment velocity distribution in warhead fragmentation under explosive loading condition.The fragment resultant velocities are correlated with key de...This study presents a machine learning-based method for predicting fragment velocity distribution in warhead fragmentation under explosive loading condition.The fragment resultant velocities are correlated with key design parameters including casing dimensions and detonation positions.The paper details the finite element analysis for fragmentation,the characterizations of the dynamic hardening and fracture models,the generation of comprehensive datasets,and the training of the ANN model.The results show the influence of casing dimensions on fragment velocity distributions,with the tendencies indicating increased resultant velocity with reduced thickness,increased length and diameter.The model's predictive capability is demonstrated through the accurate predictions for both training and testing datasets,showing its potential for the real-time prediction of fragmentation performance.展开更多
Harnessing solar power is essential for addressing the dual challenges of global warming and the depletion of traditional energy sources.However,the fluctuations and intermittency of photovoltaic(PV)power pose challen...Harnessing solar power is essential for addressing the dual challenges of global warming and the depletion of traditional energy sources.However,the fluctuations and intermittency of photovoltaic(PV)power pose challenges for its extensive incorporation into power grids.Thus,enhancing the precision of PV power prediction is particularly important.Although existing studies have made progress in short-term prediction,issues persist,particularly in the underutilization of temporal features and the neglect of correlations between satellite cloud images and PV power data.These factors hinder improvements in PV power prediction performance.To overcome these challenges,this paper proposes a novel PV power prediction method based on multi-stage temporal feature learning.First,the improved LSTMand SA-ConvLSTMare employed to extract the temporal feature of PV power and the spatial-temporal feature of satellite cloud images,respectively.Subsequently,a novel hybrid attention mechanism is proposed to identify the interplay between the two modalities,enhancing the capacity to focus on the most relevant features.Finally,theTransformermodel is applied to further capture the short-termtemporal patterns and long-term dependencies within multi-modal feature information.The paper also compares the proposed method with various competitive methods.The experimental results demonstrate that the proposed method outperforms the competitive methods in terms of accuracy and reliability in short-term PV power prediction.展开更多
Stroke,a major cerebrovascular disease,has high morbidity and mortality.Effective methods to reduce the risk and improve the prognosis are lacking.Currently,uric acid(UA)is associated with the pathological mechanism,p...Stroke,a major cerebrovascular disease,has high morbidity and mortality.Effective methods to reduce the risk and improve the prognosis are lacking.Currently,uric acid(UA)is associated with the pathological mechanism,prognosis,and therapy of stroke.UA plays pro/anti-oxidative and pro-inflammatory roles in vivo.The specific role of UA in stroke,which may have both neuroprotective and damaging effects,remains unclear.There is a U-shaped association between serum uric acid(SUA)levels and ischemic stroke(IS).UA therapy provides neuroprotection during reperfusion therapy for acute ischemic stroke(AIS).Urate-lowering therapy(ULT)plays a protective role in IS with hyperuricemia or gout.SUA levels are associated with the cerebrovascular injury mechanism,risk,and outcomes of hemorrhagic stroke.In this review,we summarize the current research on the role of UA in stroke,providing potential targets for its prediction and treatment.展开更多
The accurate prediction of drug absorption,distribution,metabolism,excretion,and toxicity(ADMET)properties represents a crucial step in early drug development for reducing failure risk.Current deep learning approaches...The accurate prediction of drug absorption,distribution,metabolism,excretion,and toxicity(ADMET)properties represents a crucial step in early drug development for reducing failure risk.Current deep learning approaches face challenges with data sparsity and information loss due to single-molecule representation limitations and isolated predictive tasks.This research proposes molecular properties prediction with parallel-view and collaborative learning(MolP-PC),a multi-view fusion and multi-task deep learning framework that integrates 1D molecular fingerprints(MFs),2D molecular graphs,and 3D geometric representations,incorporating an attention-gated fusion mechanism and multi-task adaptive learning strategy for precise ADMET property predictions.Experimental results demonstrate that MolP-PC achieves optimal performance in 27 of 54 tasks,with its multi-task learning(MTL)mechanism significantly enhancing predictive performance on small-scale datasets and surpassing single-task models in 41 of 54 tasks.Additional ablation studies and interpretability analyses confirm the significance of multi-view fusion in capturing multi-dimensional molecular information and enhancing model generalization.A case study examining the anticancer compound Oroxylin A demonstrates MolP-PC’s effective generalization in predicting key pharmacokinetic parameters such as half-life(T0.5)and clearance(CL),indicating its practical utility in drug modeling.However,the model exhibits a tendency to underestimate volume of distribution(VD),indicating potential for improvement in analyzing compounds with high tissue distribution.This study presents an efficient and interpretable approach for ADMET property prediction,establishing a novel framework for molecular optimization and risk assessment in drug development.展开更多
Negative logarithm of the acid dissociation constant(pK_(a))significantly influences the absorption,dis-tribution,metabolism,excretion,and toxicity(ADMET)properties of molecules and is a crucial indicator in drug rese...Negative logarithm of the acid dissociation constant(pK_(a))significantly influences the absorption,dis-tribution,metabolism,excretion,and toxicity(ADMET)properties of molecules and is a crucial indicator in drug research.Given the rapid and accurate characteristics of computational methods,their role in predicting drug properties is increasingly important.Although many pK_(a) prediction models currently exist,they often focus on enhancing model precision while neglecting interpretability.In this study,we present GraFpKa,a pK_(a) prediction model using graph neural networks(GNNs)and molecular finger-prints.The results show that our acidic and basic models achieved mean absolute errors(MAEs)of 0.621 and 0.402,respectively,on the test set,demonstrating good predictive performance.Notably,to improve interpretability,GraFpKa also incorporates Integrated Gradients(IGs),providing a clearer visual description of the atoms significantly affecting the pK_(a) values.The high reliability and interpretability of GraFpKa ensure accurate pKa predictions while also facilitating a deeper understanding of the relation-ship between molecular structure and pK_(a) values,making it a valuable tool in the field of pK_(a) prediction.展开更多
Accurate prediction of drug-target interactions(DTIs)plays a pivotal role in drug discovery,facilitating optimization of lead compounds,drug repurposing and elucidation of drug side effects.However,traditional DTI pre...Accurate prediction of drug-target interactions(DTIs)plays a pivotal role in drug discovery,facilitating optimization of lead compounds,drug repurposing and elucidation of drug side effects.However,traditional DTI prediction methods are often limited by incomplete biological data and insufficient representation of protein features.In this study,we proposed KG-CNNDTI,a novel knowledge graph-enhanced framework for DTI prediction,which integrates heterogeneous biological information to improve model generalizability and predictive performance.The proposed model utilized protein embeddings derived from a biomedical knowledge graph via the Node2Vec algorithm,which were further enriched with contextualized sequence representations obtained from ProteinBERT.For compound representation,multiple molecular fingerprint schemes alongside the Uni-Mol pre-trained model were evaluated.The fused representations served as inputs to both classical machine learning models and a convolutional neural network-based predictor.Experimental evaluations across benchmark datasets demonstrated that KG-CNNDTI achieved superior performance compared to state-of-the-art methods,particularly in terms of Precision,Recall,F1-Score and area under the precision-recall curve(AUPR).Ablation analysis highlighted the substantial contribution of knowledge graph-derived features.Moreover,KG-CNNDTI was employed for virtual screening of natural products against Alzheimer's disease,resulting in 40 candidate compounds.5 were supported by literature evidence,among which 3 were further validated in vitro assays.展开更多
Accurate prediction of molecular properties is crucial for selecting compounds with ideal properties and reducing the costs and risks of trials.Traditional methods based on manually crafted features and graph-based me...Accurate prediction of molecular properties is crucial for selecting compounds with ideal properties and reducing the costs and risks of trials.Traditional methods based on manually crafted features and graph-based methods have shown promising results in molecular property prediction.However,traditional methods rely on expert knowledge and often fail to capture the complex structures and interactions within molecules.Similarly,graph-based methods typically overlook the chemical structure and function hidden in molecular motifs and struggle to effectively integrate global and local molecular information.To address these limitations,we propose a novel fingerprint-enhanced hierarchical graph neural network(FH-GNN)for molecular property prediction that simultaneously learns information from hierarchical molecular graphs and fingerprints.The FH-GNN captures diverse hierarchical chemical information by applying directed message-passing neural networks(D-MPNN)on a hierarchical molecular graph that integrates atomic-level,motif-level,and graph-level information along with their relationships.Addi-tionally,we used an adaptive attention mechanism to balance the importance of hierarchical graphs and fingerprint features,creating a comprehensive molecular embedding that integrated hierarchical mo-lecular structures with domain knowledge.Experiments on eight benchmark datasets from MoleculeNet showed that FH-GNN outperformed the baseline models in both classification and regression tasks for molecular property prediction,validating its capability to comprehensively capture molecular informa-tion.By integrating molecular structure and chemical knowledge,FH-GNN provides a powerful tool for the accurate prediction of molecular properties and aids in the discovery of potential drug candidates.展开更多
Landslide susceptibility prediction(LSP)is significantly affected by the uncertainty issue of landslide related conditioning factor selection.However,most of literature only performs comparative studies on a certain c...Landslide susceptibility prediction(LSP)is significantly affected by the uncertainty issue of landslide related conditioning factor selection.However,most of literature only performs comparative studies on a certain conditioning factor selection method rather than systematically study this uncertainty issue.Targeted,this study aims to systematically explore the influence rules of various commonly used conditioning factor selection methods on LSP,and on this basis to innovatively propose a principle with universal application for optimal selection of conditioning factors.An'yuan County in southern China is taken as example considering 431 landslides and 29 types of conditioning factors.Five commonly used factor selection methods,namely,the correlation analysis(CA),linear regression(LR),principal component analysis(PCA),rough set(RS)and artificial neural network(ANN),are applied to select the optimal factor combinations from the original 29 conditioning factors.The factor selection results are then used as inputs of four types of common machine learning models to construct 20 types of combined models,such as CA-multilayer perceptron,CA-random forest.Additionally,multifactor-based multilayer perceptron random forest models that selecting conditioning factors based on the proposed principle of“accurate data,rich types,clear significance,feasible operation and avoiding duplication”are constructed for comparisons.Finally,the LSP uncertainties are evaluated by the accuracy,susceptibility index distribution,etc.Results show that:(1)multifactor-based models have generally higher LSP performance and lower uncertainties than those of factors selection-based models;(2)Influence degree of different machine learning on LSP accuracy is greater than that of different factor selection methods.Conclusively,the above commonly used conditioning factor selection methods are not ideal for improving LSP performance and may complicate the LSP processes.In contrast,a satisfied combination of conditioning factors can be constructed according to the proposed principle.展开更多
基金Supported by the National Natural Science Foundation of China,No.81560278The“Summit Plan(New Departure)”Project for the Development of Doctoral Degree Authorization Points and Professional Disciplines at the Affiliated Hospital of Youjiang Medical University for Nationalities,No.DF20244433+1 种基金Self-funded Research Project by the Guangxi Health and Wellness Committee,No.ZL20240824 and No.Z-L20240834The Project to Enhance the Research Foundations of Young and Mid-career Faculty in Guangxi Universities,No.2024KY0562 and No.2024KY0559。
文摘BACKGROUND Microvascular invasion(MVI)is an important prognostic factor in hepatocellular carcinoma(HCC),but its preoperative prediction remains challenging.AIM To develop and validate a 2.5-dimensional(2.5D)deep learning-based multiinstance learning(MIL)model(MIL signature)for predicting MVI in HCC,evaluate and compare its performance against the radiomics signature and clinical signature,and assess its prognostic predictive value in both surgical resection and transcatheter arterial chemoembolization(TACE)cohorts.METHODS A retrospective cohort consisting of 192 patients with pathologically confirmed HCC was included,of whom 68 were MVI-positive and 124 were MVI-negative.The patients were randomly assigned to a training set(134 patients)and a validation set(58 patients)in a 7:3 ratio.An additional 45 HCC patients undergoing TACE treatment were included in the TACE validation cohort.A modeling strategy based on computed tomography arterial phase images was implemented,utilizing 2.5D deep learning in combination with a MIL framework for the prediction of MVI in HCC.Moreover,this method was compared with the radiomics signature and clinical signatures,and the predictive performance of the various models was evaluated using receiver operating characteristic curves and decision curve analysis(DCA),with DeLong’s test applied to compare the area under the curve(AUC)between models.Kaplan-Meier curves were utilized to analyze differences in recurrence-free survival(RFS)or progression-free survival(PFS)among different HCC treatment cohorts stratified by MIL signature risk.RESULTS MIL signature demonstrated superior performance in the validation set(AUC=0.877),significantly surpassing the radiomics signature(AUC=0.727,P=0.047)and clinical signature(AUC=0.631,P=0.004).DCA curves indicated that the MIL signature provided a greater clinical net benefit across the full spectrum of risk thresholds.In the prognostic analysis,high-and low-risk groups stratified by the MIL signature exhibited significant differences in RFS within the surgical resection cohort(training set P=0.0058,validation set P=0.031)and PFS within the TACE treatment cohort(P=0.045).CONCLUSION MIL signature demonstrates more accurate MVI prediction in HCC,surpassing radiomics signature and clinical signature,and offers precise prognostic stratification,thereby providing new technical support for personalized HCC treatment strategies.
基金supported by the General Program of the National Natural Science Foundation of China(No.52274326)the China Baowu Low Carbon Metallurgy Innovation Foundation(No.BWLCF202109)the Seventh Batch of Ten Thousand Talents Plan of China(No.ZX20220553).
文摘Sinter is the core raw material for blast furnaces.Flue pressure,which is an important state parameter,affects sinter quality.In this paper,flue pressure prediction and optimization were studied based on the shapley additive explanation(SHAP)to predict the flue pressure and take targeted adjustment measures.First,the sintering process data were collected and processed.A flue pressure prediction model was then constructed after comparing different feature selection methods and model algorithms using SHAP+extremely random-ized trees(ET).The prediction accuracy of the model within the error range of±0.25 kPa was 92.63%.SHAP analysis was employed to improve the interpretability of the prediction model.The effects of various sintering operation parameters on flue pressure,the relation-ship between the numerical range of key operation parameters and flue pressure,the effect of operation parameter combinations on flue pressure,and the prediction process of the flue pressure prediction model on a single sample were analyzed.A flue pressure optimization module was also constructed and analyzed when the prediction satisfied the judgment conditions.The operating parameter combination was then pushed.The flue pressure was increased by 5.87%during the verification process,achieving a good optimization effect.
基金supported by the National Natural Science Foundation of China(51767017)the Basic Research Innovation Group Project of Gansu Province(18JR3RA133)the Industrial Support and Guidance Project of Universities in Gansu Province(2022CYZC-22).
文摘Photovoltaic (PV) modules, as essential components of solar power generation systems, significantly influence unitpower generation costs.The service life of these modules directly affects these costs. Over time, the performanceof PV modules gradually declines due to internal degradation and external environmental factors.This cumulativedegradation impacts the overall reliability of photovoltaic power generation. This study addresses the complexdegradation process of PV modules by developing a two-stage Wiener process model. This approach accountsfor the distinct phases of degradation resulting from module aging and environmental influences. A powerdegradation model based on the two-stage Wiener process is constructed to describe individual differences inmodule degradation processes. To estimate the model parameters, a combination of the Expectation-Maximization(EM) algorithm and the Bayesian method is employed. Furthermore, the Schwarz Information Criterion (SIC) isutilized to identify critical change points in PV module degradation trajectories. To validate the universality andeffectiveness of the proposed method, a comparative analysis is conducted against other established life predictiontechniques for PV modules.
基金supported by National Natural Science Foundation of China(Project approval number 82201825).
文摘Objective:To compare the clinical efficacy of mifepristone-misoprostol medical management versus surgical curettage for first-trimester missed miscarriage,and to establish evidence-based sonographic cutoff values predictive of incomplete abortion requiring surgical intervention.Methods:We retrospectively analyzed a cohort of 702 women diagnosed with first-trimester missed miscarriage between January 2020 and May 2023.Demographic characteristics and ultrasound parameters were systematically recorded.Receiver operating characteristic(ROC)curve analysis was performed to establish optimal sonographic cutoff values for predicting incomplete abortion requiring surgical intervention.Results:146 patients received medical treatment(mifepristone and misoprostol)and 556 underwent surgical curettage.At the 1-month follow-up,the medical group showed significantly greater endometrial thickness and longer postoperative bleeding duration than the surgical group(P<0.05).The menstrual volume reduction rate(23.56%)was significantly lower in the medical group than in the surgical group.The incomplete abortion rate was higher in the medical group(17.12%,25/146)than in the surgical group(2.88%,16/556).Among the medical group,14 patients(9.59%)required curettage due to incomplete abortion,while 11 cases resolved spontaneously after prolonged medication.ROC curve analysis identified two cut-off values indicating the need for surgical intervention:endometrial thickness>1.21 cm at 24 h post-medical abortion,and residual mass diameter>0.95 cm at 7 days post-medical abortion.Conclusions:Medical management of first-trimester missed miscarriage using mifepristone-misoprostol demonstrates comparable efficacy to surgical curettage.An endometrial thickness>1.21 cm at 24 h or residual tissue diameter>0.95 cm at 7 days post-medical abortion should prompt consideration of incomplete abortion.
基金supported in part by the Natural Science Foundation of China under Grant Nos.U2468201 and 62221001ZTE Industry-University-Institute Cooperation Funds under Grant No.IA20240420002。
文摘Accurate channel state information(CSI)is crucial for 6G wireless communication systems to accommodate the growing demands of mobile broadband services.In massive multiple-input multiple-output(MIMO)systems,traditional CSI feedback approaches face challenges such as performance degradation due to feedback delay and channel aging caused by user mobility.To address these issues,we propose a novel spatio-temporal predictive network(STPNet)that jointly integrates CSI feedback and prediction modules.STPNet employs stacked Inception modules to learn the spatial correlation and temporal evolution of CSI,which captures both the local and the global spatiotemporal features.In addition,the signal-to-noise ratio(SNR)adaptive module is designed to adapt flexibly to diverse feedback channel conditions.Simulation results demonstrate that STPNet outperforms existing channel prediction methods under various channel conditions.
基金supported by National Natural Science Foundation of China(32122066,32201855)STI2030—Major Projects(2023ZD04076).
文摘Phenotypic prediction is a promising strategy for accelerating plant breeding.Data from multiple sources(called multi-view data)can provide complementary information to characterize a biological object from various aspects.By integrating multi-view information into phenotypic prediction,a multi-view best linear unbiased prediction(MVBLUP)method is proposed in this paper.To measure the importance of multiple data views,the differential evolution algorithm with an early stopping mechanism is used,by which we obtain a multi-view kinship matrix and then incorporate it into the BLUP model for phenotypic prediction.To further illustrate the characteristics of MVBLUP,we perform the empirical experiments on four multi-view datasets in different crops.Compared to the single-view method,the prediction accuracy of the MVBLUP method has improved by 0.038–0.201 on average.The results demonstrate that the MVBLUP is an effective integrative prediction method for multi-view data.
基金the funding support from the National Natural Science Foundation of China(Grant No.52308340)Chongqing Talent Innovation and Entrepreneurship Demonstration Team Project(Grant No.cstc2024ycjh-bgzxm0012)the Science and Technology Projects supported by China Coal Technology and Engineering Chongqing Design and Research Institute(Group)Co.,Ltd..(Grant No.H20230317)。
文摘Influenced by complex external factors,the displacement-time curve of reservoir landslides demonstrates both short-term and long-term diversity and dynamic complexity.It is difficult for existing methods,including Regression models and Neural network models,to perform multi-characteristic coupled displacement prediction because they fail to consider landslide creep characteristics.This paper integrates the creep characteristics of landslides with non-linear intelligent algorithms and proposes a dynamic intelligent landslide displacement prediction method based on a combination of the Biological Growth model(BG),Convolutional Neural Network(CNN),and Long ShortTerm Memory Network(LSTM).This prediction approach improves three different biological growth models,thereby effectively extracting landslide creep characteristic parameters.Simultaneously,it integrates external factors(rainfall and reservoir water level)to construct an internal and external comprehensive dataset for data augmentation,which is input into the improved CNN-LSTM model.Thereafter,harnessing the robust feature extraction capabilities and spatial translation invariance of CNN,the model autonomously captures short-term local fluctuation characteristics of landslide displacement,and combines LSTM's efficient handling of long-term nonlinear temporal data to improve prediction performance.An evaluation of the Liangshuijing landslide in the Three Gorges Reservoir Area indicates that BG-CNN-LSTM exhibits high prediction accuracy,excellent generalization capabilities when dealing with various types of landslides.The research provides an innovative approach to achieving the whole-process,realtime,high-precision displacement predictions for multicharacteristic coupled landslides.
基金sponsored by the National Natural Science Foun-dation of China(Grant No.42330111).
文摘In this article,our nonlinear theory and technology for reducing the uncertainties of high-impact ocean‒atmosphere event predictions,with the conditional nonlinear optimal perturbation(CNOP)method as its core,are reviewed,and the“spring predictability barrier”problem for El Nino‒Southern Oscillation events and targeted observation issues for tropical cyclone forecasts are taken as two representative examples.Nonlinear theory reveals that initial errors of particular spatial structures,environmental conditions,and nonlinear processes contribute to significant prediction errors,whereas nonlinear technology provides a pioneering approach for reducing observational and forecast errors via targeted observations through the application of the CNOP method.Follow-up research further validates the scientific rigor of the theory in revealing the nonlinear mechanism of significant prediction errors,and relevant practical field campaigns for targeted observations verify the effectiveness of the technology in reducing prediction uncertainties.The CNOP method has achieved international recognition;furthermore,its applications further extend to ensemble forecasts for weather and climate and further enrich the nonlinear technology for reducing prediction uncertainties.It is expected that this nonlinear theory and technology will play a considerably important role in reducing prediction uncertainties for high-impact weather and climate events.
基金funded by the project supported by the Natural Science Foundation of Heilongjiang Provincial(Grant Number LH2023F033)the Science and Technology Innovation Talent Project of Harbin(Grant Number 2022CXRCCG006).
文摘Stock price prediction is a typical complex time series prediction problem characterized by dynamics,nonlinearity,and complexity.This paper introduces a generative adversarial network model that incorporates an attention mechanism(GAN-LSTM-Attention)to improve the accuracy of stock price prediction.Firstly,the generator of this model combines the Long and Short-Term Memory Network(LSTM),the Attention Mechanism and,the Fully-Connected Layer,focusing on generating the predicted stock price.The discriminator combines the Convolutional Neural Network(CNN)and the Fully-Connected Layer to discriminate between real stock prices and generated stock prices.Secondly,to evaluate the practical application ability and generalization ability of the GAN-LSTM-Attention model,four representative stocks in the United States of America(USA)stock market,namely,Standard&Poor’s 500 Index stock,Apple Incorporatedstock,AdvancedMicroDevices Incorporatedstock,and Google Incorporated stock were selected for prediction experiments,and the prediction performance was comprehensively evaluated by using the three evaluation metrics,namely,mean absolute error(MAE),root mean square error(RMSE),and coefficient of determination(R2).Finally,the specific effects of the attention mechanism,convolutional layer,and fully-connected layer on the prediction performance of the model are systematically analyzed through ablation study.The results of experiment show that the GAN-LSTM-Attention model exhibits excellent performance and robustness in stock price prediction.
基金supported by a major project of the Zhejiang Natural Science Foundation(LD21G030001).
文摘Objective The Asia-Pacific region has a high chronic obstructive pulmonary disease(COPD)burden,but studies on its trends are limited.Using the Global Burden of Disease(GBD)2019 data,we analyzed COPD trends in 36 countries and territories from 1990 to 2019 and predicted future incidence trends through 2034.Methods COPD data by age and sex from the GBD 2019 database were analyzed for incidence,prevalence,mortality,and disability-adjusted life years(DALY)rates from 1990 to 2019.Joinpoint regression identified significant annual trends,and age-standardized incidence rates were predicted through 2034 using age-period-cohort models.Results The incidence,prevalence,mortality,and disease burden of COPD have been decreasing,and the incidence rates will continue to decrease or remain stable until 2034 in most selected countries and territories,except for a few Southeastern Asian countries.The Lao People’s Democratic Republic and Vietnam are projected to experience an increase in COPD incidence from 165.3 per 100,000 in 2019 to 177 per 100,000 in 2034 and from 179.9 per 100,000 in 2019 to 192.5 per 100,000 in 2034,respectively.Older males had a higher incidence than any other sex or age group.The sex gap in incidence rates continues to widen,though it is smaller and less significant in the younger age group than in those in the older one.Conclusion COPD rates are expected to decline until 2034 but remain a health risk,especially in countries with rising rates.Urgent action on tobacco control,air pollution,and public education is needed.
基金Supported by National Key Research and Development Program,No.2022YFC2407304Major Research Project for Middle-Aged and Young Scientists of Fujian Provincial Health Commission,No.2021ZQNZD013+2 种基金The National Natural Science Foundation of China,No.62275050Fujian Province Science and Technology Innovation Joint Fund Project,No.2019Y9108Major Science and Technology Projects of Fujian Province,No.2021YZ036017.
文摘BACKGROUND To investigate the preoperative factors influencing textbook outcomes(TO)in Intrahepatic cholangiocarcinoma(ICC)patients and evaluate the feasibility of an interpretable machine learning model for preoperative prediction of TO,we developed a machine learning model for preoperative prediction of TO and used the SHapley Additive exPlanations(SHAP)technique to illustrate the prediction process.AIM To analyze the factors influencing textbook outcomes before surgery and to establish interpretable machine learning models for preoperative prediction.METHODS A total of 376 patients diagnosed with ICC were retrospectively collected from four major medical institutions in China,covering the period from 2011 to 2017.Logistic regression analysis was conducted to identify preoperative variables associated with achieving TO.Based on these variables,an EXtreme Gradient Boosting(XGBoost)machine learning prediction model was constructed using the XGBoost package.The SHAP(package:Shapviz)algorithm was employed to visualize each variable's contribution to the model's predictions.Kaplan-Meier survival analysis was performed to compare the prognostic differences between the TO-achieving and non-TO-achieving groups.RESULTS Among 376 patients,287 were included in the training group and 89 in the validation group.Logistic regression identified the following preoperative variables influencing TO:Child-Pugh classification,Eastern Cooperative Oncology Group(ECOG)score,hepatitis B,and tumor size.The XGBoost prediction model demonstrated high accuracy in internal validation(AUC=0.8825)and external validation(AUC=0.8346).Survival analysis revealed that the disease-free survival rates for patients achieving TO at 1,2,and 3 years were 64.2%,56.8%,and 43.4%,respectively.CONCLUSION Child-Pugh classification,ECOG score,hepatitis B,and tumor size are preoperative predictors of TO.In both the training group and the validation group,the machine learning model had certain effectiveness in predicting TO before surgery.The SHAP algorithm provided intuitive visualization of the machine learning prediction process,enhancing its interpretability.
基金supported by the National Natural Science Foun-dation of China(Grant Nos.:32170680 and T2122018)the Natural Science Foundation of Shanghai,China(Grant No.:21ZR1476000)the CAS Youth Innovation Promotion Association,China(Grant No.:Y2022076).
文摘Personalized drug response prediction from molecular data is an important challenge in precision medicine for treating cancer.Computational methods have been widely explored and have become increasingly accurate in recent years.However,the clinical application of prediction methods is still in its infancy due to large discrepancies between preclinial models and patients.We present a novel disentangled synthesis transfer network(DiSyn)for drug response prediction specifically designed for transfer learning from preclinical models to clinical patients.DiSyn uses a domain separation network(DSN)to disentangle drug response related features,employs data synthesis technology to increase the sample size and iteratively trains for better feature disentanglement.DiSyn is pretrained on large-scale unlabeled cancer samples and validated by three datasets,The Cancer Genome Atlas(TCGA),Investigation of Serial Studies to Predict Your Therapeutic Response With Imaging And moLecular Analysis 2(I-SPY2)and Novartis Institutes for Biomedical Research Patient-Derived Xenograft Encyclopedia(NIBR PDXE),achieving competitive performance with the state-of-the-art methods on cancer patients and mice.Furthermore,the application of DiSyn to thousands of breast cancer patients show the heterogeneity in drug responses and demonstrate its potential value in biomarker discovery and drug combination prediction.
基金supported by Poongsan-KAIST Future Research Center Projectthe fund support provided by the National Research Foundation of Korea(NRF)grant funded by the Korea government(MSIT)(Grant No.2023R1A2C2005661)。
文摘This study presents a machine learning-based method for predicting fragment velocity distribution in warhead fragmentation under explosive loading condition.The fragment resultant velocities are correlated with key design parameters including casing dimensions and detonation positions.The paper details the finite element analysis for fragmentation,the characterizations of the dynamic hardening and fracture models,the generation of comprehensive datasets,and the training of the ANN model.The results show the influence of casing dimensions on fragment velocity distributions,with the tendencies indicating increased resultant velocity with reduced thickness,increased length and diameter.The model's predictive capability is demonstrated through the accurate predictions for both training and testing datasets,showing its potential for the real-time prediction of fragmentation performance.
基金supported by the Science and Technology Project of Jiangsu Coastal Power Infrastructure Intelligent Engineering Research Center“Photovoltaic Power Prediction System Driven by Deep Learning and Multi-Source Data Fusion”(F2024-5044).
文摘Harnessing solar power is essential for addressing the dual challenges of global warming and the depletion of traditional energy sources.However,the fluctuations and intermittency of photovoltaic(PV)power pose challenges for its extensive incorporation into power grids.Thus,enhancing the precision of PV power prediction is particularly important.Although existing studies have made progress in short-term prediction,issues persist,particularly in the underutilization of temporal features and the neglect of correlations between satellite cloud images and PV power data.These factors hinder improvements in PV power prediction performance.To overcome these challenges,this paper proposes a novel PV power prediction method based on multi-stage temporal feature learning.First,the improved LSTMand SA-ConvLSTMare employed to extract the temporal feature of PV power and the spatial-temporal feature of satellite cloud images,respectively.Subsequently,a novel hybrid attention mechanism is proposed to identify the interplay between the two modalities,enhancing the capacity to focus on the most relevant features.Finally,theTransformermodel is applied to further capture the short-termtemporal patterns and long-term dependencies within multi-modal feature information.The paper also compares the proposed method with various competitive methods.The experimental results demonstrate that the proposed method outperforms the competitive methods in terms of accuracy and reliability in short-term PV power prediction.
基金supported by the National Natural Science Foundation of China(82371300)Zhejiang Provincial Natural Science Foundation of China(LY23H090014)Zhejiang Province Traditional Chinese Medicine Science and Technology Project(2024ZL1215).
文摘Stroke,a major cerebrovascular disease,has high morbidity and mortality.Effective methods to reduce the risk and improve the prognosis are lacking.Currently,uric acid(UA)is associated with the pathological mechanism,prognosis,and therapy of stroke.UA plays pro/anti-oxidative and pro-inflammatory roles in vivo.The specific role of UA in stroke,which may have both neuroprotective and damaging effects,remains unclear.There is a U-shaped association between serum uric acid(SUA)levels and ischemic stroke(IS).UA therapy provides neuroprotection during reperfusion therapy for acute ischemic stroke(AIS).Urate-lowering therapy(ULT)plays a protective role in IS with hyperuricemia or gout.SUA levels are associated with the cerebrovascular injury mechanism,risk,and outcomes of hemorrhagic stroke.In this review,we summarize the current research on the role of UA in stroke,providing potential targets for its prediction and treatment.
基金supported by the research on key technologies for monitoring and identifying drug abuse of anesthetic drugs and psychotropic drugs,and intervention for addiction(No.2023YFC3304200)the program of a study on the diagnosis of addiction to synthetic cannabinoids and methods of assessing the risk of abuse(No.2022YFC3300905)+1 种基金the program of Ab initio design and generation of AI models for small molecule ligands based on target structures(No.2022PE0AC03)ZHIJIANG LAB.
文摘The accurate prediction of drug absorption,distribution,metabolism,excretion,and toxicity(ADMET)properties represents a crucial step in early drug development for reducing failure risk.Current deep learning approaches face challenges with data sparsity and information loss due to single-molecule representation limitations and isolated predictive tasks.This research proposes molecular properties prediction with parallel-view and collaborative learning(MolP-PC),a multi-view fusion and multi-task deep learning framework that integrates 1D molecular fingerprints(MFs),2D molecular graphs,and 3D geometric representations,incorporating an attention-gated fusion mechanism and multi-task adaptive learning strategy for precise ADMET property predictions.Experimental results demonstrate that MolP-PC achieves optimal performance in 27 of 54 tasks,with its multi-task learning(MTL)mechanism significantly enhancing predictive performance on small-scale datasets and surpassing single-task models in 41 of 54 tasks.Additional ablation studies and interpretability analyses confirm the significance of multi-view fusion in capturing multi-dimensional molecular information and enhancing model generalization.A case study examining the anticancer compound Oroxylin A demonstrates MolP-PC’s effective generalization in predicting key pharmacokinetic parameters such as half-life(T0.5)and clearance(CL),indicating its practical utility in drug modeling.However,the model exhibits a tendency to underestimate volume of distribution(VD),indicating potential for improvement in analyzing compounds with high tissue distribution.This study presents an efficient and interpretable approach for ADMET property prediction,establishing a novel framework for molecular optimization and risk assessment in drug development.
基金upported by the National Key Research and Development Program of China(Grant No.:2023YFF1204904)the National Natural Science Foundation of China(Grant Nos.:U23A20530 and 82173746)Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism(Shanghai Municipal Education Commission,China).
文摘Negative logarithm of the acid dissociation constant(pK_(a))significantly influences the absorption,dis-tribution,metabolism,excretion,and toxicity(ADMET)properties of molecules and is a crucial indicator in drug research.Given the rapid and accurate characteristics of computational methods,their role in predicting drug properties is increasingly important.Although many pK_(a) prediction models currently exist,they often focus on enhancing model precision while neglecting interpretability.In this study,we present GraFpKa,a pK_(a) prediction model using graph neural networks(GNNs)and molecular finger-prints.The results show that our acidic and basic models achieved mean absolute errors(MAEs)of 0.621 and 0.402,respectively,on the test set,demonstrating good predictive performance.Notably,to improve interpretability,GraFpKa also incorporates Integrated Gradients(IGs),providing a clearer visual description of the atoms significantly affecting the pK_(a) values.The high reliability and interpretability of GraFpKa ensure accurate pKa predictions while also facilitating a deeper understanding of the relation-ship between molecular structure and pK_(a) values,making it a valuable tool in the field of pK_(a) prediction.
基金supported by the National Natural Science Foundation of China(Nos.82173746 and U23A20530)Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism(Shanghai Municipal Education Commission)。
文摘Accurate prediction of drug-target interactions(DTIs)plays a pivotal role in drug discovery,facilitating optimization of lead compounds,drug repurposing and elucidation of drug side effects.However,traditional DTI prediction methods are often limited by incomplete biological data and insufficient representation of protein features.In this study,we proposed KG-CNNDTI,a novel knowledge graph-enhanced framework for DTI prediction,which integrates heterogeneous biological information to improve model generalizability and predictive performance.The proposed model utilized protein embeddings derived from a biomedical knowledge graph via the Node2Vec algorithm,which were further enriched with contextualized sequence representations obtained from ProteinBERT.For compound representation,multiple molecular fingerprint schemes alongside the Uni-Mol pre-trained model were evaluated.The fused representations served as inputs to both classical machine learning models and a convolutional neural network-based predictor.Experimental evaluations across benchmark datasets demonstrated that KG-CNNDTI achieved superior performance compared to state-of-the-art methods,particularly in terms of Precision,Recall,F1-Score and area under the precision-recall curve(AUPR).Ablation analysis highlighted the substantial contribution of knowledge graph-derived features.Moreover,KG-CNNDTI was employed for virtual screening of natural products against Alzheimer's disease,resulting in 40 candidate compounds.5 were supported by literature evidence,among which 3 were further validated in vitro assays.
基金supported by Macao Science and Technology Development Fund,Macao SAR,China(Grant No.:0043/2023/AFJ)the National Natural Science Foundation of China(Grant No.:22173038)Macao Polytechnic University,Macao SAR,China(Grant No.:RP/FCA-01/2022).
文摘Accurate prediction of molecular properties is crucial for selecting compounds with ideal properties and reducing the costs and risks of trials.Traditional methods based on manually crafted features and graph-based methods have shown promising results in molecular property prediction.However,traditional methods rely on expert knowledge and often fail to capture the complex structures and interactions within molecules.Similarly,graph-based methods typically overlook the chemical structure and function hidden in molecular motifs and struggle to effectively integrate global and local molecular information.To address these limitations,we propose a novel fingerprint-enhanced hierarchical graph neural network(FH-GNN)for molecular property prediction that simultaneously learns information from hierarchical molecular graphs and fingerprints.The FH-GNN captures diverse hierarchical chemical information by applying directed message-passing neural networks(D-MPNN)on a hierarchical molecular graph that integrates atomic-level,motif-level,and graph-level information along with their relationships.Addi-tionally,we used an adaptive attention mechanism to balance the importance of hierarchical graphs and fingerprint features,creating a comprehensive molecular embedding that integrated hierarchical mo-lecular structures with domain knowledge.Experiments on eight benchmark datasets from MoleculeNet showed that FH-GNN outperformed the baseline models in both classification and regression tasks for molecular property prediction,validating its capability to comprehensively capture molecular informa-tion.By integrating molecular structure and chemical knowledge,FH-GNN provides a powerful tool for the accurate prediction of molecular properties and aids in the discovery of potential drug candidates.
基金funded by the Natural Science Foundation of China(Grant Nos.42377164 and 41972280)the Badong National Observation and Research Station of Geohazards(Grant No.BNORSG-202305).
文摘Landslide susceptibility prediction(LSP)is significantly affected by the uncertainty issue of landslide related conditioning factor selection.However,most of literature only performs comparative studies on a certain conditioning factor selection method rather than systematically study this uncertainty issue.Targeted,this study aims to systematically explore the influence rules of various commonly used conditioning factor selection methods on LSP,and on this basis to innovatively propose a principle with universal application for optimal selection of conditioning factors.An'yuan County in southern China is taken as example considering 431 landslides and 29 types of conditioning factors.Five commonly used factor selection methods,namely,the correlation analysis(CA),linear regression(LR),principal component analysis(PCA),rough set(RS)and artificial neural network(ANN),are applied to select the optimal factor combinations from the original 29 conditioning factors.The factor selection results are then used as inputs of four types of common machine learning models to construct 20 types of combined models,such as CA-multilayer perceptron,CA-random forest.Additionally,multifactor-based multilayer perceptron random forest models that selecting conditioning factors based on the proposed principle of“accurate data,rich types,clear significance,feasible operation and avoiding duplication”are constructed for comparisons.Finally,the LSP uncertainties are evaluated by the accuracy,susceptibility index distribution,etc.Results show that:(1)multifactor-based models have generally higher LSP performance and lower uncertainties than those of factors selection-based models;(2)Influence degree of different machine learning on LSP accuracy is greater than that of different factor selection methods.Conclusively,the above commonly used conditioning factor selection methods are not ideal for improving LSP performance and may complicate the LSP processes.In contrast,a satisfied combination of conditioning factors can be constructed according to the proposed principle.