Identifying the community structure of complex networks is crucial to extracting insights and understanding network properties.Although several community detection methods have been proposed,many are unsuitable for so...Identifying the community structure of complex networks is crucial to extracting insights and understanding network properties.Although several community detection methods have been proposed,many are unsuitable for social networks due to significant limitations.Specifically,most approaches depend mainly on user-user structural links while overlooking service-centric,semantic,and multi-attribute drivers of community formation,and they also lack flexible filtering mechanisms for large-scale,service-oriented settings.Our proposed approach,called community discovery-based service(CDBS),leverages user profiles and their interactions with consulted web services.The method introduces a novel similarity measure,global similarity interaction profile(GSIP),which goes beyond typical similarity measures by unifying user and service profiles for all attributes types into a coherent representation,thereby clarifying its novelty and contribution.It applies multiple filtering criteria related to user attributes,accessed services,and interaction patterns.Experimental comparisons against Louvain,Hierarchical Agglomerative Clustering,Label Propagation and Infomap show that CDBS reveals the higher performance as it achieves 0.74 modularity,0.13 conductance,0.77 coverage,and significantly fast response time of 9.8 s,even with 10,000 users and 400 services.Moreover,community discoverybased service consistently detects a larger number of communities with distinct topics of interest,underscoring its capacity to generate detailed and efficient structures in complex networks.These results confirm both the efficiency and effectiveness of the proposed method.Beyond controlled evaluation,communities discovery based service is applicable to targeted recommendations,group-oriented marketing,access control,and service personalization,where communities are shaped not only by user links but also by service engagement.展开更多
Most Convolutional Neural Network(CNN)interpretation techniques visualize only the dominant cues that the model relies on,but there is no guarantee that these represent all the evidence the model uses for classificati...Most Convolutional Neural Network(CNN)interpretation techniques visualize only the dominant cues that the model relies on,but there is no guarantee that these represent all the evidence the model uses for classification.This limitation becomes critical when hidden secondary cues—potentially more meaningful than the visualized ones—remain undiscovered.This study introduces CasCAM(Cascaded Class Activation Mapping)to address this fundamental limitation through counterfactual reasoning.By asking“if this dominant cue were absent,what other evidence would the model use?”,CasCAM progressively masks the most salient features and systematically uncovers the hierarchy of classification evidence hidden beneath them.Experimental results demonstrate that CasCAM effectively discovers the full spectrum of reasoning evidence and can be universally applied with nine existing interpretation methods.展开更多
Recent years have witnessed the significant breakthrough in the field of new materials discovery brought about by the artificial intelligence(AI).AI has successfully been applied for predicting the formability,reveali...Recent years have witnessed the significant breakthrough in the field of new materials discovery brought about by the artificial intelligence(AI).AI has successfully been applied for predicting the formability,revealing the properties,and guiding the experimental synthesis of materials.Rapid progress has been made in the integration of increasing database and improved computing power.Though some reviews present the development from their unique aspects,reviews from the view of how AI empowered both discovery of new materials and cognition of existing materials that covers the completed contents with two synergistical aspects are few.Here,the newest development is systematically reviewed in the field of AI empowered materials,reflecting advanced design of the intelligent systems for discovery,synthesis,prediction and validation of materials.First,background and mechanisms are briefed,after which the design for the AI systems with data,machine learning and automated laboratory included is illustrated.Next,strategies are summarized to obtain the AI systems for materials with improved performance which comprehensively cover the aspects from the in-depth cognizance of existing material and the rapid discovery of new materials,and then,the design thought for future AI systems in material science is pointed out.Finally,some perspectives are put forward.展开更多
Drug discovery is a complex and highly systematic process encompassing multiple critical stages,including target identification,bioactive molecule discovery,preclinical research,clinical trials,regulatory review,post-...Drug discovery is a complex and highly systematic process encompassing multiple critical stages,including target identification,bioactive molecule discovery,preclinical research,clinical trials,regulatory review,post-marketing surveillance,and others[1].This process typically spans many years and is often accompanied by high failure rates and substantial resource consumption.In recent years,driven by large amounts of biomedical data,artificial intelligence(AI)has begun to reshape every stage of drug discovery[2].Particularly,by integrating diverse,high-dimensional datasets with powerful predictive and generative models.展开更多
In the realm of drug discovery,recent advancements have paved the way for innovative approaches and methodologies.This comprehensive review encapsulates six distinct yet interrelated mini-reviews,each shedding light o...In the realm of drug discovery,recent advancements have paved the way for innovative approaches and methodologies.This comprehensive review encapsulates six distinct yet interrelated mini-reviews,each shedding light on novel strategies in drug development.(a)The resurgence of covalent drugs is highlighted,focusing on the targeted covalent inhibitors(TCIs)and their role in enhancing selectivity and affinity.(b)The potential of the quantum mechanics-based computational aid drug design(CADD)tool,Cov_DOX,is introduced for predicting protein-covalent ligand binding structures and affinities.(c)The scaffolding function of proteins is proposed as a new avenue for drug design,with a focus on modulating protein-protein interactions through small molecules and proteolysis targeting chimeras(PROTACs).(d)The concept of pro-PROTACs is explored as a promising strategy for cancer therapy,combining the principles of prodrugs and PROTACs to enhance specificity and reduce toxicity.(e)The design of prodrugs through carbon-carbon bond cleavage is discussed,offering a new perspective for the activation of drugs with limited modifiable functional groups.(f)The targeting of programmed cell death pathways in cancer therapies with small molecules is reviewed,emphasizing the induction of autophagy-dependent cell death,ferroptosis,and cuproptosis.These insights collectively contribute to a deeper understanding of the dynamic landscape of drug discovery.展开更多
Transformer models have emerged as pivotal tools within the realm of drug discovery,distinguished by their unique architectural features and exceptional performance in managing intricate data landscapes.Leveraging the...Transformer models have emerged as pivotal tools within the realm of drug discovery,distinguished by their unique architectural features and exceptional performance in managing intricate data landscapes.Leveraging the innate capabilities of transformer architectures to comprehend intricate hierarchical dependencies inherent in sequential data,these models showcase remarkable efficacy across various tasks,including new drug design and drug target identification.The adaptability of pre-trained trans-former-based models renders them indispensable assets for driving data-centric advancements in drug discovery,chemistry,and biology,furnishing a robust framework that expedites innovation and dis-covery within these domains.Beyond their technical prowess,the success of transformer-based models in drug discovery,chemistry,and biology extends to their interdisciplinary potential,seamlessly combining biological,physical,chemical,and pharmacological insights to bridge gaps across diverse disciplines.This integrative approach not only enhances the depth and breadth of research endeavors but also fosters synergistic collaborations and exchange of ideas among disparate fields.In our review,we elucidate the myriad applications of transformers in drug discovery,as well as chemistry and biology,spanning from protein design and protein engineering,to molecular dynamics(MD),drug target iden-tification,transformer-enabled drug virtual screening(VS),drug lead optimization,drug addiction,small data set challenges,chemical and biological image analysis,chemical language understanding,and single cell data.Finally,we conclude the survey by deliberating on promising trends in transformer models within the context of drug discovery and other sciences.展开更多
Anemoside B4(AB4),a triterpenoidal saponin derived from Pulsatilla chinensis,has garnered considerable attention for its potent anti-inflammatory and immunomodulatory activities,culminating in its approval for clinica...Anemoside B4(AB4),a triterpenoidal saponin derived from Pulsatilla chinensis,has garnered considerable attention for its potent anti-inflammatory and immunomodulatory activities,culminating in its approval for clinical trials by the Center for Drug Evaluation,National Medical Products Administration,for the treatment of mild to moderate ulcerative colitis.Despite this,AB4’s therapeutic potential remained underexplored until the development of its injection formulation.This review discusses the scientific rationale and theoretical framework behind AB4’s development,offering a new paradigm and innovative research strategy for discovering lead compounds or drug candidates from natural medicines.In-depth investigations into AB4’s cellular targets,biochemical pathways,and administration routes have provided valuable insights into its druggability evaluation and clinical potential.The high water solubility of AB4,attributable to its multiple sugar units,imposes limitations on its bioavailability and pharmacokinetic profiles.To address this,structural modification via chemical methods and enzymatic hydrolysis have been employed,resulting in derivatives with reduced molecular weight,improved bioavailability,enhanced pharmacological activity,and greater clinical potential.These advances lay a solid foundation for the continued development of AB4 and its derivatives as promising therapeutic agents.展开更多
Synthetic biology(SynBio)is an emerging field of study with great potential in designing,engineering,and constructing new microbial synthetic cells that do not pre-exist in nature or re-engineering existing cells to a...Synthetic biology(SynBio)is an emerging field of study with great potential in designing,engineering,and constructing new microbial synthetic cells that do not pre-exist in nature or re-engineering existing cells to accomplish industrial purposes.Systems biology seeks to understand biology at multiple dimensions,beginning with the molecular and cellular level and progressing to the tissues and organismal level and characterizes cells as complex information-processing systems.SynBio,on the other hand,toggles further and strives to develop and create its systems from scratch.SynBio is now applied in the development of novel therapeutic drugs for the prevention of human diseases,scale up industrial processes,and accomplish previously unfeasible industrial outcomes.This is made possible through significant breakthroughs in DNA sequencing and synthesis technology,as well as insights gained from synthetic chemistry and systems biology.SynBio technologies have allowed for the introduction of improved and synthetic metabolic functionalities in microorganisms to enable the synthesis of a range of pharmacologically-relevant compounds for pharmaceutical exploration.SynBio applications range from finding new ways to making industrial chemical synthesis processes more sustainable as well as the microbial synthesis of improved therapeutic modalities.Hence,this study underpins several innovations,auspicious potentials,and future directions afforded by SynBio that proposes improved industrial microbial synthesis for pharmaceutical exploration.展开更多
This review presents a comprehensive and forward-looking analysis of how Large Language Models(LLMs)are transforming knowledge discovery in the rational design of advancedmicro/nano electrocatalyst materials.Electroca...This review presents a comprehensive and forward-looking analysis of how Large Language Models(LLMs)are transforming knowledge discovery in the rational design of advancedmicro/nano electrocatalyst materials.Electrocatalysis is central to sustainable energy and environmental technologies,but traditional catalyst discovery is often hindered by high complexity,fragmented knowledge,and inefficiencies.LLMs,particularly those based on Transformer architectures,offer unprecedented capabilities in extracting,synthesizing,and generating scientific knowledge from vast unstructured textual corpora.This work provides the first structured synthesis of how LLMs have been leveraged across various electrocatalysis tasks,including automated information extraction from literature,text-based property prediction,hypothesis generation,synthesis planning,and knowledge graph construction.We comparatively analyze leading LLMs and domain-specific frameworks(e.g.,CatBERTa,CataLM,CatGPT)in terms of methodology,application scope,performance metrics,and limitations.Through curated case studies across key electrocatalytic reactions—HER,OER,ORR,and CO_(2)RR—we highlight emerging trends such as the growing use of embedding-based prediction,retrieval-augmented generation,and fine-tuned scientific LLMs.The review also identifies persistent challenges,including data heterogeneity,hallucination risks,lack of standard benchmarks,and limited multimodal integration.Importantly,we articulate future research directions,such as the development of multimodal and physics-informedMatSci-LLMs,enhanced interpretability tools,and the integration of LLMswith selfdriving laboratories for autonomous discovery.By consolidating fragmented advances and outlining a unified research roadmap,this review provides valuable guidance for both materials scientists and AI practitioners seeking to accelerate catalyst innovation through large language model technologies.展开更多
Endometrial cancer is the most common gynecologic cancer diagnosed in the United States and mortality is on the rise.Advanced and recurrent endometrial cancer represents a treatment challenge as historically there hav...Endometrial cancer is the most common gynecologic cancer diagnosed in the United States and mortality is on the rise.Advanced and recurrent endometrial cancer represents a treatment challenge as historically there have been limited therapeutic options for patients.In the last several years,multiple practice-changing clinical trials have led to significant improvements in the treatment landscape.This review will cover updates in the treatment and management of advanced and recurrent endometrial cancer with a focus on novel therapeutics,such as anti-PD-L1 and PD-1 inhibitors,poly ADP-ribose polymerase(PARP)inhibitors,antibody-drug conjugates,and hormonal therapy.展开更多
The learning algorithms of causal discovery mainly include score-based methods and genetic algorithms(GA).The score-based algorithms are prone to searching space explosion.Classical GA is slow to converge,and prone to...The learning algorithms of causal discovery mainly include score-based methods and genetic algorithms(GA).The score-based algorithms are prone to searching space explosion.Classical GA is slow to converge,and prone to falling into local optima.To address these issues,an improved GA with domain knowledge(IGADK)is proposed.Firstly,domain knowledge is incorporated into the learning process of causality to construct a new fitness function.Secondly,a dynamical mutation operator is introduced in the algorithm to accelerate the convergence rate.Finally,an experiment is conducted on simulation data,which compares the classical GA with IGADK with domain knowledge of varying accuracy.The IGADK can greatly reduce the number of iterations,populations,and samples required for learning,which illustrates the efficiency and effectiveness of the proposed algorithm.展开更多
Pesticides play a pivotal role in modern agriculture. However, the pesticide industry faces significant challenges closely linked to major global concerns such as pesticide resistance, environmental pollution, food sa...Pesticides play a pivotal role in modern agriculture. However, the pesticide industry faces significant challenges closely linked to major global concerns such as pesticide resistance, environmental pollution, food safety, and crop yields. Developing safe, efficient, and environmentally friendly pesticides has become a key challenge for the industry. Recently, Qing Yang and colleagues unveiled the mode of action of a dual-functional protein, the ABCH transporter, which plays essential roles in lipid transport to construct the lipid barrier of insect cuticles and in pesticide detoxification within insects. Since ABCH transporters are critical for all insects but absent in mammals and plants, this elegant and exciting work provides a highly promising target for developing safe, low-resistance pesticides. Here, we highlight the groundbreaking discoveries made by Qing Yang's team in unraveling the intricate mechanisms of the ABCH transporter.展开更多
Structural optimization of lead compounds is a crucial step in drug discovery.One optimization strategy is to modify the molecular structure of a scaffold to improve both its biological activities and absorption,distr...Structural optimization of lead compounds is a crucial step in drug discovery.One optimization strategy is to modify the molecular structure of a scaffold to improve both its biological activities and absorption,distribution,metabolism,excretion,and toxicity(ADMET)properties.One of the deep molecular generative model approaches preserves the scaffold while generating drug-like molecules,thereby accelerating the molecular optimization process.Deep molecular diffusion generative models simulate a gradual process that creates novel,chemically feasible molecules from noise.However,the existing models lack direct interatomic constraint features and struggle with capturing long-range dependencies in macromolecules,leading to challenges in modifying the scaffold-based molecular structures,and creates limitations in the stability and diversity of the generated molecules.To address these challenges,we propose a deep molecular diffusion generative model,the three-dimensional(3D)equivariant diffusion-driven molecular generation(3D-EDiffMG)model.The dual strong and weak atomic interaction force-based long-range dependency capturing equivariant encoder(dual-SWLEE)is introduced to encode both the bonding and non-bonding information based on strong and weak atomic interactions.Addi-tionally,a gate multilayer perceptron(gMLP)block with tiny attention is incorporated to explicitly model complex long-sequence feature interactions and long-range dependencies.The experimental results show that 3D-EDiffMG effectively generates unique,novel,stable,and diverse drug-like molecules,highlighting its potential for lead optimization and accelerating drug discovery.展开更多
Zeolites are crystalline microporous materials widely used in catalysis,adsorption,and ion exchange owing to their tunable pore structures and acid centers[1].Traditional zeolites,however,often suffer from limitations...Zeolites are crystalline microporous materials widely used in catalysis,adsorption,and ion exchange owing to their tunable pore structures and acid centers[1].Traditional zeolites,however,often suffer from limitations such as restricted molecular diffusion and rapid coking,which hinder their efficiency in processing large molecules.展开更多
Crystal structure prediction(CSP)is a foundational computational technique for determining the atomic arrangements of crystalline materials,especially under high-pressure conditions.While CSP plays a critical role in ...Crystal structure prediction(CSP)is a foundational computational technique for determining the atomic arrangements of crystalline materials,especially under high-pressure conditions.While CSP plays a critical role in materials science,traditional approaches often encounter significant challenges related to computational efficiency and scalability,particularly when applied to complex systems.Recent advances in machine learning(ML)have shown tremendous promise in addressing these limitations,enabling the rapid and accurate prediction of crystal structures across a wide range of chemical compositions and external conditions.This review provides a concise overview of recent progress in ML-assisted CSP methodologies,with a particular focus on machine learning potentials and generative models.By critically analyzing these advances,we highlight the transformative impact of ML in accelerating materials discovery,enhancing computational efficiency,and broadening the applicability of CSP.Additionally,we discuss emerging opportunities and challenges in this rapidly evolving field.展开更多
A polyketide synthase-nonribosomal peptide synthetase gene cluster twn in Talaromyces sp.HDN1820200 was activated by overexpression of the pathway-specific transcriptional factor TwnD.Large-scale fermentation and chem...A polyketide synthase-nonribosomal peptide synthetase gene cluster twn in Talaromyces sp.HDN1820200 was activated by overexpression of the pathway-specific transcriptional factor TwnD.Large-scale fermentation and chemical investigation of the mutant strain HDN1820200/TwnD led to the discovery of one new polyketide-amino acid conjugate,bipolamide C and one new polyketide compound,variotin A.The structures of the new compounds were determined by nuclear magnetic resonance(NMR)analysis,high-resolution electrospray ionization mass spectrometry,feeding experiments,NMR calculation and DP4^(+)analysis.This study revealed that the overexpression of the pathway-specific transcriptional factor represents a promising approach for the discovery of new natural products in fungi within specialized habitat.展开更多
Semi-supervised new intent discovery is a significant research focus in natural language understanding.To address the limitations of current semi-supervised training data and the underutilization of implicit informati...Semi-supervised new intent discovery is a significant research focus in natural language understanding.To address the limitations of current semi-supervised training data and the underutilization of implicit information,a Semi-supervised New Intent Discovery for Elastic Neighborhood Syntactic Elimination and Fusion model(SNID-ENSEF)is proposed.Syntactic elimination contrast learning leverages verb-dominant syntactic features,systematically replacing specific words to enhance data diversity.The radius of the positive sample neighborhood is elastically adjusted to eliminate invalid samples and improve training efficiency.A neighborhood sample fusion strategy,based on sample distribution patterns,dynamically adjusts neighborhood size and fuses sample vectors to reduce noise and improve implicit information utilization and discovery accuracy.Experimental results show that SNID-ENSEF achieves average improvements of 0.88%,1.27%,and 1.30%in Normalized Mutual Information(NMI),Accuracy(ACC),and Adjusted Rand Index(ARI),respectively,outperforming PTJN,DPN,MTP-CLNN,and DWG models on the Banking77,StackOverflow,and Clinc150 datasets.The code is available at https://github.com/qsdesz/SNID-ENSEF,accessed on 16 January 2025.展开更多
Existing text truth discovery methods fail to address two challenges:the inherent long-distance dependencies and thematic diversity of long texts;the inherent subjective sentiment that obscures objective evaluation of...Existing text truth discovery methods fail to address two challenges:the inherent long-distance dependencies and thematic diversity of long texts;the inherent subjective sentiment that obscures objective evaluation of source reliability.To address these challenges,a novel truth discovery method named large language model(LLM)-enhanced text truth discovery with dual attention(LTDDA)is proposed.First,LLMs generate embedded representations of text claims,and enhance the feature space to tackle long-distance dependencies and thematic diversity.Then,the complex relationship between source reliability and claim credibility is captured by integrating semantic and sentiment features.Finally,dual-layer attention is applied to extract key semantic information and assign consistent weights to similar sources,resulting in accurate truth outputs.Extensive experiments on three realworld datasets demonstrate that the effectiveness of LTDDA outperforms that of state-of-the-art methods,providing new insights for building more reliable and accurate text truth discovery systems.展开更多
文摘Identifying the community structure of complex networks is crucial to extracting insights and understanding network properties.Although several community detection methods have been proposed,many are unsuitable for social networks due to significant limitations.Specifically,most approaches depend mainly on user-user structural links while overlooking service-centric,semantic,and multi-attribute drivers of community formation,and they also lack flexible filtering mechanisms for large-scale,service-oriented settings.Our proposed approach,called community discovery-based service(CDBS),leverages user profiles and their interactions with consulted web services.The method introduces a novel similarity measure,global similarity interaction profile(GSIP),which goes beyond typical similarity measures by unifying user and service profiles for all attributes types into a coherent representation,thereby clarifying its novelty and contribution.It applies multiple filtering criteria related to user attributes,accessed services,and interaction patterns.Experimental comparisons against Louvain,Hierarchical Agglomerative Clustering,Label Propagation and Infomap show that CDBS reveals the higher performance as it achieves 0.74 modularity,0.13 conductance,0.77 coverage,and significantly fast response time of 9.8 s,even with 10,000 users and 400 services.Moreover,community discoverybased service consistently detects a larger number of communities with distinct topics of interest,underscoring its capacity to generate detailed and efficient structures in complex networks.These results confirm both the efficiency and effectiveness of the proposed method.Beyond controlled evaluation,communities discovery based service is applicable to targeted recommendations,group-oriented marketing,access control,and service personalization,where communities are shaped not only by user links but also by service engagement.
基金supported by the Basic Science Research Program through the National Research Foundation of Korea(NRF),funded by the Ministry of Education(RS-2023-00249743).
文摘Most Convolutional Neural Network(CNN)interpretation techniques visualize only the dominant cues that the model relies on,but there is no guarantee that these represent all the evidence the model uses for classification.This limitation becomes critical when hidden secondary cues—potentially more meaningful than the visualized ones—remain undiscovered.This study introduces CasCAM(Cascaded Class Activation Mapping)to address this fundamental limitation through counterfactual reasoning.By asking“if this dominant cue were absent,what other evidence would the model use?”,CasCAM progressively masks the most salient features and systematically uncovers the hierarchy of classification evidence hidden beneath them.Experimental results demonstrate that CasCAM effectively discovers the full spectrum of reasoning evidence and can be universally applied with nine existing interpretation methods.
基金supported by the Hong Kong Polytechnic University(Project No.4-ZZW1,4-YWER,97D9,4-W443)。
文摘Recent years have witnessed the significant breakthrough in the field of new materials discovery brought about by the artificial intelligence(AI).AI has successfully been applied for predicting the formability,revealing the properties,and guiding the experimental synthesis of materials.Rapid progress has been made in the integration of increasing database and improved computing power.Though some reviews present the development from their unique aspects,reviews from the view of how AI empowered both discovery of new materials and cognition of existing materials that covers the completed contents with two synergistical aspects are few.Here,the newest development is systematically reviewed in the field of AI empowered materials,reflecting advanced design of the intelligent systems for discovery,synthesis,prediction and validation of materials.First,background and mechanisms are briefed,after which the design for the AI systems with data,machine learning and automated laboratory included is illustrated.Next,strategies are summarized to obtain the AI systems for materials with improved performance which comprehensively cover the aspects from the in-depth cognizance of existing material and the rapid discovery of new materials,and then,the design thought for future AI systems in material science is pointed out.Finally,some perspectives are put forward.
文摘Drug discovery is a complex and highly systematic process encompassing multiple critical stages,including target identification,bioactive molecule discovery,preclinical research,clinical trials,regulatory review,post-marketing surveillance,and others[1].This process typically spans many years and is often accompanied by high failure rates and substantial resource consumption.In recent years,driven by large amounts of biomedical data,artificial intelligence(AI)has begun to reshape every stage of drug discovery[2].Particularly,by integrating diverse,high-dimensional datasets with powerful predictive and generative models.
基金supported by grants from the National Natural Science Foundation of China(No.82273770)the Foundation for Innovative Research Groups of the National Natural Science Foundation of Sichuan Province(No.24NSFTD0051).
文摘In the realm of drug discovery,recent advancements have paved the way for innovative approaches and methodologies.This comprehensive review encapsulates six distinct yet interrelated mini-reviews,each shedding light on novel strategies in drug development.(a)The resurgence of covalent drugs is highlighted,focusing on the targeted covalent inhibitors(TCIs)and their role in enhancing selectivity and affinity.(b)The potential of the quantum mechanics-based computational aid drug design(CADD)tool,Cov_DOX,is introduced for predicting protein-covalent ligand binding structures and affinities.(c)The scaffolding function of proteins is proposed as a new avenue for drug design,with a focus on modulating protein-protein interactions through small molecules and proteolysis targeting chimeras(PROTACs).(d)The concept of pro-PROTACs is explored as a promising strategy for cancer therapy,combining the principles of prodrugs and PROTACs to enhance specificity and reduce toxicity.(e)The design of prodrugs through carbon-carbon bond cleavage is discussed,offering a new perspective for the activation of drugs with limited modifiable functional groups.(f)The targeting of programmed cell death pathways in cancer therapies with small molecules is reviewed,emphasizing the induction of autophagy-dependent cell death,ferroptosis,and cuproptosis.These insights collectively contribute to a deeper understanding of the dynamic landscape of drug discovery.
基金supported in part by National Institute of Health(NIH),USA(Grant Nos.:R01GM126189,R01AI164266,and R35GM148196)the National Science Foundation,USA(Grant Nos.DMS2052983,DMS-1761320,and IIS-1900473)+3 种基金National Aero-nautics and Space Administration(NASA),USA(Grant No.:80NSSC21M0023)Michigan State University(MSU)Foundation,USA,Bristol-Myers Squibb(Grant No.:65109)USA,and Pfizer,USAsupported by the National Natural Science Foundation of China(Grant Nos.:11971367,12271416,and 11972266).
文摘Transformer models have emerged as pivotal tools within the realm of drug discovery,distinguished by their unique architectural features and exceptional performance in managing intricate data landscapes.Leveraging the innate capabilities of transformer architectures to comprehend intricate hierarchical dependencies inherent in sequential data,these models showcase remarkable efficacy across various tasks,including new drug design and drug target identification.The adaptability of pre-trained trans-former-based models renders them indispensable assets for driving data-centric advancements in drug discovery,chemistry,and biology,furnishing a robust framework that expedites innovation and dis-covery within these domains.Beyond their technical prowess,the success of transformer-based models in drug discovery,chemistry,and biology extends to their interdisciplinary potential,seamlessly combining biological,physical,chemical,and pharmacological insights to bridge gaps across diverse disciplines.This integrative approach not only enhances the depth and breadth of research endeavors but also fosters synergistic collaborations and exchange of ideas among disparate fields.In our review,we elucidate the myriad applications of transformers in drug discovery,as well as chemistry and biology,spanning from protein design and protein engineering,to molecular dynamics(MD),drug target iden-tification,transformer-enabled drug virtual screening(VS),drug lead optimization,drug addiction,small data set challenges,chemical and biological image analysis,chemical language understanding,and single cell data.Finally,we conclude the survey by deliberating on promising trends in transformer models within the context of drug discovery and other sciences.
基金supported by National Natural Science Foundation of China(82341087,82073912,and 81903896)a project funded by Priority Academic Program Development(PAPD)of Jiangsu Higher Education Institutions.
文摘Anemoside B4(AB4),a triterpenoidal saponin derived from Pulsatilla chinensis,has garnered considerable attention for its potent anti-inflammatory and immunomodulatory activities,culminating in its approval for clinical trials by the Center for Drug Evaluation,National Medical Products Administration,for the treatment of mild to moderate ulcerative colitis.Despite this,AB4’s therapeutic potential remained underexplored until the development of its injection formulation.This review discusses the scientific rationale and theoretical framework behind AB4’s development,offering a new paradigm and innovative research strategy for discovering lead compounds or drug candidates from natural medicines.In-depth investigations into AB4’s cellular targets,biochemical pathways,and administration routes have provided valuable insights into its druggability evaluation and clinical potential.The high water solubility of AB4,attributable to its multiple sugar units,imposes limitations on its bioavailability and pharmacokinetic profiles.To address this,structural modification via chemical methods and enzymatic hydrolysis have been employed,resulting in derivatives with reduced molecular weight,improved bioavailability,enhanced pharmacological activity,and greater clinical potential.These advances lay a solid foundation for the continued development of AB4 and its derivatives as promising therapeutic agents.
文摘Synthetic biology(SynBio)is an emerging field of study with great potential in designing,engineering,and constructing new microbial synthetic cells that do not pre-exist in nature or re-engineering existing cells to accomplish industrial purposes.Systems biology seeks to understand biology at multiple dimensions,beginning with the molecular and cellular level and progressing to the tissues and organismal level and characterizes cells as complex information-processing systems.SynBio,on the other hand,toggles further and strives to develop and create its systems from scratch.SynBio is now applied in the development of novel therapeutic drugs for the prevention of human diseases,scale up industrial processes,and accomplish previously unfeasible industrial outcomes.This is made possible through significant breakthroughs in DNA sequencing and synthesis technology,as well as insights gained from synthetic chemistry and systems biology.SynBio technologies have allowed for the introduction of improved and synthetic metabolic functionalities in microorganisms to enable the synthesis of a range of pharmacologically-relevant compounds for pharmaceutical exploration.SynBio applications range from finding new ways to making industrial chemical synthesis processes more sustainable as well as the microbial synthesis of improved therapeutic modalities.Hence,this study underpins several innovations,auspicious potentials,and future directions afforded by SynBio that proposes improved industrial microbial synthesis for pharmaceutical exploration.
文摘This review presents a comprehensive and forward-looking analysis of how Large Language Models(LLMs)are transforming knowledge discovery in the rational design of advancedmicro/nano electrocatalyst materials.Electrocatalysis is central to sustainable energy and environmental technologies,but traditional catalyst discovery is often hindered by high complexity,fragmented knowledge,and inefficiencies.LLMs,particularly those based on Transformer architectures,offer unprecedented capabilities in extracting,synthesizing,and generating scientific knowledge from vast unstructured textual corpora.This work provides the first structured synthesis of how LLMs have been leveraged across various electrocatalysis tasks,including automated information extraction from literature,text-based property prediction,hypothesis generation,synthesis planning,and knowledge graph construction.We comparatively analyze leading LLMs and domain-specific frameworks(e.g.,CatBERTa,CataLM,CatGPT)in terms of methodology,application scope,performance metrics,and limitations.Through curated case studies across key electrocatalytic reactions—HER,OER,ORR,and CO_(2)RR—we highlight emerging trends such as the growing use of embedding-based prediction,retrieval-augmented generation,and fine-tuned scientific LLMs.The review also identifies persistent challenges,including data heterogeneity,hallucination risks,lack of standard benchmarks,and limited multimodal integration.Importantly,we articulate future research directions,such as the development of multimodal and physics-informedMatSci-LLMs,enhanced interpretability tools,and the integration of LLMswith selfdriving laboratories for autonomous discovery.By consolidating fragmented advances and outlining a unified research roadmap,this review provides valuable guidance for both materials scientists and AI practitioners seeking to accelerate catalyst innovation through large language model technologies.
文摘Endometrial cancer is the most common gynecologic cancer diagnosed in the United States and mortality is on the rise.Advanced and recurrent endometrial cancer represents a treatment challenge as historically there have been limited therapeutic options for patients.In the last several years,multiple practice-changing clinical trials have led to significant improvements in the treatment landscape.This review will cover updates in the treatment and management of advanced and recurrent endometrial cancer with a focus on novel therapeutics,such as anti-PD-L1 and PD-1 inhibitors,poly ADP-ribose polymerase(PARP)inhibitors,antibody-drug conjugates,and hormonal therapy.
基金supported by the National Social Science Fund of China(2022-SKJJ-B-084).
文摘The learning algorithms of causal discovery mainly include score-based methods and genetic algorithms(GA).The score-based algorithms are prone to searching space explosion.Classical GA is slow to converge,and prone to falling into local optima.To address these issues,an improved GA with domain knowledge(IGADK)is proposed.Firstly,domain knowledge is incorporated into the learning process of causality to construct a new fitness function.Secondly,a dynamical mutation operator is introduced in the algorithm to accelerate the convergence rate.Finally,an experiment is conducted on simulation data,which compares the classical GA with IGADK with domain knowledge of varying accuracy.The IGADK can greatly reduce the number of iterations,populations,and samples required for learning,which illustrates the efficiency and effectiveness of the proposed algorithm.
基金National Natural Science Foundation of China(32471265).
文摘Pesticides play a pivotal role in modern agriculture. However, the pesticide industry faces significant challenges closely linked to major global concerns such as pesticide resistance, environmental pollution, food safety, and crop yields. Developing safe, efficient, and environmentally friendly pesticides has become a key challenge for the industry. Recently, Qing Yang and colleagues unveiled the mode of action of a dual-functional protein, the ABCH transporter, which plays essential roles in lipid transport to construct the lipid barrier of insect cuticles and in pesticide detoxification within insects. Since ABCH transporters are critical for all insects but absent in mammals and plants, this elegant and exciting work provides a highly promising target for developing safe, low-resistance pesticides. Here, we highlight the groundbreaking discoveries made by Qing Yang's team in unraveling the intricate mechanisms of the ABCH transporter.
基金supported by the National Key R&D Program of China(Grant No.:2023YFF1205102)the National Natural Science Foundation of China(Grant Nos.:82273856,22077143,and 21977127)the Science Foundation of Guangzhou,China(No.:2Grant024A04J2172).
文摘Structural optimization of lead compounds is a crucial step in drug discovery.One optimization strategy is to modify the molecular structure of a scaffold to improve both its biological activities and absorption,distribution,metabolism,excretion,and toxicity(ADMET)properties.One of the deep molecular generative model approaches preserves the scaffold while generating drug-like molecules,thereby accelerating the molecular optimization process.Deep molecular diffusion generative models simulate a gradual process that creates novel,chemically feasible molecules from noise.However,the existing models lack direct interatomic constraint features and struggle with capturing long-range dependencies in macromolecules,leading to challenges in modifying the scaffold-based molecular structures,and creates limitations in the stability and diversity of the generated molecules.To address these challenges,we propose a deep molecular diffusion generative model,the three-dimensional(3D)equivariant diffusion-driven molecular generation(3D-EDiffMG)model.The dual strong and weak atomic interaction force-based long-range dependency capturing equivariant encoder(dual-SWLEE)is introduced to encode both the bonding and non-bonding information based on strong and weak atomic interactions.Addi-tionally,a gate multilayer perceptron(gMLP)block with tiny attention is incorporated to explicitly model complex long-sequence feature interactions and long-range dependencies.The experimental results show that 3D-EDiffMG effectively generates unique,novel,stable,and diverse drug-like molecules,highlighting its potential for lead optimization and accelerating drug discovery.
基金the support of the National Natural Science Foundation of China(Nos.22205207 and 22378369).
文摘Zeolites are crystalline microporous materials widely used in catalysis,adsorption,and ion exchange owing to their tunable pore structures and acid centers[1].Traditional zeolites,however,often suffer from limitations such as restricted molecular diffusion and rapid coking,which hinder their efficiency in processing large molecules.
基金supported by the National Key Research and Development Program of China(Grant No.2022YFA1402304)the National Natural Science Foundation of China(Grant Nos.12034009,12374005,52288102,52090024,and T2225013)+1 种基金the Fundamental Research Funds for the Central Universitiesthe Program for JLU Science and Technology Innovative Research Team.
文摘Crystal structure prediction(CSP)is a foundational computational technique for determining the atomic arrangements of crystalline materials,especially under high-pressure conditions.While CSP plays a critical role in materials science,traditional approaches often encounter significant challenges related to computational efficiency and scalability,particularly when applied to complex systems.Recent advances in machine learning(ML)have shown tremendous promise in addressing these limitations,enabling the rapid and accurate prediction of crystal structures across a wide range of chemical compositions and external conditions.This review provides a concise overview of recent progress in ML-assisted CSP methodologies,with a particular focus on machine learning potentials and generative models.By critically analyzing these advances,we highlight the transformative impact of ML in accelerating materials discovery,enhancing computational efficiency,and broadening the applicability of CSP.Additionally,we discuss emerging opportunities and challenges in this rapidly evolving field.
基金supported by the National Key R&D Program of China(Grant no.2022YFC2807502)Qingdao Marine Science and Technology Center(Grant no.2022QNLM030003-1)Taishan Scholar Distinguished Expert Program in Shandong Province(Grant no.tstp20240504).
文摘A polyketide synthase-nonribosomal peptide synthetase gene cluster twn in Talaromyces sp.HDN1820200 was activated by overexpression of the pathway-specific transcriptional factor TwnD.Large-scale fermentation and chemical investigation of the mutant strain HDN1820200/TwnD led to the discovery of one new polyketide-amino acid conjugate,bipolamide C and one new polyketide compound,variotin A.The structures of the new compounds were determined by nuclear magnetic resonance(NMR)analysis,high-resolution electrospray ionization mass spectrometry,feeding experiments,NMR calculation and DP4^(+)analysis.This study revealed that the overexpression of the pathway-specific transcriptional factor represents a promising approach for the discovery of new natural products in fungi within specialized habitat.
基金supported by Research Projects of the Nature Science Foundation of Hebei Province(F2021402005).
文摘Semi-supervised new intent discovery is a significant research focus in natural language understanding.To address the limitations of current semi-supervised training data and the underutilization of implicit information,a Semi-supervised New Intent Discovery for Elastic Neighborhood Syntactic Elimination and Fusion model(SNID-ENSEF)is proposed.Syntactic elimination contrast learning leverages verb-dominant syntactic features,systematically replacing specific words to enhance data diversity.The radius of the positive sample neighborhood is elastically adjusted to eliminate invalid samples and improve training efficiency.A neighborhood sample fusion strategy,based on sample distribution patterns,dynamically adjusts neighborhood size and fuses sample vectors to reduce noise and improve implicit information utilization and discovery accuracy.Experimental results show that SNID-ENSEF achieves average improvements of 0.88%,1.27%,and 1.30%in Normalized Mutual Information(NMI),Accuracy(ACC),and Adjusted Rand Index(ARI),respectively,outperforming PTJN,DPN,MTP-CLNN,and DWG models on the Banking77,StackOverflow,and Clinc150 datasets.The code is available at https://github.com/qsdesz/SNID-ENSEF,accessed on 16 January 2025.
文摘Existing text truth discovery methods fail to address two challenges:the inherent long-distance dependencies and thematic diversity of long texts;the inherent subjective sentiment that obscures objective evaluation of source reliability.To address these challenges,a novel truth discovery method named large language model(LLM)-enhanced text truth discovery with dual attention(LTDDA)is proposed.First,LLMs generate embedded representations of text claims,and enhance the feature space to tackle long-distance dependencies and thematic diversity.Then,the complex relationship between source reliability and claim credibility is captured by integrating semantic and sentiment features.Finally,dual-layer attention is applied to extract key semantic information and assign consistent weights to similar sources,resulting in accurate truth outputs.Extensive experiments on three realworld datasets demonstrate that the effectiveness of LTDDA outperforms that of state-of-the-art methods,providing new insights for building more reliable and accurate text truth discovery systems.