Transformer models have emerged as pivotal tools within the realm of drug discovery,distinguished by their unique architectural features and exceptional performance in managing intricate data landscapes.Leveraging the...Transformer models have emerged as pivotal tools within the realm of drug discovery,distinguished by their unique architectural features and exceptional performance in managing intricate data landscapes.Leveraging the innate capabilities of transformer architectures to comprehend intricate hierarchical dependencies inherent in sequential data,these models showcase remarkable efficacy across various tasks,including new drug design and drug target identification.The adaptability of pre-trained trans-former-based models renders them indispensable assets for driving data-centric advancements in drug discovery,chemistry,and biology,furnishing a robust framework that expedites innovation and dis-covery within these domains.Beyond their technical prowess,the success of transformer-based models in drug discovery,chemistry,and biology extends to their interdisciplinary potential,seamlessly combining biological,physical,chemical,and pharmacological insights to bridge gaps across diverse disciplines.This integrative approach not only enhances the depth and breadth of research endeavors but also fosters synergistic collaborations and exchange of ideas among disparate fields.In our review,we elucidate the myriad applications of transformers in drug discovery,as well as chemistry and biology,spanning from protein design and protein engineering,to molecular dynamics(MD),drug target iden-tification,transformer-enabled drug virtual screening(VS),drug lead optimization,drug addiction,small data set challenges,chemical and biological image analysis,chemical language understanding,and single cell data.Finally,we conclude the survey by deliberating on promising trends in transformer models within the context of drug discovery and other sciences.展开更多
This review presents a comprehensive and forward-looking analysis of how Large Language Models(LLMs)are transforming knowledge discovery in the rational design of advancedmicro/nano electrocatalyst materials.Electroca...This review presents a comprehensive and forward-looking analysis of how Large Language Models(LLMs)are transforming knowledge discovery in the rational design of advancedmicro/nano electrocatalyst materials.Electrocatalysis is central to sustainable energy and environmental technologies,but traditional catalyst discovery is often hindered by high complexity,fragmented knowledge,and inefficiencies.LLMs,particularly those based on Transformer architectures,offer unprecedented capabilities in extracting,synthesizing,and generating scientific knowledge from vast unstructured textual corpora.This work provides the first structured synthesis of how LLMs have been leveraged across various electrocatalysis tasks,including automated information extraction from literature,text-based property prediction,hypothesis generation,synthesis planning,and knowledge graph construction.We comparatively analyze leading LLMs and domain-specific frameworks(e.g.,CatBERTa,CataLM,CatGPT)in terms of methodology,application scope,performance metrics,and limitations.Through curated case studies across key electrocatalytic reactions—HER,OER,ORR,and CO_(2)RR—we highlight emerging trends such as the growing use of embedding-based prediction,retrieval-augmented generation,and fine-tuned scientific LLMs.The review also identifies persistent challenges,including data heterogeneity,hallucination risks,lack of standard benchmarks,and limited multimodal integration.Importantly,we articulate future research directions,such as the development of multimodal and physics-informedMatSci-LLMs,enhanced interpretability tools,and the integration of LLMswith selfdriving laboratories for autonomous discovery.By consolidating fragmented advances and outlining a unified research roadmap,this review provides valuable guidance for both materials scientists and AI practitioners seeking to accelerate catalyst innovation through large language model technologies.展开更多
In the realm of drug discovery,recent advancements have paved the way for innovative approaches and methodologies.This comprehensive review encapsulates six distinct yet interrelated mini-reviews,each shedding light o...In the realm of drug discovery,recent advancements have paved the way for innovative approaches and methodologies.This comprehensive review encapsulates six distinct yet interrelated mini-reviews,each shedding light on novel strategies in drug development.(a)The resurgence of covalent drugs is highlighted,focusing on the targeted covalent inhibitors(TCIs)and their role in enhancing selectivity and affinity.(b)The potential of the quantum mechanics-based computational aid drug design(CADD)tool,Cov_DOX,is introduced for predicting protein-covalent ligand binding structures and affinities.(c)The scaffolding function of proteins is proposed as a new avenue for drug design,with a focus on modulating protein-protein interactions through small molecules and proteolysis targeting chimeras(PROTACs).(d)The concept of pro-PROTACs is explored as a promising strategy for cancer therapy,combining the principles of prodrugs and PROTACs to enhance specificity and reduce toxicity.(e)The design of prodrugs through carbon-carbon bond cleavage is discussed,offering a new perspective for the activation of drugs with limited modifiable functional groups.(f)The targeting of programmed cell death pathways in cancer therapies with small molecules is reviewed,emphasizing the induction of autophagy-dependent cell death,ferroptosis,and cuproptosis.These insights collectively contribute to a deeper understanding of the dynamic landscape of drug discovery.展开更多
Synthetic biology(SynBio)is an emerging field of study with great potential in designing,engineering,and constructing new microbial synthetic cells that do not pre-exist in nature or re-engineering existing cells to a...Synthetic biology(SynBio)is an emerging field of study with great potential in designing,engineering,and constructing new microbial synthetic cells that do not pre-exist in nature or re-engineering existing cells to accomplish industrial purposes.Systems biology seeks to understand biology at multiple dimensions,beginning with the molecular and cellular level and progressing to the tissues and organismal level and characterizes cells as complex information-processing systems.SynBio,on the other hand,toggles further and strives to develop and create its systems from scratch.SynBio is now applied in the development of novel therapeutic drugs for the prevention of human diseases,scale up industrial processes,and accomplish previously unfeasible industrial outcomes.This is made possible through significant breakthroughs in DNA sequencing and synthesis technology,as well as insights gained from synthetic chemistry and systems biology.SynBio technologies have allowed for the introduction of improved and synthetic metabolic functionalities in microorganisms to enable the synthesis of a range of pharmacologically-relevant compounds for pharmaceutical exploration.SynBio applications range from finding new ways to making industrial chemical synthesis processes more sustainable as well as the microbial synthesis of improved therapeutic modalities.Hence,this study underpins several innovations,auspicious potentials,and future directions afforded by SynBio that proposes improved industrial microbial synthesis for pharmaceutical exploration.展开更多
Endometrial cancer is the most common gynecologic cancer diagnosed in the United States and mortality is on the rise.Advanced and recurrent endometrial cancer represents a treatment challenge as historically there hav...Endometrial cancer is the most common gynecologic cancer diagnosed in the United States and mortality is on the rise.Advanced and recurrent endometrial cancer represents a treatment challenge as historically there have been limited therapeutic options for patients.In the last several years,multiple practice-changing clinical trials have led to significant improvements in the treatment landscape.This review will cover updates in the treatment and management of advanced and recurrent endometrial cancer with a focus on novel therapeutics,such as anti-PD-L1 and PD-1 inhibitors,poly ADP-ribose polymerase(PARP)inhibitors,antibody-drug conjugates,and hormonal therapy.展开更多
The learning algorithms of causal discovery mainly include score-based methods and genetic algorithms(GA).The score-based algorithms are prone to searching space explosion.Classical GA is slow to converge,and prone to...The learning algorithms of causal discovery mainly include score-based methods and genetic algorithms(GA).The score-based algorithms are prone to searching space explosion.Classical GA is slow to converge,and prone to falling into local optima.To address these issues,an improved GA with domain knowledge(IGADK)is proposed.Firstly,domain knowledge is incorporated into the learning process of causality to construct a new fitness function.Secondly,a dynamical mutation operator is introduced in the algorithm to accelerate the convergence rate.Finally,an experiment is conducted on simulation data,which compares the classical GA with IGADK with domain knowledge of varying accuracy.The IGADK can greatly reduce the number of iterations,populations,and samples required for learning,which illustrates the efficiency and effectiveness of the proposed algorithm.展开更多
Structural optimization of lead compounds is a crucial step in drug discovery.One optimization strategy is to modify the molecular structure of a scaffold to improve both its biological activities and absorption,distr...Structural optimization of lead compounds is a crucial step in drug discovery.One optimization strategy is to modify the molecular structure of a scaffold to improve both its biological activities and absorption,distribution,metabolism,excretion,and toxicity(ADMET)properties.One of the deep molecular generative model approaches preserves the scaffold while generating drug-like molecules,thereby accelerating the molecular optimization process.Deep molecular diffusion generative models simulate a gradual process that creates novel,chemically feasible molecules from noise.However,the existing models lack direct interatomic constraint features and struggle with capturing long-range dependencies in macromolecules,leading to challenges in modifying the scaffold-based molecular structures,and creates limitations in the stability and diversity of the generated molecules.To address these challenges,we propose a deep molecular diffusion generative model,the three-dimensional(3D)equivariant diffusion-driven molecular generation(3D-EDiffMG)model.The dual strong and weak atomic interaction force-based long-range dependency capturing equivariant encoder(dual-SWLEE)is introduced to encode both the bonding and non-bonding information based on strong and weak atomic interactions.Addi-tionally,a gate multilayer perceptron(gMLP)block with tiny attention is incorporated to explicitly model complex long-sequence feature interactions and long-range dependencies.The experimental results show that 3D-EDiffMG effectively generates unique,novel,stable,and diverse drug-like molecules,highlighting its potential for lead optimization and accelerating drug discovery.展开更多
Pesticides play a pivotal role in modern agriculture. However, the pesticide industry faces significant challenges closely linked to major global concerns such as pesticide resistance, environmental pollution, food sa...Pesticides play a pivotal role in modern agriculture. However, the pesticide industry faces significant challenges closely linked to major global concerns such as pesticide resistance, environmental pollution, food safety, and crop yields. Developing safe, efficient, and environmentally friendly pesticides has become a key challenge for the industry. Recently, Qing Yang and colleagues unveiled the mode of action of a dual-functional protein, the ABCH transporter, which plays essential roles in lipid transport to construct the lipid barrier of insect cuticles and in pesticide detoxification within insects. Since ABCH transporters are critical for all insects but absent in mammals and plants, this elegant and exciting work provides a highly promising target for developing safe, low-resistance pesticides. Here, we highlight the groundbreaking discoveries made by Qing Yang's team in unraveling the intricate mechanisms of the ABCH transporter.展开更多
Crystal structure prediction(CSP)is a foundational computational technique for determining the atomic arrangements of crystalline materials,especially under high-pressure conditions.While CSP plays a critical role in ...Crystal structure prediction(CSP)is a foundational computational technique for determining the atomic arrangements of crystalline materials,especially under high-pressure conditions.While CSP plays a critical role in materials science,traditional approaches often encounter significant challenges related to computational efficiency and scalability,particularly when applied to complex systems.Recent advances in machine learning(ML)have shown tremendous promise in addressing these limitations,enabling the rapid and accurate prediction of crystal structures across a wide range of chemical compositions and external conditions.This review provides a concise overview of recent progress in ML-assisted CSP methodologies,with a particular focus on machine learning potentials and generative models.By critically analyzing these advances,we highlight the transformative impact of ML in accelerating materials discovery,enhancing computational efficiency,and broadening the applicability of CSP.Additionally,we discuss emerging opportunities and challenges in this rapidly evolving field.展开更多
Semi-supervised new intent discovery is a significant research focus in natural language understanding.To address the limitations of current semi-supervised training data and the underutilization of implicit informati...Semi-supervised new intent discovery is a significant research focus in natural language understanding.To address the limitations of current semi-supervised training data and the underutilization of implicit information,a Semi-supervised New Intent Discovery for Elastic Neighborhood Syntactic Elimination and Fusion model(SNID-ENSEF)is proposed.Syntactic elimination contrast learning leverages verb-dominant syntactic features,systematically replacing specific words to enhance data diversity.The radius of the positive sample neighborhood is elastically adjusted to eliminate invalid samples and improve training efficiency.A neighborhood sample fusion strategy,based on sample distribution patterns,dynamically adjusts neighborhood size and fuses sample vectors to reduce noise and improve implicit information utilization and discovery accuracy.Experimental results show that SNID-ENSEF achieves average improvements of 0.88%,1.27%,and 1.30%in Normalized Mutual Information(NMI),Accuracy(ACC),and Adjusted Rand Index(ARI),respectively,outperforming PTJN,DPN,MTP-CLNN,and DWG models on the Banking77,StackOverflow,and Clinc150 datasets.The code is available at https://github.com/qsdesz/SNID-ENSEF,accessed on 16 January 2025.展开更多
Discovering floating wastes,especially bottles on water,is a crucial research problem in environmental hygiene.Nevertheless,real-world applications often face challenges such as interference from irrelevant objects an...Discovering floating wastes,especially bottles on water,is a crucial research problem in environmental hygiene.Nevertheless,real-world applications often face challenges such as interference from irrelevant objects and the high cost associated with data collection.Consequently,devising algorithms capable of accurately localizing specific objects within a scene in scenarios where annotated data is limited remains a formidable challenge.To solve this problem,this paper proposes an object discovery by request problem setting and a corresponding algorithmic framework.The proposed problem setting aims to identify specified objects in scenes,and the associated algorithmic framework comprises pseudo data generation and object discovery by request network.Pseudo-data generation generates images resembling natural scenes through various data augmentation rules,using a small number of object samples and scene images.The network structure of object discovery by request utilizes the pre-trained Vision Transformer(ViT)model as the backbone,employs object-centric methods to learn the latent representations of foreground objects,and applies patch-level reconstruction constraints to the model.During the validation phase,we use the generated pseudo datasets as training sets and evaluate the performance of our model on the original test sets.Experiments have proved that our method achieves state-of-the-art performance on Unmanned Aerial Vehicles-Bottle Detection(UAV-BD)dataset and self-constructed dataset Bottle,especially in multi-object scenarios.展开更多
Tauopathies,diseases characterized by neuropathological aggregates of tau including Alzheimer's disease and subtypes of fro ntotemporal dementia,make up the vast majority of dementia cases.Although there have been...Tauopathies,diseases characterized by neuropathological aggregates of tau including Alzheimer's disease and subtypes of fro ntotemporal dementia,make up the vast majority of dementia cases.Although there have been recent developments in tauopathy biomarkers and disease-modifying treatments,ongoing progress is required to ensure these are effective,economical,and accessible for the globally ageing population.As such,continued identification of new potential drug targets and biomarkers is critical."Big data"studies,such as proteomics,can generate information on thousands of possible new targets for dementia diagnostics and therapeutics,but currently remain underutilized due to the lack of a clear process by which targets are selected for future drug development.In this review,we discuss current tauopathy biomarkers and therapeutics,and highlight areas in need of improvement,particularly when addressing the needs of frail,comorbid and cognitively impaired populations.We highlight biomarkers which have been developed from proteomic data,and outline possible future directions in this field.We propose new criteria by which potential targets in proteomics studies can be objectively ranked as favorable for drug development,and demonstrate its application to our group's recent tau interactome dataset as an example.展开更多
Unmanned and aerial systems as interactors among different system components for communications,have opened up great opportunities for truth data discovery in Mobile Crowd Sensing(MCS)which has not been properly solve...Unmanned and aerial systems as interactors among different system components for communications,have opened up great opportunities for truth data discovery in Mobile Crowd Sensing(MCS)which has not been properly solved in the literature.In this paper,an Unmanned Aerial Vehicles-supported Intelligent Truth Discovery(UAV-ITD)scheme is proposed to obtain truth data at low-cost communications for MCS.The main innovations of the UAV-ITD scheme are as follows:(1)UAV-ITD scheme takes the first step in employing UAV joint Deep Matrix Factorization(DMF)to discover truth data based on the trust mechanism for an Information Elicitation Without Verification(IEWV)problem in MCS.(2)This paper introduces a truth data discovery scheme for the first time that only needs to collect a part of n data samples to infer the data of the entire network with high accuracy,which saves more communication costs than most previous data collection schemes,where they collect n or kn data samples.Finally,we conducted extensive experiments to evaluate the UAV-ITD scheme.The results show that compared with previous schemes,our scheme can reduce estimated truth error by 52.25%–96.09%,increase the accuracy of workers’trust evaluation by 0.68–61.82 times,and save recruitment costs by 24.08%–54.15%in truth data discovery.展开更多
In unmanned aerial vehicle(UAV)networks,the high mobility of nodes leads to frequent changes in network topology,which brings challenges to the neighbor discovery(ND)for UAV networks.Integrated sensing and communicati...In unmanned aerial vehicle(UAV)networks,the high mobility of nodes leads to frequent changes in network topology,which brings challenges to the neighbor discovery(ND)for UAV networks.Integrated sensing and communication(ISAC),as an emerging technology in 6G mobile networks,has shown great potential in improving communication performance with the assistance of sensing information.ISAC obtains the prior information about node distribution,reducing the ND time.However,the prior information obtained through ISAC may be imperfect.Hence,an ND algorithm based on reinforcement learning is proposed.The learning automaton(LA)is applied to interact with the environment and continuously adjust the probability of selecting beams to accelerate the convergence speed of ND algorithms.Besides,an efficient ND algorithm in the neighbor maintenance phase is designed,which applies the Kalman filter to predict node movement.Simulation results show that the LA-based ND algorithm reduces the ND time by up to 32%compared with the Scan-Based Algorithm(SBA),which proves the efficiency of the proposed ND algorithms.展开更多
The latest review published in Nature Reviews Drug Discovery by Michael W.Mullowney and co-authors focuses on the use of artificial intelligence techniques,specifically machine learning,in natural product drug discove...The latest review published in Nature Reviews Drug Discovery by Michael W.Mullowney and co-authors focuses on the use of artificial intelligence techniques,specifically machine learning,in natural product drug discovery.The authors discussed various applications of AI in this field,such as genome and metabolome mining,structural characterization of natural products,and predicting targets and biological activities of these compounds.They also highlighted the challenges associated with creating and managing large datasets for training algorithms,as well as strategies to address these obstacles.Additionally,the authors examine common pitfalls in algorithm training and offer suggestions for avoiding them.展开更多
The continued expansion of the world population,increasingly inconsistent climate and shrinking agricultural resources present major challenges to crop breeding.Fortunately,the increasing ability to discover and manip...The continued expansion of the world population,increasingly inconsistent climate and shrinking agricultural resources present major challenges to crop breeding.Fortunately,the increasing ability to discover and manipulate genes creates new opportunities to develop more productive and resilient cultivars.Many genes have been described in papers as being beneficial for yield increase.However,few of them have been translated into increased yield on farms.In contrast,commercial breeders are facing gene decidophobia,i.e.,puzzled about which gene to choose for breeding among the many identified,a huge chasm between gene discovery and cultivar innovation.The purpose of this paper is to draw attention to the shortfalls in current gene discovery research and to emphasise the need to align with cultivar innovation.The methodology dictates that genetic studies not only focus on gene discovery but also pay good attention to the genetic backgrounds,experimental validation in relevant environments,appropriate crop management,and data reusability.The close of the gaps should accelerate the application of molecular study in breeding and contribute to future global food security.展开更多
Marine natural products(MNPs)are valuable resources for drug development.To date,17 drugs from marine sources are in clinical use,and 33 pharmaceutical compounds are in clinical trials.Presently the success of drug de...Marine natural products(MNPs)are valuable resources for drug development.To date,17 drugs from marine sources are in clinical use,and 33 pharmaceutical compounds are in clinical trials.Presently the success of drug development from the marine resources is higher than the industry average.It is a feasible strategy to conduct the discovery of druglead compounds based on marine chemical ecology by fully exploiting the pharmacological potential of marine chemical defense matters.In the search for bioactive MNPs,our group has constructed a biological resources library including more than 1500 strains of fungi.Focusing on the strategy of Blue Drug Library,we have discovered a series of novel MNPs with abundant biological functions.Highly efficient and scalable total synthesis of(+)-aniduquinolone A(44)and pesimquinolone I(48)have been completed,which will facilitate access to sufficient quantities of candidates for in vivo pharmacological and toxicological studies.As a nucleoprotein(NP)inhibitor,QLA(75)possesses significant anti-influenza A virus(IAV)activities both in vitro and in vivo.CHNQD-00803(76)is a potent and selective AMP-activated kinase(AMPK)activator that can effectively inhibit metabolic disorders and metabolic dysfunction-associated steatohepatitis(MASH)progression.Moreover,we identified two new candidate molecules with potent anti-hepatocellular carcinoma effects.Particularly,as a natural guanine-nucleotide exchange factors for ADP-ribosylation factor GTPases(Arf-GEFs)inhibitor prodrug,CHNQD-01255(78)is qualified to be developed as a targeted candidate anticancer drug,which may be promising to apply for cancer immunotherapy.Hence,it is evident that MNPs play an important role in drug development.展开更多
In this paper,we propose a Multi-token Sector Antenna Neighbor Discovery(M-SAND)protocol to enhance the efficiency of neighbor discovery in asynchronous directional ad hoc networks.The central concept of our work invo...In this paper,we propose a Multi-token Sector Antenna Neighbor Discovery(M-SAND)protocol to enhance the efficiency of neighbor discovery in asynchronous directional ad hoc networks.The central concept of our work involves maintaining multiple tokens across the network.To prevent mutual interference among multi-token holders,we introduce the time and space non-interference theorems.Furthermore,we propose a master-slave strategy between tokens.When the master token holder(MTH)performs the neighbor discovery,it decides which 1-hop neighbor is the next MTH and which 2-hop neighbors can be the new slave token holders(STHs).Using this approach,the MTH and multiple STHs can simultaneously discover their neighbors without causing interference with each other.Building on this foundation,we provide a comprehensive procedure for the M-SAND protocol.We also conduct theoretical analyses on the maximum number of STHs and the lower bound of multi-token generation probability.Finally,simulation results demonstrate the time efficiency of the M-SAND protocol.When compared to the QSAND protocol,which uses only one token,the total neighbor discovery time is reduced by 28% when 6beams and 112 nodes are employed.展开更多
基金supported in part by National Institute of Health(NIH),USA(Grant Nos.:R01GM126189,R01AI164266,and R35GM148196)the National Science Foundation,USA(Grant Nos.DMS2052983,DMS-1761320,and IIS-1900473)+3 种基金National Aero-nautics and Space Administration(NASA),USA(Grant No.:80NSSC21M0023)Michigan State University(MSU)Foundation,USA,Bristol-Myers Squibb(Grant No.:65109)USA,and Pfizer,USAsupported by the National Natural Science Foundation of China(Grant Nos.:11971367,12271416,and 11972266).
文摘Transformer models have emerged as pivotal tools within the realm of drug discovery,distinguished by their unique architectural features and exceptional performance in managing intricate data landscapes.Leveraging the innate capabilities of transformer architectures to comprehend intricate hierarchical dependencies inherent in sequential data,these models showcase remarkable efficacy across various tasks,including new drug design and drug target identification.The adaptability of pre-trained trans-former-based models renders them indispensable assets for driving data-centric advancements in drug discovery,chemistry,and biology,furnishing a robust framework that expedites innovation and dis-covery within these domains.Beyond their technical prowess,the success of transformer-based models in drug discovery,chemistry,and biology extends to their interdisciplinary potential,seamlessly combining biological,physical,chemical,and pharmacological insights to bridge gaps across diverse disciplines.This integrative approach not only enhances the depth and breadth of research endeavors but also fosters synergistic collaborations and exchange of ideas among disparate fields.In our review,we elucidate the myriad applications of transformers in drug discovery,as well as chemistry and biology,spanning from protein design and protein engineering,to molecular dynamics(MD),drug target iden-tification,transformer-enabled drug virtual screening(VS),drug lead optimization,drug addiction,small data set challenges,chemical and biological image analysis,chemical language understanding,and single cell data.Finally,we conclude the survey by deliberating on promising trends in transformer models within the context of drug discovery and other sciences.
文摘This review presents a comprehensive and forward-looking analysis of how Large Language Models(LLMs)are transforming knowledge discovery in the rational design of advancedmicro/nano electrocatalyst materials.Electrocatalysis is central to sustainable energy and environmental technologies,but traditional catalyst discovery is often hindered by high complexity,fragmented knowledge,and inefficiencies.LLMs,particularly those based on Transformer architectures,offer unprecedented capabilities in extracting,synthesizing,and generating scientific knowledge from vast unstructured textual corpora.This work provides the first structured synthesis of how LLMs have been leveraged across various electrocatalysis tasks,including automated information extraction from literature,text-based property prediction,hypothesis generation,synthesis planning,and knowledge graph construction.We comparatively analyze leading LLMs and domain-specific frameworks(e.g.,CatBERTa,CataLM,CatGPT)in terms of methodology,application scope,performance metrics,and limitations.Through curated case studies across key electrocatalytic reactions—HER,OER,ORR,and CO_(2)RR—we highlight emerging trends such as the growing use of embedding-based prediction,retrieval-augmented generation,and fine-tuned scientific LLMs.The review also identifies persistent challenges,including data heterogeneity,hallucination risks,lack of standard benchmarks,and limited multimodal integration.Importantly,we articulate future research directions,such as the development of multimodal and physics-informedMatSci-LLMs,enhanced interpretability tools,and the integration of LLMswith selfdriving laboratories for autonomous discovery.By consolidating fragmented advances and outlining a unified research roadmap,this review provides valuable guidance for both materials scientists and AI practitioners seeking to accelerate catalyst innovation through large language model technologies.
基金supported by grants from the National Natural Science Foundation of China(No.82273770)the Foundation for Innovative Research Groups of the National Natural Science Foundation of Sichuan Province(No.24NSFTD0051).
文摘In the realm of drug discovery,recent advancements have paved the way for innovative approaches and methodologies.This comprehensive review encapsulates six distinct yet interrelated mini-reviews,each shedding light on novel strategies in drug development.(a)The resurgence of covalent drugs is highlighted,focusing on the targeted covalent inhibitors(TCIs)and their role in enhancing selectivity and affinity.(b)The potential of the quantum mechanics-based computational aid drug design(CADD)tool,Cov_DOX,is introduced for predicting protein-covalent ligand binding structures and affinities.(c)The scaffolding function of proteins is proposed as a new avenue for drug design,with a focus on modulating protein-protein interactions through small molecules and proteolysis targeting chimeras(PROTACs).(d)The concept of pro-PROTACs is explored as a promising strategy for cancer therapy,combining the principles of prodrugs and PROTACs to enhance specificity and reduce toxicity.(e)The design of prodrugs through carbon-carbon bond cleavage is discussed,offering a new perspective for the activation of drugs with limited modifiable functional groups.(f)The targeting of programmed cell death pathways in cancer therapies with small molecules is reviewed,emphasizing the induction of autophagy-dependent cell death,ferroptosis,and cuproptosis.These insights collectively contribute to a deeper understanding of the dynamic landscape of drug discovery.
文摘Synthetic biology(SynBio)is an emerging field of study with great potential in designing,engineering,and constructing new microbial synthetic cells that do not pre-exist in nature or re-engineering existing cells to accomplish industrial purposes.Systems biology seeks to understand biology at multiple dimensions,beginning with the molecular and cellular level and progressing to the tissues and organismal level and characterizes cells as complex information-processing systems.SynBio,on the other hand,toggles further and strives to develop and create its systems from scratch.SynBio is now applied in the development of novel therapeutic drugs for the prevention of human diseases,scale up industrial processes,and accomplish previously unfeasible industrial outcomes.This is made possible through significant breakthroughs in DNA sequencing and synthesis technology,as well as insights gained from synthetic chemistry and systems biology.SynBio technologies have allowed for the introduction of improved and synthetic metabolic functionalities in microorganisms to enable the synthesis of a range of pharmacologically-relevant compounds for pharmaceutical exploration.SynBio applications range from finding new ways to making industrial chemical synthesis processes more sustainable as well as the microbial synthesis of improved therapeutic modalities.Hence,this study underpins several innovations,auspicious potentials,and future directions afforded by SynBio that proposes improved industrial microbial synthesis for pharmaceutical exploration.
文摘Endometrial cancer is the most common gynecologic cancer diagnosed in the United States and mortality is on the rise.Advanced and recurrent endometrial cancer represents a treatment challenge as historically there have been limited therapeutic options for patients.In the last several years,multiple practice-changing clinical trials have led to significant improvements in the treatment landscape.This review will cover updates in the treatment and management of advanced and recurrent endometrial cancer with a focus on novel therapeutics,such as anti-PD-L1 and PD-1 inhibitors,poly ADP-ribose polymerase(PARP)inhibitors,antibody-drug conjugates,and hormonal therapy.
基金supported by the National Social Science Fund of China(2022-SKJJ-B-084).
文摘The learning algorithms of causal discovery mainly include score-based methods and genetic algorithms(GA).The score-based algorithms are prone to searching space explosion.Classical GA is slow to converge,and prone to falling into local optima.To address these issues,an improved GA with domain knowledge(IGADK)is proposed.Firstly,domain knowledge is incorporated into the learning process of causality to construct a new fitness function.Secondly,a dynamical mutation operator is introduced in the algorithm to accelerate the convergence rate.Finally,an experiment is conducted on simulation data,which compares the classical GA with IGADK with domain knowledge of varying accuracy.The IGADK can greatly reduce the number of iterations,populations,and samples required for learning,which illustrates the efficiency and effectiveness of the proposed algorithm.
基金supported by the National Key R&D Program of China(Grant No.:2023YFF1205102)the National Natural Science Foundation of China(Grant Nos.:82273856,22077143,and 21977127)the Science Foundation of Guangzhou,China(No.:2Grant024A04J2172).
文摘Structural optimization of lead compounds is a crucial step in drug discovery.One optimization strategy is to modify the molecular structure of a scaffold to improve both its biological activities and absorption,distribution,metabolism,excretion,and toxicity(ADMET)properties.One of the deep molecular generative model approaches preserves the scaffold while generating drug-like molecules,thereby accelerating the molecular optimization process.Deep molecular diffusion generative models simulate a gradual process that creates novel,chemically feasible molecules from noise.However,the existing models lack direct interatomic constraint features and struggle with capturing long-range dependencies in macromolecules,leading to challenges in modifying the scaffold-based molecular structures,and creates limitations in the stability and diversity of the generated molecules.To address these challenges,we propose a deep molecular diffusion generative model,the three-dimensional(3D)equivariant diffusion-driven molecular generation(3D-EDiffMG)model.The dual strong and weak atomic interaction force-based long-range dependency capturing equivariant encoder(dual-SWLEE)is introduced to encode both the bonding and non-bonding information based on strong and weak atomic interactions.Addi-tionally,a gate multilayer perceptron(gMLP)block with tiny attention is incorporated to explicitly model complex long-sequence feature interactions and long-range dependencies.The experimental results show that 3D-EDiffMG effectively generates unique,novel,stable,and diverse drug-like molecules,highlighting its potential for lead optimization and accelerating drug discovery.
基金National Natural Science Foundation of China(32471265).
文摘Pesticides play a pivotal role in modern agriculture. However, the pesticide industry faces significant challenges closely linked to major global concerns such as pesticide resistance, environmental pollution, food safety, and crop yields. Developing safe, efficient, and environmentally friendly pesticides has become a key challenge for the industry. Recently, Qing Yang and colleagues unveiled the mode of action of a dual-functional protein, the ABCH transporter, which plays essential roles in lipid transport to construct the lipid barrier of insect cuticles and in pesticide detoxification within insects. Since ABCH transporters are critical for all insects but absent in mammals and plants, this elegant and exciting work provides a highly promising target for developing safe, low-resistance pesticides. Here, we highlight the groundbreaking discoveries made by Qing Yang's team in unraveling the intricate mechanisms of the ABCH transporter.
基金supported by the National Key Research and Development Program of China(Grant No.2022YFA1402304)the National Natural Science Foundation of China(Grant Nos.12034009,12374005,52288102,52090024,and T2225013)+1 种基金the Fundamental Research Funds for the Central Universitiesthe Program for JLU Science and Technology Innovative Research Team.
文摘Crystal structure prediction(CSP)is a foundational computational technique for determining the atomic arrangements of crystalline materials,especially under high-pressure conditions.While CSP plays a critical role in materials science,traditional approaches often encounter significant challenges related to computational efficiency and scalability,particularly when applied to complex systems.Recent advances in machine learning(ML)have shown tremendous promise in addressing these limitations,enabling the rapid and accurate prediction of crystal structures across a wide range of chemical compositions and external conditions.This review provides a concise overview of recent progress in ML-assisted CSP methodologies,with a particular focus on machine learning potentials and generative models.By critically analyzing these advances,we highlight the transformative impact of ML in accelerating materials discovery,enhancing computational efficiency,and broadening the applicability of CSP.Additionally,we discuss emerging opportunities and challenges in this rapidly evolving field.
基金supported by Research Projects of the Nature Science Foundation of Hebei Province(F2021402005).
文摘Semi-supervised new intent discovery is a significant research focus in natural language understanding.To address the limitations of current semi-supervised training data and the underutilization of implicit information,a Semi-supervised New Intent Discovery for Elastic Neighborhood Syntactic Elimination and Fusion model(SNID-ENSEF)is proposed.Syntactic elimination contrast learning leverages verb-dominant syntactic features,systematically replacing specific words to enhance data diversity.The radius of the positive sample neighborhood is elastically adjusted to eliminate invalid samples and improve training efficiency.A neighborhood sample fusion strategy,based on sample distribution patterns,dynamically adjusts neighborhood size and fuses sample vectors to reduce noise and improve implicit information utilization and discovery accuracy.Experimental results show that SNID-ENSEF achieves average improvements of 0.88%,1.27%,and 1.30%in Normalized Mutual Information(NMI),Accuracy(ACC),and Adjusted Rand Index(ARI),respectively,outperforming PTJN,DPN,MTP-CLNN,and DWG models on the Banking77,StackOverflow,and Clinc150 datasets.The code is available at https://github.com/qsdesz/SNID-ENSEF,accessed on 16 January 2025.
文摘Discovering floating wastes,especially bottles on water,is a crucial research problem in environmental hygiene.Nevertheless,real-world applications often face challenges such as interference from irrelevant objects and the high cost associated with data collection.Consequently,devising algorithms capable of accurately localizing specific objects within a scene in scenarios where annotated data is limited remains a formidable challenge.To solve this problem,this paper proposes an object discovery by request problem setting and a corresponding algorithmic framework.The proposed problem setting aims to identify specified objects in scenes,and the associated algorithmic framework comprises pseudo data generation and object discovery by request network.Pseudo-data generation generates images resembling natural scenes through various data augmentation rules,using a small number of object samples and scene images.The network structure of object discovery by request utilizes the pre-trained Vision Transformer(ViT)model as the backbone,employs object-centric methods to learn the latent representations of foreground objects,and applies patch-level reconstruction constraints to the model.During the validation phase,we use the generated pseudo datasets as training sets and evaluate the performance of our model on the original test sets.Experiments have proved that our method achieves state-of-the-art performance on Unmanned Aerial Vehicles-Bottle Detection(UAV-BD)dataset and self-constructed dataset Bottle,especially in multi-object scenarios.
基金supported by funding from the Bluesand Foundation,Alzheimer's Association(AARG-21-852072 and Bias Frangione Early Career Achievement Award)to EDan Australian Government Research Training Program scholarship and the University of Sydney's Brain and Mind Centre fellowship to AH。
文摘Tauopathies,diseases characterized by neuropathological aggregates of tau including Alzheimer's disease and subtypes of fro ntotemporal dementia,make up the vast majority of dementia cases.Although there have been recent developments in tauopathy biomarkers and disease-modifying treatments,ongoing progress is required to ensure these are effective,economical,and accessible for the globally ageing population.As such,continued identification of new potential drug targets and biomarkers is critical."Big data"studies,such as proteomics,can generate information on thousands of possible new targets for dementia diagnostics and therapeutics,but currently remain underutilized due to the lack of a clear process by which targets are selected for future drug development.In this review,we discuss current tauopathy biomarkers and therapeutics,and highlight areas in need of improvement,particularly when addressing the needs of frail,comorbid and cognitively impaired populations.We highlight biomarkers which have been developed from proteomic data,and outline possible future directions in this field.We propose new criteria by which potential targets in proteomics studies can be objectively ranked as favorable for drug development,and demonstrate its application to our group's recent tau interactome dataset as an example.
基金supported by the National Natural Science Foundation of China under Grant No.62072475.
文摘Unmanned and aerial systems as interactors among different system components for communications,have opened up great opportunities for truth data discovery in Mobile Crowd Sensing(MCS)which has not been properly solved in the literature.In this paper,an Unmanned Aerial Vehicles-supported Intelligent Truth Discovery(UAV-ITD)scheme is proposed to obtain truth data at low-cost communications for MCS.The main innovations of the UAV-ITD scheme are as follows:(1)UAV-ITD scheme takes the first step in employing UAV joint Deep Matrix Factorization(DMF)to discover truth data based on the trust mechanism for an Information Elicitation Without Verification(IEWV)problem in MCS.(2)This paper introduces a truth data discovery scheme for the first time that only needs to collect a part of n data samples to infer the data of the entire network with high accuracy,which saves more communication costs than most previous data collection schemes,where they collect n or kn data samples.Finally,we conducted extensive experiments to evaluate the UAV-ITD scheme.The results show that compared with previous schemes,our scheme can reduce estimated truth error by 52.25%–96.09%,increase the accuracy of workers’trust evaluation by 0.68–61.82 times,and save recruitment costs by 24.08%–54.15%in truth data discovery.
基金supported in part by the Fundamental Research Funds for the Central Universities under Grant No.2024ZCJH01in part by the National Natural Science Foundation of China(NSFC)under Grant No.62271081in part by the National Key Research and Development Program of China under Grant No.2020YFA0711302.
文摘In unmanned aerial vehicle(UAV)networks,the high mobility of nodes leads to frequent changes in network topology,which brings challenges to the neighbor discovery(ND)for UAV networks.Integrated sensing and communication(ISAC),as an emerging technology in 6G mobile networks,has shown great potential in improving communication performance with the assistance of sensing information.ISAC obtains the prior information about node distribution,reducing the ND time.However,the prior information obtained through ISAC may be imperfect.Hence,an ND algorithm based on reinforcement learning is proposed.The learning automaton(LA)is applied to interact with the environment and continuously adjust the probability of selecting beams to accelerate the convergence speed of ND algorithms.Besides,an efficient ND algorithm in the neighbor maintenance phase is designed,which applies the Kalman filter to predict node movement.Simulation results show that the LA-based ND algorithm reduces the ND time by up to 32%compared with the Scan-Based Algorithm(SBA),which proves the efficiency of the proposed ND algorithms.
基金supported in part by the National Key Research and Development Program of China(2021YFD1700100,2023YFD1700500)the National Natural Science Foundation of China(22177051)+1 种基金the Fundamental Research Funds for the Central Universities(KYCYXT2022010)Sichuan Key Research and Development Program(22ZDYF0186,2021YFN0134).
文摘The latest review published in Nature Reviews Drug Discovery by Michael W.Mullowney and co-authors focuses on the use of artificial intelligence techniques,specifically machine learning,in natural product drug discovery.The authors discussed various applications of AI in this field,such as genome and metabolome mining,structural characterization of natural products,and predicting targets and biological activities of these compounds.They also highlighted the challenges associated with creating and managing large datasets for training algorithms,as well as strategies to address these obstacles.Additionally,the authors examine common pitfalls in algorithm training and offer suggestions for avoiding them.
基金supported by the Sichuan province Science&Technology Department Crops Breeding Project(2021YFYZ0002)。
文摘The continued expansion of the world population,increasingly inconsistent climate and shrinking agricultural resources present major challenges to crop breeding.Fortunately,the increasing ability to discover and manipulate genes creates new opportunities to develop more productive and resilient cultivars.Many genes have been described in papers as being beneficial for yield increase.However,few of them have been translated into increased yield on farms.In contrast,commercial breeders are facing gene decidophobia,i.e.,puzzled about which gene to choose for breeding among the many identified,a huge chasm between gene discovery and cultivar innovation.The purpose of this paper is to draw attention to the shortfalls in current gene discovery research and to emphasise the need to align with cultivar innovation.The methodology dictates that genetic studies not only focus on gene discovery but also pay good attention to the genetic backgrounds,experimental validation in relevant environments,appropriate crop management,and data reusability.The close of the gaps should accelerate the application of molecular study in breeding and contribute to future global food security.
基金supported by the Shandong Province Special Fund ‘Frontier Technology and Free Exploration’ from Laoshan Laboratory (No. 8-01)the National Natural Science Foundation of China (No. 42376116)+3 种基金the Special Funds of Shandong Province for Qingdao National Laboratory of Marine Science and Technology (No. 2022QN LM030003)the State Key Laboratory for Chemistry and Molecular Engineering of Medicinal Resources, Guangxi Normal University (No. CMEMR2023-B16)the National Key Research and Development Program of China (No. 2022YFC2601305)the Innovation Center for Academicians of Hainan Province, and the Fundamental Research Funds for the Central Universities (No. 202461059)
文摘Marine natural products(MNPs)are valuable resources for drug development.To date,17 drugs from marine sources are in clinical use,and 33 pharmaceutical compounds are in clinical trials.Presently the success of drug development from the marine resources is higher than the industry average.It is a feasible strategy to conduct the discovery of druglead compounds based on marine chemical ecology by fully exploiting the pharmacological potential of marine chemical defense matters.In the search for bioactive MNPs,our group has constructed a biological resources library including more than 1500 strains of fungi.Focusing on the strategy of Blue Drug Library,we have discovered a series of novel MNPs with abundant biological functions.Highly efficient and scalable total synthesis of(+)-aniduquinolone A(44)and pesimquinolone I(48)have been completed,which will facilitate access to sufficient quantities of candidates for in vivo pharmacological and toxicological studies.As a nucleoprotein(NP)inhibitor,QLA(75)possesses significant anti-influenza A virus(IAV)activities both in vitro and in vivo.CHNQD-00803(76)is a potent and selective AMP-activated kinase(AMPK)activator that can effectively inhibit metabolic disorders and metabolic dysfunction-associated steatohepatitis(MASH)progression.Moreover,we identified two new candidate molecules with potent anti-hepatocellular carcinoma effects.Particularly,as a natural guanine-nucleotide exchange factors for ADP-ribosylation factor GTPases(Arf-GEFs)inhibitor prodrug,CHNQD-01255(78)is qualified to be developed as a targeted candidate anticancer drug,which may be promising to apply for cancer immunotherapy.Hence,it is evident that MNPs play an important role in drug development.
基金supported in part by the National Natural Science Foundations of CHINA(Grant No.61771392,No.61771390,No.61871322 and No.61501373)Science and Technology on Avionics Integration Laboratory and the Aeronautical Science Foundation of China(Grant No.201955053002 and No.20185553035)。
文摘In this paper,we propose a Multi-token Sector Antenna Neighbor Discovery(M-SAND)protocol to enhance the efficiency of neighbor discovery in asynchronous directional ad hoc networks.The central concept of our work involves maintaining multiple tokens across the network.To prevent mutual interference among multi-token holders,we introduce the time and space non-interference theorems.Furthermore,we propose a master-slave strategy between tokens.When the master token holder(MTH)performs the neighbor discovery,it decides which 1-hop neighbor is the next MTH and which 2-hop neighbors can be the new slave token holders(STHs).Using this approach,the MTH and multiple STHs can simultaneously discover their neighbors without causing interference with each other.Building on this foundation,we provide a comprehensive procedure for the M-SAND protocol.We also conduct theoretical analyses on the maximum number of STHs and the lower bound of multi-token generation probability.Finally,simulation results demonstrate the time efficiency of the M-SAND protocol.When compared to the QSAND protocol,which uses only one token,the total neighbor discovery time is reduced by 28% when 6beams and 112 nodes are employed.