Drug discovery is aimed to design novel molecules with specific chemical properties for the treatment of targeting diseases. Generally, molecular optimization is one important step in drug discovery, which optimizes t...Drug discovery is aimed to design novel molecules with specific chemical properties for the treatment of targeting diseases. Generally, molecular optimization is one important step in drug discovery, which optimizes the physical and chemical properties of a molecule. Currently, artificial intelligence techniques have shown excellent success in drug discovery, which has emerged as a new strategy to address the challenges of drug design including molecular optimization, and drastically reduce the costs and time for drug discovery. We review the latest advances of molecular optimization in artificial intelligence-based drug discovery, including data resources, molecular properties, optimization methodologies, and assessment criteria for molecular optimization. Specifically, we classify the optimization methodologies into molecular mapping-based, molecular distribution matching-based, and guided search-based methods, respectively, and discuss the principles of these methods as well as their pros and cons. Moreover, we highlight the current challenges in molecular optimization and offer a variety of perspectives, including interpretability, multidimensional optimization, and model generalization, on potential new lines of research to pursue in future. This study provides a comprehensive review of molecular optimization in artificial intelligence-based drug discovery, which points out the challenges as well as the new prospects. This review will guide researchers who are interested in artificial intelligence molecular optimization.展开更多
Kirsten rat sarcoma viral oncogene homolog(KRAS)protein inhibitors are a promising class of therapeutics,but research on molecules that effectively penetrate the blood-brain barrier(BBB)remains limited,which is crucia...Kirsten rat sarcoma viral oncogene homolog(KRAS)protein inhibitors are a promising class of therapeutics,but research on molecules that effectively penetrate the blood-brain barrier(BBB)remains limited,which is crucial for treating central nervous system(CNS)malignancies.Although molecular generation models have recently advanced drug discovery,they often overlook the complexity of biological and chemical factors,leaving room for improvement.In this study,we present a structureconstrained molecular generation workflow designed to optimize lead compounds for both drug efficacy and drug absorption properties.Our approach utilizes a variational autoencoder(VAE)generative model integrated with reinforcement learning for multi-objective optimization.This method specifically aims to enhance BBB permeability(BBBp)while maintaining high-affinity substructures of KRAS inhibitors.To support this,we incorporate a specialized KRAS BBB predictor based on active learning and an affinity predictor employing comparative learning models.Additionally,we introduce two novel metrics,the knowledge-integrated reproduction score(KIRS)and the composite diversity score(CDS),to assess structural performance and biological relevance.Retrospective validation with KRAS inhibitors,AMG510 and MRTX849,demonstrates the framework’s effectiveness in optimizing BBBp and highlights its potential for real-world drug development applications.This study provides a robust framework for accelerating the structural enhancement of lead compounds,advancing the drug development process across diverse targets.展开更多
Generating novel molecules to satisfy specific properties is a challenging task in modern drug discovery,which requires the optimization of a specific objective based on satisfying chemical rules.Herein,we aim to opti...Generating novel molecules to satisfy specific properties is a challenging task in modern drug discovery,which requires the optimization of a specific objective based on satisfying chemical rules.Herein,we aim to optimize the properties of a specific molecule to satisfy the specific properties of the generated molecule.The Matched Molecular Pairs(MMPs),which contain the source and target molecules,are used herein,and logD and solubility are selected as the optimization properties.The main innovative work lies in the calculation related to a specific transformer from the perspective of a matrix dimension.Threshold intervals and state changes are then used to encode logD and solubility for subsequent tests.During the experiments,we screen the data based on the proportion of heavy atoms to all atoms in the groups and select 12365,1503,and 1570 MMPs as the training,validation,and test sets,respectively.Transformer models are compared with the baseline models with respect to their abilities to generate molecules with specific properties.Results show that the transformer model can accurately optimize the source molecules to satisfy specific properties.展开更多
基金The National Natural Science Foundation of China,Grant/Award Numbers:62372204,62072206,62102158,61772381the Fundamental Research Funds for the Central Universities,Grant/Award Numbers:2662022JC004,2662021JC008。
文摘Drug discovery is aimed to design novel molecules with specific chemical properties for the treatment of targeting diseases. Generally, molecular optimization is one important step in drug discovery, which optimizes the physical and chemical properties of a molecule. Currently, artificial intelligence techniques have shown excellent success in drug discovery, which has emerged as a new strategy to address the challenges of drug design including molecular optimization, and drastically reduce the costs and time for drug discovery. We review the latest advances of molecular optimization in artificial intelligence-based drug discovery, including data resources, molecular properties, optimization methodologies, and assessment criteria for molecular optimization. Specifically, we classify the optimization methodologies into molecular mapping-based, molecular distribution matching-based, and guided search-based methods, respectively, and discuss the principles of these methods as well as their pros and cons. Moreover, we highlight the current challenges in molecular optimization and offer a variety of perspectives, including interpretability, multidimensional optimization, and model generalization, on potential new lines of research to pursue in future. This study provides a comprehensive review of molecular optimization in artificial intelligence-based drug discovery, which points out the challenges as well as the new prospects. This review will guide researchers who are interested in artificial intelligence molecular optimization.
基金supported by National Key Research and Development Program of China(Grant Nos.:2022YFC3400504 and 2023YFC2305904)the Strategic Priority Research Program of the Chinese Academy of Sciences,China(Grant Nos.:XDB0830203 and XDB0830200)+2 种基金the National Natural Science Foundation of China(Grant Nos.:82204278,31960198,T2225002,and 82273855)SIMM-SHUTCM Traditional Chinese Medicine Innovation Joint Research Program,China(Grant No.:E2G805H)Shanghai Municipal Science and Technology Major Project,China,and Key Technologies R&D Program of Guangdong Province,China(Grant No.:2023B1111030004).
文摘Kirsten rat sarcoma viral oncogene homolog(KRAS)protein inhibitors are a promising class of therapeutics,but research on molecules that effectively penetrate the blood-brain barrier(BBB)remains limited,which is crucial for treating central nervous system(CNS)malignancies.Although molecular generation models have recently advanced drug discovery,they often overlook the complexity of biological and chemical factors,leaving room for improvement.In this study,we present a structureconstrained molecular generation workflow designed to optimize lead compounds for both drug efficacy and drug absorption properties.Our approach utilizes a variational autoencoder(VAE)generative model integrated with reinforcement learning for multi-objective optimization.This method specifically aims to enhance BBB permeability(BBBp)while maintaining high-affinity substructures of KRAS inhibitors.To support this,we incorporate a specialized KRAS BBB predictor based on active learning and an affinity predictor employing comparative learning models.Additionally,we introduce two novel metrics,the knowledge-integrated reproduction score(KIRS)and the composite diversity score(CDS),to assess structural performance and biological relevance.Retrospective validation with KRAS inhibitors,AMG510 and MRTX849,demonstrates the framework’s effectiveness in optimizing BBBp and highlights its potential for real-world drug development applications.This study provides a robust framework for accelerating the structural enhancement of lead compounds,advancing the drug development process across diverse targets.
基金This work was supported by the National Natural Science Foundation of China(Nos.62272288,61972451,and U22A2041)the Shenzhen Key Laboratory of Intelligent Bioinformatics(No.ZDSYS20220422103800001).
文摘Generating novel molecules to satisfy specific properties is a challenging task in modern drug discovery,which requires the optimization of a specific objective based on satisfying chemical rules.Herein,we aim to optimize the properties of a specific molecule to satisfy the specific properties of the generated molecule.The Matched Molecular Pairs(MMPs),which contain the source and target molecules,are used herein,and logD and solubility are selected as the optimization properties.The main innovative work lies in the calculation related to a specific transformer from the perspective of a matrix dimension.Threshold intervals and state changes are then used to encode logD and solubility for subsequent tests.During the experiments,we screen the data based on the proportion of heavy atoms to all atoms in the groups and select 12365,1503,and 1570 MMPs as the training,validation,and test sets,respectively.Transformer models are compared with the baseline models with respect to their abilities to generate molecules with specific properties.Results show that the transformer model can accurately optimize the source molecules to satisfy specific properties.
基金supported by the National Natural Science Foundation of China(22209006 and 21935001)the Natural Science Foundation of Shandong Province(ZR2022QE009)+1 种基金the Fundamental Research Funds for the Central Universities(buctrc202307)the National Key Beijing Natural Science Foundation(Z210016)。