Artificial intelligence(AI)researchers and cheminformatics specialists strive to identify effective drug precursors while optimizing costs and accelerating development processes.Digital molecular representation plays ...Artificial intelligence(AI)researchers and cheminformatics specialists strive to identify effective drug precursors while optimizing costs and accelerating development processes.Digital molecular representation plays a crucial role in achieving this objective by making molecules machine-readable,thereby enhancing the accuracy of molecular prediction tasks and facilitating evidence-based decision making.This study presents a comprehensive review of small molecular representations and AI-driven drug discovery downstream tasks utilizing these representations.The research methodology begins with the compilation of small molecule databases,followed by an analysis of fundamental molecular representations and the models that learn these representations from initial forms,capturing patterns and salient features across extensive chemical spaces.The study then examines various drug discovery downstream tasks,including drug-target interaction(DTI)prediction,drug-target affinity(DTA)prediction,drug property(DP)prediction,and drug generation,all based on learned representations.The analysis concludes by highlighting challenges and opportunities associated with machine learning(ML)methods for molecular representation and improving downstream task performance.Additionally,the representation of small molecules and AI-based downstream tasks demonstrates significant potential in identifying traditional Chinese medicine(TCM)medicinal substances and facilitating TCM target discovery.展开更多
基金supported by the Shenzhen Key Laboratory of Intelligent Bioinformatics(No.ZDSYS20220422103800001)the Shenzhen Science and Technology Program(No.JCYJ20230807140709020)+2 种基金National Natural Science Foundation of China(Nos.62402489,U22A2041,and 62373172)the China Postdoctoral Science Foundation(No.2023M743688)Guangdong Basic and Applied Basic Research Foundation(Nos.2024A1515011960 and 2023A1515110570)。
文摘Artificial intelligence(AI)researchers and cheminformatics specialists strive to identify effective drug precursors while optimizing costs and accelerating development processes.Digital molecular representation plays a crucial role in achieving this objective by making molecules machine-readable,thereby enhancing the accuracy of molecular prediction tasks and facilitating evidence-based decision making.This study presents a comprehensive review of small molecular representations and AI-driven drug discovery downstream tasks utilizing these representations.The research methodology begins with the compilation of small molecule databases,followed by an analysis of fundamental molecular representations and the models that learn these representations from initial forms,capturing patterns and salient features across extensive chemical spaces.The study then examines various drug discovery downstream tasks,including drug-target interaction(DTI)prediction,drug-target affinity(DTA)prediction,drug property(DP)prediction,and drug generation,all based on learned representations.The analysis concludes by highlighting challenges and opportunities associated with machine learning(ML)methods for molecular representation and improving downstream task performance.Additionally,the representation of small molecules and AI-based downstream tasks demonstrates significant potential in identifying traditional Chinese medicine(TCM)medicinal substances and facilitating TCM target discovery.