Chemical reactions,which transform one set of substances to another,drive research in chemistry and biology.Recently,computer-aided chemical reaction prediction has spurred rapidly growing interest,and various deep le...Chemical reactions,which transform one set of substances to another,drive research in chemistry and biology.Recently,computer-aided chemical reaction prediction has spurred rapidly growing interest,and various deep learning-based algorithms have been proposed.However,current efforts primarily focus on developing models that support specific applications,with less emphasis on building unified frameworks that predict chemical reactions.Here,we developed Bidirectional Chemical Intelligent Net(Bi CINet),a prediction framework based on Bidirectional and Auto-Regressive Transformers(BARTs),for predicting chemical reactions in various tasks,including the bidirectional prediction of organic synthesis and enzyme-mediated chemical reactions.This versatile framework was trained using general chemical reactions and achieved top-1 forward and backward accuracies of 80.7%and 48.6%,respectively,for the public benchmark dataset USPTO_50K.By multitask transfer learning and integrating various task prompts into the model,Bi CINet enables retrosynthetic planning and metabolic prediction for small molecules,as well as retrosynthetic analysis and enzyme-catalyzed product prediction for natural products.These results demonstrate the superiority of our multifunctional framework for comprehensively understanding chemical reactions.展开更多
The past decade has seen a sharp increase in machine learning(ML)applications in scientific research.This review introduces the basic constituents of ML,including databases,features,and algorithms,and highlights a few...The past decade has seen a sharp increase in machine learning(ML)applications in scientific research.This review introduces the basic constituents of ML,including databases,features,and algorithms,and highlights a few important achievements in chemistry that have been aided by ML techniques.The described databases include some of the most popular chemical databases for molecules and materials obtained from either experiments or computational calculations.Important two-dimensional(2D)and three-dimensional(3D)features representing the chemical environment of molecules and solids are briefly introduced.Decision tree and deep learning neural network algorithms are overviewed to emphasize their frameworks and typical application scenarios.Three important fields of ML in chemistry are discussed:(1)retrosynthesis,in which ML predicts the likely routes of organic synthesis;(2)atomic simulations,which utilize the ML potential to accelerate potential energy surface sampling;and(3)heterogeneous catalysis,in which ML assists in various aspects of catalytic design,ranging from synthetic condition optimization to reaction mechanism exploration.Finally,a prospect on future ML applications is provided.展开更多
基金financially supported by the National Natural Science Foundation of China(NSFC,No.82073692)CAMS Innovation Fund for Medical Sciences(CIFMS,No.2021-I2M-1-028)。
文摘Chemical reactions,which transform one set of substances to another,drive research in chemistry and biology.Recently,computer-aided chemical reaction prediction has spurred rapidly growing interest,and various deep learning-based algorithms have been proposed.However,current efforts primarily focus on developing models that support specific applications,with less emphasis on building unified frameworks that predict chemical reactions.Here,we developed Bidirectional Chemical Intelligent Net(Bi CINet),a prediction framework based on Bidirectional and Auto-Regressive Transformers(BARTs),for predicting chemical reactions in various tasks,including the bidirectional prediction of organic synthesis and enzyme-mediated chemical reactions.This versatile framework was trained using general chemical reactions and achieved top-1 forward and backward accuracies of 80.7%and 48.6%,respectively,for the public benchmark dataset USPTO_50K.By multitask transfer learning and integrating various task prompts into the model,Bi CINet enables retrosynthetic planning and metabolic prediction for small molecules,as well as retrosynthetic analysis and enzyme-catalyzed product prediction for natural products.These results demonstrate the superiority of our multifunctional framework for comprehensively understanding chemical reactions.
基金financial support from the National Key Research and Development Program of China(2018YFA0208600)the National Natural Science Foundation of China(12188101,22033003,91945301,91745201,92145302,22122301,and 92061112)the Tencent Foundation for XPLORER PRIZE,and Fundamental Research Funds for the Central Universities(20720220011)。
文摘The past decade has seen a sharp increase in machine learning(ML)applications in scientific research.This review introduces the basic constituents of ML,including databases,features,and algorithms,and highlights a few important achievements in chemistry that have been aided by ML techniques.The described databases include some of the most popular chemical databases for molecules and materials obtained from either experiments or computational calculations.Important two-dimensional(2D)and three-dimensional(3D)features representing the chemical environment of molecules and solids are briefly introduced.Decision tree and deep learning neural network algorithms are overviewed to emphasize their frameworks and typical application scenarios.Three important fields of ML in chemistry are discussed:(1)retrosynthesis,in which ML predicts the likely routes of organic synthesis;(2)atomic simulations,which utilize the ML potential to accelerate potential energy surface sampling;and(3)heterogeneous catalysis,in which ML assists in various aspects of catalytic design,ranging from synthetic condition optimization to reaction mechanism exploration.Finally,a prospect on future ML applications is provided.