The connectivity map(CMAP) database is established initially to connect biology, chemistry, and clinical conditions, which helps to discover the connection of disease-gene-drug. The CMAP approach has been applied in...The connectivity map(CMAP) database is established initially to connect biology, chemistry, and clinical conditions, which helps to discover the connection of disease-gene-drug. The CMAP approach has been applied in the field of drug discovery and development, which is widely recognized. In recently years, CMAP analysis has been applied in the research on Chinese materia medica(CMM). The study of CMM is facing a wide range of challenges, such as complicated ingredients, multiple targets, multiple pathways of action and complex functioning mechanism. The idea of employing CMAP in the CMM research has brought a new perspective for researchers and provides a systematic method for elucidating the mechanism of CMM.The connectivity map (CMAP) database is established initially to connect biology, chemistry, and clinical conditions, which helps to discover the connection of disease-gene-drug. The CMAP approach has been applied in the field of drug discovery and development, which is widely recognized. In recently years, CMAP analysis has been applied in the research on Chinese materia medica (CMM). The study of CMM is facing a wide range of challenges, such as complicated ingredients, multiple targets, multiple pathways of action and complex functioning mechanism. The idea of employing CMAP in the CMM research has brought a new perspective for researchers and provides a systematic method for elucidating the mechanism of CMM.展开更多
The photovoltaic performance of organic solar cells(OSCs)is significantly determined by the electron donor and acceptor materials in active layers.Traditional trial-and-error experiments for exploring high-performance...The photovoltaic performance of organic solar cells(OSCs)is significantly determined by the electron donor and acceptor materials in active layers.Traditional trial-and-error experiments for exploring high-performance materials suffer from long development cycles,high experimental costs,and low screening efficiency.Herein,the established database includes 547 donor-acceptor pairs,integrating photovoltaic parameters and molecular representations.The 30 molecular structure descriptors that closely relate power conversion efficiency(PCE)were extracted.Long short-term memory networks(LSTM),convolutional neural networks(CNN),and symbolic regression(SR)were trained to predict the PCE of OSCs.After hyperparameter optimization via grid search algorithm,the metrics indicate the trained models achieved high-precision for PCE prediction,and the performance of LSTM model prevail over than that of other models.Through dual validation by SHapley Additive exPlanations(SHAP)interpretability analysis and SR formulas,it was revealed that the number of structural units with double rings or more in acceptor molecules showed the significant correlation with PCE.Based on the dataset constructed using molecular fragment recombination strategy,the developed LSTM generative model successfully generated 210,660 novel donor molecules and 878,268 acceptor molecules.Following screening of 185,015,936,880 donor-acceptor pairs by the LSTM prediction model,5753 donor-acceptor pairs with the predicted PCE exceeding 18.50%were identified,among which the highest predicted PCE reached 18.66%.This approach provides theoretical guidance for the discovery of organic photovoltaic materials and may accelerate the development of high-performance OSCs,but also can be generalized to functional molecular design.展开更多
基金Professor of Chang Jiang Scholars Program,NSFC(81230090,81520108030)
文摘The connectivity map(CMAP) database is established initially to connect biology, chemistry, and clinical conditions, which helps to discover the connection of disease-gene-drug. The CMAP approach has been applied in the field of drug discovery and development, which is widely recognized. In recently years, CMAP analysis has been applied in the research on Chinese materia medica(CMM). The study of CMM is facing a wide range of challenges, such as complicated ingredients, multiple targets, multiple pathways of action and complex functioning mechanism. The idea of employing CMAP in the CMM research has brought a new perspective for researchers and provides a systematic method for elucidating the mechanism of CMM.The connectivity map (CMAP) database is established initially to connect biology, chemistry, and clinical conditions, which helps to discover the connection of disease-gene-drug. The CMAP approach has been applied in the field of drug discovery and development, which is widely recognized. In recently years, CMAP analysis has been applied in the research on Chinese materia medica (CMM). The study of CMM is facing a wide range of challenges, such as complicated ingredients, multiple targets, multiple pathways of action and complex functioning mechanism. The idea of employing CMAP in the CMM research has brought a new perspective for researchers and provides a systematic method for elucidating the mechanism of CMM.
基金supported by the National Natural Science Foundation of China(NNSFC)(GrantNo.12264025).
文摘The photovoltaic performance of organic solar cells(OSCs)is significantly determined by the electron donor and acceptor materials in active layers.Traditional trial-and-error experiments for exploring high-performance materials suffer from long development cycles,high experimental costs,and low screening efficiency.Herein,the established database includes 547 donor-acceptor pairs,integrating photovoltaic parameters and molecular representations.The 30 molecular structure descriptors that closely relate power conversion efficiency(PCE)were extracted.Long short-term memory networks(LSTM),convolutional neural networks(CNN),and symbolic regression(SR)were trained to predict the PCE of OSCs.After hyperparameter optimization via grid search algorithm,the metrics indicate the trained models achieved high-precision for PCE prediction,and the performance of LSTM model prevail over than that of other models.Through dual validation by SHapley Additive exPlanations(SHAP)interpretability analysis and SR formulas,it was revealed that the number of structural units with double rings or more in acceptor molecules showed the significant correlation with PCE.Based on the dataset constructed using molecular fragment recombination strategy,the developed LSTM generative model successfully generated 210,660 novel donor molecules and 878,268 acceptor molecules.Following screening of 185,015,936,880 donor-acceptor pairs by the LSTM prediction model,5753 donor-acceptor pairs with the predicted PCE exceeding 18.50%were identified,among which the highest predicted PCE reached 18.66%.This approach provides theoretical guidance for the discovery of organic photovoltaic materials and may accelerate the development of high-performance OSCs,but also can be generalized to functional molecular design.