Data-driven material innovation has the potential to revolutionize the traditional Edisonian process and significantly shorten development cycles.However,the scarcity of data in materials science and the poor interpre...Data-driven material innovation has the potential to revolutionize the traditional Edisonian process and significantly shorten development cycles.However,the scarcity of data in materials science and the poor interpretability of machine learning pose serious obstacles to the adoption of this new paradigm.Here,we propose a pipeline that integrates data production,virtual screening,and theoretical innovation using high-throughput all-atom molecular dynamics(MD)as a data flywheel.Using this pipeline,we explored high-performance viscosity index improver polymers and constructed a dataset of 1166 entries for viscosity index improvers(VII)started fromonly five types of polymers.Under multiobjective constraints,366 potential high-viscosity-temperature performance polymers were identified,and six representative polymers were validated through direct MD simulations.Starting from high-dimensional physical features,we conducted an unbiased systematic analysis of the quantitative structure-property relationships for polymers VII,providing an explicit mathematical model with promising application in VII industry.This work demonstrates the advanced capabilities and reliability of the pipeline proposed here in initiating material innovation cycles in data-scarce fields,and the establishment of the VII dataset and models will serve as a critical starting point for the datadriven design of high viscosity-temperature performance polymers.展开更多
基金funded by the Strategic Priority Research Program of the Chinese Academy of Sciences,Grant no.XDB 0470201.
文摘Data-driven material innovation has the potential to revolutionize the traditional Edisonian process and significantly shorten development cycles.However,the scarcity of data in materials science and the poor interpretability of machine learning pose serious obstacles to the adoption of this new paradigm.Here,we propose a pipeline that integrates data production,virtual screening,and theoretical innovation using high-throughput all-atom molecular dynamics(MD)as a data flywheel.Using this pipeline,we explored high-performance viscosity index improver polymers and constructed a dataset of 1166 entries for viscosity index improvers(VII)started fromonly five types of polymers.Under multiobjective constraints,366 potential high-viscosity-temperature performance polymers were identified,and six representative polymers were validated through direct MD simulations.Starting from high-dimensional physical features,we conducted an unbiased systematic analysis of the quantitative structure-property relationships for polymers VII,providing an explicit mathematical model with promising application in VII industry.This work demonstrates the advanced capabilities and reliability of the pipeline proposed here in initiating material innovation cycles in data-scarce fields,and the establishment of the VII dataset and models will serve as a critical starting point for the datadriven design of high viscosity-temperature performance polymers.