The ground state electron density—obtainable using Kohn-Sham Density Functional Theory(KSDFT)simulations—contains a wealth of material information,making its prediction via machine learning(ML)models attractive.Howe...The ground state electron density—obtainable using Kohn-Sham Density Functional Theory(KSDFT)simulations—contains a wealth of material information,making its prediction via machine learning(ML)models attractive.However,the computational expense of KS-DFT scales cubically with system size which tends to stymie training data generation,making it difficult to develop quantifiably accurate ML models that are applicable across many scales and system configurations.Here,we address this fundamental challenge by employing transfer learning to leverage the multi-scale nature of the training data,while comprehensively sampling systemconfigurations using thermalization.Our ML models are less reliant on heuristics,and being based on Bayesian neural networks,enable uncertainty quantification.We show that our models incur significantly lower data generation costs while allowing confident—and when verifiable,accurate—predictions for a wide variety of bulk systems well beyond training,including systems with defects,different alloy compositions,and at multi-million-atom scales.Moreover,such predictions can be carried out using only modest computational resources.展开更多
Wepropose machine learning(ML)models to predict the electron density—the fundamental unknown of a material’s ground state—across the composition space of concentrated alloys.From this,other physical properties can ...Wepropose machine learning(ML)models to predict the electron density—the fundamental unknown of a material’s ground state—across the composition space of concentrated alloys.From this,other physical properties can be inferred,enabling accelerated exploration.A significant challenge is that the number of descriptors and sampled compositions required for accurate prediction grows rapidly with species.To address this,we employ Bayesian Active Learning(AL),which minimizes training data requirements by leveraging uncertainty quantification capabilities of Bayesian Neural Networks.Compared to the strategic tessellation of the composition space,Bayesian-AL reduces the number of training data points by a factor of 2.5 for ternary(SiGeSn)and 1.7 for quaternary(CrFeCoNi)systems.We also introduce easy-to-optimize,body-attached-frame descriptors,which respect physical symmetries while keeping descriptor-vector size nearly constant as alloy complexity increases.Our ML models demonstrate high accuracy and generalizability in predicting both electron density and energy across composition space.展开更多
基金supported by grant DE-SC0023432 funded by the U.S.Department of Energy,Office of ScienceThis research used resources of the National Energy Research Scientific Computing Center,a DOE Office of Science User Facility supported by the Office of Science of the U.S.Department of Energy under Contract No.DE-AC02-05CH11231,using NERSC awards BES-ERCAP0025205,BES-ERCAP0025168,and BESERCAP0028072+2 种基金SG and SP acknowledge Research Excellence Fund from MTUASB acknowledges startup support from the Samueli School Of Engineering at UCLA,as well as funding from UCLA’s Council on Research(COR)Faculty Research GrantASB also acknowledges support through a UCLA SoHub seed grant and a Faculty Career Development Award from UCLA’s Office of Equity,Diversity,and Inclusion。
文摘The ground state electron density—obtainable using Kohn-Sham Density Functional Theory(KSDFT)simulations—contains a wealth of material information,making its prediction via machine learning(ML)models attractive.However,the computational expense of KS-DFT scales cubically with system size which tends to stymie training data generation,making it difficult to develop quantifiably accurate ML models that are applicable across many scales and system configurations.Here,we address this fundamental challenge by employing transfer learning to leverage the multi-scale nature of the training data,while comprehensively sampling systemconfigurations using thermalization.Our ML models are less reliant on heuristics,and being based on Bayesian neural networks,enable uncertainty quantification.We show that our models incur significantly lower data generation costs while allowing confident—and when verifiable,accurate—predictions for a wide variety of bulk systems well beyond training,including systems with defects,different alloy compositions,and at multi-million-atom scales.Moreover,such predictions can be carried out using only modest computational resources.
基金supported by grant DE-SC0023432 funded by the U.S. Department of Energy, Office of ScienceThis research used resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231, using NERSC awards BES-ERCAP0033206, BES-ERCAP0025205, BES-ERCAP0025168, and BES-ERCAP0028072+1 种基金JM acknowledges support from the U.S. Department of Energy under contracts DE-SC0018410 (FES) and DE-SC0020314 (BES)ASB and JM acknowledge funding through a UCLA SoHub seed grant. SP acknowledges the Doctoral Finishing Fellowship awarded by the Graduate School at MTU. The authors would like to thank UCLA's Institute for Digital Research and Education (IDRE), the Superior HPC facility at MTU, the MRI GPU cluster at MTU for making available some of the computing resources used in this work. The authors acknowledge the use of the GPT-4o (OpenAI) model to polish the language and edit grammatical errors in some sections of this manuscript. The authors subsequently inspected, validated and edited the text generated by the AI model, before incorporation.
文摘Wepropose machine learning(ML)models to predict the electron density—the fundamental unknown of a material’s ground state—across the composition space of concentrated alloys.From this,other physical properties can be inferred,enabling accelerated exploration.A significant challenge is that the number of descriptors and sampled compositions required for accurate prediction grows rapidly with species.To address this,we employ Bayesian Active Learning(AL),which minimizes training data requirements by leveraging uncertainty quantification capabilities of Bayesian Neural Networks.Compared to the strategic tessellation of the composition space,Bayesian-AL reduces the number of training data points by a factor of 2.5 for ternary(SiGeSn)and 1.7 for quaternary(CrFeCoNi)systems.We also introduce easy-to-optimize,body-attached-frame descriptors,which respect physical symmetries while keeping descriptor-vector size nearly constant as alloy complexity increases.Our ML models demonstrate high accuracy and generalizability in predicting both electron density and energy across composition space.