The physical growth of Polycyclic Aromatic Compounds(PACs)to soot particles plays a significant role in understanding the chemistry of soot formation.Insights into the process can be gained from PACs’free energy of d...The physical growth of Polycyclic Aromatic Compounds(PACs)to soot particles plays a significant role in understanding the chemistry of soot formation.Insights into the process can be gained from PACs’free energy of dimerization landscape.However,because the infeasibly large space of possible PAC dimers cannot be exhaustively simulated,researchers must train machine learning models on a subset of data to impute the rest.To this end,we propose and assess an active learning approach to discovering the optimal PACs for training a machine learning model to predict PACs’association and dissociation free energies.The comparison between active learning and random sampling showed that active learning has faster loss convergence,requiring fewer training samples to reach the same level of accuracy.The trained model accurately modeled unseen PACs and exhibited robustness against changes in the sampling space used to train the model.More broadly,this work shows how active learning can optimize the design and improve the understanding of more expensive models in specific domains.展开更多
Considering temporally evolving processes,the search for optimal input selection in Machine Learning(ML)algorithms is extended here beyond(i)the readily available independent variables defining the process and(ii)the ...Considering temporally evolving processes,the search for optimal input selection in Machine Learning(ML)algorithms is extended here beyond(i)the readily available independent variables defining the process and(ii)the dependent variables suggested by feature extraction methods,by considering the time scale that characterizes the process.The analysis is based on the process of homogeneous autoignition,which is fully determined by the initial temperature T(0)and pressure p(0)of the mixture and the equivalence ratio𝜙that specifies the initial mixture composition.The aim is to seek the optimal input for the prediction of the time at which the mixture ignites.The Multilayer Perceptron(MLP)and Principal Component Analysis(PCA)algorithms are employed for prediction and feature extraction,respectively.It is demonstrated that the time scale that characterizes the initiation of the process𝜐𝑓(0),provides much better accuracy as input to MLP than any pair of the three independent parameters T(0),p(0)and𝜙or their two principal components.Indicatively,it is shown that using𝜐𝑓(0)as input results in a coefficient of determination R 2 in the range of 0.953 to 0.982,while the maximum value of R 2 when using the independent parameters or principal components is 0.660.The physical grounds,on which the success of𝜐𝑓(0)is based,are discussed.The results suggest the need for further research in order to develop selection methodologies of optimal inputs among those that characterize the process.展开更多
基金funded in part by the National Science Foundation Environmental Convergence Opportunities in Chemical,Bioengineering,Environmental,and Transport Systems(ECO-CBET)[Award Number:AWD024893]by the Michigan Institute for Computational Discovery and Engineering(MICDE)Research Scholar Program.
文摘The physical growth of Polycyclic Aromatic Compounds(PACs)to soot particles plays a significant role in understanding the chemistry of soot formation.Insights into the process can be gained from PACs’free energy of dimerization landscape.However,because the infeasibly large space of possible PAC dimers cannot be exhaustively simulated,researchers must train machine learning models on a subset of data to impute the rest.To this end,we propose and assess an active learning approach to discovering the optimal PACs for training a machine learning model to predict PACs’association and dissociation free energies.The comparison between active learning and random sampling showed that active learning has faster loss convergence,requiring fewer training samples to reach the same level of accuracy.The trained model accurately modeled unseen PACs and exhibited robustness against changes in the sampling space used to train the model.More broadly,this work shows how active learning can optimize the design and improve the understanding of more expensive models in specific domains.
文摘Considering temporally evolving processes,the search for optimal input selection in Machine Learning(ML)algorithms is extended here beyond(i)the readily available independent variables defining the process and(ii)the dependent variables suggested by feature extraction methods,by considering the time scale that characterizes the process.The analysis is based on the process of homogeneous autoignition,which is fully determined by the initial temperature T(0)and pressure p(0)of the mixture and the equivalence ratio𝜙that specifies the initial mixture composition.The aim is to seek the optimal input for the prediction of the time at which the mixture ignites.The Multilayer Perceptron(MLP)and Principal Component Analysis(PCA)algorithms are employed for prediction and feature extraction,respectively.It is demonstrated that the time scale that characterizes the initiation of the process𝜐𝑓(0),provides much better accuracy as input to MLP than any pair of the three independent parameters T(0),p(0)and𝜙or their two principal components.Indicatively,it is shown that using𝜐𝑓(0)as input results in a coefficient of determination R 2 in the range of 0.953 to 0.982,while the maximum value of R 2 when using the independent parameters or principal components is 0.660.The physical grounds,on which the success of𝜐𝑓(0)is based,are discussed.The results suggest the need for further research in order to develop selection methodologies of optimal inputs among those that characterize the process.