Uncertainty quantification(UQ)to detect samples with large expected errors(outliers)is applied to reactive molecular potential energy surfaces(PESs).Three methods–Ensembles,deep evidential regression(DER),and Gaussia...Uncertainty quantification(UQ)to detect samples with large expected errors(outliers)is applied to reactive molecular potential energy surfaces(PESs).Three methods–Ensembles,deep evidential regression(DER),and Gaussian Mixture Models(GMM)—were applied to the H-transfer reaction between syn-Criegee and vinyl hydroxyperoxide.The results indicate that ensemble models provide the best results for detecting outliers,followed by GMM.For example,from a pool of 1000 structures with the largest uncertainty,the detection quality for outliers is~90%and~50%,respectively,if 25 or 1000 structures with large errors are sought.On the contrary,the limitations of the statistical assumptions of DER greatly impact its prediction capabilities.Finally,a structure-based indicator was found to be correlated with large average error,which may help to rapidly classify new structures into those that provide an advantage for refining the neural network.展开更多
基金supported by the Swiss National Science Foundation through grants 200020_219779 and 200021_215088the University of Basel.L.I.V.S.acknowledges funding from the Swiss National Science Foundation(Grant P500PN_222297)to develop the last stages of this work.
文摘Uncertainty quantification(UQ)to detect samples with large expected errors(outliers)is applied to reactive molecular potential energy surfaces(PESs).Three methods–Ensembles,deep evidential regression(DER),and Gaussian Mixture Models(GMM)—were applied to the H-transfer reaction between syn-Criegee and vinyl hydroxyperoxide.The results indicate that ensemble models provide the best results for detecting outliers,followed by GMM.For example,from a pool of 1000 structures with the largest uncertainty,the detection quality for outliers is~90%and~50%,respectively,if 25 or 1000 structures with large errors are sought.On the contrary,the limitations of the statistical assumptions of DER greatly impact its prediction capabilities.Finally,a structure-based indicator was found to be correlated with large average error,which may help to rapidly classify new structures into those that provide an advantage for refining the neural network.