摘要
Deep potential(DP)scheme has increased the simulation temporal and spatial scales while maintaining the ab initio accuracy of the molecular dynamics.DeePMD-kit is an outstanding application that implements DP scheme efficiently.However,current performance model cannot accurately measure the resource utilization of DeePMD-kit operators and predict the execution time.We introduce DP-perf,an interpretable performance model for DeePMD-kit.DP-perf can accurately measure the resource utilization of the individual DeePMD-kit operators,communication pattern,and the overall application by exploiting physical system properties and machine configurations.It can be easily applied to mainstream supercomputers including Tianhe-3F,the new Sunway,Fugaku,and Summit.With DP-perf,users can select the optimal machine and decide the corresponding configuration for various purposes(e.g.,lower cost,less time)without real runs.Evaluation of four top supercomputers shows that DP-perf can fit overall execution time with a low mean absolute percentage error of 5.7%/8.1%/14.3%/13.1%on Tianhe-3F/new Sunway/Fugaku/Summit.On the prediction scenario,DP-perf can predict the total execution time with a mean absolute percentage error of less than 20%.
基金
supported by the following funding:the Strategic Priority Research Program of Chinese Academy of Sciences(No.XDB0500102)
National Science Foundation of China(No.61972416,T2125013 and 92270206)
China National Postdoctoral Program for Innovative Talents(No.BX20240383)
the Natural Science Foundation of Shandong Province(No.ZR2022LZH009)
GHfound C(No.202407035455)
National Key R&D Program of China(No.2021YFA1000103-3).