We compare the performance of two very different parallel gravitational N-body codes for astrophysical simulations on large Graphics Processing Unit(GPU) clusters, both of which are pioneers in their own fields as w...We compare the performance of two very different parallel gravitational N-body codes for astrophysical simulations on large Graphics Processing Unit(GPU) clusters, both of which are pioneers in their own fields as well as on certain mutual scales- NBODY6++ and Bonsai. We carry out benchmarks of the two codes by analyzing their performance, accuracy and efficiency through the modeling of structure decomposition and timing measurements. We find that both codes are heavily optimized to leverage the computational potential of GPUs as their performance has approached half of the maximum single precision performance of the underlying GPU cards. With such performance we predict that a speed-up of200- 300 can be achieved when up to 1k processors and GPUs are employed simultaneously. We discuss the quantitative information about comparisons of the two codes, finding that in the same cases Bonsai adopts larger time steps as well as larger relative energy errors than NBODY6++, typically ranging from10- 50 times larger, depending on the chosen parameters of the codes. Although the two codes are built for different astrophysical applications, in specified conditions they may overlap in performance at certain physical scales, thus allowing the user to choose either one by fine-tuning parameters accordingly.展开更多
基金support by Chinese Academy of Sciences through the Silk Road Project at NAOC,through the Chinese Academy of Sciences Visiting Professorship for Senior International Scientists,Grant Number 2009S1-5 (RS)the “Qianren” special foreign experts program of China+2 种基金funded by the Ministry of Finance of the People’s Republic of China under the grant ZDY Z2008-2,has been used for the simulationsthe supercomputer “The Milky Way System” at Julich Supercomputing Centre in Germany,built for SFB881 at the University of Heidelberg,Germanythe special support by the NAS Ukraine under the Main Astronomical Observatory GPU/GRID computing cluster project
文摘We compare the performance of two very different parallel gravitational N-body codes for astrophysical simulations on large Graphics Processing Unit(GPU) clusters, both of which are pioneers in their own fields as well as on certain mutual scales- NBODY6++ and Bonsai. We carry out benchmarks of the two codes by analyzing their performance, accuracy and efficiency through the modeling of structure decomposition and timing measurements. We find that both codes are heavily optimized to leverage the computational potential of GPUs as their performance has approached half of the maximum single precision performance of the underlying GPU cards. With such performance we predict that a speed-up of200- 300 can be achieved when up to 1k processors and GPUs are employed simultaneously. We discuss the quantitative information about comparisons of the two codes, finding that in the same cases Bonsai adopts larger time steps as well as larger relative energy errors than NBODY6++, typically ranging from10- 50 times larger, depending on the chosen parameters of the codes. Although the two codes are built for different astrophysical applications, in specified conditions they may overlap in performance at certain physical scales, thus allowing the user to choose either one by fine-tuning parameters accordingly.