多核处理器越来越普及,如何通过软件技术最大提升CPU每个核心的使用率,成为热点问题。引入多核并行编程模型Threading Building Blocks,并与raw threads、Open MP进行各方面详细比较,分析了其优劣。并研究了TBB结合MPI在SMP集群系统上...多核处理器越来越普及,如何通过软件技术最大提升CPU每个核心的使用率,成为热点问题。引入多核并行编程模型Threading Building Blocks,并与raw threads、Open MP进行各方面详细比较,分析了其优劣。并研究了TBB结合MPI在SMP集群系统上实现高效的混合并行计算应用的方法。最终发现TBB在多核编程方面有显著的优势。TTB和MPI的结合,又为多核处理器结点集群提供了并行层次化结构,大大优化集群的性能。展开更多
A hybrid decomposition method for molecular dynamics simulations was presented, using simul- taneously spatial decomposition and force decomposition to fit the architecture of a cluster of symmetric multi-processo...A hybrid decomposition method for molecular dynamics simulations was presented, using simul- taneously spatial decomposition and force decomposition to fit the architecture of a cluster of symmetric multi-processor (SMP) nodes. The method distributes particles between nodes based on the spatial decom- position strategy to reduce inter-node communication costs. The method also partitions particle pairs within each node using the force decomposition strategy to improve the load balance for each node. Simulation results for a nucleation process with 4 000 000 particles show that the hybrid method achieves better paral- lel performance than either spatial or force decomposition alone, especially when applied to a large scale particle system with non-uniform spatial density.展开更多
文摘多核处理器越来越普及,如何通过软件技术最大提升CPU每个核心的使用率,成为热点问题。引入多核并行编程模型Threading Building Blocks,并与raw threads、Open MP进行各方面详细比较,分析了其优劣。并研究了TBB结合MPI在SMP集群系统上实现高效的混合并行计算应用的方法。最终发现TBB在多核编程方面有显著的优势。TTB和MPI的结合,又为多核处理器结点集群提供了并行层次化结构,大大优化集群的性能。
基金Supported by the "985" Basic Research Foundation of Tsinghua University of China (No. JC2001024)
文摘A hybrid decomposition method for molecular dynamics simulations was presented, using simul- taneously spatial decomposition and force decomposition to fit the architecture of a cluster of symmetric multi-processor (SMP) nodes. The method distributes particles between nodes based on the spatial decom- position strategy to reduce inter-node communication costs. The method also partitions particle pairs within each node using the force decomposition strategy to improve the load balance for each node. Simulation results for a nucleation process with 4 000 000 particles show that the hybrid method achieves better paral- lel performance than either spatial or force decomposition alone, especially when applied to a large scale particle system with non-uniform spatial density.