期刊文献+

曙光5000A天体大规模数值模拟软件性能测试 被引量:1

Performance Evaluation of the Simulation Software on Dawning 5000A for Large Scale Celestial Bodies
在线阅读 下载PDF
导出
摘要 在国产百万亿次超级计算机曙光5000A上进行了天体大规模数值模拟软件的性能和可扩展测试实验,详细介绍了软件中的测试程序以及测试环境和过程,并对测试结果进行了分析.对于80×80×50的网格规模,采用每节点4进程测试了16~128个处理器核,每节点8进程、16进程分别测试了16~512个处理器核,相对加速比最终分别达到5.33、10.48和12.57,并行效率分别达到66.66%、32.58%和32.29%.对于160×160×100的网格规模,测试了每节点16进程的64~8192个核的性能,最大相对加速比为12.46,并行效率为9.73%.测试结果表明,曙光5000A具有良好的性能,测试结果对软件下一步的优化研究具有重要的指导意义. The performance and scalability evaluation of the large scale simulation software on planetary fluid dynamics are investigated by the experiments on the domestic 100 TFlops super- computer, Dawning 5000A. The details of the software testing procedures, the environment and the process of the testing are introduced. For the mesh size of 80 × 80 ×50, the performance testing on 16 to 128 processor cores with mapping 4 processes on each node and the performance testing on 16 to 512 processor cores with mapping 16 or 8 processes on each node are performed, respectively. The relative speedups reach 5. 33, 10. 48 and 12. 57, respectively, and the corresponding parallel efficiencies of these 3 testing cases are 66. 66%, 32.58% and 32.29%, respectively. For the mesh size of 160 ×160 ×100, the performance testing on 64 to 8 192 processor cores with mapping 16 processes on each node gives the results that the relative speedup is 12.46, and its parallel efficiency is 9. 73%. The experimental results reveal the good performance of Dawning 5000A and show important clues of the software optimization.
出处 《西安交通大学学报》 EI CAS CSCD 北大核心 2009年第10期71-75,共5页 Journal of Xi'an Jiaotong University
基金 国家自然科学基金重点资助项目(60533020) 国家重点基础研究发展规划资助项目(2005CB321702) 国家自然科学基金资助项目(60303020 10801125) 国家高技术研究发展计划资助项目(2006AA01A102 2006AA01A125)
关键词 曙光5000A 数值模拟 性能测试 可扩展 Dawning 5000A numerical simulation performance evaluation scalability
  • 相关文献

参考文献13

  • 1张云泉,孙家昶,袁国兴,张林波.2008年中国高性能计算机TOP100排行榜分析与展望[J].科研信息化技术与应用,2008(3):71-78. 被引量:2
  • 2中国软件行业协会数学软件分会,国家863高性能计算机评测中心,中国计算机学会高性能计算专业委员会.2008年中国高性能计算机性能TOP100排行榜[EB/OL].(2009-03-13)[2009-03-28].http://www.samss.org.cn/sites/shuxue/ndhyC.jsp?contentId=2473512102846.
  • 3CHANA K H, LI Ligang, LIAO Xinhao. Modelling the core convection using finite element and finite difference methods [J]. Phys Earth Planet Interiors, 2006, 157(2): 124-138.
  • 4LI Ligang, LIAO Xinhao, ZHANG Keke. Linear and nonlinear instabilities in rotating cylindrical Rayleigh-Benard convection[J/OL]. Physical Review: E, 2008, 78(5):12[2009-02-20]. http://link, aps. org/doi/10. 1103/PbysRevE. 78. 056303.
  • 5I.I Ligang, LIAO Xinhao, ZHANG Keke. Countertraveling waves in rotating Rayleigh 13enard convection [J/OL]. Physical Review: E, 2008, 77(2):4[2009-02 -18]. http://link, aps. org/doi/10. 1103/PhysRevE. 77. 027301.
  • 6李力刚.球壳内行星流体动力学方程组的有限差分法[R]//国家863高效能计算机及网格服务环境项目技术报告.北京:中国科学院,2008:10-20.
  • 7杨超.球壳对流的有限差分程序注记[R]//国家863高效能计算机及网格服务环境项目技术报告.北京:中国科学院,2008:30-40.
  • 8DUKOWICZ J K, DVINSKY A S. Approximate factorization as a high order splitting for the implicit in- compressible flow equations [J]. Journal of Computational Physics, 1992, 102(2): 336-347.
  • 9TUMINARO R S, HEROUX M, HUTCHINSON S A, et al. Official AZTEC users guide: version 2.1 [DB/OL].[2009-02-20]. http://www. CS. sundia. gov/CRF/pspapers/Aztec og 2.1. ps.
  • 10张云泉,孙家昶,迟学斌,唐志敏.数值计算程序的存储复杂性分析[J].计算机学报,2000,23(4):362-373. 被引量:17

二级参考文献20

  • 1[6]Culler D, Karp R, Patterson D et al. LogP: Towards a realistic model of parallel computation. In: Proceedings of PPoPP IV, San Diego, CA, USA, 1993.1~12
  • 2[7]Gibbons Phillip B, Matias Yossi, Ramachandran V. Can a shared-memory model serve as a bridging model for parallel computation? In: Proceedings of SPAA′97, Newport, Rhode Island, USA, 1997. 72~83
  • 3[8]Cook S A, Reckhow R A. Time bounded random access machines. Journal of Computer and Systems Sciences, 1973, 7(4) :354~375
  • 4[9]Alpern B, Carter L, Feig E, Selker T. The uniform memory hierarchy model of computation. Algorithmica, 1994, 12(2/3):72~109
  • 5[10]Aggarwal A, Alpern B, Chandra A, Snir M. A model for hierarchical memory. In: Proceedings of the 19th Annual ACM Conference on Theory of Computing, New York, USA, 1987.305~314
  • 6[11]Aggarwal A, Alpern B et al. Hierarchical memory with block transfer. In: Proceedings of the 28th Annual IEEE Symposium on Foundations of Computer Science, Los Angels, California,USA, 1987. 204~216
  • 7[12]Amato N M et al. Predicting performance on SMPs: A case study--the SGI Power Challenge. In: Proceedings of the 14th International Parallel and Distributed Processing Symposium (IPDPS′00), Cancun, Mexico, 2000. 729~737
  • 8[14]Badawy A H, Aggarwal A, Yeung D, Tseng C W. Evaluating the impact of memory system performance on software prefetching and locality optimizations. In: Proceedings of 2001 International Conference on Supercomputing (ICS′01), Sorrento,Italy,2001. 486~500
  • 9[16]Juurlink Ben, Wijshoff Harry A G. A quantitative comparison of parallel computation models. ACM Transactions on Computer Systems, 1998, 16(3):271~318
  • 10[1]Goodman J R. Using cache memory to reduce processor-memory traffic. In: Proceedings of the 10th Annual Symposium on Computer Architecture(ISCA-10), Stockholm, Sweden, 1983.124~131

共引文献30

同被引文献10

  • 1Dukowicz J K,Dvinsky A S.Approximate factorization as a high order splitting for the implicit incompressible flow equations. Journal of Computational Physics . 1992
  • 2CHANA K H,LI Ligang,LIAO Xinhao.Modelling the core convection using finite element and finite difference methods. Physics of the Earth and Planetary Interiors . 2006
  • 3LI Ligang,LIAO Xinhao,ZHANG Keke.Linear and nonlinear instabilities in rotating cylindrical Rayleigh-Benard convection. Physical Review:E . 2008
  • 4LI Ligang,LIAO Xinhao,ZHANG Keke.Counter-traveling waves in rotating Rayleigh-Benard convection. Physical Review:E . 2008
  • 5VUDUC R W.Automatic performance tuning of sparsematrix kernels. . 2003
  • 6Tuminaro R S,Heroux M,Hutchinson S A,et al.Official AZTEC users guide:version 2.1. http:∥www.cs.sandia.gov/CRF/ps-papers/Aztec-ug-2.1.ps . 2009
  • 7Chen J,Zhang Y Q,Zhang L B,et al.Performanceevaluation of allgather algorithms on terascale Linuxcluster with fast ethernet. Eighth InternationalConference on High-Performance Computing inAsia-Pacific Region(HPCAsia′05) . 2005
  • 8Georgios G,Kornilios K,Nikos A,et al.Under-standing the performance of sparse matrix-vectormultiplication. Proceedings of the 16th Euromi-cro Conference on Parallel,Distributed and Net-work-Based Processing(PDP 2008) . 2008
  • 9张云泉,孙家昶,迟学斌,唐志敏.数值计算程序的存储复杂性分析[J].计算机学报,2000,23(4):362-373. 被引量:17
  • 10张云泉.面向高性能数值计算的并行计算模型DRAM(h)[J].计算机学报,2003,26(12):1660-1670. 被引量:16

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部