曙光5000A天体大规模数值模拟软件性能测试被引量：1

Performance Evaluation of the Simulation Software on Dawning 5000A for Large Scale Celestial Bodies

下载PDF

导出

摘要在国产百万亿次超级计算机曙光5000A上进行了天体大规模数值模拟软件的性能和可扩展测试实验,详细介绍了软件中的测试程序以及测试环境和过程,并对测试结果进行了分析.对于80×80×50的网格规模,采用每节点4进程测试了16～128个处理器核,每节点8进程、16进程分别测试了16～512个处理器核,相对加速比最终分别达到5.33、10.48和12.57,并行效率分别达到66.66%、32.58%和32.29%.对于160×160×100的网格规模,测试了每节点16进程的64～8192个核的性能,最大相对加速比为12.46,并行效率为9.73%.测试结果表明,曙光5000A具有良好的性能,测试结果对软件下一步的优化研究具有重要的指导意义. The performance and scalability evaluation of the large scale simulation software on planetary fluid dynamics are investigated by the experiments on the domestic 100 TFlops super- computer, Dawning 5000A. The details of the software testing procedures, the environment and the process of the testing are introduced. For the mesh size of 80 × 80 ×50, the performance testing on 16 to 128 processor cores with mapping 4 processes on each node and the performance testing on 16 to 512 processor cores with mapping 16 or 8 processes on each node are performed, respectively. The relative speedups reach 5. 33, 10. 48 and 12. 57, respectively, and the corresponding parallel efficiencies of these 3 testing cases are 66. 66%, 32.58% and 32.29%, respectively. For the mesh size of 160 ×160 ×100, the performance testing on 64 to 8 192 processor cores with mapping 16 processes on each node gives the results that the relative speedup is 12.46, and its parallel efficiency is 9. 73%. The experimental results reveal the good performance of Dawning 5000A and show important clues of the software optimization.

作者王婷孙相征张云泉杨超李力刚刘芳芳管文华唐雨新姚继峰

机构地区中国科学院软件研究所并行计算实验室中国科学院软件研究所计算机科学国家重点实验室中国科学院研究生院中国科学院上海天文台上海超级计算中心

出处《西安交通大学学报》 EI CAS CSCD 北大核心 2009年第10期71-75,共5页 Journal of Xi'an Jiaotong University

基金国家自然科学基金重点资助项目(60533020) 国家重点基础研究发展规划资助项目(2005CB321702) 国家自然科学基金资助项目(60303020 10801125) 国家高技术研究发展计划资助项目(2006AA01A102 2006AA01A125)

关键词曙光5000A 数值模拟性能测试可扩展 Dawning 5000A numerical simulation performance evaluation scalability

分类号 TP393 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献13

1张云泉,孙家昶,袁国兴,张林波.2008年中国高性能计算机TOP100排行榜分析与展望[J].科研信息化技术与应用,2008(3):71-78. 被引量：2
2中国软件行业协会数学软件分会,国家863高性能计算机评测中心,中国计算机学会高性能计算专业委员会.2008年中国高性能计算机性能TOP100排行榜[EB/OL].(2009-03-13)[2009-03-28].http://www.samss.org.cn/sites/shuxue/ndhyC.jsp?contentId=2473512102846.
3CHANA K H, LI Ligang, LIAO Xinhao. Modelling the core convection using finite element and finite difference methods [J]. Phys Earth Planet Interiors, 2006, 157(2): 124-138.
4LI Ligang, LIAO Xinhao, ZHANG Keke. Linear and nonlinear instabilities in rotating cylindrical Rayleigh-Benard convection[J/OL]. Physical Review: E, 2008, 78(5):12[2009-02-20]. http://link, aps. org/doi/10. 1103/PbysRevE. 78. 056303.
5I.I Ligang, LIAO Xinhao, ZHANG Keke. Countertraveling waves in rotating Rayleigh 13enard convection [J/OL]. Physical Review: E, 2008, 77(2):4[2009-02 -18]. http://link, aps. org/doi/10. 1103/PhysRevE. 77. 027301.
6李力刚.球壳内行星流体动力学方程组的有限差分法[R]//国家863高效能计算机及网格服务环境项目技术报告.北京:中国科学院,2008:10-20.
7杨超.球壳对流的有限差分程序注记[R]//国家863高效能计算机及网格服务环境项目技术报告.北京:中国科学院,2008:30-40.
8DUKOWICZ J K, DVINSKY A S. Approximate factorization as a high order splitting for the implicit in- compressible flow equations [J]. Journal of Computational Physics, 1992, 102(2): 336-347.
9TUMINARO R S, HEROUX M, HUTCHINSON S A, et al. Official AZTEC users guide: version 2.1 [DB/OL].[2009-02-20]. http://www. CS. sundia. gov/CRF/pspapers/Aztec og 2.1. ps.
10张云泉,孙家昶,迟学斌,唐志敏.数值计算程序的存储复杂性分析[J].计算机学报,2000,23(4):362-373. 被引量：17

二级参考文献20

1[6]Culler D, Karp R, Patterson D et al. LogP: Towards a realistic model of parallel computation. In: Proceedings of PPoPP IV, San Diego, CA, USA, 1993.1～12
2[7]Gibbons Phillip B, Matias Yossi, Ramachandran V. Can a shared-memory model serve as a bridging model for parallel computation? In: Proceedings of SPAA′97, Newport, Rhode Island, USA, 1997. 72～83
3[8]Cook S A, Reckhow R A. Time bounded random access machines. Journal of Computer and Systems Sciences, 1973, 7(4) :354～375
4[9]Alpern B, Carter L, Feig E, Selker T. The uniform memory hierarchy model of computation. Algorithmica, 1994, 12(2/3):72～109
5[10]Aggarwal A, Alpern B, Chandra A, Snir M. A model for hierarchical memory. In: Proceedings of the 19th Annual ACM Conference on Theory of Computing, New York, USA, 1987.305～314
6[11]Aggarwal A, Alpern B et al. Hierarchical memory with block transfer. In: Proceedings of the 28th Annual IEEE Symposium on Foundations of Computer Science, Los Angels, California,USA, 1987. 204～216
7[12]Amato N M et al. Predicting performance on SMPs: A case study--the SGI Power Challenge. In: Proceedings of the 14th International Parallel and Distributed Processing Symposium (IPDPS′00), Cancun, Mexico, 2000. 729～737
8[14]Badawy A H, Aggarwal A, Yeung D, Tseng C W. Evaluating the impact of memory system performance on software prefetching and locality optimizations. In: Proceedings of 2001 International Conference on Supercomputing (ICS′01), Sorrento,Italy,2001. 486～500
9[16]Juurlink Ben, Wijshoff Harry A G. A quantitative comparison of parallel computation models. ACM Transactions on Computer Systems, 1998, 16(3):271～318
10[1]Goodman J R. Using cache memory to reduce processor-memory traffic. In: Proceedings of the 10th Annual Symposium on Computer Architecture(ISCA-10), Stockholm, Sweden, 1983.124～131

共引文献30

1张云泉,袁国兴.加大对计算科学支持力度,应对艾级超算应用挑战[J].集成技术,2012,1(1):65-67.
2杨志军,陈塑寰,王欣.面向对象有限元快速算法——Ⅰ数据结构[J].吉林大学学报（工学版）,2004,34(4):684-688. 被引量：2
3杨志军,陈塑寰,吴晓明.面向对象有限元快速算法——Ⅱ快速算法[J].吉林大学学报（工学版）,2005,35(2):195-198. 被引量：1
4陈国良,孙广中,张云泉,莫则尧.Study on Parallel Computing[J].Journal of Computer Science & Technology,2006,21(5):665-673. 被引量：6
5杨学军,窦勇,胡庆丰.Progress and Challenges in High Performance Computer Technology[J].Journal of Computer Science & Technology,2006,21(5):674-681. 被引量：7
6简岩.CMP技术的现状与发展[J].合肥师范学院学报,2006,8(6):52-56.
7蒋孟奇,张云泉,宋刚,李玉成.GOTOBLAS一般矩阵乘法高效实现机制的研究[J].计算机工程,2008,34(7):84-86. 被引量：8
8陈国良,苗乾坤,孙广中,徐云,郑启龙.分层并行计算模型[J].中国科学技术大学学报,2008,38(7):841-847. 被引量：9
9袁娥,张云泉,孙相征.RAM(h)模型下SpMV存储访问复杂度的分析[J].计算机工程与设计,2009,30(3):613-618.
10孙相征,张云泉,王婷,杨超,李力刚.天体大规模数值模拟软件性能优化[J].华中科技大学学报（自然科学版）,2010,38(S1):51-54.

同被引文献10

1Dukowicz J K,Dvinsky A S.Approximate factorization as a high order splitting for the implicit incompressible flow equations. Journal of Computational Physics . 1992
2CHANA K H,LI Ligang,LIAO Xinhao.Modelling the core convection using finite element and finite difference methods. Physics of the Earth and Planetary Interiors . 2006
3LI Ligang,LIAO Xinhao,ZHANG Keke.Linear and nonlinear instabilities in rotating cylindrical Rayleigh-Benard convection. Physical Review:E . 2008
4LI Ligang,LIAO Xinhao,ZHANG Keke.Counter-traveling waves in rotating Rayleigh-Benard convection. Physical Review:E . 2008
5VUDUC R W.Automatic performance tuning of sparsematrix kernels. . 2003
6Tuminaro R S,Heroux M,Hutchinson S A,et al.Official AZTEC users guide:version 2.1. http:∥www.cs.sandia.gov/CRF/ps-papers/Aztec-ug-2.1.ps . 2009
7Chen J,Zhang Y Q,Zhang L B,et al.Performanceevaluation of allgather algorithms on terascale Linuxcluster with fast ethernet. Eighth InternationalConference on High-Performance Computing inAsia-Pacific Region(HPCAsia′05) . 2005
8Georgios G,Kornilios K,Nikos A,et al.Under-standing the performance of sparse matrix-vectormultiplication. Proceedings of the 16th Euromi-cro Conference on Parallel,Distributed and Net-work-Based Processing(PDP 2008) . 2008
9张云泉,孙家昶,迟学斌,唐志敏.数值计算程序的存储复杂性分析[J].计算机学报,2000,23(4):362-373. 被引量：17
10张云泉.面向高性能数值计算的并行计算模型DRAM(h)[J].计算机学报,2003,26(12):1660-1670. 被引量：16

引证文献1

1孙相征,张云泉,王婷,杨超,李力刚.天体大规模数值模拟软件性能优化[J].华中科技大学学报（自然科学版）,2010,38(S1):51-54.

1王蓓蓓,周海芳,李思昆,刘衡竹.VisDAMS：面向科学数据的可视化管理系统[J].计算技术与自动化,2006,25(4):61-64.
2吴国清,陈虹.基于预测的纯量场数据压缩技术研究[J].计算机科学,2009,36(6):178-180.
3黄伯虎,段振华,张金磊,聂鹏程.一种采用预排序策略的多核并行skyline算法[J].华中科技大学学报（自然科学版）,2010,38(10):31-34.
4孙相征,张云泉,王婷,杨超,李力刚.天体大规模数值模拟软件性能优化[J].华中科技大学学报（自然科学版）,2010,38(S1):51-54.
5沈卫超,陈虹.大规模数值模拟数据的多分辨组织[J].计算机工程与科学,2009,31(11):117-120. 被引量：2
6邱磊.利用跳点搜索算法加速A*寻路[J].兰州理工大学学报,2015,41(3):102-107. 被引量：16
7许香照,马天宝,宁建国.三维复杂爆炸流场的大规模并行计算[J].固体力学学报,2013,34(S1):166-170.
8我校一项目获国家973前期研究专项项目资助[J].浙江师范大学学报（自然科学版）,2007,30(1):115-115.
9熊俊.CHAP程序研制进展[J].中国工程物理研究院科技年报,2010(1):57-59.
10郭红,李艳,安恒斌.基于网格片的氧碘化学激光器多块并行数值模拟[J].强激光与粒子束,2014,26(8):6-10. 被引量：1

西安交通大学学报

2009年第10期

浏览历史

内容加载中请稍等...

曙光5000A天体大规模数值模拟软件性能测试被引量：1

参考文献13

二级参考文献20

共引文献30

同被引文献10

引证文献1

相关作者

相关机构

相关主题

浏览历史

曙光5000A天体大规模数值模拟软件性能测试 被引量：1

参考文献13

二级参考文献20

共引文献30

同被引文献10

引证文献1

相关作者

相关机构

相关主题

浏览历史

曙光5000A天体大规模数值模拟软件性能测试被引量：1