期刊文献+

面向异构多核架构的自适应编译框架 被引量:2

A Self-Adaptive Compilation Framework for Heterogeneous Multi-Core Architecture
在线阅读 下载PDF
导出
摘要 针对应用在移植到异构多核高性能计算机系统中所面临的可移植性差以及性能优化难度大的问题,文中提出一种面向异构多核架构的自适应编译框架.通过源到源编译解决传统并行编程模型应用向异构多核架构的映射问题;同时利用动态剖分信息,自适应地调整插桩并配置优化策略,形成迭代式的自动优化过程.文中自适应编译框架将软硬件映射机制与优化策略结合,有效地解决了同构并行应用向异构多核架构的移植问题并提高了应用的整体性能.实验结果表明,文中基于Cell架构实现的原型系统,很好地解决了异构多核架构下应用移植性等问题,同时应用性能有所提高. To improve the application performance and portability on the novel hardware, this paper proposes a self-adaptive compilation framework for heterogeneous multi-core architecture. This framework uses source-to-source compiling technique to address the transformation prob- lems of the application from the traditional parallel programming model to the heterogeneous multi-core architecture, and analyzes dynamic profiling information to self-adaptively adjust instrument and configure the optimization strategy. The framework uses an iterative optimization method to combine the mapping mechanisms with the performance optimization strategy. The iterative automatic optimization method is formed to ensure the efficiency of parallel application migration with fully exploiting the ability of the heterogeneous multi-core architecture. The framework has been prototyped on the Cell architecture and tested with a set of examples and the experimental results are promising.
出处 《计算机学报》 EI CSCD 北大核心 2014年第7期1548-1559,共12页 Chinese Journal of Computers
基金 国家自然科学基金(61173039) 国家"八六三"高技术研究发展计划项目基金(2012AA010904 2012AA01A306) 国家科技支撑计划(2011BAH04B03)资助~~
关键词 异构多核 源到源编译 插桩 迭代式优化 heterogeneous multi-core source-to-source compilation instrument iterativeoptimization
  • 相关文献

参考文献14

  • 1Gschwind M.Chip multiprocessing and the cell broadband engine//Proceedings of the 3rd Conference on Computing Frontiers.Ischia,Italy,2006:1-8.
  • 2Kahle J.The cell processor architecture//Proceedings of the 38th Annual IEEE/ACM International Symposium on Micro architecture.Barcelona,Spain,2005:3.
  • 3Perez J M,Bellens P,Badia R M,et al.CellSs:Making it easier to program the cell broadband engine processor.IBM Journal of Research and Development,2007,51(5):593-604.
  • 4Han T D,Abdelrahman T S.hiCUDA:High-level GPGPU programming.IEEE Transactions on Parallel and Distributed Systems,2011,22(1):78-90.
  • 5Bauer M,Clark J,Schkufza E,et al.Programming the memory hierarchy revisited:Supporting irregular parallelism in Sequoia//Proceedings of the 16th ACM Symposium on Principles and Practice of Parallel Programming.San Antonio,USA,2011:13-24.
  • 6Knight T J,Park J Y,Ren M,et al.Compilation for explicitly managed memory hierarchies//Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming.San Jose,USA,2007:226-236.
  • 7Houston M,Park J Y,Ren M,et al.A portable runtime interface for multi level memory hierarchies//Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming.Salt Lake City,USA,2008:143-152.
  • 8Linderman M D,Collins J D,Wang H,et al.Merge:A programming model for heterogeneous multi-core systems//Proceedings of the Architectural Support for Programming Languages and Operating Systems.Seattle,Washington,USA,2008:287-296.
  • 9Blagojevic F,Stamatakis A,Antonopoulos C D,et al.RAxML-cell:Parallel phylogenetic tree inference on the cell broadband engine//Proceedings of the 21st IEEE/ACM International Parallel and Distributed Processing Symposium.Long Beach,USA,2007:1-10.
  • 10Ohara M,Inoue H,Sohda Y,et al.MPI Microtask for programming the cell broadband engine processor.IBM Systems Journal,2006,45(1):85-102.

同被引文献14

  • 1罗庚兴.西门子STEP7编程软件的使用方法[J].南方金属,2006(5):35-39. 被引量:9
  • 2卢爱勤.三菱GX Developer软件中使用SFC编程的方法[J].广西轻工业,2007,23(7):54-55. 被引量:6
  • 3Voronin K V. A numerical study of an MPI/OpenMP implementation based on asynchronous threads for a three-dimensional splitting scheme in heat transfer problems[J].Joumal of Applied and Industrial Mathematics,2014,8(3):436-443.
  • 4Perla F, Zanetti R Performance analysis of an hybrid OpenMP/MPI ALM software for life insurance policies on multi-core architectures[C].8th International Workshop on OpenMP,2012: 250-253.
  • 5Tsuji M, Sato M. Performance evaluation of OpenMP and MPI hybrid programs on a large scale multi-care multi-socket cluster, T2K Open Supercomputer[C]. 2009 International Conference on Parallel Processing Workshops, 2009.
  • 6Miki Y, Takahashi D, Morid M, et al. Highly scalable implementation of an N-body code on a GPU cluster[J].Computer Physics Communications,2013(184):2159-2168.
  • 7Capuzzo-Dolcetta R, Spera M, Punzo D. A fully parallel, high precision, N-body code running on hybrid computing platforms[J].Journal of Computational Physics, 2013(236): 580-593.
  • 8何炎祥,吴伟,刘陶,李清安,陈勇,胡明昊,刘健博,石谦.可信编译理论及其核心实现技术:研究综述[J].计算机科学与探索,2011,5(1):1-22. 被引量:12
  • 9刘志强,宋君强,卢风顺,赵娟.基于线程的MPI通信加速器技术研究[J].计算机学报,2011,34(1):154-164. 被引量:12
  • 10祝永志,张丹丹,曹宝香,禹继国.基于SMP机群的层次化并行编程技术的研究[J].电子学报,2012,40(11):2206-2210. 被引量:9

引证文献2

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部