期刊文献+

分布内存系统中流水并行代码的自动生成 被引量:4

Automatic Generation of Pipeline Parallel Code in Distributed Memory System
在线阅读 下载PDF
导出
摘要 并行循环分为DOALL和DOACROSS。DOACROSS循环携带数据依赖,在并行执行时需要通信支持,对于可以精确分析依赖关系的DOACROSS循环可通过流水并行方式提高性能。该文针对流水并行代码的自动生成进行讨论,包括数据依赖关系图和流水关系图的建立、流水并行判别准则和流水代码的自动生成等。实验证明流水并行后能获得较好的加速比。 Parallel loops are divided into two kinds DOALL and DOACROSS. Loops with data dependencies are often referred as DOACROSS loops. If the dependencies of DOACROSS loop can be precisely determined by compiler, pipeline parallel code for them can be created to improve the performance. This paper discusses the algorithms of creating the data dependence relation graph and pipeline relation graph, the discrimination rules of the pipeline parallel, and how to create the pipeline parallel code automatically. Experimental results show that the speedup ratio is satisfied with pipeline parallel.
出处 《计算机工程》 CAS CSCD 北大核心 2008年第11期77-79,共3页 Computer Engineering
基金 国家部委科研基金资助重点项目
关键词 流水并行 数据依赖关系图 流水关系图 流水通信 pipeline parallel data dependence relation graph pipeline relation graph, pipeline communication
  • 相关文献

参考文献5

二级参考文献21

  • 1朱传琪,臧斌宇,陈彤.程序自动并行化系统[J].软件学报,1996,7(3):180-186. 被引量:34
  • 2王玲秋.并行程序的表格编辑环境设计[A]..见:第7届全国并行计算学术交流会会议论文集[C].,2002..
  • 3陆林生 等.DPHI:面向科学计算的数据并行高层建模语言[J].计算机研究与发展,2001,38:153-159.
  • 4[1]Balasundaram V, Fox G, Kennedy K, Kremer U. An interactive environment for data partitioning and distribution. In: Charleston SC, ed. Proceedings of the 5th Distributed Memory Computing Conference. New York: ACM Press, 1990.
  • 5[2]Banerjee P, Chandy JA, Gupta M, Hodges IVEW, Holm JG, Lain A, Palermo DJ, Ramaswamy S, Su E. The PARADIGM compiler for distributed-memory multicomputers. IEEE Computer, 1995,28(10):37~47.
  • 6[3]Adve V, Mellor-Crummey J, Sethi A. An integer set framework for HPF analysis and code generation. Technical Report, TR97-275, Computer Science Department, Rice University. 1997.
  • 7[4]Knuth DE. An empirical study of FORTRAN programs. Software-Practice and Experience, 1971,1(2):105~134.
  • 8[5]Johnson SP, Ierotheou CS, Cross M. Automatic parallel code generation for message passing on distributed memory systems. Parallel Computing, 1996,22(2):227~258.
  • 9[6]Allen FE, Cocke J. A program data flow analysis procedure. Communications of the ACM, 1978,19(3):137~174.
  • 10[7]Tarjan RE. Finding dominators in directed graphs. SIAM Journal of Computing, 1974,3(1):62~89.

共引文献13

同被引文献38

  • 1丁强,臧斌宇,朱传琪.一种动态分布数组的数据划分模式[J].计算机工程与设计,2005,26(5):1135-1139. 被引量:1
  • 2董春丽,韩林,赵荣彩.并行编译中一种线性数据和计算划分算法[J].计算机工程,2006,32(24):26-28. 被引量:5
  • 3AKHTERS.ROBERTSJ.多核程序设计技术[M].李宝峰,富弘毅,李韬,译.北京:电子工业出版社.2007.
  • 4Johnson T A,Eigenmann R,Vijaykumar T N.Speculative Thread Decomposition Through Empirical Optimization[C]//Proc.of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming.New York,USA:ACM Press,2007:205-214.
  • 5de Alba M R,Kaeli D R.Runtime Predictability of Loops[C]//Proc.of IEEE International Workshop on Workload Characterization.Washington D.C.,USA:IEEE Press,2001:91-98.
  • 6Mafijul I M.Predicting Loop Termination to Boost Speculative Thread-level Parallelism in Embedded Applications[C]//Proc.of the 19th IEEE Int'l Workshop on Computer Architecture and High Performance Computing.Aizu,Japan:IEEE Press,2007:54-61.
  • 7Mafijul I M,Busck A,Engbom M,et al.Limits on Thread-level Parallelism in Embedded Applications[C]//Proc.of the 11th IEEE Int'l Workshop on Interaction Between Compliers and Computer Architectures.Phoenix,Arizona,USA:IEEE Press,2007:40-49.
  • 8Wang Yaobin,An Hong,Liang Bo,et al.Balancing Thread Partition for Efficiently Exploiting Speculative Thread-level Parallelism[C]// Proc.of International Symposium on Advances in Visual Computing.Lake Tahoe,NV,USA:[s.n.],2007.
  • 9Wang Shengyue,Dai Xiaoru,Yellajyosula K S.et al.Loop Selection for Thread-level Speculation[C]//Proc.of International Workshop on Languages and Compilers for Parallel Computing.Hawthorne,NY,USA:[s.n.],2005.
  • 10Zhong Hongtao,Mehrara M,Lieberman S,et al.Uncovering Hidden Loop Level Parallelism in Sequential Applications[C]// Proc.of the 14th International Symposium on High-performance Computer Architecture.Salt Lake City,USA:[s.n.],2008.

引证文献4

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部