期刊文献+

选择性循环的并行方法 被引量:1

Parallel Method for Selective Loop
在线阅读 下载PDF
导出
摘要 针对含有大量循环的串行程序存在的问题,提出一种基于线程级前瞻技术的循环选择方案。该方案对循环进行最优选择后建立一个可并行运行的循环集。对于该集合中的循环,选择并行效率高的代码段作并行处理,以加快串行程序运行速度。实验表明,相对于一般的简单内部循环或外部循环并行方法,该方案使9种基准代码的加速比平均上升23.8%,从而提高串行程序并行运行的效率。 Aiming at the sequential codes containing a large number of loops,a new method for parallelism based on Thread-Level Speculation(TLS) by loop selection is proposed.After optimal selection of loops,a loop set is built for parallel operation.The codes of loops which are included in the loop set are selected for parallel operation because of high parallelizing efficiency to improve the speed of serial program operation.Compared with simple inner or outer loop parallelization,simulation results demonstrate the average speedup of 9 benchmark applications raise 23.8%.Therefore,higher parallelizing efficiency can be achieved.
出处 《计算机工程》 CAS CSCD 北大核心 2010年第9期35-37,40,共4页 Computer Engineering
基金 上海市重点学科建设基金资助项目(J50103)
关键词 线程级前瞻 循环选择 并行运行 单片多核处理器 Thread-Level Speculation(TLS) loop selection parallel operation Chip Multi-core Processors(CMP)
  • 相关文献

参考文献12

  • 1Johnson T A,Eigenmann R,Vijaykumar T N.Speculative Thread Decomposition Through Empirical Optimization[C]//Proc.of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming.New York,USA:ACM Press,2007:205-214.
  • 2邓之刚,曾国荪,周静.一种非可规约循环的投机并行方法[J].计算机工程与科学,2007,29(10):135-138. 被引量:1
  • 3de Alba M R,Kaeli D R.Runtime Predictability of Loops[C]//Proc.of IEEE International Workshop on Workload Characterization.Washington D.C.,USA:IEEE Press,2001:91-98.
  • 4Mafijul I M.Predicting Loop Termination to Boost Speculative Thread-level Parallelism in Embedded Applications[C]//Proc.of the 19th IEEE Int'l Workshop on Computer Architecture and High Performance Computing.Aizu,Japan:IEEE Press,2007:54-61.
  • 5Mafijul I M,Busck A,Engbom M,et al.Limits on Thread-level Parallelism in Embedded Applications[C]//Proc.of the 11th IEEE Int'l Workshop on Interaction Between Compliers and Computer Architectures.Phoenix,Arizona,USA:IEEE Press,2007:40-49.
  • 6董春丽,韩林,赵荣彩.并行编译中一种线性数据和计算划分算法[J].计算机工程,2006,32(24):26-28. 被引量:5
  • 7龚雪容,陆林生,赵荣彩.分布内存系统中流水并行代码的自动生成[J].计算机工程,2008,34(11):77-79. 被引量:4
  • 8Wang Yaobin,An Hong,Liang Bo,et al.Balancing Thread Partition for Efficiently Exploiting Speculative Thread-level Parallelism[C]// Proc.of International Symposium on Advances in Visual Computing.Lake Tahoe,NV,USA:[s.n.],2007.
  • 9Wang Shengyue,Dai Xiaoru,Yellajyosula K S.et al.Loop Selection for Thread-level Speculation[C]//Proc.of International Workshop on Languages and Compilers for Parallel Computing.Hawthorne,NY,USA:[s.n.],2005.
  • 10Zhong Hongtao,Mehrara M,Lieberman S,et al.Uncovering Hidden Loop Level Parallelism in Sequential Applications[C]// Proc.of the 14th International Symposium on High-performance Computer Architecture.Salt Lake City,USA:[s.n.],2008.

二级参考文献20

  • 1董春丽,韩林,赵荣彩.并行编译中一种线性数据和计算划分算法[J].计算机工程,2006,32(24):26-28. 被引量:5
  • 2Kennedy K,McKinley K S.Optimization for Parallelism and Data Locality[C].Proceedings of the ACM International Conference on Supercomputing,1992:323-334.
  • 3Kennedy K,Kremer U.Automatic Data Layout for High Performance Fortran[C].Proc.of Supercomputing,San Diego,Calif.,1995.
  • 4Anderson J M,Lam M S.Global Optimizations for Parallelism and Locality on Scalable Parallel Machines[C].Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation,1993:112-125.
  • 5Cocke J, Miller R E. Some Analysis Techniques for Optimizing Computer Prngrams[A]. Proc of the 2nd Hawaii Conf on System Sciences[C]. 1969. 143-146.
  • 6Aho A, Sethi R,Ullman J. Compilers:Principles,Techniques, and Tools[M].北京:人民邮电出版社,2002.
  • 7Janssen J,Corporaal H. Making Graphs Reducible with Controlled Node Splitting[J]. ACM Trans on Programming Languages and Systems, 1997,19(6):1031-1052.
  • 8Unger S, Mueller F. Handling Irreducible Loops: Optimized Node Splitting Versus DJ-Graphs[J]. ACM Trans on Programming Languages and Systems, 2002,24(4):299-333.
  • 9Liu Y, Zhang Z, Qiao R, et al. A Region-Based Compilation Infrastructure[A]. Proc of the 7th Workshop on Interaction Between Compilers and Computer Architecture[C]. 2003. 75-84.
  • 10Muchnick S.高级编译器设计与实现[M].赵克佳,沈志宇译.北京:机械工业出版社,2005.

共引文献6

同被引文献3

  • 1Sohoni S,Xu Zhiyong,Min Rui,et al.A Study of Memory System Performance of Multimedia Applications[C]//Proc.of ACM Joint International Conference on Measurement & Modeling of Computer Systems.Cambridge,MA,USA:[s.n.],2001:206-215.
  • 2Bishop B,Kelliher T P,Irwin M J.A Detailed Analysis of Media Bench[C]//Proc.of IEEE Workshop on Signal Processing Systems.[S.l.]:IEEE Press,1999.
  • 3Fritts J.Architecture and Compiler Design Issues in Programmable Media Processors[D].Princeton,USA:Princeton University.2000,.

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部