期刊文献+

一种基于值预测和指令复用的按序处理器预执行机制 被引量:1

A Pre-Execution Mechanism Based on Value Prediction and Instruction Reuse for In-Order Processors
在线阅读 下载PDF
导出
摘要 为提高按序处理器的性能和能效性,本文提出一种基于值预测和指令复用的预执行机制(PVPIR).与传统预执行方法相比,PVPIR在预执行过程中能够预测失效Load指令的读数据并使用预测值执行与该Load指令数据相关的后续指令,从而对其中的长延时缓存失效提前发起存储访问以提高处理器性能.在退出预执行后,PVPIR通过复用有效的预执行结果来避免重复执行已正确完成的指令,以降低预执行的能耗开销.PVPIR实现了一种结合跨距(Stride)预测和AVD(Address-Value Delta)预测的值预测器,只记录发生过长延时缓存失效的Load指令信息,从而以较小的硬件开销取得较好的值预测效果.实验结果表明,与Runahead-AVD和iEA方法相比,PVPIR将性能分别提升7.5%和9.2%,能耗分别降低11.3%和4.9%,从而使能效性分别提高17.5%和12.9%. To improve the performance and energy-efficiency of in-order processors,this paper proposes a novel hardware mechanism,pre-execution based on value prediction and instruction reuse(PVPIR).If a load instruction incurs a long-latency cache miss,PVPIR predicts its data value and uses the predicted value to pre-execute the following dependent instructions,including loads that incur long-latency misses,thus improving the performance.To reduce the energy consumption,PVPIR reuses the valid pre-executed results and thus avoids the re-execution of completed instructions.PVPIR also implements a hybrid value predictor which is a combination of stride prediction and address-value delta(AVD) prediction.The predictor only records history value for loads that have incurred long-latency misses,thus gaining good prediction results with little overhead.Experimental results demonstrate that PVPIR improves the performance by 7.5% and 9.2% while decreases the energy consumption by 11.3% and 4.9%,thus improving the energy-efficiency by 17.5% and 12.9%,as compared to Runahead-AVD and iEA,respectively.
出处 《电子学报》 EI CAS CSCD 北大核心 2011年第12期2880-2883,共4页 Acta Electronica Sinica
基金 国家863高技术研究发展计划(No.2006AA010202) 中国博士后科学基金资助项目(No.20110490208)
关键词 预执行 值预测 指令复用 访存延时包容 pre-execution value prediction instruction reuse load latency tolerance
  • 相关文献

参考文献17

  • 1K Asanovic, et al. The landscape of parallel computing re- search: A view from Berkeley [ R ]. California, USA: Dept of EECS, University of California at Berkeley, 2006.
  • 2P Kongetira,et al.Niagara:A 32-way multithreaded Sparc pro- cessor[ J]. IEEE Micro,2005,25(2) :21 - 29.
  • 3王箫音,佟冬,党向磊,冯毅,程旭.一种高能效的面向单发射按序处理器的预执行机制[J].电子学报,2011,39(2):458-463. 被引量:2
  • 4J Dundas, T Mudge. Improving data cache performance by pre- executing instructions under a cache miss[ A]. Int' 1 Conference on Supercomputing[ C]. Vienna, Austria: IEEE Computer Soci- ety, 1997.68 - 75.
  • 5O Mutlu, et al. Runahead execution:An effective alternative to large instruction windows[ J]. IEFF, Micro, 2003,23(6) :20- 25.
  • 6R D Barnes, et al. Tolerating cache-miss latency with multipass pipelines[J].IEEE Micro, 2006,26(1) :40 - 47.
  • 7O Mutlu, et al. Address-value delta (AVD) prediction: A hard- ware technique for efficiently parallelizing dependent cache misses[ J ]. IEEE Transactions on Computers, 2006, 55 (12) : 1491 - 1508.
  • 8Y Sazeides, J E Smith. The predictability of data values[ A]. Int' 1 Symposium on Microarchitecture[ C]. Los Alamitos, Cali- fornia, USA: IEEE Computer Society, 1997.248 - 258.
  • 9O Mutlu, et al, On reusing the results of pre-executed instruc- tions in a runahead execution processor[J]. IEEE Computer Ar- chitecture Letters,2005,4( 1 ) :2- 5.
  • 10A Sodani, G S Sohi. Dynamic instruction reuse [ A ]. Int' 1 Symposium on Computer Architecture[C]. Denver, Colorado, USA: IEEE Computer Society, 1997.194 - 205.

二级参考文献30

共引文献4

同被引文献6

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部