高带宽远程内存结构中的预取研究被引量：2

The Study on Prefetching of Remote Memory Architecture

下载PDF

导出

摘要高速电路和光互联技术的发展极大地提高了网络的速度与带宽。因而,突破高性能计算机 CPU与内存紧耦合的传统结构成为可能,CPU与内存的耦合不再受距离的限制,这必将引起体系结构的变革。文[1]提出 DSAG结构——CPU与内存在空间上分离,每个CPU节点上仅留少量内存,将海量内存放在远程统一管理作为内存服务器,CPU节点和内存服务器之间通过高速网络互连。这种新的体系结构带来了更好的共享性和可扩展性,但同时也对我们解决CPU和内存之间的不平衡性问题带来了挑战。为了降低DSAG这种远程内存结构增加的访存时延,我们考虑到CPU正常访存没有充分利用网络的高带宽,因此可以利用剩余的网络带宽来进行远程内存数据的预取。本论文在应用程序执行时记录本地(相对于远程内存)不命中的地址信息,以页对齐分析其中存在的页框流(Page Frame Stream)的统计特征,并提出可基于页框流的预取机制可降低访存延迟、提升系统性能的观点。最后我们采用模拟的方法验证了观点的可行性与正确性,进一步提出了三种预取策略,比较并分析影响预取效果的因素。 High speed electrical and optical interconnection technique brings us high-speed and high-bandwidth network. Thus, we can break through the traditional computer architecture by decoupling memory from CPU. Distance between CPU and memory is no longer restricted, and this will consequentially cause innovation in high performance computer architecture. In paper[1]the authors present DSAG architecture-each CPU node is only attached with a small quantity of memory, while massive memory served as memory server is located away, and they are connected by high-speed network. This architecture provides better shareability and more scalability, but it also challenges us to reduce the gap between processor and memory. To reduce the delay of remote memory access, with abundant network bandwidth, we can use the spare network bandwidth while CPU runs to prefetch data from the remote memory. In this paper, we record and analyze the address missed in local memory access while program runs, and analyze the statistical characteristic of the page frame stream. We propose a prefetching approach based on page frame stream to reduce remote memory access delay and improve the system performance. Finally, we use simulation technique to verify the feasibility and correctness of the prefetching approach, and propose three prefetching policies as well as the factors that affect the prefetching.

作者许建卫陈明宇包云岗

机构地区中科院计算技术研究所国家智能计算机研究中心

出处《计算机科学》 CSCD 北大核心 2005年第8期15-20,共6页 Computer Science

关键词 DSAG结构页框流内存结构预取策略高带宽远程高性能计算机网络互连高速电路体系结构 DSAG architecture, Page frame stream

分类号 TP363.1 [自动化与计算机技术—计算机系统结构] TP301.6 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献2

1樊建平,陈明宇.网格化的动态自组织高性能计算机体系结构DSAG[J].计算机研究与发展,2003,40(12):1737-1742. 被引量：18
2胡伟武,施巍松,唐志敏.基于新型Cache一致性协议的共享虚拟存储系统[J].计算机学报,1999,22(5):467-475. 被引量：15

二级参考文献8

1胡伟武，J Comput Sci Technol，1998年，13卷，2期，110页
2Iftode L，Proc 8th Annual ACM Sympo Parallel Algorithms and Architectures，1996年，277页
3K Compton, S Hauck. Reconfigurable computing: A survey of systems and software. ACM Computing Surveys, 2002, 34 (2):171～210
4I Foster, C Kesselman, S Tuecke. The anatomy of the grid:Enabling scalable virtual organizations. International Journal of Supercomputer Applications, 2001, 15(3): 200～222
5Neil Savage. Linking with light. IEEE Spectrum, 2002, 39(8):32 - 36
6William J Dally. Computer architecture is all about interconnect.The 8th Int'l Symp High Performance Computer Archifecture,Boston, Massachusettes, 2002
7David Patterson, Aron Brown et al. Recover oriented computing(ROC): Motivation, definition, techniques, and case studies. U C Berkeley, Tech Rep: UCB/CSD-02-1175, 2002
8Dona L Crawford. Fifty years of computing at LLNL as a lens to the future. The 17th Int'l Supercomputer Conf (ISC2002),Heidelberg, Germany, 2002

共引文献31

1刘若冰,张雪峰,罗洪霞.数据网格中的快照管理及其在电子政务中的应用[J].科技资讯,2008,6(6):109-110.
2吴少刚,章隆兵,蔡飞,顾丽红,唐志敏.机群Open MP系统的设计与实现[J].计算机学报,2004,27(7):904-912. 被引量：8
3邢继元,张义德.计算机存储技术的发展现状和趋势[J].科协论坛（下半月）,2008(2):87-89. 被引量：6
4章隆兵,吴少刚,蔡飞,胡伟武.适合机群OpenMP系统的制导扩展[J].计算机学报,2004,27(8):1129-1136. 被引量：2
5谢青峰.浅析高速缓冲存储器Cache在PC系统中的应用[J].福建电脑,2004,20(9):27-28. 被引量：1
6姚念民,舒继武,郑纬民.SAN中的分布式锁机制[J].计算机研究与发展,2005,42(2):338-343. 被引量：1
7张跃冬,杨毅,樊建平,马捷.计算机外设部件网格使能协议——gDevice[J].计算机研究与发展,2005,42(6):918-923.
8沈林峰,陈明宇,许建卫,张文力,孙国忠.兼容Linux应用环境的多粒度全系统模拟平台-SandUPSim[J].计算机工程与应用,2005,41(22):83-86. 被引量：4
9杨毅,张跃冬,赵晓芳.高性能计算机软终端控制技术的研究和实现[J].计算机工程与应用,2006,42(12):89-92.
10沈林峰,陈明宇.一种远程系统调用机制的研究及其应用[J].计算机工程,2006,32(11):89-91. 被引量：1

同被引文献31

1刘立陈明宇樊建平.一种网络内存架构及性能分析.计算机科学,2006,33(7):18-23.
2Katayama Y, Okazaki A. Optical interconnect opportunities for future server memory systems [C] //Proc of HPCA-13. Washington, DC: IEEE Computer Society, 2007:46-50.
3Bao Yungang, Chen Mingyu, Ruan Yuan, et al. HMTT: A platform independent full-system memory trace monitoring system [C] //Proc of SIGMETRICS 08. New York: ACM, 2008: 229-240.
4Przybylski S. The performance impact of block sizes and fetch strategies [C]//Proc of the 17th Annual Int Symp on Computer Architecture. New York: ACM, 1990:160-169.
5Ding Chen, Zhong Yutao. Predicting whole-program locality through reuse distanee analysis [C]//Proc of PLDI'03. New York: ACM, 2003:245-257.
6Mohan Tushar, Supinski Bronis R de, MeKee Sally A, et al. Identifying and exploiting spatial regularity in data memory references [C] //Proc of Supercomputing Conf 2003. Washington, DC: IEEE Computer Society, 2003:49-49.
7Smith A J. Cache memories[J]. ACM Computing Surveys, 1982, 14(3): 473-530.
8Dahlgren F, Dubois M, Stenstrom P. Sequential hardware prefetching in shared-memory multiprocessors [J]. IEEE Trans on Parallel and Distributed Systems, 1995, 6 (7): 733-746.
9Jouppi N P. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers [C]//Proc of the 17th Annual Int Syrup on Computer Architecture, New York: ACM, 1990:364-373.
10Palacharla S, Kessler R E. Evaluating stream buffers as a secondary cache replacement[C] //Proc of the 21st Int Symp on Computer Architecture. New York: ACM, 1994:24-33.

引证文献2

1沈林峰,陈明宇,许建卫,张文力,孙国忠.兼容Linux应用环境的多粒度全系统模拟平台-SandUPSim[J].计算机工程与应用,2005,41(22):83-86. 被引量：4
2刘立,陈明宇,包云岗,许建卫,樊建平.一种基于页面级流缓存结构的流检测和预取算法[J].计算机研究与发展,2009,46(10):1758-1767. 被引量：1

二级引证文献5

1包云岗,许建卫,陈明宇,樊建平.一种新型计算机体系结构模拟器的研究与实现[J].系统仿真学报,2007,19(7):1471-1475. 被引量：4
2许建卫,陈明宇,刘涛,杨伟,郑规,孙凝晖.一种基于超步执行的并行模拟平台研究与实现[J].系统仿真学报,2009,21(15):4647-4653. 被引量：3
3杨伟,陈明宇,许建卫.一种时钟级处理器模拟器的快速开发方法[J].计算机工程与应用,2010,46(6):63-66. 被引量：1
4邵宗有,王昭顺,许建卫.大规模并行操作系统研究[J].计算机科学,2013,40(11A):32-36. 被引量：1
5丁朝晖,张伟,杨国玉,刘腾.多维工控系统网络安全风险监测预警系统研究与应用[J].电子技术应用,2023,49(2):76-79. 被引量：15

1包云岗,许建卫,陈明宇,樊建平.一种新型计算机体系结构模拟器的研究与实现[J].系统仿真学报,2007,19(7):1471-1475. 被引量：4
2冯洁.透视超级计算机发展历程(三)[J].微电脑世界,1998(50):51-51.
3盛新蒲.“华鹰”无人机在减灾中的应用[J].中国减灾,2009,19(11):16-16. 被引量：1
4李晓明.搜索引擎如何抢夺淘宝的饭碗[J].销售与市场,2010(3):32-33.
5吴县植.说谈服务器的内存[J].办公自动化,2006(18):26-27.
6陈志华,朱楠楠,肖小龙,张静,袁玉波.基于显著目标移动的自动抠图方法[J].图学学报,2015,36(3):425-431. 被引量：1
7刘铭,张春平,杨志,张琦.基于对象化并行计算复杂场景的解决方案[J].信息技术,2016,40(9):152-155.
8李玉祥,施慧,陈莉.面向向量化的局部数据重组[J].小型微型计算机系统,2009,30(8):1528-1534. 被引量：10
9樊建平,陈明宇.网格化的动态自组织高性能计算机体系结构DSAG[J].计算机研究与发展,2003,40(12):1737-1742. 被引量：18
10梁志文,胡严思,杨金民.基于FTA与BAM神经网络融合的飞机故障诊断方法[J].湖南大学学报（自然科学版）,2013,40(5):61-64. 被引量：6

计算机科学

2005年第8期

浏览历史

内容加载中请稍等...

高带宽远程内存结构中的预取研究被引量：2

参考文献2

二级参考文献8

共引文献31

同被引文献31

引证文献2

二级引证文献5

相关作者

相关机构

相关主题

浏览历史

高带宽远程内存结构中的预取研究 被引量：2

参考文献2

二级参考文献8

共引文献31

同被引文献31

引证文献2

二级引证文献5

相关作者

相关机构

相关主题

浏览历史

高带宽远程内存结构中的预取研究被引量：2