一种面向写穿透Cache的写合并设计及验证被引量：2

A Write Coalescing Design and Verification for Write-Through Cache

下载PDF

导出

摘要为了利用片上缓冲技术来提高处理器应用性能,提出一种面向写穿透Cache的写合并设计方法.使用同步动态随机存储器(SDRAM)的单个写方式和片上写缓冲器,对SDRAM一行内的局部数据采用写合并策略,由此提高了外部存储的访问效率,同时给出了连续和单个Cache读写的缓存与内存的数据一致性策略.在寄存器传输语言(RTL)仿真环境下使用mp3解码对Leon2处理器进行数据测试,结果表明:在缓冲区优化为3行8列的参数下,SDRAM每次行开启平均进行7.8个字的写入操作,外存的读写效率由12%提高到19%;在TSMC0.18μm工艺下,综合后面积为0.263mm2,流片后工作主频为100MHz. A write coalescing design for write-through cache is proposed to promote the performance of application processors using on-chip buffer. The scheme of write coalition for the local data in the same row of SDRAM （synchronous dynamic random access memory） is designed to improve accessing efficiency by employing the single write mode of SDRAM and on-chip buffer. A coherence scheme for the single or multiple cache data reading or writing is also presented. Simulation for mp3 decoding data is implemented in the RTL （register transmit language） simulation environment for Leon2 processor. The simulation results show that when the on-chip buffers are optimized at 3 rows and 8 columns, average 7.8 words are written into the SDRAM after every row pre-charge, and the accessing efficiency increases from 12% to 19% at 100 MHz, and that the area of the proposed design is 0. 263 mm^2 under TSMC 0. 18μm process.

作者梅魁志李国辉张斌

机构地区西安交通大学人工智能与机器人研究所

出处《西安交通大学学报》 EI CAS CSCD 北大核心 2010年第4期1-4,共4页 Journal of Xi'an Jiaotong University

基金国家自然科学青年基金资助项目(60905007) 国家高技术研究发展计划资助项目(2009AA01Z307 2009AA011709)

关键词写穿透写合并处理器同步读写存储器读写效率 write through write coalescing memory access efficiency processor synchronous dynamic random access

分类号 TP302 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献5

1Intel Inc.Intel XScale technology overview[EB/OL].[2009-07-11].http://www.intel.com/design/intelx-scale/.
2Gaisler.Leon SPARC V8 processor[EB/OL].[2009-07-11].http://www.gaisler.com.
3林伟,叶笑春,宋风龙,张浩.众核处理器中使用写掩码实现混合写回/写穿透策略[J].计算机学报,2008,31(11):1918-1928. 被引量：5
4KANG S Y,PARK S,JUNG H Y,et al.Perform-ance tradeoffs in using NVRAM write buffer for flash memory-based storage devices[J].IEEE Transactions on Computers,2009,58(6):44-758.
5KANG W Y,SON S H,STANKOVIC J A.Power-aware data buffer cache management in real-time em-bedded databases[C] //The 14th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications.Piscataway.NJ,USA:IEEE,2008:35-44.

二级参考文献21

1Huang He, Yuan Nan, Lin Wei et al. Architecture supported synchronization-based cache coherence protocol for manycore Processors//Proceedings of the ISCA Workshop on Chip Multiprocessor Memory Systems and Interconnects. Beijing, China, 2008:51-53
2Asanovic Krste, Bodik Ras et al. The landscape of parallel computing research: A view from Berkeley. University of California, California, USA: Technical Report UCB/EECS- 2006-183, 2006
3Culler David E, Singh Jaswinder Pal, Gupta Anoop. Parallel Computer Architecture: A Hardware/Software Approach. San Fransisco, USA: Morgan Kaufmann, 1998
4Iftode Liviu, Singh Jaswinder Pal, Li Kai. Scope consistency: A bridge between release consistency and entry consistency//Proceedings of the 8th Annual ACM Symposium on Par allel Algorithms and Architectures. Padua, Italy, 1996: 277-287
5Karp A H, Sarkar Vivek. Data merging for shared-memory multiprocessors//Proceedings of the 26th Hawaii International Conference on System Sciences. Hawaii, USA, 1993: 244-256
6Lenoski Daniel, Laudon James, Joe Truman et al. The dash prototype: Implementation and performance//Proceedings of the 19th International Symposium on Computer Architecture. Queensland, Australia, 1992; 92-103
7Lamport Leslie. How to make a multiprocessor computer that correctly executes multiproeess program. IEEE Transactions on Computers, 1979, 28(9): 690-691
8Adve Sarita V, Hill Mark D. Weak ordering-A new definition//Proeeedings of the 17th International Symposium on Computer Architecture. Seattle, USA, 1990:2-14
9Gharachorloo Kourosh, Lenoski Daniel et al. Memory consistency and event ordering in scalable shared-memory multiprocessors//Proceedings of the 17th International Symposium on Computer Architecture. Seattle, USA, 1990:15-26
10del Cuvillo Juan, Zhu Wei-Rong et al. Toward a software infrastructure for the cyclops-64 cellular architecture//Proceedings of the 20th International Symposium on High-Performance Computing in an Advanced Collaborative Environment. St. John's, Canada, 2006:9-20

共引文献4

1叶笑春,林伟,范东睿,张浩.蛋白质序列比对算法在众核结构上的并行优化[J].软件学报,2010,21(12):3094-3105. 被引量：3
2余磊,刘志勇,马宜科,宋风龙,徐卫志,叶笑春.众核结构上分块LU分解算法的研究[J].高技术通讯,2011,21(3):248-253.
3余磊,刘志勇,宋风龙,叶笑春.LU分解在众核结构仿真器上的指令级调度研究[J].系统仿真学报,2011,23(12):2603-2610. 被引量：5
4周琰.Godson-T缓存一致性协议的Murphi建模和验证[J].计算机系统应用,2013,22(10):124-128. 被引量：3

同被引文献15

1马志强,季振洲,胡铭曾.基于分类访问的低功耗联合式cache方案[J].哈尔滨工程大学学报,2007,28(1):21-25. 被引量：3
2郑伟,姚庆栋,张明,刘鹏,张子男,周莉,李东晓.一种低功耗Cache设计技术的研究[J].电路与系统学报,2004,9(5):21-24. 被引量：5
3郇丹丹,李祖松,胡伟武,刘志勇.Cache自适应写分配策略[J].计算机研究与发展,2007,44(2):348-354. 被引量：2
4Ali K,Aboelaze M,Datta S. Predictive line buff-er:a fast energy efficient cache architecture[A].Orlando(USA),2006.291-295.
5Guthaus M,Ringenberg J,Ernst D. Mibench:afree,commerc-ally representative embedded benchmark suite[A].Aastin,2001.3-14.
6黄文君,谢东凯,卢山,何伟挺.一种高可用性的冗余工业实时以太网设计[J].仪器仪表学报,2010,31(3):704-708. 被引量：30
7靳海力,李俊.具有补发机制的增强型可靠UDP的实现[J].小型微型计算机系统,2010,31(5):904-907. 被引量：11
8董建云,何岸,周伟,钟伟,白杨,何能正,卢上丁,段振英.特种光纤传输系统数据包的快速变更与重封装技术[J].光通信技术,2010,34(6):53-55. 被引量：1
9奚杰,陈杰,朱玥.利用SystemC实现多核系统的快速建模[J].微电子学与计算机,2010,27(7):214-217. 被引量：11
10文良,靳荣利,吴龙胜,刘佑宝.基于AHB总线接口的可重用性验证环境的实现[J].微电子学与计算机,2011,28(7):202-204. 被引量：3

引证文献2

1朱伟成,周莉,喻庆东.一种低功耗高效率的AHB-AXI双总线结构联合Cache的IP设计[J].微电子学与计算机,2012,29(5):46-49. 被引量：1
2杨咚,钟艺,吕卫平.一种以太网数据记录微机及其应用技术[J].舰船电子工程,2015,35(2):106-110. 被引量：1

二级引证文献2

1姚小强,刘昌云,郭相科.基于数据截获和欺骗式注入的通用记录重演方法[J].计算机应用,2017,37(4):1153-1156. 被引量：1
2李泉泉,张铁军,王东辉,侯朝焕.基于分支执行历史的循环缓冲低功耗方法[J].微电子学与计算机,2014,31(9):7-10.

1岳元,彭亮节.最优算法组合在目标检测图像信号处理中的应用[J].计算机测量与控制,2011,19(11):2841-2843. 被引量：1
2江洪.MP3解码程序开发[J].电脑编程技巧与维护,2014(1):20-24. 被引量：2
3王芳良,张伟功,于立新,周海洋,庄伟.基准集在嵌入式系统性能分析中的应用[J].计算机工程与设计,2015,36(1):115-119. 被引量：6
4张刚.全厂信息系统的网络设计及实施方案[J].计算机安全,2012(10):85-86.
5易成就.Oracle数据库内存优化的讨论与配置[J].微计算机信息,2008,24(3):168-170. 被引量：9
6王丹,于印,冯雪平.浅析Oracle数据库内存优化[J].现代电子工程,2009(3):16-18. 被引量：1
7钱刚,沈绪榜,李莉,赵宁,许琪.浮点加法器中进位传递问题的合并处理[J].微电子学与计算机,2001,18(3):33-36. 被引量：2
8乔一泰.刍议MP3解码SoC中的低功耗设计[J].电子技术与软件工程,2015(20):123-123.
9俞晓明,郭莉.TCP/IP协议处理中的缓冲区优化及实现[J].计算机工程,2006,32(8):62-63. 被引量：1
10薛秀华,段正华.标准化移动代理的传输语言与转换机制[J].湖南大学学报（自然科学版）,2002,29(S1):207-210.

西安交通大学学报

2010年第4期

浏览历史

内容加载中请稍等...

一种面向写穿透Cache的写合并设计及验证被引量：2

参考文献5

二级参考文献21

共引文献4

同被引文献15

引证文献2

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

一种面向写穿透Cache的写合并设计及验证 被引量：2

参考文献5

二级参考文献21

共引文献4

同被引文献15

引证文献2

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

一种面向写穿透Cache的写合并设计及验证被引量：2