期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
基于缓存访问模式的C-AMAT测量方法及其在图计算中的应用
1
作者 陈炳彰 刘伟 于萧钰 《计算机研究与发展》 EI CSCD 北大核心 2024年第4期824-839,共16页
图应用是大数据领域的一个重要分支,尽管图分析在显示表示实体之间关系的能力相比传统的关系数据库具有更显著的性能优势,但图处理中大量的随机访问所导致的不规则访存模式破坏了访存的时间和空间局部性,从而对片外内存系统造成了很大... 图应用是大数据领域的一个重要分支,尽管图分析在显示表示实体之间关系的能力相比传统的关系数据库具有更显著的性能优势,但图处理中大量的随机访问所导致的不规则访存模式破坏了访存的时间和空间局部性,从而对片外内存系统造成了很大的性能压力.因此如何正确度量图应用在内存系统中的性能,对于高效的图应用体系结构优化设计至关重要.并发式平均存储访问时间(concurrent average memory access time,C-AMAT)模型作为平均存储访问时间(average memory access time,AMAT)的扩展,同时考虑了存储器访问的局部性和并发性,能够更准确地对现代处理器下图应用在存储系统中的性能进行评估分析.但C-AMAT模型忽略了处理器下级cache层串行访问的事实,这会导致计算的不准确性,同时由于计算所需参数纯粹缺失周期等难以获取的原因,也使得C-AMAT难以进行实际应用.为了使CAMAT的计算模型与现代计算机中的存储器访问模式相匹配,基于C-AMAT提出了PC-AMAT(parallel CAMAT),SC-AMAT(serial C-AMAT),其中PC-AMAT,SC-AMAT分别从cache的并行和串行访问模式对C-AMAT的计算模型进行了细粒度的扩展和表征,并在此基础上设计并实现了纯粹缺失周期的提取算法,避免直接测量带来的巨大硬件开销.实验结果表明,在单核和多核模式下,PC-AMAT和SC-AMAT与IPC之间的相关性比C-AMAT更强,最终利用PC-AMAT和SC-AMAT度量和分析了图应用的存储器性能并据此提出图应用访存优化策略. 展开更多
关键词 图应用 图分析 平均存储访问时间 并发式平均存储访问时间 纯粹缺失周期 缓存
在线阅读 下载PDF
Reevaluating Data Stall Time with the Consideration of Data Access Concurrency
2
作者 刘宇航 孙贤和 《Journal of Computer Science & Technology》 SCIE EI CSCD 2015年第2期227-245,共19页
Data access delay has become the prominent performance bottleneck of high-end computing systems. The key to reducing data access delay in system design is to diminish data stall time. Memory locality and concurrency a... Data access delay has become the prominent performance bottleneck of high-end computing systems. The key to reducing data access delay in system design is to diminish data stall time. Memory locality and concurrency are the two essential factors influencing the performance of modern memory systems. However, existing studies in reducing data stall time rarely focus on utilizing data access concurrency because the impact of memory concurrency on overall memory system performance is not well understood. In this study, a pair of novel data stall time models, the L-C model for the combined effort of locality and concurrency and the P-M model for the effect of pure miss on data stall time, are presented. The models provide a new understanding of data access delay and provide new directions for performance optimization. Based on these new models, a summary table of advanced cache optimizations is presented. It has 38 entries contributed by data concurrency while only has 21 entries contributed by data locality, which shows the value of data concurrency. The L-C and P-M models and their associated results and opportunities introduced in this study are important and necessary for future data-centric architecture and algorithm design of modern computing systems. 展开更多
关键词 memory wall data stall time memory concurrency concurrent average memory access time c-amat
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部