摘要
针对众核CPU芯片中缓存一致性片上网络(Network-on-Chip,NoC)缓存一致性监听及监听响应过程耗时过长的问题,提出多播和自适应路由两种技术来加速该过程。根据这两种技术的需求,设计了片上网络监听请求、监听响应数据包格式,并进一步设计实现了监听请求通道和监听响应通道的NoC路由器和8×8网络。设计实践表明,按照文中所提的NoC路由器在22 nm工艺下大小为85940.3μm 2或103518.5μm 2,8×8的监听请求及监听响应网络大小为5.57 mm 2,复杂度可接受。通过仿真实验比较了单播和多播、确定性路由和自适应路由4种配置下监听及监听响应过程的耗时。结果表明,在监听请求消息需要监听全部252个处理器核心时,所提技术可使1个监听请求消息的监听及监听响应过程耗时减少45%,且远小于DDR/HBM的访问延迟。若进一步在一致性节点(Point of Coherency,PoC)处采用Outstanding技术,所提技术可使32个监听请求消息的监听及监听响应过程耗时减少73%。仿真结果证实了所提多播和自适应路由技术的有效性。
In Cache-Coherent Network-on-Chip(NoC)of many-core CPU,the snooping and snooping response Process(SNP Process)incurs long latency.To address this,two techniques:multicast routing and adaptive routing are proposed in this paper.According to the requirements of these two techniques,the NoC packet formats for Snooping Request Channel(SNP REQ Ch)and Snooping Response Channel(SNP RESP Ch)are proposed,and furthermore,the NoC routers of SNP REQ Ch and SNP RESP Ch are VLSI implemented.The implementation results show that the routers for both SNP REQ Ch and SNP RESP Ch are of 85940.3μm 2 or 103518.5μm 2,while an 8×8 network occupies 5.57 mm 2,which is feasible for large-scale chips.Simulations are employed to compare the latencies of 4 configurations:unicast determined routing,unicast adaptive routing,multicast determined routing,and multicast adaptive routing.The simulation results show that the latency of SNP Process with multicast adaptive routing could be cut by 45%for a single snooping request comparing to that with unicast determined routing,resulting in a much shorter latency than DDR/HBM access,and by 73%for 32 consecutive snooping requests with outstanding technique employed at the Point of Coherency(PoC),which validate the effectiveness of the proposed techniques.
作者
胡东伟
巴晓辉
刘耿亭
王力男
雷岳俊
HU Dongwei;BA Xiaohui;LIU Gengting;WANG Linan;LEI Yuejun(54th Institute of CETC,Shijiazhuang 050080,China;School of Electronic and Information Engineering,Beijing Jiaotong University,Beijing 100044,China;Department of Computer Science,University of Manchester,Manchester M139PL;School of Information Engineering,Minzu University of China,Beijing 100081,China)
出处
《集成电路与嵌入式系统》
2025年第8期81-90,共10页
Integrated Circuits and Embedded Systems
关键词
片上网络
缓存一致性
自适应路由
多播路由
network-on-chip
cache coherency
adaptive routing
multicast routing