摘要
文章利用业界通用的FPSPEC92、FPSPEC95、LINPACK、WHETSTONE、FLOPS等浮点基准测试程序,基于阻塞步长对浮点处理性能进行分析。通过大量实验,得出浮点除法最佳执行周期为8~12拍。据此,为“龙腾R1”处理器设计了执行周期为11拍的基-256浮点除法器,并在SMIC0.18ΜM工艺下实现,恶劣环境下其运行速度为233MHZ,面积约为0.174MM2。
By utilizing industrial floating-point benchmarks, such as fpspec92, fpspec95, linpack, whetstone and flops, based on the concept of interlock distance, floating-point processing performance is analyzed. From the experimental resuits, it can be concluded that the performance will be better when the cycle delay of floating-point division is between 8 and 12 cycles. So, the cycle delay of the division unit implemented as a radix-256 divider in "LongTeng" microprocessor is designated to be 11 cycles. It is implemented in SMIC 0.181μm 1P5M CMOS technology. In the worst environment, it's speed reaches 233MHZ, as well as the area is 0.174 mm^2.
出处
《微电子学与计算机》
CSCD
北大核心
2006年第1期64-66,70,共4页
Microelectronics & Computer
基金
国防"十五"预研课题(41308010108)
西北工业大学研究生创业种子基金资助(Z20040050)
关键词
浮点基准测试程序
阻塞步长
性能分析
高阶
浮点除法器
Floating-point SPEC ,Interlock distance,Performance analyze,Very-high radix, Floating-point divider