期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Decoupled Vector Processing Unit:Past,Present,and Future
1
作者 Ruo-Xi Wang Dun-Bo Zhang +4 位作者 Qing-Jie Lang Dong-Huan Xie Zhi-Wei Wang Zhen-Yu Gao Li Shen 《Journal of Computer Science & Technology》 2025年第5期1368-1385,共18页
Vector architectures are widely employed in modern processors due to their high performance and energy efficiency in exploiting data-level parallelism through single instruction multiple data(SIMD)paradigms.The built-... Vector architectures are widely employed in modern processors due to their high performance and energy efficiency in exploiting data-level parallelism through single instruction multiple data(SIMD)paradigms.The built-in scalar cores and the vector processing units(VPUs)can be organized as integrated or decoupled.The decoupled vector architecture primarily offers the advantage of independent operation,allowing the VPU and the scalar core to execute concurrently at different frequencies,enhancing overall throughput and performance.This enables specialized VPU optimization for long vectors,complex vector operations,and separate power management,which excels in computation-intensive applications.This paper comprehensively reviews processors with decoupled VPUs,discussing their advantages and various implementations.Design challenges and corresponding potential solutions are also be included. 展开更多
关键词 decoupled organization single instruction multiple data(SIMD) vector architecture processing in memory
原文传递
XB-SIM*:A Simulation Framework for Modeling and Exploration of ReRAM-Based CNN Acceleration Design 被引量:4
2
作者 Xiang Fei Youhui Zhang Weimin Zheng 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2021年第3期322-334,共13页
Resistive Random Access Memory(ReRAM)-based neural network accelerators have potential to surpass their digital counterparts in computational efficiency and performance.However,design of these accelerators faces a num... Resistive Random Access Memory(ReRAM)-based neural network accelerators have potential to surpass their digital counterparts in computational efficiency and performance.However,design of these accelerators faces a number of challenges including imperfections of the Re RAM device and a large amount of calculations required to accurately simulate the former.We present XB-SIM,a simulation framework for Re RAM-crossbar-based Convolutional Neural Network(CNN)accelerators.XB-SIM can be flexibly configured to simulate the accelerator’s structure and clock-driven behaviors at the architecture level.This framework also includes an Re RAM-aware Neural Network(NN)training algorithm and a CNN-oriented mapper to train an NN and map it onto the simulated design efficiently.Behavior of the simulator has been verified by the corresponding circuit simulation of a real chip.Furthermore,a batch processing mode of the massive calculations that are required to mimic the behavior of Re RAM-crossbar circuits is proposed to fully apply the computational concurrency of the mapping strategy.On CPU/GPGPU,this batch processing mode can improve the simulation speed by up to 5.02 or 34.29.Within this framework,comprehensive architectural exploration and end-to-end evaluation have been achieved,which provide some insights for systemic optimization. 展开更多
关键词 deep neural network Resistive Random Access memory(Re RAM) simulation ACCELERATOR processing in memory
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部