With the increasing of data size and model size,deep neural networks(DNNs)show outstanding performance in many artificial intelligence(AI)applications.But the big model size makes it a challenge for high-performance a...With the increasing of data size and model size,deep neural networks(DNNs)show outstanding performance in many artificial intelligence(AI)applications.But the big model size makes it a challenge for high-performance and low-power running DNN on processors,such as central processing unit(CPU),graphics processing unit(GPU),and tensor processing unit(TPU).This paper proposes a LOGNN data representation of 8 bits and a hardware and software co-design deep neural network accelerator LACC to meet the challenge.LOGNN data representation replaces multiply operations to add and shift operations in running DNN.LACC accelerator achieves higher efficiency than the state-of-the-art DNN accelerators by domain specific arithmetic computing units.Finally,LACC speeds up the performance per watt by 1.5 times,compared to the state-of-the-art DNN accelerators on average.展开更多
With the coming of exascale computing era,programming systems and operating systems(including runtime systems)are facing several challenges.In aspect of architecture,increasing deeper level of parallelism,heterogeneit...With the coming of exascale computing era,programming systems and operating systems(including runtime systems)are facing several challenges.In aspect of architecture,increasing deeper level of parallelism,heterogeneity,and the adoption of diverse domain specific accelerators raise the urgent need for programmability,performance optimization and portability.On the other side,big data analytics and machine learning applications demand to be ported and optimized on modern HPC systems.This issue focuses on the novel ideas,methods,as well as efforts of system software development for resolving the above challenges,and to fill the gap between applications and the underlying hardware systems.展开更多
基金Supported by the National Key Research and Development Program of China(No.2018AAA0103300,2017YFA0700900,2017YFA0700902,2017YFA0700901,2019AAA0103802,2020AAA0103802)。
文摘With the increasing of data size and model size,deep neural networks(DNNs)show outstanding performance in many artificial intelligence(AI)applications.But the big model size makes it a challenge for high-performance and low-power running DNN on processors,such as central processing unit(CPU),graphics processing unit(GPU),and tensor processing unit(TPU).This paper proposes a LOGNN data representation of 8 bits and a hardware and software co-design deep neural network accelerator LACC to meet the challenge.LOGNN data representation replaces multiply operations to add and shift operations in running DNN.LACC accelerator achieves higher efficiency than the state-of-the-art DNN accelerators by domain specific arithmetic computing units.Finally,LACC speeds up the performance per watt by 1.5 times,compared to the state-of-the-art DNN accelerators on average.
文摘With the coming of exascale computing era,programming systems and operating systems(including runtime systems)are facing several challenges.In aspect of architecture,increasing deeper level of parallelism,heterogeneity,and the adoption of diverse domain specific accelerators raise the urgent need for programmability,performance optimization and portability.On the other side,big data analytics and machine learning applications demand to be ported and optimized on modern HPC systems.This issue focuses on the novel ideas,methods,as well as efforts of system software development for resolving the above challenges,and to fill the gap between applications and the underlying hardware systems.