期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Design and Optimization of Winograd Convolution on Array Accelerator 被引量:1
1
作者 Ji Lai Lixin Yang +4 位作者 Dejian Li Chongfei Shen Xi Feng Jizeng Wei Yu Liu 《Journal of Beijing Institute of Technology》 EI CAS 2023年第1期69-81,共13页
With the rapid development and popularization of artificial intelligence technology,convolutional neural network(CNN)is applied in many fields,and begins to replace most traditional algorithms and gradually deploys to... With the rapid development and popularization of artificial intelligence technology,convolutional neural network(CNN)is applied in many fields,and begins to replace most traditional algorithms and gradually deploys to terminal devices.However,the huge data movement and computational complexity of CNN bring huge power consumption and performance challenges to the hardware,which hinders the application of CNN in embedded devices such as smartphones and smart cars.This paper implements a convolutional neural network accelerator based on Winograd convolution algorithm on field-programmable gate array(FPGA).Firstly,a convolution kernel decomposition method for Winograd convolution is proposed.The convolution kernel larger than 3×3 is divided into multiple 3×3 convolution kernels for convolution operation,and the unsynchronized long convolution operation is processed.Then,we design Winograd convolution array and use configurable multiplier to flexibly realize multiplication for data with different accuracy.Experimental results on VGG16 and AlexNet network show that our accelerator has the most energy efficient and 101 times that of the CPU,5.8 times that of the GPU.At the same time,it has higher energy efficiency than other convolutional neural network accelerators. 展开更多
关键词 convolutional neural network winograd convolution algorithm ACCELERATOR
在线阅读 下载PDF
Design of high parallel CNN accelerator based on FPGA for AIoT
2
作者 Lin Zhijian Gao Xuewei +3 位作者 Chen Xiaopei Zhu Zhipeng Du Xiaoyong Chen Pingping 《The Journal of China Universities of Posts and Telecommunications》 EI CSCD 2022年第5期1-9,61,共10页
To tackle the challenge of applying convolutional neural network(CNN)in field-programmable gate array(FPGA)due to its computational complexity,a high-performance CNN hardware accelerator based on Verilog hardware desc... To tackle the challenge of applying convolutional neural network(CNN)in field-programmable gate array(FPGA)due to its computational complexity,a high-performance CNN hardware accelerator based on Verilog hardware description language was designed,which utilizes a pipeline architecture with three parallel dimensions including input channels,output channels,and convolution kernels.Firstly,two multiply-and-accumulate(MAC)operations were packed into one digital signal processing(DSP)block of FPGA to double the computation rate of the CNN accelerator.Secondly,strategies of feature map block partitioning and special memory arrangement were proposed to optimize the total amount of off-chip access memory and reduce the pressure on FPGA bandwidth.Finally,an efficient computational array combining multiplicative-additive tree and Winograd fast convolution algorithm was designed to balance hardware resource consumption and computational performance.The high parallel CNN accelerator was deployed in ZU3 EG of Alinx,using the YOLOv3-tiny algorithm as the test object.The average computing performance of the CNN accelerator is 127.5 giga operations per second(GOPS).The experimental results show that the hardware architecture effectively improves the computational power of CNN and provides better performance compared with other existing schemes in terms of power consumption and the efficiency of DSPs and block random access memory(BRAMs). 展开更多
关键词 artificial intelligence of things(AIoT) convolutional neural network(CNN)accelerator winograd convolution field-programmable gate array(FPGA)
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部