摘要
针对如何将神经网络保真映射到资源受限的嵌入式设备这一问题,提出基于层敏感性分析的卷积神经网络混合精度量化方法。通过计算Hessian矩阵平均迹衡量卷积层参数的敏感性,为位宽分配提供依据;使用逐层升降方法进行位宽分配,最终完成网络模型的混合精度量化。实验结果表明,与DoReFa和LSQ+两种固定精度量化方法相比,所提出的混合精度量化方法在平均位宽为3 bit的情况下将识别准确率提高了10.2%和1.7%;与其他混合精度量化方法相比,所提方法识别准确率提高了1%以上。此外,加噪训练能够有效提高混合精度量化方法的鲁棒性,在噪声标准差为0.5的情况下,将识别准确率提高了16%。
To address the problem of how to faithfully map neural networks to resource-constrained embedded devices,a mixed-precision quantization method for convolutional neural networks based on layer sensitivity analysis was proposed.The sensitivity of convolutional layer parameters was measured by calculating the average trace of the Hessian matrix,providing a basis for bit-width allocation.A layer-wise ascending-descending approach was employed for bit-width allocation,ultimately achieving mixed-precision quantization of the network model.Experimental results demonstrate that compared to the fixed-precision quantization methods DoReFa and LSQ+,the proposed mixed-precision quantization method improves recognition accuracy by 10.2%and 1.7%,respectively,at an average bit-width of 3 bit.When compared to other mixed-precision quantization methods,the proposed approach achieves over 1%higher recognition accuracy.Additionally,noise-injected training effectively enhances the robustness of the mixed-precision quantization method,improving recognition accuracy by 16%under a noise standard deviation of 0.5.
作者
刘海军
张晨曦
王析羽
陈长林
陈军
李智炜
LIU Haijun;ZHANG Chenxi;WANG Xiyu;CHEN Changlin;CHEN Jun;LI Zhiwei(College of Electronic Science and Technology,National University of Defense Technology,Changsha 410073,China)
出处
《国防科技大学学报》
北大核心
2025年第4期143-150,共8页
Journal of National University of Defense Technology
基金
国家自然科学基金资助项目(62074166,62304254,62104256,62404253,U23A20322)。
关键词
卷积神经网络
模型量化
人工智能
混合精度
convolutional neural network
model quantization
artificial intelligence
mixed precision