摘要
现如今,视觉Transformer在计算机视觉领域的许多任务中都取得了卓越的表现,但其复杂的网络结构通常需要占用大量的存储和计算资源,因此难以在计算资源受限设备上广泛部署。为此提出了一种基于剪枝和蒸馏的视觉Transformer模型压缩方法,旨在保证模型性能的前提下缩减模型大小。首先,通过对视觉Transformer的结构分析,确定宽度剪枝的对象为多头自注意力的注意力头和多层感知机中隐藏层的神经元,并采用基于模型损失函数变化的参数重要性评估策略对其进行参数重要性评估。其次,通过剪枝后蒸馏策略在模型宽度维度进行裁剪并恢复剪枝后宽度子网络的精度。最后,在深度维度上,通过剪枝后蒸馏得到最终的压缩模型。所提出方法在Tiny ImageNet、CIFAR-100和CIFAR-10数据集上对视觉Transformer进行了压缩实验。其中,在Tiny ImageNet上,ViT-S模型在参数量和计算量减少30%时,精度仅降低0.3%,而ViT-B模型精度甚至提升了0.6%。实验结果表明,所提方法能够有效实现模型精度和压缩率的平衡。
Currently,the Vision Transformer has demonstrated outstanding performance across various tasks in the field of computer vision.However,their complex network structures typically require substantial storage and computational resources,making widespread deployment on resource-constrained devices challenging.To address this issue,we propose a compression method for the Vision Transformer based on pruning and distillation,aiming to reduce the model size while ensuring performance retention.First,through a structural analysis of the Vision Transformer,we identify the targets for width pruning as the attention heads in the multi-head self-attention mechanism and the neurons in the hidden layers of the multi-layer perceptron.We then employ a parameter importance evaluation strategy based on changes in the model’s loss function to assess these parameters.Next,we apply a post-pruning distillation strategy to prune the model in the terms of width and restore the accuracy of the pruned subnetworks.Finally,in the depth dimension,we obtain the final compressed model through post-pruning distillation.The proposed method is experimentally validated on the Tiny ImageNet,CIFAR-100,and CIFAR-10 datasets,compressing the Vision Transformer.When reducing the parameter count and computational load by 30%,on the Tiny ImageNet dataset,the accuracy of the ViT-S model is decreased by only 0.3%,while the accuracy of the ViT-B model is even improved by 0.6%.Experimental results indicate that our proposed method effectively balances the model accuracy and compression ratio.
作者
郑洋
蒋晓天
付东豪
郭开泰
梁继民
ZHENG Yang;JIANG Xiaotian;FU Donghao;GUO Kaitai;LIANG Jimin(Schoolof Electronic Engineering,Xidian University,Xi’an 710071,China)
出处
《西安电子科技大学学报》
北大核心
2025年第4期55-65,共11页
Journal of Xidian University
基金
国家自然科学基金青年项目(62101416,62301405)
国家自然科学基金(62476205)。