摘要
使用卷积神经网络(Convolutional Neural Networks, CNN)提取牛脸特征时,往往忽略全局上下文信息,只能提取到牛脸图像的局部特征信息。视觉转换器(Vision Transformer, ViT)模型的全局感受野能有效改善CNN局部感受野问题。提出一种基于ViT模型的牛脸识别算法。首先,在ViT模型中加入patch-shift网络层,通过获取牛脸图像的全局特征和局部特征,以及局部特征之间的相关性,有效缓解了牛脸图像脏污的影响;然后,在patch-shift网络层之后加入可学习的掩码矩阵,运用掩码矩阵学习图像块的重要性,使模型更加关注牛脸图像块,抑制了背景噪声的干扰。在包含正脸、左侧脸和右侧脸3种正常图像库和特殊图像库中进行仿真实验,和基于CNN的牛脸识别算法相比,提出的算法有效降低了零误识下的拒识率,提高了Top1排序性能。
When using convolutional neural networks(CNN) to extract cow face features, the global context information is often ignored, and only the local feature information of the cow face image can be extracted. The global receptive field of the Vision Transformer(ViT) model can effectively improve the local receptive field problem of CNN. A cow face recognition algorithm based on ViT model is proposed. Firstly, the patch-shift network layer is added to the ViT model. By acquiring the global and local features of the cow face image, as well as the correlation between local features, the influence of the dirt on the cow face image is effectively alleviated. Then, after the patch-shift network layer, a learning mask matrix is added, and the mask matrix is used to learn the importance of image blocks, so that the model pays more attention to cow face image blocks and suppresses the interference of background noise. Simulation experiments are carried out in three normal image databases and special image databases, including the front face, left face and right face. Compared with the cow face recognition algorithm based on CNN, the proposed algorithm effectively reduces the zero error recognition rejection rate and improves the Top1 sorting performance.
作者
郑鹏
沈雷
刘浩
牟家乐
ZHENG Peng;SHEN Lei;LIU Hao;MOU Jiale(School of Communication Engineering,Hangzhou Dianzi University,Hangzhou Zhejiang 310018,China)
出处
《杭州电子科技大学学报(自然科学版)》
2022年第6期40-46,共7页
Journal of Hangzhou Dianzi University:Natural Sciences
基金
浙江省教育厅一般科研资助项目(Y202046969)。