期刊文献+

基于三重嵌入扩展和特征聚合的跨模态行人重识别

Cross-Modal Person Re-identification Based on Triple Embedding Extension and Feature Aggregation
原文传递
导出
摘要 跨模态行人重识别任务存在的主要问题是可见光和红外图像之间模态差异过大,导致识别准确率低.作者提出一种基于三重嵌入扩展和特征聚合的方法,首先,对可见光图像使用通道数据增强生成第三模态图像作为输入;其次,通过三重嵌入扩展模块对可见光、红外、第三模态图像扩充以生成更多的嵌入,扩大嵌入空间,从而进一步缩小模态差异;最后,使用跨模态特征聚合模块对不同阶段的特征进行聚合,在丰富嵌入的前提下突出图像中的重要共享特征,减少无关特征对模型的影响.实验结果表明,该方法在SYSU-MM01数据集的全搜索模式下Rank-1和mAP指标分别为75.10%和71.11%;在RegDB数据集的可见光到红外模式下Rank-1和mAP指标分别为92.06%和84.44%;在低照度LLCM数据集可见光到红外模式下Rank-1和mAP分别为63.77%和66.38%,优于目前同类方法. The main problem in cross-modal pedestrian re-identification is the excessive modal difference between visible and infrared images,which leads to low recognition accuracy.To address this issue,a method based on triple embedding extension and feature aggregation is proposed.First,the visible image is augmented with channel data to generate a third modal image as input.Second,the triple embedding extension module expands the visible,infrared,and third modal images to generate more embeddings,thereby enlarging the embedding space and further reducing modal differences.Finally,the cross-modal feature aggregation module aggregates features at different stages,highlighting important shared features in the image while reducing the influence of irrelevant features on the model.Experimental results show that the Rank-1 and mAP metrics of this method are 75.10%and 71.11%in the full search mode of the SYSU-MM01 dataset,respectively;92.06%and 84.44%in the visible to infrared mode of the RegDB dataset;and 63.77%and 66.38%in the visible to infrared mode of the low illumination LLCM dataset,outperforming current state-of-the-art methods.
作者 刘锁兰 夏洋洋 LIU Suolan;XIA Yangyang(School of Computer Science and Artificial Intelligence,Changzhou University,Changzhou 213159,Jiangsu,China;Jiangsu Key Laboratory of Image and Video Understanding for Social Security,Nanjing University of Science and Technology,Nanjing 210094,China)
出处 《昆明理工大学学报(自然科学版)》 北大核心 2025年第6期45-56,共12页 Journal of Kunming University of Science and Technology(Natural Science)
基金 国家自然科学基金项目(61976028) 江苏省社会安全图像与视频理解重点实验室课题(J2021-2)。
关键词 行人重识别 跨模态 多样化嵌入 自注意力机制 特征聚合 person re-identification cross-modal diverse embedding self-attention mechanism feature aggregation
  • 相关文献

参考文献2

二级参考文献5

共引文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部