期刊文献+

多模态特征融合的RGB-T目标跟踪网络

RGB-T tracking network based on multi-modal feature fusion
在线阅读 下载PDF
导出
摘要 近年来,RGB-T跟踪方法因可见光与热红外图像的互补特性而在视觉跟踪领域得到广泛应用。然而,现有方法在模态互补信息利用方面仍存在局限,特别是基于Transformer的算法缺乏模态间的直接交互,难以充分挖掘RGB和TIR模态的语义信息。针对这些问题,提出了一种多模态特征融合的RGB-T目标跟踪网络(Multi-Modal Feature Fusion Tracking Network for RGB-T,MMFFTN)。该网络首先在骨干网络提取初步特征后,引入通道特征融合模块(Channel Feature Fusion Module,CFFM),实现RGB和TIR通道特征的直接交互与融合。其次,针对RGB和TIR模态差异可能导致的融合效果不理想问题,设计了跨模态特征融合模块(Cross-Modal Feature Fusion Module,CMFM),通过自适应融合策略进一步融合RGB和TIR的全局特征,以提升跟踪的准确性。对本文提出的跟踪模型在GTOT,RGBT234和LasHeR三个数据集上进行了详细的实验评估。实验结果表明,与当前先进的基于Transformer的跟踪器ViPT相比,MMFFTN在成功率(Success Rate)和精确率(Precision Rate)上分别提升了3.0%和4.7%;与基于Transformer的跟踪器SDSTrack相比,成功率和精确率分别提升了2.4%和3.3%。 In recent years,RGB-T tracking methods have been widely used in visual tracking tasks due to the complementarity of visible image and thermal infrared images.However,the existing RGB-T moving target tracking methods have not yet made full use of the complementary information between the two modalities,which limits the performance of the tracker.The existing Transformer-based RGB-T tracking algorithms are still short of direct interaction between the two modalities,which limits the full use of the original semantic information of RGB and TIR modalities.To solve this problem,the paper proposed a Multi-modal Feature Fusion Tracking Network for RGB-T(MMFFTN).Firstly,after extracting the preliminary features from the backbone network,the Channel Feature Fusion Module(CFFM)was introduced to realize the direct interaction and fusion of RGB and TIR channel features.Secondly,in order to solve the problem of unsatisfactory fusion effect caused by the difference between RGB and TIR modality,a Cross-Modal Feature Fusion Module(CMFM)was designed and the global features of RGB and TIR were further fused through an adaptive fusion strategy to improve the tracking accuracy.The proposed tracking model was evaluated in detail on three datasets:GTOT,RGBT234 and LasHeR.Experimental results demonstrate that MMFFTN improves the success rate and precision rate by 3.0%and 4.7%,respectively compared with the current advanced Transformer-based tracker ViPT.Compared with the Transformer-based tracker SDSTrack,the success rate and accuracy are improved by 2.4%and 3.3%,respectively.
作者 金静 刘建琴 翟凤文 JIN Jing;LIU Jianqin;ZHAI Fengwen(School of Electronic and Information Engineering,Lanzhou Jiaotong University,Lanzhou 730070,China)
出处 《光学精密工程》 北大核心 2025年第12期1940-1954,共15页 Optics and Precision Engineering
基金 甘肃省高校教师创新基金项目(No.2025B-060) 宁夏自然科学基金资助项目(No.2023AAC03741) 甘肃省科技计划项目重点研发计划-工业类(No.23YFGA0047)。
关键词 RGB-T目标跟踪 TRANSFORMER 通道特征融合 跨模态特征融合 RGB-T tracking transformer channel feature fusion cross-modal feature fusion
  • 相关文献

参考文献3

二级参考文献12

共引文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部