摘要
可见光热红外(RGB and Thermal infrared,RGBT)跟踪是一种结合了可见光和热红外光两种不同传感器信息的多模态目标跟踪方法 .这种方法旨在克服单一传感器在特定环境下的局限性,通过融合多种传感器的数据来提高目标跟踪的鲁棒性和准确性.然而,在现有的RGBT跟踪算法中,大多将可见光与热红外图像提取的特征直接进行融合,忽略了两种模态间的同质性与异质性.此外,RGBT跟踪还经常受到目标快速运动、尺度变化、光照变化、热交叉和遮挡等多种挑战因素的影响,现有工作往往是通过研究单一结构来同时解决所有问题,但这需要足够复杂的模型和足够多的训练数据.本文提出了一种新的面向不同挑战并结合多模态同异质信息分离与融合的网络,用于RGBT跟踪.在该网络的每层主干中都设计了一个挑战感知模块用于融合每种挑战下来自可见光与热红外两种不同模态的特征,并自适应地聚合所有挑战下的融合特征.此外,还加入了注意力增强模块及多尺度辅助模块对主干网络所提取的特征进行增强.最后根据可见光与热红外的同质性与异质性,分别提取它们的特有特征与共有特征并进行自适应融合.在GTOT、RGBT234和LasHeR数据集上的大量实验表明,与现有RGBT跟踪方法相比,本文提出的跟踪器显示出非常强的竞争力.
RGB and Thermal infrared(RGBT)tracking is a multi-modal object tracking method that integrates different information from visible light and thermal infrared sensors.This method aims to overcome the limitations of single sensor in a specific condition and increase the robustness and accuracy of object tracking by fusing data from multiple sensors.However,the majority of RGBT tracking methods in use today directly fuse features extracted from thermal infrared and visible light images,ignoring the homogeneity and heterogeneity of the two modalities.In addition,RGBT tracking is often affected by multiple challenging factors such as objects fast motion,scale variation,illumination variation,thermal crossover,and occlusion.Existing work often focuses on a single model to solve all challenges simultaneously,which requires highly complex model and extensive training data.This paper proposes a novel network called CMHHNet(facing different Challenges and combining Multi-modal Homogeneous and Heterogeneous information separation and integration Network)for RGBT tracking.In this network,a challenge-aware module is deployed in each layer of the backbone to fuse the visible light and thermal infrared features from two different modalities under each challenge separately,and adaptively aggregate the fused features under all challenges.In addition,an attention enhancement module and a multi-scale auxiliary module are added to strengthen the features that the backbone network has extracted.Finally,according to the homogeneity and heterogeneity of thermal infrared and visible light,their unique and common features are extracted separately and adaptively fused.Extensive experiments on GTOT,RGBT234 and LasHeR datasets demonstrate that the tracker proposed in this paper shows quite strong competitiveness compared with existing RGBT tracking methods.
作者
方鑫
陈柘
刘占文
李小鹏
宿雨心
FANG Xin;CHEN Zhe;LIU Zhan-wen;LI Xiao-peng;SU Yu-xin(School of Information Engineering,Chang’an University,Xi’an,Shaanxi 710064,China)
出处
《电子学报》
北大核心
2025年第3期910-925,共16页
Acta Electronica Sinica
基金
国家自然科学基金(No.52172302)
陕西省重点研发计划项目(No.2022GY-063)。
关键词
RGBT跟踪
挑战感知
同异质信息分离
自适应聚合
注意力机制
多尺度特征
RGBT tracking
challenge-aware
separation of homogeneous and heterogeneous information
adaptive aggregation
attention mechanism
multiscale features