摘要
目标跟踪是计算机视觉领域的重要研究方向,其任务是在视频序列中对感兴趣的目标进行持续检测与精确定位。随着深度学习技术的迅速发展,视觉目标跟踪在特征建模、时序关联以及端到端学习等方面取得了显著进展。文章系统回顾了基于深度学习的视觉目标跟踪研究现状,从单目标跟踪与多目标跟踪两个方向出发,分析了不同框架下的基本思想与实现机制,阐述了从传统手工特征方法向深度特征驱动模型的演化过程。还总结了常用的跟踪数据集与性能评价指标,并对当前研究面临的主要问题进行了讨论,包括长时间建模能力欠佳、复杂场景跟踪鲁棒性差以及跨场景泛化能力有限等。最后,展望了未来视觉目标跟踪的发展趋势,指出融合多模态信息与时空建模的统一深度框架将成为该领域的重要研究方向。
Object tracking,an important research direction in the field of computer vision,aims to continuously detect and accurately locate objects of interest in video sequences.With the rapid development of deep learning technology,visual object tracking has made significant progress in feature modeling,temporal association,and end-to-end learning.This paper systematically reviews the current research status of deep learning-based visual object tracking and analyzes the fundamental concepts and implementation mechanisms under different frameworks from the perspectives of single object tracking and multi-object tracking.The evolution from traditional hand-crafted feature methods to deep feature-driven models is further elaborated.After summarizing commonly used tracking datasets and performance evaluation metrics,the paper discusses the main challenges in current research,including limited long-term modeling capabilities,weak robustness in complex scene tracking,and poor cross-scene generalization.Finally,future research trends in visual object tracking highlight the development of unified deep framework that integrates multimodal information and spatiotemporal modeling.
作者
罗元
马文龙
唐小平
LUO Yuan;MA Wenlong;TANG Xiaoping(School of Electronic Science and Engineering,Chongqing University of Posts and Telecommunications,Chongqing 40065,CHN)
出处
《半导体光电》
北大核心
2026年第1期13-27,共15页
Semiconductor Optoelectronics
基金
校企合作项目(E020H2022009)。