Visual object tracking(VOT),aiming to track a target object in a continuous video,is a fundamental and critical task in computer vision.However,the reliance on third-party resources(e.g.,dataset)for training poses con...Visual object tracking(VOT),aiming to track a target object in a continuous video,is a fundamental and critical task in computer vision.However,the reliance on third-party resources(e.g.,dataset)for training poses concealed threats to the security of VOT models.In this paper,we reveal that VOT models are vulnerable to a poison-only and targeted backdoor attack,where the adversary can achieve arbitrary tracking predictions by manipulating only part of the training data.Specifically,we first define and formulate three different variants of the targeted attacks:size-manipulation,trajectory-manipulation,and hybrid attacks.To implement these,we introduce Random Video Poisoning(RVP),a novel poison-only strategy that exploits temporal correlations within video data by poisoning entire video sequences.Extensive experiments demonstrate that RVP effectively injects controllable backdoors,enabling precise manipulation of tracking behavior upon trigger activation,while maintaining high performance on benign data,thus ensuring stealth.Our findings not only expose significant vulnerabilities but also highlight that the underlying principles could be adapted for beneficial uses,such as dataset watermarking for copyright protection.展开更多
To improve small object detection and trajectory estimation from an aerial moving perspective,we propose the Aerial View Attention-PRB(AVA-PRB)model.AVA-PRB integrates two attention mechanisms—Coordinate Attention(CA...To improve small object detection and trajectory estimation from an aerial moving perspective,we propose the Aerial View Attention-PRB(AVA-PRB)model.AVA-PRB integrates two attention mechanisms—Coordinate Attention(CA)and the Convolutional Block Attention Module(CBAM)—to enhance detection accuracy.Additionally,Shape-IoU is employed as the loss function to refine localization precision.Our model further incorporates an adaptive feature fusion mechanism,which optimizes multi-scale object representation,ensuring robust tracking in complex aerial environments.We evaluate the performance of AVA-PRB on two benchmark datasets:Aerial Person Detection and VisDrone2019-Det.The model achieves 60.9%mAP@0.5 on the Aerial Person Detection dataset,and 51.2%mAP@0.5 on VisDrone2019-Det,demonstrating its effectiveness in aerial object detection.Beyond detection,we propose a novel trajectory estimation method that improves movement path prediction under aerial motion.Experimental results indicate that our approach reduces path deviation by up to 64%,effectively mitigating errors caused by rapid camera movements and background variations.By optimizing feature extraction and enhancing spatialtemporal coherence,our method significantly improves object tracking under aerial moving perspectives.This research addresses the limitations of fixed-camera tracking,enhancing flexibility and accuracy in aerial tracking applications.The proposed approach has broad potential for real-world applications,including surveillance,traffic monitoring,and environmental observation.展开更多
Advancements in animal behavior quantification methods have driven the development of computational ethology,enabling fully automated behavior analysis.Existing multianimal pose estimation workflows rely on tracking-b...Advancements in animal behavior quantification methods have driven the development of computational ethology,enabling fully automated behavior analysis.Existing multianimal pose estimation workflows rely on tracking-bydetection frameworks for either bottom-up or top-down approaches,requiring retraining to accommodate diverse animal appearances.This study introduces InteBOMB,an integrated workflow that enhances top-down approaches by incorporating generic object tracking,eliminating the need for prior knowledge of target animals while maintaining broad generalizability.InteBOMB includes two key strategies for tracking and segmentation in laboratory environments and two techniques for pose estimation in natural settings.The“background enhancement”strategy optimizesforeground-backgroundcontrastiveloss,generating more discriminative correlation maps.The“online proofreading”strategy stores human-in-the-loop long-term memory and dynamic short-term memory,enabling adaptive updates to object visual features.The“automated labeling suggestion”technique reuses the visual features saved during tracking to identify representative frames for training set labeling.Additionally,the“joint behavior analysis”technique integrates these features with multimodal data,expanding the latent space for behavior classification and clustering.To evaluate the framework,six datasets of mice and six datasets of nonhuman primates were compiled,covering laboratory and natural scenes.Benchmarking results demonstrated a24%improvement in zero-shot generic tracking and a 21%enhancement in joint latent space performance across datasets,highlighting the effectiveness of this approach in robust,generalizable behavior analysis.展开更多
An improved estimation of motion vectors of feature points is proposed for tracking moving objects of dynamic image sequence. Feature points are firstly extracted by the improved minimum intensity change (MIC) algor...An improved estimation of motion vectors of feature points is proposed for tracking moving objects of dynamic image sequence. Feature points are firstly extracted by the improved minimum intensity change (MIC) algorithm. The matching points of these feature points are then determined by adaptive rood pattern searching. Based on the random sample consensus (RANSAC) method, the background motion is finally compensated by the parameters of an affine transform of the background motion. With reasonable morphological filtering, the moving objects are completely extracted from the background, and then tracked accurately. Experimental results show that the improved method is successful on the motion background compensation and offers great promise in tracking moving objects of the dynamic image sequence.展开更多
This paper describes a new framework for object detection and tracking of AUV including underwater acoustic data interpolation, underwater acoustic images segmentation and underwater objects tracking. This framework i...This paper describes a new framework for object detection and tracking of AUV including underwater acoustic data interpolation, underwater acoustic images segmentation and underwater objects tracking. This framework is applied to the design of vision-based method for AUV based on the forward looking sonar sensor. First, the real-time data flow (underwater acoustic images) is pre-processed to form the whole underwater acoustic image, and the relevant position information of objects is extracted and determined. An improved method of double threshold segmentation is proposed to resolve the problem that the threshold cannot be adjusted adaptively in the traditional method. Second, a representation of region information is created in light of the Gaussian particle filter. The weighted integration strategy combining the area and invariant moment is proposed to perfect the weight of particles and to enhance the tracking robustness. Results obtained on the real acoustic vision platform of AUV during sea trials are displayed and discussed. They show that the proposed method can detect and track the moving objects underwater online, and it is effective and robust.展开更多
This paper addresses the problem of real-time object tracking for unmanned aerial vehicles. We consider the task of object tracking as a classification problem. Training a good classifier always needs a huge number of...This paper addresses the problem of real-time object tracking for unmanned aerial vehicles. We consider the task of object tracking as a classification problem. Training a good classifier always needs a huge number of samples, which is always time-consuming and not suitable for realtime applications. In this paper, we transform the large-scale least-squares problem in the spatial domain to a series of small-scale least-squares problems with constraints in the Fourier domain using the correlation filter technique. Then, this problem is efficiently solved by two stages. In the first stage, a fast method based on recursive least squares is used to solve the correlation filter problem without constraints in the Fourier domain. In the second stage, a weight matrix is constructed to prune the solution attained in the first stage to approach the constraints in the spatial domain. Then, the pruned classifier is used for tracking. To evaluate proposed tracker’s performance, comprehensive experiments are conducted on challenging aerial sequences in the UAV123 dataset. Experimental results demonstrate that proposed approach achieves a state-ofthe-art tracking performance in aerial sequences and operates at a mean speed of beyond 40 frames/s. For further analysis of proposed tracker’s robustness, extensive experiments are also performed on recent benchmarks OTB50, OTB100, and VOT2016.展开更多
A method for moving object recognition and tracking in the intelligent traffic monitoring system is presented. For the shortcomings and deficiencies of the frame-subtraction method, a redundant discrete wavelet transf...A method for moving object recognition and tracking in the intelligent traffic monitoring system is presented. For the shortcomings and deficiencies of the frame-subtraction method, a redundant discrete wavelet transform (RDWT) based moving object recognition algorithm is put forward, which directly detects moving objects in the redundant discrete wavelet transform domain. An improved adaptive mean-shift algorithm is used to track the moving object in the follow up frames. Experimental results show that the algorithm can effectively extract the moving object, even though the object is similar to the background, and the results are better than the traditional frame-subtraction method. The object tracking is accurate without the impact of changes in the size of the object. Therefore the algorithm has a certain practical value and prospect.展开更多
There are two main trends in the development of unmanned aerial vehicle(UAV)technologies:miniaturization and intellectualization,in which realizing object tracking capabilities for a nano-scale UAV is one of the most ...There are two main trends in the development of unmanned aerial vehicle(UAV)technologies:miniaturization and intellectualization,in which realizing object tracking capabilities for a nano-scale UAV is one of the most challenging problems.In this paper,we present a visual object tracking and servoing control system utilizing a tailor-made 38 g nano-scale quadrotor.A lightweight visual module is integrated to enable object tracking capabilities,and a micro positioning deck is mounted to provide accurate pose estimation.In order to be robust against object appearance variations,a novel object tracking algorithm,denoted by RMCTer,is proposed,which integrates a powerful short-term tracking module and an efficient long-term processing module.In particular,the long-term processing module can provide additional object information and modify the short-term tracking model in a timely manner.Furthermore,a positionbased visual servoing control method is proposed for the quadrotor,where an adaptive tracking controller is designed by leveraging backstepping and adaptive techniques.Stable and accurate object tracking is achieved even under disturbances.Experimental results are presented to demonstrate the high accuracy and stability of the whole tracking system.展开更多
Collaborative Robotics is one of the high-interest research topics in the area of academia and industry.It has been progressively utilized in numerous applications,particularly in intelligent surveillance systems.It a...Collaborative Robotics is one of the high-interest research topics in the area of academia and industry.It has been progressively utilized in numerous applications,particularly in intelligent surveillance systems.It allows the deployment of smart cameras or optical sensors with computer vision techniques,which may serve in several object detection and tracking tasks.These tasks have been considered challenging and high-level perceptual problems,frequently dominated by relative information about the environment,where main concerns such as occlusion,illumination,background,object deformation,and object class variations are commonplace.In order to show the importance of top view surveillance,a collaborative robotics framework has been presented.It can assist in the detection and tracking of multiple objects in top view surveillance.The framework consists of a smart robotic camera embedded with the visual processing unit.The existing pre-trained deep learning models named SSD and YOLO has been adopted for object detection and localization.The detection models are further combined with different tracking algorithms,including GOTURN,MEDIANFLOW,TLD,KCF,MIL,and BOOSTING.These algorithms,along with detection models,help to track and predict the trajectories of detected objects.The pre-trained models are employed;therefore,the generalization performance is also investigated through testing the models on various sequences of top view data set.The detection models achieved maximum True Detection Rate 93%to 90%with a maximum 0.6%False Detection Rate.The tracking results of different algorithms are nearly identical,with tracking accuracy ranging from 90%to 94%.Furthermore,a discussion has been carried out on output results along with future guidelines.展开更多
Inspired by human behaviors, a robot object tracking model is proposed on the basis of visual attention mechanism, which is fit for the theory of topological perception. The model integrates the image-driven, bottom-u...Inspired by human behaviors, a robot object tracking model is proposed on the basis of visual attention mechanism, which is fit for the theory of topological perception. The model integrates the image-driven, bottom-up attention and the object-driven, top-down attention, whereas the previous attention model has mostly focused on either the bottom-up or top-down attention. By the bottom-up component, the whole scene is segmented into the ground region and the salient regions. Guided by top-down strategy which is achieved by a topological graph, the object regions are separated from the salient regions. The salient regions except the object regions are the barrier regions. In order to estimate the model, a mobile robot platform is developed, on which some experiments are implemented. The experimental results indicate that processing an image with a resolution of 752 × 480 pixels takes less than 200 ms and the object regions are unabridged. The analysis obtained by comparing the proposed model with the existing model demonstrates that the proposed model has some advantages in robot object tracking in terms of speed and efficiency.展开更多
An object model-based tracking method is useful for tracking multiple objects, but the main difficulties are modeling objects reliably and tracking objects via models in successive frames. An effective tracking method...An object model-based tracking method is useful for tracking multiple objects, but the main difficulties are modeling objects reliably and tracking objects via models in successive frames. An effective tracking method using the object models is proposed to track multiple objects in a real-time visual surveillance system. Firstly, for detecting objects, an adaptive kernel density estimation method is utilized, which uses an adaptive bandwidth and features combining colour and gradient. Secondly, some models of objects are built for describing motion, shape and colour features. Then, a matching matrix is formed to analyze tracking situations. If objects are tracked under occlusions, the optimal "visual" object is found to represent the occluded object, and the posterior probability of pixel is used to determine which pixel is utilized for updating object models. Extensive experiments show that this method improves the accuracy and validity of tracking objects even under occlusions and is used in real-time visual surveillance systems.展开更多
In this paper, a novel object tracking based on a particle filter and speeded up robust feature (SURF) is proposed, which uses both color and SURF features. The SURF feature makes the tracking result more robust. On...In this paper, a novel object tracking based on a particle filter and speeded up robust feature (SURF) is proposed, which uses both color and SURF features. The SURF feature makes the tracking result more robust. On the other hand, the particle selection can lead to save time. In addition, we also consider the matched particle applicable to calculating the SURF weight. Owing to the color, spatial, and SURF features being adopted, this method is more robust than the traditional color-based appearance model. Experimental results demonstrate the robustness and accurate tracking results with challenging sequences. Besides, the proposed method outperforms other methods during the intersection of similar color and object's partial occlusion.展开更多
Video object tracking is an important research topic of computer vision, whichfinds a wide range of applications in video surveillance, robotics, human-computerinteraction and so on. Although many moving object tracki...Video object tracking is an important research topic of computer vision, whichfinds a wide range of applications in video surveillance, robotics, human-computerinteraction and so on. Although many moving object tracking algorithms have beenproposed, there are still many difficulties in the actual tracking process, such asillumination change, occlusion, motion blurring, scale change, self-change and so on.Therefore, the development of object tracking technology is still challenging. Theemergence of deep learning theory and method provides a new opportunity for theresearch of object tracking, and it is also the main theoretical framework for the researchof moving object tracking algorithm in this paper. In this paper, the existing deeptracking-based target tracking algorithms are classified and sorted out. Based on theprevious knowledge and my own understanding, several solutions are proposed for theexisting methods. In addition, the existing deep learning target tracking method is stilldifficult to meet the requirements of real-time, how to design the network and trackingprocess to achieve speed and effect improvement, there is still a lot of research space.展开更多
Single object tracking based on deep learning has achieved the advanced performance in many applications of computer vision.However,the existing trackers have certain limitations owing to deformation,occlusion,movemen...Single object tracking based on deep learning has achieved the advanced performance in many applications of computer vision.However,the existing trackers have certain limitations owing to deformation,occlusion,movement and some other conditions.We propose a siamese attentional dense network called SiamADN in an end-to-end offline manner,especially aiming at unmanned aerial vehicle(UAV)tracking.First,it applies a dense network to reduce vanishing-gradient,which strengthens the features transfer.Second,the channel attention mechanism is involved into the Densenet structure,in order to focus on the possible key regions.The advance corner detection network is introduced to improve the following tracking process.Extensive experiments are carried out on four mainly tracking benchmarks as OTB-2015,UAV123,LaSOT and VOT.The accuracy rate on UAV123 is 78.9%,and the running speed is 32 frame per second(FPS),which demonstrates its efficiency in the practical real application.展开更多
A literature analysis has shown that object search,recognition,and tracking systems are becoming increasingly popular.However,such systems do not achieve high practical results in analyzing small moving living objects...A literature analysis has shown that object search,recognition,and tracking systems are becoming increasingly popular.However,such systems do not achieve high practical results in analyzing small moving living objects ranging from 8 to 14 mm.This article examines methods and tools for recognizing and tracking the class of small moving objects,such as ants.To fulfill those aims,a customized You Only Look Once Ants Recognition(YOLO_AR)Convolutional Neural Network(CNN)has been trained to recognize Messor Structor ants in the laboratory using the LabelImg object marker tool.The proposed model is an extension of the You Only Look Once v4(Yolov4)512×512 model with an additional Self Regularized Non–Monotonic(Mish)activation function.Additionally,the scalable solution for continuous object recognizing and tracking was implemented.This solution is based on the OpenDatacam system,with extended Object Tracking modules that allow for tracking and counting objects that have crossed the custom boundary line.During the study,the methods of the alignment algorithm for finding the trajectory of moving objects were modified.I discovered that the Hungarian algorithm showed better results in tracking small objects than the K–D dimensional tree(k-d tree)matching algorithm used in OpenDataCam.Remarkably,such an algorithm showed better results with the implemented YOLO_AR model due to the lack of False Positives(FP).Therefore,I provided a new tracker module with a Hungarian matching algorithm verified on the Multiple Object Tracking(MOT)benchmark.Furthermore,additional customization parameters for object recognition and tracking results parsing and filtering were added,like boundary angle threshold(BAT)and past frames trajectory prediction(PFTP).Experimental tests confirmed the results of the study on a mobile device.During the experiment,parameters such as the quality of recognition and tracking of moving objects,the PFTP and BAT,and the configuration parameters of the neural network and boundary line model were analyzed.The results showed an increased tracking accuracy with the proposed methods by 50%.The study results confirmed the relevance of the topic and the effectiveness of the implemented methods and tools.展开更多
Object tracking with abrupt motion is an important research topic and has attracted wide attention.To obtain accurate tracking results,an improved particle filter tracking algorithm based on sparse representation and ...Object tracking with abrupt motion is an important research topic and has attracted wide attention.To obtain accurate tracking results,an improved particle filter tracking algorithm based on sparse representation and nonlinear resampling is proposed in this paper. First,the sparse representation is used to compute particle weights by considering the fact that the weights are sparse when the object moves abruptly,so the potential object region can be predicted more precisely. Then,a nonlinear resampling process is proposed by utilizing the nonlinear sorting strategy,which can solve the problem of particle diversity impoverishment caused by traditional resampling methods. Experimental results based on videos containing objects with various abrupt motions have demonstrated the effectiveness of the proposed algorithm.展开更多
If a somewhat fast moving object exists in a complicated tracking environment, snake's nodes may fall into the inaccurate local minima. We propose a mean shift snake algorithm to solve this problem. However, if th...If a somewhat fast moving object exists in a complicated tracking environment, snake's nodes may fall into the inaccurate local minima. We propose a mean shift snake algorithm to solve this problem. However, if the object goes beyond the limits of mean shift snake module operation in suc- cessive sequences, mean shift snake's nodes may also fall into the local minima in their moving to the new object position. This paper presents a motion compensation strategy by using particle filter; therefore a new Particle Filter Mean Shift Snake (PFMSS) algorithm is proposed which combines particle filter with mean shift snake to fulfill the estimation of the fast moving object contour. Firstly, the fast moving object is tracked by particle filter to create a coarse position which is used to initialize the mean shift algorithm. Secondly, the whole relevant motion information is used to compensate the snake's node positions. Finally, snake algorithm is used to extract the exact object contour and the useful information of the object is fed back. Some real world sequences are tested and the results show that the novel tracking method have a good performance with high accuracy in solving the fast moving problems in cluttered background.展开更多
Visual object tracking plays a crucial role in computer vision.In recent years,researchers have proposed various methods to achieve high-performance object tracking.Among these,methods based on Transformers have becom...Visual object tracking plays a crucial role in computer vision.In recent years,researchers have proposed various methods to achieve high-performance object tracking.Among these,methods based on Transformers have become a research hotspot due to their ability to globally model and contextualize information.However,current Transformer-based object tracking methods still face challenges such as low tracking accuracy and the presence of redundant feature information.In this paper,we introduce self-calibration multi-head self-attention Transformer(SMSTracker)as a solution to these challenges.It employs a hybrid tensor decomposition self-organizing multihead self-attention transformermechanism,which not only compresses and accelerates Transformer operations but also significantly reduces redundant data,thereby enhancing the accuracy and efficiency of tracking.Additionally,we introduce a self-calibration attention fusion block to resolve common issues of attention ambiguities and inconsistencies found in traditional trackingmethods,ensuring the stability and reliability of tracking performance across various scenarios.By integrating a hybrid tensor decomposition approach with a self-organizingmulti-head self-attentive transformer mechanism,SMSTracker enhances the efficiency and accuracy of the tracking process.Experimental results show that SMSTracker achieves competitive performance in visual object tracking,promising more robust and efficient tracking systems,demonstrating its potential to providemore robust and efficient tracking solutions in real-world applications.展开更多
On accomplishing an efficacious object tracking,the activity of an object concerned becomes notified in a forthright manner.An accurate form of object tracking task necessitates a robust object tracking procedures irr...On accomplishing an efficacious object tracking,the activity of an object concerned becomes notified in a forthright manner.An accurate form of object tracking task necessitates a robust object tracking procedures irrespective of hardware assistance.Such approaches inferred a vast computational complexity to track an object with high accuracy in a stipulated amount of processing time.On the other hand,the tracking gets affected owing to the existence of varied quality diminishing factors such as occlusion,illumination changes,shadows etc.,In order to rectify all these inadequacies in tracking an object,a novel background normalization procedure articulated on the basis of a textural pattern is proposed in this paper.After preprocessing an acquired image,employment of an Environmental Succession Prediction algorithm for discriminating disparate background environment by background clustering approach have been accomplished.Afterward,abstract textural characterizations through utilization of a Probability based Gradient Pattern(PGP)approach for recognizing the similarity between patterns obtained so far.Comparison between standardized frame obtained in prior and those processed patterns detects the motion exposed by an object and the object concerned gets identified within a blob.Hence,the system is resistant towards illumination variations.These illumination variation was interpreted in object tracking residing within a dynamic background.Devised approach certainly outperforms other object tracking methodologies like Group Target Tracking(GTT),Vi PER-GT,grabcut,snakes in terms of accuracy and average time.Proposed PGP-based pattern texture analysis is compared with Gamifying Video Object(GVO)approach and hence,it evidently outperforms in terms of precision,recall and F1 measure.展开更多
Recently,Siamese-based trackers have achieved excellent performance in object tracking.However,the high speed and deformation of objects in the movement process make tracking difficult.Therefore,we have incorporated c...Recently,Siamese-based trackers have achieved excellent performance in object tracking.However,the high speed and deformation of objects in the movement process make tracking difficult.Therefore,we have incorporated cascaded region-proposal-network(RPN)fusion and coordinate attention into Siamese trackers.The proposed network framework consists of three parts:a feature-extraction sub-network,coordinate attention block,and cascaded RPN block.We exploit the coordinate attention block,which can embed location information into channel attention,to establish long-term spatial location dependence while maintaining channel associations.Thus,the features of different layers are enhanced by the coordinate attention block.We then send these features separately into the cascaded RPN for classification and regression.According to the two classification and regression results,the final position of the target is obtained.To verify the effectiveness of the proposed method,we conducted comprehensive experiments on the OTB100,VOT2016,UAV123,and GOT-10k datasets.Compared with other state-of-the-art trackers,the proposed tracker achieved good performance and can run at real-time speed.展开更多
基金supported in part by the"Pioneer"and"Leading Goose"R&D Program of Zhejiang under Grant No. 2024C01169the National Natural Science Foundation of China under Grant Nos. 62441238 and U2441240。
文摘Visual object tracking(VOT),aiming to track a target object in a continuous video,is a fundamental and critical task in computer vision.However,the reliance on third-party resources(e.g.,dataset)for training poses concealed threats to the security of VOT models.In this paper,we reveal that VOT models are vulnerable to a poison-only and targeted backdoor attack,where the adversary can achieve arbitrary tracking predictions by manipulating only part of the training data.Specifically,we first define and formulate three different variants of the targeted attacks:size-manipulation,trajectory-manipulation,and hybrid attacks.To implement these,we introduce Random Video Poisoning(RVP),a novel poison-only strategy that exploits temporal correlations within video data by poisoning entire video sequences.Extensive experiments demonstrate that RVP effectively injects controllable backdoors,enabling precise manipulation of tracking behavior upon trigger activation,while maintaining high performance on benign data,thus ensuring stealth.Our findings not only expose significant vulnerabilities but also highlight that the underlying principles could be adapted for beneficial uses,such as dataset watermarking for copyright protection.
基金funded by theNational Science and TechnologyCouncil(NSTC),Taiwan,under grant numbers NSTC 113-2634-F-A49-007 and NSTC 112-2634-F-A49-007.
文摘To improve small object detection and trajectory estimation from an aerial moving perspective,we propose the Aerial View Attention-PRB(AVA-PRB)model.AVA-PRB integrates two attention mechanisms—Coordinate Attention(CA)and the Convolutional Block Attention Module(CBAM)—to enhance detection accuracy.Additionally,Shape-IoU is employed as the loss function to refine localization precision.Our model further incorporates an adaptive feature fusion mechanism,which optimizes multi-scale object representation,ensuring robust tracking in complex aerial environments.We evaluate the performance of AVA-PRB on two benchmark datasets:Aerial Person Detection and VisDrone2019-Det.The model achieves 60.9%mAP@0.5 on the Aerial Person Detection dataset,and 51.2%mAP@0.5 on VisDrone2019-Det,demonstrating its effectiveness in aerial object detection.Beyond detection,we propose a novel trajectory estimation method that improves movement path prediction under aerial motion.Experimental results indicate that our approach reduces path deviation by up to 64%,effectively mitigating errors caused by rapid camera movements and background variations.By optimizing feature extraction and enhancing spatialtemporal coherence,our method significantly improves object tracking under aerial moving perspectives.This research addresses the limitations of fixed-camera tracking,enhancing flexibility and accuracy in aerial tracking applications.The proposed approach has broad potential for real-world applications,including surveillance,traffic monitoring,and environmental observation.
基金supported by the STI 2030-Major Projects(2022ZD0211900,2022ZD0211902)STI 2030-Major Projects(2021ZD0204500,2021ZD0204503)+1 种基金National Natural Science Foundation of China(32171461)National Key Research and Development Program of China(2023YFC3208303)。
文摘Advancements in animal behavior quantification methods have driven the development of computational ethology,enabling fully automated behavior analysis.Existing multianimal pose estimation workflows rely on tracking-bydetection frameworks for either bottom-up or top-down approaches,requiring retraining to accommodate diverse animal appearances.This study introduces InteBOMB,an integrated workflow that enhances top-down approaches by incorporating generic object tracking,eliminating the need for prior knowledge of target animals while maintaining broad generalizability.InteBOMB includes two key strategies for tracking and segmentation in laboratory environments and two techniques for pose estimation in natural settings.The“background enhancement”strategy optimizesforeground-backgroundcontrastiveloss,generating more discriminative correlation maps.The“online proofreading”strategy stores human-in-the-loop long-term memory and dynamic short-term memory,enabling adaptive updates to object visual features.The“automated labeling suggestion”technique reuses the visual features saved during tracking to identify representative frames for training set labeling.Additionally,the“joint behavior analysis”technique integrates these features with multimodal data,expanding the latent space for behavior classification and clustering.To evaluate the framework,six datasets of mice and six datasets of nonhuman primates were compiled,covering laboratory and natural scenes.Benchmarking results demonstrated a24%improvement in zero-shot generic tracking and a 21%enhancement in joint latent space performance across datasets,highlighting the effectiveness of this approach in robust,generalizable behavior analysis.
文摘An improved estimation of motion vectors of feature points is proposed for tracking moving objects of dynamic image sequence. Feature points are firstly extracted by the improved minimum intensity change (MIC) algorithm. The matching points of these feature points are then determined by adaptive rood pattern searching. Based on the random sample consensus (RANSAC) method, the background motion is finally compensated by the parameters of an affine transform of the background motion. With reasonable morphological filtering, the moving objects are completely extracted from the background, and then tracked accurately. Experimental results show that the improved method is successful on the motion background compensation and offers great promise in tracking moving objects of the dynamic image sequence.
基金supported by the National Natural Science Foundation of China(Grant No.51009040)Heilongjiang Postdoctoral Fund(Grant No.LBH-Z11205)+1 种基金the National High Technology Research and Development Program of China(863 Program,Grant No.2011AA09A106)the China Postdoctoral Science Foundation(Grant No.2012M510928)
文摘This paper describes a new framework for object detection and tracking of AUV including underwater acoustic data interpolation, underwater acoustic images segmentation and underwater objects tracking. This framework is applied to the design of vision-based method for AUV based on the forward looking sonar sensor. First, the real-time data flow (underwater acoustic images) is pre-processed to form the whole underwater acoustic image, and the relevant position information of objects is extracted and determined. An improved method of double threshold segmentation is proposed to resolve the problem that the threshold cannot be adjusted adaptively in the traditional method. Second, a representation of region information is created in light of the Gaussian particle filter. The weighted integration strategy combining the area and invariant moment is proposed to perfect the weight of particles and to enhance the tracking robustness. Results obtained on the real acoustic vision platform of AUV during sea trials are displayed and discussed. They show that the proposed method can detect and track the moving objects underwater online, and it is effective and robust.
基金supported by the National Natural Science Foundation of China(No.61671002)
文摘This paper addresses the problem of real-time object tracking for unmanned aerial vehicles. We consider the task of object tracking as a classification problem. Training a good classifier always needs a huge number of samples, which is always time-consuming and not suitable for realtime applications. In this paper, we transform the large-scale least-squares problem in the spatial domain to a series of small-scale least-squares problems with constraints in the Fourier domain using the correlation filter technique. Then, this problem is efficiently solved by two stages. In the first stage, a fast method based on recursive least squares is used to solve the correlation filter problem without constraints in the Fourier domain. In the second stage, a weight matrix is constructed to prune the solution attained in the first stage to approach the constraints in the spatial domain. Then, the pruned classifier is used for tracking. To evaluate proposed tracker’s performance, comprehensive experiments are conducted on challenging aerial sequences in the UAV123 dataset. Experimental results demonstrate that proposed approach achieves a state-ofthe-art tracking performance in aerial sequences and operates at a mean speed of beyond 40 frames/s. For further analysis of proposed tracker’s robustness, extensive experiments are also performed on recent benchmarks OTB50, OTB100, and VOT2016.
文摘A method for moving object recognition and tracking in the intelligent traffic monitoring system is presented. For the shortcomings and deficiencies of the frame-subtraction method, a redundant discrete wavelet transform (RDWT) based moving object recognition algorithm is put forward, which directly detects moving objects in the redundant discrete wavelet transform domain. An improved adaptive mean-shift algorithm is used to track the moving object in the follow up frames. Experimental results show that the algorithm can effectively extract the moving object, even though the object is similar to the background, and the results are better than the traditional frame-subtraction method. The object tracking is accurate without the impact of changes in the size of the object. Therefore the algorithm has a certain practical value and prospect.
基金supported in part by the Institute for Guo Qiang of Tsinghua University(2019GQG1023)in part by Graduate Education and Teaching Reform Project of Tsinghua University(202007J007)+1 种基金in part by National Natural Science Foundation of China(U19B2029,62073028,61803222)in part by the Independent Research Program of Tsinghua University(2018Z05JDX002)。
文摘There are two main trends in the development of unmanned aerial vehicle(UAV)technologies:miniaturization and intellectualization,in which realizing object tracking capabilities for a nano-scale UAV is one of the most challenging problems.In this paper,we present a visual object tracking and servoing control system utilizing a tailor-made 38 g nano-scale quadrotor.A lightweight visual module is integrated to enable object tracking capabilities,and a micro positioning deck is mounted to provide accurate pose estimation.In order to be robust against object appearance variations,a novel object tracking algorithm,denoted by RMCTer,is proposed,which integrates a powerful short-term tracking module and an efficient long-term processing module.In particular,the long-term processing module can provide additional object information and modify the short-term tracking model in a timely manner.Furthermore,a positionbased visual servoing control method is proposed for the quadrotor,where an adaptive tracking controller is designed by leveraging backstepping and adaptive techniques.Stable and accurate object tracking is achieved even under disturbances.Experimental results are presented to demonstrate the high accuracy and stability of the whole tracking system.
基金the Framework of International Cooperation Program managed by the National Research Foundation of Korea(2019K1A3A1A8011295711).
文摘Collaborative Robotics is one of the high-interest research topics in the area of academia and industry.It has been progressively utilized in numerous applications,particularly in intelligent surveillance systems.It allows the deployment of smart cameras or optical sensors with computer vision techniques,which may serve in several object detection and tracking tasks.These tasks have been considered challenging and high-level perceptual problems,frequently dominated by relative information about the environment,where main concerns such as occlusion,illumination,background,object deformation,and object class variations are commonplace.In order to show the importance of top view surveillance,a collaborative robotics framework has been presented.It can assist in the detection and tracking of multiple objects in top view surveillance.The framework consists of a smart robotic camera embedded with the visual processing unit.The existing pre-trained deep learning models named SSD and YOLO has been adopted for object detection and localization.The detection models are further combined with different tracking algorithms,including GOTURN,MEDIANFLOW,TLD,KCF,MIL,and BOOSTING.These algorithms,along with detection models,help to track and predict the trajectories of detected objects.The pre-trained models are employed;therefore,the generalization performance is also investigated through testing the models on various sequences of top view data set.The detection models achieved maximum True Detection Rate 93%to 90%with a maximum 0.6%False Detection Rate.The tracking results of different algorithms are nearly identical,with tracking accuracy ranging from 90%to 94%.Furthermore,a discussion has been carried out on output results along with future guidelines.
基金supported by National Basic Research Program of China (973 Program) (No. 2006CB300407)National Natural Science Foundation of China (No. 50775017)
文摘Inspired by human behaviors, a robot object tracking model is proposed on the basis of visual attention mechanism, which is fit for the theory of topological perception. The model integrates the image-driven, bottom-up attention and the object-driven, top-down attention, whereas the previous attention model has mostly focused on either the bottom-up or top-down attention. By the bottom-up component, the whole scene is segmented into the ground region and the salient regions. Guided by top-down strategy which is achieved by a topological graph, the object regions are separated from the salient regions. The salient regions except the object regions are the barrier regions. In order to estimate the model, a mobile robot platform is developed, on which some experiments are implemented. The experimental results indicate that processing an image with a resolution of 752 × 480 pixels takes less than 200 ms and the object regions are unabridged. The analysis obtained by comparing the proposed model with the existing model demonstrates that the proposed model has some advantages in robot object tracking in terms of speed and efficiency.
基金supported by the National Natural Science Foundation of China(60835004 60775047+2 种基金 60872130)the National High Technology Research and Development Program of China(863 Program)(2007AA04Z244 2008AA04Z214)
文摘An object model-based tracking method is useful for tracking multiple objects, but the main difficulties are modeling objects reliably and tracking objects via models in successive frames. An effective tracking method using the object models is proposed to track multiple objects in a real-time visual surveillance system. Firstly, for detecting objects, an adaptive kernel density estimation method is utilized, which uses an adaptive bandwidth and features combining colour and gradient. Secondly, some models of objects are built for describing motion, shape and colour features. Then, a matching matrix is formed to analyze tracking situations. If objects are tracked under occlusions, the optimal "visual" object is found to represent the occluded object, and the posterior probability of pixel is used to determine which pixel is utilized for updating object models. Extensive experiments show that this method improves the accuracy and validity of tracking objects even under occlusions and is used in real-time visual surveillance systems.
基金supported by the NSC under Grant No.NSC101-2221-E-259-032-MY3
文摘In this paper, a novel object tracking based on a particle filter and speeded up robust feature (SURF) is proposed, which uses both color and SURF features. The SURF feature makes the tracking result more robust. On the other hand, the particle selection can lead to save time. In addition, we also consider the matched particle applicable to calculating the SURF weight. Owing to the color, spatial, and SURF features being adopted, this method is more robust than the traditional color-based appearance model. Experimental results demonstrate the robustness and accurate tracking results with challenging sequences. Besides, the proposed method outperforms other methods during the intersection of similar color and object's partial occlusion.
基金supported by National Natural Science Foundationof China (Grant No. 51874300)the National Natural Science Foundation of China andShanxi Provincial People’s Government Jointly Funded Project of China for Coal Baseand Low Carbon (Grant No. U1510115)+2 种基金National Natural Science Foundation of China(51104157)the Qing Lan Project, the China Postdoctoral Science Foundation (Grant No.2013T60574)the Scientific Instrument Developing Project of the Chinese Academy ofSciences (Grant No. YJKYYQ20170074).
文摘Video object tracking is an important research topic of computer vision, whichfinds a wide range of applications in video surveillance, robotics, human-computerinteraction and so on. Although many moving object tracking algorithms have beenproposed, there are still many difficulties in the actual tracking process, such asillumination change, occlusion, motion blurring, scale change, self-change and so on.Therefore, the development of object tracking technology is still challenging. Theemergence of deep learning theory and method provides a new opportunity for theresearch of object tracking, and it is also the main theoretical framework for the researchof moving object tracking algorithm in this paper. In this paper, the existing deeptracking-based target tracking algorithms are classified and sorted out. Based on theprevious knowledge and my own understanding, several solutions are proposed for theexisting methods. In addition, the existing deep learning target tracking method is stilldifficult to meet the requirements of real-time, how to design the network and trackingprocess to achieve speed and effect improvement, there is still a lot of research space.
基金supported by the Zhejiang Key Laboratory of General Aviation Operation Technology(No.JDGA2020-7)the National Natural Science Foundation of China(No.62173237)+3 种基金the Natural Science Foundation of Liaoning Province(No.2019-MS-251)the Talent Project of Revitalization Liaoning Province(No.XLYC1907022)the Key R&D Projects of Liaoning Province(No.2020JH2/10100045)the High-Level Innovation Talent Project of Shenyang(No.RC190030).
文摘Single object tracking based on deep learning has achieved the advanced performance in many applications of computer vision.However,the existing trackers have certain limitations owing to deformation,occlusion,movement and some other conditions.We propose a siamese attentional dense network called SiamADN in an end-to-end offline manner,especially aiming at unmanned aerial vehicle(UAV)tracking.First,it applies a dense network to reduce vanishing-gradient,which strengthens the features transfer.Second,the channel attention mechanism is involved into the Densenet structure,in order to focus on the possible key regions.The advance corner detection network is introduced to improve the following tracking process.Extensive experiments are carried out on four mainly tracking benchmarks as OTB-2015,UAV123,LaSOT and VOT.The accuracy rate on UAV123 is 78.9%,and the running speed is 32 frame per second(FPS),which demonstrates its efficiency in the practical real application.
文摘A literature analysis has shown that object search,recognition,and tracking systems are becoming increasingly popular.However,such systems do not achieve high practical results in analyzing small moving living objects ranging from 8 to 14 mm.This article examines methods and tools for recognizing and tracking the class of small moving objects,such as ants.To fulfill those aims,a customized You Only Look Once Ants Recognition(YOLO_AR)Convolutional Neural Network(CNN)has been trained to recognize Messor Structor ants in the laboratory using the LabelImg object marker tool.The proposed model is an extension of the You Only Look Once v4(Yolov4)512×512 model with an additional Self Regularized Non–Monotonic(Mish)activation function.Additionally,the scalable solution for continuous object recognizing and tracking was implemented.This solution is based on the OpenDatacam system,with extended Object Tracking modules that allow for tracking and counting objects that have crossed the custom boundary line.During the study,the methods of the alignment algorithm for finding the trajectory of moving objects were modified.I discovered that the Hungarian algorithm showed better results in tracking small objects than the K–D dimensional tree(k-d tree)matching algorithm used in OpenDataCam.Remarkably,such an algorithm showed better results with the implemented YOLO_AR model due to the lack of False Positives(FP).Therefore,I provided a new tracker module with a Hungarian matching algorithm verified on the Multiple Object Tracking(MOT)benchmark.Furthermore,additional customization parameters for object recognition and tracking results parsing and filtering were added,like boundary angle threshold(BAT)and past frames trajectory prediction(PFTP).Experimental tests confirmed the results of the study on a mobile device.During the experiment,parameters such as the quality of recognition and tracking of moving objects,the PFTP and BAT,and the configuration parameters of the neural network and boundary line model were analyzed.The results showed an increased tracking accuracy with the proposed methods by 50%.The study results confirmed the relevance of the topic and the effectiveness of the implemented methods and tools.
基金Supported by the National Natural Science Foundation of China(61701029)
文摘Object tracking with abrupt motion is an important research topic and has attracted wide attention.To obtain accurate tracking results,an improved particle filter tracking algorithm based on sparse representation and nonlinear resampling is proposed in this paper. First,the sparse representation is used to compute particle weights by considering the fact that the weights are sparse when the object moves abruptly,so the potential object region can be predicted more precisely. Then,a nonlinear resampling process is proposed by utilizing the nonlinear sorting strategy,which can solve the problem of particle diversity impoverishment caused by traditional resampling methods. Experimental results based on videos containing objects with various abrupt motions have demonstrated the effectiveness of the proposed algorithm.
基金Supported by the National Natural Science Foundation of China (No. 60672094)
文摘If a somewhat fast moving object exists in a complicated tracking environment, snake's nodes may fall into the inaccurate local minima. We propose a mean shift snake algorithm to solve this problem. However, if the object goes beyond the limits of mean shift snake module operation in suc- cessive sequences, mean shift snake's nodes may also fall into the local minima in their moving to the new object position. This paper presents a motion compensation strategy by using particle filter; therefore a new Particle Filter Mean Shift Snake (PFMSS) algorithm is proposed which combines particle filter with mean shift snake to fulfill the estimation of the fast moving object contour. Firstly, the fast moving object is tracked by particle filter to create a coarse position which is used to initialize the mean shift algorithm. Secondly, the whole relevant motion information is used to compensate the snake's node positions. Finally, snake algorithm is used to extract the exact object contour and the useful information of the object is fed back. Some real world sequences are tested and the results show that the novel tracking method have a good performance with high accuracy in solving the fast moving problems in cluttered background.
基金supported by the National Natural Science Foundation of China under Grant 62177029the Postgraduate Research&Practice Innovation Program of Jiangsu Province(KYCX21_0740),China.
文摘Visual object tracking plays a crucial role in computer vision.In recent years,researchers have proposed various methods to achieve high-performance object tracking.Among these,methods based on Transformers have become a research hotspot due to their ability to globally model and contextualize information.However,current Transformer-based object tracking methods still face challenges such as low tracking accuracy and the presence of redundant feature information.In this paper,we introduce self-calibration multi-head self-attention Transformer(SMSTracker)as a solution to these challenges.It employs a hybrid tensor decomposition self-organizing multihead self-attention transformermechanism,which not only compresses and accelerates Transformer operations but also significantly reduces redundant data,thereby enhancing the accuracy and efficiency of tracking.Additionally,we introduce a self-calibration attention fusion block to resolve common issues of attention ambiguities and inconsistencies found in traditional trackingmethods,ensuring the stability and reliability of tracking performance across various scenarios.By integrating a hybrid tensor decomposition approach with a self-organizingmulti-head self-attentive transformer mechanism,SMSTracker enhances the efficiency and accuracy of the tracking process.Experimental results show that SMSTracker achieves competitive performance in visual object tracking,promising more robust and efficient tracking systems,demonstrating its potential to providemore robust and efficient tracking solutions in real-world applications.
文摘On accomplishing an efficacious object tracking,the activity of an object concerned becomes notified in a forthright manner.An accurate form of object tracking task necessitates a robust object tracking procedures irrespective of hardware assistance.Such approaches inferred a vast computational complexity to track an object with high accuracy in a stipulated amount of processing time.On the other hand,the tracking gets affected owing to the existence of varied quality diminishing factors such as occlusion,illumination changes,shadows etc.,In order to rectify all these inadequacies in tracking an object,a novel background normalization procedure articulated on the basis of a textural pattern is proposed in this paper.After preprocessing an acquired image,employment of an Environmental Succession Prediction algorithm for discriminating disparate background environment by background clustering approach have been accomplished.Afterward,abstract textural characterizations through utilization of a Probability based Gradient Pattern(PGP)approach for recognizing the similarity between patterns obtained so far.Comparison between standardized frame obtained in prior and those processed patterns detects the motion exposed by an object and the object concerned gets identified within a blob.Hence,the system is resistant towards illumination variations.These illumination variation was interpreted in object tracking residing within a dynamic background.Devised approach certainly outperforms other object tracking methodologies like Group Target Tracking(GTT),Vi PER-GT,grabcut,snakes in terms of accuracy and average time.Proposed PGP-based pattern texture analysis is compared with Gamifying Video Object(GVO)approach and hence,it evidently outperforms in terms of precision,recall and F1 measure.
基金supported in part by the National Natural Science Foundation of China under Grants 61972056 and 61901061the Science Fund for Creative Research Groups of Hunan Province under Grant 2020JJ1006+3 种基金the Natural Science Foundation of Hunan Province under Grant 2020JJ5603the Postgraduate Training Innovation Base Construction Project of Hunan Province under Grant 2019-248-51the Basic Research Fund of Zhongye Changtian International Engineering Co.,Ltd.under Grant 2020JCYJ07the Scientific Research Fund of Hunan Provincial Education Department under Grant 19C0031.
文摘Recently,Siamese-based trackers have achieved excellent performance in object tracking.However,the high speed and deformation of objects in the movement process make tracking difficult.Therefore,we have incorporated cascaded region-proposal-network(RPN)fusion and coordinate attention into Siamese trackers.The proposed network framework consists of three parts:a feature-extraction sub-network,coordinate attention block,and cascaded RPN block.We exploit the coordinate attention block,which can embed location information into channel attention,to establish long-term spatial location dependence while maintaining channel associations.Thus,the features of different layers are enhanced by the coordinate attention block.We then send these features separately into the cascaded RPN for classification and regression.According to the two classification and regression results,the final position of the target is obtained.To verify the effectiveness of the proposed method,we conducted comprehensive experiments on the OTB100,VOT2016,UAV123,and GOT-10k datasets.Compared with other state-of-the-art trackers,the proposed tracker achieved good performance and can run at real-time speed.