Understanding fish movement trajectories in aquaculture is essential for practical applications,such as disease warning,feeding optimization,and breeding management.These trajectories reveal key information about the ...Understanding fish movement trajectories in aquaculture is essential for practical applications,such as disease warning,feeding optimization,and breeding management.These trajectories reveal key information about the fish’s behavior,health,and environmental adaptability.However,when multi-object tracking(MOT)algorithms are applied to the high-density aquaculture environment,occlusion and overlapping among fish may result in missed detections,false detections,and identity switching problems,which limit the tracking accuracy.To address these issues,this paper proposes FishTracker,a MOT algorithm,by utilizing a Tracking-by-Detection framework.First,the neck part of the YOLOv8 model is enhanced by introducing a Multi-Scale Dilated Attention(MSDA)module to improve object localization and classification confidence.Second,an Adaptive Kalman Filter(AKF)is employed in the tracking phase to dynamically adjust motion prediction parameters,thereby overcoming target adhesion and nonlinear motion in complex scenarios.Experimental results show that FishTracker achieves a multi-object tracking accuracy(MOTA)of 93.22% and 87.24% in bright and dark illumination conditions,respectively.Further validation in a real aquaculture scenario reveal that FishTracker achieves aMOTA of 76.70%,which is 5.34% higher than the baselinemodel.The higher order tracking accuracy(HOTA)reaches 50.5%,which is 3.4% higher than the benchmark.In conclusion,FishTracker can provide reliable technical support for accurate tracking and behavioral analysis of high-density fish populations.展开更多
Embodied visual exploration is critical for building intelligent visual agents. This paper presents the neural exploration with feature-based visual odometry and tracking-failure-reduction policy(Ne OR), a framework f...Embodied visual exploration is critical for building intelligent visual agents. This paper presents the neural exploration with feature-based visual odometry and tracking-failure-reduction policy(Ne OR), a framework for embodied visual exploration that possesses the efficient exploration capabilities of deep reinforcement learning(DRL)-based exploration policies and leverages feature-based visual odometry(VO) for more accurate mapping and positioning results. An improved local policy is also proposed to reduce tracking failures of feature-based VO in weakly textured scenes through a refined multi-discrete action space, keyframe fusion, and an auxiliary task. The experimental results demonstrate that Ne OR has better mapping and positioning accuracy compared to other entirely learning-based exploration frameworks and improves the robustness of feature-based VO by significantly reducing tracking failures in weakly textured scenes.展开更多
Siamese tracking algorithms usually take convolutional neural networks(CNNs)as feature extractors owing to their capability of extracting deep discriminative features.However,the convolution kernels in CNNs have limit...Siamese tracking algorithms usually take convolutional neural networks(CNNs)as feature extractors owing to their capability of extracting deep discriminative features.However,the convolution kernels in CNNs have limited receptive fields,making it difficult to capture global feature dependencies which is important for object detection,especially when the target undergoes large-scale variations or movement.In view of this,we develop a novel network called effective convolution mixed Transformer Siamese network(SiamCMT)for visual tracking,which integrates CNN-based and Transformer-based architectures to capture both local information and long-range dependencies.Specifically,we design a Transformer-based module named lightweight multi-head attention(LWMHA)which can be flexibly embedded into stage-wise CNNs and improve the network’s representation ability.Additionally,we introduce a stage-wise feature aggregation mechanism which integrates features learned from multiple stages.By leveraging both location and semantic information,this mechanism helps the SiamCMT to better locate and find the target.Moreover,to distinguish the contribution of different channels,a channel-wise attention mechanism is introduced to enhance the important channels and suppress the others.Extensive experiments on seven challenging benchmarks,i.e.,OTB2015,UAV123,GOT10K,LaSOT,DTB70,UAVTrack112_L,and VOT2018,demonstrate the effectiveness of the proposed algorithm.Specially,the proposed method outperforms the baseline by 3.5%and 3.1%in terms of precision and success rates with a real-time speed of 59.77 FPS on UAV123.展开更多
Target tracking is an essential task in contemporary computer vision applications.However,its effectiveness is susceptible to model drift,due to the different appearances of targets,which often compromises tracking ro...Target tracking is an essential task in contemporary computer vision applications.However,its effectiveness is susceptible to model drift,due to the different appearances of targets,which often compromises tracking robustness and precision.In this paper,a universally applicable method based on correlation filters is introduced to mitigate model drift in complex scenarios.It employs temporal-confidence samples as a priori to guide the model update process and ensure its precision and consistency over a long period.An improved update mechanism based on the peak side-lobe to peak correlation energy(PSPCE)criterion is proposed,which selects high-confidence samples along the temporal dimension to update temporal-confidence samples.Extensive experiments on various benchmarks demonstrate that the proposed method achieves a competitive performance compared with the state-of-the-art methods.Especially when the target appearance changes significantly,our method is more robust and can achieve a balance between precision and speed.Specifically,on the object tracking benchmark(OTB-100)dataset,compared to the baseline,the tracking precision of our model improves by 8.8%,8.8%,5.1%,5.6%,and 6.9%for background clutter,deformation,occlusion,rotation,and illumination variation,respectively.The results indicate that this proposed method can significantly enhance the robustness and precision of target tracking in dynamic and challenging environments,offering a reliable solution for applications such as real-time monitoring,autonomous driving,and precision guidance.展开更多
Purpose-With the rapid advancement of China’s high-speed rail network,the density of train operations is on the rise.To address the challenge of shortening train tracking intervals while enhancing transportation effi...Purpose-With the rapid advancement of China’s high-speed rail network,the density of train operations is on the rise.To address the challenge of shortening train tracking intervals while enhancing transportation efficiency,the multi-objective dynamic optimization of the train operation process has emerged as a critical issue.Design/methodology/approach-Train dynamic model is established by analyzing the force of the train in the process of tracing operation.The train tracing operation model is established according to the dynamic mechanical model of the train tracking process,and the dynamic optimization analysis is carried out with comfort,energy saving and punctuality as optimization objectives.To achieve multi-objective dynamic optimization,a novel train tracking operation calculation method is proposed,utilizing the improved grey wolf optimization algorithm(MOGWO).The proposed method is simulated and verified based on the train characteristics and line data of CR400AF electric multiple units.Findings-The simulation results prove that the optimized MOGWO algorithm can be computed quickly during train tracks,the optimum results can be given within 5s and the algorithm can converge effectively in different optimization target directions.The optimized speed profile of the MOGWO algorithm is smoother and more stable and meets the target requirements of energy saving,punctuality and comfort while maximally respecting the speed limit profile.Originality/value-The MOGWO train tracking interval optimization method enhances the tracking process while ensuring a safe tracking interval.This approach enables the trailing train to operate more comfortably,energy-efficiently and punctually,aligning with passenger needs and industry trends.The method offers valuable insights for optimizing the high-speed train tracking process.展开更多
Visual object tracking(VOT),aiming to track a target object in a continuous video,is a fundamental and critical task in computer vision.However,the reliance on third-party resources(e.g.,dataset)for training poses con...Visual object tracking(VOT),aiming to track a target object in a continuous video,is a fundamental and critical task in computer vision.However,the reliance on third-party resources(e.g.,dataset)for training poses concealed threats to the security of VOT models.In this paper,we reveal that VOT models are vulnerable to a poison-only and targeted backdoor attack,where the adversary can achieve arbitrary tracking predictions by manipulating only part of the training data.Specifically,we first define and formulate three different variants of the targeted attacks:size-manipulation,trajectory-manipulation,and hybrid attacks.To implement these,we introduce Random Video Poisoning(RVP),a novel poison-only strategy that exploits temporal correlations within video data by poisoning entire video sequences.Extensive experiments demonstrate that RVP effectively injects controllable backdoors,enabling precise manipulation of tracking behavior upon trigger activation,while maintaining high performance on benign data,thus ensuring stealth.Our findings not only expose significant vulnerabilities but also highlight that the underlying principles could be adapted for beneficial uses,such as dataset watermarking for copyright protection.展开更多
In this paper, the Kalman filter is used to predict image feature positionaround which an image-processing window is then established to diminish feature-searching area andto heighten the image-processing speed. Accor...In this paper, the Kalman filter is used to predict image feature positionaround which an image-processing window is then established to diminish feature-searching area andto heighten the image-processing speed. According to the fundamentals of image-based visual servoing(IBVS), the cerebellar model articulation controller (CMAC) neural network is inserted into thevisual servo control loop to implement the nonlinear mapping from the error signal in the imagespace to the control signal in the input space instead of the iterative adjustment and complicatedinverse solution of the image Jacobian. Simulation results show that the feature point can bepredicted efficiently using the Kalman filter and on-line supervised learning can be realized usingCMAC neural network; end-effector can track the target object very well.展开更多
To cope with multi-object tracking under real-world complex situations, a new video-based method is proposed. In the detecting step, the moving objects are segmented with the third level DWT (discrete wavelet transfo...To cope with multi-object tracking under real-world complex situations, a new video-based method is proposed. In the detecting step, the moving objects are segmented with the third level DWT (discrete wavelet transform )and background difference. In the tracking step, the Kalman filter and scale parameter are used first to estimate the object position and bounding box. Then, the center-association-based projection ratio and region-association-based occlusion ratio are defined and combined to judge object behaviours. Finally, the tracking scheme and Kalman parameters are adaptively adjusted according to object behaviour. Under occlusion, partial observability is utilized to obtain the object measurements and optimum box dimensions. This method is robust in tracking mobile objects under such situations as occlusion, new appearing and stablization, etc. Experimental results show that the proposed method is efficient.展开更多
Three-dimensional(3 D) visual tracking of a multicopter(where the camera is fixed while the multicopter is moving) means continuously recovering the six-degree-of-freedom pose of the multicopter relative to the camera...Three-dimensional(3 D) visual tracking of a multicopter(where the camera is fixed while the multicopter is moving) means continuously recovering the six-degree-of-freedom pose of the multicopter relative to the camera. It can be used in many applications,such as precision terminal guidance and control algorithm validation for multicopters. However, it is difficult for many researchers to build a 3 D visual tracking system for multicopters(VTSMs) by using cheap and off-the-shelf cameras. This paper firstly gives an overview of the three key technologies of a 3 D VTSMs: multi-camera placement, multi-camera calibration and pose estimation for multicopters. Then, some representative 3 D visual tracking systems for multicopters are introduced. Finally, the future development of the 3D VTSMs is analyzed and summarized.展开更多
There are two main trends in the development of unmanned aerial vehicle(UAV)technologies:miniaturization and intellectualization,in which realizing object tracking capabilities for a nano-scale UAV is one of the most ...There are two main trends in the development of unmanned aerial vehicle(UAV)technologies:miniaturization and intellectualization,in which realizing object tracking capabilities for a nano-scale UAV is one of the most challenging problems.In this paper,we present a visual object tracking and servoing control system utilizing a tailor-made 38 g nano-scale quadrotor.A lightweight visual module is integrated to enable object tracking capabilities,and a micro positioning deck is mounted to provide accurate pose estimation.In order to be robust against object appearance variations,a novel object tracking algorithm,denoted by RMCTer,is proposed,which integrates a powerful short-term tracking module and an efficient long-term processing module.In particular,the long-term processing module can provide additional object information and modify the short-term tracking model in a timely manner.Furthermore,a positionbased visual servoing control method is proposed for the quadrotor,where an adaptive tracking controller is designed by leveraging backstepping and adaptive techniques.Stable and accurate object tracking is achieved even under disturbances.Experimental results are presented to demonstrate the high accuracy and stability of the whole tracking system.展开更多
A real-time arc welding robot visual control system based on a local network with a multi-level hierarchy is developed in this paper. It consists of an intelligence and human-machine interface level, a motion planning...A real-time arc welding robot visual control system based on a local network with a multi-level hierarchy is developed in this paper. It consists of an intelligence and human-machine interface level, a motion planning level, a motion control level and a servo control level. The last three levels form a local real-time open robot controller, which realizes motion planning and motion control of a robot. A camera calibration method based on the relative movement of the end-effector connected to a robot is proposed and a method for tracking weld seam based on the structured light stereovision is provided. Combining the parameters of the cameras and laser plane, three groups of position values in Cartesian space are obtained for each feature point in a stripe projected on the weld seam. The accurate three-dimensional position of the edge points in the weld seam can be calculated from the obtained parameters with an information fusion algorithm. By calculating the weld seam parameter from position and image data, the movement parameters of the robot used for tracking can be determined. A swing welding experiment of type V groove weld is successfully conducted, the results of which show that the system has high resolution seam tracking in real-time, and works stably and efficiently.展开更多
The generic Meanshift is susceptible to interference of background pixels with the target pixels in the kernel of the reference model, which compromises the tracking performance. In this paper, we enhance the target c...The generic Meanshift is susceptible to interference of background pixels with the target pixels in the kernel of the reference model, which compromises the tracking performance. In this paper, we enhance the target color feature by attenuating the background color within the kernel through enlarging the pixel weightings which map to the pixels on the target. This way, the background pixel interference is largely suppressed in the color histogram in the course of constructing the target reference model. In addition, the proposed method also reduces the number of Meanshift iterations, which speeds up the algorithmic convergence. The two tests validate the proposed approach with improved tracking robustness on real-world video sequences.展开更多
Recently,deep learning has achieved great success in visual tracking tasks,particularly in single-object tracking.This paper provides a comprehensive review of state-of-the-art single-object tracking algorithms based ...Recently,deep learning has achieved great success in visual tracking tasks,particularly in single-object tracking.This paper provides a comprehensive review of state-of-the-art single-object tracking algorithms based on deep learning.First,we introduce basic knowledge of deep visual tracking,including fundamental concepts,existing algorithms,and previous reviews.Second,we briefly review existing deep learning methods by categorizing them into data-invariant and data-adaptive methods based on whether they can dynamically change their model parameters or architectures.Then,we conclude with the general components of deep trackers.In this way,we systematically analyze the novelties of several recently proposed deep trackers.Thereafter,popular datasets such as Object Tracking Benchmark(OTB)and Visual Object Tracking(VOT)are discussed,along with the performances of several deep trackers.Finally,based on observations and experimental results,we discuss three different characteristics of deep trackers,i.e.,the relationships between their general components,exploration of more effective tracking frameworks,and interpretability of their motion estimation components.展开更多
An object model-based tracking method is useful for tracking multiple objects, but the main difficulties are modeling objects reliably and tracking objects via models in successive frames. An effective tracking method...An object model-based tracking method is useful for tracking multiple objects, but the main difficulties are modeling objects reliably and tracking objects via models in successive frames. An effective tracking method using the object models is proposed to track multiple objects in a real-time visual surveillance system. Firstly, for detecting objects, an adaptive kernel density estimation method is utilized, which uses an adaptive bandwidth and features combining colour and gradient. Secondly, some models of objects are built for describing motion, shape and colour features. Then, a matching matrix is formed to analyze tracking situations. If objects are tracked under occlusions, the optimal "visual" object is found to represent the occluded object, and the posterior probability of pixel is used to determine which pixel is utilized for updating object models. Extensive experiments show that this method improves the accuracy and validity of tracking objects even under occlusions and is used in real-time visual surveillance systems.展开更多
Multi-object tracking(MOT) techniques have been increasingly applied in a diverse range of tasks. Unmanned aerial vehicle(UAV) is one of its typical application scenarios. Due to the scene complexity and the low resol...Multi-object tracking(MOT) techniques have been increasingly applied in a diverse range of tasks. Unmanned aerial vehicle(UAV) is one of its typical application scenarios. Due to the scene complexity and the low resolution of moving targets in UAV applications, it is difficult to extract target features and identify them. In order to solve this problem, we propose a new re-identification(re-ID) network to extract association features for tracking in the association stage. Moreover, in order to reduce the complexity of detection model, we perform the lightweight optimization for it. Experimental results show that the proposed re-ID network can effectively reduce the number of identity switches, and surpass current state-of-the-art algorithms. In the meantime, the optimized detector can increase the speed by 27% owing to its lightweight design, which enables it to further meet the requirements of UAV tracking tasks.展开更多
Pig farmers want to have an effective solution for automatically detecting and tracking multiple pigs and alerting their conditions in order to recognize disease risk factors quickly.In this paper,therefore,we propose...Pig farmers want to have an effective solution for automatically detecting and tracking multiple pigs and alerting their conditions in order to recognize disease risk factors quickly.In this paper,therefore,we propose a novel monitoring system using an Artificial Intelligence of Things(AIoT)technique combining artificial intelligence and Internet of Things(IoT).The proposed system consists of AIoT edge devices and a central monitoring server.First,an AIoT edge device extracts video frame images from a CCTV camera installed in a pig pen by a frame extraction method,detects multiple pigs in the images by a faster region-based convolutional neural network(RCNN)model,and tracks them by an object center-point tracking algorithm(OCTA)based on bounding box regression outputs of the faster RCNN.Finally,it sends multi-pig tracking images to the central monitoring server,which alerts them to pig farmers through a social networking service(SNS)agent in cooperation with an oneM2M-compliant IoT alerting method.Experimental results showed that the multi-pig tracking method achieved the multi-object tracking accuracy performance of about 77%.In addition,we verified alerting operation by confirming the images received in the SNS smartphone application.展开更多
Inspired by human behaviors,a robot object tracking model is proposed on the basis of visual attention mechanism,which is fit for the theory of topological perception.The model integrates the image-driven,bottom-up at...Inspired by human behaviors,a robot object tracking model is proposed on the basis of visual attention mechanism,which is fit for the theory of topological perception.The model integrates the image-driven,bottom-up attention and the object-driven,top-down attention,whereas the previous attention model has mostly focused on either the bottom-up or top-down attention.By the bottom-up component,the whole scene is segmented into the ground region and the salient regions.Guided by top-down strategy which is achieved by a topological graph,the object regions are separated from the salient regions.The salient regions except the object regions are the barrier regions.In order to estimate the model,a mobile robot platform is developed,on which some experiments are implemented.The experimental results indicate that processing an image with a resolution of 752×480 pixels takes less than 200 ms and the object regions are unabridged.The analysis obtained by comparing the proposed model with the existing model demonstrates that the proposed model has some advantages in robot object tracking in terms of speed and efficiency.展开更多
Most sensors or cameras discussed in the sensor network community are usually 3D homogeneous, even though their2 D coverage areas in the ground plane are heterogeneous. Meanwhile, observed objects of camera networks a...Most sensors or cameras discussed in the sensor network community are usually 3D homogeneous, even though their2 D coverage areas in the ground plane are heterogeneous. Meanwhile, observed objects of camera networks are usually simplified as 2D points in previous literature. However in actual application scenes, not only cameras are always heterogeneous with different height and action radiuses, but also the observed objects are with 3D features(i.e., height). This paper presents a sensor planning formulation addressing the efficiency enhancement of visual tracking in 3D heterogeneous camera networks that track and detect people traversing a region. The problem of sensor planning consists of three issues:(i) how to model the 3D heterogeneous cameras;(ii) how to rank the visibility, which ensures that the object of interest is visible in a camera's field of view;(iii) how to reconfigure the 3D viewing orientations of the cameras. This paper studies the geometric properties of 3D heterogeneous camera networks and addresses an evaluation formulation to rank the visibility of observed objects. Then a sensor planning method is proposed to improve the efficiency of visual tracking. Finally, the numerical results show that the proposed method can improve the tracking performance of the system compared to the conventional strategies.展开更多
According to the main tools of TRIZ, the theory of inventive problem solving, a new flowchart of the product conceptual design process to solve contradiction in TRIZ is proposed. In order to realize autonomous moving ...According to the main tools of TRIZ, the theory of inventive problem solving, a new flowchart of the product conceptual design process to solve contradiction in TRIZ is proposed. In order to realize autonomous moving and automatic weld seam tracking for welding robot in Tailed Welded Blanks, a creative design of robotic visual tracking system bused on CMOS has been developed by using the flowchart. The new system is not only used to inspect the workpiece ahead of a welding torch and measure the joint orientation and lateral deviation caused by curvature or discontinuity in the joint part, but also to record and measure the image size of the weld pool. Moreover, the hardware and software components are discussed in brief.展开更多
To improve the reliability and accuracy of visual tracker,a robust visual tracking algorithm based on multi-cues fusion under Bayesian framework is proposed.The weighed color and texture cues of the object are applied...To improve the reliability and accuracy of visual tracker,a robust visual tracking algorithm based on multi-cues fusion under Bayesian framework is proposed.The weighed color and texture cues of the object are applied to describe the moving object.An adjustable observation model is incorporated into particle filtering,which utilizes the properties of particle filter for coping with non-linear,non-Gaussian assumption and the ability to predict the position of the moving object in a cluttered environment and two complementary attributes are employed to estimate the matching similarity dynamically in term of the likelihood ratio factors;furthermore tunes the weight values according to the confidence map of the color and texture feature on-line adaptively to reconfigure the optimal observation likelihood model,which ensured attaining the maximum likelihood ratio in the tracking scenario even if in the situations where the object is occluded or illumination,pose and scale are time-variant.The experimental result shows that the algorithm can track a moving object accurately while the reliability of tracking in a challenging case is validated in the experimentation.展开更多
基金funded by the Fundamental Research Funds for the Central Universities(Grant No.106-YDZX2025022)the Startup Foundation of New Professor at Nanjing Agricultural University(Grant No.106-804005)the“Qing Lan Project”of Jiangsu Higher Education Institutions.
文摘Understanding fish movement trajectories in aquaculture is essential for practical applications,such as disease warning,feeding optimization,and breeding management.These trajectories reveal key information about the fish’s behavior,health,and environmental adaptability.However,when multi-object tracking(MOT)algorithms are applied to the high-density aquaculture environment,occlusion and overlapping among fish may result in missed detections,false detections,and identity switching problems,which limit the tracking accuracy.To address these issues,this paper proposes FishTracker,a MOT algorithm,by utilizing a Tracking-by-Detection framework.First,the neck part of the YOLOv8 model is enhanced by introducing a Multi-Scale Dilated Attention(MSDA)module to improve object localization and classification confidence.Second,an Adaptive Kalman Filter(AKF)is employed in the tracking phase to dynamically adjust motion prediction parameters,thereby overcoming target adhesion and nonlinear motion in complex scenarios.Experimental results show that FishTracker achieves a multi-object tracking accuracy(MOTA)of 93.22% and 87.24% in bright and dark illumination conditions,respectively.Further validation in a real aquaculture scenario reveal that FishTracker achieves aMOTA of 76.70%,which is 5.34% higher than the baselinemodel.The higher order tracking accuracy(HOTA)reaches 50.5%,which is 3.4% higher than the benchmark.In conclusion,FishTracker can provide reliable technical support for accurate tracking and behavioral analysis of high-density fish populations.
基金supported by the National Natural Science Foundation of China (No.62202137)the China Postdoctoral Science Foundation (No.2023M730599)the Zhejiang Provincial Natural Science Foundation of China (No.LMS25F020009)。
文摘Embodied visual exploration is critical for building intelligent visual agents. This paper presents the neural exploration with feature-based visual odometry and tracking-failure-reduction policy(Ne OR), a framework for embodied visual exploration that possesses the efficient exploration capabilities of deep reinforcement learning(DRL)-based exploration policies and leverages feature-based visual odometry(VO) for more accurate mapping and positioning results. An improved local policy is also proposed to reduce tracking failures of feature-based VO in weakly textured scenes through a refined multi-discrete action space, keyframe fusion, and an auxiliary task. The experimental results demonstrate that Ne OR has better mapping and positioning accuracy compared to other entirely learning-based exploration frameworks and improves the robustness of feature-based VO by significantly reducing tracking failures in weakly textured scenes.
基金supported by the National Natural Science Foundation of China(Grant No.62033007)the Major Fundamental Research Program of Shandong Province(Grant No.ZR2023ZD37).
文摘Siamese tracking algorithms usually take convolutional neural networks(CNNs)as feature extractors owing to their capability of extracting deep discriminative features.However,the convolution kernels in CNNs have limited receptive fields,making it difficult to capture global feature dependencies which is important for object detection,especially when the target undergoes large-scale variations or movement.In view of this,we develop a novel network called effective convolution mixed Transformer Siamese network(SiamCMT)for visual tracking,which integrates CNN-based and Transformer-based architectures to capture both local information and long-range dependencies.Specifically,we design a Transformer-based module named lightweight multi-head attention(LWMHA)which can be flexibly embedded into stage-wise CNNs and improve the network’s representation ability.Additionally,we introduce a stage-wise feature aggregation mechanism which integrates features learned from multiple stages.By leveraging both location and semantic information,this mechanism helps the SiamCMT to better locate and find the target.Moreover,to distinguish the contribution of different channels,a channel-wise attention mechanism is introduced to enhance the important channels and suppress the others.Extensive experiments on seven challenging benchmarks,i.e.,OTB2015,UAV123,GOT10K,LaSOT,DTB70,UAVTrack112_L,and VOT2018,demonstrate the effectiveness of the proposed algorithm.Specially,the proposed method outperforms the baseline by 3.5%and 3.1%in terms of precision and success rates with a real-time speed of 59.77 FPS on UAV123.
基金supported by the Natural Science Foundation of Sichuan Province of China under Grant No.2025ZNSFSC0522partially supported by the National Natural Science Foundation of China under Grants No.61775030 and No.61571096.
文摘Target tracking is an essential task in contemporary computer vision applications.However,its effectiveness is susceptible to model drift,due to the different appearances of targets,which often compromises tracking robustness and precision.In this paper,a universally applicable method based on correlation filters is introduced to mitigate model drift in complex scenarios.It employs temporal-confidence samples as a priori to guide the model update process and ensure its precision and consistency over a long period.An improved update mechanism based on the peak side-lobe to peak correlation energy(PSPCE)criterion is proposed,which selects high-confidence samples along the temporal dimension to update temporal-confidence samples.Extensive experiments on various benchmarks demonstrate that the proposed method achieves a competitive performance compared with the state-of-the-art methods.Especially when the target appearance changes significantly,our method is more robust and can achieve a balance between precision and speed.Specifically,on the object tracking benchmark(OTB-100)dataset,compared to the baseline,the tracking precision of our model improves by 8.8%,8.8%,5.1%,5.6%,and 6.9%for background clutter,deformation,occlusion,rotation,and illumination variation,respectively.The results indicate that this proposed method can significantly enhance the robustness and precision of target tracking in dynamic and challenging environments,offering a reliable solution for applications such as real-time monitoring,autonomous driving,and precision guidance.
基金funded by the China Academy of Railway Sciences Corporation Limited Scientific Research Project(No:2023YJ080).
文摘Purpose-With the rapid advancement of China’s high-speed rail network,the density of train operations is on the rise.To address the challenge of shortening train tracking intervals while enhancing transportation efficiency,the multi-objective dynamic optimization of the train operation process has emerged as a critical issue.Design/methodology/approach-Train dynamic model is established by analyzing the force of the train in the process of tracing operation.The train tracing operation model is established according to the dynamic mechanical model of the train tracking process,and the dynamic optimization analysis is carried out with comfort,energy saving and punctuality as optimization objectives.To achieve multi-objective dynamic optimization,a novel train tracking operation calculation method is proposed,utilizing the improved grey wolf optimization algorithm(MOGWO).The proposed method is simulated and verified based on the train characteristics and line data of CR400AF electric multiple units.Findings-The simulation results prove that the optimized MOGWO algorithm can be computed quickly during train tracks,the optimum results can be given within 5s and the algorithm can converge effectively in different optimization target directions.The optimized speed profile of the MOGWO algorithm is smoother and more stable and meets the target requirements of energy saving,punctuality and comfort while maximally respecting the speed limit profile.Originality/value-The MOGWO train tracking interval optimization method enhances the tracking process while ensuring a safe tracking interval.This approach enables the trailing train to operate more comfortably,energy-efficiently and punctually,aligning with passenger needs and industry trends.The method offers valuable insights for optimizing the high-speed train tracking process.
基金supported in part by the"Pioneer"and"Leading Goose"R&D Program of Zhejiang under Grant No. 2024C01169the National Natural Science Foundation of China under Grant Nos. 62441238 and U2441240。
文摘Visual object tracking(VOT),aiming to track a target object in a continuous video,is a fundamental and critical task in computer vision.However,the reliance on third-party resources(e.g.,dataset)for training poses concealed threats to the security of VOT models.In this paper,we reveal that VOT models are vulnerable to a poison-only and targeted backdoor attack,where the adversary can achieve arbitrary tracking predictions by manipulating only part of the training data.Specifically,we first define and formulate three different variants of the targeted attacks:size-manipulation,trajectory-manipulation,and hybrid attacks.To implement these,we introduce Random Video Poisoning(RVP),a novel poison-only strategy that exploits temporal correlations within video data by poisoning entire video sequences.Extensive experiments demonstrate that RVP effectively injects controllable backdoors,enabling precise manipulation of tracking behavior upon trigger activation,while maintaining high performance on benign data,thus ensuring stealth.Our findings not only expose significant vulnerabilities but also highlight that the underlying principles could be adapted for beneficial uses,such as dataset watermarking for copyright protection.
基金The National Natural Science Foundation of China (59990470).
文摘In this paper, the Kalman filter is used to predict image feature positionaround which an image-processing window is then established to diminish feature-searching area andto heighten the image-processing speed. According to the fundamentals of image-based visual servoing(IBVS), the cerebellar model articulation controller (CMAC) neural network is inserted into thevisual servo control loop to implement the nonlinear mapping from the error signal in the imagespace to the control signal in the input space instead of the iterative adjustment and complicatedinverse solution of the image Jacobian. Simulation results show that the feature point can bepredicted efficiently using the Kalman filter and on-line supervised learning can be realized usingCMAC neural network; end-effector can track the target object very well.
基金The National Natural Science Foundation of China(No.60574006,60804017)
文摘To cope with multi-object tracking under real-world complex situations, a new video-based method is proposed. In the detecting step, the moving objects are segmented with the third level DWT (discrete wavelet transform )and background difference. In the tracking step, the Kalman filter and scale parameter are used first to estimate the object position and bounding box. Then, the center-association-based projection ratio and region-association-based occlusion ratio are defined and combined to judge object behaviours. Finally, the tracking scheme and Kalman parameters are adaptively adjusted according to object behaviour. Under occlusion, partial observability is utilized to obtain the object measurements and optimum box dimensions. This method is robust in tracking mobile objects under such situations as occlusion, new appearing and stablization, etc. Experimental results show that the proposed method is efficient.
基金supported by the National Key Research and Development Program of China (No. 2017YFB1300102)National Natural Science Foundation of China (No. 61803025)
文摘Three-dimensional(3 D) visual tracking of a multicopter(where the camera is fixed while the multicopter is moving) means continuously recovering the six-degree-of-freedom pose of the multicopter relative to the camera. It can be used in many applications,such as precision terminal guidance and control algorithm validation for multicopters. However, it is difficult for many researchers to build a 3 D visual tracking system for multicopters(VTSMs) by using cheap and off-the-shelf cameras. This paper firstly gives an overview of the three key technologies of a 3 D VTSMs: multi-camera placement, multi-camera calibration and pose estimation for multicopters. Then, some representative 3 D visual tracking systems for multicopters are introduced. Finally, the future development of the 3D VTSMs is analyzed and summarized.
基金supported in part by the Institute for Guo Qiang of Tsinghua University(2019GQG1023)in part by Graduate Education and Teaching Reform Project of Tsinghua University(202007J007)+1 种基金in part by National Natural Science Foundation of China(U19B2029,62073028,61803222)in part by the Independent Research Program of Tsinghua University(2018Z05JDX002)。
文摘There are two main trends in the development of unmanned aerial vehicle(UAV)technologies:miniaturization and intellectualization,in which realizing object tracking capabilities for a nano-scale UAV is one of the most challenging problems.In this paper,we present a visual object tracking and servoing control system utilizing a tailor-made 38 g nano-scale quadrotor.A lightweight visual module is integrated to enable object tracking capabilities,and a micro positioning deck is mounted to provide accurate pose estimation.In order to be robust against object appearance variations,a novel object tracking algorithm,denoted by RMCTer,is proposed,which integrates a powerful short-term tracking module and an efficient long-term processing module.In particular,the long-term processing module can provide additional object information and modify the short-term tracking model in a timely manner.Furthermore,a positionbased visual servoing control method is proposed for the quadrotor,where an adaptive tracking controller is designed by leveraging backstepping and adaptive techniques.Stable and accurate object tracking is achieved even under disturbances.Experimental results are presented to demonstrate the high accuracy and stability of the whole tracking system.
基金This work was supported by the National High Technology Research and Development Program of China under Grant 2002AA422160 by the National Key Fundamental Research and the Devel-opment Project of China (973) under Grant 2002CB312200.
文摘A real-time arc welding robot visual control system based on a local network with a multi-level hierarchy is developed in this paper. It consists of an intelligence and human-machine interface level, a motion planning level, a motion control level and a servo control level. The last three levels form a local real-time open robot controller, which realizes motion planning and motion control of a robot. A camera calibration method based on the relative movement of the end-effector connected to a robot is proposed and a method for tracking weld seam based on the structured light stereovision is provided. Combining the parameters of the cameras and laser plane, three groups of position values in Cartesian space are obtained for each feature point in a stripe projected on the weld seam. The accurate three-dimensional position of the edge points in the weld seam can be calculated from the obtained parameters with an information fusion algorithm. By calculating the weld seam parameter from position and image data, the movement parameters of the robot used for tracking can be determined. A swing welding experiment of type V groove weld is successfully conducted, the results of which show that the system has high resolution seam tracking in real-time, and works stably and efficiently.
基金Supported by the Program for Technology Innovation Team of Ningbo Government (No. 2011B81002)the Ningbo University Science Research Foundation (No.xkl11075)
文摘The generic Meanshift is susceptible to interference of background pixels with the target pixels in the kernel of the reference model, which compromises the tracking performance. In this paper, we enhance the target color feature by attenuating the background color within the kernel through enlarging the pixel weightings which map to the pixels on the target. This way, the background pixel interference is largely suppressed in the color histogram in the course of constructing the target reference model. In addition, the proposed method also reduces the number of Meanshift iterations, which speeds up the algorithmic convergence. The two tests validate the proposed approach with improved tracking robustness on real-world video sequences.
基金supported by National Natural Science Foundation of China(Nos.61922064 and U2033210)Zhejiang Provincial Natural Science Foundation(Nos.LR17F030001 and LQ19F020005)the Project of Science and Technology Plans of Wenzhou City(Nos.C20170008 and ZG2017016)。
文摘Recently,deep learning has achieved great success in visual tracking tasks,particularly in single-object tracking.This paper provides a comprehensive review of state-of-the-art single-object tracking algorithms based on deep learning.First,we introduce basic knowledge of deep visual tracking,including fundamental concepts,existing algorithms,and previous reviews.Second,we briefly review existing deep learning methods by categorizing them into data-invariant and data-adaptive methods based on whether they can dynamically change their model parameters or architectures.Then,we conclude with the general components of deep trackers.In this way,we systematically analyze the novelties of several recently proposed deep trackers.Thereafter,popular datasets such as Object Tracking Benchmark(OTB)and Visual Object Tracking(VOT)are discussed,along with the performances of several deep trackers.Finally,based on observations and experimental results,we discuss three different characteristics of deep trackers,i.e.,the relationships between their general components,exploration of more effective tracking frameworks,and interpretability of their motion estimation components.
基金supported by the National Natural Science Foundation of China(60835004 60775047+2 种基金 60872130)the National High Technology Research and Development Program of China(863 Program)(2007AA04Z244 2008AA04Z214)
文摘An object model-based tracking method is useful for tracking multiple objects, but the main difficulties are modeling objects reliably and tracking objects via models in successive frames. An effective tracking method using the object models is proposed to track multiple objects in a real-time visual surveillance system. Firstly, for detecting objects, an adaptive kernel density estimation method is utilized, which uses an adaptive bandwidth and features combining colour and gradient. Secondly, some models of objects are built for describing motion, shape and colour features. Then, a matching matrix is formed to analyze tracking situations. If objects are tracked under occlusions, the optimal "visual" object is found to represent the occluded object, and the posterior probability of pixel is used to determine which pixel is utilized for updating object models. Extensive experiments show that this method improves the accuracy and validity of tracking objects even under occlusions and is used in real-time visual surveillance systems.
基金supported by the Research Foundation of Nanjing University of Posts and Telecommunications (No.NY219076)。
文摘Multi-object tracking(MOT) techniques have been increasingly applied in a diverse range of tasks. Unmanned aerial vehicle(UAV) is one of its typical application scenarios. Due to the scene complexity and the low resolution of moving targets in UAV applications, it is difficult to extract target features and identify them. In order to solve this problem, we propose a new re-identification(re-ID) network to extract association features for tracking in the association stage. Moreover, in order to reduce the complexity of detection model, we perform the lightweight optimization for it. Experimental results show that the proposed re-ID network can effectively reduce the number of identity switches, and surpass current state-of-the-art algorithms. In the meantime, the optimized detector can increase the speed by 27% owing to its lightweight design, which enables it to further meet the requirements of UAV tracking tasks.
基金supported by Institute of Information&communications Technology Planning&Evaluation(IITP)Grant funded by the Korea government(MSIT)(No.2018-0-00387Development of ICT based Intelligent Smart Welfare Housing System for the Prevention and Control of Livestock Disease).
文摘Pig farmers want to have an effective solution for automatically detecting and tracking multiple pigs and alerting their conditions in order to recognize disease risk factors quickly.In this paper,therefore,we propose a novel monitoring system using an Artificial Intelligence of Things(AIoT)technique combining artificial intelligence and Internet of Things(IoT).The proposed system consists of AIoT edge devices and a central monitoring server.First,an AIoT edge device extracts video frame images from a CCTV camera installed in a pig pen by a frame extraction method,detects multiple pigs in the images by a faster region-based convolutional neural network(RCNN)model,and tracks them by an object center-point tracking algorithm(OCTA)based on bounding box regression outputs of the faster RCNN.Finally,it sends multi-pig tracking images to the central monitoring server,which alerts them to pig farmers through a social networking service(SNS)agent in cooperation with an oneM2M-compliant IoT alerting method.Experimental results showed that the multi-pig tracking method achieved the multi-object tracking accuracy performance of about 77%.In addition,we verified alerting operation by confirming the images received in the SNS smartphone application.
基金supported by National Basic Research Program of China(973 Program)(No.2006CB300407)National Natural Science Foundation of China(No.50775017)
文摘Inspired by human behaviors,a robot object tracking model is proposed on the basis of visual attention mechanism,which is fit for the theory of topological perception.The model integrates the image-driven,bottom-up attention and the object-driven,top-down attention,whereas the previous attention model has mostly focused on either the bottom-up or top-down attention.By the bottom-up component,the whole scene is segmented into the ground region and the salient regions.Guided by top-down strategy which is achieved by a topological graph,the object regions are separated from the salient regions.The salient regions except the object regions are the barrier regions.In order to estimate the model,a mobile robot platform is developed,on which some experiments are implemented.The experimental results indicate that processing an image with a resolution of 752×480 pixels takes less than 200 ms and the object regions are unabridged.The analysis obtained by comparing the proposed model with the existing model demonstrates that the proposed model has some advantages in robot object tracking in terms of speed and efficiency.
基金supported by the National Natural Science Foundationof China(61100207)the National Key Technology Research and Development Program of the Ministry of Science and Technology of China(2014BAK14B03)+1 种基金the Fundamental Research Funds for the Central Universities(2013PT132013XZ12)
文摘Most sensors or cameras discussed in the sensor network community are usually 3D homogeneous, even though their2 D coverage areas in the ground plane are heterogeneous. Meanwhile, observed objects of camera networks are usually simplified as 2D points in previous literature. However in actual application scenes, not only cameras are always heterogeneous with different height and action radiuses, but also the observed objects are with 3D features(i.e., height). This paper presents a sensor planning formulation addressing the efficiency enhancement of visual tracking in 3D heterogeneous camera networks that track and detect people traversing a region. The problem of sensor planning consists of three issues:(i) how to model the 3D heterogeneous cameras;(ii) how to rank the visibility, which ensures that the object of interest is visible in a camera's field of view;(iii) how to reconfigure the 3D viewing orientations of the cameras. This paper studies the geometric properties of 3D heterogeneous camera networks and addresses an evaluation formulation to rank the visibility of observed objects. Then a sensor planning method is proposed to improve the efficiency of visual tracking. Finally, the numerical results show that the proposed method can improve the tracking performance of the system compared to the conventional strategies.
文摘According to the main tools of TRIZ, the theory of inventive problem solving, a new flowchart of the product conceptual design process to solve contradiction in TRIZ is proposed. In order to realize autonomous moving and automatic weld seam tracking for welding robot in Tailed Welded Blanks, a creative design of robotic visual tracking system bused on CMOS has been developed by using the flowchart. The new system is not only used to inspect the workpiece ahead of a welding torch and measure the joint orientation and lateral deviation caused by curvature or discontinuity in the joint part, but also to record and measure the image size of the weld pool. Moreover, the hardware and software components are discussed in brief.
文摘To improve the reliability and accuracy of visual tracker,a robust visual tracking algorithm based on multi-cues fusion under Bayesian framework is proposed.The weighed color and texture cues of the object are applied to describe the moving object.An adjustable observation model is incorporated into particle filtering,which utilizes the properties of particle filter for coping with non-linear,non-Gaussian assumption and the ability to predict the position of the moving object in a cluttered environment and two complementary attributes are employed to estimate the matching similarity dynamically in term of the likelihood ratio factors;furthermore tunes the weight values according to the confidence map of the color and texture feature on-line adaptively to reconfigure the optimal observation likelihood model,which ensured attaining the maximum likelihood ratio in the tracking scenario even if in the situations where the object is occluded or illumination,pose and scale are time-variant.The experimental result shows that the algorithm can track a moving object accurately while the reliability of tracking in a challenging case is validated in the experimentation.