In dynamic scenarios,visual simultaneous localization and mapping(SLAM)algorithms often incorrectly incorporate dynamic points during camera pose computation,leading to reduced accuracy and robustness.This paper prese...In dynamic scenarios,visual simultaneous localization and mapping(SLAM)algorithms often incorrectly incorporate dynamic points during camera pose computation,leading to reduced accuracy and robustness.This paper presents a dynamic SLAM algorithm that leverages object detection and regional dynamic probability.Firstly,a parallel thread employs the YOLOX object detectionmodel to gather 2D semantic information and compensate for missed detections.Next,an improved K-means++clustering algorithm clusters bounding box regions,adaptively determining the threshold for extracting dynamic object contours as dynamic points change.This process divides the image into low dynamic,suspicious dynamic,and high dynamic regions.In the tracking thread,the dynamic point removal module assigns dynamic probability weights to the feature points in these regions.Combined with geometric methods,it detects and removes the dynamic points.The final evaluation on the public TUM RGB-D dataset shows that the proposed dynamic SLAM algorithm surpasses most existing SLAM algorithms,providing better pose estimation accuracy and robustness in dynamic environments.展开更多
Background As visual simultaneous localization and mapping(SLAM)is primarily based on the assumption of a static scene,the presence of dynamic objects in the frame causes problems such as a deterioration of system rob...Background As visual simultaneous localization and mapping(SLAM)is primarily based on the assumption of a static scene,the presence of dynamic objects in the frame causes problems such as a deterioration of system robustness and inaccurate position estimation.In this study,we propose a YGC-SLAM for indoor dynamic environments based on the ORB-SLAM2 framework combined with semantic and geometric constraints to improve the positioning accuracy and robustness of the system.Methods First,the recognition accuracy of YOLOv5 was improved by introducing the convolution block attention model and the improved EIOU loss function,whereby the prediction frame converges quickly for better detection.The improved YOLOv5 was then added to the tracking thread for dynamic target detection to eliminate dynamic points.Subsequently,multi-view geometric constraints were used for re-judging to further eliminate dynamic points while enabling more useful feature points to be retained and preventing the semantic approach from over-eliminating feature points,causing a failure of map building.The K-means clustering algorithm was used to accelerate this process and quickly calculate and determine the motion state of each cluster of pixel points.Finally,a strategy for drawing keyframes with de-redundancy was implemented to construct a clear 3D dense static point-cloud map.Results Through testing on TUM dataset and a real environment,the experimental results show that our algorithm reduces the absolute trajectory error by 98.22%and the relative trajectory error by 97.98%compared with the original ORBSLAM2,which is more accurate and has better real-time performance than similar algorithms,such as DynaSLAM and DS-SLAM.Conclusions The YGC-SLAM proposed in this study can effectively eliminate the adverse effects of dynamic objects,and the system can better complete positioning and map building tasks in complex environments.展开更多
Dynamic visual SLAM (Simultaneous Localization and Mapping) is an important research area, but existing methods struggle to balance real-time performance and accuracy in removing dynamic feature points, especially whe...Dynamic visual SLAM (Simultaneous Localization and Mapping) is an important research area, but existing methods struggle to balance real-time performance and accuracy in removing dynamic feature points, especially when semantic information is missing. This paper presents a novel dynamic SLAM system that uses optical flow tracking and epipolar geometry to identify dynamic feature points and applies a regional dynamic probability method to improve removal accuracy. We developed two innovative algorithms for precise pruning of dynamic regions: first, using optical flow and epipolar geometry to identify and prune dynamic areas while preserving static regions on stationary dynamic objects to optimize tracking performance;second, propagating dynamic probabilities across frames to mitigate the impact of semantic information loss in some frames. Experiments show that our system significantly reduces trajectory and pose errors in dynamic scenes, achieving dynamic feature point removal accuracy close to that of semantic segmentation methods, while maintaining high real-time performance. Our system performs exceptionally well in highly dynamic environments, especially where complex dynamic objects are present, demonstrating its advantage in handling dynamic scenarios. The experiments also show that while traditional methods may fail in tracking when semantic information is lost, our approach effectively reduces the misidentification of dynamic regions caused by such loss, thus improving system robustness and accuracy.展开更多
A great number of visual simultaneous localization and mapping(VSLAM)systems need to assume static features in the environment.However,moving objects can vastly impair the performance of a VSLAM system which relies on...A great number of visual simultaneous localization and mapping(VSLAM)systems need to assume static features in the environment.However,moving objects can vastly impair the performance of a VSLAM system which relies on the static-world assumption.To cope with this challenging topic,a real-time and robust VSLAM system based on ORB-SLAM2 for dynamic environments was proposed.To reduce the influence of dynamic content,we incorporate the deep-learning-based object detection method in the visual odometry,then the dynamic object probability model is added to raise the efficiency of object detection deep neural network and enhance the real-time performance of our system.Experiment with both on the TUM and KITTI benchmark dataset,as well as in a real-world environment,the results clarify that our method can significantly reduce the tracking error or drift,enhance the robustness,accuracy and stability of the VSLAM system in dynamic scenes.展开更多
Visual simultaneous localization and mapping(SLAM)is crucial in robotics and autonomous driving.However,traditional visual SLAM faces challenges in dynamic environments.To address this issue,researchers have proposed ...Visual simultaneous localization and mapping(SLAM)is crucial in robotics and autonomous driving.However,traditional visual SLAM faces challenges in dynamic environments.To address this issue,researchers have proposed semantic SLAM,which combines object detection,semantic segmentation,instance segmentation,and visual SLAM.Despite the growing body of literature on semantic SLAM,there is currently a lack of comprehensive research on the integration of object detection and visual SLAM.Therefore,this study aims to gather information from multiple databases and review relevant literature using specific keywords.It focuses on visual SLAM based on object detection,covering different aspects.Firstly,it discusses the current research status and challenges in this field,highlighting methods for incorporating semantic information from object detection networks into mileage measurement,closed-loop detection,and map construction.It also compares the characteristics and performance of various visual SLAM object detection algorithms.Lastly,it provides an outlook on future research directions and emerging trends in visual SLAM.Research has shown that visual SLAM based on object detection has significant improvements compared to traditional SLAM in dynamic point removal,data association,point cloud segmentation,and other technologies.It can improve the robustness and accuracy of the entire SLAM system and can run in real time.With the continuous optimization of algorithms and the improvement of hardware level,object visual SLAM has great potential for development.展开更多
This article presents a brief survey to visual simultaneous localization and mapping (SLAM) systems applied to multiple independently moving agents, such as a team of ground or aerial vehicles, a group of users holdin...This article presents a brief survey to visual simultaneous localization and mapping (SLAM) systems applied to multiple independently moving agents, such as a team of ground or aerial vehicles, a group of users holding augmented or virtual reality devices. Such visual SLAM system, name as collaborative visual SLAM, is different from a typical visual SLAM deployed on a single agent in that information is exchanged or shared among different agents to achieve better robustness, efficiency, and accuracy. We review the representative works on this topic proposed in the past ten years and describe the key components involved in designing such a system including collaborative pose estimation and mapping tasks, as well as the emerging topic of decentralized architecture. We believe this brief survey could be helpful to someone who are working on this topic or developing multi-agent applications, particularly micro-aerial vehicle swarm or collaborative augmented/virtual reality.展开更多
Feature selection is always an important issue in the visual SLAM (simultaneous location and mapping) literature. Considering that the location estimation can be improved by tracking features with larger value of vi...Feature selection is always an important issue in the visual SLAM (simultaneous location and mapping) literature. Considering that the location estimation can be improved by tracking features with larger value of visible time, a new feature selection method based on motion estimation is proposed. First, a k-step iteration algorithm is presented for visible time estimation using an affme motion model; then a delayed feature detection method is introduced for efficiently detecting features with the maximum visible time. As a means of validation for the proposed method, both simulation and real data experiments are carded out. Results show that the proposed method can improve both the estimation performance and the computational performance compared with the existing random feature selection method.展开更多
This paper presents a hierarchical simultaneous localization and mapping(SLAM) system for a small unmanned aerial vehicle(UAV) using the output of an inertial measurement unit(IMU) and the bearing-only observati...This paper presents a hierarchical simultaneous localization and mapping(SLAM) system for a small unmanned aerial vehicle(UAV) using the output of an inertial measurement unit(IMU) and the bearing-only observations from an onboard monocular camera.A homography based approach is used to calculate the motion of the vehicle in 6 degrees of freedom by image feature match.This visual measurement is fused with the inertial outputs by an indirect extended Kalman filter(EKF) for attitude and velocity estimation.Then,another EKF is employed to estimate the position of the vehicle and the locations of the features in the map.Both simulations and experiments are carried out to test the performance of the proposed system.The result of the comparison with the referential global positioning system/inertial navigation system(GPS/INS) navigation indicates that the proposed SLAM can provide reliable and stable state estimation for small UAVs in GPS-denied environments.展开更多
In dynamic scenes,the pose estimation and map consistency of visual simultaneous localisation and mapping(visual SLAM)are affected by intermittent changes in object motion states.An adaptive motion-state estimation an...In dynamic scenes,the pose estimation and map consistency of visual simultaneous localisation and mapping(visual SLAM)are affected by intermittent changes in object motion states.An adaptive motion-state estimation and feature-reuse mechanism is proposed which restores features once objects become stationary.Camera ego-motion is com-pensated via projection-based point-to-point red-green-blue-depth(RGB-D)Iterative Closest Point;the alignment residual yields a short-term jitter score.An Extended Kalman Filter fuses the centre-pixel trajectory and depth of the object,using depth innovation as strong evidence to suppress false triggers.Applied adaptive decision thresholds involve resolution,ego-motion intensity,jitter,and reference depth,and are combined with dual/single triggering and hysteresis to achieve robust switching.When an object is considered static,its feature points are reused.On the Bonn RGB-D Dynamic Dataset(BONN)and TUM RGB-D SLAM Dataset and Benchmark(TUM),the proposed method matches or exceeds baselines:In intermittent-motion-dominated BONN sequences Placing_non_box,it re-duces the root-mean-square of the absolute trajectory error(ATE-RMSE)by 27%relative to the baseline,remains comparable to Ellipsoid-SLAM on TUM,and consistently outperforms ORB-SLAM3 in dynamic scenes.The hysteresis counter reading on Placing_non_box2 shows that the proposed method can reduce the motion-state misclassification rate by nearly 40%.From the ablation experiment results,we confirm that adaptive thresholds yield the most significant optimisation effect.The approach improves robustness and map completeness in dynamic environments without degrading performance in low-dynamic settings.展开更多
针对动态场景下视觉SLAM(Simultaneous Localization and Mapping)系统中深度学习分割网络实时性不足,以及相机非期望运动导致位姿估计偏差的问题,提出一种基于跨域掩膜分割的视觉SLAM算法.该算法采用轻量化YOLO-fastest网络结合背景减...针对动态场景下视觉SLAM(Simultaneous Localization and Mapping)系统中深度学习分割网络实时性不足,以及相机非期望运动导致位姿估计偏差的问题,提出一种基于跨域掩膜分割的视觉SLAM算法.该算法采用轻量化YOLO-fastest网络结合背景减除法实现运动物体检测,利用深度图结合深度阈值分割构建跨域掩膜分割机制,并设计相机运动几何校正策略补偿检测框坐标误差,在实现运动物体分割的同时提升处理速度.为优化特征点利用率,采用金字塔光流对动态特征点进行帧间连续跟踪与更新,同时确保仅由静态特征点参与位姿估计过程.在TUM数据集上进行系统性评估,实验结果表明,相比于ORB-SLAM3算法,该算法的绝对位姿误差平均降幅达97.1%,与使用深度学习分割网络的DynaSLAM和DS-SLAM的动态SLAM算法相比,其单帧跟踪时间大幅减少,在精度与效率之间实现了更好的平衡.展开更多
Although deep learning methods have been widely applied in slam visual odometry(VO)over the past decade with impressive improvements,the accuracy remains limited in complex dynamic environments.In this paper,a composi...Although deep learning methods have been widely applied in slam visual odometry(VO)over the past decade with impressive improvements,the accuracy remains limited in complex dynamic environments.In this paper,a composite mask-based generative adversarial network(CMGAN)is introduced to predict camera motion and binocular depth maps.Specifically,a perceptual generator is constructed to obtain the corresponding parallax map and optical flow between two neighboring frames.Then,an iterative pose improvement strategy is proposed to improve the accuracy of pose estimation.Finally,a composite mask is embedded in the discriminator to sense structural deformation in the synthesized virtual image,thereby increasing the overall structural constraints of the network model,improving the accuracy of camera pose estimation,and reducing drift issues in the VO.Detailed quantitative and qualitative evaluations on the KITTI dataset show that the proposed framework outperforms existing conventional,supervised learning and unsupervised depth VO methods,providing better results in both pose estimation and depth estimation.展开更多
Visual SLAM(Simultaneously Localization and Mapping)is a solution to achieve localization and mapping of robots simultaneously.Significant achievements have been made during the past decades,geography-based methods ar...Visual SLAM(Simultaneously Localization and Mapping)is a solution to achieve localization and mapping of robots simultaneously.Significant achievements have been made during the past decades,geography-based methods are becoming more and more successful in dealing with static environments.However,they still cannot handle a challenging environment.With the great achievements of deep learning methods in the field of computer vision,there is a trend of applying deep learning methods to visual SLAM.In this paper,the latest research progress of deep learning applied to the field of visual SLAM is reviewed.The outstanding research results of deep learning visual odometry and deep learning loop closure detect are summarized.Finally,future development directions of visual SLAM based on deep learning is prospected.展开更多
【目的】煤矿井下普遍存在低照度、弱纹理和结构化的特征退化场景,导致视觉SLAM(visual simultaneous localization and mapping)系统面临有效特征不足或误匹配率高的问题,严重制约了其定位的准确性和鲁棒性。【方法】提出一种基于边缘...【目的】煤矿井下普遍存在低照度、弱纹理和结构化的特征退化场景,导致视觉SLAM(visual simultaneous localization and mapping)系统面临有效特征不足或误匹配率高的问题,严重制约了其定位的准确性和鲁棒性。【方法】提出一种基于边缘感知增强的视觉SLAM方法。首先,构建了边缘感知约束的低光图像增强模块。通过自适应尺度的梯度域引导滤波器优化Retinex算法,以获得纹理清晰光照均匀的图像,从而显著提升了在低光照和不均匀光照条件下特征提取性能。其次,在视觉里程计中构建了边缘感知增强的特征提取和匹配模块,通过点线特征融合策略有效增强了弱纹理和结构化场景中特征的可检测性和匹配准确性。具体使用边缘绘制线特征提取算法(edge drawing lines,EDLines)提取线特征,定向FAST和旋转BRIEF点特征提取算法(oriented fast and rotated brief,ORB)提取点特征,并利用基于网格运动统计(grid-based motion statistics,GMS)和比值测试匹配算法进行精确匹配。最后,将该方法与ORB-SLAM2、ORB-SLAM3在TUM数据集和煤矿井下实景数据集上进行了全面实验验证,涵盖图像增强、特征匹配和定位等多个环节。【结果和结论】结果表明:(1)在TUM数据集上的测试结果显示,所提方法与ORB-SLAM2相比,绝对轨迹误差、相对轨迹误差的均方根误差分别降低了4%~38.46%、8.62%~50%;与ORB-SLAM3相比,绝对轨迹误差、相对轨迹误差的均方根误差分别降低了0~61.68%、3.63%~47.05%。(2)在煤矿井下实景实验中,所提方法的定位轨迹更接近于相机运动参考轨迹。(3)有效提高了视觉SLAM在煤矿井下特征退化场景中的准确性和鲁棒性,为视觉SLAM技术在煤矿井下的应用提供了技术解决方案。研究面向井下特征退化场景的视觉SLAM方法,对于推动煤矿井下移动式装备机器人化具有重要意义。展开更多
煤矿井下视觉同步定位与地图构建SLAM(Simultaneous Localization and Mapping)应用中,光照变化与低纹理场景严重影响特征点的提取和匹配结果,导致位姿估计失败,影响定位精度。提出一种基于改进定向快速旋转二值描述符ORB(Oriented Fast...煤矿井下视觉同步定位与地图构建SLAM(Simultaneous Localization and Mapping)应用中,光照变化与低纹理场景严重影响特征点的提取和匹配结果,导致位姿估计失败,影响定位精度。提出一种基于改进定向快速旋转二值描述符ORB(Oriented Fast and Rotated Brief)-SLAM3算法的煤矿井下移动机器人双目视觉定位算法SL-SLAM。针对光照变化场景,在前端使用光照稳定性的Super-Point特征点提取网络替换原始ORB特征点提取算法,并提出一种特征点网格限定法,有效剔除无效特征点区域,增加位姿估计稳定性。针对低纹理场景,在前端引入稳定的线段检测器LSD(Line Segment Detector)线特征提取算法,并提出一种点线联合算法,按照特征点网格对线特征进行分组,根据特征点的匹配结果进行线特征匹配,降低线特征匹配复杂度,节约位姿估计时间。构建了点特征和线特征的重投影误差模型,在线特征残差模型中添加角度约束,通过点特征和线特征的位姿增量雅可比矩阵建立点线特征重投影误差统一成本函数。局部建图线程使用ORB-SLAM3经典的局部优化方法调整点、线特征和关键帧位姿,并在后端线程中进行回环修正、子图融合和全局捆绑调整BA(Bundle Adjustment)。在EuRoC数据集上的试验结果表明,SL-SLAM的绝对位姿误差APE(Absolute Pose Error)指标优于其他对比算法,并取得了与真值最接近的轨迹预测结果:均方根误差相较于ORB-SLAM3降低了17.3%。在煤矿井下模拟场景中的试验结果表明,SL-SLAM能适应光照变化和低纹理场景,可以满足煤矿井下移动机器人的定位精度和稳定性要求。展开更多
The Internet of Vehicles (IoV) has become an important direction in the field of intelligent transportation, in which vehicle positioning is a crucial part. SLAM (Simultaneous Localization and Mapping) technology play...The Internet of Vehicles (IoV) has become an important direction in the field of intelligent transportation, in which vehicle positioning is a crucial part. SLAM (Simultaneous Localization and Mapping) technology plays a crucial role in vehicle localization and navigation. Traditional Simultaneous Localization and Mapping (SLAM) systems are designed for use in static environments, and they can result in poor performance in terms of accuracy and robustness when used in dynamic environments where objects are in constant movement. To address this issue, a new real-time visual SLAM system called MG-SLAM has been developed. Based on ORB-SLAM2, MG-SLAM incorporates a dynamic target detection process that enables the detection of both known and unknown moving objects. In this process, a separate semantic segmentation thread is required to segment dynamic target instances, and the Mask R-CNN algorithm is applied on the Graphics Processing Unit (GPU) to accelerate segmentation. To reduce computational cost, only key frames are segmented to identify known dynamic objects. Additionally, a multi-view geometry method is adopted to detect unknown moving objects. The results demonstrate that MG-SLAM achieves higher precision, with an improvement from 0.2730 m to 0.0135 m in precision. Moreover, the processing time required by MG-SLAM is significantly reduced compared to other dynamic scene SLAM algorithms, which illustrates its efficacy in locating objects in dynamic scenes.展开更多
基金the National Natural Science Foundation of China(No.62063006)to the Guangxi Natural Science Foundation under Grant(Nos.2023GXNSFAA026025,AA24010001)+3 种基金to the Innovation Fund of Chinese Universities Industry-University-Research(ID:2023RY018)to the Special Guangxi Industry and Information Technology Department,Textile and Pharmaceutical Division(ID:2021 No.231)to the Special Research Project of Hechi University(ID:2021GCC028)to the Key Laboratory of AI and Information Processing,Education Department of Guangxi Zhuang Autonomous Region(Hechi University),No.2024GXZDSY009。
文摘In dynamic scenarios,visual simultaneous localization and mapping(SLAM)algorithms often incorrectly incorporate dynamic points during camera pose computation,leading to reduced accuracy and robustness.This paper presents a dynamic SLAM algorithm that leverages object detection and regional dynamic probability.Firstly,a parallel thread employs the YOLOX object detectionmodel to gather 2D semantic information and compensate for missed detections.Next,an improved K-means++clustering algorithm clusters bounding box regions,adaptively determining the threshold for extracting dynamic object contours as dynamic points change.This process divides the image into low dynamic,suspicious dynamic,and high dynamic regions.In the tracking thread,the dynamic point removal module assigns dynamic probability weights to the feature points in these regions.Combined with geometric methods,it detects and removes the dynamic points.The final evaluation on the public TUM RGB-D dataset shows that the proposed dynamic SLAM algorithm surpasses most existing SLAM algorithms,providing better pose estimation accuracy and robustness in dynamic environments.
基金Supported by Jiangsu Key R&D Program(BE2021622)Jiangsu Postgraduate Practice and Innovation Program(SJCX23_0395).
文摘Background As visual simultaneous localization and mapping(SLAM)is primarily based on the assumption of a static scene,the presence of dynamic objects in the frame causes problems such as a deterioration of system robustness and inaccurate position estimation.In this study,we propose a YGC-SLAM for indoor dynamic environments based on the ORB-SLAM2 framework combined with semantic and geometric constraints to improve the positioning accuracy and robustness of the system.Methods First,the recognition accuracy of YOLOv5 was improved by introducing the convolution block attention model and the improved EIOU loss function,whereby the prediction frame converges quickly for better detection.The improved YOLOv5 was then added to the tracking thread for dynamic target detection to eliminate dynamic points.Subsequently,multi-view geometric constraints were used for re-judging to further eliminate dynamic points while enabling more useful feature points to be retained and preventing the semantic approach from over-eliminating feature points,causing a failure of map building.The K-means clustering algorithm was used to accelerate this process and quickly calculate and determine the motion state of each cluster of pixel points.Finally,a strategy for drawing keyframes with de-redundancy was implemented to construct a clear 3D dense static point-cloud map.Results Through testing on TUM dataset and a real environment,the experimental results show that our algorithm reduces the absolute trajectory error by 98.22%and the relative trajectory error by 97.98%compared with the original ORBSLAM2,which is more accurate and has better real-time performance than similar algorithms,such as DynaSLAM and DS-SLAM.Conclusions The YGC-SLAM proposed in this study can effectively eliminate the adverse effects of dynamic objects,and the system can better complete positioning and map building tasks in complex environments.
基金the National Natural Science Foundation of China(No.62063006)to the Guangxi Natural Science Foundation under Grant(Nos.2023GXNSFAA026025,AA24010001)+4 种基金to the Innovation Fund of Chinese Universities Industry-University-Research(ID:2023RY018)to the Special Guangxi Industry and Information Technology Department,Textile and Pharmaceutical Division(ID:2021 No.231)to the Special Research Project of Hechi University(ID:2021GCC028)to the Key Laboratory of AI and Information Processing,Education Department of Guangxi Zhuang Autonomous Region(Hechi University)No.2024GXZDSY009.
文摘Dynamic visual SLAM (Simultaneous Localization and Mapping) is an important research area, but existing methods struggle to balance real-time performance and accuracy in removing dynamic feature points, especially when semantic information is missing. This paper presents a novel dynamic SLAM system that uses optical flow tracking and epipolar geometry to identify dynamic feature points and applies a regional dynamic probability method to improve removal accuracy. We developed two innovative algorithms for precise pruning of dynamic regions: first, using optical flow and epipolar geometry to identify and prune dynamic areas while preserving static regions on stationary dynamic objects to optimize tracking performance;second, propagating dynamic probabilities across frames to mitigate the impact of semantic information loss in some frames. Experiments show that our system significantly reduces trajectory and pose errors in dynamic scenes, achieving dynamic feature point removal accuracy close to that of semantic segmentation methods, while maintaining high real-time performance. Our system performs exceptionally well in highly dynamic environments, especially where complex dynamic objects are present, demonstrating its advantage in handling dynamic scenarios. The experiments also show that while traditional methods may fail in tracking when semantic information is lost, our approach effectively reduces the misidentification of dynamic regions caused by such loss, thus improving system robustness and accuracy.
基金the National Natural Science Foundation of China(No.61671470).
文摘A great number of visual simultaneous localization and mapping(VSLAM)systems need to assume static features in the environment.However,moving objects can vastly impair the performance of a VSLAM system which relies on the static-world assumption.To cope with this challenging topic,a real-time and robust VSLAM system based on ORB-SLAM2 for dynamic environments was proposed.To reduce the influence of dynamic content,we incorporate the deep-learning-based object detection method in the visual odometry,then the dynamic object probability model is added to raise the efficiency of object detection deep neural network and enhance the real-time performance of our system.Experiment with both on the TUM and KITTI benchmark dataset,as well as in a real-world environment,the results clarify that our method can significantly reduce the tracking error or drift,enhance the robustness,accuracy and stability of the VSLAM system in dynamic scenes.
基金the National Natural Science Foundation of China(No.62063006)to the Natural Science Foundation of Guangxi Province(No.2023GXNS-FAA026025)+3 种基金to the Innovation Fund of Chinese Universities Industry-University-Research(ID:2021RYC06005)to the Research Project for Young and Middle-aged Teachers in Guangxi Universities(ID:2020KY15013)to the Special Research Project of Hechi University(ID:2021GCC028)supported by the Project of Outstanding Thousand Young Teachers’Training in Higher Education Institutions of Guangxi,Guangxi Colleges and Universities Key Laboratory of AI and Information Processing(Hechi University),Education Department of Guangxi Zhuang Autonomous Region.
文摘Visual simultaneous localization and mapping(SLAM)is crucial in robotics and autonomous driving.However,traditional visual SLAM faces challenges in dynamic environments.To address this issue,researchers have proposed semantic SLAM,which combines object detection,semantic segmentation,instance segmentation,and visual SLAM.Despite the growing body of literature on semantic SLAM,there is currently a lack of comprehensive research on the integration of object detection and visual SLAM.Therefore,this study aims to gather information from multiple databases and review relevant literature using specific keywords.It focuses on visual SLAM based on object detection,covering different aspects.Firstly,it discusses the current research status and challenges in this field,highlighting methods for incorporating semantic information from object detection networks into mileage measurement,closed-loop detection,and map construction.It also compares the characteristics and performance of various visual SLAM object detection algorithms.Lastly,it provides an outlook on future research directions and emerging trends in visual SLAM.Research has shown that visual SLAM based on object detection has significant improvements compared to traditional SLAM in dynamic point removal,data association,point cloud segmentation,and other technologies.It can improve the robustness and accuracy of the entire SLAM system and can run in real time.With the continuous optimization of algorithms and the improvement of hardware level,object visual SLAM has great potential for development.
基金Project Grant JZX7Y2-0190258055601National Natural Science Foundation of China(61402283).
文摘This article presents a brief survey to visual simultaneous localization and mapping (SLAM) systems applied to multiple independently moving agents, such as a team of ground or aerial vehicles, a group of users holding augmented or virtual reality devices. Such visual SLAM system, name as collaborative visual SLAM, is different from a typical visual SLAM deployed on a single agent in that information is exchanged or shared among different agents to achieve better robustness, efficiency, and accuracy. We review the representative works on this topic proposed in the past ten years and describe the key components involved in designing such a system including collaborative pose estimation and mapping tasks, as well as the emerging topic of decentralized architecture. We believe this brief survey could be helpful to someone who are working on this topic or developing multi-agent applications, particularly micro-aerial vehicle swarm or collaborative augmented/virtual reality.
文摘Feature selection is always an important issue in the visual SLAM (simultaneous location and mapping) literature. Considering that the location estimation can be improved by tracking features with larger value of visible time, a new feature selection method based on motion estimation is proposed. First, a k-step iteration algorithm is presented for visible time estimation using an affme motion model; then a delayed feature detection method is introduced for efficiently detecting features with the maximum visible time. As a means of validation for the proposed method, both simulation and real data experiments are carded out. Results show that the proposed method can improve both the estimation performance and the computational performance compared with the existing random feature selection method.
基金supported by National High Technology Research Development Program of China (863 Program) (No.2011AA040202)National Science Foundation of China (No.51005008)
文摘This paper presents a hierarchical simultaneous localization and mapping(SLAM) system for a small unmanned aerial vehicle(UAV) using the output of an inertial measurement unit(IMU) and the bearing-only observations from an onboard monocular camera.A homography based approach is used to calculate the motion of the vehicle in 6 degrees of freedom by image feature match.This visual measurement is fused with the inertial outputs by an indirect extended Kalman filter(EKF) for attitude and velocity estimation.Then,another EKF is employed to estimate the position of the vehicle and the locations of the features in the map.Both simulations and experiments are carried out to test the performance of the proposed system.The result of the comparison with the referential global positioning system/inertial navigation system(GPS/INS) navigation indicates that the proposed SLAM can provide reliable and stable state estimation for small UAVs in GPS-denied environments.
文摘In dynamic scenes,the pose estimation and map consistency of visual simultaneous localisation and mapping(visual SLAM)are affected by intermittent changes in object motion states.An adaptive motion-state estimation and feature-reuse mechanism is proposed which restores features once objects become stationary.Camera ego-motion is com-pensated via projection-based point-to-point red-green-blue-depth(RGB-D)Iterative Closest Point;the alignment residual yields a short-term jitter score.An Extended Kalman Filter fuses the centre-pixel trajectory and depth of the object,using depth innovation as strong evidence to suppress false triggers.Applied adaptive decision thresholds involve resolution,ego-motion intensity,jitter,and reference depth,and are combined with dual/single triggering and hysteresis to achieve robust switching.When an object is considered static,its feature points are reused.On the Bonn RGB-D Dynamic Dataset(BONN)and TUM RGB-D SLAM Dataset and Benchmark(TUM),the proposed method matches or exceeds baselines:In intermittent-motion-dominated BONN sequences Placing_non_box,it re-duces the root-mean-square of the absolute trajectory error(ATE-RMSE)by 27%relative to the baseline,remains comparable to Ellipsoid-SLAM on TUM,and consistently outperforms ORB-SLAM3 in dynamic scenes.The hysteresis counter reading on Placing_non_box2 shows that the proposed method can reduce the motion-state misclassification rate by nearly 40%.From the ablation experiment results,we confirm that adaptive thresholds yield the most significant optimisation effect.The approach improves robustness and map completeness in dynamic environments without degrading performance in low-dynamic settings.
文摘针对动态场景下视觉SLAM(Simultaneous Localization and Mapping)系统中深度学习分割网络实时性不足,以及相机非期望运动导致位姿估计偏差的问题,提出一种基于跨域掩膜分割的视觉SLAM算法.该算法采用轻量化YOLO-fastest网络结合背景减除法实现运动物体检测,利用深度图结合深度阈值分割构建跨域掩膜分割机制,并设计相机运动几何校正策略补偿检测框坐标误差,在实现运动物体分割的同时提升处理速度.为优化特征点利用率,采用金字塔光流对动态特征点进行帧间连续跟踪与更新,同时确保仅由静态特征点参与位姿估计过程.在TUM数据集上进行系统性评估,实验结果表明,相比于ORB-SLAM3算法,该算法的绝对位姿误差平均降幅达97.1%,与使用深度学习分割网络的DynaSLAM和DS-SLAM的动态SLAM算法相比,其单帧跟踪时间大幅减少,在精度与效率之间实现了更好的平衡.
基金supported by the Program of Graduate Education and Teaching Reform in Tianjin University of Technology(Nos.YBXM2204 and ZDXM2202)the National Natural Science Foundation of China(Nos.62203331 and 62103299)。
文摘Although deep learning methods have been widely applied in slam visual odometry(VO)over the past decade with impressive improvements,the accuracy remains limited in complex dynamic environments.In this paper,a composite mask-based generative adversarial network(CMGAN)is introduced to predict camera motion and binocular depth maps.Specifically,a perceptual generator is constructed to obtain the corresponding parallax map and optical flow between two neighboring frames.Then,an iterative pose improvement strategy is proposed to improve the accuracy of pose estimation.Finally,a composite mask is embedded in the discriminator to sense structural deformation in the synthesized virtual image,thereby increasing the overall structural constraints of the network model,improving the accuracy of camera pose estimation,and reducing drift issues in the VO.Detailed quantitative and qualitative evaluations on the KITTI dataset show that the proposed framework outperforms existing conventional,supervised learning and unsupervised depth VO methods,providing better results in both pose estimation and depth estimation.
文摘Visual SLAM(Simultaneously Localization and Mapping)is a solution to achieve localization and mapping of robots simultaneously.Significant achievements have been made during the past decades,geography-based methods are becoming more and more successful in dealing with static environments.However,they still cannot handle a challenging environment.With the great achievements of deep learning methods in the field of computer vision,there is a trend of applying deep learning methods to visual SLAM.In this paper,the latest research progress of deep learning applied to the field of visual SLAM is reviewed.The outstanding research results of deep learning visual odometry and deep learning loop closure detect are summarized.Finally,future development directions of visual SLAM based on deep learning is prospected.
文摘【目的】煤矿井下普遍存在低照度、弱纹理和结构化的特征退化场景,导致视觉SLAM(visual simultaneous localization and mapping)系统面临有效特征不足或误匹配率高的问题,严重制约了其定位的准确性和鲁棒性。【方法】提出一种基于边缘感知增强的视觉SLAM方法。首先,构建了边缘感知约束的低光图像增强模块。通过自适应尺度的梯度域引导滤波器优化Retinex算法,以获得纹理清晰光照均匀的图像,从而显著提升了在低光照和不均匀光照条件下特征提取性能。其次,在视觉里程计中构建了边缘感知增强的特征提取和匹配模块,通过点线特征融合策略有效增强了弱纹理和结构化场景中特征的可检测性和匹配准确性。具体使用边缘绘制线特征提取算法(edge drawing lines,EDLines)提取线特征,定向FAST和旋转BRIEF点特征提取算法(oriented fast and rotated brief,ORB)提取点特征,并利用基于网格运动统计(grid-based motion statistics,GMS)和比值测试匹配算法进行精确匹配。最后,将该方法与ORB-SLAM2、ORB-SLAM3在TUM数据集和煤矿井下实景数据集上进行了全面实验验证,涵盖图像增强、特征匹配和定位等多个环节。【结果和结论】结果表明:(1)在TUM数据集上的测试结果显示,所提方法与ORB-SLAM2相比,绝对轨迹误差、相对轨迹误差的均方根误差分别降低了4%~38.46%、8.62%~50%;与ORB-SLAM3相比,绝对轨迹误差、相对轨迹误差的均方根误差分别降低了0~61.68%、3.63%~47.05%。(2)在煤矿井下实景实验中,所提方法的定位轨迹更接近于相机运动参考轨迹。(3)有效提高了视觉SLAM在煤矿井下特征退化场景中的准确性和鲁棒性,为视觉SLAM技术在煤矿井下的应用提供了技术解决方案。研究面向井下特征退化场景的视觉SLAM方法,对于推动煤矿井下移动式装备机器人化具有重要意义。
文摘煤矿井下视觉同步定位与地图构建SLAM(Simultaneous Localization and Mapping)应用中,光照变化与低纹理场景严重影响特征点的提取和匹配结果,导致位姿估计失败,影响定位精度。提出一种基于改进定向快速旋转二值描述符ORB(Oriented Fast and Rotated Brief)-SLAM3算法的煤矿井下移动机器人双目视觉定位算法SL-SLAM。针对光照变化场景,在前端使用光照稳定性的Super-Point特征点提取网络替换原始ORB特征点提取算法,并提出一种特征点网格限定法,有效剔除无效特征点区域,增加位姿估计稳定性。针对低纹理场景,在前端引入稳定的线段检测器LSD(Line Segment Detector)线特征提取算法,并提出一种点线联合算法,按照特征点网格对线特征进行分组,根据特征点的匹配结果进行线特征匹配,降低线特征匹配复杂度,节约位姿估计时间。构建了点特征和线特征的重投影误差模型,在线特征残差模型中添加角度约束,通过点特征和线特征的位姿增量雅可比矩阵建立点线特征重投影误差统一成本函数。局部建图线程使用ORB-SLAM3经典的局部优化方法调整点、线特征和关键帧位姿,并在后端线程中进行回环修正、子图融合和全局捆绑调整BA(Bundle Adjustment)。在EuRoC数据集上的试验结果表明,SL-SLAM的绝对位姿误差APE(Absolute Pose Error)指标优于其他对比算法,并取得了与真值最接近的轨迹预测结果:均方根误差相较于ORB-SLAM3降低了17.3%。在煤矿井下模拟场景中的试验结果表明,SL-SLAM能适应光照变化和低纹理场景,可以满足煤矿井下移动机器人的定位精度和稳定性要求。
基金funded by the Natural Science Foundation of the Jiangsu Higher Education Institutions of China(grant number 22KJD440001)Changzhou Science&Technology Program(grant number CJ20220232).
文摘The Internet of Vehicles (IoV) has become an important direction in the field of intelligent transportation, in which vehicle positioning is a crucial part. SLAM (Simultaneous Localization and Mapping) technology plays a crucial role in vehicle localization and navigation. Traditional Simultaneous Localization and Mapping (SLAM) systems are designed for use in static environments, and they can result in poor performance in terms of accuracy and robustness when used in dynamic environments where objects are in constant movement. To address this issue, a new real-time visual SLAM system called MG-SLAM has been developed. Based on ORB-SLAM2, MG-SLAM incorporates a dynamic target detection process that enables the detection of both known and unknown moving objects. In this process, a separate semantic segmentation thread is required to segment dynamic target instances, and the Mask R-CNN algorithm is applied on the Graphics Processing Unit (GPU) to accelerate segmentation. To reduce computational cost, only key frames are segmented to identify known dynamic objects. Additionally, a multi-view geometry method is adopted to detect unknown moving objects. The results demonstrate that MG-SLAM achieves higher precision, with an improvement from 0.2730 m to 0.0135 m in precision. Moreover, the processing time required by MG-SLAM is significantly reduced compared to other dynamic scene SLAM algorithms, which illustrates its efficacy in locating objects in dynamic scenes.