Aiming at the problem of system error and noise in simultaneous localization and mapping(SLAM) technology, we propose a calibration model based on Project Tango device and a loop closure detection algorithm based on v...Aiming at the problem of system error and noise in simultaneous localization and mapping(SLAM) technology, we propose a calibration model based on Project Tango device and a loop closure detection algorithm based on visual vocabulary with memory management. The graph optimization is also combined to achieve a running application. First, the color image and depth information of the environment are collected to establish the calibration model of system error and noise. Second, with constraint condition provided by loop closure detection algorithm, speed up robust feature is calculated and matched. Finally, the motion pose model is solved, and the optimal scene model is determined by graph optimization method. This method is compared with Open Constructor for reconstruction on several experimental scenarios. The results show the number of model's points and faces are larger than Open Constructor's, and the scanning time is less than Open Constructor's. The experimental results show the feasibility and efficiency of the proposed algorithm.展开更多
In this paper a semi-direct visual odometry and mapping system is proposed with a RGB-D camera,which combines the merits of both feature based and direct based methods.The presented system directly estimates the camer...In this paper a semi-direct visual odometry and mapping system is proposed with a RGB-D camera,which combines the merits of both feature based and direct based methods.The presented system directly estimates the camera motion of two consecutive RGB-D frames by minimizing the photometric error.To permit outliers and noise,a robust sensor model built upon the t-distribution and an error function mixing depth and photometric errors are used to enhance the accuracy and robustness.Local graph optimization based on key frames is used to reduce the accumulative error and refine the local map.The loop closure detection method,which combines the appearance similarity method and spatial location constraints method,increases the speed of detection.Experimental results demonstrate that the proposed approach achieves higher accuracy on the motion estimation and environment reconstruction compared to the other state-of-the-art methods. Moreover,the proposed approach works in real-time on a laptop without a GPU,which makes it attractive for robots equipped with limited computational resources.展开更多
This paper presents a visual simultaneous localization and mapping(SLAM)system designed for highly dynamic environments,capable of eliminating dynamic objects using only visual information.The proposed system integrat...This paper presents a visual simultaneous localization and mapping(SLAM)system designed for highly dynamic environments,capable of eliminating dynamic objects using only visual information.The proposed system integrates learning-based and geometry-based methods to address the challenges posed by moving objects.The learning-based approach leverages image segmentation to remove previously trained objects,whereas the geometry-based approach utilises point correlation to eliminate unseen objects.By complementing each other,these methods enhance the robustness of the SLAM system in dynamic scenarios.Experimental results demonstrate that the proposed method effectively removes dynamic objects.Comparative studies with state-of-the-art algorithms further show that the proposed method achieves superior accuracy and robustness.展开更多
This paper presents a visual simultaneous localization and mapping(SLAM)system designed for highly dynamic environments,capable of eliminating dynamic objects using only visual information.The proposed system integrat...This paper presents a visual simultaneous localization and mapping(SLAM)system designed for highly dynamic environments,capable of eliminating dynamic objects using only visual information.The proposed system integrates learning-based and geometry-based methods to address the challenges posed by moving objects.The learning-based approach leverages image segmentation to remove previously trained objects,whereas the geometry-based approach utilises point correlation to eliminate unseen objects.By complementing each other,these methods enhance the robustness of the SLAM system in dynamic sce-narios.Experimental results demonstrate that the proposed method effectively removes dynamic objects.Comparative studies with state-of-the-art algorithms further show that the proposed method achieves superior accuracy and robustness.展开更多
An improved method with better selection capability using a single camera was presented in comparison with previous method. To improve performance, two methods were applied to landmark selection in an unfamiliar indoo...An improved method with better selection capability using a single camera was presented in comparison with previous method. To improve performance, two methods were applied to landmark selection in an unfamiliar indoor environment. First, a modified visual attention method was proposed to automatically select a candidate region as a more useful landmark. In visual attention, candidate landmark regions were selected with different characteristics of ambient color and intensity in the image. Then, the more useful landmarks were selected by combining the candidate regions using clustering. As generally implemented, automatic landmark selection by vision-based simultaneous localization and mapping(SLAM) results in many useless landmarks, because the features of images are distinguished from the surrounding environment but detected repeatedly. These useless landmarks create a serious problem for the SLAM system because they complicate data association. To address this, a method was proposed in which the robot initially collected landmarks through automatic detection while traversing the entire area where the robot performed SLAM, and then, the robot selected only those landmarks that exhibited high rarity through clustering, which enhanced the system performance. Experimental results show that this method of automatic landmark selection results in selection of a high-rarity landmark. The average error of the performance of SLAM decreases 52% compared with conventional methods and the accuracy of data associations increases.展开更多
针对弱纹理和变光照环境下基于点特征的视觉SLAM(simultaneous localization and mapping)算法轨迹漂移的问题,提出了一种基于改进自适应阈值ELSED算法(Adaptive-ELSED)的快速点线融合双目视觉SLAM算法。通过在ELSED算法中添加自适应阈...针对弱纹理和变光照环境下基于点特征的视觉SLAM(simultaneous localization and mapping)算法轨迹漂移的问题,提出了一种基于改进自适应阈值ELSED算法(Adaptive-ELSED)的快速点线融合双目视觉SLAM算法。通过在ELSED算法中添加自适应阈值矩阵,动态调整不同光照条件下梯度阈值,并使用长度抑制和短线合并策略,提高线特征的质量。利用基于双目几何约束和图像结构相似性(SSIM)进行快速线段特征三角化。基于历史位姿及误差分析获取初始位姿,通过自适应因子实现光束法平差过程中点线特征的更有效融合。实验结果表明,所提算法在提高线特征质量的同时,耗时仅为LSD算法的50%,线特征匹配速度较传统LBD算法提升67%,挑战性场景下轨迹误差较ORB-SLAM3降低62.2%,系统的平均跟踪帧率为27帧/s,在保证系统实时性的同时,显著提升了系统在弱纹理、变光照环境下的精度和鲁棒性。展开更多
vSLAM(visual Simultaneous Localization and Mapping)是一种基于视觉传感器实现同时定位与建图的技术,不仅可为地面机器人提供服务,同时在无人机的定位导航中也有着非常重要的应用。对基于无人机的vSLAM发展概况进行整理研究,就其中...vSLAM(visual Simultaneous Localization and Mapping)是一种基于视觉传感器实现同时定位与建图的技术,不仅可为地面机器人提供服务,同时在无人机的定位导航中也有着非常重要的应用。对基于无人机的vSLAM发展概况进行整理研究,就其中几大关键方向的研究现状予以介绍,主要包括结合IMU、结合光流传感器的vSLAM,同时总结目前研究中仍存在的一些问题和不足之处。结合经典理论与最新研究动态,对基于无人机的vSLAM重点研究内容和未来发展方向提出了新的展望。展开更多
There are about 253 million people with visual impairment worldwide.Many of them use a white cane and/or a guide dog as the mobility tool for daily travel.Despite decades of efforts,electronic navigation aid that can ...There are about 253 million people with visual impairment worldwide.Many of them use a white cane and/or a guide dog as the mobility tool for daily travel.Despite decades of efforts,electronic navigation aid that can replace white cane is still research in progress.In this paper,we propose an RGB-D camera based visual positioning system(VPS)for real-time localization of a robotic navigation aid(RNA)in an architectural floor plan for assistive navigation.The core of the system is the combination of a new 6-DOF depth-enhanced visual-inertial odometry(DVIO)method and a particle filter localization(PFL)method.DVIO estimates RNA’s pose by using the data from an RGB-D camera and an inertial measurement unit(IMU).It extracts the floor plane from the camera’s depth data and tightly couples the floor plane,the visual features(with and without depth data),and the IMU’s inertial data in a graph optimization framework to estimate the device’s 6-DOF pose.Due to the use of the floor plane and depth data from the RGB-D camera,DVIO has a better pose estimation accuracy than the conventional VIO method.To reduce the accumulated pose error of DVIO for navigation in a large indoor space,we developed the PFL method to locate RNA in the floor plan.PFL leverages geometric information of the architectural CAD drawing of an indoor space to further reduce the error of the DVIO-estimated pose.Based on VPS,an assistive navigation system is developed for the RNA prototype to assist a visually impaired person in navigating a large indoor space.Experimental results demonstrate that:1)DVIO method achieves better pose estimation accuracy than the state-of-the-art VIO method and performs real-time pose estimation(18 Hz pose update rate)on a UP Board computer;2)PFL reduces the DVIO-accrued pose error by 82.5%on average and allows for accurate wayfinding(endpoint position error≤45 cm)in large indoor spaces.展开更多
This paper presents a hierarchical simultaneous localization and mapping(SLAM) system for a small unmanned aerial vehicle(UAV) using the output of an inertial measurement unit(IMU) and the bearing-only observati...This paper presents a hierarchical simultaneous localization and mapping(SLAM) system for a small unmanned aerial vehicle(UAV) using the output of an inertial measurement unit(IMU) and the bearing-only observations from an onboard monocular camera.A homography based approach is used to calculate the motion of the vehicle in 6 degrees of freedom by image feature match.This visual measurement is fused with the inertial outputs by an indirect extended Kalman filter(EKF) for attitude and velocity estimation.Then,another EKF is employed to estimate the position of the vehicle and the locations of the features in the map.Both simulations and experiments are carried out to test the performance of the proposed system.The result of the comparison with the referential global positioning system/inertial navigation system(GPS/INS) navigation indicates that the proposed SLAM can provide reliable and stable state estimation for small UAVs in GPS-denied environments.展开更多
In recent years, there have been a lot of interests in incorporating semantics into simultaneous localization and mapping (SLAM) systems. This paper presents an approach to generate an outdoor large-scale 3D dense s...In recent years, there have been a lot of interests in incorporating semantics into simultaneous localization and mapping (SLAM) systems. This paper presents an approach to generate an outdoor large-scale 3D dense semantic map based on binocular stereo vision. The inputs to system are stereo color images from a moving vehicle. First, dense 3D space around the vehicle is constructed, and tile motion of camera is estimated by visual odometry. Meanwhile, semantic segmentation is performed through the deep learning technology online, and the semantic labels are also used to verify tim feature matching in visual odometry. These three processes calculate the motion, depth and semantic label of every pixel in the input views. Then, a voxel conditional random field (CRF) inference is introduced to fuse semantic labels to voxel. After that, we present a method to remove the moving objects by incorporating the semantic labels, which improves the motion segmentation accuracy. The last is to generate tile dense 3D semantic map of an urban environment from arbitrary long image sequence. We evaluate our approach on KITTI vision benchmark, and the results show that the proposed method is effective.展开更多
基金Supported by the National Natural Science Foundation of China(61772379)
文摘Aiming at the problem of system error and noise in simultaneous localization and mapping(SLAM) technology, we propose a calibration model based on Project Tango device and a loop closure detection algorithm based on visual vocabulary with memory management. The graph optimization is also combined to achieve a running application. First, the color image and depth information of the environment are collected to establish the calibration model of system error and noise. Second, with constraint condition provided by loop closure detection algorithm, speed up robust feature is calculated and matched. Finally, the motion pose model is solved, and the optimal scene model is determined by graph optimization method. This method is compared with Open Constructor for reconstruction on several experimental scenarios. The results show the number of model's points and faces are larger than Open Constructor's, and the scanning time is less than Open Constructor's. The experimental results show the feasibility and efficiency of the proposed algorithm.
基金Supported by the National Natural Science Foundation of China(61501034)
文摘In this paper a semi-direct visual odometry and mapping system is proposed with a RGB-D camera,which combines the merits of both feature based and direct based methods.The presented system directly estimates the camera motion of two consecutive RGB-D frames by minimizing the photometric error.To permit outliers and noise,a robust sensor model built upon the t-distribution and an error function mixing depth and photometric errors are used to enhance the accuracy and robustness.Local graph optimization based on key frames is used to reduce the accumulative error and refine the local map.The loop closure detection method,which combines the appearance similarity method and spatial location constraints method,increases the speed of detection.Experimental results demonstrate that the proposed approach achieves higher accuracy on the motion estimation and environment reconstruction compared to the other state-of-the-art methods. Moreover,the proposed approach works in real-time on a laptop without a GPU,which makes it attractive for robots equipped with limited computational resources.
基金supported by the Autonomous Intelligent Unmanned Systems(No.NSFC 62088101)the National Natural Science Foundation of China(No.62306096)in part by the Zhejiang Provincial Natural Science Foundation of China under Grant(No.LD24F030001).
文摘This paper presents a visual simultaneous localization and mapping(SLAM)system designed for highly dynamic environments,capable of eliminating dynamic objects using only visual information.The proposed system integrates learning-based and geometry-based methods to address the challenges posed by moving objects.The learning-based approach leverages image segmentation to remove previously trained objects,whereas the geometry-based approach utilises point correlation to eliminate unseen objects.By complementing each other,these methods enhance the robustness of the SLAM system in dynamic scenarios.Experimental results demonstrate that the proposed method effectively removes dynamic objects.Comparative studies with state-of-the-art algorithms further show that the proposed method achieves superior accuracy and robustness.
基金supported by the Autonomous Intelligent Unmanned Systems(No.NSFC 62088101)the National Natural Science Foundation of China(No.62306096)in part by the Zhejiang Provincial Natural Science Foundation of China under Grant(No.LD24F030001).
文摘This paper presents a visual simultaneous localization and mapping(SLAM)system designed for highly dynamic environments,capable of eliminating dynamic objects using only visual information.The proposed system integrates learning-based and geometry-based methods to address the challenges posed by moving objects.The learning-based approach leverages image segmentation to remove previously trained objects,whereas the geometry-based approach utilises point correlation to eliminate unseen objects.By complementing each other,these methods enhance the robustness of the SLAM system in dynamic sce-narios.Experimental results demonstrate that the proposed method effectively removes dynamic objects.Comparative studies with state-of-the-art algorithms further show that the proposed method achieves superior accuracy and robustness.
文摘An improved method with better selection capability using a single camera was presented in comparison with previous method. To improve performance, two methods were applied to landmark selection in an unfamiliar indoor environment. First, a modified visual attention method was proposed to automatically select a candidate region as a more useful landmark. In visual attention, candidate landmark regions were selected with different characteristics of ambient color and intensity in the image. Then, the more useful landmarks were selected by combining the candidate regions using clustering. As generally implemented, automatic landmark selection by vision-based simultaneous localization and mapping(SLAM) results in many useless landmarks, because the features of images are distinguished from the surrounding environment but detected repeatedly. These useless landmarks create a serious problem for the SLAM system because they complicate data association. To address this, a method was proposed in which the robot initially collected landmarks through automatic detection while traversing the entire area where the robot performed SLAM, and then, the robot selected only those landmarks that exhibited high rarity through clustering, which enhanced the system performance. Experimental results show that this method of automatic landmark selection results in selection of a high-rarity landmark. The average error of the performance of SLAM decreases 52% compared with conventional methods and the accuracy of data associations increases.
文摘针对弱纹理和变光照环境下基于点特征的视觉SLAM(simultaneous localization and mapping)算法轨迹漂移的问题,提出了一种基于改进自适应阈值ELSED算法(Adaptive-ELSED)的快速点线融合双目视觉SLAM算法。通过在ELSED算法中添加自适应阈值矩阵,动态调整不同光照条件下梯度阈值,并使用长度抑制和短线合并策略,提高线特征的质量。利用基于双目几何约束和图像结构相似性(SSIM)进行快速线段特征三角化。基于历史位姿及误差分析获取初始位姿,通过自适应因子实现光束法平差过程中点线特征的更有效融合。实验结果表明,所提算法在提高线特征质量的同时,耗时仅为LSD算法的50%,线特征匹配速度较传统LBD算法提升67%,挑战性场景下轨迹误差较ORB-SLAM3降低62.2%,系统的平均跟踪帧率为27帧/s,在保证系统实时性的同时,显著提升了系统在弱纹理、变光照环境下的精度和鲁棒性。
文摘vSLAM(visual Simultaneous Localization and Mapping)是一种基于视觉传感器实现同时定位与建图的技术,不仅可为地面机器人提供服务,同时在无人机的定位导航中也有着非常重要的应用。对基于无人机的vSLAM发展概况进行整理研究,就其中几大关键方向的研究现状予以介绍,主要包括结合IMU、结合光流传感器的vSLAM,同时总结目前研究中仍存在的一些问题和不足之处。结合经典理论与最新研究动态,对基于无人机的vSLAM重点研究内容和未来发展方向提出了新的展望。
基金supported by the NIBIB and the NEI of the National Institutes of Health(R01EB018117)。
文摘There are about 253 million people with visual impairment worldwide.Many of them use a white cane and/or a guide dog as the mobility tool for daily travel.Despite decades of efforts,electronic navigation aid that can replace white cane is still research in progress.In this paper,we propose an RGB-D camera based visual positioning system(VPS)for real-time localization of a robotic navigation aid(RNA)in an architectural floor plan for assistive navigation.The core of the system is the combination of a new 6-DOF depth-enhanced visual-inertial odometry(DVIO)method and a particle filter localization(PFL)method.DVIO estimates RNA’s pose by using the data from an RGB-D camera and an inertial measurement unit(IMU).It extracts the floor plane from the camera’s depth data and tightly couples the floor plane,the visual features(with and without depth data),and the IMU’s inertial data in a graph optimization framework to estimate the device’s 6-DOF pose.Due to the use of the floor plane and depth data from the RGB-D camera,DVIO has a better pose estimation accuracy than the conventional VIO method.To reduce the accumulated pose error of DVIO for navigation in a large indoor space,we developed the PFL method to locate RNA in the floor plan.PFL leverages geometric information of the architectural CAD drawing of an indoor space to further reduce the error of the DVIO-estimated pose.Based on VPS,an assistive navigation system is developed for the RNA prototype to assist a visually impaired person in navigating a large indoor space.Experimental results demonstrate that:1)DVIO method achieves better pose estimation accuracy than the state-of-the-art VIO method and performs real-time pose estimation(18 Hz pose update rate)on a UP Board computer;2)PFL reduces the DVIO-accrued pose error by 82.5%on average and allows for accurate wayfinding(endpoint position error≤45 cm)in large indoor spaces.
基金supported by National High Technology Research Development Program of China (863 Program) (No.2011AA040202)National Science Foundation of China (No.51005008)
文摘This paper presents a hierarchical simultaneous localization and mapping(SLAM) system for a small unmanned aerial vehicle(UAV) using the output of an inertial measurement unit(IMU) and the bearing-only observations from an onboard monocular camera.A homography based approach is used to calculate the motion of the vehicle in 6 degrees of freedom by image feature match.This visual measurement is fused with the inertial outputs by an indirect extended Kalman filter(EKF) for attitude and velocity estimation.Then,another EKF is employed to estimate the position of the vehicle and the locations of the features in the map.Both simulations and experiments are carried out to test the performance of the proposed system.The result of the comparison with the referential global positioning system/inertial navigation system(GPS/INS) navigation indicates that the proposed SLAM can provide reliable and stable state estimation for small UAVs in GPS-denied environments.
基金supported by National Natural Science Foundation of China(Nos.NSFC 61473042 and 61105092)Beijing Higher Education Young Elite Teacher Project(No.YETP1215)
文摘In recent years, there have been a lot of interests in incorporating semantics into simultaneous localization and mapping (SLAM) systems. This paper presents an approach to generate an outdoor large-scale 3D dense semantic map based on binocular stereo vision. The inputs to system are stereo color images from a moving vehicle. First, dense 3D space around the vehicle is constructed, and tile motion of camera is estimated by visual odometry. Meanwhile, semantic segmentation is performed through the deep learning technology online, and the semantic labels are also used to verify tim feature matching in visual odometry. These three processes calculate the motion, depth and semantic label of every pixel in the input views. Then, a voxel conditional random field (CRF) inference is introduced to fuse semantic labels to voxel. After that, we present a method to remove the moving objects by incorporating the semantic labels, which improves the motion segmentation accuracy. The last is to generate tile dense 3D semantic map of an urban environment from arbitrary long image sequence. We evaluate our approach on KITTI vision benchmark, and the results show that the proposed method is effective.