This paper presents a novel approach for camera pose refinement based on neural radiance fields(NeRF)by introducing semantic feature consistency to enhance robustness.NeRF has been successfully applied to camera pose ...This paper presents a novel approach for camera pose refinement based on neural radiance fields(NeRF)by introducing semantic feature consistency to enhance robustness.NeRF has been successfully applied to camera pose estimation by inverting the rendering process given an observed RGB image and an initial pose estimate.However,previous methods only adopted photometric consistency for pose optimization,which is prone to be trapped in local minima.To address this problem,we introduce semantic feature consistency into the existing framework.Specifically,we utilize high-level features extracted from a convolutional neural network(CNN)pre-trained for image recognition,and maintain consistency of such features between observed and rendered images during the optimization procedure.Unlike the color values at each pixel,these features contain rich semantic information shared within local regions and can be more robust to appearance changes from different viewpoints.Since it is computationally expensive to render a full image with NeRF for feature extraction from CNN,we propose an efficient way to estimate the features of individually rendered pixels by projecting them to a nearby reference image and interpolating its feature maps.Extensive experiments show that our method greatly outperforms the baseline method on both synthetic objects and real-world large indoor scenes,increasing the accuracy of pose estimation by over 6.4%.展开更多
Image-based relocalization is a renewed interest in outdoor environments,because it is an important problem with many applications.PoseNet introduces Convolutional Neural Network(CNN)for the first time to realize the ...Image-based relocalization is a renewed interest in outdoor environments,because it is an important problem with many applications.PoseNet introduces Convolutional Neural Network(CNN)for the first time to realize the real-time camera pose solution based on a single image.In order to solve the problem of precision and robustness of PoseNet and its improved algorithms in complex environment,this paper proposes and implements a new visual relocation method based on deep convolutional neural networks(VNLSTM-PoseNet).Firstly,this method directly resizes the input image without cropping to increase the receptive field of the training image.Then,the image and the corresponding pose labels are put into the improved Long Short-Term Memory based(LSTM-based)PoseNet network for training and the network is optimized by the Nadam optimizer.Finally,the trained network is used for image localization to obtain the camera pose.Experimental results on outdoor public datasets show our VNLSTM-PoseNet can lead to drastic improvements in relocalization performance compared to existing state-of-theart CNN-based methods.展开更多
In this paper we present a novel featurebased RGB-D camera pose optimization algorithm for real-time 3D reconstruction systems. During camera pose estimation, current methods in online systems suffer from fast-scanned...In this paper we present a novel featurebased RGB-D camera pose optimization algorithm for real-time 3D reconstruction systems. During camera pose estimation, current methods in online systems suffer from fast-scanned RGB-D data, or generate inaccurate relative transformations between consecutive frames. Our approach improves current methods by utilizing matched features across all frames and is robust for RGB-D data with large shifts in consecutive frames. We directly estimate camera pose for each frame by efficiently solving a quadratic minimization problem to maximize the consistency of3 D points in global space across frames corresponding to matched feature points. We have implemented our method within two state-of-the-art online 3D reconstruction platforms. Experimental results testify that our method is efficient and reliable in estimating camera poses for RGB-D data with large shifts.展开更多
Virtual reality,augmented reality,robotics,and autonomous driving,have recently attracted much attention from both academic and industrial communities,in which image-based camera localization is a key task.However,the...Virtual reality,augmented reality,robotics,and autonomous driving,have recently attracted much attention from both academic and industrial communities,in which image-based camera localization is a key task.However,there has not been a complete review on image-based camera localization.It is urgent to map this topic to enable individuals enter the field quickly.In this paper,an overview of image-based camera localization is presented.A new and complete classification of image-based camera localization approaches is provided and the related techniques are introduced.Trends for future development are also discussed.This will be useful not only to researchers,but also to engineers and other individuals interested in this field.展开更多
Real-time indoor camera localization is a significant problem in indoor robot navigation and surveillance systems.The scene can change during the image sequence and plays a vital role in the localization performance o...Real-time indoor camera localization is a significant problem in indoor robot navigation and surveillance systems.The scene can change during the image sequence and plays a vital role in the localization performance of robotic applications in terms of accuracy and speed.This research proposed a real-time indoor camera localization system based on a recurrent neural network that detects scene change during the image sequence.An annotated image dataset trains the proposed system and predicts the camera pose in real-time.The system mainly improved the localization performance of indoor cameras by more accurately predicting the camera pose.It also recognizes the scene changes during the sequence and evaluates the effects of these changes.This system achieved high accuracy and real-time performance.The scene change detection process was performed using visual rhythm and the proposed recurrent deep architecture,which performed camera pose prediction and scene change impact evaluation.Overall,this study proposed a novel real-time localization system for indoor cameras that detects scene changes and shows how they affect localization performance.展开更多
经颅磁刺激(transcranial magnetic stimulation, TMS)是一种神经调制方法,临床中凭借医生经验手动确定TMS线圈摆放位姿,导致线圈摆放位置和姿态不准确且重复定位精度差。针对上述问题,提出一种TMS线圈机器人辅助定位系统,使用RGB相机...经颅磁刺激(transcranial magnetic stimulation, TMS)是一种神经调制方法,临床中凭借医生经验手动确定TMS线圈摆放位姿,导致线圈摆放位置和姿态不准确且重复定位精度差。针对上述问题,提出一种TMS线圈机器人辅助定位系统,使用RGB相机替代导航系统中双目红外相机,采用一种基于神经网络的无标志物TMS线圈机器人辅助定位方法。搭建神经网络实现相机空间线圈姿态到操作臂空间关节角度的映射,并通过仿真数据训练验证了该神经网络架构适用于TMS线圈位姿摆放问题。随后,通过实验验证了该方法的可行性,同时表明训练的神经网络针对TMS线圈定位任务具有良好的泛化能力。最后,在笛卡儿空间的位姿验证结果显示TMS线圈三维位置平均误差为2.16 mm,总体姿态误差为0.055 rad,使用RGB相机的TMS线圈机器人辅助定位系统在精度上达到了与其他使用双目红外相机的科研或商用系统相同的水平,满足TMS临床治疗要求,具备临床应用的可行性。展开更多
Camera Pose Estimating from point and line correspondences is critical in various applications,including robotics,augmented reality,3D reconstruction,and autonomous navigation.Existing methods,such as the Perspective-...Camera Pose Estimating from point and line correspondences is critical in various applications,including robotics,augmented reality,3D reconstruction,and autonomous navigation.Existing methods,such as the Perspective-n-Point(PnP)and Perspective-n-Line(PnL)approaches,offer limited accuracy and robustness in environments with occlusions,noise,or sparse feature data.This paper presents a unified solution,Efficient and Accurate Pose Estimation from Point and Line Correspondences(EAPnPL),combining point-based and linebased constraints to improve pose estimation accuracy and computational efficiency,particularly in low-altitude UAV navigation and obstacle avoidance.The proposed method utilizes quaternion parameterization of the rotation matrix to overcome singularity issues and address challenges in traditional rotation matrix-based formulations.A hybrid optimization framework is developed to integrate both point and line constraints,providing a more robust and stable solution in complex scenarios.The method is evaluated using synthetic and realworld datasets,demonstrating significant improvements in performance over existing techniques.The results indicate that the EAPnPL method enhances accuracy and reduces computational complexity,making it suitable for real-time applications in autonomous UAV systems.This approach offers a promising solution to the limitations of existing camera pose estimation methods,with potential applications in low-altitude navigation,autonomous robotics,and 3D scene reconstruction.展开更多
在动态场景下,视觉同时定位与地图构建(simultaneous localization and mapping, SLAM)通常与深度学习方法结合提高系统的定位精度.针对深度学习方法运行时产生的时间延迟,导致系统难以达到流式处理要求的问题,提出一种面向动态场景下视...在动态场景下,视觉同时定位与地图构建(simultaneous localization and mapping, SLAM)通常与深度学习方法结合提高系统的定位精度.针对深度学习方法运行时产生的时间延迟,导致系统难以达到流式处理要求的问题,提出一种面向动态场景下视觉SLAM的流感知定位方法.首先针对传统评估指标只考虑定位精度的问题,提出流式评估指标,该指标同时考虑定位精度和时间延迟,能够准确反映系统的流式处理性能;其次针对传统视觉SLAM方法无法实现流式处理的问题,提出流感知的视觉定位方法,通过多线程并行和相机位姿预测相结合的方式,获得持续稳定的相机位姿输出.在BONN数据集和真实场景上的实验结果表明,所提方法能够有效地提升动态场景下采用深度学习方法的视觉定位的流性能.基于BONN数据集和流式评估方式的评估结果表明,与DynaSLAM方法对比,所提方法的绝对轨迹误差(APE),相对平移误差(RPE_trans)和相对旋转误差(RPE_angle)分别下降80.438%, 56.180%和54.676%.在真实场景下的实验结果表明,所提方法可以得到与实际相符的相机轨迹.展开更多
事件相机是一种新型仿生视觉传感器,具有高动态范围、低延迟和无运动模糊等优点。本文提出了一种基于事件和图像的数据融合算法,名为EI-Fusion(Event and Image Fusion,事件图像融合),实现了事件相机与传统帧式相机的互补,有效提高了复...事件相机是一种新型仿生视觉传感器,具有高动态范围、低延迟和无运动模糊等优点。本文提出了一种基于事件和图像的数据融合算法,名为EI-Fusion(Event and Image Fusion,事件图像融合),实现了事件相机与传统帧式相机的互补,有效提高了复杂光照条件下的图像质量。此外,本文设计了一个基于光流跟踪的3-DoF姿态估计系统,并将融合结果作为输入以进一步评估该算法在姿态估计应用中的表现。实验结果表明,EI-Fusion的平均APE(Absolute Pose Error,绝对位姿误差)相较于原始图像降低了69%,大幅提升了姿态估计框架在昏暗场景中的性能。展开更多
文摘This paper presents a novel approach for camera pose refinement based on neural radiance fields(NeRF)by introducing semantic feature consistency to enhance robustness.NeRF has been successfully applied to camera pose estimation by inverting the rendering process given an observed RGB image and an initial pose estimate.However,previous methods only adopted photometric consistency for pose optimization,which is prone to be trapped in local minima.To address this problem,we introduce semantic feature consistency into the existing framework.Specifically,we utilize high-level features extracted from a convolutional neural network(CNN)pre-trained for image recognition,and maintain consistency of such features between observed and rendered images during the optimization procedure.Unlike the color values at each pixel,these features contain rich semantic information shared within local regions and can be more robust to appearance changes from different viewpoints.Since it is computationally expensive to render a full image with NeRF for feature extraction from CNN,we propose an efficient way to estimate the features of individually rendered pixels by projecting them to a nearby reference image and interpolating its feature maps.Extensive experiments show that our method greatly outperforms the baseline method on both synthetic objects and real-world large indoor scenes,increasing the accuracy of pose estimation by over 6.4%.
基金This work is supported by the National Key R&D Program of China[grant number 2018YFB0505400]the National Natural Science Foundation of China(NSFC)[grant num-ber 41901407]+1 种基金the LIESMARS Special Research Funding[grant number 2021]the College Students’Innovative Entrepreneurial Training Plan Program[grant number S2020634016].
文摘Image-based relocalization is a renewed interest in outdoor environments,because it is an important problem with many applications.PoseNet introduces Convolutional Neural Network(CNN)for the first time to realize the real-time camera pose solution based on a single image.In order to solve the problem of precision and robustness of PoseNet and its improved algorithms in complex environment,this paper proposes and implements a new visual relocation method based on deep convolutional neural networks(VNLSTM-PoseNet).Firstly,this method directly resizes the input image without cropping to increase the receptive field of the training image.Then,the image and the corresponding pose labels are put into the improved Long Short-Term Memory based(LSTM-based)PoseNet network for training and the network is optimized by the Nadam optimizer.Finally,the trained network is used for image localization to obtain the camera pose.Experimental results on outdoor public datasets show our VNLSTM-PoseNet can lead to drastic improvements in relocalization performance compared to existing state-of-theart CNN-based methods.
文摘In this paper we present a novel featurebased RGB-D camera pose optimization algorithm for real-time 3D reconstruction systems. During camera pose estimation, current methods in online systems suffer from fast-scanned RGB-D data, or generate inaccurate relative transformations between consecutive frames. Our approach improves current methods by utilizing matched features across all frames and is robust for RGB-D data with large shifts in consecutive frames. We directly estimate camera pose for each frame by efficiently solving a quadratic minimization problem to maximize the consistency of3 D points in global space across frames corresponding to matched feature points. We have implemented our method within two state-of-the-art online 3D reconstruction platforms. Experimental results testify that our method is efficient and reliable in estimating camera poses for RGB-D data with large shifts.
基金supported by the National Natural Science Foundation of China under Grant No.61421004,61572499,61632003.
文摘Virtual reality,augmented reality,robotics,and autonomous driving,have recently attracted much attention from both academic and industrial communities,in which image-based camera localization is a key task.However,there has not been a complete review on image-based camera localization.It is urgent to map this topic to enable individuals enter the field quickly.In this paper,an overview of image-based camera localization is presented.A new and complete classification of image-based camera localization approaches is provided and the related techniques are introduced.Trends for future development are also discussed.This will be useful not only to researchers,but also to engineers and other individuals interested in this field.
文摘Real-time indoor camera localization is a significant problem in indoor robot navigation and surveillance systems.The scene can change during the image sequence and plays a vital role in the localization performance of robotic applications in terms of accuracy and speed.This research proposed a real-time indoor camera localization system based on a recurrent neural network that detects scene change during the image sequence.An annotated image dataset trains the proposed system and predicts the camera pose in real-time.The system mainly improved the localization performance of indoor cameras by more accurately predicting the camera pose.It also recognizes the scene changes during the sequence and evaluates the effects of these changes.This system achieved high accuracy and real-time performance.The scene change detection process was performed using visual rhythm and the proposed recurrent deep architecture,which performed camera pose prediction and scene change impact evaluation.Overall,this study proposed a novel real-time localization system for indoor cameras that detects scene changes and shows how they affect localization performance.
基金funded by the Jiangsu Province Postgraduate Scientific Research and Practice Innovation Program(SJCX240449)projectthe Nanjing University of Information Science and Technology Talent Startup Fund(2022r078).
文摘Camera Pose Estimating from point and line correspondences is critical in various applications,including robotics,augmented reality,3D reconstruction,and autonomous navigation.Existing methods,such as the Perspective-n-Point(PnP)and Perspective-n-Line(PnL)approaches,offer limited accuracy and robustness in environments with occlusions,noise,or sparse feature data.This paper presents a unified solution,Efficient and Accurate Pose Estimation from Point and Line Correspondences(EAPnPL),combining point-based and linebased constraints to improve pose estimation accuracy and computational efficiency,particularly in low-altitude UAV navigation and obstacle avoidance.The proposed method utilizes quaternion parameterization of the rotation matrix to overcome singularity issues and address challenges in traditional rotation matrix-based formulations.A hybrid optimization framework is developed to integrate both point and line constraints,providing a more robust and stable solution in complex scenarios.The method is evaluated using synthetic and realworld datasets,demonstrating significant improvements in performance over existing techniques.The results indicate that the EAPnPL method enhances accuracy and reduces computational complexity,making it suitable for real-time applications in autonomous UAV systems.This approach offers a promising solution to the limitations of existing camera pose estimation methods,with potential applications in low-altitude navigation,autonomous robotics,and 3D scene reconstruction.