A new motion model and estimation algorithm is proposed to compute the general rigid motion object's 6-DOF motion parameters and center of rotation based on stereo vision. The object's 6-DOF motion model is designed...A new motion model and estimation algorithm is proposed to compute the general rigid motion object's 6-DOF motion parameters and center of rotation based on stereo vision. The object's 6-DOF motion model is designed from the rigid object's motion character under the two defined reference frames. According to the rigid object's motion model and motion dynamics knowledge, the corresponding motion algorithm to compute the 6-DOF motion parameters is worked out. By the rigid object pure rotation motion model and space sphere geometry knowledge, the center of rotation may be calculated after eliminating the translation motion out of the 6-DOF motion. The motion equations are educed based on the motion model and the closed-form solutions are figured out. To heighten the motion estimation algorithm's robust, RANSAC algorithm is applied to delete the outliers. Simulation and real experiments are conducted and the experiment results are analyzed. The results prove the motion model's correction and algorithm's validity.展开更多
Because of its characteristics of simple algorithm and hardware, optical flow-based motion estimation has become a hot research field, especially in GPS-denied environment. Optical flow could be used to obtain the air...Because of its characteristics of simple algorithm and hardware, optical flow-based motion estimation has become a hot research field, especially in GPS-denied environment. Optical flow could be used to obtain the aircraft motion information, but the six-(degree of freedom)(6-DOF) motion still couldn't be accurately estimated by existing methods. The purpose of this work is to provide a motion estimation method based on optical flow from forward and down looking cameras, which doesn't rely on the assumption of level flight. First, the distribution and decoupling method of optical flow from forward camera are utilized to get attitude. Then, the resulted angular velocities are utilized to obtain the translational optical flow of the down camera, which can eliminate the influence of rotational motion on velocity estimation. Besides, the translational motion estimation equation is simplified by establishing the relation between the depths of feature points and the aircraft altitude. Finally, simulation results show that the method presented is accurate and robust.展开更多
The 6D pose estimation is important for the safe take-off and landing of the aircraft using a single RGB image. Due to the large scene and large depth, the exiting pose estimation methods have unstratified performance...The 6D pose estimation is important for the safe take-off and landing of the aircraft using a single RGB image. Due to the large scene and large depth, the exiting pose estimation methods have unstratified performance on the accuracy. To achieve precise 6D pose estimation of the aircraft, an end-to-end method using an RGB image is proposed. In the proposed method, the2D and 3D information of the keypoints of the aircraft is used as the intermediate supervision,and 6D pose information of the aircraft in this intermediate information will be explored. Specifically, an off-the-shelf object detector is utilized to detect the Region of the Interest(Ro I) of the aircraft to eliminate background distractions. The 2D projection and 3D spatial information of the pre-designed keypoints of the aircraft is predicted by the keypoint coordinate estimator(Kp Net).The proposed method is trained in an end-to-end fashion. In addition, to deal with the lack of the related datasets, this paper builds the Aircraft 6D Pose dataset to train and test, which captures the take-off and landing process of three types of aircraft from 11 views. Compared with the latest Wide-Depth-Range method on this dataset, our proposed method improves the average 3D distance of model points metric(ADD) and 5° and 5 m metric by 86.8% and 30.1%, respectively. Furthermore, the proposed method gets 9.30 ms, 61.0% faster than YOLO6D with 23.86 ms.展开更多
In unstructured environments such as disaster sites and mine tunnels,it is a challenge for robots to estimate the poses of objects under complex lighting backgrounds,which limit their operation.Owing to the shadows pr...In unstructured environments such as disaster sites and mine tunnels,it is a challenge for robots to estimate the poses of objects under complex lighting backgrounds,which limit their operation.Owing to the shadows produced by a point light source,the brightness of the operation scene is seriously unbalanced,and it is difficult to accurately extract the features of objects.It is especially difficult to accurately label the poses of objects with weak corners and textures.This study proposes an automatic pose annotation method for such objects,which combine 3D-2D matching projection and rendering technology to improve the efficiency of dataset annotation.A 6D object pose estimation method under low-light conditions(LP_TGC)is then proposed,including(1)a light preprocessing neural network model based on a low-light preprocessing module(LPM)to balance the brightness of a picture and improve its quality;and(2)a 6D pose estimation model(TGC)based on the keypoint matching.Four typical datasets are constructed to verify our method,the experimental results validated and demonstrated the effectiveness of the proposed LP_TGC method.The estimation model based on the preprocessed image can accurately estimate the pose of the object in the mentioned unstructured environments,and it can improve the accuracy by an average of~3%based on the ADD metric.展开更多
We propose a feature-fusion network for pose estimation directly from RGB images without any depth information in this study.First,we introduce a two-stream architecture consisting of segmentation and regression strea...We propose a feature-fusion network for pose estimation directly from RGB images without any depth information in this study.First,we introduce a two-stream architecture consisting of segmentation and regression streams.The segmentation stream processes the spatial embedding features and obtains the corresponding image crop.These features are further coupled with the image crop in the fusion network.Second,we use an efficient perspective-n-point(E-PnP)algorithm in the regression stream to extract robust spatial features between 3D and 2D keypoints.Finally,we perform iterative refinement with an end-to-end mechanism to improve the estimation performance.We conduct experiments on two public datasets of YCB-Video and the challenging Occluded-LineMOD.The results show that our method outperforms state-of-the-art approaches in both the speed and the accuracy.展开更多
基金National Natural Science Foundation of China (No.50275040)
文摘A new motion model and estimation algorithm is proposed to compute the general rigid motion object's 6-DOF motion parameters and center of rotation based on stereo vision. The object's 6-DOF motion model is designed from the rigid object's motion character under the two defined reference frames. According to the rigid object's motion model and motion dynamics knowledge, the corresponding motion algorithm to compute the 6-DOF motion parameters is worked out. By the rigid object pure rotation motion model and space sphere geometry knowledge, the center of rotation may be calculated after eliminating the translation motion out of the 6-DOF motion. The motion equations are educed based on the motion model and the closed-form solutions are figured out. To heighten the motion estimation algorithm's robust, RANSAC algorithm is applied to delete the outliers. Simulation and real experiments are conducted and the experiment results are analyzed. The results prove the motion model's correction and algorithm's validity.
基金Project(2012CB720003)supported by the National Basic Research Program of ChinaProjects(61320106010,61127007,61121003,61573019)supported by the National Natural Science Foundation of ChinaProject(2013DFE13040)supported by the Special Program for International Science and Technology Cooperation from Ministry of Science and Technology of China
文摘Because of its characteristics of simple algorithm and hardware, optical flow-based motion estimation has become a hot research field, especially in GPS-denied environment. Optical flow could be used to obtain the aircraft motion information, but the six-(degree of freedom)(6-DOF) motion still couldn't be accurately estimated by existing methods. The purpose of this work is to provide a motion estimation method based on optical flow from forward and down looking cameras, which doesn't rely on the assumption of level flight. First, the distribution and decoupling method of optical flow from forward camera are utilized to get attitude. Then, the resulted angular velocities are utilized to obtain the translational optical flow of the down camera, which can eliminate the influence of rotational motion on velocity estimation. Besides, the translational motion estimation equation is simplified by establishing the relation between the depths of feature points and the aircraft altitude. Finally, simulation results show that the method presented is accurate and robust.
基金co-supported by the Key research and development plan project of Sichuan Province,China(No.2022YFG0153).
文摘The 6D pose estimation is important for the safe take-off and landing of the aircraft using a single RGB image. Due to the large scene and large depth, the exiting pose estimation methods have unstratified performance on the accuracy. To achieve precise 6D pose estimation of the aircraft, an end-to-end method using an RGB image is proposed. In the proposed method, the2D and 3D information of the keypoints of the aircraft is used as the intermediate supervision,and 6D pose information of the aircraft in this intermediate information will be explored. Specifically, an off-the-shelf object detector is utilized to detect the Region of the Interest(Ro I) of the aircraft to eliminate background distractions. The 2D projection and 3D spatial information of the pre-designed keypoints of the aircraft is predicted by the keypoint coordinate estimator(Kp Net).The proposed method is trained in an end-to-end fashion. In addition, to deal with the lack of the related datasets, this paper builds the Aircraft 6D Pose dataset to train and test, which captures the take-off and landing process of three types of aircraft from 11 views. Compared with the latest Wide-Depth-Range method on this dataset, our proposed method improves the average 3D distance of model points metric(ADD) and 5° and 5 m metric by 86.8% and 30.1%, respectively. Furthermore, the proposed method gets 9.30 ms, 61.0% faster than YOLO6D with 23.86 ms.
基金supported by the National Key Research and Development Program of China(Grant No.2018YFB1305300)the China Postdoctoral Science Foundation(Grant Nos.2020TQ0039 and 2021M700425)the National Natural Science Foundation of China(Grant Nos.61733001,62103054,U2013602,61873039,U1913211 and U1713215)。
文摘In unstructured environments such as disaster sites and mine tunnels,it is a challenge for robots to estimate the poses of objects under complex lighting backgrounds,which limit their operation.Owing to the shadows produced by a point light source,the brightness of the operation scene is seriously unbalanced,and it is difficult to accurately extract the features of objects.It is especially difficult to accurately label the poses of objects with weak corners and textures.This study proposes an automatic pose annotation method for such objects,which combine 3D-2D matching projection and rendering technology to improve the efficiency of dataset annotation.A 6D object pose estimation method under low-light conditions(LP_TGC)is then proposed,including(1)a light preprocessing neural network model based on a low-light preprocessing module(LPM)to balance the brightness of a picture and improve its quality;and(2)a 6D pose estimation model(TGC)based on the keypoint matching.Four typical datasets are constructed to verify our method,the experimental results validated and demonstrated the effectiveness of the proposed LP_TGC method.The estimation model based on the preprocessed image can accurately estimate the pose of the object in the mentioned unstructured environments,and it can improve the accuracy by an average of~3%based on the ADD metric.
基金the National Key Research and Development Program of China under Grant No.2021YFB1715900the National Natural Science Foundation of China under Grant Nos.12022117 and 61802406+2 种基金the Beijing Natural Science Foundation under Grant No.Z190004the Beijing Advanced Discipline Fund under Grant No.115200S001Alibaba Group through Alibaba Innovative Research Program.
文摘We propose a feature-fusion network for pose estimation directly from RGB images without any depth information in this study.First,we introduce a two-stream architecture consisting of segmentation and regression streams.The segmentation stream processes the spatial embedding features and obtains the corresponding image crop.These features are further coupled with the image crop in the fusion network.Second,we use an efficient perspective-n-point(E-PnP)algorithm in the regression stream to extract robust spatial features between 3D and 2D keypoints.Finally,we perform iterative refinement with an end-to-end mechanism to improve the estimation performance.We conduct experiments on two public datasets of YCB-Video and the challenging Occluded-LineMOD.The results show that our method outperforms state-of-the-art approaches in both the speed and the accuracy.
文摘针对单张RGB-D图像进行六自由度目标位姿估计难以充分利用颜色信息与深度信息的问题,提出了一种基于多种网络(金字塔池化网络和PointNet++网络结合特征融合网络)构成的深度学习网络框架.方法用于估计在高度杂乱场景下一组已知对象的六自由度位姿.首先对RGB图像进行语义识别,将每一个已知类别的对象掩膜应用到深度图中,按照掩膜的边界框完成对彩色图与深度图进行语义分割;其次,在获取到的点云数据中采用FPS算法获取关键点,映射到彩色图像与深度图像中进行关键点特征提取,将RGB-D图像中的颜色信息与深度信息视为异构数据,考虑关键点需要充分融合局部信息与全局信息,分别采用了金子塔池化网络(pyramid scene parsing network,PSPNet)和PointNet++网络提取颜色信息与深度信息;采用一种新型的关键点特征融合方法,深度融合提取到颜色信息与几何信息的局部及全局特征,并嵌入到选定的特征点中;使用多层感知机(multilayer perceptron,MLP)输出每一个像素点的六自由度位姿和置信度,利用每一个像素点的置信度,让网络自主选择最优的估计结果;最后,利用一种端到端的迭代位姿求精网络,进一步提高六自由度位姿估计的准确度.网络在公开的数据集LineMOD和YCB-Video上进行测试,实验结果表明和现有同类型的六自由度位姿估计方法相比,本文所提出的模型预测的六自由度准确度优于现有的同类型方法,在采用相同的评价标准下,平均准确度分别达到了97.2%和95.1%,分别提升了2.9%和3.9%.网络同时满足实时性要求,完成每一帧图像的六自由度位姿预测仅需0.06 s.