Photomechanics is a crucial branch of solid mechanics.The localization of point targets constitutes a fundamental problem in optical experimental mechanics,with extensive applications in various missions of unmanned a...Photomechanics is a crucial branch of solid mechanics.The localization of point targets constitutes a fundamental problem in optical experimental mechanics,with extensive applications in various missions of unmanned aerial vehicles.Localizing moving targets is crucial for analyzing their motion characteristics and dynamic properties.Reconstructing the trajectories of points from asynchronous cameras is a significant challenge.It encompasses two coupled sub-problems:Trajectory reconstruction and camera synchronization.Present methods typically address only one of these sub-problems individually.This paper proposes a 3D trajectory reconstruction method for point targets based on asynchronous cameras,simultaneously solving both sub-problems.Firstly,we extend the trajectory intersection method to asynchronous cameras to resolve the limitation of traditional triangulation that requires camera synchronization.Secondly,we develop models for camera temporal information and target motion,based on imaging mechanisms and target dynamics characteristics.The parameters are optimized simultaneously to achieve trajectory reconstruction without accurate time parameters.Thirdly,we optimize the camera rotations alongside the camera time information and target motion parameters,using tighter and more continuous constraints on moving points.The reconstruction accuracy is significantly improved,especially when the camera rotations are inaccurate.Finally,the simulated and real-world experimental results demonstrate the feasibility and accuracy of the proposed method.The real-world results indicate that the proposed algorithm achieved a localization error of 112.95 m at an observation distance range of 15-20 km.展开更多
We present the Fourier lightfield multiview stereoscope(FiLM-Scope).This imaging device combines concepts from Fourier lightfield microscopy and multiview stereo imaging to capture high-resolution 3D videos over large...We present the Fourier lightfield multiview stereoscope(FiLM-Scope).This imaging device combines concepts from Fourier lightfield microscopy and multiview stereo imaging to capture high-resolution 3D videos over large fields of view.The FiLM-Scope optical hardware consists of a multicamera array,with 48 individual microcameras,placed behind a high-throughput primary lens.This allows the FiLM-Scope to simultaneously capture 48 unique 12.8 megapixel images of a 28×37 mm field-of-view,from unique angular perspectives over a 21 deg×29 deg range,with down to 22μm lateral resolution.We also describe a self-supervised algorithm to reconstruct 3D height maps from these images.Our approach demonstrates height accuracy down to 11μm.To showcase the utility of our system,we perform tool tracking over the surface of an ex vivo rat skull and visualize the 3D deformation in stretching human skin,with videos captured at up to 100 frames per second.The FiLM-Scope has the potential to improve 3D visualization in a range of microsurgical settings.展开更多
The trend towards automation and intelligence in aircraft final assembly testing has led to a new demand for autonomous perception of unknown cockpit operation scenes in robotic collaborative airborne system testing.T...The trend towards automation and intelligence in aircraft final assembly testing has led to a new demand for autonomous perception of unknown cockpit operation scenes in robotic collaborative airborne system testing.To address this demand,a robotic automated 3D reconstruction cell which enables to autonomously plan the robot end-camera’s trajectory is developed for image acquisition and 3D modeling of the cockpit operation scene.A continuous viewpoint path planning algorithm is proposed that incorporates both 3D reconstruction quality and robot path quality into optimization process.Smoothness metrics for viewpoint position paths and orientation paths are introduced together for the first time in 3D reconstruction.To ensure safe and effective movement,two spatial constraints,Domain of View Admissible Position(DVAP)and Domain of View Admissible Orientation(DVAO),are implemented to account for robot reachability and collision avoidance.By using diffeomorphism mapping,the orientation path is transformed into 3D,consistent with the position path.Both orientation and position paths can be optimized in a unified framework to maximize the gain of reconstruction quality and path smoothness within DVAP and DVAO.The reconstruction cell is capable of automatic data acquisition and fine scene modeling,using the generated robot C-space trajectory.Simulation and physical scene experiments have confirmed the effectiveness of the proposed method to achieve highprecision 3D reconstruction while optimizing robot motion quality.展开更多
A novel single color camera trichromatic mask 3D-PIV technique suitable for measurement of complex flow fields in confined spaces is presented in this paper.By using a trichromatic mask to modulate the imaging optical...A novel single color camera trichromatic mask 3D-PIV technique suitable for measurement of complex flow fields in confined spaces is presented in this paper.By using a trichromatic mask to modulate the imaging optical path of a color camera,the RGB(Red,Green,and Blue)channels of the photosensitive chip were used to record full-frame full-resolution images of tracer particles from three viewing angles.The MLOS-SMART particle reconstruction algorithm was used to obtain three-dimensional particle distribution matrix from particle trichromatic mask images.The impact of parameters such as the inter-hole spacing and hole diameter of the trichromatic mask on the quality of particle reconstruction was analyzed.Through numerical simulation experiments on artificially synthesized three-dimensional flow fields of Gaussian vortex rings,the practicality of this technique in measuring three-dimensional transient velocity fields and the accuracy of velocity measurements were examined.The accuracy and feasibility of the technique are illustrated based on experimental measurements of a zero-net-mass-flux jet.展开更多
This paper describes a multiple camera-based method to reconstruct the 3D shape of a human foot. From a foot database, an initial 3D model of the foot represented by a cloud of points is built. The shape parameters, w...This paper describes a multiple camera-based method to reconstruct the 3D shape of a human foot. From a foot database, an initial 3D model of the foot represented by a cloud of points is built. The shape parameters, which can characterize more than 92% of a foot, are defined by using the principal component analysis method. Then, using "active shape models", the initial 3D model is adapted to the real foot captured in multiple images by applying some constraints (edge points' distance and color variance). We insist here on the experiment part where we demonstrate the efficiency of the proposed method on a plastic foot model, and also on real human feet with various shapes. We propose and compare different ways of texturing the foot which is needed for reconstruction. We present an experiment performed on the plastic foot model and on human feet and propose two different ways to improve the final 3D shapers accuracy according to the previous experiments' results. The first improvement proposed is the densification of the cloud of points used to represent the initial model and the foot database. The second improvement concerns the projected patterns used to texture the foot. We conclude by showing the obtained results for a human foot with the average computed shape error being only 1.06 mm.展开更多
Three-dimensional(3D)modeling is an important topic in computer graphics and computer vision.In recent years,the introduction of consumer-grade depth cameras has resulted in profound advances in 3D modeling.Starting w...Three-dimensional(3D)modeling is an important topic in computer graphics and computer vision.In recent years,the introduction of consumer-grade depth cameras has resulted in profound advances in 3D modeling.Starting with the basic data structure,this survey reviews the latest developments of 3D modeling based on depth cameras,including research works on camera tracking,3D object and scene reconstruction,and high-quality texture reconstruction.We also discuss the future work and possible solutions for 3D modeling based on the depth camera.展开更多
Instead of traditionally using a 3D physical model with many control points on it, a calibration plate with printed chess grid and movable along its normal direction is implemented to provide large area 3D control poi...Instead of traditionally using a 3D physical model with many control points on it, a calibration plate with printed chess grid and movable along its normal direction is implemented to provide large area 3D control points with variable Z values. Experiments show that the approach presented is effective for reconstructing 3D color objects in computer vision system.展开更多
This work explores an alternative 3D geometry measurement method for non-cooperative spacecraft guiding navigation and proximity operations.From one snapshot of an unfocused light-field camera, the 3D point cloud of a...This work explores an alternative 3D geometry measurement method for non-cooperative spacecraft guiding navigation and proximity operations.From one snapshot of an unfocused light-field camera, the 3D point cloud of a non-cooperative spacecraft can be calculated from sub-aperture images with the epipolar plane image(EPI) based light-field rendering algorithm.A Chang'e-3 model(7.2 cm×5.6 cm×7.0 cm) is tested to validate the proposed technique.Three measurement distances(1.0 m, 1.2 m, 1.5 m) are considered to simulate different approaching stages.Measuring errors are quantified by comparing the light-field camera data with a high precision commercial laser scanner.The mean error distance for the three cases are 0.837 mm, 0.743 mm, and 0.973 mm respectively, indicating that the method can well reconstruct 3D geometry of a non-cooperative spacecraft with a densely distributed 3D point cloud and is thus promising in space-related missions.展开更多
In this paper, we present a new technique of 3D face reconstruction from a sequence of images taken with cameras having varying parameters without the need to grid. This method is based on the estimation of the projec...In this paper, we present a new technique of 3D face reconstruction from a sequence of images taken with cameras having varying parameters without the need to grid. This method is based on the estimation of the projection matrices of the cameras from a symmetry property which characterizes the face, these projections matrices are used with points matching in each pair of images to determine the 3D points cloud, subsequently, 3D mesh of the face is constructed with 3D Crust algorithm. Lastly, the 2D image is projected on the 3D model to generate the texture mapping. The strong point of the proposed approach is to minimize the constraints of the calibration system: we calibrated the cameras from a symmetry property which characterizes the face, this property gives us the opportunity to know some points of 3D face in a specific well-chosen global reference, to formulate a system of linear and nonlinear equations according to these 3D points, their projection in the image plan and the elements of the projections matrix. Then to solve these equations, we use a genetic algorithm which consists of finding the global optimum without the need of the initial estimation and allows to avoid the local minima of the formulated cost function. Our study is conducted on real data to demonstrate the validity and the performance of the proposed approach in terms of robustness, simplicity, stability and convergence.展开更多
Precise and robust three-dimensional object detection(3DOD)presents a promising opportunity in the field of mobile robot(MR)navigation.Monocular 3DOD techniques typically involve extending existing twodimensional obje...Precise and robust three-dimensional object detection(3DOD)presents a promising opportunity in the field of mobile robot(MR)navigation.Monocular 3DOD techniques typically involve extending existing twodimensional object detection(2DOD)frameworks to predict the three-dimensional bounding box(3DBB)of objects captured in 2D RGB images.However,these methods often require multiple images,making them less feasible for various real-time scenarios.To address these challenges,the emergence of agile convolutional neural networks(CNNs)capable of inferring depth froma single image opens a new avenue for investigation.The paper proposes a novel ELDENet network designed to produce cost-effective 3DBounding Box Estimation(3D-BBE)froma single image.This novel framework comprises the PP-LCNet as the encoder and a fast convolutional decoder.Additionally,this integration includes a Squeeze-Exploit(SE)module utilizing the Math Kernel Library for Deep Neural Networks(MKLDNN)optimizer to enhance convolutional efficiency and streamline model size during effective training.Meanwhile,the proposed multi-scale sub-pixel decoder generates high-quality depth maps while maintaining a compact structure.Furthermore,the generated depthmaps provide a clear perspective with distance details of objects in the environment.These depth insights are combined with 2DOD for precise evaluation of 3D Bounding Boxes(3DBB),facilitating scene understanding and optimal route planning for mobile robots.Based on the estimated object center of the 3DBB,the Deep Reinforcement Learning(DRL)-based obstacle avoidance strategy for MRs is developed.Experimental results demonstrate that our model achieves state-of-the-art performance across three datasets:NYU-V2,KITTI,and Cityscapes.Overall,this framework shows significant potential for adaptation in intelligent mechatronic systems,particularly in developing knowledge-driven systems for mobile robot navigation.展开更多
针对航天器相对导航问题,以空间站表面为"特殊地形",提出一种基于大型航天器表面巡检的相对导航算法。首先,运用巡检飞行器上的TOF(Time of Flight)相机测量空间站表面局部点云数据,以该点云数据为实时图,以空间站表面先验点...针对航天器相对导航问题,以空间站表面为"特殊地形",提出一种基于大型航天器表面巡检的相对导航算法。首先,运用巡检飞行器上的TOF(Time of Flight)相机测量空间站表面局部点云数据,以该点云数据为实时图,以空间站表面先验点云数据为基准图。然后,利用3D-Zernike矩与三维地形间的一一对应关系,将三维地形匹配转化为基于3D Zernike矩的特征向量匹配。在此基础上求解实时图与匹配上的基准图间的相对位置、相对姿态,从而确定两航天器间的相对导航参数,并通过实验分析了匹配精度及速度的主要影响因素。最后,将该相对导航参数与惯性系统推算的相对导航参数在扩展卡尔曼滤波器的框架下实现信息融合,估计了巡检飞行器与空间站间的相对位置、相对姿态,实验结果表明,相对位置精度优于0.002m,相对姿态精度优于0.1°。展开更多
Three-dimensional(3 D) visual tracking of a multicopter(where the camera is fixed while the multicopter is moving) means continuously recovering the six-degree-of-freedom pose of the multicopter relative to the camera...Three-dimensional(3 D) visual tracking of a multicopter(where the camera is fixed while the multicopter is moving) means continuously recovering the six-degree-of-freedom pose of the multicopter relative to the camera. It can be used in many applications,such as precision terminal guidance and control algorithm validation for multicopters. However, it is difficult for many researchers to build a 3 D visual tracking system for multicopters(VTSMs) by using cheap and off-the-shelf cameras. This paper firstly gives an overview of the three key technologies of a 3 D VTSMs: multi-camera placement, multi-camera calibration and pose estimation for multicopters. Then, some representative 3 D visual tracking systems for multicopters are introduced. Finally, the future development of the 3D VTSMs is analyzed and summarized.展开更多
3D human pose estimation is a major focus area in the field of computer vision,which plays an important role in practical applications.This article summarizes the framework and research progress related to the estimat...3D human pose estimation is a major focus area in the field of computer vision,which plays an important role in practical applications.This article summarizes the framework and research progress related to the estimation of monocular RGB images and videos.An overall perspective ofmethods integrated with deep learning is introduced.Novel image-based and video-based inputs are proposed as the analysis framework.From this viewpoint,common problems are discussed.The diversity of human postures usually leads to problems such as occlusion and ambiguity,and the lack of training datasets often results in poor generalization ability of the model.Regression methods are crucial for solving such problems.Considering image-based input,the multi-view method is commonly used to solve occlusion problems.Here,the multi-view method is analyzed comprehensively.By referring to video-based input,the human prior knowledge of restricted motion is used to predict human postures.In addition,structural constraints are widely used as prior knowledge.Furthermore,weakly supervised learningmethods are studied and discussed for these two types of inputs to improve the model generalization ability.The problem of insufficient training datasets must also be considered,especially because 3D datasets are usually biased and limited.Finally,emerging and popular datasets and evaluation indicators are discussed.The characteristics of the datasets and the relationships of the indicators are explained and highlighted.Thus,this article can be useful and instructive for researchers who are lacking in experience and find this field confusing.In addition,by providing an overview of 3D human pose estimation,this article sorts and refines recent studies on 3D human pose estimation.It describes kernel problems and common useful methods,and discusses the scope for further research.展开更多
Structure-from-Motion(SfM)techniques have been widely used for 3D geometry reconstruction from multi-view images.Nevertheless,the efficiency and quality of the reconstructed geometry depends on multiple factors,i.e.,t...Structure-from-Motion(SfM)techniques have been widely used for 3D geometry reconstruction from multi-view images.Nevertheless,the efficiency and quality of the reconstructed geometry depends on multiple factors,i.e.,the base-height ratio,intersection angle,overlap,and ground control points,etc.,which are rarely quantified in real-world applications.To answer this question,in this paper,we take a data-driven approach by analyzing hundreds of terrestrial stereo image configurations through a typical SfM algorithm.Two main meta-parameters with respect to base-height ratio and intersection angle are analyzed.Following the results,we propose a Skeletal Camera Network(SCN)and embed it into the SfM to lead to a novel SfM scheme called SCN-SfM,which limits tie-point matching to the remaining connected image pairs in SCN.The proposed method was applied in three terrestrial datasets.Experimental results have demonstrated the effectiveness of the proposed SCN-SfM to achieve 3D geometry with higher accuracy and fast time efficiency compared to the typical SfM method,whereas the completeness of the geometry is comparable.展开更多
基金supported by the Hunan Provin〓〓cial Natural Science Foundation for Excellent Young Scholars(Grant No.2023JJ20045)the National Natural Science Foundation of China(Grant No.12372189)。
文摘Photomechanics is a crucial branch of solid mechanics.The localization of point targets constitutes a fundamental problem in optical experimental mechanics,with extensive applications in various missions of unmanned aerial vehicles.Localizing moving targets is crucial for analyzing their motion characteristics and dynamic properties.Reconstructing the trajectories of points from asynchronous cameras is a significant challenge.It encompasses two coupled sub-problems:Trajectory reconstruction and camera synchronization.Present methods typically address only one of these sub-problems individually.This paper proposes a 3D trajectory reconstruction method for point targets based on asynchronous cameras,simultaneously solving both sub-problems.Firstly,we extend the trajectory intersection method to asynchronous cameras to resolve the limitation of traditional triangulation that requires camera synchronization.Secondly,we develop models for camera temporal information and target motion,based on imaging mechanisms and target dynamics characteristics.The parameters are optimized simultaneously to achieve trajectory reconstruction without accurate time parameters.Thirdly,we optimize the camera rotations alongside the camera time information and target motion parameters,using tighter and more continuous constraints on moving points.The reconstruction accuracy is significantly improved,especially when the camera rotations are inaccurate.Finally,the simulated and real-world experimental results demonstrate the feasibility and accuracy of the proposed method.The real-world results indicate that the proposed algorithm achieved a localization error of 112.95 m at an observation distance range of 15-20 km.
基金supported by the National Cancer Institute(NCI)of the National Institutes of Health(Grant No.R44CA250877)the Office of Research Infrastructure Programs(ORIP),Office of the Director,National Institutes of Health,and the National Institute of Environmental Health Sciences(NIEHS)of the National Institutes of Health(Grant No.R44OD024879)+2 种基金the National Institute of Biomedical Imaging and Bioengineering(NIBIB)of the National Institutes of Health(Grant No.R43EB030979)the National Science Foundation(Grant Nos.2036439 and 2238845)the Duke Coulter Translational Part-nership Award,the Fitzpatrick Institute at the Duke University.
文摘We present the Fourier lightfield multiview stereoscope(FiLM-Scope).This imaging device combines concepts from Fourier lightfield microscopy and multiview stereo imaging to capture high-resolution 3D videos over large fields of view.The FiLM-Scope optical hardware consists of a multicamera array,with 48 individual microcameras,placed behind a high-throughput primary lens.This allows the FiLM-Scope to simultaneously capture 48 unique 12.8 megapixel images of a 28×37 mm field-of-view,from unique angular perspectives over a 21 deg×29 deg range,with down to 22μm lateral resolution.We also describe a self-supervised algorithm to reconstruct 3D height maps from these images.Our approach demonstrates height accuracy down to 11μm.To showcase the utility of our system,we perform tool tracking over the surface of an ex vivo rat skull and visualize the 3D deformation in stretching human skin,with videos captured at up to 100 frames per second.The FiLM-Scope has the potential to improve 3D visualization in a range of microsurgical settings.
基金supported by the National Key Research and Development Program of China(2019YFB1707505)the National Natural Science Foundation of China(Grant No.52005436)。
文摘The trend towards automation and intelligence in aircraft final assembly testing has led to a new demand for autonomous perception of unknown cockpit operation scenes in robotic collaborative airborne system testing.To address this demand,a robotic automated 3D reconstruction cell which enables to autonomously plan the robot end-camera’s trajectory is developed for image acquisition and 3D modeling of the cockpit operation scene.A continuous viewpoint path planning algorithm is proposed that incorporates both 3D reconstruction quality and robot path quality into optimization process.Smoothness metrics for viewpoint position paths and orientation paths are introduced together for the first time in 3D reconstruction.To ensure safe and effective movement,two spatial constraints,Domain of View Admissible Position(DVAP)and Domain of View Admissible Orientation(DVAO),are implemented to account for robot reachability and collision avoidance.By using diffeomorphism mapping,the orientation path is transformed into 3D,consistent with the position path.Both orientation and position paths can be optimized in a unified framework to maximize the gain of reconstruction quality and path smoothness within DVAP and DVAO.The reconstruction cell is capable of automatic data acquisition and fine scene modeling,using the generated robot C-space trajectory.Simulation and physical scene experiments have confirmed the effectiveness of the proposed method to achieve highprecision 3D reconstruction while optimizing robot motion quality.
基金co-supported by the National Natural Science Foundation of China(Nos.12102284,12172242,12332017)the Shanxi Province Science Foundation for Youths,China(No.20210302124262)the Chunhui Project Foundation of the Education Department of China(No.202200257)。
文摘A novel single color camera trichromatic mask 3D-PIV technique suitable for measurement of complex flow fields in confined spaces is presented in this paper.By using a trichromatic mask to modulate the imaging optical path of a color camera,the RGB(Red,Green,and Blue)channels of the photosensitive chip were used to record full-frame full-resolution images of tracer particles from three viewing angles.The MLOS-SMART particle reconstruction algorithm was used to obtain three-dimensional particle distribution matrix from particle trichromatic mask images.The impact of parameters such as the inter-hole spacing and hole diameter of the trichromatic mask on the quality of particle reconstruction was analyzed.Through numerical simulation experiments on artificially synthesized three-dimensional flow fields of Gaussian vortex rings,the practicality of this technique in measuring three-dimensional transient velocity fields and the accuracy of velocity measurements were examined.The accuracy and feasibility of the technique are illustrated based on experimental measurements of a zero-net-mass-flux jet.
基金This work was supported by Grant-in-Aid for Scientific Research (C) (No.17500119)
文摘This paper describes a multiple camera-based method to reconstruct the 3D shape of a human foot. From a foot database, an initial 3D model of the foot represented by a cloud of points is built. The shape parameters, which can characterize more than 92% of a foot, are defined by using the principal component analysis method. Then, using "active shape models", the initial 3D model is adapted to the real foot captured in multiple images by applying some constraints (edge points' distance and color variance). We insist here on the experiment part where we demonstrate the efficiency of the proposed method on a plastic foot model, and also on real human feet with various shapes. We propose and compare different ways of texturing the foot which is needed for reconstruction. We present an experiment performed on the plastic foot model and on human feet and propose two different ways to improve the final 3D shapers accuracy according to the previous experiments' results. The first improvement proposed is the densification of the cloud of points used to represent the initial model and the foot database. The second improvement concerns the projected patterns used to texture the foot. We conclude by showing the obtained results for a human foot with the average computed shape error being only 1.06 mm.
基金National Natural Science Foundation of China(61732016).
文摘Three-dimensional(3D)modeling is an important topic in computer graphics and computer vision.In recent years,the introduction of consumer-grade depth cameras has resulted in profound advances in 3D modeling.Starting with the basic data structure,this survey reviews the latest developments of 3D modeling based on depth cameras,including research works on camera tracking,3D object and scene reconstruction,and high-quality texture reconstruction.We also discuss the future work and possible solutions for 3D modeling based on the depth camera.
基金Supported by the Natural Science Foundation of China (69775022)the State High-Technology Development program of China(863 306ZT04 06 3)
文摘Instead of traditionally using a 3D physical model with many control points on it, a calibration plate with printed chess grid and movable along its normal direction is implemented to provide large area 3D control points with variable Z values. Experiments show that the approach presented is effective for reconstructing 3D color objects in computer vision system.
文摘This work explores an alternative 3D geometry measurement method for non-cooperative spacecraft guiding navigation and proximity operations.From one snapshot of an unfocused light-field camera, the 3D point cloud of a non-cooperative spacecraft can be calculated from sub-aperture images with the epipolar plane image(EPI) based light-field rendering algorithm.A Chang'e-3 model(7.2 cm×5.6 cm×7.0 cm) is tested to validate the proposed technique.Three measurement distances(1.0 m, 1.2 m, 1.5 m) are considered to simulate different approaching stages.Measuring errors are quantified by comparing the light-field camera data with a high precision commercial laser scanner.The mean error distance for the three cases are 0.837 mm, 0.743 mm, and 0.973 mm respectively, indicating that the method can well reconstruct 3D geometry of a non-cooperative spacecraft with a densely distributed 3D point cloud and is thus promising in space-related missions.
文摘In this paper, we present a new technique of 3D face reconstruction from a sequence of images taken with cameras having varying parameters without the need to grid. This method is based on the estimation of the projection matrices of the cameras from a symmetry property which characterizes the face, these projections matrices are used with points matching in each pair of images to determine the 3D points cloud, subsequently, 3D mesh of the face is constructed with 3D Crust algorithm. Lastly, the 2D image is projected on the 3D model to generate the texture mapping. The strong point of the proposed approach is to minimize the constraints of the calibration system: we calibrated the cameras from a symmetry property which characterizes the face, this property gives us the opportunity to know some points of 3D face in a specific well-chosen global reference, to formulate a system of linear and nonlinear equations according to these 3D points, their projection in the image plan and the elements of the projections matrix. Then to solve these equations, we use a genetic algorithm which consists of finding the global optimum without the need of the initial estimation and allows to avoid the local minima of the formulated cost function. Our study is conducted on real data to demonstrate the validity and the performance of the proposed approach in terms of robustness, simplicity, stability and convergence.
文摘Precise and robust three-dimensional object detection(3DOD)presents a promising opportunity in the field of mobile robot(MR)navigation.Monocular 3DOD techniques typically involve extending existing twodimensional object detection(2DOD)frameworks to predict the three-dimensional bounding box(3DBB)of objects captured in 2D RGB images.However,these methods often require multiple images,making them less feasible for various real-time scenarios.To address these challenges,the emergence of agile convolutional neural networks(CNNs)capable of inferring depth froma single image opens a new avenue for investigation.The paper proposes a novel ELDENet network designed to produce cost-effective 3DBounding Box Estimation(3D-BBE)froma single image.This novel framework comprises the PP-LCNet as the encoder and a fast convolutional decoder.Additionally,this integration includes a Squeeze-Exploit(SE)module utilizing the Math Kernel Library for Deep Neural Networks(MKLDNN)optimizer to enhance convolutional efficiency and streamline model size during effective training.Meanwhile,the proposed multi-scale sub-pixel decoder generates high-quality depth maps while maintaining a compact structure.Furthermore,the generated depthmaps provide a clear perspective with distance details of objects in the environment.These depth insights are combined with 2DOD for precise evaluation of 3D Bounding Boxes(3DBB),facilitating scene understanding and optimal route planning for mobile robots.Based on the estimated object center of the 3DBB,the Deep Reinforcement Learning(DRL)-based obstacle avoidance strategy for MRs is developed.Experimental results demonstrate that our model achieves state-of-the-art performance across three datasets:NYU-V2,KITTI,and Cityscapes.Overall,this framework shows significant potential for adaptation in intelligent mechatronic systems,particularly in developing knowledge-driven systems for mobile robot navigation.
文摘针对航天器相对导航问题,以空间站表面为"特殊地形",提出一种基于大型航天器表面巡检的相对导航算法。首先,运用巡检飞行器上的TOF(Time of Flight)相机测量空间站表面局部点云数据,以该点云数据为实时图,以空间站表面先验点云数据为基准图。然后,利用3D-Zernike矩与三维地形间的一一对应关系,将三维地形匹配转化为基于3D Zernike矩的特征向量匹配。在此基础上求解实时图与匹配上的基准图间的相对位置、相对姿态,从而确定两航天器间的相对导航参数,并通过实验分析了匹配精度及速度的主要影响因素。最后,将该相对导航参数与惯性系统推算的相对导航参数在扩展卡尔曼滤波器的框架下实现信息融合,估计了巡检飞行器与空间站间的相对位置、相对姿态,实验结果表明,相对位置精度优于0.002m,相对姿态精度优于0.1°。
基金supported by the National Key Research and Development Program of China (No. 2017YFB1300102)National Natural Science Foundation of China (No. 61803025)
文摘Three-dimensional(3 D) visual tracking of a multicopter(where the camera is fixed while the multicopter is moving) means continuously recovering the six-degree-of-freedom pose of the multicopter relative to the camera. It can be used in many applications,such as precision terminal guidance and control algorithm validation for multicopters. However, it is difficult for many researchers to build a 3 D visual tracking system for multicopters(VTSMs) by using cheap and off-the-shelf cameras. This paper firstly gives an overview of the three key technologies of a 3 D VTSMs: multi-camera placement, multi-camera calibration and pose estimation for multicopters. Then, some representative 3 D visual tracking systems for multicopters are introduced. Finally, the future development of the 3D VTSMs is analyzed and summarized.
基金supported by the Program of Entrepreneurship and Innovation Ph.D.in Jiangsu Province(JSSCBS20211175)the School Ph.D.Talent Funding(Z301B2055)the Natural Science Foundation of the Jiangsu Higher Education Institutions of China(21KJB520002).
文摘3D human pose estimation is a major focus area in the field of computer vision,which plays an important role in practical applications.This article summarizes the framework and research progress related to the estimation of monocular RGB images and videos.An overall perspective ofmethods integrated with deep learning is introduced.Novel image-based and video-based inputs are proposed as the analysis framework.From this viewpoint,common problems are discussed.The diversity of human postures usually leads to problems such as occlusion and ambiguity,and the lack of training datasets often results in poor generalization ability of the model.Regression methods are crucial for solving such problems.Considering image-based input,the multi-view method is commonly used to solve occlusion problems.Here,the multi-view method is analyzed comprehensively.By referring to video-based input,the human prior knowledge of restricted motion is used to predict human postures.In addition,structural constraints are widely used as prior knowledge.Furthermore,weakly supervised learningmethods are studied and discussed for these two types of inputs to improve the model generalization ability.The problem of insufficient training datasets must also be considered,especially because 3D datasets are usually biased and limited.Finally,emerging and popular datasets and evaluation indicators are discussed.The characteristics of the datasets and the relationships of the indicators are explained and highlighted.Thus,this article can be useful and instructive for researchers who are lacking in experience and find this field confusing.In addition,by providing an overview of 3D human pose estimation,this article sorts and refines recent studies on 3D human pose estimation.It describes kernel problems and common useful methods,and discusses the scope for further research.
基金National Natural Science Foundation of China(No.41701534)Open Fund of State Key Laboratory of Coal Resources and Safe Mining(No.SKLCRSM19KFA01)+1 种基金Ecological and Smart Mine Joint Foundation of Hebei Province(No.E2020402086)State Key Laboratory ofGeohazard Prevention and Geoenvironment Protection(No.SKLGP2019K015)
文摘Structure-from-Motion(SfM)techniques have been widely used for 3D geometry reconstruction from multi-view images.Nevertheless,the efficiency and quality of the reconstructed geometry depends on multiple factors,i.e.,the base-height ratio,intersection angle,overlap,and ground control points,etc.,which are rarely quantified in real-world applications.To answer this question,in this paper,we take a data-driven approach by analyzing hundreds of terrestrial stereo image configurations through a typical SfM algorithm.Two main meta-parameters with respect to base-height ratio and intersection angle are analyzed.Following the results,we propose a Skeletal Camera Network(SCN)and embed it into the SfM to lead to a novel SfM scheme called SCN-SfM,which limits tie-point matching to the remaining connected image pairs in SCN.The proposed method was applied in three terrestrial datasets.Experimental results have demonstrated the effectiveness of the proposed SCN-SfM to achieve 3D geometry with higher accuracy and fast time efficiency compared to the typical SfM method,whereas the completeness of the geometry is comparable.