Real-time hand gesture recognition technology significantly improves the user's experience for virtual reality/augmented reality(VR/AR) applications, which relies on the identification of the orientation of the ha...Real-time hand gesture recognition technology significantly improves the user's experience for virtual reality/augmented reality(VR/AR) applications, which relies on the identification of the orientation of the hand in captured images or videos. A new three-stage pipeline approach for fast and accurate hand segmentation for the hand from a single depth image is proposed. Firstly, a depth frame is segmented into several regions by histogrambased threshold selection algorithm and by tracing the exterior boundaries of objects after thresholding. Secondly, each segmentation proposal is evaluated by a three-layers shallow convolutional neural network(CNN) to determine whether or not the boundary is associated with the hand. Finally, all hand components are merged as the hand segmentation result. Compared with algorithms based on random decision forest(RDF), the experimental results demonstrate that the approach achieves better performance with high-accuracy(88.34% mean intersection over union, mIoU) and a shorter processing time(≤8 ms).展开更多
针对动态场景下视觉SLAM(Simultaneous Localization and Mapping)系统中深度学习分割网络实时性不足,以及相机非期望运动导致位姿估计偏差的问题,提出一种基于跨域掩膜分割的视觉SLAM算法.该算法采用轻量化YOLO-fastest网络结合背景减...针对动态场景下视觉SLAM(Simultaneous Localization and Mapping)系统中深度学习分割网络实时性不足,以及相机非期望运动导致位姿估计偏差的问题,提出一种基于跨域掩膜分割的视觉SLAM算法.该算法采用轻量化YOLO-fastest网络结合背景减除法实现运动物体检测,利用深度图结合深度阈值分割构建跨域掩膜分割机制,并设计相机运动几何校正策略补偿检测框坐标误差,在实现运动物体分割的同时提升处理速度.为优化特征点利用率,采用金字塔光流对动态特征点进行帧间连续跟踪与更新,同时确保仅由静态特征点参与位姿估计过程.在TUM数据集上进行系统性评估,实验结果表明,相比于ORB-SLAM3算法,该算法的绝对位姿误差平均降幅达97.1%,与使用深度学习分割网络的DynaSLAM和DS-SLAM的动态SLAM算法相比,其单帧跟踪时间大幅减少,在精度与效率之间实现了更好的平衡.展开更多
Holoscopic 3D imaging is a true 3D imaging system mimics fly’s eye technique to acquire a true 3D optical model of a real scene. To reconstruct the 3D image computationally, an efficient implementation of an Auto-Fea...Holoscopic 3D imaging is a true 3D imaging system mimics fly’s eye technique to acquire a true 3D optical model of a real scene. To reconstruct the 3D image computationally, an efficient implementation of an Auto-Feature-Edge (AFE) descriptor algorithm is required that provides an individual feature detector for integration of 3D information to locate objects in the scene. The AFE descriptor plays a key role in simplifying the detection of both edge-based and region-based objects. The detector is based on a Multi-Quantize Adaptive Local Histogram Analysis (MQALHA) algorithm. This is distinctive for each Feature-Edge (FE) block i.e. the large contrast changes (gradients) in FE are easier to localise. The novelty of this work lies in generating a free-noise 3D-Map (3DM) according to a correlation analysis of region contours. This automatically combines the exploitation of the available depth estimation technique with edge-based feature shape recognition technique. The application area consists of two varied domains, which prove the efficiency and robustness of the approach: a) extracting a set of setting feature-edges, for both tracking and mapping process for 3D depthmap estimation, and b) separation and recognition of focus objects in the scene. Experimental results show that the proposed 3DM technique is performed efficiently compared to the state-of-the-art algorithms.展开更多
文摘Real-time hand gesture recognition technology significantly improves the user's experience for virtual reality/augmented reality(VR/AR) applications, which relies on the identification of the orientation of the hand in captured images or videos. A new three-stage pipeline approach for fast and accurate hand segmentation for the hand from a single depth image is proposed. Firstly, a depth frame is segmented into several regions by histogrambased threshold selection algorithm and by tracing the exterior boundaries of objects after thresholding. Secondly, each segmentation proposal is evaluated by a three-layers shallow convolutional neural network(CNN) to determine whether or not the boundary is associated with the hand. Finally, all hand components are merged as the hand segmentation result. Compared with algorithms based on random decision forest(RDF), the experimental results demonstrate that the approach achieves better performance with high-accuracy(88.34% mean intersection over union, mIoU) and a shorter processing time(≤8 ms).
文摘针对动态场景下视觉SLAM(Simultaneous Localization and Mapping)系统中深度学习分割网络实时性不足,以及相机非期望运动导致位姿估计偏差的问题,提出一种基于跨域掩膜分割的视觉SLAM算法.该算法采用轻量化YOLO-fastest网络结合背景减除法实现运动物体检测,利用深度图结合深度阈值分割构建跨域掩膜分割机制,并设计相机运动几何校正策略补偿检测框坐标误差,在实现运动物体分割的同时提升处理速度.为优化特征点利用率,采用金字塔光流对动态特征点进行帧间连续跟踪与更新,同时确保仅由静态特征点参与位姿估计过程.在TUM数据集上进行系统性评估,实验结果表明,相比于ORB-SLAM3算法,该算法的绝对位姿误差平均降幅达97.1%,与使用深度学习分割网络的DynaSLAM和DS-SLAM的动态SLAM算法相比,其单帧跟踪时间大幅减少,在精度与效率之间实现了更好的平衡.
文摘Holoscopic 3D imaging is a true 3D imaging system mimics fly’s eye technique to acquire a true 3D optical model of a real scene. To reconstruct the 3D image computationally, an efficient implementation of an Auto-Feature-Edge (AFE) descriptor algorithm is required that provides an individual feature detector for integration of 3D information to locate objects in the scene. The AFE descriptor plays a key role in simplifying the detection of both edge-based and region-based objects. The detector is based on a Multi-Quantize Adaptive Local Histogram Analysis (MQALHA) algorithm. This is distinctive for each Feature-Edge (FE) block i.e. the large contrast changes (gradients) in FE are easier to localise. The novelty of this work lies in generating a free-noise 3D-Map (3DM) according to a correlation analysis of region contours. This automatically combines the exploitation of the available depth estimation technique with edge-based feature shape recognition technique. The application area consists of two varied domains, which prove the efficiency and robustness of the approach: a) extracting a set of setting feature-edges, for both tracking and mapping process for 3D depthmap estimation, and b) separation and recognition of focus objects in the scene. Experimental results show that the proposed 3DM technique is performed efficiently compared to the state-of-the-art algorithms.