Recently,stacked hourglass network has shown outstanding performance in human pose estimation.However,repeated bottom-up and top-down stride convolution operations in deep convolutional neural networks lead to a signi...Recently,stacked hourglass network has shown outstanding performance in human pose estimation.However,repeated bottom-up and top-down stride convolution operations in deep convolutional neural networks lead to a significant decrease in the initial image resolution.In order to address this problem,we propose to incorporate affinage module and residual attention module into stacked hourglass network for human pose estimation.This paper introduces a novel network architecture to replace the stacked hourglass network of up-sampling operation for getting high-resolution features.We refer to the architecture as an affinage module which is critical to improve the performance of the stacked hourglass network.Additionally,we also propose a novel residual attention module to increase the supervision of up-sample process.The effectiveness of the introduced module is evaluated on standard benchmarks.Various experimental results demonstrated that our method can achieve more accurate and more robust human pose estimation results in images with complex background.展开更多
The detection of Oracle Bone Inscriptions (OBIs) is one of the most fundamental tasks in the study of Oracle Bone, which aims to locate the positions of OBIs on rubbing images. The existing methods are based on the sc...The detection of Oracle Bone Inscriptions (OBIs) is one of the most fundamental tasks in the study of Oracle Bone, which aims to locate the positions of OBIs on rubbing images. The existing methods are based on the scheme of anchor boxes, involving complex network design and a great number of anchor boxes. In order to overcome the problem, this paper proposes a simpler but more effective OBIs detector by using an anchor-free scheme, where shape-adaptive Gaussian kernels are employed to represent the spatial regions of different OBIs. More specifically, to address the problem of misdetection caused by regional overlapping between some tightly distributed OBIs, the character regions are simultaneously represented by multiscale Gaussian kernels to obtain regions with sharp edges. Besides, based on the kernel predictions of different scales, a novel post-processing pipeline is used to obtain accurate predictions of bounding boxes. Experiments show that our OBIs detector has achieved significant results on the OBIs dataset, which greatly outperforms several mainstream object detectors in both speed and efficiency. Dataset is available at http://jgw.aynu.edu.cn.展开更多
目的6D姿态估计是3D目标识别及重建中的一个重要问题。由于很多物体表面光滑、无纹理,特征难以提取,导致检测难度大。很多算法依赖后处理过程提高姿态估计精度,导致算法速度降低。针对以上问题,本文提出一种基于热力图的6D物体姿态估计...目的6D姿态估计是3D目标识别及重建中的一个重要问题。由于很多物体表面光滑、无纹理,特征难以提取,导致检测难度大。很多算法依赖后处理过程提高姿态估计精度,导致算法速度降低。针对以上问题,本文提出一种基于热力图的6D物体姿态估计算法。方法首先,采用分割掩码避免遮挡造成的热力图污染导致的特征点预测准确率下降问题。其次,基于漏斗网络架构,无需后处理过程,保证算法具有高效性能。在物体检测阶段,采用一个分割网络结构,使用速度较快的YOLOv3(you only look once v3)作为网络骨架,目的在于预测目标物体掩码分割图,从而减少其他不相关物体通过遮挡带来的影响。为了提高掩码的准确度,增加反卷积层提高特征层的分辨率并对它们进行融合。然后,针对关键点采用漏斗网络进行特征点预测,避免残差网络模块由于局部特征丢失导致的关键点检测准确率下降问题。最后,对检测得到的关键点进行位姿计算,通过PnP(perspective-n-point)算法恢复物体的6D姿态。结果在有挑战的Linemod数据集上进行实验。实验结果表明,本文算法的3D误差准确性为82.7%,与热力图方法相比提高了10%;2D投影准确性为98.9%,比主流算法提高了4%;同时达到了15帧/s的检测速度。结论本文提出的基于掩码和关键点检测算法不仅有效提高了6D姿态估计准确性,而且可以维持高效的检测速度。展开更多
基金This work was supported by the National Natural Science Foundation of China(Grant Nos.61672375 and 61170118).
文摘Recently,stacked hourglass network has shown outstanding performance in human pose estimation.However,repeated bottom-up and top-down stride convolution operations in deep convolutional neural networks lead to a significant decrease in the initial image resolution.In order to address this problem,we propose to incorporate affinage module and residual attention module into stacked hourglass network for human pose estimation.This paper introduces a novel network architecture to replace the stacked hourglass network of up-sampling operation for getting high-resolution features.We refer to the architecture as an affinage module which is critical to improve the performance of the stacked hourglass network.Additionally,we also propose a novel residual attention module to increase the supervision of up-sample process.The effectiveness of the introduced module is evaluated on standard benchmarks.Various experimental results demonstrated that our method can achieve more accurate and more robust human pose estimation results in images with complex background.
文摘针对因遮挡和自相似性导致的从单张RGB图像估计三维手部姿态不精确的问题,提出结合注意力机制和多尺度特征融合的三维手部姿态估计算法。首先,提出结合扩张卷积和CBAM(Convolutional Block Attention Module)注意力机制的感受强化模块(SEM),以替换沙漏网络(HGNet)中的基本块(Basicblock),在扩大感受野的同时增强对空间信息的敏感性,从而提高手部特征的提取能力;其次,设计一种结合SPCNet(Spatial Preserve and Contentaware Network)和Soft-Attention改进的多尺度信息融合模块SS-MIFM(SPCNet and Soft-attention-Multi-scale Information Fusion Module),在充分考虑空间内容感知机制的情况下,有效地聚合多级特征,并显著提高二维手部关键点检测的准确性;最后,利用2.5D姿态转换模块将二维姿态转换为三维姿态,从而避免二维关键点坐标直接回归计算三维姿态信息导致的空间丢失问题。实验结果表明,在InterHand2.6M数据集上,所提算法的双手关节点平均误差(MPJPE)、单手MPJPE和根节点平均误差(MRRPE)分别达到了12.32、9.96和29.57 mm;在RHD(Rendered Hand pose Dataset)上,与InterNet和QMGR-Net算法相比,所提算法的终点误差(EPE)分别降低了2.68和0.38 mm。以上结果说明了所提算法能够更准确地估计手部姿态,且在一些双手交互和遮挡的场景下有更高的鲁棒性。
文摘The detection of Oracle Bone Inscriptions (OBIs) is one of the most fundamental tasks in the study of Oracle Bone, which aims to locate the positions of OBIs on rubbing images. The existing methods are based on the scheme of anchor boxes, involving complex network design and a great number of anchor boxes. In order to overcome the problem, this paper proposes a simpler but more effective OBIs detector by using an anchor-free scheme, where shape-adaptive Gaussian kernels are employed to represent the spatial regions of different OBIs. More specifically, to address the problem of misdetection caused by regional overlapping between some tightly distributed OBIs, the character regions are simultaneously represented by multiscale Gaussian kernels to obtain regions with sharp edges. Besides, based on the kernel predictions of different scales, a novel post-processing pipeline is used to obtain accurate predictions of bounding boxes. Experiments show that our OBIs detector has achieved significant results on the OBIs dataset, which greatly outperforms several mainstream object detectors in both speed and efficiency. Dataset is available at http://jgw.aynu.edu.cn.
文摘目的6D姿态估计是3D目标识别及重建中的一个重要问题。由于很多物体表面光滑、无纹理,特征难以提取,导致检测难度大。很多算法依赖后处理过程提高姿态估计精度,导致算法速度降低。针对以上问题,本文提出一种基于热力图的6D物体姿态估计算法。方法首先,采用分割掩码避免遮挡造成的热力图污染导致的特征点预测准确率下降问题。其次,基于漏斗网络架构,无需后处理过程,保证算法具有高效性能。在物体检测阶段,采用一个分割网络结构,使用速度较快的YOLOv3(you only look once v3)作为网络骨架,目的在于预测目标物体掩码分割图,从而减少其他不相关物体通过遮挡带来的影响。为了提高掩码的准确度,增加反卷积层提高特征层的分辨率并对它们进行融合。然后,针对关键点采用漏斗网络进行特征点预测,避免残差网络模块由于局部特征丢失导致的关键点检测准确率下降问题。最后,对检测得到的关键点进行位姿计算,通过PnP(perspective-n-point)算法恢复物体的6D姿态。结果在有挑战的Linemod数据集上进行实验。实验结果表明,本文算法的3D误差准确性为82.7%,与热力图方法相比提高了10%;2D投影准确性为98.9%,比主流算法提高了4%;同时达到了15帧/s的检测速度。结论本文提出的基于掩码和关键点检测算法不仅有效提高了6D姿态估计准确性,而且可以维持高效的检测速度。