摘要
3D人体姿态估计是计算机视觉领域一大研究热点,针对深度图像缺乏深度标签,以及因姿态单一造成的模型泛化能力不高的问题,创新性地提出了基于多源图像弱监督学习的3D人体姿态估计方法。首先,利用多源图像融合训练的方法,提高模型的泛化能力;然后,提出弱监督学习方法解决标签不足的问题;最后,为了提高姿态估计的效果,改进了残差模块的设计。实验结果表明:改善的网络结构在训练时间下降约28%的情况下,准确率提高0.2%,并且所提方法不管是在深度图像还是彩色图像上,均达到了较好的估计结果。
Three-dimensional human pose estimation is a hot research topic in the field of computer vision.Aimed at the lack of labels in depth images and the low generalization ability of models caused by single human pose,this paper innovatively proposes a method of 3D human pose estimation based on multi-source image weakly-supervised learning.This method mainly includes the following points.First,multi-source image fusion training method is used to improve the generalization ability of the model.Second,weakly-supervised learning approach is proposed to solve the problem of label insufficiency.Third,in order to improve the attitude estimation results,this paper improve the design of the residual module.The experimental results show that the regression accuracy from our improved network increases by 0.2%,and meanwhile the training time reduces by 28%compared with the original network.In a word,the proposed method obtains excellent estimation results with both depth images and color images.
作者
蔡轶珩
王雪艳
胡绍斌
刘嘉琦
CAI Yiheng;WANG Xueyan;HU Shaobin;LIU Jiaqi(Faculty of Information Technology,Beijing University of Technology,Beijing 100124,China)
出处
《北京航空航天大学学报》
EI
CAS
CSCD
北大核心
2019年第12期2375-2384,共10页
Journal of Beijing University of Aeronautics and Astronautics
基金
国家重点研发计划(2017YFC1703302)
北京市教委科技项目(KM201710005028)~~
关键词
人体姿态估计
沙漏网络
弱监督
多源图像
深度图像
human pose estimation
hourglass networks
weakly-supervised
multi-source image
depth image