摘要
人体姿态估计是工业制造5.0的重要支撑技术之一,已经在行为识别、人机交互、数字孪生等多种场景展开了应用。然而,在复杂工业场景下,告示牌、管线、立柱等物品极易对作业人员形成局部或全局遮挡,导致人体姿态估计模型在关键点定位时出现偏差,造成模型对姿态估计的准确率降低。针对该问题,提出了一种面向复杂工业场景的人体姿态估计性能增强方法,该方法首先基于量子化自编码器对人体关键点进行结构化建模,将关节点特征映射到量子化隐空间,以提升人体部分遮挡时姿态估计的准确率。然后,针对工人遮挡数据集构建困难的问题,创新地提出了一种面向人体姿态遮挡的动态数据增强训练方法,在模型训练过程中,通过评估人体姿态估计模型在数据集上对各关键点的估计结果,使用工业场景真实遮挡物动态生成符合工业场景特征的工人遮挡图片用于下一次模型训练,进一步提升模型在人体姿态估计任务中的鲁棒性。实验结果证明,所提的方法在自建数据集上相比先进方法PCT平均准确率AP和平均召回率AR分别提升了3.8%和2.7%,其能够有效地应对复杂工业场景中的作业人员人体遮挡问题。
Human pose estimation is one of the important supporting technologies for Industrial Manufacturing 5.0,which has already been applied in various scenarios such as action recognition,human-computer interaction,and digital twin.However,in complex industrial scenes,objects such as notice boards,pipes,and columns can easily cause local or global occlusions for workers,leading to errors in joint points localization by human pose estimation models and a decrease in the performance of the human pose estimation model.To address this problem,this article proposes a human pose estimation performance enhancement method for complex industrial scenes,which firstly structurally models the key points of the human body based on VQ-VAE model,mapping joint features to a quantized latent space to improve the accuracy of human pose estimation when occlusion occurred.Then,to address the problem of insufficient worker occlusion dataset,a dynamic data augmentation and training method is innovatively proposed.In the process of model training,industrial scene-specific worker occlusion images are generated dynamically using real industrial scene occlusion objects by evaluating the human pose estimation results of the model for the next model training,further enhancing the model′s robustness in human pose estimation tasks.The experimental results show that the method proposed in this article achieves an average precision(AP)improvement of 3.8%and an average recall(AR)improvement of 2.7%over the advanced method PCT on the self-constructed dataset and is able to effectively cope with the human occlusion problem in complex industrial scenes.
作者
李帆雅
张泽辉
陈博洋
徐晓滨
管聪
Li Fanya;Zhang Zehui;Chen Boyang;Xu Xiaobin;Guan Cong(School of Automation,Hangzhou Dianzi University,Hangzhou 310018,China;Ningxia Petrochemical Yinjun Safety Technology Consulting Co.,Ltd.,Yinchuan 750000,China;School of Naval Architecture,Ocean and Energy Power Engineering,Wuhan University of Technology,Wuhan 430070,China)
出处
《仪器仪表学报》
北大核心
2025年第8期255-265,共11页
Chinese Journal of Scientific Instrument
基金
浙江省自然科学基金(LTGG24F030004)
国家水运安全工程技术研究中心开放基金(A202403)
国家重点研发计划(2022YFE0210700)
国家自然科学基金(52401376)
浙江省科协青年人才托举培养(ZJSKXQT2015049)项目资助。
关键词
人体姿态估计
工业遮挡
数据增强
计算机视觉
human pose estimation
industrial occlusion
data augmentation
computer vision