摘要
头部姿势估计已成为计算机视觉的一个重要研究领域,广泛地应用在机器人、监控或驾驶员注意力监控中。头部姿态估计最困难的挑战之一是管理真实场景中经常发生的头部遮挡,为此文章提出了一种基于潜在空间回归的头部姿态估计方法。在特征提取阶段采用VisionTransformer用于提取图像的全局信息,并设计了特征增强模块,采用金字塔结构提取局部特征信息。此外,为处理遮挡的存在引入了潜在空间回归,将遮挡图像的潜在特征逼近于非遮挡图像的特征,同时改进了头部姿态的角度预测,并设计了多重损失函数。实验结果表明,在经过遮挡处理的AFLW2000数据上,本文方法的平均绝对误差降低至9.872,优于现有方法,证明了本文方法处理头部遮挡的有效性。
Head pose estimation has become an important research field in computer vision,with extensive applications in robotics,surveillance,and driver attention monitoring.One of the most challenging difficulties in head pose estimation is managing occlusions that frequently occur in real-world sce-narios.To address this,this paper proposes a head pose estimation method based on latent space regression.During the feature extraction stage,a Vision Transformer is employed to capture global information from images.A feature enhancement module is designed,adopting a pyramid structure to extract local feature information.Additionally,to handle occlusions,latent space regression is introduced to approximate the latent features of occluded images to those of non-occluded ones,while improving angle prediction for head poses and incorporating a multi-loss function design.Experimental results demonstrate that on the occlusion-processed AFLW2000 dataset,the mean absolute error of this method is reduced to 9.872,outperforming existing approaches and proving the effectiveness of our method in handling head occlusions.
作者
周腾锋
Tengfeng Zhou(School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai)
出处
《建模与仿真》
2025年第5期194-202,共9页
Modeling and Simulation
关键词
头部姿态估计
潜在空间回归
头部遮挡
特征增强
Head Pose Estimation
Latent Spatial Regression
Head Occlusion
Feature Enhancement