3D pose transfer over unorganized point clouds is a challenging generation task,which transfers a source’s pose to a target shape and keeps the target’s identity.Recent deep models have learned deformations and used...3D pose transfer over unorganized point clouds is a challenging generation task,which transfers a source’s pose to a target shape and keeps the target’s identity.Recent deep models have learned deformations and used the target’s identity as a style to modulate the combined features of two shapes or the aligned vertices of the source shape.However,all operations in these models are point-wise and independent and ignore the geometric information on the surface and structure of the input shapes.This disadvantage severely limits the generation and generalization capabilities.In this study,we propose a geometry-aware method based on a novel transformer autoencoder to solve this problem.An efficient self-attention mechanism,that is,cross-covariance attention,was utilized across our framework to perceive the correlations between points at different distances.Specifically,the transformer encoder extracts the target shape’s local geometry details for identity attributes and the source shape’s global geometry structure for pose information.Our transformer decoder efficiently learns deformations and recovers identity properties by fusing and decoding the extracted features in a geometry attentional manner,which does not require corresponding information or modulation steps.The experiments demonstrated that the geometry-aware method achieved state-of-the-art performance in a 3D pose transfer task.The implementation code and data are available at https://github.com/SEULSH/Geometry-Aware-3D-Pose-Transfer-Using-Transfor mer-Autoencoder.展开更多
针对现有人体姿态迁移方法因编码阶段特征处理不当导致图像变形失真的问题,提出基于Pose-Attentional Transfer Network(PATN)和自注意力机制的多分辨率人体姿态迁移方法。首先,设计了姿态引导自注意力模块,通过多头注意力机制增强关键...针对现有人体姿态迁移方法因编码阶段特征处理不当导致图像变形失真的问题,提出基于Pose-Attentional Transfer Network(PATN)和自注意力机制的多分辨率人体姿态迁移方法。首先,设计了姿态引导自注意力模块,通过多头注意力机制增强关键身体区域特征通道的权重,并减小背景无关特征的影响,自适应地探索两条支路特征之间的关联性;其次,在解码阶段加入多尺度注意力模块,增强不同尺度的姿态信息表达,有效提升局部细节和整体纹理的保真度;最后,引入三元像素损失对生成图像进行约束,提高了图像的特征一致性和结构一致性。在DeepFashion和Market-1501数据集上进行验证,实验结果表明,本文方法在结构相似性(SSIM)、初始评分(IS)、感知相似度(LPIPS)指标上均优于现有的PATN方法,并且在视觉感观、边缘纹理方面均有所提升,在行人重识别的下游任务中具有重要的潜力。展开更多
基金supported by the Special Project on Basic Research of Frontier Leading Technology of Jiangsu Province,China(Grant No.BK20192004C).
文摘3D pose transfer over unorganized point clouds is a challenging generation task,which transfers a source’s pose to a target shape and keeps the target’s identity.Recent deep models have learned deformations and used the target’s identity as a style to modulate the combined features of two shapes or the aligned vertices of the source shape.However,all operations in these models are point-wise and independent and ignore the geometric information on the surface and structure of the input shapes.This disadvantage severely limits the generation and generalization capabilities.In this study,we propose a geometry-aware method based on a novel transformer autoencoder to solve this problem.An efficient self-attention mechanism,that is,cross-covariance attention,was utilized across our framework to perceive the correlations between points at different distances.Specifically,the transformer encoder extracts the target shape’s local geometry details for identity attributes and the source shape’s global geometry structure for pose information.Our transformer decoder efficiently learns deformations and recovers identity properties by fusing and decoding the extracted features in a geometry attentional manner,which does not require corresponding information or modulation steps.The experiments demonstrated that the geometry-aware method achieved state-of-the-art performance in a 3D pose transfer task.The implementation code and data are available at https://github.com/SEULSH/Geometry-Aware-3D-Pose-Transfer-Using-Transfor mer-Autoencoder.
文摘针对现有人体姿态迁移方法因编码阶段特征处理不当导致图像变形失真的问题,提出基于Pose-Attentional Transfer Network(PATN)和自注意力机制的多分辨率人体姿态迁移方法。首先,设计了姿态引导自注意力模块,通过多头注意力机制增强关键身体区域特征通道的权重,并减小背景无关特征的影响,自适应地探索两条支路特征之间的关联性;其次,在解码阶段加入多尺度注意力模块,增强不同尺度的姿态信息表达,有效提升局部细节和整体纹理的保真度;最后,引入三元像素损失对生成图像进行约束,提高了图像的特征一致性和结构一致性。在DeepFashion和Market-1501数据集上进行验证,实验结果表明,本文方法在结构相似性(SSIM)、初始评分(IS)、感知相似度(LPIPS)指标上均优于现有的PATN方法,并且在视觉感观、边缘纹理方面均有所提升,在行人重识别的下游任务中具有重要的潜力。