Realistic human reconstruction embraces an extensive range of applications as depth sensors advance.However,current stateof-the-art methods with RGB-D input still suffer from artefacts,such as noisy surfaces,non-human...Realistic human reconstruction embraces an extensive range of applications as depth sensors advance.However,current stateof-the-art methods with RGB-D input still suffer from artefacts,such as noisy surfaces,non-human shapes,and depth ambiguity,especially for the invisible parts.The authors observe the main issue is the lack of geometric semantics without using depth input priors fully.This paper focuses on improving the representation ability of implicit function,exploring an effective method to utilise depth-related semantics effectively and efficiently.The proposed geometry-enhanced implicit function enhances the geometric semantics with the extra voxel-aligned features from point clouds,promoting the completion of missing parts for unseen regions while preserving the local details on the input.For incorporating multi-scale pixel-aligned and voxelaligned features,the authors use the Squeeze-and-Excitation attention to capture and fully use channel interdependencies.For the multi-view reconstruction,the proposed depth-enhanced attention explicitly excites the network to“sense”the geometric structure for a more reasonable feature aggregation.Experiments and results show that our method outperforms current RGB and depth-based SOTA methods on the challenging data from Twindom and Thuman3.0,and achieves a detailed and completed human reconstruction,balancing performance and efficiency well.展开更多
为了克服单视图三维人体重建中出现的伪影及肢体残缺现象,提出了一种基于参数模型和法线推理的三维人体隐式重建算法(Parametric⁃Model and Normal Inference,PMNI),该方法能够从单一RGB图像重建出包含服装的三维人体。网络的输入仅为...为了克服单视图三维人体重建中出现的伪影及肢体残缺现象,提出了一种基于参数模型和法线推理的三维人体隐式重建算法(Parametric⁃Model and Normal Inference,PMNI),该方法能够从单一RGB图像重建出包含服装的三维人体。网络的输入仅为一张包含人物全身的RGB图像,首先基于图卷积神经网络预测对应的SMPL参数模型,接着基于条件GAN(Generative Adversarial Networks)网络生成人物的后视图像,并分别从前后视图中提取法线特征,最后将它们作为深度隐式函数的额外参数辅助训练。实验结果表明,相较于传统方法,该方法有效提升了重建结果的整体质量和表面细节。得益于参数体和法线作为先验,该方法也可以很好地处理一些复杂人体姿态。展开更多
Neural implicit representation(NIR)has attracted significant attention in 3D shape representation for its efficiency,generalizability,and flexibility compared with traditional explicit representations.Previous works u...Neural implicit representation(NIR)has attracted significant attention in 3D shape representation for its efficiency,generalizability,and flexibility compared with traditional explicit representations.Previous works usually parameterize shapes with neural feature grids/volumes,which prove to be inefficient for the discrete position constraints of the representations.While recent advances make it possible to optimize continuous positions for the latent codes,they still lack self-adaptability to represent various kinds of shapes well.In this paper,we introduce a hierarchical adaptive code cloud(HACC)model to achieve an accurate and compact implicit 3D shape representation.Specifically,we begin by assigning adaptive influence fields and dynamic positions to latent codes,which are optimizable during training,and propose an adaptive aggregation function to fuse the contributions of candidate latent codes with respect to query points.In addition,these basic modules are stacked hierarchically with gradually narrowing influence field thresholds and,therefore,heuristically forced to focus on capturing finer structures at higher levels.These formulations greatly improve the distribution and effectiveness of local latent codes and reconstruct shapes from coarse to fine with high accuracy.Extensive qualitative and quantitative evaluations both on single-shape reconstruction and large-scale dataset representation tasks demonstrate the superiority of our method over state-of-the-art approaches.展开更多
基金supported by the National Key R&D Programme of China(2022YFF0902200).
文摘Realistic human reconstruction embraces an extensive range of applications as depth sensors advance.However,current stateof-the-art methods with RGB-D input still suffer from artefacts,such as noisy surfaces,non-human shapes,and depth ambiguity,especially for the invisible parts.The authors observe the main issue is the lack of geometric semantics without using depth input priors fully.This paper focuses on improving the representation ability of implicit function,exploring an effective method to utilise depth-related semantics effectively and efficiently.The proposed geometry-enhanced implicit function enhances the geometric semantics with the extra voxel-aligned features from point clouds,promoting the completion of missing parts for unseen regions while preserving the local details on the input.For incorporating multi-scale pixel-aligned and voxelaligned features,the authors use the Squeeze-and-Excitation attention to capture and fully use channel interdependencies.For the multi-view reconstruction,the proposed depth-enhanced attention explicitly excites the network to“sense”the geometric structure for a more reasonable feature aggregation.Experiments and results show that our method outperforms current RGB and depth-based SOTA methods on the challenging data from Twindom and Thuman3.0,and achieves a detailed and completed human reconstruction,balancing performance and efficiency well.
文摘为了克服单视图三维人体重建中出现的伪影及肢体残缺现象,提出了一种基于参数模型和法线推理的三维人体隐式重建算法(Parametric⁃Model and Normal Inference,PMNI),该方法能够从单一RGB图像重建出包含服装的三维人体。网络的输入仅为一张包含人物全身的RGB图像,首先基于图卷积神经网络预测对应的SMPL参数模型,接着基于条件GAN(Generative Adversarial Networks)网络生成人物的后视图像,并分别从前后视图中提取法线特征,最后将它们作为深度隐式函数的额外参数辅助训练。实验结果表明,相较于传统方法,该方法有效提升了重建结果的整体质量和表面细节。得益于参数体和法线作为先验,该方法也可以很好地处理一些复杂人体姿态。
基金supported by the National Natural Science Foundation of China(Nos.62001213 and 62025108).
文摘Neural implicit representation(NIR)has attracted significant attention in 3D shape representation for its efficiency,generalizability,and flexibility compared with traditional explicit representations.Previous works usually parameterize shapes with neural feature grids/volumes,which prove to be inefficient for the discrete position constraints of the representations.While recent advances make it possible to optimize continuous positions for the latent codes,they still lack self-adaptability to represent various kinds of shapes well.In this paper,we introduce a hierarchical adaptive code cloud(HACC)model to achieve an accurate and compact implicit 3D shape representation.Specifically,we begin by assigning adaptive influence fields and dynamic positions to latent codes,which are optimizable during training,and propose an adaptive aggregation function to fuse the contributions of candidate latent codes with respect to query points.In addition,these basic modules are stacked hierarchically with gradually narrowing influence field thresholds and,therefore,heuristically forced to focus on capturing finer structures at higher levels.These formulations greatly improve the distribution and effectiveness of local latent codes and reconstruct shapes from coarse to fine with high accuracy.Extensive qualitative and quantitative evaluations both on single-shape reconstruction and large-scale dataset representation tasks demonstrate the superiority of our method over state-of-the-art approaches.