Learning-based methods have proven successful in compressing geometric information for point clouds.For attribute compression,however,they still lag behind non-learning-based methods such as the MPEG G-PCC standard.To...Learning-based methods have proven successful in compressing geometric information for point clouds.For attribute compression,however,they still lag behind non-learning-based methods such as the MPEG G-PCC standard.To bridge this gap,we propose a novel deep learning-based point cloud attribute compression method that uses a generative adversarial network(GAN)with sparse convolution layers.Our method also includes a module that adaptively selects the resolution of the voxels used to voxelize the input point cloud.Sparse vectors are used to represent the voxelized point cloud,and sparse convolutions process the sparse tensors,ensuring computational efficiency.To the best of our knowledge,this is the first application of GANs to compress point cloud attributes.Our experimental results show that our method outperforms existing learning-based techniques and rivals the latest G-PCC test model(TMC13v23)in terms of visual quality.展开更多
Generative adversarial networks(GANs)are an unsupervised generative model that learns data distribution through adversarial training.However,recent experiments indicated that GANs are difficult to train due to the req...Generative adversarial networks(GANs)are an unsupervised generative model that learns data distribution through adversarial training.However,recent experiments indicated that GANs are difficult to train due to the requirement of optimization in the high dimensional parameter space and the zero gradient problem.In this work,we propose a self-sparse generative adversarial network(Self-Sparse GAN)that reduces the parameter space and alleviates the zero gradient problem.In the Self-Sparse GAN,we design a self-adaptive sparse transform module(SASTM)comprising the sparsity decomposition and feature-map recombination,which can be applied on multi-channel feature maps to obtain sparse feature maps.The key idea of Self-Sparse GAN is to add the SASTM following every deconvolution layer in the generator,which can adaptively reduce the parameter space by utilizing the sparsity in multi-channel feature maps.We theoretically prove that the SASTM can not only reduce the search space of the convolution kernel weight of the generator but also alleviate the zero gradient problem by maintaining meaningful features in the batch normalization layer and driving the weight of deconvolution layers away from being negative.The experimental results show that our method achieves the best Fréchet inception distance(FID)scores for image generation compared with Wasserstein GAN with gradient penalty(WGAN-GP)on MNIST,Fashion-MNIST,CIFAR-10,STL-10,mini-ImageNet,CELEBA-HQ,and LSUN bedrooms datasets,and the relative decrease of FID is 4.76%-21.84%.Meanwhile,an architectural sketch dataset(Sketch)is also used to validate the superiority of the proposed method.展开更多
基金supported in part by the National Natural Science Foundation of China under Grants 62222110,62172259,62571303the High-end Foreign Experts Recruitment Plan of Chinese Ministry of Science and Technology under Grant G2023150003L+2 种基金the exchange project of the Royal Society and the National Natural Science Foundation of China under Grant 62311530104(IEC\NSFC\223076)the Taishan Scholar Project of Shandong Province(tsqn202103001)the Natural Science Foundation of Shandong Province under Grant ZR2022ZD38.
文摘Learning-based methods have proven successful in compressing geometric information for point clouds.For attribute compression,however,they still lag behind non-learning-based methods such as the MPEG G-PCC standard.To bridge this gap,we propose a novel deep learning-based point cloud attribute compression method that uses a generative adversarial network(GAN)with sparse convolution layers.Our method also includes a module that adaptively selects the resolution of the voxels used to voxelize the input point cloud.Sparse vectors are used to represent the voxelized point cloud,and sparse convolutions process the sparse tensors,ensuring computational efficiency.To the best of our knowledge,this is the first application of GANs to compress point cloud attributes.Our experimental results show that our method outperforms existing learning-based techniques and rivals the latest G-PCC test model(TMC13v23)in terms of visual quality.
基金This work was supported by the National Natural Science Foundation of China(Nos.51921006 and 52008138)Heilongjiang Touyan Innovation Team Program(No.AUEA5640200320).
文摘Generative adversarial networks(GANs)are an unsupervised generative model that learns data distribution through adversarial training.However,recent experiments indicated that GANs are difficult to train due to the requirement of optimization in the high dimensional parameter space and the zero gradient problem.In this work,we propose a self-sparse generative adversarial network(Self-Sparse GAN)that reduces the parameter space and alleviates the zero gradient problem.In the Self-Sparse GAN,we design a self-adaptive sparse transform module(SASTM)comprising the sparsity decomposition and feature-map recombination,which can be applied on multi-channel feature maps to obtain sparse feature maps.The key idea of Self-Sparse GAN is to add the SASTM following every deconvolution layer in the generator,which can adaptively reduce the parameter space by utilizing the sparsity in multi-channel feature maps.We theoretically prove that the SASTM can not only reduce the search space of the convolution kernel weight of the generator but also alleviate the zero gradient problem by maintaining meaningful features in the batch normalization layer and driving the weight of deconvolution layers away from being negative.The experimental results show that our method achieves the best Fréchet inception distance(FID)scores for image generation compared with Wasserstein GAN with gradient penalty(WGAN-GP)on MNIST,Fashion-MNIST,CIFAR-10,STL-10,mini-ImageNet,CELEBA-HQ,and LSUN bedrooms datasets,and the relative decrease of FID is 4.76%-21.84%.Meanwhile,an architectural sketch dataset(Sketch)is also used to validate the superiority of the proposed method.