Denoising diffusion models have demonstrated tremendous success in modeling data distributions and synthesizing high-quality samples.In the 2D image domain,they have become the state-of-the-art and are capable of gene...Denoising diffusion models have demonstrated tremendous success in modeling data distributions and synthesizing high-quality samples.In the 2D image domain,they have become the state-of-the-art and are capable of generating photo-realistic images with high controllability.More recently,researchers have begun to explore how to utilize diffusion models to generate 3D data,as doing so has more potential in real-world applications.This requires careful design choices in two key ways:identifying a suitable 3D representation and determining how to apply the diffusion process.In this survey,we provide the first comprehensive review of diffusion models for manipulating 3D content,including 3D generation,reconstruction,and 3D-aware image synthesis.We classify existing methods into three major categories:2D space diffusion with pretrained models,2D space diffusion without pretrained models,and 3D space diffusion.We also summarize popular datasets used for 3D generation with diffusion models.Along with this survey,we maintain a repository https://github.com/cwchenwang/awesome-3d-diffusion to track the latest relevant papers and codebases.Finally,we pose current challenges for diffusion models for 3D generation,and suggest future research directions.展开更多
This paper presents a review of the work on fluid/structure impact based on inviscid and imcompressible liquid and irrotational flow. The focus is on the velocity potential theory together with boundary element method...This paper presents a review of the work on fluid/structure impact based on inviscid and imcompressible liquid and irrotational flow. The focus is on the velocity potential theory together with boundary element method (BEM). Fully nonlinear boundary conditions are imposed on the unknown free surface and the wetted surface of the moving body. The review includes (1) vertical and oblique water entry of a body at constant or a prescribed varying speed, as well as free fall motion, (2) liquid droplets or column impact as well as wave impact on a body, (3) similarity solution of an expanding body. It covers two dimensional (2D), axisymmetric and three dimensional (3D) cases. Key techniques used in the numerical simulation are outlined, including mesh generation on the multivalued free surface, the stretched coordinate system for expanding domain, the auxiliary function method for decoupling the mutual dependence of the pressure and the body motion, and treatment for the jet or the thin liquid film developed during impact.展开更多
Several approaches for fast generation of digital holograms of a three-dimensional (3D) object have been discussed. Among them, the novel look-up table (N-LUT) method is analyzed to dramatically reduce the number ...Several approaches for fast generation of digital holograms of a three-dimensional (3D) object have been discussed. Among them, the novel look-up table (N-LUT) method is analyzed to dramatically reduce the number of pre-calculated fringe patterns required for computation of digital holograms of a 3D object by employing a new concept of principal fringe patterns, so that problems of computational complexity and huge memory size of the conventional ray-tracing and look-up table methods have been considerably alleviated. Meanwhile, as the 3D video images have a lot of temporally or spatially redundant data in their inter- and intra-frames, computation time of the 3D video holograms could be also reduced just by removing these redundant data. Thus, a couple of computational methods for generation of 3D video holograms by combined use of the N-LUT method and data compression algorithms are also presented and discussed. Some experimental results finally reveal that by using this approach a great reduction of computation time of 3D video holograms could be achieved.展开更多
In modern architectural design,as complexity increases and diverse demands emerge,reconstructing 3D spaces has become a crucial method.However,existing methods remain limited to small-scale scenarios and exhibit poor ...In modern architectural design,as complexity increases and diverse demands emerge,reconstructing 3D spaces has become a crucial method.However,existing methods remain limited to small-scale scenarios and exhibit poor reconstruction accuracy when applied to building-scale environments,resulting in unstable mesh quality and reduced design productivity.Furthermore,the lack of real-time,interactive editing tools prolongs design iteration cycles and impedes workflow efficiency.To address this issue,we propose the following contributions:(1)We construct ArchiNet++,an architectural dataset that includes 710,180 multi-view images,5200 SketchUp models,and corresponding camera parameters from the conceptual design phase of architectural projects.(2)We introduce Drag2Build++,an interactive 3D mesh reconstruction framework featuring drag-based editing and three core innovations:a differentiable geometry module for fine-grained deformation,a 2D-3D rendering bridge for supervision,and a GAN-based refinement module for photorealistic texture synthesis.(3)Comprehensive experiments demonstrate that our model excels in generating highquality 3D meshes and enables rapid mesh editing via drag-based interactions.Furthermore,by incorporating textured mesh generation into this interactive workflow,it improves both efficiency and modeling flexibility.We hope this combination can contribute to a more intuitive modeling process and offer a practical tool set that supports the digital transformation efforts within architectural design.展开更多
Full-parallax light-field is captured by a small-scale 3D image scanning system and applied to holographic display. A vertical camera array is scanned horizontally to capture full-parallax imagery, and the vertical vi...Full-parallax light-field is captured by a small-scale 3D image scanning system and applied to holographic display. A vertical camera array is scanned horizontally to capture full-parallax imagery, and the vertical views between cameras are interpolated by depth image-based rendering technique. An improved technique for depth estimation reduces the estimation error and high-density light-field is obtained. The captured data is employed for the calculation of computer hologram using ray-sampling plane. This technique enables high-resolution display even in deep 3D scene although a hologram is calculated from ray information, and thus it makes use of the important advantage of holographic 3D display.展开更多
Generating and inserting new objects into 3D content is a compelling approach for achieving versatile scene recreation.Existing methods,which rely on SDS optimization or single-view inpainting,often struggle to produc...Generating and inserting new objects into 3D content is a compelling approach for achieving versatile scene recreation.Existing methods,which rely on SDS optimization or single-view inpainting,often struggle to produce high-quality results.To address this,we propose a novel method for object inser-tion in 3D content represented by Gaussian Splatting.Our approach introduces a multi-view diffusion model,dubbed MVInpainter,which is built upon a pre-trained stable video diffusion model to facilitate view-consistent object inpainting.Within MVInpainter,we incorporate a ControlNet-based conditional injection module to enable controlled and more predictable multi-view generation.After generating the multi-view inpainted results,we further propose a mask-aware 3D reconstruction technique to refine Gaussian Splatting reconstruction from these sparse inpainted views.By leveraging these fabricate techniques,our approach yields diverse results,ensures view-consistent and harmonious insertions,and produces better object quality.Extensive experiments demonstrate that our approach outperforms existing methods.展开更多
In the field of remote sensing imaging, multispectral imaging can obtain an image of the observed scene in several bands, while the light detection and ranging(LiDAR) can acquire the accurate 3D geometric information ...In the field of remote sensing imaging, multispectral imaging can obtain an image of the observed scene in several bands, while the light detection and ranging(LiDAR) can acquire the accurate 3D geometric information of the scene. With the development of remote sensing technology, how to effectively integrate the two imaging technologies in order to collect and process simultaneous spectral and 3D geometric information has been one of the frontier problems. Most of the present researches on simultaneous spectral and geometric data acquisition focus on the design of physical multispectral LiDAR system, which inevitably lead to an imaging system of heavy weight and high power consumption and thus inconvenient in practice. Different from the present researches, a UAV-based integrated multispectral-LiDAR system is introduced in this paper. Through simultaneous multi-sensor data collection and multispectral point cloud generation, a low-cost and UAV-based portable 3D geometric and spectral information acquisition system can be achieved.展开更多
We present SinGRAV, an attempt to learn a generative radiance volume from multi-view observations of a single natural scene, in stark contrast to existing category-level 3D generative models that learn from images of ...We present SinGRAV, an attempt to learn a generative radiance volume from multi-view observations of a single natural scene, in stark contrast to existing category-level 3D generative models that learn from images of many object-centric scenes. Inspired by SinGAN, we also learn the internal distribution of the input scene, which necessitates our key designs w.r.t. the scene representation and network architecture. Unlike popular multi-layer perceptrons (MLP)-based architectures, we particularly employ convolutional generators and discriminators, which inherently possess spatial locality bias, to operate over voxelized volumes for learning the internal distribution over a plethora of overlapping regions. On the other hand, localizing the adversarial generators and discriminators over confined areas with limited receptive fields easily leads to highly implausible geometric structures in the spatial. Our remedy is to use spatial inductive bias and joint discrimination on geometric clues in the form of 2D depth maps. This strategy is effective in improving spatial arrangement while incurring negligible additional computational cost. Experimental results demonstrate the ability of SinGRAV in generating plausible and diverse variations from a single scene, the merits of SinGRAV over state-of-the-art generative neural scene models, and the versatility of SinGRAV by its use in a variety of applications. Code and data will be released to facilitate further research.展开更多
文摘Denoising diffusion models have demonstrated tremendous success in modeling data distributions and synthesizing high-quality samples.In the 2D image domain,they have become the state-of-the-art and are capable of generating photo-realistic images with high controllability.More recently,researchers have begun to explore how to utilize diffusion models to generate 3D data,as doing so has more potential in real-world applications.This requires careful design choices in two key ways:identifying a suitable 3D representation and determining how to apply the diffusion process.In this survey,we provide the first comprehensive review of diffusion models for manipulating 3D content,including 3D generation,reconstruction,and 3D-aware image synthesis.We classify existing methods into three major categories:2D space diffusion with pretrained models,2D space diffusion without pretrained models,and 3D space diffusion.We also summarize popular datasets used for 3D generation with diffusion models.Along with this survey,we maintain a repository https://github.com/cwchenwang/awesome-3d-diffusion to track the latest relevant papers and codebases.Finally,we pose current challenges for diffusion models for 3D generation,and suggest future research directions.
基金Foundation item: Supported by the National Natural Science Foundation of China (Grant Nos. 11302057, 11302056), the Fundamental Research Funds for the Central Universities (Grant No. HEUCF140115) and the Research Funds for State Key Laboratory of Ocean Engineering in Shanghai Jiao Tong University (Grant No. 1310).
文摘This paper presents a review of the work on fluid/structure impact based on inviscid and imcompressible liquid and irrotational flow. The focus is on the velocity potential theory together with boundary element method (BEM). Fully nonlinear boundary conditions are imposed on the unknown free surface and the wetted surface of the moving body. The review includes (1) vertical and oblique water entry of a body at constant or a prescribed varying speed, as well as free fall motion, (2) liquid droplets or column impact as well as wave impact on a body, (3) similarity solution of an expanding body. It covers two dimensional (2D), axisymmetric and three dimensional (3D) cases. Key techniques used in the numerical simulation are outlined, including mesh generation on the multivalued free surface, the stretched coordinate system for expanding domain, the auxiliary function method for decoupling the mutual dependence of the pressure and the body motion, and treatment for the jet or the thin liquid film developed during impact.
基金supported by the MKE (Ministry of Knowledge Economy), Korea, under the ITRC (Informa-tion Technology Research Center)support program su-pervised by the NIPA (National IT Industry Promotion Agency) (NIPA-2009-C1090-0902-0018)
文摘Several approaches for fast generation of digital holograms of a three-dimensional (3D) object have been discussed. Among them, the novel look-up table (N-LUT) method is analyzed to dramatically reduce the number of pre-calculated fringe patterns required for computation of digital holograms of a 3D object by employing a new concept of principal fringe patterns, so that problems of computational complexity and huge memory size of the conventional ray-tracing and look-up table methods have been considerably alleviated. Meanwhile, as the 3D video images have a lot of temporally or spatially redundant data in their inter- and intra-frames, computation time of the 3D video holograms could be also reduced just by removing these redundant data. Thus, a couple of computational methods for generation of 3D video holograms by combined use of the N-LUT method and data compression algorithms are also presented and discussed. Some experimental results finally reveal that by using this approach a great reduction of computation time of 3D video holograms could be achieved.
基金supported by Guangdong Basic and Applied Basic Research Foundation(2024A1515012595)the Department of Education of Guangdong Province(2023ZDZX4078)Shenzhen Science and Technology Innovation Committee(WDZC20231129201240001).
文摘In modern architectural design,as complexity increases and diverse demands emerge,reconstructing 3D spaces has become a crucial method.However,existing methods remain limited to small-scale scenarios and exhibit poor reconstruction accuracy when applied to building-scale environments,resulting in unstable mesh quality and reduced design productivity.Furthermore,the lack of real-time,interactive editing tools prolongs design iteration cycles and impedes workflow efficiency.To address this issue,we propose the following contributions:(1)We construct ArchiNet++,an architectural dataset that includes 710,180 multi-view images,5200 SketchUp models,and corresponding camera parameters from the conceptual design phase of architectural projects.(2)We introduce Drag2Build++,an interactive 3D mesh reconstruction framework featuring drag-based editing and three core innovations:a differentiable geometry module for fine-grained deformation,a 2D-3D rendering bridge for supervision,and a GAN-based refinement module for photorealistic texture synthesis.(3)Comprehensive experiments demonstrate that our model excels in generating highquality 3D meshes and enables rapid mesh editing via drag-based interactions.Furthermore,by incorporating textured mesh generation into this interactive workflow,it improves both efficiency and modeling flexibility.We hope this combination can contribute to a more intuitive modeling process and offer a practical tool set that supports the digital transformation efforts within architectural design.
基金partly supported by the JSPS Grant-in-Aid for Scientific Research #17300032
文摘Full-parallax light-field is captured by a small-scale 3D image scanning system and applied to holographic display. A vertical camera array is scanned horizontally to capture full-parallax imagery, and the vertical views between cameras are interpolated by depth image-based rendering technique. An improved technique for depth estimation reduces the estimation error and high-density light-field is obtained. The captured data is employed for the calculation of computer hologram using ray-sampling plane. This technique enables high-resolution display even in deep 3D scene although a hologram is calculated from ray information, and thus it makes use of the important advantage of holographic 3D display.
文摘Generating and inserting new objects into 3D content is a compelling approach for achieving versatile scene recreation.Existing methods,which rely on SDS optimization or single-view inpainting,often struggle to produce high-quality results.To address this,we propose a novel method for object inser-tion in 3D content represented by Gaussian Splatting.Our approach introduces a multi-view diffusion model,dubbed MVInpainter,which is built upon a pre-trained stable video diffusion model to facilitate view-consistent object inpainting.Within MVInpainter,we incorporate a ControlNet-based conditional injection module to enable controlled and more predictable multi-view generation.After generating the multi-view inpainted results,we further propose a mask-aware 3D reconstruction technique to refine Gaussian Splatting reconstruction from these sparse inpainted views.By leveraging these fabricate techniques,our approach yields diverse results,ensures view-consistent and harmonious insertions,and produces better object quality.Extensive experiments demonstrate that our approach outperforms existing methods.
基金supported by the National Natural Science Foundation of Key International Cooperation (Grant No. 61720106002)the Key Research and Development Project of Ministry of Science and Technology (Grant No.2017YFC1405100)the Heading Wild Goose Plan of Heilongjiang Province,China。
文摘In the field of remote sensing imaging, multispectral imaging can obtain an image of the observed scene in several bands, while the light detection and ranging(LiDAR) can acquire the accurate 3D geometric information of the scene. With the development of remote sensing technology, how to effectively integrate the two imaging technologies in order to collect and process simultaneous spectral and 3D geometric information has been one of the frontier problems. Most of the present researches on simultaneous spectral and geometric data acquisition focus on the design of physical multispectral LiDAR system, which inevitably lead to an imaging system of heavy weight and high power consumption and thus inconvenient in practice. Different from the present researches, a UAV-based integrated multispectral-LiDAR system is introduced in this paper. Through simultaneous multi-sensor data collection and multispectral point cloud generation, a low-cost and UAV-based portable 3D geometric and spectral information acquisition system can be achieved.
基金supported by the International(Regional)Cooperation and Exchange Program of National Natural Science Foundation of China under Grant No.62161146002the Shenzhen Collaborative Innovation Program under Grant No.CJGJZD2021048092601003.
文摘We present SinGRAV, an attempt to learn a generative radiance volume from multi-view observations of a single natural scene, in stark contrast to existing category-level 3D generative models that learn from images of many object-centric scenes. Inspired by SinGAN, we also learn the internal distribution of the input scene, which necessitates our key designs w.r.t. the scene representation and network architecture. Unlike popular multi-layer perceptrons (MLP)-based architectures, we particularly employ convolutional generators and discriminators, which inherently possess spatial locality bias, to operate over voxelized volumes for learning the internal distribution over a plethora of overlapping regions. On the other hand, localizing the adversarial generators and discriminators over confined areas with limited receptive fields easily leads to highly implausible geometric structures in the spatial. Our remedy is to use spatial inductive bias and joint discrimination on geometric clues in the form of 2D depth maps. This strategy is effective in improving spatial arrangement while incurring negligible additional computational cost. Experimental results demonstrate the ability of SinGRAV in generating plausible and diverse variations from a single scene, the merits of SinGRAV over state-of-the-art generative neural scene models, and the versatility of SinGRAV by its use in a variety of applications. Code and data will be released to facilitate further research.