深度学习助力全息元宇宙虚实融合场景生成与呈现:发展与展望(特邀) 被引量：1

Deep learning empowers generation and presentation of virtual-real fusion scenarios in holographic metaverse:development and prospects(invited)

原文传递

导出

摘要元宇宙是互联网变革的先导性和支撑性技术,表明了信息维度扩展和沉浸体验革新是互联网的未来发展趋势。数字三维内容是元宇宙的核心要素,也是承载信息和传递反馈的主要媒介。基于数字渲染的三维内容生成和基于全息显示的三维内容呈现,在图像效果、设备成本和应用灵活性等方面优势显著,在元宇宙领域内具有广阔前景。文中比较了常用数字渲染技术的性能表现,介绍了单目深度估计在真实场景三维数字化任务中的作用,综述了有监督和无监督两类基于人工智能的单目深度估计技术的发展历程,强调了突破深度估计精度和速度瓶颈是单目深度估计技术在元宇宙内容生成领域的主要挑战,进而介绍了潜在解决方案,包括回归估计区间优化、特征参数冗余压缩和多维度特征关联等;介绍了人工智能技术在计算全息图生成任务中的应用,综述了数据驱动和模型驱动两类计算全息图生成网络的发展历程,总结了全息显示结果可重构深度范围受限是计算全息图生成网络在元宇宙内容呈现领域的主要挑战,进而介绍了潜在的解决方案,包括全息图频率成分滤波、初始计算条件优化和模型收敛路径选择等。总之,提升三维内容生成和呈现的质量和效率,是元宇宙对计算全息三维显示提出的必然要求。 Significance The metaverse is a guiding and supporting technology for the revolution of internet.It can enhance the visual experience and interactive efficiency,demonstrating prominent economic and social benefits.Digital 3D content is a core element of the metaverse,serving as the primary medium for visual information and interactive feedback.Thus,the generation and presentation of 3D content are critical for the construction of the metaverse(Fig.1-Fig.2).Generating 3D content through digital rendering technology and presenting it through holographic display technology is a wise combination for the metaverse construction because it can strike a balance among visual fidelity,device costs,and deployment complexity.However,in the task of real-world digitalization,this combination often faces bottlenecks of calculation speed and presentation quality which are caused by the massive computational load.Fortunately,the advancement of neural network provides a powerful tool to break through these bottlenecks.Progress Digital 3D rendering of 2D images,also known as depth estimation,can be categorized into multiview estimation,motion estimation,and monocular estimation.Monocular depth estimation employs single-view 2D images as the input data,demonstrating advantages including high deployment flexibility and low device costs.The neural network of monocular depth estimation can be categorized into supervised-type and unsupervised-type(Fig.3).Supervised network requires depth-labeled datasets as supervisory signals for parameter training.However,its practical application is often limited by the high difficulty of obtaining labeled datasets.Unsupervised network primarily relies on mathematical priors to achieve depth estimation,significantly reducing dependence on labeled datasets.However,the performance of this type of networks still requires continuous enhancement.Currently,monocular depth estimation networks face challenges in insufficient estimation robustness and inadequate calculation speed.To rapidly construct high-quality 3D content for the metaverse,constraints in monocular depth estimation require further in-depth investigation,to break through these mentioned challenges.Potential research directions include the optimization of estimation intervals,reduction of feature redundancy in depth estimation,and enhancement of correlations between monocular estimation and multi-view estimation(Fig.4).Holographic display is an impeccable solution for presenting digital 3D content in the metaverse.Phase-only hologram,with its high energy-efficiency and absence of twin-image artifact,serves as a superior medium for dynamic 3D content.However,the generation process of a phase-only hologram is ill-posed,posing challenges of limited computational speed and accuracy.Neural network,as an expert in solving ill-posed problems,provides a powerful tool for the calculation of phase-only holograms.Generation networks for phase-only holograms can be categorized into data-driven type and model-driven type(Fig.5).Data-driven network requires 3D targets and corresponding phase-only holograms to update parameters of the network.However,obtaining high-quality hologram-datasets demands significant computational resources.Model-driven network leverages physical constraints to train the network,overcoming the limitation of dataset quality on inference capabilities of the network.Currently,holographic display often suffers from the limited depth ranges in optical reconstructions.To extend the depth range,it is critical to address the constraints imposed by computational strategies on solving illposed problems.Further research directions include frequency filtering of phase-only holograms,optimization of initial calculation conditions,and selection of solution paths(Fig.6).Conclusion and prospect The integration of metaverse technology with internet technology holds the potential to revolutionize many fields including education,social interaction,healthcare,and industry.Neural network,as a rapid and accurate calculation tool,provides an ideal solution for the generation and presentation of the 3D content in the metaverse.The limited estimation robustness and calculation speed pose a bottleneck on 3D content generation.Researches on the constraints in monocular depth estimation should be conducted to breakthrough this bottleneck.The limited depth range of optical reconstructions is a major challenge for holographic presentation of the 3D content.Addressing this challenge requires optimizing calculation strategies for solving ill-posed problems.Based on these researches,3D acquisition and projection systems can be constructed in the foreseeable future,which would inject strong momentum into the sustainable development of virtual-real interaction in the metaverse.

作者何泽浩高云晖曹良才张岩 HE Zehao;GAO Yunhui;CAO Liangcai;ZHANG Yan(Department of Physics,Capital Normal University,Beijing 100048,China;Department of Precision Instrument,Tsinghua University,Beijing 100084,China)

机构地区首都师范大学物理系清华大学精密仪器系

出处《红外与激光工程》北大核心 2025年第7期54-67,共14页 Infrared and Laser Engineering

基金国家自然科学基金项目(62205173,62441613)。

关键词元宇宙深度估计计算全息三维成像三维显示 metaverse depth estimation computer-generated holography 3D imaging 3D display

分类号 O436 [机械工程—光学工程]

引文网络
相关文献

参考文献17

1何泽浩,曹良才.面向沉浸式元宇宙的显示、交互和应用[J].科技导报,2023,41(5):6-14. 被引量：13
2Tong Wu,Yu-Jie Yuan,Ling-Xiao Zhang,Jie Yang,Yan-Pei Cao,Ling-Qi Yan,Lin Gao.Recent advances in 3D Gaussian splatting[J].Computational Visual Media,2024,10(4):613-642. 被引量：12
3Zhaohe Zhang,Xunbo Yu,Xin Gao,Boyang Liu,Hanbo Wang,Chao Gao,Zeyu Hao,Ruiang Zhao,Xinzhu Sang.High-fidelity light-field display with enhanced information utilization by modulating chrominance and luminance separately[J].Light: Science & Applications,2025,14(3):811-821. 被引量：3
4李涵宇,于迅博,高鑫,桑新柱,颜玢玢.高逼真3D光场显示关键技术(特邀)[J].光学学报,2025,45(2):68-86. 被引量：3
5David Blinder,Tobias Birnbaum,Tomoyoshi Ito,Tomoyoshi Shimobaba.The state-of-the-art in computer generated holography for 3D display[J].Light: Advanced Manufacturing,2022,3(3):168-196. 被引量：11
6Kexuan Liu,Jiachen Wu,Zehao He,Liangcai Cao.4K-DMDNet:diffraction model-driven network for 4K computer-generated holography[J].Opto-Electronic Advances,2023,6(5):17-29. 被引量：21
7鄢化彪,徐方奇,黄绿娥,刘词波,林初欣.基于深度学习的多视图立体重建方法综述[J].光学精密工程,2023,31(16):2444-2464. 被引量：19
8黄军,王聪,刘越,毕天腾.单目深度估计技术进展综述[J].中国图象图形学报,2019,24(12):2081-2097. 被引量：27
9曹良才,何泽浩,刘珂瑄,隋晓萌.元宇宙中的动态全息三维显示:发展与挑战(特邀)[J].红外与激光工程,2022,51(1):259-273. 被引量：29
10Xiaomeng Sui,Zehao He,Daping Chu,Liangcai Cao.Non-convex optimization for inverse problem solving in computer-generated holography[J].Light: Science & Applications,2024,13(8):1464-1486. 被引量：5

二级参考文献82

1余加勇,薛现凯,陈昌富,陈仁朋,何旷宇,李锋.基于无人机倾斜摄影的公路边坡三维重建与灾害识别方法[J].中国公路学报,2022,35(4):77-86. 被引量：60
2刘东生,陈建林,费点,张之江.基于深度相机的大场景三维重建[J].光学精密工程,2020,28(1):234-243. 被引量：30
3贾甲,王涌天,刘娟,李昕,谢敬辉.计算全息三维实时显示的研究进展[J].激光与光电子学进展,2012,49(5):12-20. 被引量：25
4徐维鹏,王涌天,刘越,翁冬冬.增强现实中的虚实遮挡处理综述[J].计算机辅助设计与图形学学报,2013,25(11):1635-1642. 被引量：24
5Viewing angle-enhanced three integral imaging system using lens arrays[J].Chinese Optics Letters,2014,12(1):26-29. 被引量：2
6Fei-wei QIN,Lu-ye LI,Shu-ming GAO,Xiao-ling YANG,Xiang CHEN.A deep learning approach to the classification of 3D CAD models[J].Journal of Zhejiang University-Science C(Computers and Electronics),2014,15(2):91-106. 被引量：12
7P. W. M. Tsang,T.-C. Poon.Data-embedded-error-diffusion hologram(Invited Paper)[J].Chinese Optics Letters,2014,12(6):76-79. 被引量：1
8Masahiro Yamaguchi,Koki Wakunami,Mamoru Inaniwa.Computer generated hologram from full-parallax 3D image data captured by scanning vertical camera array(Invited Paper)[J].Chinese Optics Letters,2014,12(6):80-85. 被引量：2
9刘万奎,刘越.用于增强现实的光照估计研究综述[J].计算机辅助设计与图形学学报,2016,28(2):197-207. 被引量：24
10张迎曦,刘娟,李昕,王涌天.Fast processing method to generate gigabyte computer generated holography for three-dimensional dynamic holographic display[J].Chinese Optics Letters,2016,14(3):37-41. 被引量：6

共引文献142

1庄梦迪.元宇宙:“热闹”背后的冷思考[J].传媒论坛,2022,5(15):11-13. 被引量：1
2丁萌,姜欣言.先进驾驶辅助系统中基于单目视觉的场景深度估计方法[J].光学学报,2020,40(17):131-139. 被引量：11
3温静,李智宏.基于带squeeze-and-excitation模块的ResNeXt的单目图像深度估计方法[J].计算机应用,2021,41(1):215-219. 被引量：1
4安泽宇.计算机视觉中的深度估计分析[J].信息记录材料,2021,22(1):221-223. 被引量：1
5刘莹,王晓宇,徐卓飞,喻丹,董晨曦.基于卷积神经网络的商品图像识别[J].数字印刷,2020(6):33-40. 被引量：7
6徐宗煌,林志强,许美燕,黄方方.单幅图像距离信息分析模型[J].沈阳大学学报（自然科学版）,2021,33(1):88-94. 被引量：1
7周大可,田径,杨欣.结合局部平面参数预测的无监督单目图像深度估计[J].中国图象图形学报,2021,26(1):165-175. 被引量：7
8李旭,丁萌,魏东辉,吴晓舟,曹云峰.VDAS中基于单目红外图像的深度估计方法[J].系统工程与电子技术,2021,43(5):1210-1217. 被引量：5
9周爱军,於留芳,李镇.基于单目相机的机器人改进SLAM系统[J].自动化与仪器仪表,2021(5):206-211. 被引量：1
10隋晓萌,何泽浩,曹良才,金国藩.基于液晶空间光调制器的复振幅全息显示进展[J].液晶与显示,2021,36(6):797-809. 被引量：14

同被引文献6

1刘娟,皮大普,王涌天.实时全息三维显示技术研究进展[J].光学学报,2023,43(15):118-131. 被引量：15
2Kaiqiang Wang,Li Song,Chutian Wang,Zhenbo Ren,Guangyuan Zhao,Jiazhen Dou,Jianglei Di,George Barbastathis,Renjie Zhou,Jianlin Zhao,Edmund Y.Lam.On the use of deep learning for phase recovery[J].Light: Science & Applications,2024,13(2):190-235. 被引量：12
3Di Wang,Zhao-Song Li,Yi Zheng,You-Ran Zhao,Chao Liu,Jin-Bo Xu,Yi-Wei Zheng,Qian Huang,Chen-Liang Chang,Da-Wei Zhang,Song-Lin Zhuang,Qiong-Hua Wang.Liquid lens based holographic camera for real 3D scene hologram acquisition using end-to-end physical model-driven network[J].Light: Science & Applications,2024,13(3):488-497. 被引量：7
4Xiaomeng Sui,Zehao He,Daping Chu,Liangcai Cao.Non-convex optimization for inverse problem solving in computer-generated holography[J].Light: Science & Applications,2024,13(8):1464-1486. 被引量：5
5Zhao-Song Li,Chao Liu,Xiao-Wei Li,Yi Zheng,Qian Huang,Yi-Wei Zheng,Ye-Hao Hou,Chen-Liang Chang,Da-Wei Zhang,Song-Lin Zhuang,Di Wang,Qiong-Hua Wang.Real-time holographic camera for obtaining real 3D scene hologram[J].Light: Science & Applications,2025,14(3):789-799. 被引量：2
6Chenliang Chang,Chenzhou Zhao,Bo Dai,Qi Wang,Jun Xia,Songlin Zhuang,Dawei Zhang.Conversion of 2D picture to color 3D holography using end‑to‑end convolutional neural network[J].PhotoniX,2025,6(1):495-516. 被引量：1

引证文献1

1李赵松,樊宇博,王迪.基于深度神经网络的全息图快速编码方法进展(特邀)[J].红外与激光工程,2026,55(2):268-270.

1张亚灵,董岩,张亚敏,周建强,王艺霏.“五金”背景下《储能技术及应用》课程改革探索[J].中国电力教育,2025(2):84-85. 被引量：2
2张任.基于计算机图形学的传统艺术数字化表达方法探索[J].海外文摘,2025(6):0029-0031.
3Sam Ansari,Khawla A.Alnajiar,Mohamed Saad,Saeed Abdaliah,Ali A.El-Moursy,李云平(翻译).基于遗传算法优化机器学习模型的数字调制自动识别[J].通信对抗,2023,42(2):50-62.
4余思柳.基于VR技术的工业设计常用材料与工艺应用和建模标准试验研究[J].造纸装备及材料,2025,54(6):101-103.
5邵乾虔,邓敏,张健,苗佳威.无人机配送调度优化研究综述[J].中国储运,2025(5):90-91. 被引量：1
6姜俊,张家瑞,潘吉龙,孙国林.面向海上无人系统的边缘模型协同与数据压缩算法[J].系统工程与电子技术,2025,47(5):1718-1727. 被引量：1
7赵健宇,张露露,袭希,姚欣林.基于节点失灵的相依网络韧性测度及稳定策略研究[J].系统工程理论与实践,2025,45(4):1309-1327. 被引量：5
8李梦雅,李笑丛,赵延延,王闯世,王杨,李卫.回归法在定量指标诊断试验一致性评价中的应用及SAS实现[J].中国卫生统计,2025,42(4):614-621.

红外与激光工程

2025年第7期

浏览历史

内容加载中请稍等...