期刊文献+
共找到226篇文章
< 1 2 12 >
每页显示 20 50 100
Unsupervised Monocular Depth Estimation with Edge Enhancement for Dynamic Scenes
1
作者 Peicheng Shi Yueyue Tang +3 位作者 Yi Li Xinlong Dong Yu Sun Aixi Yang 《Computers, Materials & Continua》 2025年第8期3321-3343,共23页
In the dynamic scene of autonomous vehicles,the depth estimation of monocular cameras often faces the problem of inaccurate edge depth estimation.To solve this problem,we propose an unsupervised monocular depth estima... In the dynamic scene of autonomous vehicles,the depth estimation of monocular cameras often faces the problem of inaccurate edge depth estimation.To solve this problem,we propose an unsupervised monocular depth estimation model based on edge enhancement,which is specifically aimed at the depth perception challenge in dynamic scenes.The model consists of two core networks:a deep prediction network and a motion estimation network,both of which adopt an encoder-decoder architecture.The depth prediction network is based on the U-Net structure of ResNet18,which is responsible for generating the depth map of the scene.The motion estimation network is based on the U-Net structure of Flow-Net,focusing on the motion estimation of dynamic targets.In the decoding stage of the motion estimation network,we innovatively introduce an edge-enhanced decoder,which integrates a convolutional block attention module(CBAM)in the decoding process to enhance the recognition ability of the edge features of moving objects.In addition,we also designed a strip convolution module to improve the model’s capture efficiency of discrete moving targets.To further improve the performance of the model,we propose a novel edge regularization method based on the Laplace operator,which effectively accelerates the convergence process of themodel.Experimental results on the KITTI and Cityscapes datasets show that compared with the current advanced dynamic unsupervised monocular model,the proposed model has a significant improvement in depth estimation accuracy and convergence speed.Specifically,the rootmean square error(RMSE)is reduced by 4.8%compared with the DepthMotion algorithm,while the training convergence speed is increased by 36%,which shows the superior performance of the model in the depth estimation task in dynamic scenes. 展开更多
关键词 Dynamic scenes unsupervised learning monocular depth edge enhancement
在线阅读 下载PDF
Self-Supervised Monocular Depth Estimation with Scene Dynamic Pose
2
作者 Jing He Haonan Zhu +1 位作者 Chenhao Zhao Minrui Zhao 《Computers, Materials & Continua》 2025年第6期4551-4573,共23页
Self-supervised monocular depth estimation has emerged as a major research focus in recent years,primarily due to the elimination of ground-truth depth dependence.However,the prevailing architectures in this domain su... Self-supervised monocular depth estimation has emerged as a major research focus in recent years,primarily due to the elimination of ground-truth depth dependence.However,the prevailing architectures in this domain suffer from inherent limitations:existing pose network branches infer camera ego-motion exclusively under static-scene and Lambertian-surface assumptions.These assumptions are often violated in real-world scenarios due to dynamic objects,non-Lambertian reflectance,and unstructured background elements,leading to pervasive artifacts such as depth discontinuities(“holes”),structural collapse,and ambiguous reconstruction.To address these challenges,we propose a novel framework that integrates scene dynamic pose estimation into the conventional self-supervised depth network,enhancing its ability to model complex scene dynamics.Our contributions are threefold:(1)a pixel-wise dynamic pose estimation module that jointly resolves the pose transformations of moving objects and localized scene perturbations;(2)a physically-informed loss function that couples dynamic pose and depth predictions,designed to mitigate depth errors arising from high-speed distant objects and geometrically inconsistent motion profiles;(3)an efficient SE(3)transformation parameterization that streamlines network complexity and temporal pre-processing.Extensive experiments on the KITTI and NYU-V2 benchmarks show that our framework achieves state-of-the-art performance in both quantitative metrics and qualitative visual fidelity,significantly improving the robustness and generalization of monocular depth estimation under dynamic conditions. 展开更多
关键词 monocular depth estimation self-supervised learning scene dynamic pose estimation dynamic-depth constraint pixel-wise dynamic pose
在线阅读 下载PDF
Monocular Depth Estimation with Sharp Boundary
3
作者 Xin Yang Qingling Chang +2 位作者 Shiting Xu Xinlin Liu Yan Cui 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第7期573-592,共20页
Monocular depth estimation is the basic task in computer vision.Its accuracy has tremendous improvement in the decade with the development of deep learning.However,the blurry boundary in the depth map is a serious pro... Monocular depth estimation is the basic task in computer vision.Its accuracy has tremendous improvement in the decade with the development of deep learning.However,the blurry boundary in the depth map is a serious problem.Researchers find that the blurry boundary is mainly caused by two factors.First,the low-level features,containing boundary and structure information,may be lost in deep networks during the convolution process.Second,themodel ignores the errors introduced by the boundary area due to the few portions of the boundary area in the whole area,during the backpropagation.Focusing on the factors mentioned above.Two countermeasures are proposed to mitigate the boundary blur problem.Firstly,we design a scene understanding module and scale transformmodule to build a lightweight fuse feature pyramid,which can deal with low-level feature loss effectively.Secondly,we propose a boundary-aware depth loss function to pay attention to the effects of the boundary’s depth value.Extensive experiments show that our method can predict the depth maps with clearer boundaries,and the performance of the depth accuracy based on NYU-Depth V2,SUN RGB-D,and iBims-1 are competitive. 展开更多
关键词 monocular depth estimation object boundary blurry boundary scene global information feature fusion scale transform boundary aware
在线阅读 下载PDF
Boosting Unsupervised Monocular Depth Estimation with Auxiliary Semantic Information
4
作者 Hui Ren Nan Gao Jia Li 《China Communications》 SCIE CSCD 2021年第6期228-243,共16页
Learning-based multi-task models have been widely used in various scene understanding tasks,and complement each other,i.e.,they allow us to consider prior semantic information to better infer depth.We boost the unsupe... Learning-based multi-task models have been widely used in various scene understanding tasks,and complement each other,i.e.,they allow us to consider prior semantic information to better infer depth.We boost the unsupervised monocular depth estimation using semantic segmentation as an auxiliary task.To address the lack of cross-domain datasets and catastrophic forgetting problems encountered in multi-task training,we utilize existing methodology to obtain redundant segmentation maps to build our cross-domain dataset,which not only provides a new way to conduct multi-task training,but also helps us to evaluate results compared with those of other algorithms.In addition,in order to comprehensively use the extracted features of the two tasks in the early perception stage,we use a strategy of sharing weights in the network to fuse cross-domain features,and introduce a novel multi-task loss function to further smooth the depth values.Extensive experiments on KITTI and Cityscapes datasets show that our method has achieved state-of-the-art performance in the depth estimation task,as well improved semantic segmentation. 展开更多
关键词 unsupervised monocular depth estimation semantic segmentation multi-task model
在线阅读 下载PDF
RADepthNet:Reflectance-aware monocular depth estimation
5
作者 Chuxuan LI Ran YI +5 位作者 Saba Ghazanfar ALI Lizhuang MA Enhua WU Jihong WANG Lijuan MAO Bin SHENG 《Virtual Reality & Intelligent Hardware》 2022年第5期418-431,共14页
Background Monocular depth estimation aims to predict a dense depth map from a single RGB image,and has important applications in 3D reconstruction,automatic driving,and augmented reality.However,existing methods dire... Background Monocular depth estimation aims to predict a dense depth map from a single RGB image,and has important applications in 3D reconstruction,automatic driving,and augmented reality.However,existing methods directly feed the original RGB image into the model to extract depth features without avoiding the interference of depth-irrelevant information on depth-estimation accuracy,which leads to inferior performance.Methods To remove the influence of depth-irrelevant information and improve the depth-prediction accuracy,we propose RADepthNet,a novel reflectance-guided network that fuses boundary features.Specifically,our method predicts depth maps using the following three steps:(1)Intrinsic Image Decomposition.We propose a reflectance extraction module consisting of an encoder-decoder structure to extract the depth-related reflectance.Through an ablation study,we demonstrate that the module can reduce the influence of illumination on depth estimation.(2)Boundary Detection.A boundary extraction module,consisting of an encoder,refinement block,and upsample block,was proposed to better predict the depth at object boundaries utilizing gradient constraints.(3)Depth Prediction Module.We use an encoder different from(2)to obtain depth features from the reflectance map and fuse boundary features to predict depth.In addition,we proposed FIFADataset,a depth-estimation dataset applied in soccer scenarios.Results Extensive experiments on a public dataset and our proposed FIFADataset show that our method achieves state-of-the-art performance. 展开更多
关键词 monocular depth estimation Deep learning Intrinsic image decomposition
在线阅读 下载PDF
Monocular depth estimation based on deep learning for intraoperative guidance using surface-enhanced Raman scattering imaging
6
作者 ANIWAT JUHONG BO LI +12 位作者 YIFAN LIU CHENG-YOU YAO CHIA-WEI YANG A.K.M.ATIQUE ULLAH KUNLI LIU RYAN P.LEWANDOWSKI JACK R.HARKEMA DALEN W.AGNEW YU LEO LEI GARY D.LUKER XUEFEI HUANG WIBOOL PIYAWATTANAMETHA ZHEN QIU 《Photonics Research》 2025年第2期550-560,共11页
Imaging of surface-enhanced Raman scattering(SERS) nanoparticles(NPs) has been intensively studied for cancer detection due to its high sensitivity, unconstrained low signal-to-noise ratios, and multiplexing detection... Imaging of surface-enhanced Raman scattering(SERS) nanoparticles(NPs) has been intensively studied for cancer detection due to its high sensitivity, unconstrained low signal-to-noise ratios, and multiplexing detection capability. Furthermore, conjugating SERS NPs with various biomarkers is straightforward, resulting in numerous successful studies on cancer detection and diagnosis. However, Raman spectroscopy only provides spectral data from an imaging area without co-registered anatomic context. 展开更多
关键词 raman spectroscopy cancer detection surface enhanced raman scattering imaging intraoperative guidance monocular depth estimation anatomic context deep learning sers nanoparticles
原文传递
High Quality Monocular Video Depth Estimation Based on Mask Guided Refinement
7
作者 Huixiao Pan Qiang Zhao 《Journal of Beijing Institute of Technology》 2025年第1期18-27,共10页
Depth maps play a crucial role in various practical applications such as computer vision,augmented reality,and autonomous driving.How to obtain clear and accurate depth information in video depth estimation is a signi... Depth maps play a crucial role in various practical applications such as computer vision,augmented reality,and autonomous driving.How to obtain clear and accurate depth information in video depth estimation is a significant challenge faced in the field of computer vision.However,existing monocular video depth estimation models tend to produce blurred or inaccurate depth information in regions with object edges and low texture.To address this issue,we propose a monocular depth estimation model architecture guided by semantic segmentation masks,which introduces semantic information into the model to correct the ambiguous depth regions.We have evaluated the proposed method,and experimental results show that our method improves the accuracy of edge depth,demonstrating the effectiveness of our approach. 展开更多
关键词 monocular video depth estimation depth refinement edge depth accuracy semantic segmentation
在线阅读 下载PDF
Monocular depth estimation based on deep learning:An overview 被引量:28
8
作者 ZHAO ChaoQiang SUN QiYu +2 位作者 ZHANG ChongZhen TANG Yang QIAN Feng 《Science China(Technological Sciences)》 SCIE EI CAS CSCD 2020年第9期1612-1627,共16页
Depth information is important for autonomous systems to perceive environments and estimate their own state.Traditional depth estimation methods,like structure from motion and stereo vision matching,are built on featu... Depth information is important for autonomous systems to perceive environments and estimate their own state.Traditional depth estimation methods,like structure from motion and stereo vision matching,are built on feature correspondences of multiple viewpoints.Meanwhile,the predicted depth maps are sparse.Inferring depth information from a single image(monocular depth estimation)is an ill-posed problem.With the rapid development of deep neural networks,monocular depth estimation based on deep learning has been widely studied recently and achieved promising performance in accuracy.Meanwhile,dense depth maps are estimated from single images by deep neural networks in an end-to-end manner.In order to improve the accuracy of depth estimation,different kinds of network frameworks,loss functions and training strategies are proposed subsequently.Therefore,we survey the current monocular depth estimation methods based on deep learning in this review.Initially,we conclude several widely used datasets and evaluation indicators in deep learning-based depth estimation.Furthermore,we review some representative existing methods according to different training manners:supervised,unsupervised and semi-supervised.Finally,we discuss the challenges and provide some ideas for future researches in monocular depth estimation. 展开更多
关键词 autonomous systems monocular depth estimation deep learning unsupervised learning
原文传递
DepthFormer:Exploiting Long-range Correlation and Local Information for Accurate Monocular Depth Estimation 被引量:10
9
作者 Zhenyu Li Zehui Chen +1 位作者 Xianming Liu Junjun Jiang 《Machine Intelligence Research》 EI CSCD 2023年第6期837-854,共18页
This paper aims to address the problem of supervised monocular depth estimation.We start with a meticulous pilot study to demonstrate that the long-range correlation is essential for accurate depth estimation.Moreover... This paper aims to address the problem of supervised monocular depth estimation.We start with a meticulous pilot study to demonstrate that the long-range correlation is essential for accurate depth estimation.Moreover,the Transformer and convolution are good at long-range and close-range depth estimation,respectively.Therefore,we propose to adopt a parallel encoder architecture consisting of a Transformer branch and a convolution branch.The former can model global context with the effective attention mechanism and the latter aims to preserve the local information as the Transformer lacks the spatial inductive bias in modeling such contents.However,independent branches lead to a shortage of connections between features.To bridge this gap,we design a hierarchical aggregation and heterogeneous interaction module to enhance the Transformer features and model the affinity between the heterogeneous features in a set-to-set translation manner.Due to the unbearable memory cost introduced by the global attention on high-resolution feature maps,we adopt the deformable scheme to reduce the complexity.Extensive experiments on the KITTI,NYU,and SUN RGB-D datasets demonstrate that our proposed model,termed DepthFormer,surpasses state-of-the-art monocular depth estimation methods with prominent margins.The effectiveness of each proposed module is elaborately evaluated through meticulous and intensive ablation studies. 展开更多
关键词 Autonomous driving 3D reconstruction monocular depth estimation TRANSFORMER CONVOLUTION
原文传递
ArthroNet:a monocular depth estimation technique with 3D segmented maps for knee arthroscopy 被引量:1
10
作者 Shahnewaz Ali Ajay K.Pandey 《Intelligent Medicine》 CSCD 2023年第2期129-138,共10页
Background Lack of depth perception from medical imaging systems is one of the long-standing technological limitations of minimally invasive surgeries.The ability to visualize anatomical structures in 3D can improve c... Background Lack of depth perception from medical imaging systems is one of the long-standing technological limitations of minimally invasive surgeries.The ability to visualize anatomical structures in 3D can improve conventional arthroscopic surgeries,as a full 3D semantic representation of the surgical site can directly improve surgeons’ability.It also brings the possibility of intraoperative image registration with preoperative clinical records for the development of semi-autonomous,and fully autonomous platforms.This study aimed to present a novel monocular depth prediction model to infer depth maps from a single-color arthroscopic video frame.Methods We applied a novel technique that provides the ability to combine both supervised and self-supervised loss terms and thus eliminate the drawback of each technique.It enabled the estimation of edge-preserving depth maps from a single untextured arthroscopic frame.The proposed image acquisition technique projected artificial textures on the surface to improve the quality of disparity maps from stereo images.Moreover,following the integration of the attention-ware multi-scale feature extraction technique along with scene global contextual constraints and multiscale depth fusion,the model could predict reliable and accurate tissue depth of the surgical sites that complies with scene geometry.Results A total of 4,128 stereo frames from a knee phantom were used to train a network,and during the pre-trained stage,the network learned disparity maps from the stereo images.The fine-tuned training phase uses 12,695 knee arthroscopic stereo frames from cadaver experiments along with their corresponding coarse disparity maps obtained from the stereo matching technique.In a supervised fashion,the network learns the left image to the disparity map transformation process,whereas the self-supervised loss term refines the coarse depth map by minimizing reprojection,gradients,and structural dissimilarity loss.Together,our method produces high-quality 3D maps with minimum re-projection loss that are 0.0004132(structural similarity index),0.00036120156(L1 error distance)and 6.591908×10^(−5)(L1 gradient error distance).Conclusion Machine learning techniques for monocular depth prediction is studied to infer accurate depth maps from a single-color arthroscopic video frame.Moreover,the study integrates segmentation model hence,3D segmented maps are inferred that provides extended perception ability and tissue awareness. 展开更多
关键词 monocular depth estimation technique 3D segmented maps Knee arthroscopic
原文传递
Self-Supervised Monocular Depth Estimation by Digging into Uncertainty Quantification
11
作者 李远珍 郑圣杰 +3 位作者 谭梓欣 曹拓 罗飞 肖春霞 《Journal of Computer Science & Technology》 SCIE EI CSCD 2023年第3期510-525,共16页
Based on well-designed network architectures and objective functions,self-supervised monocular depth estimation has made great progress.However,lacking a specific mechanism to make the network learn more about the reg... Based on well-designed network architectures and objective functions,self-supervised monocular depth estimation has made great progress.However,lacking a specific mechanism to make the network learn more about the regions containing moving objects or occlusion scenarios,existing depth estimation methods likely produce poor results for them.Therefore,we propose an uncertainty quantification method to improve the performance of existing depth estimation networks without changing their architectures.Our uncertainty quantification method consists of uncertainty measurement,the learning guidance by uncertainty,and the ultimate adaptive determination.Firstly,with Snapshot and Siam learning strategies,we measure the uncertainty degree by calculating the variance of pre-converged epochs or twins during training.Secondly,we use the uncertainty to guide the network to strengthen learning about those regions with more uncertainty.Finally,we use the uncertainty to adaptively produce the final depth estimation results with a balance of accuracy and robustness.To demonstrate the effectiveness of our uncertainty quantification method,we apply it to two state-of-the-art models,Monodepth2 and Hints.Experimental results show that our method has improved the depth estimation performance in seven evaluation metrics compared with two baseline models and exceeded the existing uncertainty method. 展开更多
关键词 self-supervised monocular depth estimation uncertainty quantification variance
原文传递
Self-supervised coarse-to-fine monocular depth estimation using a lightweight attention module
12
作者 Yuanzhen Li Fei Luo Chunxia Xiao 《Computational Visual Media》 SCIE EI CSCD 2022年第4期631-647,共17页
Self-supervised monocular depth estimation has been widely investigated and applied in previous works.However,existing methods suffer from texture-copy,depth drift,and incomplete structure.It is difficult for normal C... Self-supervised monocular depth estimation has been widely investigated and applied in previous works.However,existing methods suffer from texture-copy,depth drift,and incomplete structure.It is difficult for normal CNN networks to completely understand the relationship between the object and its surrounding environment.Moreover,it is hard to design the depth smoothness loss to balance depth smoothness and sharpness.To address these issues,we propose a coarse-to-fine method with a normalized convolutional block attention module(NCBAM).In the coarse estimation stage,we incorporate the NCBAM into depth and pose networks to overcome the texture-copy and depth drift problems.Then,we use a new network to refine the coarse depth guided by the color image and produce a structure-preserving depth result in the refinement stage.Our method can produce results competitive with state-of-the-art methods.Comprehensive experiments prove the effectiveness of our two-stage method using the NCBAM. 展开更多
关键词 monocular depth estimation texture copy depth drift attention module
原文传递
TalentDepth:基于多尺度注意力机制的复杂天气场景单目深度估计模型
13
作者 张航 卫守林 殷继彬 《计算机科学》 北大核心 2025年第S1期442-448,共7页
对于复杂天气场景图像模糊、低对比度和颜色失真所导致的深度信息预测不准的问题,以往的研究均以标准场景的深度图作为先验信息来对该类场景进行深度估计。然而,这一方式存在先验信息精度较低等问题。对此,提出一个基于多尺度注意力机... 对于复杂天气场景图像模糊、低对比度和颜色失真所导致的深度信息预测不准的问题,以往的研究均以标准场景的深度图作为先验信息来对该类场景进行深度估计。然而,这一方式存在先验信息精度较低等问题。对此,提出一个基于多尺度注意力机制的单目深度估计模型TalentDepth,以实现对复杂天气场景的预测。首先,在编码器中融合多尺度注意力机制,在减少计算成本的同时,保留每个通道的信息,提高特征提取的效率和能力。其次,针对图像深度不清晰的问题,基于几何一致性,提出深度区域细化(Depth Region Refinement,DSR)模块,过滤不准确的像素点,以提高深度信息的可靠性。最后,输入图像翻译模型所生成的复杂样本,并计算相应原始图像上的标准损失来指导模型的自监督训练。在NuScence,KITTI和KITTI-C这3个数据集上,相比于基线模型,所提模型对误差和精度均有优化。 展开更多
关键词 单目深度估计 自监督学习 多尺度注意力 知识提炼 深度学习
在线阅读 下载PDF
轻量化的低成本海洋机器人深度估计方法EDepth
14
作者 陈东烁 柴春来 +1 位作者 叶航 张思赟 《计算机应用》 北大核心 2025年第S1期106-113,共8页
针对传统单目深度估计方法在海洋环境中存在的精度低、鲁棒性差、运行速度慢和难以部署等问题,提出一种轻量化的海洋机器人深度估计方法,命名为EDepth(EfficientDepth)。该方法旨在提升低成本海洋机器人的三维(3D)感知能力。首先,利用... 针对传统单目深度估计方法在海洋环境中存在的精度低、鲁棒性差、运行速度慢和难以部署等问题,提出一种轻量化的海洋机器人深度估计方法,命名为EDepth(EfficientDepth)。该方法旨在提升低成本海洋机器人的三维(3D)感知能力。首先,利用水下光衰减先验,通过空间转换将输入数据从原始RGB(Red-Green-Blue)图像空间映射到RBI(Red-BlueIntensity)输入域,从而提高深度估计的准确性;其次,采用高效的EfficientFormerV2作为特征提取模块,并结合视觉注意力机制MiniViT(Mini Vision Transformer)和光衰减模块实现深度信息的有效提取和处理;此外,通过自适应分区的设计,MiniViT模块能够动态调整深度区间,从而提高深度估计的精度;最后,优化网络结构,从而在不牺牲性能的前提下,实现高效的计算。实验结果表明,EDepth在RGB-D(Red-Green-Blue Depth)数据集USOD10K上的深度估计性能显著优于传统方法。具体来说,EDepth在平均绝对相对误差(Abs Rel)上达到了0.587,而DenseDepth为0.519,尽管DenseDepth在某些指标上表现更佳,但相较于DenseDepth的4 461万参数和171.44 MB的内存占用,EDepth仅有461万参数,减少了89.67%的参数量,而内存占用减少至23.56 MB,且在单个CPU上EDepth的每秒帧数(FPS)达到了14.11,明显优于DenseDepth的2.45。可见,EDepth在深度估计性能和计算效率之间取得了良好的平衡。 展开更多
关键词 三维感知 自适应分区 计算效率 EfficientFormerV2 海洋机器人 单目深度估计
在线阅读 下载PDF
LpDepth:基于拉普拉斯金字塔的自监督单目深度估计
15
作者 曹明伟 邢景杰 +1 位作者 程宜风 赵海锋 《计算机科学》 北大核心 2025年第3期33-40,共8页
自监督单目深度估计受到了国内外研究人员的广泛关注。现有基于深度学习的自监督单目深度估计方法主要采用编码器-解码器结构。然而,这些方法在编码过程中对输入图像进行下采样操作,导致部分图像信息,尤其是图像的边界信息丢失,进而影... 自监督单目深度估计受到了国内外研究人员的广泛关注。现有基于深度学习的自监督单目深度估计方法主要采用编码器-解码器结构。然而,这些方法在编码过程中对输入图像进行下采样操作,导致部分图像信息,尤其是图像的边界信息丢失,进而影响深度图的精度。针对上述问题,提出一种基于拉普拉斯金字塔的自监督单目深度估计方法(Self-supervised Monocular Depth Estimation Based on the Laplace Pyramid,LpDepth)。此方法的核心思想是:首先,使用拉普拉斯残差图丰富编码特征,以弥补在下采样过程中丢失的特征信息;其次,在下采样过程中使用最大池化层突显和放大特征信息,使编码器在特征提取过程中更容易地提取到训练模型所需要的特征信息;最后,使用残差模块解决过拟合问题,提高解码器对特征的利用效率。在KITTI和Make3D等数据集上对所提方法进行了测试,同时将其与现有经典方法进行了比较。实验结果证明了所提方法的有效性。 展开更多
关键词 单目深度估计 拉普拉斯金字塔 残差网络 深度图
在线阅读 下载PDF
DepthMamba:多尺度VisionMamba架构的单目深度估计
16
作者 徐志斌 张孙杰 《计算机应用研究》 北大核心 2025年第3期944-948,共5页
在单目深度估计领域,虽然基于CNN和Transformer的模型已经得到了广泛的研究,但是CNN全局特征提取不足,Transformer则具有二次计算复杂性。为了克服这些限制,提出了一种用于单目深度估计的端到端模型,命名为DepthMamba。该模型能够高效... 在单目深度估计领域,虽然基于CNN和Transformer的模型已经得到了广泛的研究,但是CNN全局特征提取不足,Transformer则具有二次计算复杂性。为了克服这些限制,提出了一种用于单目深度估计的端到端模型,命名为DepthMamba。该模型能够高效地捕捉全局信息并减少计算负担。具体地,该方法引入了视觉状态空间(VSS)模块构建编码器-解码器架构,以提高模型提取多尺度信息和全局信息的能力。此外,还设计了MLPBins深度预测模块,旨在优化深度图的平滑性和整洁性。最后在室内场景NYU_Depth V2数据集和室外场景KITTI数据集上进行了综合实验,实验结果表明:与基于视觉Transformer架构的Depthformer相比,该方法网络参数量减少了27.75%,RMSE分别减少了6.09%和2.63%,验证了算法的高效性和优越性。 展开更多
关键词 单目深度估计 Vmamba Bins深度预测 状态空间模型
在线阅读 下载PDF
Monocular Vision-based Two-stage Iterative Algorithm for Relative Position and Attitude Estimation of Docking Spacecraft 被引量:7
17
作者 张世杰 刘峰华 +1 位作者 曹喜滨 贺亮 《Chinese Journal of Aeronautics》 SCIE EI CAS CSCD 2010年第2期204-210,共7页
Visual sensors are used to measure the relative state of the chaser spacecraft to the target spacecraft during close range ren- dezvous phases. This article proposes a two-stage iterative algorithm based on an inverse... Visual sensors are used to measure the relative state of the chaser spacecraft to the target spacecraft during close range ren- dezvous phases. This article proposes a two-stage iterative algorithm based on an inverse projection ray approach to address the relative position and attitude estimation by using feature points and monocular vision. It consists of two stages: absolute orienta- tion and depth recovery. In the first stage, Umeyama's algorithm is used to fit the three-dimensional (3D) model set and estimate the 3D point set while in the second stage, the depths of the observed feature points are estimated. This procedure is repeated until the result converges. Moreover, the effectiveness and convergence of the proposed algorithm are verified through theoreti- cal analysis and mathematical simulation. 展开更多
关键词 SPACECRAFT relative position and attitude monocular vision depth recovery absolute orientation
原文传递
基于改进FeatDepth的足球运动场景无监督单目图像深度预测 被引量:1
18
作者 傅荟璇 徐权文 王宇超 《实验技术与管理》 CAS 北大核心 2024年第10期74-84,共11页
为了在降低成本的同时提高图像深度信息预测的精确度,并将深度估计应用于足球运动场景,提出一种基于改进FeatDepth的足球运动场景无监督单目图像深度预测方法。首先,对原FeatDepth引入注意力机制,使模型更加关注有效的特征信息;其次,将F... 为了在降低成本的同时提高图像深度信息预测的精确度,并将深度估计应用于足球运动场景,提出一种基于改进FeatDepth的足球运动场景无监督单目图像深度预测方法。首先,对原FeatDepth引入注意力机制,使模型更加关注有效的特征信息;其次,将FeatDepth中的PoseNet网络和DepthNet网络分别嵌入GAM全局注意力机制模块,为网络添加额外的上下文信息,在基本不增加计算成本的情况下提升FeatDepth模型深度预测性能;再次,为在低纹理区域和细节上获得更好的深度预测效果,由单视图重构损失与交叉视图重构损失组合而成最终的损失函数。选取KITTI数据集中Person场景较多的部分进行数据集制作并进行仿真实验,结果表明,改进后的FeatDepth模型不仅在精确度上有所提升,且在低纹理区域及细节处拥有更好的深度预测效果。最后,对比模型在足球场景下的推理效果后得出,改进后的模型在低纹理区域(足球、球门等)及细节处(肢体等)有更好的深度预测效果,实现了将基于无监督的单目深度估计模型应用于足球运动场景的目的。 展开更多
关键词 足球运动场景 无监督单目深度估计 Featdepth 注意力机制 GAM 图像重构
在线阅读 下载PDF
Fusion of color and hallucinated depth features for enhanced multimodal deep learning-based damage segmentation
19
作者 Tarutal Ghosh Mondal Mohammad Reza Jahanshahi 《Earthquake Engineering and Engineering Vibration》 SCIE EI CSCD 2023年第1期55-68,共14页
Recent advances in computer vision and deep learning have shown that the fusion of depth information can significantly enhance the performance of RGB-based damage detection and segmentation models.However,alongside th... Recent advances in computer vision and deep learning have shown that the fusion of depth information can significantly enhance the performance of RGB-based damage detection and segmentation models.However,alongside the advantages,depth-sensing also presents many practical challenges.For instance,the depth sensors impose an additional payload burden on the robotic inspection platforms limiting the operation time and increasing the inspection cost.Additionally,some lidar-based depth sensors have poor outdoor performance due to sunlight contamination during the daytime.In this context,this study investigates the feasibility of abolishing depth-sensing at test time without compromising the segmentation performance.An autonomous damage segmentation framework is developed,based on recent advancements in vision-based multi-modal sensing such as modality hallucination(MH)and monocular depth estimation(MDE),which require depth data only during the model training.At the time of deployment,depth data becomes expendable as it can be simulated from the corresponding RGB frames.This makes it possible to reap the benefits of depth fusion without any depth perception per se.This study explored two different depth encoding techniques and three different fusion strategies in addition to a baseline RGB-based model.The proposed approach is validated on computer-generated RGB-D data of reinforced concrete buildings subjected to seismic damage.It was observed that the surrogate techniques can increase the segmentation IoU by up to 20.1%with a negligible increase in the computation cost.Overall,this study is believed to make a positive contribution to enhancing the resilience of critical civil infrastructure. 展开更多
关键词 multimodal data fusion depth sensing vision-based inspection UAV-assisted inspection damage segmentation post-disaster reconnaissance modality hallucination monocular depth estimation
在线阅读 下载PDF
基于Shuffle-ZoeDepth单目深度估计的苗期玉米株高测量方法 被引量:4
20
作者 赵永杰 蒲六如 +2 位作者 宋磊 刘佳辉 宋怀波 《农业机械学报》 EI CAS CSCD 北大核心 2024年第5期235-243,253,共10页
株高是鉴别玉米种质性状及作物活力的重要表型指标,苗期玉米遗传特性表现明显,准确测量苗期玉米植株高度对玉米遗传特性鉴别与田间管理具有重要意义。针对传统植株高度获取方法依赖人工测量,费时费力且存在主观误差的问题,提出了一种融... 株高是鉴别玉米种质性状及作物活力的重要表型指标,苗期玉米遗传特性表现明显,准确测量苗期玉米植株高度对玉米遗传特性鉴别与田间管理具有重要意义。针对传统植株高度获取方法依赖人工测量,费时费力且存在主观误差的问题,提出了一种融合混合注意力信息的改进ZoeDepth单目深度估计模型。改进后的模型将Shuffle Attention模块加入Decoder模块的4个阶段,使Decoder模块在对低分辨率特征图信息提取过程中能更关注特征图中的有效信息,提升了模型关键信息的提取能力,可生成更精确的深度图。为验证本研究方法的有效性,在NYU-V2深度数据集上进行了验证。结果表明,改进的Shuffle-ZoeDepth模型在NYU-V2深度数据集上绝对相对差、均方根误差、对数均方根误差为0.083、0.301 mm、0.036,不同阈值下准确率分别为93.9%、99.1%、99.8%,均优于ZoeDepth模型。同时,利用Shuffle-ZoeDepth单目深度估计模型结合玉米植株高度测量模型实现了苗期玉米植株高度的测量,采集不同距离下苗期玉米图像进行植株高度测量试验。当玉米高度在15~25 cm、25~35 cm、35~45 cm 3个区间时,平均测量绝对误差分别为1.41、2.21、2.08 cm,平均测量百分比误差分别为8.41%、7.54%、4.98%。试验结果表明该方法可仅使用单个RGB相机完成复杂室外环境下苗期玉米植株高度的精确测量。 展开更多
关键词 苗期玉米 株高 单目深度估计 测量方法 混合注意力机制
在线阅读 下载PDF
上一页 1 2 12 下一页 到第
使用帮助 返回顶部