期刊文献+
共找到1篇文章
< 1 >
每页显示 20 50 100
Multimodal Pretrained Knowledge for Real-world Object Navigation
1
作者 Hui Yuan Yan Huang +4 位作者 Naigong Yu Dongbo Zhang Zetao Du Ziqi Liu Kun Zhang 《Machine Intelligence Research》 2025年第4期713-729,共17页
Most visual-language navigation(VLN)research focuses on simulate environments,but applying these methods to real-world scenarios is challenging because of misalignments between vision and language in complex environme... Most visual-language navigation(VLN)research focuses on simulate environments,but applying these methods to real-world scenarios is challenging because of misalignments between vision and language in complex environments,leading to path deviations.To address this,we propose a novel vision-and-language object navigation strategy that uses multimodal pretrained knowledge as a cross-modal bridge to link semantic concepts in both images and text.This improves navigation supervision at key-points and enhances robustness.Specifically,we 1)randomly generate key-points within a specific density range and optimize them on the basis of challenging locations;2)use pretrained multimodal knowledge to efficiently retrieve target objects;3)combine depth information with simultaneous localization and mapping(SLAM)map data to predict optimal positions and orientations for accurate navigation;and 4)implement the method on a physical robot,successfully conducting navigation tests.Our approach achieves a maximum success rate of 66.7%,outperforming existing VLN methods in real-world environments. 展开更多
关键词 visual-and-language object navigation key-points multimodal pretrained knowledge optimal positions and orientations physical robot
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部