期刊文献+

基于分布式知识编码的机器人场景理解与推理

Distributed Knowledge Encoding for Scene Understanding and Reasoning in Robotics
在线阅读 下载PDF
导出
摘要 在复杂场景中的理解和推理能力是衡量机器人智能程度的关键指标之一。然而,现有方法多聚焦于静态环境中的问答任务,难以有效应对动态场景中的多步推理需求。提出一种基于分布式知识编码的机器人场景理解与推理方法,通过构建概念-属性双层结构的场景知识图谱,增强机器人的场景理解能力。其中,实体概念通过分布式嵌入技术表示为属性集合,实现语义表达的紧凑化与可计算化。此外,设计了一种分类器,用于根据初始场景与目标场景,推理实现场景变换所需的机器人操作序列。所提分布式知识编码方法具有良好的可解释性,能够将多步场景变换过程分解为可执行的单步操作序列,使得推理复杂度随变换步数线性增长。在公开数据集Trance上的实验结果表明,与GPT-4o相比,所提方法在场景变换识别的敏感性与多步推理性能方面均取得更优表现。此外,在真实机器人平台上,验证了所提方法在实际场景中的可行性与鲁棒性。 Understanding and reasoning in complex scenes are regarded as key indicators for evaluating the intelligence level of robots.However,existing methods mainly focus on question answering tasks in static environments and are insufficient for addressing multi-step reasoning requirements in dynamic scenes.A robot scene understanding and reasoning method based on distributed knowledge encoding was proposed.A scene knowledge graph with a dual-layer structure of concepts and attributes was constructed to enhance scene understanding capability.Entity concepts were represented as attribute sets through distributed embedding techniques,which enables compact and computable semantic representation.In addition,a classifier was designed to infer the sequence of robot actions required to achieve scene transformation based on the initial scene and the target scene.The proposed distributed knowledge encoding method exhibits good interpretability.Multi-step scene transformation processes were decomposed into executable single-step action sequences,such that the reasoning complexity increases linearly with the number of transformation steps.Experiments were conducted on the public Trance dataset.The experimental results demonstrate that,compared with GPT-4o,the proposed method achieves superior performance in scene transformation sensitivity and multi-step reasoning capability.Furthermore,feasibility and robustness in real-world scenarios are validated on a real robotic platform.
作者 王海涛 张少林 蒋天雨 葛悦光 崔少伟 王硕 WANG Hai-tao;ZHANG Shao-lin;JIANG Tian-yu;GE Yue-guang;CUI Shao-wei;WANG Shuo(Institute of Automation,Chinese Academy of Sciences,Beijing 100190,China;School of Artificial Intelligence,University of Chinese Academy of Sciences,Beijing 100049,China)
出处 《科学技术与工程》 北大核心 2026年第9期3634-3641,共8页 Science Technology and Engineering
基金 国家重点研发计划(2024YFB3411104) 国家自然科学基金(62273342,U24A20281)。
关键词 场景理解 知识图谱 分布式知识编码 动作序列预测 scene understanding knowledge graph distributed knowledge encoding action sequence prediction
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部