The autonomous exploration and mapping of an unknown environment is useful in a wide range of applications and thus holds great significance. Existing methods mostly use range sensors to generate twodimensional (2D) g...The autonomous exploration and mapping of an unknown environment is useful in a wide range of applications and thus holds great significance. Existing methods mostly use range sensors to generate twodimensional (2D) grid maps. Red/green/blue-depth (RGB-D) sensors provide both color and depth information on the environment, thereby enabling the generation of a three-dimensional (3D) point cloud map that is intuitive for human perception. In this paper, we present a systematic approach with dual RGB-D sensors to achieve the autonomous exploration and mapping of an unknown indoor environment. With the synchronized and processed RGB-D data, location points were generated and a 3D point cloud map and 2D grid map were incrementally built. Next, the exploration was modeled as a partially observable Markov decision process. Partial map simulation and global frontier search methods were combined for autonomous exploration, and dynamic action constraints were utilized in motion control. In this way, the local optimum can be avoided and the exploration efficacy can be ensured. Experiments with single connected and multi-branched regions demonstrated the high robustness, efficiency, and superiority of the developed system and methods.展开更多
针对RGB-D(Red Green Blue Depth)语义分割中色彩信息和深度信息无法有效融合以及无法充分提取多尺度上下文信息的问题,文中提出了一种基于双流聚合Transformer的RGB-D语义分割方法。通过Transformer提取全彩图像和深度图像的多层次特征...针对RGB-D(Red Green Blue Depth)语义分割中色彩信息和深度信息无法有效融合以及无法充分提取多尺度上下文信息的问题,文中提出了一种基于双流聚合Transformer的RGB-D语义分割方法。通过Transformer提取全彩图像和深度图像的多层次特征,采用通道注意交叉融合模块与深度增强RGB操作实现各层次特征模态鸿沟的补偿,完成双模态信息融合。使用多层聚合解码器模块整合多层次多尺度上下文特征,减少了信息传递损失,实现了更准确和更全面的语义分割。实验结果表明,所提方法在NYU-Dv2数据集上的平均交并比(mean Intersection over Union,mIoU)、像素准确率和平均像素准确率分别达到52.9%、78.0%、66.0%。在Cityscapes数据集上的实验结果表明,在低分辨率输入图像下,所提方法的mIoU达到了79.8%。展开更多
目的行人再识别是指在一个或者多个相机拍摄的图像或视频中实现行人匹配的技术,广泛用于图像检索、智能安保等领域。按照相机种类和拍摄视角的不同,行人再识别算法可主要分为基于侧视角彩色相机的行人再识别算法和基于俯视角深度相机的...目的行人再识别是指在一个或者多个相机拍摄的图像或视频中实现行人匹配的技术,广泛用于图像检索、智能安保等领域。按照相机种类和拍摄视角的不同,行人再识别算法可主要分为基于侧视角彩色相机的行人再识别算法和基于俯视角深度相机的行人再识别算法。在侧视角彩色相机场景中,行人身体的大部分表观信息可见;而在俯视角深度相机场景中,仅行人头部和肩部的结构信息可见。现有的多数算法主要针对侧视角彩色相机场景,只有少数算法可以直接应用于俯视角深度相机场景中,尤其是低分辨率场景,如公交车的车载飞行时间(time of flight,TOF)相机拍摄的视频。因此针对俯视角深度相机场景,本文提出了一种基于俯视深度头肩序列的行人再识别算法,以期提高低分辨率场景下的行人再识别精度。方法对俯视深度头肩序列进行头部区域检测和卡尔曼滤波器跟踪,获取行人的头部图像序列,构建头部深度能量图组(head depth energy map group,He DEMaG),并据此提取深度特征、面积特征、投影特征、傅里叶描述子和方向梯度直方图(histogram of oriented gradient,HOG)特征。计算行人之间头部深度能量图组的各特征之间的相似度,再利用经过模型学习所获得的权重系数对各特征相似度进行加权融合,从而得到相似度总分,将最大相似度对应的行人标签作为识别结果,实现行人再识别。结果本文算法在公开的室内单人场景TVPR(top view person re-identification)数据集、自建的室内多人场景TDPI-L(top-view depth based person identification for laboratory scenarios)数据集和公交车实际场景TDPI-B(top-view depth based person identification for bus scenarios)数据集上进行了测试,使用首位匹配率(rank-1)、前5位匹配率(rank-5)、宏F1值(macro-F1)、累计匹配曲线(cumulative match characteristic,CMC)和平均耗时等5个指标来衡量算法性能。其中,rank-1、rank-5和macro-F1分别达到61%、68%和67%以上,相比于典型算法至少提高了11%。结论本文构建了表达行人结构与行为特征的头部深度能量图组,实现了适合低分辨率行人的多特征表达;提出了基于权重学习的相似度融合,提高了识别精度,在室内单人、室内多人和公交车实际场景数据集中均取得了较好的效果。展开更多
基金the National Natural Science Foundation of China (61720106012 and 61403215)the Foundation of State Key Laboratory of Robotics (2006-003)the Fundamental Research Funds for the Central Universities for the financial support of this work.
文摘The autonomous exploration and mapping of an unknown environment is useful in a wide range of applications and thus holds great significance. Existing methods mostly use range sensors to generate twodimensional (2D) grid maps. Red/green/blue-depth (RGB-D) sensors provide both color and depth information on the environment, thereby enabling the generation of a three-dimensional (3D) point cloud map that is intuitive for human perception. In this paper, we present a systematic approach with dual RGB-D sensors to achieve the autonomous exploration and mapping of an unknown indoor environment. With the synchronized and processed RGB-D data, location points were generated and a 3D point cloud map and 2D grid map were incrementally built. Next, the exploration was modeled as a partially observable Markov decision process. Partial map simulation and global frontier search methods were combined for autonomous exploration, and dynamic action constraints were utilized in motion control. In this way, the local optimum can be avoided and the exploration efficacy can be ensured. Experiments with single connected and multi-branched regions demonstrated the high robustness, efficiency, and superiority of the developed system and methods.
文摘针对RGB-D(Red Green Blue Depth)语义分割中色彩信息和深度信息无法有效融合以及无法充分提取多尺度上下文信息的问题,文中提出了一种基于双流聚合Transformer的RGB-D语义分割方法。通过Transformer提取全彩图像和深度图像的多层次特征,采用通道注意交叉融合模块与深度增强RGB操作实现各层次特征模态鸿沟的补偿,完成双模态信息融合。使用多层聚合解码器模块整合多层次多尺度上下文特征,减少了信息传递损失,实现了更准确和更全面的语义分割。实验结果表明,所提方法在NYU-Dv2数据集上的平均交并比(mean Intersection over Union,mIoU)、像素准确率和平均像素准确率分别达到52.9%、78.0%、66.0%。在Cityscapes数据集上的实验结果表明,在低分辨率输入图像下,所提方法的mIoU达到了79.8%。
文摘目的行人再识别是指在一个或者多个相机拍摄的图像或视频中实现行人匹配的技术,广泛用于图像检索、智能安保等领域。按照相机种类和拍摄视角的不同,行人再识别算法可主要分为基于侧视角彩色相机的行人再识别算法和基于俯视角深度相机的行人再识别算法。在侧视角彩色相机场景中,行人身体的大部分表观信息可见;而在俯视角深度相机场景中,仅行人头部和肩部的结构信息可见。现有的多数算法主要针对侧视角彩色相机场景,只有少数算法可以直接应用于俯视角深度相机场景中,尤其是低分辨率场景,如公交车的车载飞行时间(time of flight,TOF)相机拍摄的视频。因此针对俯视角深度相机场景,本文提出了一种基于俯视深度头肩序列的行人再识别算法,以期提高低分辨率场景下的行人再识别精度。方法对俯视深度头肩序列进行头部区域检测和卡尔曼滤波器跟踪,获取行人的头部图像序列,构建头部深度能量图组(head depth energy map group,He DEMaG),并据此提取深度特征、面积特征、投影特征、傅里叶描述子和方向梯度直方图(histogram of oriented gradient,HOG)特征。计算行人之间头部深度能量图组的各特征之间的相似度,再利用经过模型学习所获得的权重系数对各特征相似度进行加权融合,从而得到相似度总分,将最大相似度对应的行人标签作为识别结果,实现行人再识别。结果本文算法在公开的室内单人场景TVPR(top view person re-identification)数据集、自建的室内多人场景TDPI-L(top-view depth based person identification for laboratory scenarios)数据集和公交车实际场景TDPI-B(top-view depth based person identification for bus scenarios)数据集上进行了测试,使用首位匹配率(rank-1)、前5位匹配率(rank-5)、宏F1值(macro-F1)、累计匹配曲线(cumulative match characteristic,CMC)和平均耗时等5个指标来衡量算法性能。其中,rank-1、rank-5和macro-F1分别达到61%、68%和67%以上,相比于典型算法至少提高了11%。结论本文构建了表达行人结构与行为特征的头部深度能量图组,实现了适合低分辨率行人的多特征表达;提出了基于权重学习的相似度融合,提高了识别精度,在室内单人、室内多人和公交车实际场景数据集中均取得了较好的效果。