Stable local feature detection is a fundamental component of many stereo vision problems such as 3-D reconstruction, object localization, and object tracking. A robust method for extracting scale-invariant feature poi...Stable local feature detection is a fundamental component of many stereo vision problems such as 3-D reconstruction, object localization, and object tracking. A robust method for extracting scale-invariant feature points is presented. First, the Harris corners in three-level pyramid are extracted. Then, the points detected at the highest level of the pyramid are correctly propagated to the lower level by pyramid based scale invariant (PBSI) method. The corners detected repeatedly in different levels are chosen as final feature points. Finally, the characteristic scale is obtained based on maximum entropy method. The experimental results show that the algorithm has low computation cost, strong antinoise capability, and excellent performance in the presence of significant scale changes.展开更多
We derive the conditions for the existence of the unique solution of the two scale integral equation and the form of the solution, according to the method of the construction of the dyadic scale function. We give the ...We derive the conditions for the existence of the unique solution of the two scale integral equation and the form of the solution, according to the method of the construction of the dyadic scale function. We give the construction of the dyadic wavelet and its necessary and sufficient condition. As an application, we also develop a pyramid algorithm of the dyadic wavelet decomposition.展开更多
目的图像拼接通过整合不同视角的可见光数据获得广角合成图。不利的天气因素会使采集到的可见光数据退化,导致拼接效果不佳。红外传感器通过热辐射成像,在不利的条件下也能突出目标,克服环境和人为因素的影响。方法考虑到红外传感器和...目的图像拼接通过整合不同视角的可见光数据获得广角合成图。不利的天气因素会使采集到的可见光数据退化,导致拼接效果不佳。红外传感器通过热辐射成像,在不利的条件下也能突出目标,克服环境和人为因素的影响。方法考虑到红外传感器和可见光传感器的成像互补性,本文提出了一个基于多模态数据(红外和可见光数据)特征融合的图像拼接算法。首先利用红外数据准确的结构特征和可见光数据丰富的纹理细节由粗到细地进行偏移估计,并通过非参数化的直接线性变换得到变形矩阵。然后将拼接后的红外和可见光数据进行融合,丰富了场景感知信息。结果本文选择包含530对可拼接多模态图像的真实数据集以及包含200对合成数据集作为测试数据,选取了3个最新的融合方法,包括RFN(residual fusion network)、ReCoNet(recurrent correction network)和DATFuse(dual attention transformer),以及7个拼接方法,包括APAP(as projective as possible)、SPW(single-perspective warps)、WPIS(wide parallax image stitching)、SLAS(seam-guided local alignment and stitching)、VFIS(view-free image stitching)、RSFI(reconstructing stitched features to images)和UDIS++(unsupervised deep image stitching)组成的21种融合—拼接策略进行了定性和定量的性能对比。在拼接性能上,本文方法实现了准确的跨视角场景对齐,平均角点误差降低了53%,避免了鬼影的出现;在多模态互补信息整合方面,本文方法能自适应兼顾红外图像的结构信息以及可见光图像的丰富纹理细节,信息熵较DATFuse-UDIS++策略提升了24.6%。结论本文方法在结合了红外和可见光图像成像互补优势的基础上,通过多尺度递归估计实现了更加准确的大视角场景生成;与常规可见光图像拼接相比鲁棒性更强。展开更多
目的从单幅影像中估计景深已成为计算机视觉研究热点之一,现有方法常通过提高网络的复杂度回归深度,增加了数据的训练成本及时间复杂度,为此提出一种面向单目深度估计的多层次感知条件随机场模型。方法采用自适应混合金字塔特征融合策略...目的从单幅影像中估计景深已成为计算机视觉研究热点之一,现有方法常通过提高网络的复杂度回归深度,增加了数据的训练成本及时间复杂度,为此提出一种面向单目深度估计的多层次感知条件随机场模型。方法采用自适应混合金字塔特征融合策略,捕获图像中不同位置间的短距离和长距离依赖关系,从而有效聚合全局和局部上下文信息,实现信息的高效传递。引入条件随机场解码机制,以此精细捕捉像素间的空间依赖关系。结合动态缩放注意力机制增强对不同图像区域间依赖关系的感知能力,引入偏置学习单元模块避免网络陷入极端值问题,保证模型的稳定性。针对不同特征模态间的交互情况,通过层次感知适配器扩展特征映射维度增强空间和通道交互性能,提高模型的特征学习能力。结果在NYU Depth v2(New York University depth dataset version 2)数据集上进行消融实验,结果表明,本文网络可以显著提高性能指标,相较于对比的先进方法,绝对相对误差(absolute relative error,Abs Rel)减小至0.1以内,降低7.4%,均方根误差(root mean square error,RMSE)降低5.4%。为验证模型在真实道路环境中的实用性,在KITTI(Karlsruhe Institute of Technology and Toyota Technological Institute at Chicago)数据集上进行对比实验,上述指标均优于对比的主流方法,其中RMSE降低3.1%,阈值(δ<1.25^(2),δ<1.25^(3))准确度接近100%,此外,在MatterPort3D数据集上验证了模型的泛化性。从可视化实验结果看,在复杂环境下本文方法可以更好地估计困难区域的深度。结论本文采用多层次特征提取器及混合金字塔特征融合策略,优化了信息在编码器和解码器间的传递过程,通过全连接解码获取像素级别的输出,能够有效提高单目深度估计精度。展开更多
针对道路交通环境中车辆和行人目标较小或被遮挡导致的检测精度低以及误检、漏检问题,提出道路目标检测算法RO-YOLOv9。增加小目标检测层,增强算法对小目标的特征学习能力。设计双向与自适应尺度融合特征金字塔网络(bidirectional and a...针对道路交通环境中车辆和行人目标较小或被遮挡导致的检测精度低以及误检、漏检问题,提出道路目标检测算法RO-YOLOv9。增加小目标检测层,增强算法对小目标的特征学习能力。设计双向与自适应尺度融合特征金字塔网络(bidirectional and adaptive scale fusion feature pyramid network,BiASF-FPN)结构,优化多尺度特征融合,保证算法有效捕捉从小尺度到大尺度目标的详细信息。提出OR-RepN4模块,通过重参数化策略,复杂算法结构简单化,提高推理速度。引用Shape-NWD(shape neighborhood weighted decomposition)损失函数,专注边界框形状与尺寸,采用归一化高斯Wasserstein距离平滑回归,实现跨尺度不变性,降低小尺度与遮挡目标的检测误差。实验结果表明,在优化后的SODA10M和BDD100K数据集下,RO-YOLOv9算法的mAP@0.5(mean average precision)分别达到68.1%和56.8%,比YLOLOv9算法提高5.6个百分点和4.4个百分点,并且检测帧率分别达到了55.3帧/s和54.2帧/s,达到检测精度和检测速度的平衡。展开更多
基金supported by the Development Program of China and the National Science Foundation Project (60475024)National High Technology Research (2006AA09Z203)
文摘Stable local feature detection is a fundamental component of many stereo vision problems such as 3-D reconstruction, object localization, and object tracking. A robust method for extracting scale-invariant feature points is presented. First, the Harris corners in three-level pyramid are extracted. Then, the points detected at the highest level of the pyramid are correctly propagated to the lower level by pyramid based scale invariant (PBSI) method. The corners detected repeatedly in different levels are chosen as final feature points. Finally, the characteristic scale is obtained based on maximum entropy method. The experimental results show that the algorithm has low computation cost, strong antinoise capability, and excellent performance in the presence of significant scale changes.
文摘We derive the conditions for the existence of the unique solution of the two scale integral equation and the form of the solution, according to the method of the construction of the dyadic scale function. We give the construction of the dyadic wavelet and its necessary and sufficient condition. As an application, we also develop a pyramid algorithm of the dyadic wavelet decomposition.
文摘目的图像拼接通过整合不同视角的可见光数据获得广角合成图。不利的天气因素会使采集到的可见光数据退化,导致拼接效果不佳。红外传感器通过热辐射成像,在不利的条件下也能突出目标,克服环境和人为因素的影响。方法考虑到红外传感器和可见光传感器的成像互补性,本文提出了一个基于多模态数据(红外和可见光数据)特征融合的图像拼接算法。首先利用红外数据准确的结构特征和可见光数据丰富的纹理细节由粗到细地进行偏移估计,并通过非参数化的直接线性变换得到变形矩阵。然后将拼接后的红外和可见光数据进行融合,丰富了场景感知信息。结果本文选择包含530对可拼接多模态图像的真实数据集以及包含200对合成数据集作为测试数据,选取了3个最新的融合方法,包括RFN(residual fusion network)、ReCoNet(recurrent correction network)和DATFuse(dual attention transformer),以及7个拼接方法,包括APAP(as projective as possible)、SPW(single-perspective warps)、WPIS(wide parallax image stitching)、SLAS(seam-guided local alignment and stitching)、VFIS(view-free image stitching)、RSFI(reconstructing stitched features to images)和UDIS++(unsupervised deep image stitching)组成的21种融合—拼接策略进行了定性和定量的性能对比。在拼接性能上,本文方法实现了准确的跨视角场景对齐,平均角点误差降低了53%,避免了鬼影的出现;在多模态互补信息整合方面,本文方法能自适应兼顾红外图像的结构信息以及可见光图像的丰富纹理细节,信息熵较DATFuse-UDIS++策略提升了24.6%。结论本文方法在结合了红外和可见光图像成像互补优势的基础上,通过多尺度递归估计实现了更加准确的大视角场景生成;与常规可见光图像拼接相比鲁棒性更强。
文摘目的从单幅影像中估计景深已成为计算机视觉研究热点之一,现有方法常通过提高网络的复杂度回归深度,增加了数据的训练成本及时间复杂度,为此提出一种面向单目深度估计的多层次感知条件随机场模型。方法采用自适应混合金字塔特征融合策略,捕获图像中不同位置间的短距离和长距离依赖关系,从而有效聚合全局和局部上下文信息,实现信息的高效传递。引入条件随机场解码机制,以此精细捕捉像素间的空间依赖关系。结合动态缩放注意力机制增强对不同图像区域间依赖关系的感知能力,引入偏置学习单元模块避免网络陷入极端值问题,保证模型的稳定性。针对不同特征模态间的交互情况,通过层次感知适配器扩展特征映射维度增强空间和通道交互性能,提高模型的特征学习能力。结果在NYU Depth v2(New York University depth dataset version 2)数据集上进行消融实验,结果表明,本文网络可以显著提高性能指标,相较于对比的先进方法,绝对相对误差(absolute relative error,Abs Rel)减小至0.1以内,降低7.4%,均方根误差(root mean square error,RMSE)降低5.4%。为验证模型在真实道路环境中的实用性,在KITTI(Karlsruhe Institute of Technology and Toyota Technological Institute at Chicago)数据集上进行对比实验,上述指标均优于对比的主流方法,其中RMSE降低3.1%,阈值(δ<1.25^(2),δ<1.25^(3))准确度接近100%,此外,在MatterPort3D数据集上验证了模型的泛化性。从可视化实验结果看,在复杂环境下本文方法可以更好地估计困难区域的深度。结论本文采用多层次特征提取器及混合金字塔特征融合策略,优化了信息在编码器和解码器间的传递过程,通过全连接解码获取像素级别的输出,能够有效提高单目深度估计精度。