摘要
提出一种循环视觉注意网络来同时进行目标搜索和识别。该网络能够从图像中自动选择一个局部观测序列,通过融合局部详细表观和粗略上下文视觉信息,实现视觉目标的高精度定位与识别,比传统的滑动窗口和全图卷积的方法具有更高的目标搜索效率。此外,提出了一种混合损失函数来对网络参数进行端到端的多任务学习,特别在视觉注视点序列损失函数中引入随机性和目标启发的组合策略,可以有效地挖掘更丰富的上下文信息,保证注意点快速接近视觉目标。建立了一个真实场景数据集来验证该模型在感兴趣目标和小目标搜索与识别的性能。试验结果表明,该方法通过几个注视点转移,就能够在一幅图像上预测一个视觉目标的准确边框,并在大图像上获得比较高的搜索速度。开放源代码用于该方法验证与比较分析。
A recurrent visual network is proposed to search and recognize an object simultaneously.The network can automatically select a sequence of local observations,and accurately localize and recognize objects by fusing those local detail appearance and rough context visual information.The method is more efficient than other methods with sliding windows or convolution on a whole image.Besides,a hybrid loss function is proposed to learn parameters of the multi-task network end-to-end.Especially,The combination of stochastic and object-awareness strategy is imported into visual fixation loss,which is beneficial to mine more abundant context and ensure fixation point close to object as fast as possible.A real-world dataset is built to verify the capacity of the method in searching and recognizing the object of interest including those small ones.Experiments illustrate that the method can predict an accurate bounding box for a visual object,and achieve higher searching speed.The source code will be opened to verify and analyze the method.
作者
吕杰
罗芳颖
袁泽剑
Lü Jie;LUO Fangying;YUAN Zejian(School of Electronic and Information Engineering,Xi'an Jiaotong University,Xi'an 710049)
出处
《机械工程学报》
EI
CAS
CSCD
北大核心
2019年第11期123-130,共8页
Journal of Mechanical Engineering
基金
国家自然科学基金(91648121,61573280)
国家重点研究计划(2016YFB001001)资助项目
关键词
注意力模型
强化学习
目标检测
注意策略
attentional model
reinforcement learning
object detection
fixation strategy