期刊文献+

基于关键帧的轻量化行为识别方法研究 被引量:10

Research on lightweight action recognition method based on key frame
原文传递
导出
摘要 针对目前双流卷积神经网络通常使用堆叠RGB帧和光流图分别提取视频的表观信息和运动信息,存在信息冗余和计算复杂度高的问题,基于时域分割网络提出了一种结合光流图、差分图像和并行卷积神经网络的行为识别算法。首先通过分析行为视频中存在的运动模糊现象,设计了一种基于图像特征量的关键帧选取算法,同时构建了一个包含表观信息流和运动信息流的改进时域分割网络,将关键帧RGB图像、非关键帧光流图像和差分图像并行地输入特征提取网络计算分类得分,最后将关键帧与非关键帧的行为类别得分进行平均融合后输入SoftMax层得到视频类别概率。为进一步降低算法的参数量和计算复杂度,设计了一种轻量化卷积神经网络作为特征提取网络。本文算法在UCF101数据集的识别准确率为94.7%,在HMDB51数据集的识别准确率为69.3%,推理速度相比于时域分割网络快了45.3%。实验结果表明,该算法能够高效利用视频的表观信息和运动信息,且具有较高的行为识别准确率。 Aiming at the problems that current two stream convolutional neural network usually uses stacked RGB frames and optical flow images to extract the apparent information and motion information of the video, respectively, and there exist information redundancy and high computational complexity, an action recognition algorithm combining optical flow images, differential images and parallel convolutional neural network is proposed based on temporal segment network. Firstly, a key frame selection algorithm based on image feature quantity is designed through analyzing the motion blur phenomenon existing in the action video. At the same time, an improved temporal segment network containing apparent information flow and motion information flow of the video is constructed. In order to calculate the action classification score, the RGB images of key frames, the optical flow images and differential images of non-key frames are inputted in parallel to the feature extraction network. Finally, the action category scores of key frames and non-key frames are averaged and fused, which then are inputted into the SoftMax layer to obtain the video category probability. In order to further reduce the amount of parameters and computational complexity of the algorithm, a lightweight convolutional neural network is designed and used as the feature extraction network. The experiments on UCF101 and HMDB51 datasets were conducted, and the recognition accuracies of 94.7% and 69.3% are obtained, respectively, and the inference speed is 45.3% faster compared with temporal segment network. Experiment results indicate that the proposed algorithm can efficiently use the apparent information and motion information of the video, and has a high action recognition accuracy.
作者 周育新 白宏阳 李伟 郭宏伟 徐啸康 Zhou Yuxin;Bai Hongyang;Li Wei;Guo Hongwei;Xu Xiaokang(College of Energy and Power Engineering,Nanjing University of Science and Technology,Nanjing 210094,China;96037 Troop,People's Liberation Army of China,Baoji 721000,China)
出处 《仪器仪表学报》 EI CAS CSCD 北大核心 2020年第7期196-204,共9页 Chinese Journal of Scientific Instrument
基金 国家自然科学基金(61603189)项目资助
关键词 卷积神经网络 行为识别 关键帧 轻量化 convolutional neural network action recognition key frame light weight
  • 相关文献

参考文献6

二级参考文献31

  • 1Wu L, Faloutsos C.FALCON: Feedback Adaptive Loop for Contentbased Retrieval. In: Proc. of VLDB, Kairo, Egypt, 2000:297-306.
  • 2MacArthur S D,Brodley C E.Relevance Feedback Decision Trees in Content-based Image Retrieval.In: Proc.of CBAIVL,South Carolina,2000:68-72.
  • 3DEng J.Control Systems of Grey Systems.Systems and Control Letter.1982,5:288-294.
  • 4Deng J .Introduction to Grey System Theory. The Journal of Grey,System, 1989, 1: 1-24.
  • 5Rui Y,Huang T S.A Novel Relevance Feedback Technique in Image Retrieval.In: Proc.of ACM Multimedia 99,Orlando, Florida, 1999:67-70.
  • 6Androutsos D,Plataniotis K N,Venetsanopoulos A N.A Novel Vectorbascd Approach to Color Image Rctricval Using a Vector Angularbased Distance Measttrc.Computer Vision and Image Understanding.1999.75( 1/2): 46-58.
  • 7朱兴全,薛向阳,吴立德.一种自动门限选取的视频Shot分割方法[J].计算机研究与发展,2000,37(1):80-85. 被引量:23
  • 8申晓霞,张桦,高赞,徐光平,薛彦兵.基于Kinect和金字塔特征的行为识别算法[J].光电子.激光,2014,25(2):357-363. 被引量:13
  • 9王竞雪,朱庆,王伟玺,赵丽科.结合边缘编组的Hough变换直线提取[J].遥感学报,2014,18(2):378-389. 被引量:49
  • 10曹奎,冯玉才,曹忠升.基于颜色和形状特征的彩色图像表示与检索技术[J].计算机辅助设计与图形学学报,2001,13(10):906-911. 被引量:15

共引文献63

同被引文献104

引证文献10

二级引证文献52

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部