摘要
从视频中识别人体动作是目前计算机视觉领域一个具有挑战性的方向。本文采用文本处理领域的bag-of-words方法,将视频表示为文章。在视频中寻找局部区域内在时间与空间上变化最大的点,作为时空兴趣点,在兴趣点上采集的视觉特征,作为文章中的词汇。在此基础上引入主题模型,对于视频中的隐含主题进行分析。最终通过主题在视频中的分布,经过判别法则识别其中的人物动作。通过在公开的视觉数据集上进行测试,结果表明本方法的表现接近或超过目前国际上领先的方法。
Human action recognition from video sequences is a challenging problem in computer vision. This paper uses the bag- of-words paradigm inherited from text analysis to represent a clip of video as a document. The local features are extracted from spatio-temporal interest points which are points with local maximum variation in both space and time dmnain. Then topic models on video documents are applied to analyze the latent topics and actions in the video are recognized in a discriminative fashion. The proposed system is tested on both simple and complex data sets. Experiment result shows that the approach is comparable or better than all published state-of-the-art methods.
出处
《计算机与现代化》
2013年第4期1-4,共4页
Computer and Modernization
基金
国家自然科学基金资助项目(61272251)