Key frame extraction based on sparse coding can reduce the redundancy of continuous frames and concisely express the entire video.However,how to develop a key frame extraction algorithm that can automatically extract ...Key frame extraction based on sparse coding can reduce the redundancy of continuous frames and concisely express the entire video.However,how to develop a key frame extraction algorithm that can automatically extract a few frames with a low reconstruction error remains a challenge.In this paper,we propose a novel model of structured sparse-codingbased key frame extraction,wherein a nonconvex group log-regularizer is used with strong sparsity and a low reconstruction error.To automatically extract key frames,a decomposition scheme is designed to separate the sparse coefficient matrix by rows.The rows enforced by the nonconvex group log-regularizer become zero or nonzero,leading to the learning of the structured sparse coefficient matrix.To solve the nonconvex problems due to the log-regularizer,the difference of convex algorithm(DCA)is employed to decompose the log-regularizer into the difference of two convex functions related to the l1 norm,which can be directly obtained through the proximal operator.Therefore,an efficient structured sparse coding algorithm with the group log-regularizer for key frame extraction is developed,which can automatically extract a few frames directly from the video to represent the entire video with a low reconstruction error.Experimental results demonstrate that the proposed algorithm can extract more accurate key frames from most Sum Me videos compared to the stateof-the-art methods.Furthermore,the proposed algorithm can obtain a higher compression with a nearly 18% increase compared to sparse modeling representation selection(SMRS)and an 8% increase compared to SC-det on the VSUMM dataset.展开更多
基金supported in part by the National Natural Science Foundation of China(61903090,61727810,62073086,62076077,61803096,U191140003)the Guangzhou Science and Technology Program Project(202002030289)Japan Society for the Promotion of Science(JSPS)KAKENHI(18K18083)。
文摘Key frame extraction based on sparse coding can reduce the redundancy of continuous frames and concisely express the entire video.However,how to develop a key frame extraction algorithm that can automatically extract a few frames with a low reconstruction error remains a challenge.In this paper,we propose a novel model of structured sparse-codingbased key frame extraction,wherein a nonconvex group log-regularizer is used with strong sparsity and a low reconstruction error.To automatically extract key frames,a decomposition scheme is designed to separate the sparse coefficient matrix by rows.The rows enforced by the nonconvex group log-regularizer become zero or nonzero,leading to the learning of the structured sparse coefficient matrix.To solve the nonconvex problems due to the log-regularizer,the difference of convex algorithm(DCA)is employed to decompose the log-regularizer into the difference of two convex functions related to the l1 norm,which can be directly obtained through the proximal operator.Therefore,an efficient structured sparse coding algorithm with the group log-regularizer for key frame extraction is developed,which can automatically extract a few frames directly from the video to represent the entire video with a low reconstruction error.Experimental results demonstrate that the proposed algorithm can extract more accurate key frames from most Sum Me videos compared to the stateof-the-art methods.Furthermore,the proposed algorithm can obtain a higher compression with a nearly 18% increase compared to sparse modeling representation selection(SMRS)and an 8% increase compared to SC-det on the VSUMM dataset.