Video prediction is the problem of generating future frames by exploiting the spatiotemporal correlation from the past frame sequence.It is one of the crucial issues in computer vision and has many real-world applicat...Video prediction is the problem of generating future frames by exploiting the spatiotemporal correlation from the past frame sequence.It is one of the crucial issues in computer vision and has many real-world applications,mainly focused on predicting future scenarios to avoid undesirable outcomes.However,modeling future image content and object is challenging due to the dynamic evolution and complexity of the scene,such as occlusions,camera movements,delay and illumination.Direct frame synthesis or optical-flow estimation are common approaches used by researchers.However,researchers mainly focused on video prediction using one of the approaches.Both methods have limitations,such as direct frame synthesis,usually face blurry prediction due to complex pixel distributions in the scene,and optical-flow estimation,usually produce artifacts due to large object displacements or obstructions in the clip.In this paper,we constructed a deep neural network Frame Prediction Network(FPNet-OF)with multiplebranch inputs(optical flow and original frame)to predict the future video frame by adaptively fusing the future object-motion with the future frame generator.The key idea is to jointly optimize direct RGB frame synthesis and dense optical flow estimation to generate a superior video prediction network.Using various real-world datasets,we experimentally verify that our proposed framework can produce high-level video frame compared to other state-ofthe-art framework.展开更多
A two-stage automatic key frame selection method is proposed to enhance stitching speed and quality for UAV aerial videos. In the first stage, to reduce redundancy, the overlapping rate of the UAV aerial video sequenc...A two-stage automatic key frame selection method is proposed to enhance stitching speed and quality for UAV aerial videos. In the first stage, to reduce redundancy, the overlapping rate of the UAV aerial video sequence within the sampling period is calculated. Lagrange interpolation is used to fit the overlapping rate curve of the sequence. An empirical threshold for the overlapping rate is then applied to filter candidate key frames from the sequence. In the second stage, the principle of minimizing remapping spots is used to dynamically adjust and determine the final key frame close to the candidate key frames. Comparative experiments show that the proposed method significantly improves stitching speed and accuracy by more than 40%.展开更多
A popular and challenging task in video research,frame interpolation aims to increase the frame rate of video.Most existing methods employ a fixed motion model,e.g.,linear,quadratic,or cubic,to estimate the intermedia...A popular and challenging task in video research,frame interpolation aims to increase the frame rate of video.Most existing methods employ a fixed motion model,e.g.,linear,quadratic,or cubic,to estimate the intermediate warping field.However,such fixed motion models cannot well represent the complicated non-linear motions in the real world or rendered animations.Instead,we present an adaptive flow prediction module to better approximate the complex motions in video.Furthermore,interpolating just one intermediate frame between consecutive input frames may be insufficient for complicated non-linear motions.To enable multi-frame interpolation,we introduce the time as a control variable when interpolating frames between original ones in our generic adaptive flow prediction module.Qualitative and quantitative experimental results show that our method can produce high-quality results and outperforms the existing stateof-the-art methods on popular public datasets.展开更多
We propose a Rate-Distortion (RD) optimized strategy for frame-dropping and scheduling of multi-user conversa- tional and streaming videos. We consider a scenario where conversational and streaming videos share the fo...We propose a Rate-Distortion (RD) optimized strategy for frame-dropping and scheduling of multi-user conversa- tional and streaming videos. We consider a scenario where conversational and streaming videos share the forwarding resources at a network node. Two buffers are setup on the node to temporarily store the packets for these two types of video applications. For streaming video, a big buffer is used as the associated delay constraint of the application is moderate and a very small buffer is used for conversational video to ensure that the forwarding delay of every packet is limited. A scheduler is located behind these two buffers that dynamically assigns transmission slots on the outgoing link to the two buffers. Rate-distortion side information is used to perform RD-optimized frame dropping in case of node overload. Sharing the data rate on the outgoing link between the con- versational and the streaming videos is done either based on the fullness of the two associated buffers or on the mean incoming rates of the respective videos. Simulation results showed that our proposed RD-optimized frame dropping and scheduling ap- proach provides significant improvements in performance over the popular priority-based random dropping (PRD) technique.展开更多
Identifying inter-frame forgery is a hot topic in video forensics. In this paper, we propose a method based on the assumption that the correlation coefficients of gray values is consistent in an original video, while ...Identifying inter-frame forgery is a hot topic in video forensics. In this paper, we propose a method based on the assumption that the correlation coefficients of gray values is consistent in an original video, while in forgeries the consistency will be destroyed. We first extract the consistency of correlation coefficients of gray values (CCCoGV for short) after normalization and quantization as distinguishing feature to identify interframe forgeries. Then we test the CCCoGV in a large database with the help of SVM (Support Vector Machine). Experimental results show that the proposed method is efficient in classifying original videos and forgeries. Furthermore, the proposed method performs also pretty well in classifying frame insertion and frame deletion forgeries.展开更多
现有的基于帧预测的方法精度低的原因大多是不紧凑的正常分布学习,导致了异常事件的强泛化。对此,提出了动态运动约束下的视频异常检测(dynamic motion constraints for video anomaly detection,DMC-VAD)框架。提出了时空约束模块,使...现有的基于帧预测的方法精度低的原因大多是不紧凑的正常分布学习,导致了异常事件的强泛化。对此,提出了动态运动约束下的视频异常检测(dynamic motion constraints for video anomaly detection,DMC-VAD)框架。提出了时空约束模块,使得模型能够基于光流显著目标和运动信息,对正常运动施加时空约束。设计了时空对齐融合模块,用校正对齐模块抑制光流中低质量的特征响应,增强正常帧的全局外观与运动上下文约束。这些动态约束能够使模型学习到更紧凑的正常分布,异常帧因为偏离这些约束从而产生高误差。引入光流重构以更好地训练运动编码器,融合光流平均误差和视频帧峰值信噪比作为异常分数。在UCSD Ped2和Shanghaitech上获得了最优的受试者操作特性(receiver operating characteristic,ROC)曲线下面积(area under curve,AUC):99.42%(+1.19%),74.50%(+0.28%)。在CUHK Avenue上获得了与AMSTE相当的结果(88.07%),但提出模型的参数量仅是AMSTE的16%。展开更多
This paper presents a novel technique for embedding a digital watermark into video frames based on motion vectors and discrete wavelet transform (DWT). In the proposed scheme, the binary image watermark is divided int...This paper presents a novel technique for embedding a digital watermark into video frames based on motion vectors and discrete wavelet transform (DWT). In the proposed scheme, the binary image watermark is divided into blocks and each watermark block is embedded several times in each selected video frame at different locations. The block-based motion estimation algorithm is used to select the video frame blocks having the greatest motion vectors magnitude. The DWT is applied to the selected frame blocks, and then, the watermark block is hidden into these blocks by modifying the coefficients of the Horizontal sub-bands (HL). Adding the watermark at different locations in the same video frame makes the scheme more robust against different types of attacks. The method was tested on different types of videos. The average peak signal to noise ratio (PSNR) and the normalized correlation (NC) are used to measure the performance of the proposed method. Experimental results show that the proposed algorithm does not affect the visual quality of video frames and the scheme is robust against a variety of attacks.展开更多
基金supported by Incheon NationalUniversity Research Grant in 2017.
文摘Video prediction is the problem of generating future frames by exploiting the spatiotemporal correlation from the past frame sequence.It is one of the crucial issues in computer vision and has many real-world applications,mainly focused on predicting future scenarios to avoid undesirable outcomes.However,modeling future image content and object is challenging due to the dynamic evolution and complexity of the scene,such as occlusions,camera movements,delay and illumination.Direct frame synthesis or optical-flow estimation are common approaches used by researchers.However,researchers mainly focused on video prediction using one of the approaches.Both methods have limitations,such as direct frame synthesis,usually face blurry prediction due to complex pixel distributions in the scene,and optical-flow estimation,usually produce artifacts due to large object displacements or obstructions in the clip.In this paper,we constructed a deep neural network Frame Prediction Network(FPNet-OF)with multiplebranch inputs(optical flow and original frame)to predict the future video frame by adaptively fusing the future object-motion with the future frame generator.The key idea is to jointly optimize direct RGB frame synthesis and dense optical flow estimation to generate a superior video prediction network.Using various real-world datasets,we experimentally verify that our proposed framework can produce high-level video frame compared to other state-ofthe-art framework.
文摘A two-stage automatic key frame selection method is proposed to enhance stitching speed and quality for UAV aerial videos. In the first stage, to reduce redundancy, the overlapping rate of the UAV aerial video sequence within the sampling period is calculated. Lagrange interpolation is used to fit the overlapping rate curve of the sequence. An empirical threshold for the overlapping rate is then applied to filter candidate key frames from the sequence. In the second stage, the principle of minimizing remapping spots is used to dynamically adjust and determine the final key frame close to the candidate key frames. Comparative experiments show that the proposed method significantly improves stitching speed and accuracy by more than 40%.
基金supported by the Research Grants Council of the Hong Kong Special Administrative Region,under RGC General Research Fund(Project No.CUHK 14201017)Shenzhen Science and Technology Program(No.JCYJ20180507182410327)the Science and Technology Plan Project of Guangzhou(No.201704020141)。
文摘A popular and challenging task in video research,frame interpolation aims to increase the frame rate of video.Most existing methods employ a fixed motion model,e.g.,linear,quadratic,or cubic,to estimate the intermediate warping field.However,such fixed motion models cannot well represent the complicated non-linear motions in the real world or rendered animations.Instead,we present an adaptive flow prediction module to better approximate the complex motions in video.Furthermore,interpolating just one intermediate frame between consecutive input frames may be insufficient for complicated non-linear motions.To enable multi-frame interpolation,we introduce the time as a control variable when interpolating frames between original ones in our generic adaptive flow prediction module.Qualitative and quantitative experimental results show that our method can produce high-quality results and outperforms the existing stateof-the-art methods on popular public datasets.
基金Project (No. STE1093/1-1) supported by the German ResearchFoundation, Germany
文摘We propose a Rate-Distortion (RD) optimized strategy for frame-dropping and scheduling of multi-user conversa- tional and streaming videos. We consider a scenario where conversational and streaming videos share the forwarding resources at a network node. Two buffers are setup on the node to temporarily store the packets for these two types of video applications. For streaming video, a big buffer is used as the associated delay constraint of the application is moderate and a very small buffer is used for conversational video to ensure that the forwarding delay of every packet is limited. A scheduler is located behind these two buffers that dynamically assigns transmission slots on the outgoing link to the two buffers. Rate-distortion side information is used to perform RD-optimized frame dropping in case of node overload. Sharing the data rate on the outgoing link between the con- versational and the streaming videos is done either based on the fullness of the two associated buffers or on the mean incoming rates of the respective videos. Simulation results showed that our proposed RD-optimized frame dropping and scheduling ap- proach provides significant improvements in performance over the popular priority-based random dropping (PRD) technique.
文摘Identifying inter-frame forgery is a hot topic in video forensics. In this paper, we propose a method based on the assumption that the correlation coefficients of gray values is consistent in an original video, while in forgeries the consistency will be destroyed. We first extract the consistency of correlation coefficients of gray values (CCCoGV for short) after normalization and quantization as distinguishing feature to identify interframe forgeries. Then we test the CCCoGV in a large database with the help of SVM (Support Vector Machine). Experimental results show that the proposed method is efficient in classifying original videos and forgeries. Furthermore, the proposed method performs also pretty well in classifying frame insertion and frame deletion forgeries.
文摘现有的基于帧预测的方法精度低的原因大多是不紧凑的正常分布学习,导致了异常事件的强泛化。对此,提出了动态运动约束下的视频异常检测(dynamic motion constraints for video anomaly detection,DMC-VAD)框架。提出了时空约束模块,使得模型能够基于光流显著目标和运动信息,对正常运动施加时空约束。设计了时空对齐融合模块,用校正对齐模块抑制光流中低质量的特征响应,增强正常帧的全局外观与运动上下文约束。这些动态约束能够使模型学习到更紧凑的正常分布,异常帧因为偏离这些约束从而产生高误差。引入光流重构以更好地训练运动编码器,融合光流平均误差和视频帧峰值信噪比作为异常分数。在UCSD Ped2和Shanghaitech上获得了最优的受试者操作特性(receiver operating characteristic,ROC)曲线下面积(area under curve,AUC):99.42%(+1.19%),74.50%(+0.28%)。在CUHK Avenue上获得了与AMSTE相当的结果(88.07%),但提出模型的参数量仅是AMSTE的16%。
文摘This paper presents a novel technique for embedding a digital watermark into video frames based on motion vectors and discrete wavelet transform (DWT). In the proposed scheme, the binary image watermark is divided into blocks and each watermark block is embedded several times in each selected video frame at different locations. The block-based motion estimation algorithm is used to select the video frame blocks having the greatest motion vectors magnitude. The DWT is applied to the selected frame blocks, and then, the watermark block is hidden into these blocks by modifying the coefficients of the Horizontal sub-bands (HL). Adding the watermark at different locations in the same video frame makes the scheme more robust against different types of attacks. The method was tested on different types of videos. The average peak signal to noise ratio (PSNR) and the normalized correlation (NC) are used to measure the performance of the proposed method. Experimental results show that the proposed algorithm does not affect the visual quality of video frames and the scheme is robust against a variety of attacks.