近年来,随着计算机视觉在智能监控、自动驾驶等领域的广泛应用,越来越多视频不仅用于人类观看,还可直接由机器视觉算法进行自动分析。如何高效地面向机器视觉存储和传输此类视频成为新的挑战。然而,现有的视频编码标准,如最新的多功能...近年来,随着计算机视觉在智能监控、自动驾驶等领域的广泛应用,越来越多视频不仅用于人类观看,还可直接由机器视觉算法进行自动分析。如何高效地面向机器视觉存储和传输此类视频成为新的挑战。然而,现有的视频编码标准,如最新的多功能视频编码(Versatile Video Coding,VVC/H.266),主要针对人眼视觉特性进行优化,未能充分考虑压缩对机器视觉任务的性能影响。为解决这一问题,本文以多目标跟踪作为典型的机器视觉视频处理任务,提出一种面向机器视觉的VVC帧内编码算法。首先,使用神经网络可解释性方法,梯度加权类激活映射(Gradient-weighted Class Activation Mapping,GradCAM++),对视频内容进行显著性分析,定位出机器视觉任务所关注的区域,并以显著图的形式表示。随后,为了突出视频画面中的关键边缘轮廓信息,本文引入边缘检测并将其结果与显著性分析结果进行融合,得到最终的机器视觉显著性图。最后,基于融合后的机器视觉显著性图改进VVC模式选择过程,优化VVC中的块划分和帧内预测的模式决策过程。通过引入机器视觉失真,代替原有的信号失真来调整率失真优化公式,使得编码器在压缩过程中尽可能保留对视觉任务更为相关的信息。实验结果表明,与VVC基准相比,所提出方法在保持相同机器视觉检测精度的同时,可节约12.7%的码率。展开更多
The Joint Video Experts Team(JVET)has announced the latest generation of the Versatile Video Coding(VVC,H.266)standard.The in-loop filter in VVC inherits the De-Blocking Filter(DBF)and Sample Adaptive Offset(SAO)of Hi...The Joint Video Experts Team(JVET)has announced the latest generation of the Versatile Video Coding(VVC,H.266)standard.The in-loop filter in VVC inherits the De-Blocking Filter(DBF)and Sample Adaptive Offset(SAO)of High Efficiency Video Coding(HEVC,H.265),and adds the Adaptive Loop Filter(ALF)to minimize the error between the original sample and the decoded sample.However,for chaotic moving video encoding with low bitrates,serious blocking artifacts still remain after in-loop filtering due to the severe quantization distortion of texture details.To tackle this problem,this paper proposes a Convolutional Neural Network(CNN)based VVC in-loop filter for chaotic moving video encoding with low bitrates.First,a blur-aware attention network is designed to perceive the blurring effect and to restore texture details.Then,a deep in-loop filtering method is proposed based on the blur-aware network to replace the VVC in-loop filter.Finally,experimental results show that the proposed method could averagely save 8.3%of bit consumption at similar subjective quality.Meanwhile,under close bit rate consumption,the proposed method could reconstruct more texture information,thereby significantly reducing the blocking artifacts and improving the visual quality.展开更多
To enhance the video quality after encoding and decoding in video compression,a video quality enhancement framework is pro-posed based on local and non-local priors in this paper.Low-level features are first extracted...To enhance the video quality after encoding and decoding in video compression,a video quality enhancement framework is pro-posed based on local and non-local priors in this paper.Low-level features are first extracted through a single convolution layer and then pro-cessed by several conv-tran blocks(CTB)to extract high-level features,which are ultimately transformed into a residual image.The final re-constructed video frame is obtained by performing an element-wise addition of the residual image and the original lossy video frame.Experi-ments show that the proposed Conv-Tran Network(CTN)model effectively recovers the quality loss caused by Versatile Video Coding(VVC)and further improves VVC's performance.展开更多
国际上新一代视频编码H.266/VVC(Versatile Video Coding)在上一代视频编码H.265/HEVC(High Effciency Video Coding)基础上,采用了四叉树和多类型树递归划分的划分结构,可以更好的适应图像内容,提升编码效率。本文重点研究H.264/AVC、H...国际上新一代视频编码H.266/VVC(Versatile Video Coding)在上一代视频编码H.265/HEVC(High Effciency Video Coding)基础上,采用了四叉树和多类型树递归划分的划分结构,可以更好的适应图像内容,提升编码效率。本文重点研究H.264/AVC、H.265/HEVC、H.266/VVC视频编码标准中图像划分技术的演进过程,分析不同编码标准图像划分技术的差异,为我国未来编码技术演进提供技术参考。展开更多
针对新一代多用途视频编码(versatile video coding,VVC)标准相比上一代高效视频编码(high efficiency video coding,HEVC)采用了更多数目的时空预测模式,为相邻编码帧带来了更强的帧间相关性的问题,基于深度增强学习方法提出了一种适用...针对新一代多用途视频编码(versatile video coding,VVC)标准相比上一代高效视频编码(high efficiency video coding,HEVC)采用了更多数目的时空预测模式,为相邻编码帧带来了更强的帧间相关性的问题,基于深度增强学习方法提出了一种适用于VVC编码器的码率控制算法。首先选择合适的模型输入信息,包括帧间相关信息、分层编码结构信息和视频内容信息等;其次利用上述信息,结合长短期记忆(long short-term memory,LSTM)神经网络和增强学习方法,构建基于深度增强学习的帧间量化参数预测模型,以优化VVC编码器的码率控制过程;最后验证所提出算法的性能,将所提出算法在VTM 5.1平台实现,并与VVC源编码器进行性能对比。测试结果表明,在相同码率条件下,所提出算法相比于VVC源编码器,实现了BDBR平均节省1.81%和BDPSNR提升0.14 dB。展开更多
文摘现有的基于卷积神经网络(convolutional neural network,CNN)的环路滤波器倾向于将多个网络应用于不同的量化参数(quantization parameter,QP),消耗训练模型中的大量资源,并增加内存负担。针对这一问题,提出一种基于CNN的QP自适应环路滤波器。首先,设计一个轻量级分类网络,按照滤波难易程度将编码树单元(coding tree unit,CTU)划分为难、中、易3类;然后,构建3个融合了特征信息增强融合模块的基于CNN的滤波网络,以满足不同QP下的3类CTU滤波需求。将所提出的环路滤波器集成到多功能视频编码(versatile video coding,VVC)标准H.266/VVC的测试软件VTM 6.0中,替换原有的去块效应滤波器(deblocking filter,DBF)、样本自适应偏移(sample adaptive offset,SAO)滤波器和自适应环路滤波器。实验结果表明,该方法平均降低了3.14%的比特率差值(Bjøntegaard delta bit rate,BD-BR),与其他基于CNN的环路滤波器相比,显著提高了压缩效率,并减少了压缩伪影。
文摘近年来,随着计算机视觉在智能监控、自动驾驶等领域的广泛应用,越来越多视频不仅用于人类观看,还可直接由机器视觉算法进行自动分析。如何高效地面向机器视觉存储和传输此类视频成为新的挑战。然而,现有的视频编码标准,如最新的多功能视频编码(Versatile Video Coding,VVC/H.266),主要针对人眼视觉特性进行优化,未能充分考虑压缩对机器视觉任务的性能影响。为解决这一问题,本文以多目标跟踪作为典型的机器视觉视频处理任务,提出一种面向机器视觉的VVC帧内编码算法。首先,使用神经网络可解释性方法,梯度加权类激活映射(Gradient-weighted Class Activation Mapping,GradCAM++),对视频内容进行显著性分析,定位出机器视觉任务所关注的区域,并以显著图的形式表示。随后,为了突出视频画面中的关键边缘轮廓信息,本文引入边缘检测并将其结果与显著性分析结果进行融合,得到最终的机器视觉显著性图。最后,基于融合后的机器视觉显著性图改进VVC模式选择过程,优化VVC中的块划分和帧内预测的模式决策过程。通过引入机器视觉失真,代替原有的信号失真来调整率失真优化公式,使得编码器在压缩过程中尽可能保留对视觉任务更为相关的信息。实验结果表明,与VVC基准相比,所提出方法在保持相同机器视觉检测精度的同时,可节约12.7%的码率。
基金supported by National Natural Science Foundation of China under grant U20A20157,61771082,62271096 and 61871062the General Program of Chonqing Natural Science Foundation under grant cstc2021jcyj-msxm X0032+2 种基金the Natural Science Foundation of Chongqing,China(cstc2020jcyj-zdxm X0024)the Science and Technology Research Program of Chongqing Municipal Education Commission under grant KJQN202300632the University Innovation Research Group of Chongqing(CXQT20017)。
文摘The Joint Video Experts Team(JVET)has announced the latest generation of the Versatile Video Coding(VVC,H.266)standard.The in-loop filter in VVC inherits the De-Blocking Filter(DBF)and Sample Adaptive Offset(SAO)of High Efficiency Video Coding(HEVC,H.265),and adds the Adaptive Loop Filter(ALF)to minimize the error between the original sample and the decoded sample.However,for chaotic moving video encoding with low bitrates,serious blocking artifacts still remain after in-loop filtering due to the severe quantization distortion of texture details.To tackle this problem,this paper proposes a Convolutional Neural Network(CNN)based VVC in-loop filter for chaotic moving video encoding with low bitrates.First,a blur-aware attention network is designed to perceive the blurring effect and to restore texture details.Then,a deep in-loop filtering method is proposed based on the blur-aware network to replace the VVC in-loop filter.Finally,experimental results show that the proposed method could averagely save 8.3%of bit consumption at similar subjective quality.Meanwhile,under close bit rate consumption,the proposed method could reconstruct more texture information,thereby significantly reducing the blocking artifacts and improving the visual quality.
基金supported by the Key R&D Program of China under Grant No. 2022YFC3301800Sichuan Local Technological Development Program under Grant No. 24YRGZN0010ZTE Industry-University-Institute Cooperation Funds under Grant No. HC-CN-03-2019-12
文摘To enhance the video quality after encoding and decoding in video compression,a video quality enhancement framework is pro-posed based on local and non-local priors in this paper.Low-level features are first extracted through a single convolution layer and then pro-cessed by several conv-tran blocks(CTB)to extract high-level features,which are ultimately transformed into a residual image.The final re-constructed video frame is obtained by performing an element-wise addition of the residual image and the original lossy video frame.Experi-ments show that the proposed Conv-Tran Network(CTN)model effectively recovers the quality loss caused by Versatile Video Coding(VVC)and further improves VVC's performance.
文摘国际上新一代视频编码H.266/VVC(Versatile Video Coding)在上一代视频编码H.265/HEVC(High Effciency Video Coding)基础上,采用了四叉树和多类型树递归划分的划分结构,可以更好的适应图像内容,提升编码效率。本文重点研究H.264/AVC、H.265/HEVC、H.266/VVC视频编码标准中图像划分技术的演进过程,分析不同编码标准图像划分技术的差异,为我国未来编码技术演进提供技术参考。
文摘针对新一代多用途视频编码(versatile video coding,VVC)标准相比上一代高效视频编码(high efficiency video coding,HEVC)采用了更多数目的时空预测模式,为相邻编码帧带来了更强的帧间相关性的问题,基于深度增强学习方法提出了一种适用于VVC编码器的码率控制算法。首先选择合适的模型输入信息,包括帧间相关信息、分层编码结构信息和视频内容信息等;其次利用上述信息,结合长短期记忆(long short-term memory,LSTM)神经网络和增强学习方法,构建基于深度增强学习的帧间量化参数预测模型,以优化VVC编码器的码率控制过程;最后验证所提出算法的性能,将所提出算法在VTM 5.1平台实现,并与VVC源编码器进行性能对比。测试结果表明,在相同码率条件下,所提出算法相比于VVC源编码器,实现了BDBR平均节省1.81%和BDPSNR提升0.14 dB。