The Joint Video Experts Team(JVET)has announced the latest generation of the Versatile Video Coding(VVC,H.266)standard.The in-loop filter in VVC inherits the De-Blocking Filter(DBF)and Sample Adaptive Offset(SAO)of Hi...The Joint Video Experts Team(JVET)has announced the latest generation of the Versatile Video Coding(VVC,H.266)standard.The in-loop filter in VVC inherits the De-Blocking Filter(DBF)and Sample Adaptive Offset(SAO)of High Efficiency Video Coding(HEVC,H.265),and adds the Adaptive Loop Filter(ALF)to minimize the error between the original sample and the decoded sample.However,for chaotic moving video encoding with low bitrates,serious blocking artifacts still remain after in-loop filtering due to the severe quantization distortion of texture details.To tackle this problem,this paper proposes a Convolutional Neural Network(CNN)based VVC in-loop filter for chaotic moving video encoding with low bitrates.First,a blur-aware attention network is designed to perceive the blurring effect and to restore texture details.Then,a deep in-loop filtering method is proposed based on the blur-aware network to replace the VVC in-loop filter.Finally,experimental results show that the proposed method could averagely save 8.3%of bit consumption at similar subjective quality.Meanwhile,under close bit rate consumption,the proposed method could reconstruct more texture information,thereby significantly reducing the blocking artifacts and improving the visual quality.展开更多
To enhance the video quality after encoding and decoding in video compression,a video quality enhancement framework is pro-posed based on local and non-local priors in this paper.Low-level features are first extracted...To enhance the video quality after encoding and decoding in video compression,a video quality enhancement framework is pro-posed based on local and non-local priors in this paper.Low-level features are first extracted through a single convolution layer and then pro-cessed by several conv-tran blocks(CTB)to extract high-level features,which are ultimately transformed into a residual image.The final re-constructed video frame is obtained by performing an element-wise addition of the residual image and the original lossy video frame.Experi-ments show that the proposed Conv-Tran Network(CTN)model effectively recovers the quality loss caused by Versatile Video Coding(VVC)and further improves VVC's performance.展开更多
国际上新一代视频编码H.266/VVC(Versatile Video Coding)在上一代视频编码H.265/HEVC(High Effciency Video Coding)基础上,采用了四叉树和多类型树递归划分的划分结构,可以更好的适应图像内容,提升编码效率。本文重点研究H.264/AVC、H...国际上新一代视频编码H.266/VVC(Versatile Video Coding)在上一代视频编码H.265/HEVC(High Effciency Video Coding)基础上,采用了四叉树和多类型树递归划分的划分结构,可以更好的适应图像内容,提升编码效率。本文重点研究H.264/AVC、H.265/HEVC、H.266/VVC视频编码标准中图像划分技术的演进过程,分析不同编码标准图像划分技术的差异,为我国未来编码技术演进提供技术参考。展开更多
近年来,随着计算机视觉在智能监控、自动驾驶等领域的广泛应用,越来越多视频不仅用于人类观看,还可直接由机器视觉算法进行自动分析。如何高效地面向机器视觉存储和传输此类视频成为新的挑战。然而,现有的视频编码标准,如最新的多功能...近年来,随着计算机视觉在智能监控、自动驾驶等领域的广泛应用,越来越多视频不仅用于人类观看,还可直接由机器视觉算法进行自动分析。如何高效地面向机器视觉存储和传输此类视频成为新的挑战。然而,现有的视频编码标准,如最新的多功能视频编码(Versatile Video Coding,VVC/H.266),主要针对人眼视觉特性进行优化,未能充分考虑压缩对机器视觉任务的性能影响。为解决这一问题,本文以多目标跟踪作为典型的机器视觉视频处理任务,提出一种面向机器视觉的VVC帧内编码算法。首先,使用神经网络可解释性方法,梯度加权类激活映射(Gradient-weighted Class Activation Mapping,GradCAM++),对视频内容进行显著性分析,定位出机器视觉任务所关注的区域,并以显著图的形式表示。随后,为了突出视频画面中的关键边缘轮廓信息,本文引入边缘检测并将其结果与显著性分析结果进行融合,得到最终的机器视觉显著性图。最后,基于融合后的机器视觉显著性图改进VVC模式选择过程,优化VVC中的块划分和帧内预测的模式决策过程。通过引入机器视觉失真,代替原有的信号失真来调整率失真优化公式,使得编码器在压缩过程中尽可能保留对视觉任务更为相关的信息。实验结果表明,与VVC基准相比,所提出方法在保持相同机器视觉检测精度的同时,可节约12.7%的码率。展开更多
近年来,面向机器视觉视频的研究和应用越来越广泛,这对此类视频的存储和传输都提出了巨大的挑战。视频编码标准如多功能视频编码(Versatile Video Coding,VVC)能实现高效的全分辨率压缩与重建,但是对机器视觉任务而言,这种压缩方法是有...近年来,面向机器视觉视频的研究和应用越来越广泛,这对此类视频的存储和传输都提出了巨大的挑战。视频编码标准如多功能视频编码(Versatile Video Coding,VVC)能实现高效的全分辨率压缩与重建,但是对机器视觉任务而言,这种压缩方法是有冗余的。因此,提出了一种在VVC编码过程中结合显著性检测的视频编码方法用于机器任务,用实例分割网络掩膜基于区域的卷积神经网络(Mask Region-based Convolutional Neural Network,Mask R-CNN)获得包含对象的二进制掩膜,并依此判定是否为感兴趣区域,指导VVC编码过程中编码树单元(Coding Tree Unit,CTU)的量化参数的偏移。实验证明,与VVC基线方法相比,所提方法可以在相似的检测精度下节省一定的比特率。展开更多
针对H.266/多功能视频编码(Versatile Video Coding,VVC)帧间仿射运动估计复杂度高的问题,提出了一种基于已重建先验信息的快速仿射运动估计算法。该算法利用帧间跳过(Skip)模式和仿射运动估计(Affine)模式之间的互斥性,根据上层级编码...针对H.266/多功能视频编码(Versatile Video Coding,VVC)帧间仿射运动估计复杂度高的问题,提出了一种基于已重建先验信息的快速仿射运动估计算法。该算法利用帧间跳过(Skip)模式和仿射运动估计(Affine)模式之间的互斥性,根据上层级编码块(Coding Unit,CU)、本层级子CU和相邻CU的重建信息跳过冗余的仿射运动估计过程,以降低仿射运动估计的复杂度。实验结果表明,在不明显影响视频码率和质量的情况下,该算法的编码时间相较于VVC标准整体下降了10.17%,仿射运动估计时间下降了44.2%,有效地降低了仿射运动估计的复杂度。展开更多
随着虚拟现实技术的发展,360度视频越来越受欢迎。这些视频在使用标准编码器进行编码之前,要先将其转换为2D图像平面格式。为了提高编码效率,专家们提出了新一代视频编码标准H.266/VVC(Versatile Video Coding),然而,VVC分区模式的多样...随着虚拟现实技术的发展,360度视频越来越受欢迎。这些视频在使用标准编码器进行编码之前,要先将其转换为2D图像平面格式。为了提高编码效率,专家们提出了新一代视频编码标准H.266/VVC(Versatile Video Coding),然而,VVC分区模式的多样性导致编码360度高分辨率视频耗时过长。针对上述问题,设计一种CU划分早期决策算法。通过对ERP(Equirectangular projection)视频的统计实验,发现这类视频采用水平分区的概率大于垂直分区。利用经验变差函数设计算法衡量纹理方向差异度,再根据编码单元水平与垂直2个方向的差异程度选择不同的分区。实验结果表明:在全帧内模式下,与VVC测试模型VTM4.0相比,该算法节省了35.42%的编码时间,BD-rate仅增加0.70%。展开更多
基金supported by National Natural Science Foundation of China under grant U20A20157,61771082,62271096 and 61871062the General Program of Chonqing Natural Science Foundation under grant cstc2021jcyj-msxm X0032+2 种基金the Natural Science Foundation of Chongqing,China(cstc2020jcyj-zdxm X0024)the Science and Technology Research Program of Chongqing Municipal Education Commission under grant KJQN202300632the University Innovation Research Group of Chongqing(CXQT20017)。
文摘The Joint Video Experts Team(JVET)has announced the latest generation of the Versatile Video Coding(VVC,H.266)standard.The in-loop filter in VVC inherits the De-Blocking Filter(DBF)and Sample Adaptive Offset(SAO)of High Efficiency Video Coding(HEVC,H.265),and adds the Adaptive Loop Filter(ALF)to minimize the error between the original sample and the decoded sample.However,for chaotic moving video encoding with low bitrates,serious blocking artifacts still remain after in-loop filtering due to the severe quantization distortion of texture details.To tackle this problem,this paper proposes a Convolutional Neural Network(CNN)based VVC in-loop filter for chaotic moving video encoding with low bitrates.First,a blur-aware attention network is designed to perceive the blurring effect and to restore texture details.Then,a deep in-loop filtering method is proposed based on the blur-aware network to replace the VVC in-loop filter.Finally,experimental results show that the proposed method could averagely save 8.3%of bit consumption at similar subjective quality.Meanwhile,under close bit rate consumption,the proposed method could reconstruct more texture information,thereby significantly reducing the blocking artifacts and improving the visual quality.
基金supported by the Key R&D Program of China under Grant No. 2022YFC3301800Sichuan Local Technological Development Program under Grant No. 24YRGZN0010ZTE Industry-University-Institute Cooperation Funds under Grant No. HC-CN-03-2019-12
文摘To enhance the video quality after encoding and decoding in video compression,a video quality enhancement framework is pro-posed based on local and non-local priors in this paper.Low-level features are first extracted through a single convolution layer and then pro-cessed by several conv-tran blocks(CTB)to extract high-level features,which are ultimately transformed into a residual image.The final re-constructed video frame is obtained by performing an element-wise addition of the residual image and the original lossy video frame.Experi-ments show that the proposed Conv-Tran Network(CTN)model effectively recovers the quality loss caused by Versatile Video Coding(VVC)and further improves VVC's performance.
文摘国际上新一代视频编码H.266/VVC(Versatile Video Coding)在上一代视频编码H.265/HEVC(High Effciency Video Coding)基础上,采用了四叉树和多类型树递归划分的划分结构,可以更好的适应图像内容,提升编码效率。本文重点研究H.264/AVC、H.265/HEVC、H.266/VVC视频编码标准中图像划分技术的演进过程,分析不同编码标准图像划分技术的差异,为我国未来编码技术演进提供技术参考。
文摘近年来,随着计算机视觉在智能监控、自动驾驶等领域的广泛应用,越来越多视频不仅用于人类观看,还可直接由机器视觉算法进行自动分析。如何高效地面向机器视觉存储和传输此类视频成为新的挑战。然而,现有的视频编码标准,如最新的多功能视频编码(Versatile Video Coding,VVC/H.266),主要针对人眼视觉特性进行优化,未能充分考虑压缩对机器视觉任务的性能影响。为解决这一问题,本文以多目标跟踪作为典型的机器视觉视频处理任务,提出一种面向机器视觉的VVC帧内编码算法。首先,使用神经网络可解释性方法,梯度加权类激活映射(Gradient-weighted Class Activation Mapping,GradCAM++),对视频内容进行显著性分析,定位出机器视觉任务所关注的区域,并以显著图的形式表示。随后,为了突出视频画面中的关键边缘轮廓信息,本文引入边缘检测并将其结果与显著性分析结果进行融合,得到最终的机器视觉显著性图。最后,基于融合后的机器视觉显著性图改进VVC模式选择过程,优化VVC中的块划分和帧内预测的模式决策过程。通过引入机器视觉失真,代替原有的信号失真来调整率失真优化公式,使得编码器在压缩过程中尽可能保留对视觉任务更为相关的信息。实验结果表明,与VVC基准相比,所提出方法在保持相同机器视觉检测精度的同时,可节约12.7%的码率。
文摘近年来,面向机器视觉视频的研究和应用越来越广泛,这对此类视频的存储和传输都提出了巨大的挑战。视频编码标准如多功能视频编码(Versatile Video Coding,VVC)能实现高效的全分辨率压缩与重建,但是对机器视觉任务而言,这种压缩方法是有冗余的。因此,提出了一种在VVC编码过程中结合显著性检测的视频编码方法用于机器任务,用实例分割网络掩膜基于区域的卷积神经网络(Mask Region-based Convolutional Neural Network,Mask R-CNN)获得包含对象的二进制掩膜,并依此判定是否为感兴趣区域,指导VVC编码过程中编码树单元(Coding Tree Unit,CTU)的量化参数的偏移。实验证明,与VVC基线方法相比,所提方法可以在相似的检测精度下节省一定的比特率。
文摘针对H.266/多功能视频编码(Versatile Video Coding,VVC)帧间仿射运动估计复杂度高的问题,提出了一种基于已重建先验信息的快速仿射运动估计算法。该算法利用帧间跳过(Skip)模式和仿射运动估计(Affine)模式之间的互斥性,根据上层级编码块(Coding Unit,CU)、本层级子CU和相邻CU的重建信息跳过冗余的仿射运动估计过程,以降低仿射运动估计的复杂度。实验结果表明,在不明显影响视频码率和质量的情况下,该算法的编码时间相较于VVC标准整体下降了10.17%,仿射运动估计时间下降了44.2%,有效地降低了仿射运动估计的复杂度。
文摘随着虚拟现实技术的发展,360度视频越来越受欢迎。这些视频在使用标准编码器进行编码之前,要先将其转换为2D图像平面格式。为了提高编码效率,专家们提出了新一代视频编码标准H.266/VVC(Versatile Video Coding),然而,VVC分区模式的多样性导致编码360度高分辨率视频耗时过长。针对上述问题,设计一种CU划分早期决策算法。通过对ERP(Equirectangular projection)视频的统计实验,发现这类视频采用水平分区的概率大于垂直分区。利用经验变差函数设计算法衡量纹理方向差异度,再根据编码单元水平与垂直2个方向的差异程度选择不同的分区。实验结果表明:在全帧内模式下,与VVC测试模型VTM4.0相比,该算法节省了35.42%的编码时间,BD-rate仅增加0.70%。