The high-efficiency video coder(HEVC)is one of the most advanced techniques used in growing real-time multimedia applications today.However,they require large bandwidth for transmission through bandwidth,and bandwidth...The high-efficiency video coder(HEVC)is one of the most advanced techniques used in growing real-time multimedia applications today.However,they require large bandwidth for transmission through bandwidth,and bandwidth varies with different video sequences/formats.This paper proposes an adaptive information-based variable quantization matrix(AIVQM)developed for different video formats having variable energy levels.The quantization method is adapted based on video sequence using statistical analysis,improving bit budget,quality and complexity reduction.Further,to have precise control over bit rate and quality,a multi-constraint prune algorithm is proposed in the second stage of the AI-VQM technique for pre-calculating K numbers of paths.The same should be handy to selfadapt and choose one of the K-path automatically in dynamically changing bandwidth availability as per requirement after extensive testing of the proposed algorithm in the multi-constraint environment for multiple paths and evaluating the performance based on peak signal to noise ratio(PSNR),bit-budget and time complexity for different videos a noticeable improvement in rate-distortion(RD)performance is achieved.Using the proposed AIVQM technique,more feasible and efficient video sequences are achieved with less loss in PSNR than the variable quantization method(VQM)algorithm with approximately a rise of 10%–20%based on different video sequences/formats.展开更多
Distributed video coding (DVC) is a new video coding approach based on Wyner-Ziv theorem. The novel uplink-friendly DVC, which offers low-complexity, low-power consuming, and low-cost video encoding, has aroused mor...Distributed video coding (DVC) is a new video coding approach based on Wyner-Ziv theorem. The novel uplink-friendly DVC, which offers low-complexity, low-power consuming, and low-cost video encoding, has aroused more and more research interests. In this paper a new method based on multiple view geometry is presented for spatial side information generation of uncalibrated video sensor network. Trifocal tensor encapsulates all the geometric relations among three views that are independent of scene structure; it can be computed from image correspondences alone without requiring knowledge of the motion or calibration. Simulation results show that trifocal tensor-based spatial side information improves the rate-distortion performance over motion compensation based interpolation side information by a maximum gap of around 2dB. Then fusion merges the different side information (temporal and spatial) in order to improve the quality of the final one. Simulation results show that the rate-distortion gains about 0.4 dB.展开更多
In this paper, we present a method using video codec technology to compress ECG signals. This method exploits both intra-beat and inter-beat correlations of the ECG signals to achieve high compression ratios (CR) and ...In this paper, we present a method using video codec technology to compress ECG signals. This method exploits both intra-beat and inter-beat correlations of the ECG signals to achieve high compression ratios (CR) and a low percent root mean square difference (PRD). Since ECG signals have both intra-beat and inter-beat redundancies like video signals, which have both intra-frame and inter-frame correlation, video codec technology can be used for ECG compression. In order to do this, some pre-process will be needed. The ECG signals should firstly be segmented and normalized to a sequence of beat cycles with the same length, and then these beat cycles can be treated as picture frames and compressed with video codec technology. We have used records from MIT-BIH arrhythmia database to evaluate our algorithm. Results show that, besides compression efficiently, this algorithm has the advantages of resolution adjustable, random access and flexibility for irregular period and QRS false detection.展开更多
In-loop filters have been comprehensively explored during the development of video coding standards due to their remarkable noise-reduction capabilities.In the early stage of video coding,in-loop filters,such as the d...In-loop filters have been comprehensively explored during the development of video coding standards due to their remarkable noise-reduction capabilities.In the early stage of video coding,in-loop filters,such as the deblocking filter,sample adaptive offset,and adaptive loop filter,were performed separately for each component.Recently,cross-component filters have been studied to improve chroma fidelity by exploiting correlations between the luma and chroma channels.This paper introduces the cross-component filters used in the state-ofthe-art video coding standards,including the cross-component adaptive loop filter and cross-component sample adaptive offset.Crosscomponent filters aim to reduce compression artifacts based on the correlation between different components and provide more accurate pixel reconstruction values.We present their origin,development,and status in the current video coding standards.Finally,we conduct discussions on the further evolution of cross-component filters.展开更多
This letter proposes a rate control algorithm for H.264 video encoder, which is based on block activity and buffer state. Experimental results indicate that it has an excellent performance by providing much accurate b...This letter proposes a rate control algorithm for H.264 video encoder, which is based on block activity and buffer state. Experimental results indicate that it has an excellent performance by providing much accurate bit rate and better coding efficiency compared with H.264. The computational complexity of the algorithm is reduced by adopting a novel block activity description method using the Sum of Absolute Difference (SAD) of 16× 16 mode, and its robustness is enhanced by introducing a feedback circuit at frame layer.展开更多
With the development of general-purpose processors (GPP) and video signal processing algorithms, it is possible to implement a software-based real-time video encoder on GPP, and its low cost and easy upgrade attract d...With the development of general-purpose processors (GPP) and video signal processing algorithms, it is possible to implement a software-based real-time video encoder on GPP, and its low cost and easy upgrade attract developers' interests to transfer video encoding from specialized hardware to more flexible software. In this paper, the encoding structure is set up first to support complexity scalability; then a lot of high performance algorithms are used on the key time-consuming modules in coding process; finally, at programming level, processor characteristics are considered to improve data access efficiency and processing parallelism. Other programming methods such as lookup table are adopted to reduce the computational complexity. Simulation results showed that these ideas could not only improve the global performance of video coding, but also provide great flexibility in complexity regulation.展开更多
Stereo video is widely used because it can provide depth information. However, it is difficult to store and transmit stereo video due to the huge data amount. So, high efficient channel encoding algorithm and proper t...Stereo video is widely used because it can provide depth information. However, it is difficult to store and transmit stereo video due to the huge data amount. So, high efficient channel encoding algorithm and proper transmission strategy is needed to deal with the video transmission over limited bandwidth channel. In this paper, unequal error protection (UEP) based on low density parity check (LDPC) code was used to transmit stereo video over wireless channel with limited bandwidth. Different correction level LDPC code was used according to the importance of video stream to reconstruction at the receiver. Simulation result shows that the proposed transmission scheme increases the PSNR of reconstructed image, and improves the subjective effect.展开更多
Popular video coding standards like H.264 and MPEG working on the principle of motion-compensated pre-dictive coding demand much of the computational resources at the encoder increasing its complexity. Such bulky enco...Popular video coding standards like H.264 and MPEG working on the principle of motion-compensated pre-dictive coding demand much of the computational resources at the encoder increasing its complexity. Such bulky encoders are not suitable for applications like wireless low power surveillance, multimedia sensor networks, wireless PC cameras, mobile camera phones etc. New video coding scheme based on the principle of distributed source coding is looked upon in this paper. This scheme supports a low complexity encoder, at the same time trying to achieve the rate distortion performance of conventional video codecs. Current im-plementation uses LDPC codes for syndrome coding.展开更多
Cloud computing has drastically changed the delivery and consumption of live streaming content.The designs,challenges,and possible uses of cloud computing for live streaming are studied.A comprehensive overview of the...Cloud computing has drastically changed the delivery and consumption of live streaming content.The designs,challenges,and possible uses of cloud computing for live streaming are studied.A comprehensive overview of the technical and business issues surrounding cloudbased live streaming is provided,including the benefits of cloud computing,the various live streaming architectures,and the challenges that live streaming service providers face in delivering high‐quality,real‐time services.The different techniques used to improve the performance of video streaming,such as adaptive bit‐rate streaming,multicast distribution,and edge computing are discussed and the necessity of low‐latency and high‐quality video transmission in cloud‐based live streaming is underlined.Issues such as improving user experience and live streaming service performance using cutting‐edge technology,like artificial intelligence and machine learning are discussed.In addition,the legal and regulatory implications of cloud‐based live streaming,including issues with network neutrality,data privacy,and content moderation are addressed.The future of cloud computing for live streaming is examined in the section that follows,and it looks at the most likely new developments in terms of trends and technology.For technology vendors,live streaming service providers,and regulators,the findings have major policy‐relevant implications.Suggestions on how stakeholders should address these concerns and take advantage of the potential presented by this rapidly evolving sector,as well as insights into the key challenges and opportunities associated with cloud‐based live streaming are provided.展开更多
The Joint Video Experts Team(JVET)has announced the latest generation of the Versatile Video Coding(VVC,H.266)standard.The in-loop filter in VVC inherits the De-Blocking Filter(DBF)and Sample Adaptive Offset(SAO)of Hi...The Joint Video Experts Team(JVET)has announced the latest generation of the Versatile Video Coding(VVC,H.266)standard.The in-loop filter in VVC inherits the De-Blocking Filter(DBF)and Sample Adaptive Offset(SAO)of High Efficiency Video Coding(HEVC,H.265),and adds the Adaptive Loop Filter(ALF)to minimize the error between the original sample and the decoded sample.However,for chaotic moving video encoding with low bitrates,serious blocking artifacts still remain after in-loop filtering due to the severe quantization distortion of texture details.To tackle this problem,this paper proposes a Convolutional Neural Network(CNN)based VVC in-loop filter for chaotic moving video encoding with low bitrates.First,a blur-aware attention network is designed to perceive the blurring effect and to restore texture details.Then,a deep in-loop filtering method is proposed based on the blur-aware network to replace the VVC in-loop filter.Finally,experimental results show that the proposed method could averagely save 8.3%of bit consumption at similar subjective quality.Meanwhile,under close bit rate consumption,the proposed method could reconstruct more texture information,thereby significantly reducing the blocking artifacts and improving the visual quality.展开更多
To enhance the video quality after encoding and decoding in video compression,a video quality enhancement framework is pro-posed based on local and non-local priors in this paper.Low-level features are first extracted...To enhance the video quality after encoding and decoding in video compression,a video quality enhancement framework is pro-posed based on local and non-local priors in this paper.Low-level features are first extracted through a single convolution layer and then pro-cessed by several conv-tran blocks(CTB)to extract high-level features,which are ultimately transformed into a residual image.The final re-constructed video frame is obtained by performing an element-wise addition of the residual image and the original lossy video frame.Experi-ments show that the proposed Conv-Tran Network(CTN)model effectively recovers the quality loss caused by Versatile Video Coding(VVC)and further improves VVC's performance.展开更多
To improve the performance of video compression for machine vision analysis tasks,a video coding for machines(VCM)standard working group was established to promote standardization procedures.In this paper,recent advan...To improve the performance of video compression for machine vision analysis tasks,a video coding for machines(VCM)standard working group was established to promote standardization procedures.In this paper,recent advances in video coding for machine standards are presented and comprehensive introductions to the use cases,requirements,evaluation frameworks and corresponding metrics of the VCM standard are given.Then the existing methods are presented,introducing the existing proposals by category and the research progress of the latest VCM conference.Finally,we give conclusions.展开更多
Video games have been around for several decades and have had many advancements from the original start of video games. Video games started as virtual games that were advertised towards children, and these virtual gam...Video games have been around for several decades and have had many advancements from the original start of video games. Video games started as virtual games that were advertised towards children, and these virtual games created a virtual reality of a variety of genres. These genres included sports games, such as tennis, football, baseball, war games, fantasy, puzzles, etc. The start of these games was derived from a sports genre and now has a popularity in multiplayer-online-shooting games. The purpose of this paper is to investigate different types of tools available for cheating in virtual world making players have undue advantage over other players in a competition. With the advancement in technology, these video games have become more expanded in the development aspects of gaming. Video game developers have created long lines of codes to create a new look of video games. As video games have progressed, the coding, bugs, bots, and errors of video games have changed throughout the years. The coding of video games has branched out from the original video games, which have given many benefits to this virtual world, while simultaneously creating more problems such as bots. Analysis of tools available for cheating in a game has disadvantaged normal gamer in a fair contest.展开更多
高效视频编码(high efficiency video coding,HEVC)相较于上一代编码标准H.264降低了约50%的比特率,但为了提高帧内预测的准确性,HEVC提出的35种预测模式导致计算量大幅增加,对软件和硬件实现均构成了挑战.针对该问题,在HEVC的基础上提...高效视频编码(high efficiency video coding,HEVC)相较于上一代编码标准H.264降低了约50%的比特率,但为了提高帧内预测的准确性,HEVC提出的35种预测模式导致计算量大幅增加,对软件和硬件实现均构成了挑战.针对该问题,在HEVC的基础上提出了一种依据图片纹理方向,结合预测模式之间的关联性来确定帧内预测模式的快速算法.实验结果表明,本算法与HEVC参考软件HM16.20相比,在BD-Rate损失仅为5.79%的情况下,节省46%以上的编码时间,显著降低了帧内预测模式决策的复杂度,便于在嵌入式系统等硬件资源有限的端侧实现算法落地.展开更多
为了解决多功能视频编码(versatile video coding,VVC)标准下具有相同编码参数的视频双压缩检测方法准确率不高的问题,提出了一种基于编码单元(coding unit,CU)尺寸、划分模式和预测模式的检测方法。对待检测的视频进行多次编解码,分析...为了解决多功能视频编码(versatile video coding,VVC)标准下具有相同编码参数的视频双压缩检测方法准确率不高的问题,提出了一种基于编码单元(coding unit,CU)尺寸、划分模式和预测模式的检测方法。对待检测的视频进行多次编解码,分析并确定VVC流中与压缩编码次数密切相关的基础码流特征;以CU尺寸、划分模式和预测模式构建高级码流特征输入支持向量机完成视频的双压缩检测。实验结果表明,与对比文献的方法相比,所提方法的视频双压缩检测准确率有较大提升,平均准确率达到了95.82%。展开更多
近年来,随着计算机视觉在智能监控、自动驾驶等领域的广泛应用,越来越多视频不仅用于人类观看,还可直接由机器视觉算法进行自动分析。如何高效地面向机器视觉存储和传输此类视频成为新的挑战。然而,现有的视频编码标准,如最新的多功能...近年来,随着计算机视觉在智能监控、自动驾驶等领域的广泛应用,越来越多视频不仅用于人类观看,还可直接由机器视觉算法进行自动分析。如何高效地面向机器视觉存储和传输此类视频成为新的挑战。然而,现有的视频编码标准,如最新的多功能视频编码(Versatile Video Coding,VVC/H.266),主要针对人眼视觉特性进行优化,未能充分考虑压缩对机器视觉任务的性能影响。为解决这一问题,本文以多目标跟踪作为典型的机器视觉视频处理任务,提出一种面向机器视觉的VVC帧内编码算法。首先,使用神经网络可解释性方法,梯度加权类激活映射(Gradient-weighted Class Activation Mapping,GradCAM++),对视频内容进行显著性分析,定位出机器视觉任务所关注的区域,并以显著图的形式表示。随后,为了突出视频画面中的关键边缘轮廓信息,本文引入边缘检测并将其结果与显著性分析结果进行融合,得到最终的机器视觉显著性图。最后,基于融合后的机器视觉显著性图改进VVC模式选择过程,优化VVC中的块划分和帧内预测的模式决策过程。通过引入机器视觉失真,代替原有的信号失真来调整率失真优化公式,使得编码器在压缩过程中尽可能保留对视觉任务更为相关的信息。实验结果表明,与VVC基准相比,所提出方法在保持相同机器视觉检测精度的同时,可节约12.7%的码率。展开更多
In order to decrease both computational complexity and coding time, an improved algorithm for the early detection of all-zero blocks (AZBs) in H. 264/AVC is proposed. The previous AZBs detection algorithms are revie...In order to decrease both computational complexity and coding time, an improved algorithm for the early detection of all-zero blocks (AZBs) in H. 264/AVC is proposed. The previous AZBs detection algorithms are reviewed. Three types of transformed frequency-domain coefficients, which are quantized to zeros, are analyzed. Based on the three types of frequencydomain scaling factors, the corresponding spatial coefficients are derived. Then the Schwarz inequality is applied to the derivation of the three thresholds based on spatial coefficients. Another threshold is set on the basis of the probability distribution of zero coefficients in a block. As a result, an adaptive AZBs detection algorithm is proposed based on the minimum of the former three thresholds and the threshold of zero blocks distribution. The simulation results show that, compared with the existing AZBs detection algorithms, the proposed algorithm achieves a 5% higher detection ratio in AZBs and 4% to 10% computation saving with only 0. 1 dB video quality degradation.展开更多
文摘The high-efficiency video coder(HEVC)is one of the most advanced techniques used in growing real-time multimedia applications today.However,they require large bandwidth for transmission through bandwidth,and bandwidth varies with different video sequences/formats.This paper proposes an adaptive information-based variable quantization matrix(AIVQM)developed for different video formats having variable energy levels.The quantization method is adapted based on video sequence using statistical analysis,improving bit budget,quality and complexity reduction.Further,to have precise control over bit rate and quality,a multi-constraint prune algorithm is proposed in the second stage of the AI-VQM technique for pre-calculating K numbers of paths.The same should be handy to selfadapt and choose one of the K-path automatically in dynamically changing bandwidth availability as per requirement after extensive testing of the proposed algorithm in the multi-constraint environment for multiple paths and evaluating the performance based on peak signal to noise ratio(PSNR),bit-budget and time complexity for different videos a noticeable improvement in rate-distortion(RD)performance is achieved.Using the proposed AIVQM technique,more feasible and efficient video sequences are achieved with less loss in PSNR than the variable quantization method(VQM)algorithm with approximately a rise of 10%–20%based on different video sequences/formats.
文摘Distributed video coding (DVC) is a new video coding approach based on Wyner-Ziv theorem. The novel uplink-friendly DVC, which offers low-complexity, low-power consuming, and low-cost video encoding, has aroused more and more research interests. In this paper a new method based on multiple view geometry is presented for spatial side information generation of uncalibrated video sensor network. Trifocal tensor encapsulates all the geometric relations among three views that are independent of scene structure; it can be computed from image correspondences alone without requiring knowledge of the motion or calibration. Simulation results show that trifocal tensor-based spatial side information improves the rate-distortion performance over motion compensation based interpolation side information by a maximum gap of around 2dB. Then fusion merges the different side information (temporal and spatial) in order to improve the quality of the final one. Simulation results show that the rate-distortion gains about 0.4 dB.
文摘In this paper, we present a method using video codec technology to compress ECG signals. This method exploits both intra-beat and inter-beat correlations of the ECG signals to achieve high compression ratios (CR) and a low percent root mean square difference (PRD). Since ECG signals have both intra-beat and inter-beat redundancies like video signals, which have both intra-frame and inter-frame correlation, video codec technology can be used for ECG compression. In order to do this, some pre-process will be needed. The ECG signals should firstly be segmented and normalized to a sequence of beat cycles with the same length, and then these beat cycles can be treated as picture frames and compressed with video codec technology. We have used records from MIT-BIH arrhythmia database to evaluate our algorithm. Results show that, besides compression efficiently, this algorithm has the advantages of resolution adjustable, random access and flexibility for irregular period and QRS false detection.
基金supported in part by National Science Foundation of China under Grant No.62031013PCL-CMCC Foundation for Science and Innovation under Grant No.2024ZY1C0040+1 种基金New Cornerstone Science Foundation for the Xplorer PrizeHigh performance Computing Platform of Peking University。
文摘In-loop filters have been comprehensively explored during the development of video coding standards due to their remarkable noise-reduction capabilities.In the early stage of video coding,in-loop filters,such as the deblocking filter,sample adaptive offset,and adaptive loop filter,were performed separately for each component.Recently,cross-component filters have been studied to improve chroma fidelity by exploiting correlations between the luma and chroma channels.This paper introduces the cross-component filters used in the state-ofthe-art video coding standards,including the cross-component adaptive loop filter and cross-component sample adaptive offset.Crosscomponent filters aim to reduce compression artifacts based on the correlation between different components and provide more accurate pixel reconstruction values.We present their origin,development,and status in the current video coding standards.Finally,we conduct discussions on the further evolution of cross-component filters.
基金the National Nature Science Foundation of China(No.90104013) 863 Project(No.2002AA119010, 2001AA121061 and 2002AA123041)
文摘This letter proposes a rate control algorithm for H.264 video encoder, which is based on block activity and buffer state. Experimental results indicate that it has an excellent performance by providing much accurate bit rate and better coding efficiency compared with H.264. The computational complexity of the algorithm is reduced by adopting a novel block activity description method using the Sum of Absolute Difference (SAD) of 16× 16 mode, and its robustness is enhanced by introducing a feedback circuit at frame layer.
文摘With the development of general-purpose processors (GPP) and video signal processing algorithms, it is possible to implement a software-based real-time video encoder on GPP, and its low cost and easy upgrade attract developers' interests to transfer video encoding from specialized hardware to more flexible software. In this paper, the encoding structure is set up first to support complexity scalability; then a lot of high performance algorithms are used on the key time-consuming modules in coding process; finally, at programming level, processor characteristics are considered to improve data access efficiency and processing parallelism. Other programming methods such as lookup table are adopted to reduce the computational complexity. Simulation results showed that these ideas could not only improve the global performance of video coding, but also provide great flexibility in complexity regulation.
文摘Stereo video is widely used because it can provide depth information. However, it is difficult to store and transmit stereo video due to the huge data amount. So, high efficient channel encoding algorithm and proper transmission strategy is needed to deal with the video transmission over limited bandwidth channel. In this paper, unequal error protection (UEP) based on low density parity check (LDPC) code was used to transmit stereo video over wireless channel with limited bandwidth. Different correction level LDPC code was used according to the importance of video stream to reconstruction at the receiver. Simulation result shows that the proposed transmission scheme increases the PSNR of reconstructed image, and improves the subjective effect.
文摘Popular video coding standards like H.264 and MPEG working on the principle of motion-compensated pre-dictive coding demand much of the computational resources at the encoder increasing its complexity. Such bulky encoders are not suitable for applications like wireless low power surveillance, multimedia sensor networks, wireless PC cameras, mobile camera phones etc. New video coding scheme based on the principle of distributed source coding is looked upon in this paper. This scheme supports a low complexity encoder, at the same time trying to achieve the rate distortion performance of conventional video codecs. Current im-plementation uses LDPC codes for syndrome coding.
文摘Cloud computing has drastically changed the delivery and consumption of live streaming content.The designs,challenges,and possible uses of cloud computing for live streaming are studied.A comprehensive overview of the technical and business issues surrounding cloudbased live streaming is provided,including the benefits of cloud computing,the various live streaming architectures,and the challenges that live streaming service providers face in delivering high‐quality,real‐time services.The different techniques used to improve the performance of video streaming,such as adaptive bit‐rate streaming,multicast distribution,and edge computing are discussed and the necessity of low‐latency and high‐quality video transmission in cloud‐based live streaming is underlined.Issues such as improving user experience and live streaming service performance using cutting‐edge technology,like artificial intelligence and machine learning are discussed.In addition,the legal and regulatory implications of cloud‐based live streaming,including issues with network neutrality,data privacy,and content moderation are addressed.The future of cloud computing for live streaming is examined in the section that follows,and it looks at the most likely new developments in terms of trends and technology.For technology vendors,live streaming service providers,and regulators,the findings have major policy‐relevant implications.Suggestions on how stakeholders should address these concerns and take advantage of the potential presented by this rapidly evolving sector,as well as insights into the key challenges and opportunities associated with cloud‐based live streaming are provided.
基金supported by National Natural Science Foundation of China under grant U20A20157,61771082,62271096 and 61871062the General Program of Chonqing Natural Science Foundation under grant cstc2021jcyj-msxm X0032+2 种基金the Natural Science Foundation of Chongqing,China(cstc2020jcyj-zdxm X0024)the Science and Technology Research Program of Chongqing Municipal Education Commission under grant KJQN202300632the University Innovation Research Group of Chongqing(CXQT20017)。
文摘The Joint Video Experts Team(JVET)has announced the latest generation of the Versatile Video Coding(VVC,H.266)standard.The in-loop filter in VVC inherits the De-Blocking Filter(DBF)and Sample Adaptive Offset(SAO)of High Efficiency Video Coding(HEVC,H.265),and adds the Adaptive Loop Filter(ALF)to minimize the error between the original sample and the decoded sample.However,for chaotic moving video encoding with low bitrates,serious blocking artifacts still remain after in-loop filtering due to the severe quantization distortion of texture details.To tackle this problem,this paper proposes a Convolutional Neural Network(CNN)based VVC in-loop filter for chaotic moving video encoding with low bitrates.First,a blur-aware attention network is designed to perceive the blurring effect and to restore texture details.Then,a deep in-loop filtering method is proposed based on the blur-aware network to replace the VVC in-loop filter.Finally,experimental results show that the proposed method could averagely save 8.3%of bit consumption at similar subjective quality.Meanwhile,under close bit rate consumption,the proposed method could reconstruct more texture information,thereby significantly reducing the blocking artifacts and improving the visual quality.
基金supported by the Key R&D Program of China under Grant No. 2022YFC3301800Sichuan Local Technological Development Program under Grant No. 24YRGZN0010ZTE Industry-University-Institute Cooperation Funds under Grant No. HC-CN-03-2019-12
文摘To enhance the video quality after encoding and decoding in video compression,a video quality enhancement framework is pro-posed based on local and non-local priors in this paper.Low-level features are first extracted through a single convolution layer and then pro-cessed by several conv-tran blocks(CTB)to extract high-level features,which are ultimately transformed into a residual image.The final re-constructed video frame is obtained by performing an element-wise addition of the residual image and the original lossy video frame.Experi-ments show that the proposed Conv-Tran Network(CTN)model effectively recovers the quality loss caused by Versatile Video Coding(VVC)and further improves VVC's performance.
基金supported by ZTE Industry-University-Institute Cooperation Funds.
文摘To improve the performance of video compression for machine vision analysis tasks,a video coding for machines(VCM)standard working group was established to promote standardization procedures.In this paper,recent advances in video coding for machine standards are presented and comprehensive introductions to the use cases,requirements,evaluation frameworks and corresponding metrics of the VCM standard are given.Then the existing methods are presented,introducing the existing proposals by category and the research progress of the latest VCM conference.Finally,we give conclusions.
文摘Video games have been around for several decades and have had many advancements from the original start of video games. Video games started as virtual games that were advertised towards children, and these virtual games created a virtual reality of a variety of genres. These genres included sports games, such as tennis, football, baseball, war games, fantasy, puzzles, etc. The start of these games was derived from a sports genre and now has a popularity in multiplayer-online-shooting games. The purpose of this paper is to investigate different types of tools available for cheating in virtual world making players have undue advantage over other players in a competition. With the advancement in technology, these video games have become more expanded in the development aspects of gaming. Video game developers have created long lines of codes to create a new look of video games. As video games have progressed, the coding, bugs, bots, and errors of video games have changed throughout the years. The coding of video games has branched out from the original video games, which have given many benefits to this virtual world, while simultaneously creating more problems such as bots. Analysis of tools available for cheating in a game has disadvantaged normal gamer in a fair contest.
文摘高效视频编码(high efficiency video coding,HEVC)相较于上一代编码标准H.264降低了约50%的比特率,但为了提高帧内预测的准确性,HEVC提出的35种预测模式导致计算量大幅增加,对软件和硬件实现均构成了挑战.针对该问题,在HEVC的基础上提出了一种依据图片纹理方向,结合预测模式之间的关联性来确定帧内预测模式的快速算法.实验结果表明,本算法与HEVC参考软件HM16.20相比,在BD-Rate损失仅为5.79%的情况下,节省46%以上的编码时间,显著降低了帧内预测模式决策的复杂度,便于在嵌入式系统等硬件资源有限的端侧实现算法落地.
文摘现有的基于卷积神经网络(convolutional neural network,CNN)的环路滤波器倾向于将多个网络应用于不同的量化参数(quantization parameter,QP),消耗训练模型中的大量资源,并增加内存负担。针对这一问题,提出一种基于CNN的QP自适应环路滤波器。首先,设计一个轻量级分类网络,按照滤波难易程度将编码树单元(coding tree unit,CTU)划分为难、中、易3类;然后,构建3个融合了特征信息增强融合模块的基于CNN的滤波网络,以满足不同QP下的3类CTU滤波需求。将所提出的环路滤波器集成到多功能视频编码(versatile video coding,VVC)标准H.266/VVC的测试软件VTM 6.0中,替换原有的去块效应滤波器(deblocking filter,DBF)、样本自适应偏移(sample adaptive offset,SAO)滤波器和自适应环路滤波器。实验结果表明,该方法平均降低了3.14%的比特率差值(Bjøntegaard delta bit rate,BD-BR),与其他基于CNN的环路滤波器相比,显著提高了压缩效率,并减少了压缩伪影。
文摘为了解决多功能视频编码(versatile video coding,VVC)标准下具有相同编码参数的视频双压缩检测方法准确率不高的问题,提出了一种基于编码单元(coding unit,CU)尺寸、划分模式和预测模式的检测方法。对待检测的视频进行多次编解码,分析并确定VVC流中与压缩编码次数密切相关的基础码流特征;以CU尺寸、划分模式和预测模式构建高级码流特征输入支持向量机完成视频的双压缩检测。实验结果表明,与对比文献的方法相比,所提方法的视频双压缩检测准确率有较大提升,平均准确率达到了95.82%。
文摘近年来,随着计算机视觉在智能监控、自动驾驶等领域的广泛应用,越来越多视频不仅用于人类观看,还可直接由机器视觉算法进行自动分析。如何高效地面向机器视觉存储和传输此类视频成为新的挑战。然而,现有的视频编码标准,如最新的多功能视频编码(Versatile Video Coding,VVC/H.266),主要针对人眼视觉特性进行优化,未能充分考虑压缩对机器视觉任务的性能影响。为解决这一问题,本文以多目标跟踪作为典型的机器视觉视频处理任务,提出一种面向机器视觉的VVC帧内编码算法。首先,使用神经网络可解释性方法,梯度加权类激活映射(Gradient-weighted Class Activation Mapping,GradCAM++),对视频内容进行显著性分析,定位出机器视觉任务所关注的区域,并以显著图的形式表示。随后,为了突出视频画面中的关键边缘轮廓信息,本文引入边缘检测并将其结果与显著性分析结果进行融合,得到最终的机器视觉显著性图。最后,基于融合后的机器视觉显著性图改进VVC模式选择过程,优化VVC中的块划分和帧内预测的模式决策过程。通过引入机器视觉失真,代替原有的信号失真来调整率失真优化公式,使得编码器在压缩过程中尽可能保留对视觉任务更为相关的信息。实验结果表明,与VVC基准相比,所提出方法在保持相同机器视觉检测精度的同时,可节约12.7%的码率。
基金The EU Seventh Framework Programme FP7-PEOPLE-IRSES( No. 247083)
文摘In order to decrease both computational complexity and coding time, an improved algorithm for the early detection of all-zero blocks (AZBs) in H. 264/AVC is proposed. The previous AZBs detection algorithms are reviewed. Three types of transformed frequency-domain coefficients, which are quantized to zeros, are analyzed. Based on the three types of frequencydomain scaling factors, the corresponding spatial coefficients are derived. Then the Schwarz inequality is applied to the derivation of the three thresholds based on spatial coefficients. Another threshold is set on the basis of the probability distribution of zero coefficients in a block. As a result, an adaptive AZBs detection algorithm is proposed based on the minimum of the former three thresholds and the threshold of zero blocks distribution. The simulation results show that, compared with the existing AZBs detection algorithms, the proposed algorithm achieves a 5% higher detection ratio in AZBs and 4% to 10% computation saving with only 0. 1 dB video quality degradation.