Intelligent video coding(IVC),which dates back to the late 1980s with the concept of encoding videos with knowledge and semantics,includes visual content compact representation models and methods enabling structural,d...Intelligent video coding(IVC),which dates back to the late 1980s with the concept of encoding videos with knowledge and semantics,includes visual content compact representation models and methods enabling structural,detailed descriptions of visual information at different granularity levels(i.e.,block,mesh,region,and object)and in different areas.It aims to support and facilitate a wide range of applications,such as visual media coding,content broadcasting,and ubiquitous multimedia computing.We present a high-level overview of the IVC technology from model-based coding(MBC)to learning-based coding(LBC).MBC mainly adopts a manually designed coding scheme to explicitly decompose videos to be coded into blocks or semantic components.Thanks to emerging deep learning technologies such as neural networks and generative models,LBC has become a rising topic in the coding area.In this paper,wefirst review the classical MBC approaches,followed by the LBC approaches for image and video data.We also discuss and overview our recent attempts at neural coding approaches,which are inspiring for both academic research and industrial implementation.Some critical yet less studied issues are discussed at the end of this paper.展开更多
基金supported by the National Natural Science Foundation of China(Grant No.62025101,62088102,62101007 and 61931014)the Young Elite Scientist Sponsorship Program by the Beijing Association of Science and Technology(Grant No.BYSS2022019).
文摘Intelligent video coding(IVC),which dates back to the late 1980s with the concept of encoding videos with knowledge and semantics,includes visual content compact representation models and methods enabling structural,detailed descriptions of visual information at different granularity levels(i.e.,block,mesh,region,and object)and in different areas.It aims to support and facilitate a wide range of applications,such as visual media coding,content broadcasting,and ubiquitous multimedia computing.We present a high-level overview of the IVC technology from model-based coding(MBC)to learning-based coding(LBC).MBC mainly adopts a manually designed coding scheme to explicitly decompose videos to be coded into blocks or semantic components.Thanks to emerging deep learning technologies such as neural networks and generative models,LBC has become a rising topic in the coding area.In this paper,wefirst review the classical MBC approaches,followed by the LBC approaches for image and video data.We also discuss and overview our recent attempts at neural coding approaches,which are inspiring for both academic research and industrial implementation.Some critical yet less studied issues are discussed at the end of this paper.