This paper presents an effective machine learning-based depth selection algorithm for CTU(Coding Tree Unit)in HEVC(High Efficiency Video Coding).Existing machine learning methods are limited in their ability in handli...This paper presents an effective machine learning-based depth selection algorithm for CTU(Coding Tree Unit)in HEVC(High Efficiency Video Coding).Existing machine learning methods are limited in their ability in handling the initial depth decision of CU(Coding Unit)and selecting the proper set of input features for the depth selection model.In this paper,we first propose a new classification approach for the initial division depth prediction.In particular,we study the correlation of the texture complexity,QPs(quantization parameters)and the depth decision of the CUs to forecast the original partition depth of the current CUs.Secondly,we further aim to determine the input features of the classifier by analysing the correlation between depth decision of the CUs,picture distortion and the bit-rate.Using the found relationships,we also study a decision method for the end partition depth of the current CUs using bit-rate and picture distortion as input.Finally,we formulate the depth division of the CUs as a binary classification problem and use the nearest neighbor classifier to conduct classification.Our proposed method can significantly improve the efficiency of interframe coding by circumventing the traversing cost of the division depth.It shows that the mentioned method can reduce the time spent by 34.56%compared to HM-16.9 while keeping the partition depth of the CUs correct.展开更多
Image segmentation of sea-land remote sensing images is of great importance for downstream applications including shoreline extraction,the monitoring of near-shore marine environment,and near-shore target recognition....Image segmentation of sea-land remote sensing images is of great importance for downstream applications including shoreline extraction,the monitoring of near-shore marine environment,and near-shore target recognition.To mitigate large number of parameters and improve the segmentation accuracy,we propose a new Squeeze-Depth-Wise UNet(SDW-UNet)deep learning model for sea-land remote sensing image segmentation.The proposed SDW-UNet model leverages the squeeze-excitation and depth-wise separable convolution to construct new convolution modules,which enhance the model capacity in combining multiple channels and reduces the model parameters.We further explore the effect of position-encoded information in NLP(Natural Language Processing)domain on sea-land segmentation task.We have conducted extensive experiments to compare the proposed network with the mainstream segmentation network in terms of accuracy,the number of parameters and the time cost for prediction.The test results on remote sensing data sets of Guam,Okinawa,Taiwan China,San Diego,and Diego Garcia demonstrate the effectiveness of SDW-UNet in recognizing different types of sea-land areas with a smaller number of parameters,reduces prediction time cost and improves performance over other mainstream segmentation models.We also show that the position encoding can further improve the accuracy of model segmentation.展开更多
Versatile video coding(H.266/VVC),which was newly released by the Joint Video Exploration Team(JVET),introduces quad-tree plus multitype tree(QTMT)partition structure on the basis of quad-tree(QT)partition structure i...Versatile video coding(H.266/VVC),which was newly released by the Joint Video Exploration Team(JVET),introduces quad-tree plus multitype tree(QTMT)partition structure on the basis of quad-tree(QT)partition structure in High Efficiency Video Coding(H.265/HEVC).More complicated coding unit(CU)partitioning processes in H.266/VVC significantly improve video compression efficiency,but greatly increase the computational complexity compared.The ultra-high encoding complexity has obstructed its real-time applications.In order to solve this problem,a CU partition algorithm using convolutional neural network(CNN)is proposed in this paper to speed up the H.266/VVC CU partition process.Firstly,64×64 CU is divided into smooth texture CU,mildly complex texture CU and complex texture CU according to the CU texture characteristics.Second,CU texture complexity classification convolutional neural network(CUTCC-CNN)is proposed to classify CUs.Finally,according to the classification results,the encoder is guided to skip different RDO search process.And optimal CU partition results will be determined.Experimental results show that the proposed method reduces the average coding time by 32.2%with only 0.55%BD-BR loss compared with VTM 10.2.展开更多
基金This paper is supported by the National Natural Science Foundation of China(61672064)Basic Research Program of Qinghai Province(No.2020-ZJ-709)the project for advanced information network Beijing laboratory(PXM2019_014204_500029).
文摘This paper presents an effective machine learning-based depth selection algorithm for CTU(Coding Tree Unit)in HEVC(High Efficiency Video Coding).Existing machine learning methods are limited in their ability in handling the initial depth decision of CU(Coding Unit)and selecting the proper set of input features for the depth selection model.In this paper,we first propose a new classification approach for the initial division depth prediction.In particular,we study the correlation of the texture complexity,QPs(quantization parameters)and the depth decision of the CUs to forecast the original partition depth of the current CUs.Secondly,we further aim to determine the input features of the classifier by analysing the correlation between depth decision of the CUs,picture distortion and the bit-rate.Using the found relationships,we also study a decision method for the end partition depth of the current CUs using bit-rate and picture distortion as input.Finally,we formulate the depth division of the CUs as a binary classification problem and use the nearest neighbor classifier to conduct classification.Our proposed method can significantly improve the efficiency of interframe coding by circumventing the traversing cost of the division depth.It shows that the mentioned method can reduce the time spent by 34.56%compared to HM-16.9 while keeping the partition depth of the CUs correct.
基金This paper is supported by the following funds:The National Key Research and Development Program of China(2018YFF01010100)The Beijing Natural Science Foundation(4212001)+1 种基金Basic Research Program of Qinghai Province under Grants No.2021-ZJ-704Advanced information network Beijing laboratory(PXM2019_014204_500029).
文摘Image segmentation of sea-land remote sensing images is of great importance for downstream applications including shoreline extraction,the monitoring of near-shore marine environment,and near-shore target recognition.To mitigate large number of parameters and improve the segmentation accuracy,we propose a new Squeeze-Depth-Wise UNet(SDW-UNet)deep learning model for sea-land remote sensing image segmentation.The proposed SDW-UNet model leverages the squeeze-excitation and depth-wise separable convolution to construct new convolution modules,which enhance the model capacity in combining multiple channels and reduces the model parameters.We further explore the effect of position-encoded information in NLP(Natural Language Processing)domain on sea-land segmentation task.We have conducted extensive experiments to compare the proposed network with the mainstream segmentation network in terms of accuracy,the number of parameters and the time cost for prediction.The test results on remote sensing data sets of Guam,Okinawa,Taiwan China,San Diego,and Diego Garcia demonstrate the effectiveness of SDW-UNet in recognizing different types of sea-land areas with a smaller number of parameters,reduces prediction time cost and improves performance over other mainstream segmentation models.We also show that the position encoding can further improve the accuracy of model segmentation.
基金This paper is supported by the following funds:The National Key Research and Development Program of China(2018YFF01010100)Basic Research Program of Qinghai Province under Grants No.2021-ZJ-704,The Beijing Natural Science Foundation(4212001)Advanced information network Beijing laboratory(PXM2019_014204_500029).
文摘Versatile video coding(H.266/VVC),which was newly released by the Joint Video Exploration Team(JVET),introduces quad-tree plus multitype tree(QTMT)partition structure on the basis of quad-tree(QT)partition structure in High Efficiency Video Coding(H.265/HEVC).More complicated coding unit(CU)partitioning processes in H.266/VVC significantly improve video compression efficiency,but greatly increase the computational complexity compared.The ultra-high encoding complexity has obstructed its real-time applications.In order to solve this problem,a CU partition algorithm using convolutional neural network(CNN)is proposed in this paper to speed up the H.266/VVC CU partition process.Firstly,64×64 CU is divided into smooth texture CU,mildly complex texture CU and complex texture CU according to the CU texture characteristics.Second,CU texture complexity classification convolutional neural network(CUTCC-CNN)is proposed to classify CUs.Finally,according to the classification results,the encoder is guided to skip different RDO search process.And optimal CU partition results will be determined.Experimental results show that the proposed method reduces the average coding time by 32.2%with only 0.55%BD-BR loss compared with VTM 10.2.