To reduce the computational complexity and storage cost caused by wedge segmentation algorithm,a scheme of simplifying wedge matching is proposed.It takes advantage of the correlation of the wedge separation line of d...To reduce the computational complexity and storage cost caused by wedge segmentation algorithm,a scheme of simplifying wedge matching is proposed.It takes advantage of the correlation of the wedge separation line of depth map and the direction of intra-prediction for 3D high-efficiency video coding(3D-HEVC).According to the difference of wedge segmentation between adjacent edge and opposite edge,a set only including 104×4 wedgelet templates is given.By expanding of the wedge wave of a certain minimum unit,a simple separation line acquisition method for different size of depth block is put forward.Furthermore,based on the array processor(DPR-CODEC)developed by project team,an efficient parallel scheme of the improved wedge segmentation mode prediction is introduced.By the scheme,prediction unit(PU)size can be changed randomly from 4×4 to 8×8,16×16,and 32×32,which is more in line with the needs of the HEVC standard.Veri-fied with test sequence in HTM16.1 and the Xilinx virtex-6 field programmable gate array(FPGA)respectively,the experiment results show that the proposed methods save 99.2%of the storage space and 63.94%of the encoding time,the serial/parallel acceleration ratio of each template reaches 1.84 in average.The coding performance,storage and resource consumption are considered for both.展开更多
Based on the flexible quadtree partition structure of coding tree units(CTUs),the deblocking filter(DBF)in high efficiency video coding(HEVC)consumes a lot of resources when implemen-ted by hardware.It is difficult to...Based on the flexible quadtree partition structure of coding tree units(CTUs),the deblocking filter(DBF)in high efficiency video coding(HEVC)consumes a lot of resources when implemen-ted by hardware.It is difficult to achieve flexible switching between different sizes of coding blocks.Aiming at this problem,a reconfigurable implementation of DBF is proposed.Based on the dynamic programmable reconfigurable video array processor(DPRAP)with context switch reconfiguration mechanism,the runtime flexible switching of two coding block sizes is realized.The experimental results show that the highest work-frequency reaches 151.4 MHz.Compared with the dedicated hardware architecture scheme,the resource consumption can be reduced by 28.1%while realizing the dynamic switching between algorithms of two coding block sizes.Compared with the results of HM16.0,by using a complete I-frame for testing,the average peak signal-to-noise ratio(PSNR)of the reconfigurable implementation proposed in this paper has increased by 3.0508 dB,the coding quality has improved to a certain extent.展开更多
基金the National Natural Science Foundation of China(No.61834005,61772417,61802304,61602377,61874087,61634004)Shaanxi International Science and Technology Cooperation Program(No.2018KW-006).
文摘To reduce the computational complexity and storage cost caused by wedge segmentation algorithm,a scheme of simplifying wedge matching is proposed.It takes advantage of the correlation of the wedge separation line of depth map and the direction of intra-prediction for 3D high-efficiency video coding(3D-HEVC).According to the difference of wedge segmentation between adjacent edge and opposite edge,a set only including 104×4 wedgelet templates is given.By expanding of the wedge wave of a certain minimum unit,a simple separation line acquisition method for different size of depth block is put forward.Furthermore,based on the array processor(DPR-CODEC)developed by project team,an efficient parallel scheme of the improved wedge segmentation mode prediction is introduced.By the scheme,prediction unit(PU)size can be changed randomly from 4×4 to 8×8,16×16,and 32×32,which is more in line with the needs of the HEVC standard.Veri-fied with test sequence in HTM16.1 and the Xilinx virtex-6 field programmable gate array(FPGA)respectively,the experiment results show that the proposed methods save 99.2%of the storage space and 63.94%of the encoding time,the serial/parallel acceleration ratio of each template reaches 1.84 in average.The coding performance,storage and resource consumption are considered for both.
基金Supported by the National Natural Science Foundation of China(No.61834005,61772417,61802304,61602377,61874087,61634004)the Shaanxi Province Key R&D Plan(No.2021GY-029,2021KW-16).
文摘Based on the flexible quadtree partition structure of coding tree units(CTUs),the deblocking filter(DBF)in high efficiency video coding(HEVC)consumes a lot of resources when implemen-ted by hardware.It is difficult to achieve flexible switching between different sizes of coding blocks.Aiming at this problem,a reconfigurable implementation of DBF is proposed.Based on the dynamic programmable reconfigurable video array processor(DPRAP)with context switch reconfiguration mechanism,the runtime flexible switching of two coding block sizes is realized.The experimental results show that the highest work-frequency reaches 151.4 MHz.Compared with the dedicated hardware architecture scheme,the resource consumption can be reduced by 28.1%while realizing the dynamic switching between algorithms of two coding block sizes.Compared with the results of HM16.0,by using a complete I-frame for testing,the average peak signal-to-noise ratio(PSNR)of the reconfigurable implementation proposed in this paper has increased by 3.0508 dB,the coding quality has improved to a certain extent.