The existing depth video coding algorithms are generally based on in-loop depth filters, whose performance are unstable and easily affected by the outliers. In this paper, we design a joint weighted sparse representat...The existing depth video coding algorithms are generally based on in-loop depth filters, whose performance are unstable and easily affected by the outliers. In this paper, we design a joint weighted sparse representation-based median filter as the in-loop filter in depth video codec. It constructs depth candidate set which contains relevant neighboring depth pixel based on depth and intensity similarity weighted sparse coding, then the median operation is performed on this set to select a neighboring depth pixel as the result of the filtering. The experimental results indicate that the depth bitrate is reduced by about 9% compared with anchor method. It is confirmed that the proposed method is more effective in reducing the required depth bitrates for a given synthesis quality level.展开更多
In order to repair the dark holes in Kinect depth video, we propose a depth hole-filling method based on tensor.First, we process the original depth video by a weighted moving average system. Then, reconstruct the low...In order to repair the dark holes in Kinect depth video, we propose a depth hole-filling method based on tensor.First, we process the original depth video by a weighted moving average system. Then, reconstruct the low-rank sensors and sparse sensors of the video utilize the tensor recovery method, through which the rough motion saliency can be initially separated from the background. Finally, construct a four-order tensor for moving target part, by grouping similar patches. Then we can formulate the video denoising and hole filling problem as a low-rank completion problem. In the proposed algorithm, the tensor model is used to preserve the spatial structure of the video modality. And we employ the block processing method to overcome the problem of information loss in traditional video processing based on frames. Experimental results show that our method can significantly improve the quality of depth video, and has strong robustness.展开更多
Depth maps play a crucial role in various practical applications such as computer vision,augmented reality,and autonomous driving.How to obtain clear and accurate depth information in video depth estimation is a signi...Depth maps play a crucial role in various practical applications such as computer vision,augmented reality,and autonomous driving.How to obtain clear and accurate depth information in video depth estimation is a significant challenge faced in the field of computer vision.However,existing monocular video depth estimation models tend to produce blurred or inaccurate depth information in regions with object edges and low texture.To address this issue,we propose a monocular depth estimation model architecture guided by semantic segmentation masks,which introduces semantic information into the model to correct the ambiguous depth regions.We have evaluated the proposed method,and experimental results show that our method improves the accuracy of edge depth,demonstrating the effectiveness of our approach.展开更多
In this paper, we propose a new algorithm for temporally consistent depth map estimation to generate three-dimensional video. The proposed algorithm adaptively computes the matching cost using a temporal weighting fun...In this paper, we propose a new algorithm for temporally consistent depth map estimation to generate three-dimensional video. The proposed algorithm adaptively computes the matching cost using a temporal weighting function, which is obtained by block-based moving object detection and motion estimation with variable block sizes. Experimental results show that the proposed algorithm improves the temporal consistency of the depth video and reduces by about 38% both the flickering artefact in the synthesized view and the number of coding bits for depth video coding.展开更多
In the paper, an approach is proposed for the problem of consistency in depth maps estimation from binocular stereo video sequence. The consistent method includes temporal consistency and spatial consistency to elimin...In the paper, an approach is proposed for the problem of consistency in depth maps estimation from binocular stereo video sequence. The consistent method includes temporal consistency and spatial consistency to eliminate the flickering artifacts and smooth inaccuracy in depth recovery. So the improved global stereo matching based on graph cut and energy optimization is implemented. In temporal domain, the penalty function with coherence factor is introduced for temporal consistency, and the factor is determined by Lucas-Kanade optical flow weighted histogram similarity constraint(LKWHSC). In spatial domain, the joint bilateral truncated absolute difference(JBTAD) is proposed for segmentation smoothing. The method can smooth naturally and uniformly in low-gradient region and avoid over-smoothing as well as keep edge sharpness in high-gradient discontinuities to realize spatial consistency. The experimental results show that the algorithm can obtain better spatial and temporal consistent depth maps compared with the existing algorithms.展开更多
The depth information of the scene indicates the distance between the object and the camera,and depth extraction is a key technology in 3D video system.The emergence of Kinect makes the high resolution depth map captu...The depth information of the scene indicates the distance between the object and the camera,and depth extraction is a key technology in 3D video system.The emergence of Kinect makes the high resolution depth map capturing possible.However,the depth map captured by Kinect can not be directly used due to the existing holes and noises,which needs to be repaired.We propose a texture combined inpainting algorithm in this paper.Firstly,the foreground is segmented combined with the color characteristics of the texture image to repair the foreground of the depth map.Secondly,region growing is used to determine the match region of the hole in the depth map,and to accurately position the match region according to the texture information.Then the match region is weighted to fill the hole.Finally,a Gaussian filter is used to remove the noise in the depth map.Experimental results show that the proposed method can effectively repair the holes existing in the original depth map and get an accurate and smooth depth map,which can be used to render a virtual image with good quality.展开更多
Depth maps are used for synthesis virtual view in free-viewpoint television (FTV) systems. When depth maps are derived using existing depth estimation methods, the depth distortions will cause undesirable artifacts ...Depth maps are used for synthesis virtual view in free-viewpoint television (FTV) systems. When depth maps are derived using existing depth estimation methods, the depth distortions will cause undesirable artifacts in the synthesized views. To solve this problem, a 3D video quality model base depth maps (D-3DV) for virtual view synthesis and depth map coding in the FTV applications is proposed. First, the relationships between distortions in coded depth map and rendered view are derived. Then, a precisely 3DV quality model based depth characteristics is develop for the synthesized virtual views. Finally, based on D-3DV model, a multilateral filtering is applied as a pre-processed filter to reduce rendering artifacts. The experimental results evaluated by objective and subjective methods indicate that the proposed D-3DV model can reduce bit-rate of depth coding and achieve better rendering quality.展开更多
To deliver three-dimension (3D) videos through the current two-dimension (2D) broadcasting systems, the frame-compati-ble packing formats properly including one texture frame and one depth map in various down-samp...To deliver three-dimension (3D) videos through the current two-dimension (2D) broadcasting systems, the frame-compati-ble packing formats properly including one texture frame and one depth map in various down-sampling ratios have been proposed to achieve the simplest and most effective solution. To enhance the compatible centralized texture-depth packing (CTDP) formats, in this paper, we further introduce two depth enhancement algorithms to further improve the quality of CT-DP formats for delivering 3D video services. To compensate the loss of color YCbCr 444 to 420 conversion of colored-depth, two efficient depth reconstruction processes based on texture and depth consistency are proposed. Experimental re-sults show that the proposed enhanced CTDP depacking pro-cess outperforms the 2DDP format and the original CTDP de-packing procedure in synthesizing virtual views. With the help of the proposed efficient depth reconstruction processes, more correct reconstructed depth maps and better synthesized quality can be achieved. Before the available 3D broadcasting systems, which adopt truly depth and texture dependent cod-ing procedure, we believe that the proposed CTDP formats with depth enhancement could help to deliver 3D videos in the current 2D broadcasting systems simply and efficiently.展开更多
基金Supported by the National Natural Science Foundation of China(61462048)
文摘The existing depth video coding algorithms are generally based on in-loop depth filters, whose performance are unstable and easily affected by the outliers. In this paper, we design a joint weighted sparse representation-based median filter as the in-loop filter in depth video codec. It constructs depth candidate set which contains relevant neighboring depth pixel based on depth and intensity similarity weighted sparse coding, then the median operation is performed on this set to select a neighboring depth pixel as the result of the filtering. The experimental results indicate that the depth bitrate is reduced by about 9% compared with anchor method. It is confirmed that the proposed method is more effective in reducing the required depth bitrates for a given synthesis quality level.
文摘In order to repair the dark holes in Kinect depth video, we propose a depth hole-filling method based on tensor.First, we process the original depth video by a weighted moving average system. Then, reconstruct the low-rank sensors and sparse sensors of the video utilize the tensor recovery method, through which the rough motion saliency can be initially separated from the background. Finally, construct a four-order tensor for moving target part, by grouping similar patches. Then we can formulate the video denoising and hole filling problem as a low-rank completion problem. In the proposed algorithm, the tensor model is used to preserve the spatial structure of the video modality. And we employ the block processing method to overcome the problem of information loss in traditional video processing based on frames. Experimental results show that our method can significantly improve the quality of depth video, and has strong robustness.
文摘Depth maps play a crucial role in various practical applications such as computer vision,augmented reality,and autonomous driving.How to obtain clear and accurate depth information in video depth estimation is a significant challenge faced in the field of computer vision.However,existing monocular video depth estimation models tend to produce blurred or inaccurate depth information in regions with object edges and low texture.To address this issue,we propose a monocular depth estimation model architecture guided by semantic segmentation masks,which introduces semantic information into the model to correct the ambiguous depth regions.We have evaluated the proposed method,and experimental results show that our method improves the accuracy of edge depth,demonstrating the effectiveness of our approach.
基金supported by the National Research Foundation of Korea Grant funded by the Korea Ministry of Science and Technology under Grant No. 2012-0009228
文摘In this paper, we propose a new algorithm for temporally consistent depth map estimation to generate three-dimensional video. The proposed algorithm adaptively computes the matching cost using a temporal weighting function, which is obtained by block-based moving object detection and motion estimation with variable block sizes. Experimental results show that the proposed algorithm improves the temporal consistency of the depth video and reduces by about 38% both the flickering artefact in the synthesized view and the number of coding bits for depth video coding.
基金the Science and Technology Innovation Project of Ministry of Culture of China(No.2014KJCXXM08)the National Key Technology Research and Development Program of the Ministry of Science and Technology of China(No.2012BAH37F02)the National High Technology Research and Development Program(863)of China(No.2011AA01A107)
文摘In the paper, an approach is proposed for the problem of consistency in depth maps estimation from binocular stereo video sequence. The consistent method includes temporal consistency and spatial consistency to eliminate the flickering artifacts and smooth inaccuracy in depth recovery. So the improved global stereo matching based on graph cut and energy optimization is implemented. In temporal domain, the penalty function with coherence factor is introduced for temporal consistency, and the factor is determined by Lucas-Kanade optical flow weighted histogram similarity constraint(LKWHSC). In spatial domain, the joint bilateral truncated absolute difference(JBTAD) is proposed for segmentation smoothing. The method can smooth naturally and uniformly in low-gradient region and avoid over-smoothing as well as keep edge sharpness in high-gradient discontinuities to realize spatial consistency. The experimental results show that the algorithm can obtain better spatial and temporal consistent depth maps compared with the existing algorithms.
基金Supported by the Key Project of National Natural Science Foundation of China(Nos.60832003 and 61172096)major Project of Shanghai Science and Technology Committee(No.10510500500)the Major Innovation Project of Shanghai Municipal Education Commission
文摘The depth information of the scene indicates the distance between the object and the camera,and depth extraction is a key technology in 3D video system.The emergence of Kinect makes the high resolution depth map capturing possible.However,the depth map captured by Kinect can not be directly used due to the existing holes and noises,which needs to be repaired.We propose a texture combined inpainting algorithm in this paper.Firstly,the foreground is segmented combined with the color characteristics of the texture image to repair the foreground of the depth map.Secondly,region growing is used to determine the match region of the hole in the depth map,and to accurately position the match region according to the texture information.Then the match region is weighted to fill the hole.Finally,a Gaussian filter is used to remove the noise in the depth map.Experimental results show that the proposed method can effectively repair the holes existing in the original depth map and get an accurate and smooth depth map,which can be used to render a virtual image with good quality.
基金supported by the National Natural Science Foundation of China(Grant No.60832003)Key Laboratory of Advanced Display and System Application(Shanghai University),Ministry of Education,China(Grant No.P200902)the Key Project of Science and Technology Commission of Shanghai Municipality(Grant No.10510500500)
文摘Depth maps are used for synthesis virtual view in free-viewpoint television (FTV) systems. When depth maps are derived using existing depth estimation methods, the depth distortions will cause undesirable artifacts in the synthesized views. To solve this problem, a 3D video quality model base depth maps (D-3DV) for virtual view synthesis and depth map coding in the FTV applications is proposed. First, the relationships between distortions in coded depth map and rendered view are derived. Then, a precisely 3DV quality model based depth characteristics is develop for the synthesized virtual views. Finally, based on D-3DV model, a multilateral filtering is applied as a pre-processed filter to reduce rendering artifacts. The experimental results evaluated by objective and subjective methods indicate that the proposed D-3DV model can reduce bit-rate of depth coding and achieve better rendering quality.
文摘To deliver three-dimension (3D) videos through the current two-dimension (2D) broadcasting systems, the frame-compati-ble packing formats properly including one texture frame and one depth map in various down-sampling ratios have been proposed to achieve the simplest and most effective solution. To enhance the compatible centralized texture-depth packing (CTDP) formats, in this paper, we further introduce two depth enhancement algorithms to further improve the quality of CT-DP formats for delivering 3D video services. To compensate the loss of color YCbCr 444 to 420 conversion of colored-depth, two efficient depth reconstruction processes based on texture and depth consistency are proposed. Experimental re-sults show that the proposed enhanced CTDP depacking pro-cess outperforms the 2DDP format and the original CTDP de-packing procedure in synthesizing virtual views. With the help of the proposed efficient depth reconstruction processes, more correct reconstructed depth maps and better synthesized quality can be achieved. Before the available 3D broadcasting systems, which adopt truly depth and texture dependent cod-ing procedure, we believe that the proposed CTDP formats with depth enhancement could help to deliver 3D videos in the current 2D broadcasting systems simply and efficiently.