Occlusion problem is one of the challenging issues in vision field for a long time,and the occlusion phenomenon of visual object will be involved in many vision research fields. Once the occlusion occurs in a visual s...Occlusion problem is one of the challenging issues in vision field for a long time,and the occlusion phenomenon of visual object will be involved in many vision research fields. Once the occlusion occurs in a visual system,it will affect the effects of object recognition,tracking,observation and operation,so detecting occlusion autonomously should be one of the abilities for an intelligent vision system. The research on occlusion detection method for visual object has increasingly attracted attentions of scholars. First,the definition and classification of the occlusion problem are presented.Then,the characteristics and deficiencies of the occlusion detection methods based on the intensity image and the depth image are analyzed respectively,and the existing occlusion detection methods are compared. Finally,the problems of existing occlusion detection methods and possible research directions are pointed out.展开更多
A new method is proposed for synthesizing intermediate views from a pair of stereoscopic images. In order to synthesize high-quality intermediate views, the block matching method together with a simplified multi-windo...A new method is proposed for synthesizing intermediate views from a pair of stereoscopic images. In order to synthesize high-quality intermediate views, the block matching method together with a simplified multi-window technique and dynamic programming is used in the process of disparity estimation. Then occlusion detection is performed to locate occluded regions and their disparities are compensated. After the projecton of the left-to-right and right-to-left disparities onto the intermediate image, intermediate view is synthesized considering occluded regions. Experimental results show that our synthesis method can obtain intermediate views with higher quality.展开更多
The creation of the 3D rendering model involves the prediction of an accurate depth map for the input images.A proposed approach of a modified semi-global block matching algorithm with variable window size and the gra...The creation of the 3D rendering model involves the prediction of an accurate depth map for the input images.A proposed approach of a modified semi-global block matching algorithm with variable window size and the gradient assessment of objects predicts the depth map.3D modeling and view synthesis algorithms could effectively handle the obtained disparity maps.This work uses the consistency check method to find an accurate depth map for identifying occluded pixels.The prediction of the disparity map by semi-global block matching has used the benchmark dataset of Middlebury stereo for evaluation.The improved depth map quality within a reasonable process-ing time outperforms the other existing depth map prediction algorithms.The experimental results have shown that the proposed depth map predictioncould identify the inter-object boundaryeven with the presence ofocclusion with less detection error and runtime.We observed that the Middlebury stereo dataset has very few images with occluded objects,which made the attainment of gain cumbersome.Considering this gain,we have created our dataset with occlu-sion using the structured lighting technique.The proposed regularization term as an optimization process in the graph cut algorithm handles occlusion for different smoothing coefficients.The experimented results demonstrated that our dataset had outperformed the Tsukuba dataset regarding the percentage of occluded pixels.展开更多
High-resolution sub-meter satellite data play an increasingly crucial role in the 3D real-scene China construction initiative.Current research on 3D reconstruction using high-resolution satellite data primarily focuse...High-resolution sub-meter satellite data play an increasingly crucial role in the 3D real-scene China construction initiative.Current research on 3D reconstruction using high-resolution satellite data primarily focuses on two approaches:Multi-stereo fusion and multi-view matching.While algorithms based on these two methodologies for multi-view image 3D reconstruction have reached relative maturity,no systematic comparison has been conducted specifically on satellite data to evaluate the relative merits of multi-stereo fusion versus multi-view matching methods.This paper conducts a comparative analysis of the practical accuracy of both approaches using high-resolution satellite datasets from diverse geographical regions.To ensure fairness in accuracy comparison,both methodologies employ non-local dense matching for cost optimization.Results demonstrate that the multi-stereo fusion method outperforms multi-view matching in all evaluation metrics,exhibiting approximately 1.2%higher average matching accuracy and 10.7%superior elevation precision in the experimental datasets.Therefore,for 3D modeling applications using satellite data,we recommend adopting the multi-stereo fusion approach for digital surface model(DSM)product generation.展开更多
Occlusion relationship reasoning aims to locate where an object occludes others and estimate the depth order of these objects in three-dimensional(3D)space from a two-dimensional(2D)image.The former sub-task demands b...Occlusion relationship reasoning aims to locate where an object occludes others and estimate the depth order of these objects in three-dimensional(3D)space from a two-dimensional(2D)image.The former sub-task demands both the accurate location and the semantic indication of the objects,while the latter sub-task needs the depth order among the objects.Although several insightful studies have been proposed,a key characteristic of occlusion relationship reasoning,i.e.,the specialty and complementarity between occlusion boundary detection and occlusion orientation estimation,is rarely discussed.To verify this claim,in this paper,we integrate these properties into a unified end-to-end trainable network,namely the feature separation and interaction network(FSINet).It contains a shared encoder-decoder structure to learn the complementary property between the two sub-tasks,and two separated paths to learn specialized properties of the two sub-tasks.Concretely,the occlusion boundary path contains an image-level cue extractor to capture rich location information of the boundary,a detail-perceived semantic feature extractor,and a contextual correlation extractor to acquire refined semantic features of objects.In addition,a dual-flow cross detector has been customized to alleviate false-positive and false-negative boundaries.For the occlusion orientation estimation path,a scene context learner has been designed to capture the depth order cue around the boundary.In addition,two stripe convolutions are built to judge the depth order between objects.The shared decoder supplies the feature interaction,which plays a key role in exploiting the complementarity of the two paths.Extensive experimental results on the PIOD and BSDS ownership datasets reveal the superior performance of FSINet over state-of-the-art alternatives.Additionally,abundant ablation studies are offered to demonstrate the effectiveness of our design.展开更多
Robust object tracking has been an important and challenging research area in the field of computer vision for decades. With the increasing popularity of affordable depth sensors, range data is widely used in visual t...Robust object tracking has been an important and challenging research area in the field of computer vision for decades. With the increasing popularity of affordable depth sensors, range data is widely used in visual tracking for its ability to provide robustness to varying illumination and occlusions. In this paper, a novel RGBD and sparse learning based tracker is proposed. The range data is integrated into the sparse learning framework in three respects. First, an extra depth view is added to the color image based visual features as an independent view for robust appearance modeling. Then, a special occlusion template set is designed to replenish the existing dictionary for handling various occlusion conditions. Finally, a depth-based occlusion detection method is proposed to efficiently determine an accurate time for the template update. Extensive experiments on both KITTI and Princeton data sets demonstrate that the proposed tracker outperforms the state-of-the-art tracking algorithms, including both sparse learning and RGBD based methods.展开更多
基金Supported by the National Natural Science Foundation of China(No.61379065) Natural Science Foundation of Hebei Province(No.F2014203119)
文摘Occlusion problem is one of the challenging issues in vision field for a long time,and the occlusion phenomenon of visual object will be involved in many vision research fields. Once the occlusion occurs in a visual system,it will affect the effects of object recognition,tracking,observation and operation,so detecting occlusion autonomously should be one of the abilities for an intelligent vision system. The research on occlusion detection method for visual object has increasingly attracted attentions of scholars. First,the definition and classification of the occlusion problem are presented.Then,the characteristics and deficiencies of the occlusion detection methods based on the intensity image and the depth image are analyzed respectively,and the existing occlusion detection methods are compared. Finally,the problems of existing occlusion detection methods and possible research directions are pointed out.
文摘A new method is proposed for synthesizing intermediate views from a pair of stereoscopic images. In order to synthesize high-quality intermediate views, the block matching method together with a simplified multi-window technique and dynamic programming is used in the process of disparity estimation. Then occlusion detection is performed to locate occluded regions and their disparities are compensated. After the projecton of the left-to-right and right-to-left disparities onto the intermediate image, intermediate view is synthesized considering occluded regions. Experimental results show that our synthesis method can obtain intermediate views with higher quality.
文摘The creation of the 3D rendering model involves the prediction of an accurate depth map for the input images.A proposed approach of a modified semi-global block matching algorithm with variable window size and the gradient assessment of objects predicts the depth map.3D modeling and view synthesis algorithms could effectively handle the obtained disparity maps.This work uses the consistency check method to find an accurate depth map for identifying occluded pixels.The prediction of the disparity map by semi-global block matching has used the benchmark dataset of Middlebury stereo for evaluation.The improved depth map quality within a reasonable process-ing time outperforms the other existing depth map prediction algorithms.The experimental results have shown that the proposed depth map predictioncould identify the inter-object boundaryeven with the presence ofocclusion with less detection error and runtime.We observed that the Middlebury stereo dataset has very few images with occluded objects,which made the attainment of gain cumbersome.Considering this gain,we have created our dataset with occlu-sion using the structured lighting technique.The proposed regularization term as an optimization process in the graph cut algorithm handles occlusion for different smoothing coefficients.The experimented results demonstrated that our dataset had outperformed the Tsukuba dataset regarding the percentage of occluded pixels.
文摘High-resolution sub-meter satellite data play an increasingly crucial role in the 3D real-scene China construction initiative.Current research on 3D reconstruction using high-resolution satellite data primarily focuses on two approaches:Multi-stereo fusion and multi-view matching.While algorithms based on these two methodologies for multi-view image 3D reconstruction have reached relative maturity,no systematic comparison has been conducted specifically on satellite data to evaluate the relative merits of multi-stereo fusion versus multi-view matching methods.This paper conducts a comparative analysis of the practical accuracy of both approaches using high-resolution satellite datasets from diverse geographical regions.To ensure fairness in accuracy comparison,both methodologies employ non-local dense matching for cost optimization.Results demonstrate that the multi-stereo fusion method outperforms multi-view matching in all evaluation metrics,exhibiting approximately 1.2%higher average matching accuracy and 10.7%superior elevation precision in the experimental datasets.Therefore,for 3D modeling applications using satellite data,we recommend adopting the multi-stereo fusion approach for digital surface model(DSM)product generation.
基金supported by the National Natural Science Foundation of China(Nos.62176098 and 61703049)the Natural Science Foundation of Hubei Province of China(No.2019CFA022).
文摘Occlusion relationship reasoning aims to locate where an object occludes others and estimate the depth order of these objects in three-dimensional(3D)space from a two-dimensional(2D)image.The former sub-task demands both the accurate location and the semantic indication of the objects,while the latter sub-task needs the depth order among the objects.Although several insightful studies have been proposed,a key characteristic of occlusion relationship reasoning,i.e.,the specialty and complementarity between occlusion boundary detection and occlusion orientation estimation,is rarely discussed.To verify this claim,in this paper,we integrate these properties into a unified end-to-end trainable network,namely the feature separation and interaction network(FSINet).It contains a shared encoder-decoder structure to learn the complementary property between the two sub-tasks,and two separated paths to learn specialized properties of the two sub-tasks.Concretely,the occlusion boundary path contains an image-level cue extractor to capture rich location information of the boundary,a detail-perceived semantic feature extractor,and a contextual correlation extractor to acquire refined semantic features of objects.In addition,a dual-flow cross detector has been customized to alleviate false-positive and false-negative boundaries.For the occlusion orientation estimation path,a scene context learner has been designed to capture the depth order cue around the boundary.In addition,two stripe convolutions are built to judge the depth order between objects.The shared decoder supplies the feature interaction,which plays a key role in exploiting the complementarity of the two paths.Extensive experimental results on the PIOD and BSDS ownership datasets reveal the superior performance of FSINet over state-of-the-art alternatives.Additionally,abundant ablation studies are offered to demonstrate the effectiveness of our design.
基金the National Natural Science Foundation of China (No. 61571390) and the Fundamental Research Funds for the Central Universities, China (No. 2016QNA5004)
文摘Robust object tracking has been an important and challenging research area in the field of computer vision for decades. With the increasing popularity of affordable depth sensors, range data is widely used in visual tracking for its ability to provide robustness to varying illumination and occlusions. In this paper, a novel RGBD and sparse learning based tracker is proposed. The range data is integrated into the sparse learning framework in three respects. First, an extra depth view is added to the color image based visual features as an independent view for robust appearance modeling. Then, a special occlusion template set is designed to replenish the existing dictionary for handling various occlusion conditions. Finally, a depth-based occlusion detection method is proposed to efficiently determine an accurate time for the template update. Extensive experiments on both KITTI and Princeton data sets demonstrate that the proposed tracker outperforms the state-of-the-art tracking algorithms, including both sparse learning and RGBD based methods.