Humans can naturally and effectively find salient regions in complex scenes.Motivated by this observation,attention mechanisms were introduced into computer vision with the aim of imitating this aspect of the human vi...Humans can naturally and effectively find salient regions in complex scenes.Motivated by this observation,attention mechanisms were introduced into computer vision with the aim of imitating this aspect of the human visual system.Such an attention mechanism can be regarded as a dynamic weight adjustment process based on features of the input image.Attention mechanisms have achieved great success in many visual tasks,including image classification,object detection,semantic segmentation,video understanding,image generation,3D vision,multimodal tasks,and self-supervised learning.In this survey,we provide a comprehensive review of various attention mechanisms in computer vision and categorize them according to approach,such as channel attention,spatial attention,temporal attention,and branch attention;a related repository https://github.com/MenghaoG uo/Awesome-Vision-Attentions is dedicated to collecting related work.We also suggest future directions for attention mechanism research.展开更多
Detecting and segmenting salient objects from natural scenes, often referred to as salient object detection, has attracted great interest in computer vision. While many models have been proposed and several applicatio...Detecting and segmenting salient objects from natural scenes, often referred to as salient object detection, has attracted great interest in computer vision. While many models have been proposed and several applications have emerged, a deep understanding of achievements and issues remains lacking. We aim to provide a comprehensive review of recent progress in salient object detection and situate this field among other closely related areas such as generic scene segmentation, object proposal generation, and saliency for fixation prediction. Covering 228 publications, we survey i) roots, key concepts, and tasks, ii) core techniques and main modeling trends, and iii) datasets and evaluation metrics for salient object detection. We also discuss open problems such as evaluation metrics and dataset bias in model performance, and suggest future research directions.展开更多
Humans have the ability to perceive kinetic depth effects, i.e., to perceived 3 D shapes from 2 D projections of rotating 3 D objects. This process is based on a variety of visual cues such as lighting and shading eff...Humans have the ability to perceive kinetic depth effects, i.e., to perceived 3 D shapes from 2 D projections of rotating 3 D objects. This process is based on a variety of visual cues such as lighting and shading effects. However, when such cues are weak or missing, perception can become faulty, as demonstrated by the famous silhouette illusion example of the spinning dancer. Inspired by this, we establish objective and subjective evaluation models of rotated3 D objects by taking their projected 2 D images as input. We investigate five different cues: ambient luminance, shading, rotation speed, perspective, and color difference between the objects and background.In the objective evaluation model, we first apply3 D reconstruction algorithms to obtain an objective reconstruction quality metric, and then use quadratic stepwise regression analysis to determine weights of depth cues to represent the reconstruction quality. In the subjective evaluation model, we use a comprehensive user study to reveal correlations with reaction time and accuracy, rotation speed, and perspective. The two evaluation models are generally consistent, and potentially of benefit to inter-disciplinary research into visual perception and 3 D reconstruction.展开更多
Existing color editing algorithms enable users to edit the colors in an image according to their own aesthetics.Unlike artists who have an accurate grasp of color,ordinary users are inexperienced in color selection an...Existing color editing algorithms enable users to edit the colors in an image according to their own aesthetics.Unlike artists who have an accurate grasp of color,ordinary users are inexperienced in color selection and matching,and allowing non-professional users to edit colors arbitrarily may lead to unrealistic editing results.To address this issue,we introduce a palette-based approach for realistic object-level image recoloring.Our data-driven approach consists of an offline learning part that learns the color distributions for different objects in the real world,and an online recoloring part that first recognizes the object category,and then recommends appropriate realistic candidate colors learned in the offline step for that category.We also provide an intuitive user interface for efficient color manipulation.After color selection,image matting is performed to ensure smoothness of the object boundary.Comprehensive evaluation on various color editing examples demonstrates that our approach outperforms existing state-of-the-art color editing algorithms.展开更多
Interactive image segmentation(IIS)is an important technique for obtaining pixel-level annotations.In many cases,target objects share similar semantics.However,IIS methods neglect this connection and in particular the...Interactive image segmentation(IIS)is an important technique for obtaining pixel-level annotations.In many cases,target objects share similar semantics.However,IIS methods neglect this connection and in particular the cues provided by representations of previously segmented objects,previous user interaction,and previous prediction masks,which can all provide suitable priors for the current annotation.In this paper,we formulate a sequential interactive image segmentation(SIIS)task for minimizing user interaction when segmenting sequences of related images,and we provide a practical approach to this task using two pertinent designs.The first is a novel interaction mode.When annotating a new sample,our method can automatically propose an initial click proposal based on previous annotation.This dramatically helps to reduce the interaction burden on the user.The second is an online optimization strategy,with the goal of providing semantic information when annotating specific targets,optimizing the model with dense supervision from previously labeled samples.Experiments demonstrate the effectiveness of regarding SIIS as a particular task,and our methods for addressing it.展开更多
基金National Natural Science Foundation of China(Grant Nos.61521002 and 62132012)。
文摘Humans can naturally and effectively find salient regions in complex scenes.Motivated by this observation,attention mechanisms were introduced into computer vision with the aim of imitating this aspect of the human visual system.Such an attention mechanism can be regarded as a dynamic weight adjustment process based on features of the input image.Attention mechanisms have achieved great success in many visual tasks,including image classification,object detection,semantic segmentation,video understanding,image generation,3D vision,multimodal tasks,and self-supervised learning.In this survey,we provide a comprehensive review of various attention mechanisms in computer vision and categorize them according to approach,such as channel attention,spatial attention,temporal attention,and branch attention;a related repository https://github.com/MenghaoG uo/Awesome-Vision-Attentions is dedicated to collecting related work.We also suggest future directions for attention mechanism research.
文摘Detecting and segmenting salient objects from natural scenes, often referred to as salient object detection, has attracted great interest in computer vision. While many models have been proposed and several applications have emerged, a deep understanding of achievements and issues remains lacking. We aim to provide a comprehensive review of recent progress in salient object detection and situate this field among other closely related areas such as generic scene segmentation, object proposal generation, and saliency for fixation prediction. Covering 228 publications, we survey i) roots, key concepts, and tasks, ii) core techniques and main modeling trends, and iii) datasets and evaluation metrics for salient object detection. We also discuss open problems such as evaluation metrics and dataset bias in model performance, and suggest future research directions.
基金supported by Tianjin NSF(Nos.18JCYBJC41300 and 18ZXZNGX00110)National Natural Science Foundation of China(No.61972216)the Open Project Program of the State Key Laboratory of Virtual Reality Technology and Systems,Beihang University(No.VRLAB2019B04)
文摘Humans have the ability to perceive kinetic depth effects, i.e., to perceived 3 D shapes from 2 D projections of rotating 3 D objects. This process is based on a variety of visual cues such as lighting and shading effects. However, when such cues are weak or missing, perception can become faulty, as demonstrated by the famous silhouette illusion example of the spinning dancer. Inspired by this, we establish objective and subjective evaluation models of rotated3 D objects by taking their projected 2 D images as input. We investigate five different cues: ambient luminance, shading, rotation speed, perspective, and color difference between the objects and background.In the objective evaluation model, we first apply3 D reconstruction algorithms to obtain an objective reconstruction quality metric, and then use quadratic stepwise regression analysis to determine weights of depth cues to represent the reconstruction quality. In the subjective evaluation model, we use a comprehensive user study to reveal correlations with reaction time and accuracy, rotation speed, and perspective. The two evaluation models are generally consistent, and potentially of benefit to inter-disciplinary research into visual perception and 3 D reconstruction.
基金supported by National Natural Science Foundation of China(Grant Nos.61972216 and 62111530097)NSF of Tianjin City(Grant Nos.18JCYBJC41300 and 18ZXZNGX00110).
文摘Existing color editing algorithms enable users to edit the colors in an image according to their own aesthetics.Unlike artists who have an accurate grasp of color,ordinary users are inexperienced in color selection and matching,and allowing non-professional users to edit colors arbitrarily may lead to unrealistic editing results.To address this issue,we introduce a palette-based approach for realistic object-level image recoloring.Our data-driven approach consists of an offline learning part that learns the color distributions for different objects in the real world,and an online recoloring part that first recognizes the object category,and then recommends appropriate realistic candidate colors learned in the offline step for that category.We also provide an intuitive user interface for efficient color manipulation.After color selection,image matting is performed to ensure smoothness of the object boundary.Comprehensive evaluation on various color editing examples demonstrates that our approach outperforms existing state-of-the-art color editing algorithms.
文摘Interactive image segmentation(IIS)is an important technique for obtaining pixel-level annotations.In many cases,target objects share similar semantics.However,IIS methods neglect this connection and in particular the cues provided by representations of previously segmented objects,previous user interaction,and previous prediction masks,which can all provide suitable priors for the current annotation.In this paper,we formulate a sequential interactive image segmentation(SIIS)task for minimizing user interaction when segmenting sequences of related images,and we provide a practical approach to this task using two pertinent designs.The first is a novel interaction mode.When annotating a new sample,our method can automatically propose an initial click proposal based on previous annotation.This dramatically helps to reduce the interaction burden on the user.The second is an online optimization strategy,with the goal of providing semantic information when annotating specific targets,optimizing the model with dense supervision from previously labeled samples.Experiments demonstrate the effectiveness of regarding SIIS as a particular task,and our methods for addressing it.