Face recognition provides a natural visual interface for human computer interaction (HCI) applications. The process of face recognition, however, is inhibited by variations in the appearance of face images caused by...Face recognition provides a natural visual interface for human computer interaction (HCI) applications. The process of face recognition, however, is inhibited by variations in the appearance of face images caused by changes in lighting, expression, viewpoint, aging and introduction of occlusion. Although various algorithms have been presented for face recognition, face recognition is still a very challenging topic. A novel approach of real time face recognition for HCI is proposed in the paper. In view of the limits of the popular approaches to foreground segmentation, wavelet multi-scale transform based background subtraction is developed to extract foreground objects. The optimal selection of the threshold is automatically determined, which does not require any complex supervised training or manual experimental calibration. A robust real time face recognition algorithm is presented, which combines the projection matrixes without iteration and kernel Fisher discriminant analysis (KFDA) to overcome some difficulties existing in the real face recognition. Superior performance of the proposed algorithm is demonstrated by comparing with other algorithms through experiments. The proposed algorithm can also be applied to the video image sequences of natural HCI.展开更多
Moving object segmentation (MOS) is one of the essential functions of the vision system of all robots,including medical robots. Deep learning-based MOS methods, especially deep end-to-end MOS methods, are actively inv...Moving object segmentation (MOS) is one of the essential functions of the vision system of all robots,including medical robots. Deep learning-based MOS methods, especially deep end-to-end MOS methods, are actively investigated in this field. Foreground segmentation networks (FgSegNets) are representative deep end-to-endMOS methods proposed recently. This study explores a new mechanism to improve the spatial feature learningcapability of FgSegNets with relatively few brought parameters. Specifically, we propose an enhanced attention(EA) module, a parallel connection of an attention module and a lightweight enhancement module, with sequentialattention and residual attention as special cases. We also propose integrating EA with FgSegNet_v2 by taking thelightweight convolutional block attention module as the attention module and plugging EA module after the twoMaxpooling layers of the encoder. The derived new model is named FgSegNet_v2 EA. The ablation study verifiesthe effectiveness of the proposed EA module and integration strategy. The results on the CDnet2014 dataset,which depicts human activities and vehicles captured in different scenes, show that FgSegNet_v2 EA outperformsFgSegNet_v2 by 0.08% and 14.5% under the settings of scene dependent evaluation and scene independent evaluation, respectively, which indicates the positive effect of EA on improving spatial feature learning capability ofFgSegNet_v2.展开更多
This article proposes a three-dimensional light field reconstruction method based on neural radiation field(NeRF)called Infrared NeRF for low resolution thermal infrared scenes.Based on the characteristics of the low ...This article proposes a three-dimensional light field reconstruction method based on neural radiation field(NeRF)called Infrared NeRF for low resolution thermal infrared scenes.Based on the characteristics of the low resolution thermal infrared imaging,various optimizations have been carried out to improve the speed and accuracy of thermal infrared 3D reconstruction.Firstly,inspired by Boltzmann's law of thermal radiation,distance is incorporated into the NeRF model for the first time,resulting in a nonlinear propagation of a single ray and a more accurate description of the physical property that infrared radiation intensity decreases with increasing distance.Secondly,in terms of improving inference speed,based on the phenomenon of high and low frequency distribution of foreground and background in infrared images,a multi ray non-uniform light synthesis strategy is proposed to make the model pay more attention to foreground objects in the scene,reduce the distribution of light in the background,and significantly reduce training time without reducing accuracy.In addition,compared to visible light scenes,infrared images only have a single channel,so fewer network parameters are required.Experiments using the same training data and data filtering method showed that,compared to the original NeRF,the improved network achieved an average improvement of 13.8%and 4.62%in PSNR and SSIM,respectively,while an average decreases of 46%in LPIPS.And thanks to the optimization of network layers and data filtering methods,training only takes about 25%of the original method's time to achieve convergence.Finally,for scenes with weak backgrounds,this article improves the inference speed of the model by 4-6 times compared to the original NeRF by limiting the query interval of the model.展开更多
Aimed at the two problems of underwater imaging, fog effect and color cast, an Improved Segmentation Dark Channel Prior(ISDCP) defogging method is proposed to solve the fog effects caused by physical properties of wat...Aimed at the two problems of underwater imaging, fog effect and color cast, an Improved Segmentation Dark Channel Prior(ISDCP) defogging method is proposed to solve the fog effects caused by physical properties of water. Due to mass refraction of light in the process of underwater imaging, fog effects would lead to image blurring. And color cast is closely related to different degree of attenuation while light with different wavelengths is traveling in water. The proposed method here integrates the ISDCP and quantitative histogram stretching techniques into the image enhancement procedure. Firstly, the threshold value is set during the refinement process of the transmission maps to identify the original mismatching, and to conduct the differentiated defogging process further. Secondly, a method of judging the propagating distance of light is adopted to get the attenuation degree of energy during the propagation underwater. Finally, the image histogram is stretched quantitatively in Red-Green-Blue channel respectively according to the degree of attenuation in each color channel. The proposed method ISDCP can reduce the computational complexity and improve the efficiency in terms of defogging effect to meet the real-time requirements. Qualitative and quantitative comparison for several different underwater scenes reveals that the proposed method can significantly improve the visibility compared with previous methods.展开更多
This paper presents a novel method of spot addressing and segmentation about the foreground segmentation of microarray image. In this paper,a spot addressing method based on particle swarm optimization(PSO),algorithm ...This paper presents a novel method of spot addressing and segmentation about the foreground segmentation of microarray image. In this paper,a spot addressing method based on particle swarm optimization(PSO),algorithm is proposed to have a further search for the center coordinate and radius of the spot whose region is determined by the projection method. Then,a foreground segmentation method is put forward to make the spot foreground segmentation based on the center coordinate and radius of the spot. The spot addressing and segmentation experiments on synthetic and real microarray images show that the proposed method is effective and feasible for the foreground segmentation of microarray image.展开更多
基金supported by the National Natural Science Foundation of China (Grant No.60872117)the Leading Academic Discipline Project of Shanghai Municipal Education Commission (Grant No.J50104)
文摘Face recognition provides a natural visual interface for human computer interaction (HCI) applications. The process of face recognition, however, is inhibited by variations in the appearance of face images caused by changes in lighting, expression, viewpoint, aging and introduction of occlusion. Although various algorithms have been presented for face recognition, face recognition is still a very challenging topic. A novel approach of real time face recognition for HCI is proposed in the paper. In view of the limits of the popular approaches to foreground segmentation, wavelet multi-scale transform based background subtraction is developed to extract foreground objects. The optimal selection of the threshold is automatically determined, which does not require any complex supervised training or manual experimental calibration. A robust real time face recognition algorithm is presented, which combines the projection matrixes without iteration and kernel Fisher discriminant analysis (KFDA) to overcome some difficulties existing in the real face recognition. Superior performance of the proposed algorithm is demonstrated by comparing with other algorithms through experiments. The proposed algorithm can also be applied to the video image sequences of natural HCI.
基金the National Natural Science Foundation of China(No.61702323)。
文摘Moving object segmentation (MOS) is one of the essential functions of the vision system of all robots,including medical robots. Deep learning-based MOS methods, especially deep end-to-end MOS methods, are actively investigated in this field. Foreground segmentation networks (FgSegNets) are representative deep end-to-endMOS methods proposed recently. This study explores a new mechanism to improve the spatial feature learningcapability of FgSegNets with relatively few brought parameters. Specifically, we propose an enhanced attention(EA) module, a parallel connection of an attention module and a lightweight enhancement module, with sequentialattention and residual attention as special cases. We also propose integrating EA with FgSegNet_v2 by taking thelightweight convolutional block attention module as the attention module and plugging EA module after the twoMaxpooling layers of the encoder. The derived new model is named FgSegNet_v2 EA. The ablation study verifiesthe effectiveness of the proposed EA module and integration strategy. The results on the CDnet2014 dataset,which depicts human activities and vehicles captured in different scenes, show that FgSegNet_v2 EA outperformsFgSegNet_v2 by 0.08% and 14.5% under the settings of scene dependent evaluation and scene independent evaluation, respectively, which indicates the positive effect of EA on improving spatial feature learning capability ofFgSegNet_v2.
基金Support by the Fundamental Research Funds for the Central Universities(2024300443)the National Natural Science Foundation of China(NSFC)Young Scientists Fund(62405131)。
文摘This article proposes a three-dimensional light field reconstruction method based on neural radiation field(NeRF)called Infrared NeRF for low resolution thermal infrared scenes.Based on the characteristics of the low resolution thermal infrared imaging,various optimizations have been carried out to improve the speed and accuracy of thermal infrared 3D reconstruction.Firstly,inspired by Boltzmann's law of thermal radiation,distance is incorporated into the NeRF model for the first time,resulting in a nonlinear propagation of a single ray and a more accurate description of the physical property that infrared radiation intensity decreases with increasing distance.Secondly,in terms of improving inference speed,based on the phenomenon of high and low frequency distribution of foreground and background in infrared images,a multi ray non-uniform light synthesis strategy is proposed to make the model pay more attention to foreground objects in the scene,reduce the distribution of light in the background,and significantly reduce training time without reducing accuracy.In addition,compared to visible light scenes,infrared images only have a single channel,so fewer network parameters are required.Experiments using the same training data and data filtering method showed that,compared to the original NeRF,the improved network achieved an average improvement of 13.8%and 4.62%in PSNR and SSIM,respectively,while an average decreases of 46%in LPIPS.And thanks to the optimization of network layers and data filtering methods,training only takes about 25%of the original method's time to achieve convergence.Finally,for scenes with weak backgrounds,this article improves the inference speed of the model by 4-6 times compared to the original NeRF by limiting the query interval of the model.
基金supported by the National Natural Science Foundation of China (No. 61401413)
文摘Aimed at the two problems of underwater imaging, fog effect and color cast, an Improved Segmentation Dark Channel Prior(ISDCP) defogging method is proposed to solve the fog effects caused by physical properties of water. Due to mass refraction of light in the process of underwater imaging, fog effects would lead to image blurring. And color cast is closely related to different degree of attenuation while light with different wavelengths is traveling in water. The proposed method here integrates the ISDCP and quantitative histogram stretching techniques into the image enhancement procedure. Firstly, the threshold value is set during the refinement process of the transmission maps to identify the original mismatching, and to conduct the differentiated defogging process further. Secondly, a method of judging the propagating distance of light is adopted to get the attenuation degree of energy during the propagation underwater. Finally, the image histogram is stretched quantitatively in Red-Green-Blue channel respectively according to the degree of attenuation in each color channel. The proposed method ISDCP can reduce the computational complexity and improve the efficiency in terms of defogging effect to meet the real-time requirements. Qualitative and quantitative comparison for several different underwater scenes reveals that the proposed method can significantly improve the visibility compared with previous methods.
基金National Natural Science Foundation of Chinagrant number:61475071+1 种基金Funding for Outstanding Doctoral Dissertation in NUAAgrant number:BCXJ14-13
文摘This paper presents a novel method of spot addressing and segmentation about the foreground segmentation of microarray image. In this paper,a spot addressing method based on particle swarm optimization(PSO),algorithm is proposed to have a further search for the center coordinate and radius of the spot whose region is determined by the projection method. Then,a foreground segmentation method is put forward to make the spot foreground segmentation based on the center coordinate and radius of the spot. The spot addressing and segmentation experiments on synthetic and real microarray images show that the proposed method is effective and feasible for the foreground segmentation of microarray image.