Multispectral pedestrian detection technology leverages infrared images to provide reliable information for visible light images, demonstrating significant advantages in low-light conditions and background occlusion s...Multispectral pedestrian detection technology leverages infrared images to provide reliable information for visible light images, demonstrating significant advantages in low-light conditions and background occlusion scenarios. However, while continuously improving cross-modal feature extraction and fusion, ensuring the model’s detection speed is also a challenging issue. We have devised a deep learning network model for cross-modal pedestrian detection based on Resnet50, aiming to focus on more reliable features and enhance the model’s detection efficiency. This model employs a spatial attention mechanism to reweight the input visible light and infrared image data, enhancing the model’s focus on different spatial positions and sharing the weighted feature data across different modalities, thereby reducing the interference of multi-modal features. Subsequently, lightweight modules with depthwise separable convolution are incorporated to reduce the model’s parameter count and computational load through channel-wise and point-wise convolutions. The network model algorithm proposed in this paper was experimentally validated on the publicly available KAIST dataset and compared with other existing methods. The experimental results demonstrate that our approach achieves favorable performance in various complex environments, affirming the effectiveness of the multispectral pedestrian detection technology proposed in this paper.展开更多
Precise classification of Light Detection and Ranging(LiDAR)point cloud is a fundamental process in various applications,such as land cover mapping,forestry management,and autonomous driving.Due to the lack of spectra...Precise classification of Light Detection and Ranging(LiDAR)point cloud is a fundamental process in various applications,such as land cover mapping,forestry management,and autonomous driving.Due to the lack of spectral information,the existing research on single wavelength LiDAR classification is limited.Spectral information from images could address this limitation,but data fusion suffers from varying illumination conditions and the registration problem.A novel multispectral LiDAR successfully obtains spatial and spectral information as a brand-new data type,namely,multispectral point cloud,thereby improving classification performance.However,spatial and spectral information of multispectral LiDAR has been processed separately in previous studies,thereby possibly limiting the classification performance of multispectral LiDAR.To explore the potential of this new data type,the current spatial-spectral classification framework for multispectral LiDAR that includes four steps:(1)neighborhood selection,(2)feature extraction and selection,(3)classification,and(4)label smoothing.Three novel highlights were proposed in this spatial-spectral classification framework.(1)We improved the popular eigen entropy-based neighborhood selection by spectral angle match to extract a more precise neighborhood.(2)We evaluated the importance of geometric and spectral features to compare their contributions and selected the most important features to reduce feature redundancy.(3)We conducted spatial label smoothing by a conditional random field,accounting for the spatial and spectral information of the neighborhood points.The proposed method demonstrated by a multispectral LiDAR with three channels:466 nm(blue),527 nm(green),and 628 nm(red).Experimental results demonstrate the effectiveness of the proposed spatial-spectral classification framework.Moreover,this research takes advantages of the complementation of spatial and spectral information,which could benefit more precise neighborhood selection,more effective features,and satisfactory refinement of classification result.Finally,this study could serve as an inspiration for future efficient spatial-spectral process for multispectral point cloud.展开更多
With complementary multi-modal information(i.e. visible and thermal), multispectral pedestrian detection is essential for around-the-clock applications, such as autonomous driving, video surveillance, and vicinagearth...With complementary multi-modal information(i.e. visible and thermal), multispectral pedestrian detection is essential for around-the-clock applications, such as autonomous driving, video surveillance, and vicinagearth security. Despite its broad applications, the requirements for expensive thermal device and multi-sensor alignment limit the utilization in real-world applications. In this paper, we propose a pseudo-multispectral pedestrian detection(called Pseudo MPD) method,which employs the gray image converted from the RGB image to replace the real thermal image,and learns the pseudo-thermal feature through deep thermal feature guidance(TFG). To achieve this goal, we first introduce an image base-detail decomposition(IBD) module to decompose image information into base and detail parts. Afterwards, we design a base-detail hierarchical feature fusion(BHFF) module to deeply exploit the information between these two parts, and employ a TFG module to guide pseudo-thermal base and detail feature learning. As a result, our proposed method does not require the real thermal image during inference. The comprehensive experiments are performed on two public multispectral pedestrian datasets. The experimental results demonstrate the effectiveness of our proposed method.展开更多
基金supported by the Henan Provincial Science and Technology Research Project under Grants 232102211006,232102210044,232102211017,232102210055 and 222102210214the Science and Technology Innovation Project of Zhengzhou University of Light Industry under Grant 23XNKJTD0205+1 种基金the Undergraduate Universities Smart Teaching Special Research Project of Henan Province under Grant Jiao Gao[2021]No.489-29the Doctor Natural Science Foundation of Zhengzhou University of Light Industry under Grants 2021BSJJ025 and 2022BSJJZK13.
文摘Multispectral pedestrian detection technology leverages infrared images to provide reliable information for visible light images, demonstrating significant advantages in low-light conditions and background occlusion scenarios. However, while continuously improving cross-modal feature extraction and fusion, ensuring the model’s detection speed is also a challenging issue. We have devised a deep learning network model for cross-modal pedestrian detection based on Resnet50, aiming to focus on more reliable features and enhance the model’s detection efficiency. This model employs a spatial attention mechanism to reweight the input visible light and infrared image data, enhancing the model’s focus on different spatial positions and sharing the weighted feature data across different modalities, thereby reducing the interference of multi-modal features. Subsequently, lightweight modules with depthwise separable convolution are incorporated to reduce the model’s parameter count and computational load through channel-wise and point-wise convolutions. The network model algorithm proposed in this paper was experimentally validated on the publicly available KAIST dataset and compared with other existing methods. The experimental results demonstrate that our approach achieves favorable performance in various complex environments, affirming the effectiveness of the multispectral pedestrian detection technology proposed in this paper.
基金supported by the National Natural Science Foundation of China[grant number 41971307]Fundamental Research Funds for the Central Universities[grant number 2042022kf1200,2042023kf0217]+1 种基金Wuhan University Specific Fund for Major School-level Internationalization InitiativesLIESMARS Special Research Funding.
文摘Precise classification of Light Detection and Ranging(LiDAR)point cloud is a fundamental process in various applications,such as land cover mapping,forestry management,and autonomous driving.Due to the lack of spectral information,the existing research on single wavelength LiDAR classification is limited.Spectral information from images could address this limitation,but data fusion suffers from varying illumination conditions and the registration problem.A novel multispectral LiDAR successfully obtains spatial and spectral information as a brand-new data type,namely,multispectral point cloud,thereby improving classification performance.However,spatial and spectral information of multispectral LiDAR has been processed separately in previous studies,thereby possibly limiting the classification performance of multispectral LiDAR.To explore the potential of this new data type,the current spatial-spectral classification framework for multispectral LiDAR that includes four steps:(1)neighborhood selection,(2)feature extraction and selection,(3)classification,and(4)label smoothing.Three novel highlights were proposed in this spatial-spectral classification framework.(1)We improved the popular eigen entropy-based neighborhood selection by spectral angle match to extract a more precise neighborhood.(2)We evaluated the importance of geometric and spectral features to compare their contributions and selected the most important features to reduce feature redundancy.(3)We conducted spatial label smoothing by a conditional random field,accounting for the spatial and spectral information of the neighborhood points.The proposed method demonstrated by a multispectral LiDAR with three channels:466 nm(blue),527 nm(green),and 628 nm(red).Experimental results demonstrate the effectiveness of the proposed spatial-spectral classification framework.Moreover,this research takes advantages of the complementation of spatial and spectral information,which could benefit more precise neighborhood selection,more effective features,and satisfactory refinement of classification result.Finally,this study could serve as an inspiration for future efficient spatial-spectral process for multispectral point cloud.
基金supported by the National Key Research and Development Program of China (Grant No. 2022ZD0160400)the National Natural Science Foundation of China (Grant No. 62106152)
文摘With complementary multi-modal information(i.e. visible and thermal), multispectral pedestrian detection is essential for around-the-clock applications, such as autonomous driving, video surveillance, and vicinagearth security. Despite its broad applications, the requirements for expensive thermal device and multi-sensor alignment limit the utilization in real-world applications. In this paper, we propose a pseudo-multispectral pedestrian detection(called Pseudo MPD) method,which employs the gray image converted from the RGB image to replace the real thermal image,and learns the pseudo-thermal feature through deep thermal feature guidance(TFG). To achieve this goal, we first introduce an image base-detail decomposition(IBD) module to decompose image information into base and detail parts. Afterwards, we design a base-detail hierarchical feature fusion(BHFF) module to deeply exploit the information between these two parts, and employ a TFG module to guide pseudo-thermal base and detail feature learning. As a result, our proposed method does not require the real thermal image during inference. The comprehensive experiments are performed on two public multispectral pedestrian datasets. The experimental results demonstrate the effectiveness of our proposed method.