Multispectral pedestrian detection technology leverages infrared images to provide reliable information for visible light images, demonstrating significant advantages in low-light conditions and background occlusion s...Multispectral pedestrian detection technology leverages infrared images to provide reliable information for visible light images, demonstrating significant advantages in low-light conditions and background occlusion scenarios. However, while continuously improving cross-modal feature extraction and fusion, ensuring the model’s detection speed is also a challenging issue. We have devised a deep learning network model for cross-modal pedestrian detection based on Resnet50, aiming to focus on more reliable features and enhance the model’s detection efficiency. This model employs a spatial attention mechanism to reweight the input visible light and infrared image data, enhancing the model’s focus on different spatial positions and sharing the weighted feature data across different modalities, thereby reducing the interference of multi-modal features. Subsequently, lightweight modules with depthwise separable convolution are incorporated to reduce the model’s parameter count and computational load through channel-wise and point-wise convolutions. The network model algorithm proposed in this paper was experimentally validated on the publicly available KAIST dataset and compared with other existing methods. The experimental results demonstrate that our approach achieves favorable performance in various complex environments, affirming the effectiveness of the multispectral pedestrian detection technology proposed in this paper.展开更多
This study explores the challenges posed by pedestrian detection and occlusion in AR applications, employing a novel approach that utilizes RGB-D-based skeleton reconstruction to reduce the overhead of classical pedes...This study explores the challenges posed by pedestrian detection and occlusion in AR applications, employing a novel approach that utilizes RGB-D-based skeleton reconstruction to reduce the overhead of classical pedestrian detection algorithms during training. Furthermore, it is dedicated to addressing occlusion issues in pedestrian detection by using Azure Kinect for body tracking and integrating a robust occlusion management algorithm, significantly enhancing detection efficiency. In experiments, an average latency of 204 milliseconds was measured, and the detection accuracy reached an outstanding level of 97%. Additionally, this approach has been successfully applied in creating a simple yet captivating augmented reality game, demonstrating the practical application of the algorithm.展开更多
Real-time pedestrian detection is an important task for unmanned driving systems and video surveillance.The existing pedestrian detection methods often work at low speed and also fail to detect smaller and densely dis...Real-time pedestrian detection is an important task for unmanned driving systems and video surveillance.The existing pedestrian detection methods often work at low speed and also fail to detect smaller and densely distributed pedestrians by losing some of their detection accuracy in such cases.Therefore,the proposed algorithm YOLOv2(“YOU ONLY LOOK ONCE Version 2”)-based pedestrian detection(referred to as YOLOv2PD)would be more suitable for detecting smaller and densely distributed pedestrians in real-time complex road scenes.The proposed YOLOv2PD algorithm adopts a Multi-layer Feature Fusion(MLFF)strategy,which helps to improve the model’s feature extraction ability.In addition,one repeated convolution layer is removed from the final layer,which in turn reduces the computational complexity without losing any detection accuracy.The proposed algorithm applies the K-means clustering method on the Pascal Voc-2007+2012 pedestrian dataset before training to find the optimal anchor boxes.Both the proposed network structure and the loss function are improved to make the model more accurate and faster while detecting smaller pedestrians.Experimental results show that,at 544×544 image resolution,the proposed model achieves 80.7%average precision(AP),which is 2.1%higher than the YOLOv2 Model on the Pascal Voc-2007+2012 pedestrian dataset.Besides,based on the experimental results,the proposed model YOLOv2PD achieves a good trade-off balance between detection accuracy and real-time speed when evaluated on INRIA and Caltech test pedestrian datasets and achieves state-of-the-art detection results.展开更多
Focusing on data imbalance and intraclass variation,an improved pedestrian detection with a cascade of complex peer AdaBoost classifiers is proposed.The series of the AdaBoost classifiers are learned greedily,along wi...Focusing on data imbalance and intraclass variation,an improved pedestrian detection with a cascade of complex peer AdaBoost classifiers is proposed.The series of the AdaBoost classifiers are learned greedily,along with negative example mining.The complexity of classifiers in the cascade is not limited,so more negative examples are used for training.Furthermore,the cascade becomes an ensemble of strong peer classifiers,which treats intraclass variation.To locally train the AdaBoost classifiers with a high detection rate,a refining strategy is used to discard the hardest negative training examples rather than decreasing their thresholds.Using the aggregate channel feature(ACF),the method achieves miss rates of 35%and 14%on the Caltech pedestrian benchmark and Inria pedestrian dataset,respectively,which are lower than that of increasingly complex AdaBoost classifiers,i.e.,44%and 17%,respectively.Using deep features extracted by the region proposal network(RPN),the method achieves a miss rate of 10.06%on the Caltech pedestrian benchmark,which is also lower than 10.53%from the increasingly complex cascade.This study shows that the proposed method can use more negative examples to train the pedestrian detector.It outperforms the existing cascade of increasingly complex classifiers.展开更多
Pedestrian detection and tracking are vital elements of today’s surveillance systems,which make daily life safe for humans.Thus,human detection and visualization have become essential inventions in the field of compu...Pedestrian detection and tracking are vital elements of today’s surveillance systems,which make daily life safe for humans.Thus,human detection and visualization have become essential inventions in the field of computer vision.Hence,developing a surveillance system with multiple object recognition and tracking,especially in low light and night-time,is still challenging.Therefore,we propose a novel system based on machine learning and image processing to provide an efficient surveillance system for pedestrian detection and tracking at night.In particular,we propose a system that tackles a two-fold problem by detecting multiple pedestrians in infrared(IR)images using machine learning and tracking them using particle filters.Moreover,a random forest classifier is adopted for image segmentation to identify pedestrians in an image.The result of detection is investigated by particle filter to solve pedestrian tracking.Through the extensive experiment,our system shows 93%segmentation accuracy using a random forest algorithm that demonstrates high accuracy for background and roof classes.Moreover,the system achieved a detection accuracy of 90%usingmultiple templatematching techniques and 81%accuracy for pedestrian tracking.Furthermore,our system can identify that the detected object is a human.Hence,our system provided the best results compared to the state-ofart systems,which proves the effectiveness of the techniques used for image segmentation,classification,and tracking.The presented method is applicable for human detection/tracking,crowd analysis,and monitoring pedestrians in IR video surveillance.展开更多
A real-time pedestrian detection and tracking system using a single video camera was developed to monitor pedestrians. This system contained six modules: video flow capture, pre-processing, movement detection, shadow ...A real-time pedestrian detection and tracking system using a single video camera was developed to monitor pedestrians. This system contained six modules: video flow capture, pre-processing, movement detection, shadow removal, tracking, and object classification. The Gaussian mixture model was utilized to extract the moving object from an image sequence segmented by the mean-shift technique in the pre-processing module. Shadow removal was used to alleviate the negative impact of the shadow to the detected objects. A model-free method was adopted to identify pedestrians. The maximum and minimum integration methods were developed to integrate multiple cues into the mean-shift algorithm and the initial tracking iteration with the competent integrated probability distribution map for object tracking. A simple but effective algorithm was proposed to handle full occlusion cases. The system was tested using real traffic videos from different sites. The results of the test confirm that the system is reliable and has an overall accuracy of over 85%.展开更多
This study proposes a motion cue based pedestrian detection method with two-trame-filtering (Tff) for video surveillance. The novel motion cue is exploited by the gray value variation between two frames. Then Tff pr...This study proposes a motion cue based pedestrian detection method with two-trame-filtering (Tff) for video surveillance. The novel motion cue is exploited by the gray value variation between two frames. Then Tff processing filters the gradient magnitude image by the variation map. Summa- tions of the Tff gradient magnitudes in cells are applied to train a pre-deteetor to exclude most of the background regions. Histogram of Tff oriented gradient (HTffOG) feature is proposed for pedestrian detection. Experimental results show that this method is effective and suitable for real-time surveil- lance applications.展开更多
Pedestrian detection is a critical challenge in the field of general object detection,the performance of object detection has advanced with the development of deep learning.However,considerable improvement is still re...Pedestrian detection is a critical challenge in the field of general object detection,the performance of object detection has advanced with the development of deep learning.However,considerable improvement is still required for pedestrian detection,considering the differences in pedestrian wears,action,and posture.In the driver assistance system,it is necessary to further improve the intelligent pedestrian detection ability.We present a method based on the combination of SSD and GAN to improve the performance of pedestrian detection.Firstly,we assess the impact of different kinds of methods which can detect pedestrians based on SSD and optimize the detection for pedestrian characteristics.Secondly,we propose a novel network architecture,namely data synthesis PS-GAN to generate diverse pedestrian data for verifying the effectiveness of massive training data to SSD detector.Experimental results show that the proposed manners can improve the performance of pedestrian detection to some extent.At last,we use the pedestrian detector to simulate a specific application of motor vehicle assisted driving which would make the detector focus on specific pedestrians according to the velocity of the vehicle.The results establish the validity of the approach.展开更多
Stable and reliable perception capability is the basis for the safety of autonomous driving,and pedestrian detection is one of the key tasks for vehicle-mounted sensors to perceive the environment.In order to make ful...Stable and reliable perception capability is the basis for the safety of autonomous driving,and pedestrian detection is one of the key tasks for vehicle-mounted sensors to perceive the environment.In order to make full use of the complementarity of vehicle cameras and lidars,we make improvements on the basis of the EPNet algorthm,and a pedestrian detection method based on prefusion of point cloud and image data is proposed.Use the bidirectional cascaded feature fusion module to achieve more information exchange between image and point cloud data,and obtain more comprehensive fusion features;design a consistency loss function to enhance the correlation between location confidence and category confidence and improve model detection accuracy.Validated on KITTI and other datasets,the detection result of pedestrian s can reach 84%mAP,4.49%higher than the EPNet on difficult pedestrian samples.Compared with a single visual sensor,the proposed method has a better detection effect on objects affected by shadow or longer distance.Finally,the model is accelerated based on the TensorRT custom plug-in and uses CUDA to improve the effciencyof multimodal data pre-processing and post-processing.Deployed on the Nvidia Jetson Orin edge computing device,the model runs at 10 frames per second,and the inference speed is increased by about 60%,laying the foundation for the application of algorithm engineering.展开更多
To address the dual challenges of excessive energy consumption and operational inefficiency inherent in the reliance of current agricultural machinery on direct supervision,this study developed an enhanced YOLOv8n-SS ...To address the dual challenges of excessive energy consumption and operational inefficiency inherent in the reliance of current agricultural machinery on direct supervision,this study developed an enhanced YOLOv8n-SS pedestrian detection algorithm through architectural modifications to the baseline YOLOv8n framework.The proposed method had superior performance in dense agricultural contexts while improving detection capabilities for pedestrian distribution patterns under complex farmland conditions,including variable lighting and mechanical occlusions.The main innovations were:(1)integration of spatial pyramid dilated(SPD)operations with conventional convolution layers to construct SPD-Conv modules,which effectively mitigated feature information loss while enhancing small-target detection accuracy;(2)incorporation of selective kernel attention mechanisms to enable context-aware feature selection and adaptive feature extraction.Experimental validation revealed significant performance improvements over the original YOLOv8n model.This enhanced architecture achieved 7.2% and 9.2% increases in m AP0.5 and m AP0.5:0.95 metrics respectively for dense pedestrian detection,with corresponding improvements of 7.6% and 8.7% observed in actual farmland working environments.The proposed method ultimately provides a computationally efficient and robust intelligent monitoring solution for agricultural mechanization,facilitating the transition from conventional agricultural practices toward sustainable,low-carbon production paradigms through algorithmic optimization.展开更多
Pedestrian detection has been a hot spot in computer vision over the past decades due to the wide spectrum of promising applications,and the major challenge is false positives that occur during pedestrian detection.Th...Pedestrian detection has been a hot spot in computer vision over the past decades due to the wide spectrum of promising applications,and the major challenge is false positives that occur during pedestrian detection.The emergence of various Convolutional Neural Network-based detection strategies substantially enhances pedestrian detection accuracy but still does not solve this problem well.This paper deeply analyzes the detection framework of the two-stage CNN detection methods and finds out false positives in detection results are due to its training strategy misclassifying some false proposals,thus weakening the classification capability of the following subnetwork and hardly suppressing false ones.To solve this problem,this paper proposes a pedestrian-sensitive training algorithm to help two-stage CNN detection methods effectively learn to distinguish the pedestrian and non-pedestrian samples and suppress the false positives in the final detection results.The core of the proposed algorithm is to redesign the training proposal generating scheme for the two-stage CNN detection methods,which can avoid a certain number of false ones that mislead its training process.With the help of the proposed algorithm,the detection accuracy of the MetroNext,a smaller and more accurate metro passenger detector,is further improved,which further decreases false ones in its metro passenger detection results.Based on various challenging benchmark datasets,experiment results have demonstrated that the feasibility of the proposed algorithm is effective in improving pedestrian detection accuracy by removing false positives.Compared with the existing state-of-the-art detection networks,PSTNet demonstrates better overall prediction performance in accuracy,total number of parameters,and inference time;thus,it can become a practical solution for hunting pedestrians on various hardware platforms,especially for mobile and edge devices.展开更多
In the pedestrian detection scenario,the detection algorithm usually misses obscured and distant fuzzy pedestrians,and at the same time cannot take into account the detection accuracy and speed.In this paper,we propos...In the pedestrian detection scenario,the detection algorithm usually misses obscured and distant fuzzy pedestrians,and at the same time cannot take into account the detection accuracy and speed.In this paper,we propose a modified YOLOv5 model for pedestrian detection.Firstly,the backbone network uses the SPD-GCONV module constructed by the combination of SPD(Space-to-Depth)module and Ghost convolution for down-sampling to reduce the loss of fine-grained feature information.Secondly,the multi-scale detection ability of the model is enhanced by adding a small size detection layer.Then,the original CIoU loss function is replaced by α-EloU loss function to improve the accuracy of pedestrian target location.According to the experiments on WiderPerson data set,the average detection accuracy is improved by 2%compared with other pedestrian detection algorithms on the premise of ensuring the detection speed.Experimental results show that the improved algorithm can significantly improve the detection performance.展开更多
With complementary multi-modal information(i.e. visible and thermal), multispectral pedestrian detection is essential for around-the-clock applications, such as autonomous driving, video surveillance, and vicinagearth...With complementary multi-modal information(i.e. visible and thermal), multispectral pedestrian detection is essential for around-the-clock applications, such as autonomous driving, video surveillance, and vicinagearth security. Despite its broad applications, the requirements for expensive thermal device and multi-sensor alignment limit the utilization in real-world applications. In this paper, we propose a pseudo-multispectral pedestrian detection(called Pseudo MPD) method,which employs the gray image converted from the RGB image to replace the real thermal image,and learns the pseudo-thermal feature through deep thermal feature guidance(TFG). To achieve this goal, we first introduce an image base-detail decomposition(IBD) module to decompose image information into base and detail parts. Afterwards, we design a base-detail hierarchical feature fusion(BHFF) module to deeply exploit the information between these two parts, and employ a TFG module to guide pseudo-thermal base and detail feature learning. As a result, our proposed method does not require the real thermal image during inference. The comprehensive experiments are performed on two public multispectral pedestrian datasets. The experimental results demonstrate the effectiveness of our proposed method.展开更多
It remains a challenging task to detect pedestrians in crowds and it needs more efforts to understand why the detectors fail.When we perform an error analysis based on the traditional evaluation strategy,we find that ...It remains a challenging task to detect pedestrians in crowds and it needs more efforts to understand why the detectors fail.When we perform an error analysis based on the traditional evaluation strategy,we find that it produces many misleading false positives,which in fact cover occluded pedestrians.The reason for this is that we usually have two kinds of annotations in the dataset:regular pedestrians(detection targets)labeled by full-body boxes and ignored pedestrians(NOT detection targets)labeled by visible boxes.Ignored pedestrians are labeled as an additional category termed the“ignore region”.Nevertheless,our detectors always predict a full-body box for each pedestrian.This gap results in the following case:when a detector successfully predicts a full-body box for those ignored pedestrians,a false positive is triggered due to the low overlap between the predicted full-body box and the labeled visible box for the ignored pedestrian.This becomes even more harmful as the detector improves and becomes more capable of locating occluded pedestrians.To alleviate this issue,we devise a new pedestrian detection pipeline,which considers the additional visible box at both the detection and evaluation stages.During detection,we predict an extra visible box apart from the full-body box for every instance;during evaluation,we employ visible boxes instead of full-body boxes to match the“ignore region”.We apply the new pipeline to dozens of detection methods and validate the effectiveness of our pipeline in reducing the over-reporting of false positives and providing more reliable evaluation results.展开更多
Detection of pedestrians in images and video sequences is important for many applications but is very challenging due to the various silhouettes of pedestrians and partial occlusions. This paper describes a two-stage ...Detection of pedestrians in images and video sequences is important for many applications but is very challenging due to the various silhouettes of pedestrians and partial occlusions. This paper describes a two-stage robust pedestrian detection approach. The first stage uses a full body detector applied to a single image to generate pedestrian candidates. In the second stage, each pedestrian candidate is verified with a detector ensemble consisting of part detectors. The full body detector is trained based on improved shapelet features, while the part detectors make use of Haar-like wavelets as features. All the detectors are trained by a boosting method. The responses of the part detectors are then combined using a detector ensemble. The verification process is formulated as a combinatoria~ optimization problem with a genetic a^gorithm for optimization. Then, the detection results are regarded as equivalent classes so that multiple detections of the same pedestrian are quickly merged together. Tests show that this approach has a detection rate of over 95% for 0.1% FPPW on the INRIA dataset, which is significantly better than that of the original shapelet feature based approach and the existing detector ensemble approach. This approach can robustly detect pedestrians in different situations.展开更多
Pedestrian detection is a critical problem in the field of computer vision. Although most existing algorithms are able to detect pedestrians well in controlled environ- ments, it is often difficult to achieve accurate...Pedestrian detection is a critical problem in the field of computer vision. Although most existing algorithms are able to detect pedestrians well in controlled environ- ments, it is often difficult to achieve accurate pedestrian de- tection from video sequences alone, especially in pedestrian- intensive scenes wherein pedestrians may cause mutual oc- clusion and thus incomplete detection. To surmount these dif- ficulties, this paper presents pedestrian detection algorithm based on video sequences and laser point cloud. First, laser point cloud is interpreted and classified to separate pedes- trian data and vehicle data. Then a fusion of video image data and laser point cloud data is achieved by calibration. The re- gion of interest after fusion is determined using feature in- formation contained in video image and three-dimensional information of laser point cloud to remove false detection of pedestrian and thus to achieve pedestrian detection in inten- sive scenes. Experimental verification and analysis in video sequences demonstrate that fusion of two data improves the performance of pedestrian detection and has better detection results.展开更多
In recent years,pedestrian detection is a hot research topic in the field of computer vision and artificial intelligence,it is widely used in the field of security and pedestrian analysis.However,due to a large amount...In recent years,pedestrian detection is a hot research topic in the field of computer vision and artificial intelligence,it is widely used in the field of security and pedestrian analysis.However,due to a large amount of calculation in the traditional pedestrian detection technology,the speed of many systems for pedestrian recognition is very limited.But in some restricted areas,such as construction hazardous areas,real-time detection of pedestrians and cross-border behaviors is required.To more conveniently and efficiently detect whether there are pedestrians in the restricted area and cross-border behavior,this paper proposes a pedestrian cross-border detection method based on HOG(Histogram of Oriented Gradient)and SVM(Support Vector Machine).This method extracts the moving target through the GMM(Gaussian Mixture Model)background modeling and then extracts the characteristics of the moving target through gradient HOG.Finally,it uses SVM training to distinguish pedestrians from non-pedestrians,completes the detection of pedestrians,and labels the targets.The test results show that only the HOG feature extraction of the candidate area can greatly reduce the amount of calculation and reduce the time of feature extraction,eliminate background interference,thereby improving the efficiency of detection,and can be applied to occasions with real-time requirements.展开更多
Purpose–The conventional pedestrian detection algorithms lack in scale sensitivity.The purpose of this paper is to propose a novel algorithm of self-adaptive scale pedestrian detection,based on deep residual network(...Purpose–The conventional pedestrian detection algorithms lack in scale sensitivity.The purpose of this paper is to propose a novel algorithm of self-adaptive scale pedestrian detection,based on deep residual network(DRN),to address such lacks.Design/methodology/approach–First,the“Edge boxes”algorithm is introduced to extract region of interestsfrompedestrian images.Then,the extracted boundingboxesare incorporatedto differentDRNs,one is a large-scale DRN and the other one is the small-scale DRN.The height of the bounding boxes is used to classify the results of pedestrians and to regress the bounding boxes to the entity of the pedestrian.At last,a weighted self-adaptive scale function,which combines the large-scale results and small-scale results,is designed for the final pedestrian detection.Findings–Tovalidatetheeffectivenessandfeasibilityoftheproposedalgorithm,somecomparisonexperiments have been done on the common pedestrian detection data sets:Caltech,INRIA,ETH and KITTI.Experimental resultsshowthattheproposedalgorithmisadaptedforthevariousscalesofthepedestrians.Fortheharddetected small-scale pedestrians,the proposed algorithm has improved the accuracy and robustness of detections.Originality/value–By applying different models to deal with different scales of pedestrians,the proposed algorithm with the weighted calculation function has improved the accuracy and robustness for different scales of pedestrians.展开更多
Purpose-The purpose of the study is to address the problems of low accuracy and missed detection of occluded pedestrians and small target pedestrians when using the YOLOX general object detection algorithm for pedestr...Purpose-The purpose of the study is to address the problems of low accuracy and missed detection of occluded pedestrians and small target pedestrians when using the YOLOX general object detection algorithm for pedestrian detection.This study proposes a multi-level fine-grained YOLOX pedestrian detection algorithm.Design/methodology/approach-First,to address the problem of the original YOLOX algorithm in obtaining a single perceptual field for the feature map before feature fusion,this study improves the PAFPN structure by adding the ResCoT module to increase the diversity of the perceptual field of the feature map and divides the pedestrian multi-scale features into finer granularity.Second,for the CSPLayer of the PAFPN,a weight gain-based normalization-based attention module(NAM)is proposed to make the model pay more attention to the context information when extracting pedestrian features and highlight the salient features of pedestrians.Finally,the authors experimentally determined the optimal values for the confidence loss function.Findings-The experimental results show that,compared with the original YOLOX algorithm,the AP of the improved algorithm increased by 2.90%,the Recall increased by 3.57%,and F1 increased by 2%on the pedestrian dataset.Research limitations/implications-The multi-level fine-grained YOLOX pedestrian detection algorithm can effectively improve the detection of occluded pedestrians and small target pedestrians.Originality/value-The authors introduce a multi-level fine-grained ResCoT module and a weight gain-based NAM attention module.展开更多
Early detection of vulnerable road users is a crucial requirement for autonomous vehicles to meet and exceed the object detection capabilities of human drivers.One of the most complex outstanding challenges is that of...Early detection of vulnerable road users is a crucial requirement for autonomous vehicles to meet and exceed the object detection capabilities of human drivers.One of the most complex outstanding challenges is that of partial occlusion where a target object is only partially available to the sensor due to obstruction by another foreground object.A number of leading pedestrian detection benchmarks provide annotation for partial occlusion,however each benchmark varies greatly in their definition of the occurrence and severity of occlusion.Research demonstrates that a high degree of subjectivity is used to classify occlusion level in these cases and occlusion is typically categorized into 2–3 broad categories such as“partially”and“heavily”occluded.In addition,many pedestrian instances are impacted by multiple inhibiting factors which contribute to non-detection such as object scale,distance from camera,lighting variations and adverse weather.This can lead to inaccurate or inconsistent reporting of detection performance for partially occluded pedestrians depending on which benchmark is used.This research introduces a novel,objective benchmark for partially occluded pedestrian detection to facilitate the objective characterization of pedestrian detection models.Characterization is carried out on seven popular pedestrian detection models for a range of occlusion levels from 0%–99%to demonstrate the impact of progressive levels of partial occlusion on pedestrian detectability.Results show that the proposed benchmark provides more objective,fine grained analysis of pedestrian detection algorithms than the current state of the art.展开更多
基金supported by the Henan Provincial Science and Technology Research Project under Grants 232102211006,232102210044,232102211017,232102210055 and 222102210214the Science and Technology Innovation Project of Zhengzhou University of Light Industry under Grant 23XNKJTD0205+1 种基金the Undergraduate Universities Smart Teaching Special Research Project of Henan Province under Grant Jiao Gao[2021]No.489-29the Doctor Natural Science Foundation of Zhengzhou University of Light Industry under Grants 2021BSJJ025 and 2022BSJJZK13.
文摘Multispectral pedestrian detection technology leverages infrared images to provide reliable information for visible light images, demonstrating significant advantages in low-light conditions and background occlusion scenarios. However, while continuously improving cross-modal feature extraction and fusion, ensuring the model’s detection speed is also a challenging issue. We have devised a deep learning network model for cross-modal pedestrian detection based on Resnet50, aiming to focus on more reliable features and enhance the model’s detection efficiency. This model employs a spatial attention mechanism to reweight the input visible light and infrared image data, enhancing the model’s focus on different spatial positions and sharing the weighted feature data across different modalities, thereby reducing the interference of multi-modal features. Subsequently, lightweight modules with depthwise separable convolution are incorporated to reduce the model’s parameter count and computational load through channel-wise and point-wise convolutions. The network model algorithm proposed in this paper was experimentally validated on the publicly available KAIST dataset and compared with other existing methods. The experimental results demonstrate that our approach achieves favorable performance in various complex environments, affirming the effectiveness of the multispectral pedestrian detection technology proposed in this paper.
文摘This study explores the challenges posed by pedestrian detection and occlusion in AR applications, employing a novel approach that utilizes RGB-D-based skeleton reconstruction to reduce the overhead of classical pedestrian detection algorithms during training. Furthermore, it is dedicated to addressing occlusion issues in pedestrian detection by using Azure Kinect for body tracking and integrating a robust occlusion management algorithm, significantly enhancing detection efficiency. In experiments, an average latency of 204 milliseconds was measured, and the detection accuracy reached an outstanding level of 97%. Additionally, this approach has been successfully applied in creating a simple yet captivating augmented reality game, demonstrating the practical application of the algorithm.
基金The authors are grateful to the Deanship of Scientific Research,King Saud University,Riyadh,Saudi Arabia,for funding this work through the Vice Deanship of Scientific Research Chairs:Research Chair of Pervasive and Mobile Computing.
文摘Real-time pedestrian detection is an important task for unmanned driving systems and video surveillance.The existing pedestrian detection methods often work at low speed and also fail to detect smaller and densely distributed pedestrians by losing some of their detection accuracy in such cases.Therefore,the proposed algorithm YOLOv2(“YOU ONLY LOOK ONCE Version 2”)-based pedestrian detection(referred to as YOLOv2PD)would be more suitable for detecting smaller and densely distributed pedestrians in real-time complex road scenes.The proposed YOLOv2PD algorithm adopts a Multi-layer Feature Fusion(MLFF)strategy,which helps to improve the model’s feature extraction ability.In addition,one repeated convolution layer is removed from the final layer,which in turn reduces the computational complexity without losing any detection accuracy.The proposed algorithm applies the K-means clustering method on the Pascal Voc-2007+2012 pedestrian dataset before training to find the optimal anchor boxes.Both the proposed network structure and the loss function are improved to make the model more accurate and faster while detecting smaller pedestrians.Experimental results show that,at 544×544 image resolution,the proposed model achieves 80.7%average precision(AP),which is 2.1%higher than the YOLOv2 Model on the Pascal Voc-2007+2012 pedestrian dataset.Besides,based on the experimental results,the proposed model YOLOv2PD achieves a good trade-off balance between detection accuracy and real-time speed when evaluated on INRIA and Caltech test pedestrian datasets and achieves state-of-the-art detection results.
基金Project(2018AAA0102102)supported by the National Science and Technology Major Project,ChinaProject(2017WK2074)supported by the Planned Science and Technology Project of Hunan Province,China+1 种基金Project(B18059)supported by the National 111 Project,ChinaProject(61702559)supported by the National Natural Science Foundation of China。
文摘Focusing on data imbalance and intraclass variation,an improved pedestrian detection with a cascade of complex peer AdaBoost classifiers is proposed.The series of the AdaBoost classifiers are learned greedily,along with negative example mining.The complexity of classifiers in the cascade is not limited,so more negative examples are used for training.Furthermore,the cascade becomes an ensemble of strong peer classifiers,which treats intraclass variation.To locally train the AdaBoost classifiers with a high detection rate,a refining strategy is used to discard the hardest negative training examples rather than decreasing their thresholds.Using the aggregate channel feature(ACF),the method achieves miss rates of 35%and 14%on the Caltech pedestrian benchmark and Inria pedestrian dataset,respectively,which are lower than that of increasingly complex AdaBoost classifiers,i.e.,44%and 17%,respectively.Using deep features extracted by the region proposal network(RPN),the method achieves a miss rate of 10.06%on the Caltech pedestrian benchmark,which is also lower than 10.53%from the increasingly complex cascade.This study shows that the proposed method can use more negative examples to train the pedestrian detector.It outperforms the existing cascade of increasingly complex classifiers.
基金supported by the MSIT(Ministry of Science and ICT),Korea,under the ITRC(Information Technology Research Center)support program(IITP-2023-2018-0-01426)supervised by the IITP(Institute for Information&Communications Technology Planning&Evaluation)+2 种基金Also,this work was partially supported by the Taif University Researchers Supporting Project Number(TURSP-2020/115)Taif University,Taif,Saudi Arabia.This work was also supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2023R239)PrincessNourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Pedestrian detection and tracking are vital elements of today’s surveillance systems,which make daily life safe for humans.Thus,human detection and visualization have become essential inventions in the field of computer vision.Hence,developing a surveillance system with multiple object recognition and tracking,especially in low light and night-time,is still challenging.Therefore,we propose a novel system based on machine learning and image processing to provide an efficient surveillance system for pedestrian detection and tracking at night.In particular,we propose a system that tackles a two-fold problem by detecting multiple pedestrians in infrared(IR)images using machine learning and tracking them using particle filters.Moreover,a random forest classifier is adopted for image segmentation to identify pedestrians in an image.The result of detection is investigated by particle filter to solve pedestrian tracking.Through the extensive experiment,our system shows 93%segmentation accuracy using a random forest algorithm that demonstrates high accuracy for background and roof classes.Moreover,the system achieved a detection accuracy of 90%usingmultiple templatematching techniques and 81%accuracy for pedestrian tracking.Furthermore,our system can identify that the detected object is a human.Hence,our system provided the best results compared to the state-ofart systems,which proves the effectiveness of the techniques used for image segmentation,classification,and tracking.The presented method is applicable for human detection/tracking,crowd analysis,and monitoring pedestrians in IR video surveillance.
基金Project(50778015)supported by the National Natural Science Foundation of ChinaProject(2012CB725403)supported by the Major State Basic Research Development Program of China
文摘A real-time pedestrian detection and tracking system using a single video camera was developed to monitor pedestrians. This system contained six modules: video flow capture, pre-processing, movement detection, shadow removal, tracking, and object classification. The Gaussian mixture model was utilized to extract the moving object from an image sequence segmented by the mean-shift technique in the pre-processing module. Shadow removal was used to alleviate the negative impact of the shadow to the detected objects. A model-free method was adopted to identify pedestrians. The maximum and minimum integration methods were developed to integrate multiple cues into the mean-shift algorithm and the initial tracking iteration with the competent integrated probability distribution map for object tracking. A simple but effective algorithm was proposed to handle full occlusion cases. The system was tested using real traffic videos from different sites. The results of the test confirm that the system is reliable and has an overall accuracy of over 85%.
基金Supported by the National High Technology Research and Development Program of China(No.2007AA01Z164)the National Natural Science Foundation of China(No.61273258)
文摘This study proposes a motion cue based pedestrian detection method with two-trame-filtering (Tff) for video surveillance. The novel motion cue is exploited by the gray value variation between two frames. Then Tff processing filters the gradient magnitude image by the variation map. Summa- tions of the Tff gradient magnitudes in cells are applied to train a pre-deteetor to exclude most of the background regions. Histogram of Tff oriented gradient (HTffOG) feature is proposed for pedestrian detection. Experimental results show that this method is effective and suitable for real-time surveil- lance applications.
文摘Pedestrian detection is a critical challenge in the field of general object detection,the performance of object detection has advanced with the development of deep learning.However,considerable improvement is still required for pedestrian detection,considering the differences in pedestrian wears,action,and posture.In the driver assistance system,it is necessary to further improve the intelligent pedestrian detection ability.We present a method based on the combination of SSD and GAN to improve the performance of pedestrian detection.Firstly,we assess the impact of different kinds of methods which can detect pedestrians based on SSD and optimize the detection for pedestrian characteristics.Secondly,we propose a novel network architecture,namely data synthesis PS-GAN to generate diverse pedestrian data for verifying the effectiveness of massive training data to SSD detector.Experimental results show that the proposed manners can improve the performance of pedestrian detection to some extent.At last,we use the pedestrian detector to simulate a specific application of motor vehicle assisted driving which would make the detector focus on specific pedestrians according to the velocity of the vehicle.The results establish the validity of the approach.
基金partially supported by the Hunan Association for Science and Technology Talent Support Project(Grant No.2022TJN14)the Postgraduate Scientific Research Innovation Project of Central South University(Grant No.2023XQLHO67).
文摘Stable and reliable perception capability is the basis for the safety of autonomous driving,and pedestrian detection is one of the key tasks for vehicle-mounted sensors to perceive the environment.In order to make full use of the complementarity of vehicle cameras and lidars,we make improvements on the basis of the EPNet algorthm,and a pedestrian detection method based on prefusion of point cloud and image data is proposed.Use the bidirectional cascaded feature fusion module to achieve more information exchange between image and point cloud data,and obtain more comprehensive fusion features;design a consistency loss function to enhance the correlation between location confidence and category confidence and improve model detection accuracy.Validated on KITTI and other datasets,the detection result of pedestrian s can reach 84%mAP,4.49%higher than the EPNet on difficult pedestrian samples.Compared with a single visual sensor,the proposed method has a better detection effect on objects affected by shadow or longer distance.Finally,the model is accelerated based on the TensorRT custom plug-in and uses CUDA to improve the effciencyof multimodal data pre-processing and post-processing.Deployed on the Nvidia Jetson Orin edge computing device,the model runs at 10 frames per second,and the inference speed is increased by about 60%,laying the foundation for the application of algorithm engineering.
基金supported by the General Program of the Natural Science Foundation of Hunan Province of China(2021JJ30359)。
文摘To address the dual challenges of excessive energy consumption and operational inefficiency inherent in the reliance of current agricultural machinery on direct supervision,this study developed an enhanced YOLOv8n-SS pedestrian detection algorithm through architectural modifications to the baseline YOLOv8n framework.The proposed method had superior performance in dense agricultural contexts while improving detection capabilities for pedestrian distribution patterns under complex farmland conditions,including variable lighting and mechanical occlusions.The main innovations were:(1)integration of spatial pyramid dilated(SPD)operations with conventional convolution layers to construct SPD-Conv modules,which effectively mitigated feature information loss while enhancing small-target detection accuracy;(2)incorporation of selective kernel attention mechanisms to enable context-aware feature selection and adaptive feature extraction.Experimental validation revealed significant performance improvements over the original YOLOv8n model.This enhanced architecture achieved 7.2% and 9.2% increases in m AP0.5 and m AP0.5:0.95 metrics respectively for dense pedestrian detection,with corresponding improvements of 7.6% and 8.7% observed in actual farmland working environments.The proposed method ultimately provides a computationally efficient and robust intelligent monitoring solution for agricultural mechanization,facilitating the transition from conventional agricultural practices toward sustainable,low-carbon production paradigms through algorithmic optimization.
文摘Pedestrian detection has been a hot spot in computer vision over the past decades due to the wide spectrum of promising applications,and the major challenge is false positives that occur during pedestrian detection.The emergence of various Convolutional Neural Network-based detection strategies substantially enhances pedestrian detection accuracy but still does not solve this problem well.This paper deeply analyzes the detection framework of the two-stage CNN detection methods and finds out false positives in detection results are due to its training strategy misclassifying some false proposals,thus weakening the classification capability of the following subnetwork and hardly suppressing false ones.To solve this problem,this paper proposes a pedestrian-sensitive training algorithm to help two-stage CNN detection methods effectively learn to distinguish the pedestrian and non-pedestrian samples and suppress the false positives in the final detection results.The core of the proposed algorithm is to redesign the training proposal generating scheme for the two-stage CNN detection methods,which can avoid a certain number of false ones that mislead its training process.With the help of the proposed algorithm,the detection accuracy of the MetroNext,a smaller and more accurate metro passenger detector,is further improved,which further decreases false ones in its metro passenger detection results.Based on various challenging benchmark datasets,experiment results have demonstrated that the feasibility of the proposed algorithm is effective in improving pedestrian detection accuracy by removing false positives.Compared with the existing state-of-the-art detection networks,PSTNet demonstrates better overall prediction performance in accuracy,total number of parameters,and inference time;thus,it can become a practical solution for hunting pedestrians on various hardware platforms,especially for mobile and edge devices.
文摘In the pedestrian detection scenario,the detection algorithm usually misses obscured and distant fuzzy pedestrians,and at the same time cannot take into account the detection accuracy and speed.In this paper,we propose a modified YOLOv5 model for pedestrian detection.Firstly,the backbone network uses the SPD-GCONV module constructed by the combination of SPD(Space-to-Depth)module and Ghost convolution for down-sampling to reduce the loss of fine-grained feature information.Secondly,the multi-scale detection ability of the model is enhanced by adding a small size detection layer.Then,the original CIoU loss function is replaced by α-EloU loss function to improve the accuracy of pedestrian target location.According to the experiments on WiderPerson data set,the average detection accuracy is improved by 2%compared with other pedestrian detection algorithms on the premise of ensuring the detection speed.Experimental results show that the improved algorithm can significantly improve the detection performance.
基金supported by the National Key Research and Development Program of China (Grant No. 2022ZD0160400)the National Natural Science Foundation of China (Grant No. 62106152)
文摘With complementary multi-modal information(i.e. visible and thermal), multispectral pedestrian detection is essential for around-the-clock applications, such as autonomous driving, video surveillance, and vicinagearth security. Despite its broad applications, the requirements for expensive thermal device and multi-sensor alignment limit the utilization in real-world applications. In this paper, we propose a pseudo-multispectral pedestrian detection(called Pseudo MPD) method,which employs the gray image converted from the RGB image to replace the real thermal image,and learns the pseudo-thermal feature through deep thermal feature guidance(TFG). To achieve this goal, we first introduce an image base-detail decomposition(IBD) module to decompose image information into base and detail parts. Afterwards, we design a base-detail hierarchical feature fusion(BHFF) module to deeply exploit the information between these two parts, and employ a TFG module to guide pseudo-thermal base and detail feature learning. As a result, our proposed method does not require the real thermal image during inference. The comprehensive experiments are performed on two public multispectral pedestrian datasets. The experimental results demonstrate the effectiveness of our proposed method.
基金partially supported by the National Natural Science Foundation of China(Grant No.62322602)Natural Science Foundation of Jiangsu Province,China(Grant No.BK20230033)+1 种基金National Natural Science Foundation of China(Grant No.62172225)CAAI-Huawei MindSpore Open Fund.
文摘It remains a challenging task to detect pedestrians in crowds and it needs more efforts to understand why the detectors fail.When we perform an error analysis based on the traditional evaluation strategy,we find that it produces many misleading false positives,which in fact cover occluded pedestrians.The reason for this is that we usually have two kinds of annotations in the dataset:regular pedestrians(detection targets)labeled by full-body boxes and ignored pedestrians(NOT detection targets)labeled by visible boxes.Ignored pedestrians are labeled as an additional category termed the“ignore region”.Nevertheless,our detectors always predict a full-body box for each pedestrian.This gap results in the following case:when a detector successfully predicts a full-body box for those ignored pedestrians,a false positive is triggered due to the low overlap between the predicted full-body box and the labeled visible box for the ignored pedestrian.This becomes even more harmful as the detector improves and becomes more capable of locating occluded pedestrians.To alleviate this issue,we devise a new pedestrian detection pipeline,which considers the additional visible box at both the detection and evaluation stages.During detection,we predict an extra visible box apart from the full-body box for every instance;during evaluation,we employ visible boxes instead of full-body boxes to match the“ignore region”.We apply the new pipeline to dozens of detection methods and validate the effectiveness of our pipeline in reducing the over-reporting of false positives and providing more reliable evaluation results.
基金Supported by the National Natural Science Foundation of China(Nos. 60621062, 60775040, and 90820305)
文摘Detection of pedestrians in images and video sequences is important for many applications but is very challenging due to the various silhouettes of pedestrians and partial occlusions. This paper describes a two-stage robust pedestrian detection approach. The first stage uses a full body detector applied to a single image to generate pedestrian candidates. In the second stage, each pedestrian candidate is verified with a detector ensemble consisting of part detectors. The full body detector is trained based on improved shapelet features, while the part detectors make use of Haar-like wavelets as features. All the detectors are trained by a boosting method. The responses of the part detectors are then combined using a detector ensemble. The verification process is formulated as a combinatoria~ optimization problem with a genetic a^gorithm for optimization. Then, the detection results are regarded as equivalent classes so that multiple detections of the same pedestrian are quickly merged together. Tests show that this approach has a detection rate of over 95% for 0.1% FPPW on the INRIA dataset, which is significantly better than that of the original shapelet feature based approach and the existing detector ensemble approach. This approach can robustly detect pedestrians in different situations.
文摘Pedestrian detection is a critical problem in the field of computer vision. Although most existing algorithms are able to detect pedestrians well in controlled environ- ments, it is often difficult to achieve accurate pedestrian de- tection from video sequences alone, especially in pedestrian- intensive scenes wherein pedestrians may cause mutual oc- clusion and thus incomplete detection. To surmount these dif- ficulties, this paper presents pedestrian detection algorithm based on video sequences and laser point cloud. First, laser point cloud is interpreted and classified to separate pedes- trian data and vehicle data. Then a fusion of video image data and laser point cloud data is achieved by calibration. The re- gion of interest after fusion is determined using feature in- formation contained in video image and three-dimensional information of laser point cloud to remove false detection of pedestrian and thus to achieve pedestrian detection in inten- sive scenes. Experimental verification and analysis in video sequences demonstrate that fusion of two data improves the performance of pedestrian detection and has better detection results.
基金This work was supported by the National Nature Science Foundation of China(Grant Nos.61702347,61972267,61772225)Natural Science Foundation of Hebei Province(Grant Nos.F2017210161,F2018210148)。
文摘In recent years,pedestrian detection is a hot research topic in the field of computer vision and artificial intelligence,it is widely used in the field of security and pedestrian analysis.However,due to a large amount of calculation in the traditional pedestrian detection technology,the speed of many systems for pedestrian recognition is very limited.But in some restricted areas,such as construction hazardous areas,real-time detection of pedestrians and cross-border behaviors is required.To more conveniently and efficiently detect whether there are pedestrians in the restricted area and cross-border behavior,this paper proposes a pedestrian cross-border detection method based on HOG(Histogram of Oriented Gradient)and SVM(Support Vector Machine).This method extracts the moving target through the GMM(Gaussian Mixture Model)background modeling and then extracts the characteristics of the moving target through gradient HOG.Finally,it uses SVM training to distinguish pedestrians from non-pedestrians,completes the detection of pedestrians,and labels the targets.The test results show that only the HOG feature extraction of the candidate area can greatly reduce the amount of calculation and reduce the time of feature extraction,eliminate background interference,thereby improving the efficiency of detection,and can be applied to occasions with real-time requirements.
文摘Purpose–The conventional pedestrian detection algorithms lack in scale sensitivity.The purpose of this paper is to propose a novel algorithm of self-adaptive scale pedestrian detection,based on deep residual network(DRN),to address such lacks.Design/methodology/approach–First,the“Edge boxes”algorithm is introduced to extract region of interestsfrompedestrian images.Then,the extracted boundingboxesare incorporatedto differentDRNs,one is a large-scale DRN and the other one is the small-scale DRN.The height of the bounding boxes is used to classify the results of pedestrians and to regress the bounding boxes to the entity of the pedestrian.At last,a weighted self-adaptive scale function,which combines the large-scale results and small-scale results,is designed for the final pedestrian detection.Findings–Tovalidatetheeffectivenessandfeasibilityoftheproposedalgorithm,somecomparisonexperiments have been done on the common pedestrian detection data sets:Caltech,INRIA,ETH and KITTI.Experimental resultsshowthattheproposedalgorithmisadaptedforthevariousscalesofthepedestrians.Fortheharddetected small-scale pedestrians,the proposed algorithm has improved the accuracy and robustness of detections.Originality/value–By applying different models to deal with different scales of pedestrians,the proposed algorithm with the weighted calculation function has improved the accuracy and robustness for different scales of pedestrians.
基金This work was supported by the National Ethnic Affairs Commission of the People’s Republic of China(Training Program for Young and Middle-aged Talents)(No:MZR20007)Hubei Provincinal Science and Technology Major Project of China(No:2020AEA011)+1 种基金Wuhan Science and Technology Plan Applied Basic Frontier Project(No:2020020601012267)the Fundamental Research Funds for the Central Universities,South-Central MinZu University(No:CZQ21026).
文摘Purpose-The purpose of the study is to address the problems of low accuracy and missed detection of occluded pedestrians and small target pedestrians when using the YOLOX general object detection algorithm for pedestrian detection.This study proposes a multi-level fine-grained YOLOX pedestrian detection algorithm.Design/methodology/approach-First,to address the problem of the original YOLOX algorithm in obtaining a single perceptual field for the feature map before feature fusion,this study improves the PAFPN structure by adding the ResCoT module to increase the diversity of the perceptual field of the feature map and divides the pedestrian multi-scale features into finer granularity.Second,for the CSPLayer of the PAFPN,a weight gain-based normalization-based attention module(NAM)is proposed to make the model pay more attention to the context information when extracting pedestrian features and highlight the salient features of pedestrians.Finally,the authors experimentally determined the optimal values for the confidence loss function.Findings-The experimental results show that,compared with the original YOLOX algorithm,the AP of the improved algorithm increased by 2.90%,the Recall increased by 3.57%,and F1 increased by 2%on the pedestrian dataset.Research limitations/implications-The multi-level fine-grained YOLOX pedestrian detection algorithm can effectively improve the detection of occluded pedestrians and small target pedestrians.Originality/value-The authors introduce a multi-level fine-grained ResCoT module and a weight gain-based NAM attention module.
文摘Early detection of vulnerable road users is a crucial requirement for autonomous vehicles to meet and exceed the object detection capabilities of human drivers.One of the most complex outstanding challenges is that of partial occlusion where a target object is only partially available to the sensor due to obstruction by another foreground object.A number of leading pedestrian detection benchmarks provide annotation for partial occlusion,however each benchmark varies greatly in their definition of the occurrence and severity of occlusion.Research demonstrates that a high degree of subjectivity is used to classify occlusion level in these cases and occlusion is typically categorized into 2–3 broad categories such as“partially”and“heavily”occluded.In addition,many pedestrian instances are impacted by multiple inhibiting factors which contribute to non-detection such as object scale,distance from camera,lighting variations and adverse weather.This can lead to inaccurate or inconsistent reporting of detection performance for partially occluded pedestrians depending on which benchmark is used.This research introduces a novel,objective benchmark for partially occluded pedestrian detection to facilitate the objective characterization of pedestrian detection models.Characterization is carried out on seven popular pedestrian detection models for a range of occlusion levels from 0%–99%to demonstrate the impact of progressive levels of partial occlusion on pedestrian detectability.Results show that the proposed benchmark provides more objective,fine grained analysis of pedestrian detection algorithms than the current state of the art.