Aiming at the problem that the existing algorithms for vehicle detection in smart factories are difficult to detect partial occlusion of vehicles,vulnerable to background interference,lack of global vision,and excessi...Aiming at the problem that the existing algorithms for vehicle detection in smart factories are difficult to detect partial occlusion of vehicles,vulnerable to background interference,lack of global vision,and excessive suppression of real targets,which ultimately cause accuracy degradation.At the same time,to facilitate the subsequent positioning of vehicles in the factory,this paper proposes an improved YOLOv8 algorithm.Firstly,the RFCAConv module is combined to improve the original YOLOv8 backbone.Pay attention to the different features in the receptive field,and give priority to the spatial features of the receptive field to capture more vehicle feature information and solve the problem that the vehicle is partially occluded and difficult to detect.Secondly,the SFE module is added to the neck of v8,which improves the saliency of the target in the reasoning process and reduces the influence of background interference on vehicle detection.Finally,the head of the RT-DETR algorithm is used to replace the head in the original YOLOv8 algorithm,which avoids the excessive suppression of the real target while combining the context information.The experimental results show that compared with the original YOLOv8 algorithm,the detection accuracy of the improved YOLOv8 algorithm is improved by 4.6%on the self-made smart factory data set,and the detection speed also meets the real-time requirements of smart factory vehicle detection and subsequent vehicle positioning.展开更多
Vehicle detection is a crucial aspect of intelligent transportation systems(ITS)and autonomous driving technologies.The complexity and diversity of real-world road environments,coupled with traffic congestion,pose sig...Vehicle detection is a crucial aspect of intelligent transportation systems(ITS)and autonomous driving technologies.The complexity and diversity of real-world road environments,coupled with traffic congestion,pose significant challenges to the accuracy and real-time performance of vehicle detection models.To address these challenges,this paper introduces a fast and accurate vehicle detection algorithm named BES-Net.Firstly,the BoTNet module is integrated into the backbone network to bolster the model’s long-distance dependency,address the complexities and diversity of road environments,and accelerate the detection speed of the BES-Net network.Secondly,to accommodate the varying sizes of target vehicles,the efficient multi-scale attention mechanism(EMA)was added to enrich feature information and enhance the model’s expressive power by combining features from multiple scales.Finally,the Slide loss function is specifically designed to give higher weight to difficult samples,thereby improving the detection accuracy of small targets.The experimental results demonstrate that BES-Net achieves superior performance compared to conventional object detection models,with mAP50(mean Average Precision),precision,and recall reaching 73.2%,74.8%,and 73.1%,respectively.These metrics represent significant improvements of 8.5%,7.2%,and 11.7%over the baseline network.This significant improvement underscores the high robustness of the BES-Net model in vehicle detection tasks.In addition,the BES-Net network has been deployed on NVIDIA Jetson Orin NX equipment,providing a solid foundation for its integration into intelligent transportation systems.This deployment not only showcases the practicality of the model but also highlights its potential for real-world applications in autonomous driving and ITS.展开更多
Traditional automated guided vehicle(AGV)primarily relies on scheduling systems to manage warehouse locations and execute picking or placing tasks on fixedheight pallets.However,these conventional systems are illsuite...Traditional automated guided vehicle(AGV)primarily relies on scheduling systems to manage warehouse locations and execute picking or placing tasks on fixedheight pallets.However,these conventional systems are illsuited for scenarios involving variable heights,such as vehicle loading and unloading or the complex stacking of soft packages.To address the challenges of AGV endeffector operations in nonfixed height scenarios,this paper proposes an innovative solution leveraging lowcost depth camera sensors.By capturing image and depth data,and integrating deep learning,image processing,and spatial attitude calculation techniques,the method accurately determines the position of the endeffector center point relative to the upper plane of the fork.The approach effectively resolves a key issue in AGV operations within intelligent logistics scenarios that lack fixed heights.The proposed algorithm is deployed on a domestic embedded,lowcost ARM chip controller,and extensive experiments are conducted on a real AGV equipped with multiple stacked vehicles and nonstandard vehicles.The experimental results demonstrate that for diverse vehicles with different heights,the measurement error can be maintained within±10 mm,satisfying the requirements for highprecision measurement.The height measurement method developed in the paper not only enhances the AGV’s adaptability in nonfixed height scenarios but also significantly broadens its application potential across various industries.展开更多
Accurate vehicle detection is essential for autonomous driving,traffic monitoring,and intelligent transportation systems.This paper presents an enhanced YOLOv8n model that incorporates the Ghost Module,Convolutional B...Accurate vehicle detection is essential for autonomous driving,traffic monitoring,and intelligent transportation systems.This paper presents an enhanced YOLOv8n model that incorporates the Ghost Module,Convolutional Block Attention Module(CBAM),and Deformable Convolutional Networks v2(DCNv2).The Ghost Module streamlines feature generation to reduce redundancy,CBAM applies channel and spatial attention to improve feature focus,and DCNv2 enables adaptability to geometric variations in vehicle shapes.These components work together to improve both accuracy and computational efficiency.Evaluated on the KITTI dataset,the proposed model achieves 95.4%mAP@0.5—an 8.97% gain over standard YOLOv8n—along with 96.2% precision,93.7% recall,and a 94.93%F1-score.Comparative analysis with seven state-of-the-art detectors demonstrates consistent superiority in key performance metrics.An ablation study is also conducted to quantify the individual and combined contributions of GhostModule,CBAM,and DCNv2,highlighting their effectiveness in improving detection performance.By addressing feature redundancy,attention refinement,and spatial adaptability,the proposed model offers a robust and scalable solution for vehicle detection across diverse traffic scenarios.展开更多
In the context of target detection under infrared conditions for drones,the common issues of high missed detection rates,low signal-to-noise ratio,and blurred edge features for small targets are prevalent.To address t...In the context of target detection under infrared conditions for drones,the common issues of high missed detection rates,low signal-to-noise ratio,and blurred edge features for small targets are prevalent.To address these challenges,this paper proposes an improved detection algorithm based on YOLOv11n.First,a Dynamic Multi-Scale Feature Fusion and Adaptive Weighting approach is employed to design an Adaptive Focused Diffusion Pyramid Network(AFDPN),which enhances the feature expression and transmission capability of shallow small targets,thereby reducing the loss of detailed information.Then,combined with an Edge Enhancement(EE)module,the model improves the extraction of infrared small target edge features through low-frequency suppression and high-frequency enhancement strategies.Experimental results on the publicly available HIT-UAV dataset show that the improved model achieves a 3.8%increase in average detection accuracy and a 3.0%improvement in recall rate compared to YOLOv11n,with a computational cost of only 9.1 GFLOPS.In comparison experiments,the detection accuracy and model size balance achieved the optimal solution,meeting the lightweight deployment requirements for drone-based systems.This method provides a high-precision,lightweight solution for small target detection in drone-based infrared imagery.展开更多
Differentiating between regular and abnormal noises in machine-generated sounds is a crucial but difficult problem.For accurate audio signal classification,suitable and efficient techniques are needed,particularly mac...Differentiating between regular and abnormal noises in machine-generated sounds is a crucial but difficult problem.For accurate audio signal classification,suitable and efficient techniques are needed,particularly machine learning approaches for automated classification.Due to the dynamic and diverse representative characteristics of audio data,the probability of achieving high classification accuracy is relatively low and requires further research efforts.This study proposes an ensemble model based on the LeNet and hierarchical attention mechanism(HAM)models with MFCC features to enhance the models’capacity to handle bias.Additionally,CNNs,bidirectional LSTM(BiLSTM),CRNN,LSTM,capsule network model(CNM),attention mechanism(AM),gated recurrent unit(GRU),ResNet,EfficientNet,and HAM models are implemented for performance comparison.Experiments involving the DCASE2020 dataset reveal that the proposed approach works better than the others,achieving an impressive 99.13%accuracy and 99.56%k-fold cross-validation accuracy.Comparison with state-of-the-art studies further validates this performance.The study’s findings highlight the potential of the proposed approach for accurate fault detection in vehicles,particularly involving the use of acoustic data.展开更多
In order to decrease vehicle crashes, a new rear view vehicle detection system based on monocular vision is designed. First, a small and flexible hardware platform based on a DM642 digtal signal processor (DSP) micr...In order to decrease vehicle crashes, a new rear view vehicle detection system based on monocular vision is designed. First, a small and flexible hardware platform based on a DM642 digtal signal processor (DSP) micro-controller is built. Then, a two-step vehicle detection algorithm is proposed. In the first step, a fast vehicle edge and symmetry fusion algorithm is used and a low threshold is set so that all the possible vehicles have a nearly 100% detection rate (TP) and the non-vehicles have a high false detection rate (FP), i. e., all the possible vehicles can be obtained. In the second step, a classifier using a probabilistic neural network (PNN) which is based on multiple scales and an orientation Gabor feature is trained to classify the possible vehicles and eliminate the false detected vehicles from the candidate vehicles generated in the first step. Experimental results demonstrate that the proposed system maintains a high detection rate and a low false detection rate under different road, weather and lighting conditions.展开更多
A method which extracts traffic information from an MPEG-2 compressed video is proposed. According to the features of vehicle motion, the motion vector of a macro-block is used to detect moving vehicles in daytime, an...A method which extracts traffic information from an MPEG-2 compressed video is proposed. According to the features of vehicle motion, the motion vector of a macro-block is used to detect moving vehicles in daytime, and a filter algorithm for removing noises of motion vectors is given. As the brightness of the headlights is higher than that of the background in night images, discrete cosine transform (DCT)coefficient of image block is used to detect headlights of vehicles at night, and an algorithm for calculating the DCT coefficients of P-frames is introduced. In order to prevent moving objects outside the expressway and video shot changes from disturbing the detection, a driveway location method and a video-shot-change detection algorithm are suggested. The detection rate is 97.4% in daytime and 95.4% in nighttime by this method. The results prove that this vehicle detection method is effective.展开更多
An efficient vehicle detection approach is proposed for traffic surveillance images, which is based on information fusion of vehicle symmetrical contour and license plate position. The vertical symmetry axis of the ve...An efficient vehicle detection approach is proposed for traffic surveillance images, which is based on information fusion of vehicle symmetrical contour and license plate position. The vertical symmetry axis of the vehicle contour in an image is. first detected, and then the vertical and the horizontal symmetry axes of the license plate are detected using the symmetry axis of the vehicle contour as a reference. The vehicle location in an image is determined using license plate symmetry axes and the vertical and the horizontal projection maps of the vehicle edge image. A dataset consisting of 450 images (15 classes of vehicles) is used to test the proposed method. The experimental results indicate that compared with the vehicle contour-based, the license plate location-based, the vehicle texture-based and the Gabor feature-based methods, the proposed method is the best with a detection accuracy of 90.7% and an elapsed time of 125 ms.展开更多
Traditional vehicle detection algorithms use traverse search based vehicle candidate generation and hand crafted based classifier training for vehicle candidate verification.These types of methods generally have high ...Traditional vehicle detection algorithms use traverse search based vehicle candidate generation and hand crafted based classifier training for vehicle candidate verification.These types of methods generally have high processing times and low vehicle detection performance.To address this issue,a visual saliency and deep sparse convolution hierarchical model based vehicle detection algorithm is proposed.A visual saliency calculation is firstly used to generate a small vehicle candidate area.The vehicle candidate sub images are then loaded into a sparse deep convolution hierarchical model with an SVM-based classifier to perform the final detection.The experimental results demonstrate that the proposed method is with 94.81%correct rate and 0.78%false detection rate on the existing datasets and the real road pictures captured by our group,which outperforms the existing state-of-the-art algorithms.More importantly,high discriminative multi-scale features are generated by deep sparse convolution network which has broad application prospects in target recognition in the field of intelligent vehicle.展开更多
To address the challenges of high complexity,poor real-time performance,and low detection rates for small target vehicles in existing vehicle object detection algorithms,this paper proposes a real-time lightweight arc...To address the challenges of high complexity,poor real-time performance,and low detection rates for small target vehicles in existing vehicle object detection algorithms,this paper proposes a real-time lightweight architecture based on You Only Look Once(YOLO)v5m.Firstly,a lightweight upsampling operator called Content-Aware Reassembly of Features(CARAFE)is introduced in the feature fusion layer of the network to maximize the extraction of deep-level features for small target vehicles,reducing the missed detection rate and false detection rate.Secondly,a new prediction layer for tiny targets is added,and the feature fusion network is redesigned to enhance the detection capability for small targets.Finally,this paper applies L1 regularization to train the improved network,followed by pruning and fine-tuning operations to remove redundant channels,reducing computational and parameter complexity and enhancing the detection efficiency of the network.Training is conducted on the VisDrone2019-DET dataset.The experimental results show that the proposed algorithmreduces parameters and computation by 63.8% and 65.8%,respectively.The average detection accuracy improves by 5.15%,and the detection speed reaches 47 images per second,satisfying real-time requirements.Compared with existing approaches,including YOLOv5m and classical vehicle detection algorithms,our method achieves higher accuracy and faster speed for real-time detection of small target vehicles in edge computing.展开更多
A patch-based method for detecting vehicle logos using prior knowledge is proposed.By representing the coarse region of the logo with the weight matrix of patch intensity and position,the proposed method is robust to ...A patch-based method for detecting vehicle logos using prior knowledge is proposed.By representing the coarse region of the logo with the weight matrix of patch intensity and position,the proposed method is robust to bad and complex environmental conditions.The bounding-box of the logo is extracted by a thershloding approach.Experimental results show that 93.58% location accuracy is achieved with 1100 images under various environmental conditions,indicating that the proposed method is effective and suitable for the location of vehicle logo in practical applications.展开更多
Autonomous driving technology has made a lot of outstanding achievements with deep learning,and the vehicle detection and classification algorithm has become one of the critical technologies of autonomous driving syst...Autonomous driving technology has made a lot of outstanding achievements with deep learning,and the vehicle detection and classification algorithm has become one of the critical technologies of autonomous driving systems.The vehicle instance segmentation can perform instance-level semantic parsing of vehicle information,which is more accurate and reliable than object detection.However,the existing instance segmentation algorithms still have the problems of poor mask prediction accuracy and low detection speed.Therefore,this paper proposes an advanced real-time instance segmentation model named FIR-YOLACT,which fuses the ICIoU(Improved Complete Intersection over Union)and Res2Net for the YOLACT algorithm.Specifically,the ICIoU function can effectively solve the degradation problem of the original CIoU loss function,and improve the training convergence speed and detection accuracy.The Res2Net module fused with the ECA(Efficient Channel Attention)Net is added to the model’s backbone network,which improves the multi-scale detection capability and mask prediction accuracy.Furthermore,the Cluster NMS(Non-Maximum Suppression)algorithm is introduced in the model’s bounding box regression to enhance the performance of detecting similarly occluded objects.The experimental results demonstrate the superiority of FIR-YOLACT to the based methods and the effectiveness of all components.The processing speed reaches 28 FPS,which meets the demands of real-time vehicle instance segmentation.展开更多
Nowadays,the rapid development of edge computing has driven an increasing number of deep learning applications deployed at the edge of the network,such as pedestrian and vehicle detection,to provide efficient intellig...Nowadays,the rapid development of edge computing has driven an increasing number of deep learning applications deployed at the edge of the network,such as pedestrian and vehicle detection,to provide efficient intelligent services to mobile users.However,as the accuracy requirements continue to increase,the components of deep learning models for pedestrian and vehicle detection,such as YOLOv4,become more sophisticated and the computing resources required for model training are increasing dramatically,which in turn leads to significant challenges in achieving effective deployment on resource-constrained edge devices while ensuring the high accuracy performance.For addressing this challenge,a cloud-edge collaboration-based pedestrian and vehicle detection framework is proposed in this paper,which enables sufficient training of models by utilizing the abundant computing resources in the cloud,and then deploying the well-trained models on edge devices,thus reducing the computing resource requirements for model training on edge devices.Furthermore,to reduce the size of the model deployed on edge devices,an automatic pruning method combines the convolution layer and BN layer is proposed to compress the pedestrian and vehicle detection model size.Experimental results show that the framework proposed in this paper is able to deploy the pruned model on a real edge device,Jetson TX2,with 6.72 times higher FPS.Meanwhile,the channel pruning reduces the volume and the number of parameters to 96.77%for the model,and the computing amount is reduced to 81.37%.展开更多
In this paper,we provide a new approach for intelligent traffic transportation in the intelligent vehicular networks,which aims at collecting the vehicles’locations,trajectories and other key driving parameters for t...In this paper,we provide a new approach for intelligent traffic transportation in the intelligent vehicular networks,which aims at collecting the vehicles’locations,trajectories and other key driving parameters for the time-critical autonomous driving’s requirement.The key of our method is a multi-vehicle tracking framework in the traffic monitoring scenario..Our proposed framework is composed of three modules:multi-vehicle detection,multi-vehicle association and miss-detected vehicle tracking.For the first module,we integrate self-attention mechanism into detector of using key point estimation for better detection effect.For the second module,we apply the multi-dimensional information for robustness promotion,including vehicle re-identification(Re-ID)features,historical trajectory information,and spatial position information For the third module,we re-track the miss-detected vehicles with occlusions in the first detection module.Besides,we utilize the asymmetric convolution and depth-wise separable convolution to reduce the model’s parameters for speed-up.Extensive experimental results show the effectiveness of our proposed multi-vehicle tracking framework.展开更多
Vision-based vehicle detection in adverse weather conditions such as fog,haze,and mist is a challenging research area in the fields of autonomous vehicles,collision avoidance,and Internet of Things(IoT)-enabled edge/f...Vision-based vehicle detection in adverse weather conditions such as fog,haze,and mist is a challenging research area in the fields of autonomous vehicles,collision avoidance,and Internet of Things(IoT)-enabled edge/fog computing traffic surveillance and monitoring systems.Efficient and cost-effective vehicle detection at high accuracy and speed in foggy weather is essential to avoiding road traffic collisions in real-time.To evaluate vision-based vehicle detection performance in foggy weather conditions,state-of-the-art Vehicle Detection in Adverse Weather Nature(DAWN)and Foggy Driving(FD)datasets are self-annotated using the YOLO LABEL tool and customized to four vehicle detection classes:cars,buses,motorcycles,and trucks.The state-of-the-art single-stage deep learning algorithms YOLO-V5,and YOLO-V8 are considered for the task of vehicle detection.Furthermore,YOLO-V5s is enhanced by introducing attention modules Convolutional Block Attention Module(CBAM),Normalized-based Attention Module(NAM),and Simple Attention Module(SimAM)after the SPPF module as well as YOLO-V5l with BiFPN.Their vehicle detection accuracy parameters and running speed is validated on cloud(Google Colab)and edge(local)systems.The mAP50 score of YOLO-V5n is 72.60%,YOLOV5s is 75.20%,YOLO-V5m is 73.40%,and YOLO-V5l is 77.30%;and YOLO-V8n is 60.20%,YOLO-V8s is 73.50%,YOLO-V8m is 73.80%,and YOLO-V8l is 72.60%on DAWN dataset.The mAP50 score of YOLO-V5n is 43.90%,YOLO-V5s is 40.10%,YOLO-V5m is 49.70%,and YOLO-V5l is 57.30%;and YOLO-V8n is 41.60%,YOLO-V8s is 46.90%,YOLO-V8m is 42.90%,and YOLO-V8l is 44.80%on FD dataset.The vehicle detection speed of YOLOV5n is 59 Frame Per Seconds(FPS),YOLO-V5s is 47 FPS,YOLO-V5m is 38 FPS,and YOLO-V5l is 30 FPS;and YOLO-V8n is 185 FPS,YOLO-V8s is 109 FPS,YOLO-V8m is 72 FPS,and YOLO-V8l is 63 FPS on DAWN dataset.The vehicle detection speed of YOLO-V5n is 26 FPS,YOLO-V5s is 24 FPS,YOLO-V5m is 22 FPS,and YOLO-V5l is 17 FPS;and YOLO-V8n is 313 FPS,YOLO-V8s is 182 FPS,YOLO-V8m is 99 FPS,and YOLO-V8l is 60 FPS on FD dataset.YOLO-V5s,YOLO-V5s variants and YOLO-V5l_BiFPN,and YOLO-V8 algorithms are efficient and cost-effective solution for real-time vision-based vehicle detection in foggy weather.展开更多
Analyzing a vehicle’s abnormal behavior in surveillance videos is a challenging field,mainly due to the wide variety of anomaly cases and the complexity of surveillance videos.In this study,a novel intelligent vehicl...Analyzing a vehicle’s abnormal behavior in surveillance videos is a challenging field,mainly due to the wide variety of anomaly cases and the complexity of surveillance videos.In this study,a novel intelligent vehicle behavior analysis framework based on a digital twin is proposed.First,detecting vehicles based on deep learning is implemented,and Kalman filtering and feature matching are used to track vehicles.Subsequently,the tracked vehicle is mapped to a digital-twin virtual scene developed in the Unity game engine,and each vehicle’s behavior is tested according to the customized detection conditions set up in the scene.The stored behavior data can be used to reconstruct the scene again in Unity for a secondary analysis.The experimental results using real videos from traffic cameras illustrate that the detection rate of the proposed framework is close to that of the state-of-the-art abnormal event detection systems.In addition,the implementation and analysis process show the usability,generalization,and effectiveness of the proposed framework.展开更多
Vehicle detection plays a crucial role in the field of autonomous driving technology.However,directly applying deep learning-based object detection algorithms to complex road scene images often leads to subpar perform...Vehicle detection plays a crucial role in the field of autonomous driving technology.However,directly applying deep learning-based object detection algorithms to complex road scene images often leads to subpar performance and slow inference speeds in vehicle detection.Achieving a balance between accuracy and detection speed is crucial for real-time object detection in real-world road scenes.This paper proposes a high-precision and fast vehicle detector called the feature-guided bidirectional pyramid network(FBPN).Firstly,to tackle challenges like vehicle occlusion and significant background interference,the efficient feature filtering module(EFFM)is introduced into the deep network,which amplifies the disparities between the features of the vehicle and the background.Secondly,the proposed global attention localization module(GALM)in the model neck effectively perceives the detailed position information of the target,improving both the accuracy and inference speed of themodel.Finally,the detection accuracy of small-scale vehicles is further enhanced through the utilization of a four-layer feature pyramid structure.Experimental results show that FBPN achieves an average precision of 60.8% and 97.8% on the BDD100K and KITTI datasets,respectively,with inference speeds reaching 344.83 frames/s and 357.14 frames/s.FBPN demonstrates its effectiveness and superiority by striking a balance between detection accuracy and inference speed,outperforming several state-of-the-art methods.展开更多
This paper proposes a night-time vehicle detection method using variable Haar-like feature.The specific features of front vehicle cannot be obtained in road image at night-time because of light reflection and ambient ...This paper proposes a night-time vehicle detection method using variable Haar-like feature.The specific features of front vehicle cannot be obtained in road image at night-time because of light reflection and ambient light,and it is also difficult to define optimal brightness and color of rear lamp according to road conditions.In comparison,the difference of vehicle region and road surface is more robust for road illumination environment.Thus,we select the candidates of vehicles by analysing the difference,and verify the candidates using those brightness and complexity to detect vehicle correctly.The feature of brightness difference is detected using variable horizontal Haar-like mask according to vehicle size in the location of image.And the region occurring rapid change is selected as the candidate.The proposed method is evaluated by testing on the various real road conditions.展开更多
Detecting the moving vehicles in jittering traffic scenes is a very difficult problem because of the complex environment.Only by the color features of the pixel or only by the texture features of image cannot establis...Detecting the moving vehicles in jittering traffic scenes is a very difficult problem because of the complex environment.Only by the color features of the pixel or only by the texture features of image cannot establish a suitable background model for the moving vehicles. In order to solve this problem, the Gaussian pyramid layered algorithm is proposed, combining with the advantages of the Codebook algorithm and the Local binary patterns(LBP) algorithm. Firstly, the image pyramid is established to eliminate the noises generated by the camera shake. Then, codebook model and LBP model are constructed on the low-resolution level and the high-resolution level of Gaussian pyramid, respectively. At last, the final test results are obtained through a set of operations according to the spatial relations of pixels. The experimental results show that this algorithm can not only eliminate the noises effectively, but also save the calculating time with high detection sensitivity and high detection accuracy.展开更多
基金funded by Changzhou Science and Technology Project(No.CZ20230025)Postgraduate Research&Practice Innovation Program of Jiangsu Province(No.XSJCX23_36).
文摘Aiming at the problem that the existing algorithms for vehicle detection in smart factories are difficult to detect partial occlusion of vehicles,vulnerable to background interference,lack of global vision,and excessive suppression of real targets,which ultimately cause accuracy degradation.At the same time,to facilitate the subsequent positioning of vehicles in the factory,this paper proposes an improved YOLOv8 algorithm.Firstly,the RFCAConv module is combined to improve the original YOLOv8 backbone.Pay attention to the different features in the receptive field,and give priority to the spatial features of the receptive field to capture more vehicle feature information and solve the problem that the vehicle is partially occluded and difficult to detect.Secondly,the SFE module is added to the neck of v8,which improves the saliency of the target in the reasoning process and reduces the influence of background interference on vehicle detection.Finally,the head of the RT-DETR algorithm is used to replace the head in the original YOLOv8 algorithm,which avoids the excessive suppression of the real target while combining the context information.The experimental results show that compared with the original YOLOv8 algorithm,the detection accuracy of the improved YOLOv8 algorithm is improved by 4.6%on the self-made smart factory data set,and the detection speed also meets the real-time requirements of smart factory vehicle detection and subsequent vehicle positioning.
基金funded by National Natural Science Foundation of China(No.61961011)Innovation Project of Guangxi Graduate Education No.YCSW2025411.
文摘Vehicle detection is a crucial aspect of intelligent transportation systems(ITS)and autonomous driving technologies.The complexity and diversity of real-world road environments,coupled with traffic congestion,pose significant challenges to the accuracy and real-time performance of vehicle detection models.To address these challenges,this paper introduces a fast and accurate vehicle detection algorithm named BES-Net.Firstly,the BoTNet module is integrated into the backbone network to bolster the model’s long-distance dependency,address the complexities and diversity of road environments,and accelerate the detection speed of the BES-Net network.Secondly,to accommodate the varying sizes of target vehicles,the efficient multi-scale attention mechanism(EMA)was added to enrich feature information and enhance the model’s expressive power by combining features from multiple scales.Finally,the Slide loss function is specifically designed to give higher weight to difficult samples,thereby improving the detection accuracy of small targets.The experimental results demonstrate that BES-Net achieves superior performance compared to conventional object detection models,with mAP50(mean Average Precision),precision,and recall reaching 73.2%,74.8%,and 73.1%,respectively.These metrics represent significant improvements of 8.5%,7.2%,and 11.7%over the baseline network.This significant improvement underscores the high robustness of the BES-Net model in vehicle detection tasks.In addition,the BES-Net network has been deployed on NVIDIA Jetson Orin NX equipment,providing a solid foundation for its integration into intelligent transportation systems.This deployment not only showcases the practicality of the model but also highlights its potential for real-world applications in autonomous driving and ITS.
基金Supported by the Key Research and Development Program of Anhui Province(No.201904a05020035)the Postdoctoral Research Initiative of Anhui Province(No.2024B804)the Hefei City Key Technology Research and Development‘Ranking’(No.2023SGJ017).
文摘Traditional automated guided vehicle(AGV)primarily relies on scheduling systems to manage warehouse locations and execute picking or placing tasks on fixedheight pallets.However,these conventional systems are illsuited for scenarios involving variable heights,such as vehicle loading and unloading or the complex stacking of soft packages.To address the challenges of AGV endeffector operations in nonfixed height scenarios,this paper proposes an innovative solution leveraging lowcost depth camera sensors.By capturing image and depth data,and integrating deep learning,image processing,and spatial attitude calculation techniques,the method accurately determines the position of the endeffector center point relative to the upper plane of the fork.The approach effectively resolves a key issue in AGV operations within intelligent logistics scenarios that lack fixed heights.The proposed algorithm is deployed on a domestic embedded,lowcost ARM chip controller,and extensive experiments are conducted on a real AGV equipped with multiple stacked vehicles and nonstandard vehicles.The experimental results demonstrate that for diverse vehicles with different heights,the measurement error can be maintained within±10 mm,satisfying the requirements for highprecision measurement.The height measurement method developed in the paper not only enhances the AGV’s adaptability in nonfixed height scenarios but also significantly broadens its application potential across various industries.
文摘Accurate vehicle detection is essential for autonomous driving,traffic monitoring,and intelligent transportation systems.This paper presents an enhanced YOLOv8n model that incorporates the Ghost Module,Convolutional Block Attention Module(CBAM),and Deformable Convolutional Networks v2(DCNv2).The Ghost Module streamlines feature generation to reduce redundancy,CBAM applies channel and spatial attention to improve feature focus,and DCNv2 enables adaptability to geometric variations in vehicle shapes.These components work together to improve both accuracy and computational efficiency.Evaluated on the KITTI dataset,the proposed model achieves 95.4%mAP@0.5—an 8.97% gain over standard YOLOv8n—along with 96.2% precision,93.7% recall,and a 94.93%F1-score.Comparative analysis with seven state-of-the-art detectors demonstrates consistent superiority in key performance metrics.An ablation study is also conducted to quantify the individual and combined contributions of GhostModule,CBAM,and DCNv2,highlighting their effectiveness in improving detection performance.By addressing feature redundancy,attention refinement,and spatial adaptability,the proposed model offers a robust and scalable solution for vehicle detection across diverse traffic scenarios.
文摘In the context of target detection under infrared conditions for drones,the common issues of high missed detection rates,low signal-to-noise ratio,and blurred edge features for small targets are prevalent.To address these challenges,this paper proposes an improved detection algorithm based on YOLOv11n.First,a Dynamic Multi-Scale Feature Fusion and Adaptive Weighting approach is employed to design an Adaptive Focused Diffusion Pyramid Network(AFDPN),which enhances the feature expression and transmission capability of shallow small targets,thereby reducing the loss of detailed information.Then,combined with an Edge Enhancement(EE)module,the model improves the extraction of infrared small target edge features through low-frequency suppression and high-frequency enhancement strategies.Experimental results on the publicly available HIT-UAV dataset show that the improved model achieves a 3.8%increase in average detection accuracy and a 3.0%improvement in recall rate compared to YOLOv11n,with a computational cost of only 9.1 GFLOPS.In comparison experiments,the detection accuracy and model size balance achieved the optimal solution,meeting the lightweight deployment requirements for drone-based systems.This method provides a high-precision,lightweight solution for small target detection in drone-based infrared imagery.
基金funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2025R746),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia。
文摘Differentiating between regular and abnormal noises in machine-generated sounds is a crucial but difficult problem.For accurate audio signal classification,suitable and efficient techniques are needed,particularly machine learning approaches for automated classification.Due to the dynamic and diverse representative characteristics of audio data,the probability of achieving high classification accuracy is relatively low and requires further research efforts.This study proposes an ensemble model based on the LeNet and hierarchical attention mechanism(HAM)models with MFCC features to enhance the models’capacity to handle bias.Additionally,CNNs,bidirectional LSTM(BiLSTM),CRNN,LSTM,capsule network model(CNM),attention mechanism(AM),gated recurrent unit(GRU),ResNet,EfficientNet,and HAM models are implemented for performance comparison.Experiments involving the DCASE2020 dataset reveal that the proposed approach works better than the others,achieving an impressive 99.13%accuracy and 99.56%k-fold cross-validation accuracy.Comparison with state-of-the-art studies further validates this performance.The study’s findings highlight the potential of the proposed approach for accurate fault detection in vehicles,particularly involving the use of acoustic data.
基金The National Key Technology R&D Program of China during the 11th Five-Year Plan Period(2009BAG13A04)Jiangsu Transportation Science Research Program(No.08X09)Program of Suzhou Science and Technology(No.SG201076)
文摘In order to decrease vehicle crashes, a new rear view vehicle detection system based on monocular vision is designed. First, a small and flexible hardware platform based on a DM642 digtal signal processor (DSP) micro-controller is built. Then, a two-step vehicle detection algorithm is proposed. In the first step, a fast vehicle edge and symmetry fusion algorithm is used and a low threshold is set so that all the possible vehicles have a nearly 100% detection rate (TP) and the non-vehicles have a high false detection rate (FP), i. e., all the possible vehicles can be obtained. In the second step, a classifier using a probabilistic neural network (PNN) which is based on multiple scales and an orientation Gabor feature is trained to classify the possible vehicles and eliminate the false detected vehicles from the candidate vehicles generated in the first step. Experimental results demonstrate that the proposed system maintains a high detection rate and a low false detection rate under different road, weather and lighting conditions.
基金The Cultivation Fund of the Key Scientific and Technical Innovation Project of Higher Education of Ministry of Education(No.705020)the Natural Science Foundation of Jiangsu Province ( No.BK2004077)
文摘A method which extracts traffic information from an MPEG-2 compressed video is proposed. According to the features of vehicle motion, the motion vector of a macro-block is used to detect moving vehicles in daytime, and a filter algorithm for removing noises of motion vectors is given. As the brightness of the headlights is higher than that of the background in night images, discrete cosine transform (DCT)coefficient of image block is used to detect headlights of vehicles at night, and an algorithm for calculating the DCT coefficients of P-frames is introduced. In order to prevent moving objects outside the expressway and video shot changes from disturbing the detection, a driveway location method and a video-shot-change detection algorithm are suggested. The detection rate is 97.4% in daytime and 95.4% in nighttime by this method. The results prove that this vehicle detection method is effective.
基金The National Natural Science Foundation of China(No. 40804015,61101163)
文摘An efficient vehicle detection approach is proposed for traffic surveillance images, which is based on information fusion of vehicle symmetrical contour and license plate position. The vertical symmetry axis of the vehicle contour in an image is. first detected, and then the vertical and the horizontal symmetry axes of the license plate are detected using the symmetry axis of the vehicle contour as a reference. The vehicle location in an image is determined using license plate symmetry axes and the vertical and the horizontal projection maps of the vehicle edge image. A dataset consisting of 450 images (15 classes of vehicles) is used to test the proposed method. The experimental results indicate that compared with the vehicle contour-based, the license plate location-based, the vehicle texture-based and the Gabor feature-based methods, the proposed method is the best with a detection accuracy of 90.7% and an elapsed time of 125 ms.
基金Supported by National Natural Science Foundation of China(Grant Nos.U1564201,61573171,61403172,51305167)China Postdoctoral Science Foundation(Grant Nos.2015T80511,2014M561592)+3 种基金Jiangsu Provincial Natural Science Foundation of China(Grant No.BK20140555)Six Talent Peaks Project of Jiangsu Province,China(Grant Nos.2015-JXQC-012,2014-DZXX-040)Jiangsu Postdoctoral Science Foundation,China(Grant No.1402097C)Jiangsu University Scientific Research Foundation for Senior Professionals,China(Grant No.14JDG028)
文摘Traditional vehicle detection algorithms use traverse search based vehicle candidate generation and hand crafted based classifier training for vehicle candidate verification.These types of methods generally have high processing times and low vehicle detection performance.To address this issue,a visual saliency and deep sparse convolution hierarchical model based vehicle detection algorithm is proposed.A visual saliency calculation is firstly used to generate a small vehicle candidate area.The vehicle candidate sub images are then loaded into a sparse deep convolution hierarchical model with an SVM-based classifier to perform the final detection.The experimental results demonstrate that the proposed method is with 94.81%correct rate and 0.78%false detection rate on the existing datasets and the real road pictures captured by our group,which outperforms the existing state-of-the-art algorithms.More importantly,high discriminative multi-scale features are generated by deep sparse convolution network which has broad application prospects in target recognition in the field of intelligent vehicle.
基金funded by the General Project of Key Research and Develop-ment Plan of Shaanxi Province(No.2022NY-087).
文摘To address the challenges of high complexity,poor real-time performance,and low detection rates for small target vehicles in existing vehicle object detection algorithms,this paper proposes a real-time lightweight architecture based on You Only Look Once(YOLO)v5m.Firstly,a lightweight upsampling operator called Content-Aware Reassembly of Features(CARAFE)is introduced in the feature fusion layer of the network to maximize the extraction of deep-level features for small target vehicles,reducing the missed detection rate and false detection rate.Secondly,a new prediction layer for tiny targets is added,and the feature fusion network is redesigned to enhance the detection capability for small targets.Finally,this paper applies L1 regularization to train the improved network,followed by pruning and fine-tuning operations to remove redundant channels,reducing computational and parameter complexity and enhancing the detection efficiency of the network.Training is conducted on the VisDrone2019-DET dataset.The experimental results show that the proposed algorithmreduces parameters and computation by 63.8% and 65.8%,respectively.The average detection accuracy improves by 5.15%,and the detection speed reaches 47 images per second,satisfying real-time requirements.Compared with existing approaches,including YOLOv5m and classical vehicle detection algorithms,our method achieves higher accuracy and faster speed for real-time detection of small target vehicles in edge computing.
文摘A patch-based method for detecting vehicle logos using prior knowledge is proposed.By representing the coarse region of the logo with the weight matrix of patch intensity and position,the proposed method is robust to bad and complex environmental conditions.The bounding-box of the logo is extracted by a thershloding approach.Experimental results show that 93.58% location accuracy is achieved with 1100 images under various environmental conditions,indicating that the proposed method is effective and suitable for the location of vehicle logo in practical applications.
基金supported by the Natural Science Foundation of Guizhou Province(Grant Number:20161054)Joint Natural Science Foundation of Guizhou Province(Grant Number:LH20177226)+1 种基金2017 Special Project of New Academic Talent Training and Innovation Exploration of Guizhou University(Grant Number:20175788)The National Natural Science Foundation of China under Grant No.12205062.
文摘Autonomous driving technology has made a lot of outstanding achievements with deep learning,and the vehicle detection and classification algorithm has become one of the critical technologies of autonomous driving systems.The vehicle instance segmentation can perform instance-level semantic parsing of vehicle information,which is more accurate and reliable than object detection.However,the existing instance segmentation algorithms still have the problems of poor mask prediction accuracy and low detection speed.Therefore,this paper proposes an advanced real-time instance segmentation model named FIR-YOLACT,which fuses the ICIoU(Improved Complete Intersection over Union)and Res2Net for the YOLACT algorithm.Specifically,the ICIoU function can effectively solve the degradation problem of the original CIoU loss function,and improve the training convergence speed and detection accuracy.The Res2Net module fused with the ECA(Efficient Channel Attention)Net is added to the model’s backbone network,which improves the multi-scale detection capability and mask prediction accuracy.Furthermore,the Cluster NMS(Non-Maximum Suppression)algorithm is introduced in the model’s bounding box regression to enhance the performance of detecting similarly occluded objects.The experimental results demonstrate the superiority of FIR-YOLACT to the based methods and the effectiveness of all components.The processing speed reaches 28 FPS,which meets the demands of real-time vehicle instance segmentation.
基金supported by Key-Area Research and Development Program of Guangdong Province(2021B0101420002)the Major Key Project of PCL(PCL2021A09)+3 种基金National Natural Science Foundation of China(62072187)Guangdong Major Project of Basic and Applied Basic Research(2019B030302002)Guangdong Marine Economic Development Special Fund Project(GDNRC[2022]17)Guangzhou Development Zone Science and Technology(2021GH10,2020GH10).
文摘Nowadays,the rapid development of edge computing has driven an increasing number of deep learning applications deployed at the edge of the network,such as pedestrian and vehicle detection,to provide efficient intelligent services to mobile users.However,as the accuracy requirements continue to increase,the components of deep learning models for pedestrian and vehicle detection,such as YOLOv4,become more sophisticated and the computing resources required for model training are increasing dramatically,which in turn leads to significant challenges in achieving effective deployment on resource-constrained edge devices while ensuring the high accuracy performance.For addressing this challenge,a cloud-edge collaboration-based pedestrian and vehicle detection framework is proposed in this paper,which enables sufficient training of models by utilizing the abundant computing resources in the cloud,and then deploying the well-trained models on edge devices,thus reducing the computing resource requirements for model training on edge devices.Furthermore,to reduce the size of the model deployed on edge devices,an automatic pruning method combines the convolution layer and BN layer is proposed to compress the pedestrian and vehicle detection model size.Experimental results show that the framework proposed in this paper is able to deploy the pruned model on a real edge device,Jetson TX2,with 6.72 times higher FPS.Meanwhile,the channel pruning reduces the volume and the number of parameters to 96.77%for the model,and the computing amount is reduced to 81.37%.
基金This work was supported in part by the Beijing Natural Science Foundation(L191004)the National Natural Science Foundation of China under No.61720106007 and No.61872047+1 种基金the Beijing Nova Program under No.Z201100006820124the Funds for Cre ative Research Groups of China under No.61921003,and the 111 Project(B18008).
文摘In this paper,we provide a new approach for intelligent traffic transportation in the intelligent vehicular networks,which aims at collecting the vehicles’locations,trajectories and other key driving parameters for the time-critical autonomous driving’s requirement.The key of our method is a multi-vehicle tracking framework in the traffic monitoring scenario..Our proposed framework is composed of three modules:multi-vehicle detection,multi-vehicle association and miss-detected vehicle tracking.For the first module,we integrate self-attention mechanism into detector of using key point estimation for better detection effect.For the second module,we apply the multi-dimensional information for robustness promotion,including vehicle re-identification(Re-ID)features,historical trajectory information,and spatial position information For the third module,we re-track the miss-detected vehicles with occlusions in the first detection module.Besides,we utilize the asymmetric convolution and depth-wise separable convolution to reduce the model’s parameters for speed-up.Extensive experimental results show the effectiveness of our proposed multi-vehicle tracking framework.
基金supported and funded by the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University(IMSIU)(grant number IMSIU-RG23129).
文摘Vision-based vehicle detection in adverse weather conditions such as fog,haze,and mist is a challenging research area in the fields of autonomous vehicles,collision avoidance,and Internet of Things(IoT)-enabled edge/fog computing traffic surveillance and monitoring systems.Efficient and cost-effective vehicle detection at high accuracy and speed in foggy weather is essential to avoiding road traffic collisions in real-time.To evaluate vision-based vehicle detection performance in foggy weather conditions,state-of-the-art Vehicle Detection in Adverse Weather Nature(DAWN)and Foggy Driving(FD)datasets are self-annotated using the YOLO LABEL tool and customized to four vehicle detection classes:cars,buses,motorcycles,and trucks.The state-of-the-art single-stage deep learning algorithms YOLO-V5,and YOLO-V8 are considered for the task of vehicle detection.Furthermore,YOLO-V5s is enhanced by introducing attention modules Convolutional Block Attention Module(CBAM),Normalized-based Attention Module(NAM),and Simple Attention Module(SimAM)after the SPPF module as well as YOLO-V5l with BiFPN.Their vehicle detection accuracy parameters and running speed is validated on cloud(Google Colab)and edge(local)systems.The mAP50 score of YOLO-V5n is 72.60%,YOLOV5s is 75.20%,YOLO-V5m is 73.40%,and YOLO-V5l is 77.30%;and YOLO-V8n is 60.20%,YOLO-V8s is 73.50%,YOLO-V8m is 73.80%,and YOLO-V8l is 72.60%on DAWN dataset.The mAP50 score of YOLO-V5n is 43.90%,YOLO-V5s is 40.10%,YOLO-V5m is 49.70%,and YOLO-V5l is 57.30%;and YOLO-V8n is 41.60%,YOLO-V8s is 46.90%,YOLO-V8m is 42.90%,and YOLO-V8l is 44.80%on FD dataset.The vehicle detection speed of YOLOV5n is 59 Frame Per Seconds(FPS),YOLO-V5s is 47 FPS,YOLO-V5m is 38 FPS,and YOLO-V5l is 30 FPS;and YOLO-V8n is 185 FPS,YOLO-V8s is 109 FPS,YOLO-V8m is 72 FPS,and YOLO-V8l is 63 FPS on DAWN dataset.The vehicle detection speed of YOLO-V5n is 26 FPS,YOLO-V5s is 24 FPS,YOLO-V5m is 22 FPS,and YOLO-V5l is 17 FPS;and YOLO-V8n is 313 FPS,YOLO-V8s is 182 FPS,YOLO-V8m is 99 FPS,and YOLO-V8l is 60 FPS on FD dataset.YOLO-V5s,YOLO-V5s variants and YOLO-V5l_BiFPN,and YOLO-V8 algorithms are efficient and cost-effective solution for real-time vision-based vehicle detection in foggy weather.
文摘Analyzing a vehicle’s abnormal behavior in surveillance videos is a challenging field,mainly due to the wide variety of anomaly cases and the complexity of surveillance videos.In this study,a novel intelligent vehicle behavior analysis framework based on a digital twin is proposed.First,detecting vehicles based on deep learning is implemented,and Kalman filtering and feature matching are used to track vehicles.Subsequently,the tracked vehicle is mapped to a digital-twin virtual scene developed in the Unity game engine,and each vehicle’s behavior is tested according to the customized detection conditions set up in the scene.The stored behavior data can be used to reconstruct the scene again in Unity for a secondary analysis.The experimental results using real videos from traffic cameras illustrate that the detection rate of the proposed framework is close to that of the state-of-the-art abnormal event detection systems.In addition,the implementation and analysis process show the usability,generalization,and effectiveness of the proposed framework.
基金funded by Ministry of Science and Technology of the People’s Republic of China,Grant Numbers 2022YFC3800502Chongqing Science and Technology Commission,Grant Number cstc2020jscx-dxwtBX0019,CSTB2022TIAD-KPX0118,cstc2020jscx-cylhX0005 and cstc2021jscx-gksbX0058.
文摘Vehicle detection plays a crucial role in the field of autonomous driving technology.However,directly applying deep learning-based object detection algorithms to complex road scene images often leads to subpar performance and slow inference speeds in vehicle detection.Achieving a balance between accuracy and detection speed is crucial for real-time object detection in real-world road scenes.This paper proposes a high-precision and fast vehicle detector called the feature-guided bidirectional pyramid network(FBPN).Firstly,to tackle challenges like vehicle occlusion and significant background interference,the efficient feature filtering module(EFFM)is introduced into the deep network,which amplifies the disparities between the features of the vehicle and the background.Secondly,the proposed global attention localization module(GALM)in the model neck effectively perceives the detailed position information of the target,improving both the accuracy and inference speed of themodel.Finally,the detection accuracy of small-scale vehicles is further enhanced through the utilization of a four-layer feature pyramid structure.Experimental results show that FBPN achieves an average precision of 60.8% and 97.8% on the BDD100K and KITTI datasets,respectively,with inference speeds reaching 344.83 frames/s and 357.14 frames/s.FBPN demonstrates its effectiveness and superiority by striking a balance between detection accuracy and inference speed,outperforming several state-of-the-art methods.
基金supported by the MKE(The Ministry of Knowledge Economy),Korea,under the ITRC(Infor mation Technology Research Center)support program supervised by the NIPA(National IT Industry Promotion Agency)(NIPA-2011-C1090-1121-0010)by the Brain Korea 21 Project in2011
文摘This paper proposes a night-time vehicle detection method using variable Haar-like feature.The specific features of front vehicle cannot be obtained in road image at night-time because of light reflection and ambient light,and it is also difficult to define optimal brightness and color of rear lamp according to road conditions.In comparison,the difference of vehicle region and road surface is more robust for road illumination environment.Thus,we select the candidates of vehicles by analysing the difference,and verify the candidates using those brightness and complexity to detect vehicle correctly.The feature of brightness difference is detected using variable horizontal Haar-like mask according to vehicle size in the location of image.And the region occurring rapid change is selected as the candidate.The proposed method is evaluated by testing on the various real road conditions.
基金Project(61172047)supported by the National Natural Science Foundation of China
文摘Detecting the moving vehicles in jittering traffic scenes is a very difficult problem because of the complex environment.Only by the color features of the pixel or only by the texture features of image cannot establish a suitable background model for the moving vehicles. In order to solve this problem, the Gaussian pyramid layered algorithm is proposed, combining with the advantages of the Codebook algorithm and the Local binary patterns(LBP) algorithm. Firstly, the image pyramid is established to eliminate the noises generated by the camera shake. Then, codebook model and LBP model are constructed on the low-resolution level and the high-resolution level of Gaussian pyramid, respectively. At last, the final test results are obtained through a set of operations according to the spatial relations of pixels. The experimental results show that this algorithm can not only eliminate the noises effectively, but also save the calculating time with high detection sensitivity and high detection accuracy.