摘要
针对红外图像颜色、纹理等信息不够丰富,导致检测精度相较于可见光图像低;夜间红外行人目标检测模型参数量大、依赖高性能GPU资源,导致检测速度慢等问题,提出一种融合行人目标精细尺度嵌入的多检测层、轻量化实时检测模型。首先为了获取更精确的红外行人位置特征,在原始Yolov4-tiny结构上设计了64×64精细尺度检测层并增加残差结构加深主干网络,以融合红外行人的位置特征;其次针对红外行人目标宽高比相对固定的特点,应用K-means++聚类分析出适用于红外行人检测的先验框预置参数;最后为了减少模型参数,通过通批量归一化层道剪枝实现模型轻量化,同时利用知识蒸馏算法完成TIPRD的微量调控。实验结果表明,轻量化红外行人实时检测模型检测速度达到了88.7帧/s,平均检测精度达到89.2%,模型大小为4 MB,相较于Yolov4-tiny平均检测精度提升了8.6%,模型大小缩小了19.5 MB,相较于Yolov4缩小了264 MB。在Jetson Nano移动开发平台部署该模型验证了实际工程应用的有效性,对开发汽车辅助驾驶系统以降低夜间交通事故发生率具有重要意义。
The growing number of cars makes accidents more frequent.Due to the poor visibility conditions of drivers at night,the accident rate is higher than during the day.Therefore,various assisted driving technologies to enhance driving safety are widely used to reduce traffic accidents in the nighttime environment,among which infrared cameras have unique advantages at night.On the one hand,the visible light imaging of general cameras is easily affected by other interference light sources,and the low-quality images obtained in the nighttime environment with insufficient light will make pedestrian detection extremely difficult.The infrared camera technology based on the object's thermal radiation and reflection imaging can achieve barrier-free night vision without being affected by the interference light sources.On the other hand,the decreasing cost of infrared imaging equipment makes its application scenarios more and more common.Aiming at the night driving environment with a high accident rate,a night infrared image pedestrian detection model is proposed,which can detect pedestrians on the road at night in real-time.This research has important value and broad market application prospects in vehicle assisted driving,providing higher security for vehicles and pedestrians.Aiming at the problems of insufficient information such as color and texture of infrared images,low detection accuracy compared with visible light images,a large number of detection model parameters,and dependence on high-performance GPU resources,resulting in slow detection speed and other problems,a multi-scale embedding method fused with fine-scale pedestrian objects was proposed.Detection layer,lightweight real-time detection TIPRD model.First,to obtain more accurate infrared pedestrian location features,a 64×64 fine-scale detection layer is embedded on the original Yolov4-tiny structure to form a multi-detection layer structure,and a CSP module is added to deepen the backbone network to fuse the location features of infrared pedestrians;Secondly,in view of the relatively fixed aspect ratio of the infrared pedestrian target,K-means++clustering is used to analyze the preset parameters of the a priori frame suitable for the infrared pedestrian target for improvement of the match between the a priori frame and the infrared pedestrian target.Finally,in order to reduce the model parameters,the model is processed through the BN layer channel pruning,and the model before pruning is used as the teacher model.At the same time,the model after pruning is used as the student model.The knowledge distillation algorithm is used instead of fine-tuning to complete the micro-control of TIPRD.While ensuring the detection accuracy,the model parameters are greatly reduced and the model is further lightened.Experiments based on the Yolov4-tiny network model show that using three strategies of finescale multi-detection layer embedded in 64×64,adding a CSP module and a priori box clustering can improve the detection accuracy of infrared pedestrian targets by 8.6%.But with the increase of model parameters,the model size increases by 1.4M,and the FPS decreases by 11.4 frames/s.Therefore,Yolopedestrian needs to be channel pruned to achieve model lightweight.After pruning the BN layer channel,the model detection accuracy will be reduced to varying degrees.Therefore,this paper uses knowledge distillation instead of fine-tuning to achieve the accuracy recovery of the model after pruning.When the pruning rate is 0.8,the model size is compressed by 20.9M,and the FPS is increased by 8.4 frames/s.The model can maintain the original accuracy after pruning through the knowledge distillation algorithm,achieving a lightweight model.Under the premise of approximating the accuracy of the Yolov4 network model,the size of the TIPRD model is only 1.5%of the Yolov4 model,which is far smaller than other detection models of the same type.Compared with the Yolo-pedestrian model before pruning,the FPS is improved by 9.4 frames/s.At the same time,the TIPRD model also has an extremely fast detection speed of 88.7 frames/s,which meets the requirements of real-time detection.For the assisted driving system with limited computing resources,a lightweight model TIPRD with high accuracy is proposed,which provides a certain reference value for the application of infrared pedestrian detection in the nighttime assisted driving system deployed on the mobile terminal.Firstly,the structure is improved based on the Yolov4-tiny network.The CSP structure is circulated on the original network structure to strengthen the network feature extraction ability,and a detection layer with a size of 64×64 is added.A feature fusion line is added between the new detection layer and the backbone network to fuse the location features of infrared pedestrians and enrich the semantic information of feature maps.And according to the relatively fixed length and width of pedestrian targets,the K-means++clustering algorithm is used to analyze the preset model parameters of the apriori frame suitable for infrared pedestrian detection,which improves the match between the apriori frame and the pedestrian target;the model accuracy is improved by 8.6 points.Percentage points,verifying the effectiveness of our improvements on the Yolov4-tiny algorithm.Secondly,based on the improved pedestrian detection model,the BN layer channel pruning strategy is used to achieve compression,and the knowledge distillation algorithm is applied to complete the microadjustment of the model.On the premise of maintaining accuracy,the deep compression of the model is realised,and the model's size is compressed.At the same time,the test speed reaches 88.7 frames/s,8.4 frames/s higher than before pruning,which meets the requirements of real-time detections.Finally,the deployment of the TIPRD infrared pedestrian detection model at night on the Jetson Nano(2GB)mobile terminal development platform is realised,and the FPS is increased by 1.7 frames/s,by which the feasibility of running the model in the mobile terminal is further verified and good engineering application value is shown.
作者
张印辉
张朋程
何自芬
王森
ZHANG Yinhui;ZHANG Pengcheng;HE Zifen;WANG Sen(Mechanical and Electrical Engineering,Kunming University of Science and Technology,Kunming 650500,China)
出处
《光子学报》
EI
CAS
CSCD
北大核心
2022年第9期258-268,共11页
Acta Photonica Sinica
基金
国家自然科学基金(Nos.62061022,62171206,52065035,61761024)。
关键词
红外检测
深度学习
多检测层
模型剪枝
知识蒸馏
Infrared detection
Deep learning
Multiple detection layers
Model pruning
Knowledge distillation