Fire detection has held stringent importance in computer vision for over half a century.The development of early fire detection strategies is pivotal to the realization of safe and smart cities,inhabitable in the futu...Fire detection has held stringent importance in computer vision for over half a century.The development of early fire detection strategies is pivotal to the realization of safe and smart cities,inhabitable in the future.However,the development of optimal fire and smoke detection models is hindered by limitations like publicly available datasets,lack of diversity,and class imbalance.In this work,we explore the possible ways forward to overcome these challenges posed by available datasets.We study the impact of a class-balanced dataset to improve the fire detection capability of state-of-the-art(SOTA)vision-based models and propose the use of generative models for data augmentation,as a future work direction.First,a comparative analysis of two prominent object detection architectures,You Only Look Once version 7(YOLOv7)and YOLOv8 has been carried out using a balanced dataset,where both models have been evaluated across various evaluation metrics including precision,recall,and mean Average Precision(mAP).The results are compared to other recent fire detection models,highlighting the superior performance and efficiency of the proposed YOLOv8 architecture as trained on our balanced dataset.Next,a fractal dimension analysis gives a deeper insight into the repetition of patterns in fire,and the effectiveness of the results has been demonstrated by a windowing-based inference approach.The proposed Slicing-Aided Hyper Inference(SAHI)improves the fire and smoke detection capability of YOLOv8 for real-life applications with a significantly improved mAP performance over a strict confidence threshold.YOLOv8 with SAHI inference gives a mAP:50-95 improvement of more than 25%compared to the base YOLOv8 model.The study also provides insights into future work direction by exploring the potential of generative models like deep convolutional generative adversarial network(DCGAN)and diffusion models like stable diffusion,for data augmentation.展开更多
In the field of remote sensing,the rapid and accurate acquisition of the category and location of airplanes has emerged as a prominent research.However,remote sensing fuzzy imaging and complex environmental interferen...In the field of remote sensing,the rapid and accurate acquisition of the category and location of airplanes has emerged as a prominent research.However,remote sensing fuzzy imaging and complex environmental interference affect airplane detection.Besides,the inconsistency in the size of remote sensing images and the low accuracy of small target detection are crucial challenges that need to be addressed.To tackle these issues,we propose a novel network SDaDCS(SAHI-data augmentation-dilation-channel and spatial attention)based on YOLOX model and the slicing aided hyper inference(SAHI)framework,a new data augmentation technique and dilation-channel and spatial(DCS)attention mechanism.Initially,we create a remote sensing dataset for airplane targets and introduce a new data augmentation technique based on the Rotate-Mixup and mixed data augmentation to enhance data diversity.The DCS attention mechanism,which comprises the dilated convolution block,channel attention and spatial attention,is designed to bolster the feature extraction and discrimination of the network.To address the challenges arised by the difficulties of detecting small targets,we integrate the YOLOX model with the SAHI framework.Experiment results show that,when compared to the original YOLOX model,the proposed SDaDCS remote sensing target detection algorithm enhances overall accuracy by 13.6%.The experimental results validate the effectiveness of the proposed algorithm.展开更多
基金supported by a grant from R&D Program Development of Rail-Specific Digital Resource Technology Based on an AI-Enabled Rail Support Platform,grant number PK2401C1,of the Korea Railroad Research Institute.
文摘Fire detection has held stringent importance in computer vision for over half a century.The development of early fire detection strategies is pivotal to the realization of safe and smart cities,inhabitable in the future.However,the development of optimal fire and smoke detection models is hindered by limitations like publicly available datasets,lack of diversity,and class imbalance.In this work,we explore the possible ways forward to overcome these challenges posed by available datasets.We study the impact of a class-balanced dataset to improve the fire detection capability of state-of-the-art(SOTA)vision-based models and propose the use of generative models for data augmentation,as a future work direction.First,a comparative analysis of two prominent object detection architectures,You Only Look Once version 7(YOLOv7)and YOLOv8 has been carried out using a balanced dataset,where both models have been evaluated across various evaluation metrics including precision,recall,and mean Average Precision(mAP).The results are compared to other recent fire detection models,highlighting the superior performance and efficiency of the proposed YOLOv8 architecture as trained on our balanced dataset.Next,a fractal dimension analysis gives a deeper insight into the repetition of patterns in fire,and the effectiveness of the results has been demonstrated by a windowing-based inference approach.The proposed Slicing-Aided Hyper Inference(SAHI)improves the fire and smoke detection capability of YOLOv8 for real-life applications with a significantly improved mAP performance over a strict confidence threshold.YOLOv8 with SAHI inference gives a mAP:50-95 improvement of more than 25%compared to the base YOLOv8 model.The study also provides insights into future work direction by exploring the potential of generative models like deep convolutional generative adversarial network(DCGAN)and diffusion models like stable diffusion,for data augmentation.
基金supported in part by National Natural Science Foundation of China(No.62471034)Hebei Natural Science Foundation(No.F2023105001)。
文摘In the field of remote sensing,the rapid and accurate acquisition of the category and location of airplanes has emerged as a prominent research.However,remote sensing fuzzy imaging and complex environmental interference affect airplane detection.Besides,the inconsistency in the size of remote sensing images and the low accuracy of small target detection are crucial challenges that need to be addressed.To tackle these issues,we propose a novel network SDaDCS(SAHI-data augmentation-dilation-channel and spatial attention)based on YOLOX model and the slicing aided hyper inference(SAHI)framework,a new data augmentation technique and dilation-channel and spatial(DCS)attention mechanism.Initially,we create a remote sensing dataset for airplane targets and introduce a new data augmentation technique based on the Rotate-Mixup and mixed data augmentation to enhance data diversity.The DCS attention mechanism,which comprises the dilated convolution block,channel attention and spatial attention,is designed to bolster the feature extraction and discrimination of the network.To address the challenges arised by the difficulties of detecting small targets,we integrate the YOLOX model with the SAHI framework.Experiment results show that,when compared to the original YOLOX model,the proposed SDaDCS remote sensing target detection algorithm enhances overall accuracy by 13.6%.The experimental results validate the effectiveness of the proposed algorithm.