Forest fires pose a serious threat to ecological balance, air quality, and the safety of both humans and wildlife. This paper presents an improved model based on You Only Look Once version 5 (YOLOv5), named YOLO Light...Forest fires pose a serious threat to ecological balance, air quality, and the safety of both humans and wildlife. This paper presents an improved model based on You Only Look Once version 5 (YOLOv5), named YOLO Lightweight Fire Detector (YOLO-LFD), to address the limitations of traditional sensor-based fire detection methods in terms of real-time performance and accuracy. The proposed model is designed to enhance inference speed while maintaining high detection accuracy on resource-constrained devices such as drones and embedded systems. Firstly, we introduce Depthwise Separable Convolutions (DSConv) to reduce the complexity of the feature extraction network. Secondly, we design and implement the Lightweight Faster Implementation of Cross Stage Partial (CSP) Bottleneck with 2 Convolutions (C2f-Light) and the CSP Structure with 3 Compact Inverted Blocks (C3CIB) modules to replace the traditional C3 modules. This optimization enhances deep feature extraction and semantic information processing, thereby significantly increasing inference speed. To enhance the detection capability for small fires, the model employs a Normalized Wasserstein Distance (NWD) loss function, which effectively reduces the missed detection rate and improves the accuracy of detecting small fire sources. Experimental results demonstrate that compared to the baseline YOLOv5s model, the YOLO-LFD model not only increases inference speed by 19.3% but also significantly improves the detection accuracy for small fire targets, with only a 1.6% reduction in overall mean average precision (mAP)@0.5. Through these innovative improvements to YOLOv5s, the YOLO-LFD model achieves a balance between speed and accuracy, making it particularly suitable for real-time detection tasks on mobile and embedded devices.展开更多
Visible and infrared(RGB-IR)fusion object detection plays an important role in security,disaster relief,etc.In recent years,deep-learning-based RGB-IR fusion detection methods have been developing rapidly,but still st...Visible and infrared(RGB-IR)fusion object detection plays an important role in security,disaster relief,etc.In recent years,deep-learning-based RGB-IR fusion detection methods have been developing rapidly,but still struggle to deal with the complex and changing scenarios captured by drones,mainly due to two reasons:(A)RGB-IR fusion detectors are susceptible to inferior inputs that degrade performance and stability.(B)RGB-IR fusion detectors are susceptible to redundant features that reduce accuracy and efficiency.In this paper,an innovative RGB-IR fusion detection framework based on global-local feature optimization,named GLFDet,is proposed to improve the detection performance and efficiency of drone-captured objects.The key components of GLFDet include a Global Feature Optimization(GFO)module,a Local Feature Optimization(LFO)module and a Channel Separation Fusion(CSF)module.Specifically,GFO calculates the information content of the input image from the frequency domain and optimizes the features holistically.Then,LFO dynamically selects high-value features and filters out low-value features before fusion,which significantly improves the efficiency of fusion.Finally,CSF fuses the RGB and IR features across the corresponding channels,which avoids the rearrangement of the channel relationships and enhances the model stability.Extensive experimental results show that the proposed method achieves the best performance on three popular RGB-IR datasets Drone Vehicle,VEDAI,and LLVIP.In addition,GLFDet is more lightweight than other comparable models,making it more appealing to edge devices such as drones.The code is available at https://github.com/lao chen330/GLFDet.展开更多
The original monitoring data from aero-engines possess characteristics such as high dimen-sionality,strong noise,and imbalance,which present substantial challenges to traditional anomalydetection methods.In response,t...The original monitoring data from aero-engines possess characteristics such as high dimen-sionality,strong noise,and imbalance,which present substantial challenges to traditional anomalydetection methods.In response,this paper proposes a method based on Fuzzy Fusion of variablesand Discriminant mapping of features for Clustering(FFD-Clustering)to detect anomalies in originalmonitoring data from Aircraft Communication Addressing and Reporting System(ACARS).Firstly,associated variables are fuzzily grouped to extract the underlying distribution characteristics and trendsfrom the data.Secondly,a multi-layer contrastive denoising-based feature Fusion Encoding Network(FEN)is designed for each variable group,which can construct representative features for each variablegroup through eliminating strong noise and complex interrelations between variables.Thirdly,a featureDiscriminative Mapping Network(DMN)based on reconstruction difference re-clustering is designed,which can distinguish dissimilar feature vectors when mapping representative features to a unified fea-ture space.Finally,the K-means clustering is used to detect the abnormal feature vectors in the unifiedfeature space.Additionally,the algorithm is capable of reconstructing identified abnormal vectors,thereby locating the abnormal variable groups.The performance of this algorithm was tested ontwo public datasets and real original monitoring data from four aero-engines'ACARS,demonstratingits superiority and application potential in aero-engine anomaly detection.展开更多
Fire detection has held stringent importance in computer vision for over half a century.The development of early fire detection strategies is pivotal to the realization of safe and smart cities,inhabitable in the futu...Fire detection has held stringent importance in computer vision for over half a century.The development of early fire detection strategies is pivotal to the realization of safe and smart cities,inhabitable in the future.However,the development of optimal fire and smoke detection models is hindered by limitations like publicly available datasets,lack of diversity,and class imbalance.In this work,we explore the possible ways forward to overcome these challenges posed by available datasets.We study the impact of a class-balanced dataset to improve the fire detection capability of state-of-the-art(SOTA)vision-based models and propose the use of generative models for data augmentation,as a future work direction.First,a comparative analysis of two prominent object detection architectures,You Only Look Once version 7(YOLOv7)and YOLOv8 has been carried out using a balanced dataset,where both models have been evaluated across various evaluation metrics including precision,recall,and mean Average Precision(mAP).The results are compared to other recent fire detection models,highlighting the superior performance and efficiency of the proposed YOLOv8 architecture as trained on our balanced dataset.Next,a fractal dimension analysis gives a deeper insight into the repetition of patterns in fire,and the effectiveness of the results has been demonstrated by a windowing-based inference approach.The proposed Slicing-Aided Hyper Inference(SAHI)improves the fire and smoke detection capability of YOLOv8 for real-life applications with a significantly improved mAP performance over a strict confidence threshold.YOLOv8 with SAHI inference gives a mAP:50-95 improvement of more than 25%compared to the base YOLOv8 model.The study also provides insights into future work direction by exploring the potential of generative models like deep convolutional generative adversarial network(DCGAN)and diffusion models like stable diffusion,for data augmentation.展开更多
Glugea plecoglossi,a microsporidia of the Glugea genus,can cause an infamous disease Plecoglossus altivelis in East Asia,resulting in heavy economic losses.At present,the main diagnostic methods for this disease inclu...Glugea plecoglossi,a microsporidia of the Glugea genus,can cause an infamous disease Plecoglossus altivelis in East Asia,resulting in heavy economic losses.At present,the main diagnostic methods for this disease include microscopy examination,quantitative real-time PCR,and loop-mediated isothermal amplification-lateral flow dipstick(LAMP-LFD).In this study,a recombinase polymerase amplification-lateral flow dipstick(RPA-LFD)method,targeting the beta-tubulin gene,was developed to detect G.plecoglossi,three sets of primers and probes were designed and screened,after which the initial reaction system was established.The RPA-LFD method for G.plecoglossi could complete nucleic acid amplification at 39℃ for 10 min,after which the amplification product was dropped on the LFD strip,and the results could then be observed within 5 min.A specificity assay revealed that there was no cross reactivity with other protozoa except G.plecoglossi.A sensitivity assay revealed that the detection limit was 9.38×10^(-6) ng/μL,which was more sensitive than that of conventional PCR.Compared with conventional detection methods,the novel RPA-LFD method has the advantages of simple operation,short operation time,high sensitivity,and high specificity for G.plecoglossi detection,indicating its potential use in rapid field detection of G.plecoglossi.展开更多
Single-signal detection in orthogonal frequency-divisionmultiplexing(OFDM)systems presents a challenge due to the time-varying nature of wireless channels.Although conventional methods have limitations,particularly in...Single-signal detection in orthogonal frequency-divisionmultiplexing(OFDM)systems presents a challenge due to the time-varying nature of wireless channels.Although conventional methods have limitations,particularly inmulti-inputmultioutput orthogonal frequency divisionmultiplexing(MIMO-OFDM)systems,this paper addresses this problem by exploring advanced deep learning approaches for combined channel estimation and signal detection.Specifically,we propose two hybrid architectures that integrate a convolutional neural network(CNN)with a recurrent neural network(RNN),namely,CNN-long short-term memory(CNN-LSTM)and CNN-bidirectional-LSTM(CNNBi-LSTM),designed to enhance signal detection performance in MIMO-OFDM systems.The proposed CNN-LSTM and CNN-Bi-LSTM architectures are evaluated and compared with both traditional methods and standalone deep learning models.Training was conducted offline using a dataset generated from a 2×2 MIMO-OFDM system with a 3GPP 5G channel model.The trained models are evaluated using accuracy,loss,and computational time,and further analysis of signal detection performance is based on bit error rate,optimal cyclic prefix length,and optimal pilot subcarrier configurations under various noise conditions and channel uncertainty scenarios.The results demonstrate that the proposed CNN-based architectures,particularly the CNN-Bi-LSTM trained model,significantly reduce the need for pilot and cyclic prefix symbols while delivering superior performance,especially at SNRs.All the hybrid deep learning architectures(CNN-LSTM,CNN-Bi-LSTM)demonstrated greater robustness and adaptability under dynamic channel conditions,outperforming conventional methods and benchmark deep learning architectures.These results indicate the effectiveness of CNN-based feature extractors in learning generalized spatial patterns,positioning these hybrid models as highly efficient and reliable solutions for MIMO-OFDM signal detection in 5G and future wireless communication systems.展开更多
Wheat fungal infections pose a danger to the grain quality and crop productivity.Thus,prompt and precise diagnosis is essential for efficient crop management.This study used the WFD2020 image dataset,which is availabl...Wheat fungal infections pose a danger to the grain quality and crop productivity.Thus,prompt and precise diagnosis is essential for efficient crop management.This study used the WFD2020 image dataset,which is available to everyone,to look into howdeep learningmodels could be used to find powdery mildew,leaf rust,and yellow rust,which are three common fungal diseases in Punjab,India.We changed a few hyperparameters to test TensorFlowbased models,such as SSD and Faster R-CNN with ResNet50,ResNet101,and ResNet152 as backbones.Faster R-CNN with ResNet50 achieved amean average precision(mAP)of 0.68 among these models.We then used the PyTorch-based YOLOv8 model,which significantly outperformed the previous methods with an impressive mAP of 0.99.YOLOv8 proved to be a beneficial approach for the early-stage diagnosis of fungal diseases,especially when it comes to precisely identifying diseased areas and various object sizes in images.Problems,such as class imbalance and possible model overfitting,persisted despite these developments.The results show that YOLOv8 is a good automated disease diagnosis tool that helps farmers quickly find and treat fungal infections using image-based systems.展开更多
Small object detection has been a focus of attention since the emergence of deep learning-based object detection.Although classical object detection frameworks have made significant contributions to the development of...Small object detection has been a focus of attention since the emergence of deep learning-based object detection.Although classical object detection frameworks have made significant contributions to the development of object detection,there are still many issues to be resolved in detecting small objects due to the inherent complexity and diversity of real-world visual scenes.In particular,the YOLO(You Only Look Once)series of detection models,renowned for their real-time performance,have undergone numerous adaptations aimed at improving the detection of small targets.In this survey,we summarize the state-of-the-art YOLO-based small object detection methods.This review presents a systematic categorization of YOLO-based approaches for small-object detection,organized into four methodological avenues,namely attention-based feature enhancement,detection-head optimization,loss function,and multi-scale feature fusion strategies.We then examine the principal challenges addressed by each category.Finally,we analyze the performance of thesemethods on public benchmarks and,by comparing current approaches,identify limitations and outline directions for future research.展开更多
The rapid proliferation of Internet of Things(IoT)devices in critical healthcare infrastructure has introduced significant security and privacy challenges that demand innovative,distributed architectural solutions.Thi...The rapid proliferation of Internet of Things(IoT)devices in critical healthcare infrastructure has introduced significant security and privacy challenges that demand innovative,distributed architectural solutions.This paper proposes FE-ACS(Fog-Edge Adaptive Cybersecurity System),a novel hierarchical security framework that intelligently distributes AI-powered anomaly detection algorithms across edge,fog,and cloud layers to optimize security efficacy,latency,and privacy.Our comprehensive evaluation demonstrates that FE-ACS achieves superior detection performance with an AUC-ROC of 0.985 and an F1-score of 0.923,while maintaining significantly lower end-to-end latency(18.7 ms)compared to cloud-centric(152.3 ms)and fog-only(34.5 ms)architectures.The system exhibits exceptional scalability,supporting up to 38,000 devices with logarithmic performance degradation—a 67×improvement over conventional cloud-based approaches.By incorporating differential privacy mechanisms with balanced privacy-utility tradeoffs(ε=1.0–1.5),FE-ACS maintains 90%–93%detection accuracy while ensuring strong privacy guarantees for sensitive healthcare data.Computational efficiency analysis reveals that our architecture achieves a detection rate of 12,400 events per second with only 12.3 mJ energy consumption per inference.In healthcare risk assessment,FE-ACS demonstrates robust operational viability with low patient safety risk(14.7%)and high system reliability(94.0%).The proposed framework represents a significant advancement in distributed security architectures,offering a scalable,privacy-preserving,and real-time solution for protecting healthcare IoT ecosystems against evolving cyber threats.展开更多
Breast cancer screening programs rely heavily on mammography for early detection;however,diagnostic performance is strongly affected by inter-reader variability,breast density,and the limitations of conven-tional comp...Breast cancer screening programs rely heavily on mammography for early detection;however,diagnostic performance is strongly affected by inter-reader variability,breast density,and the limitations of conven-tional computer-aided detection systems.Recent advances in deep learning have enabled more robust and scalable solutions for large-scale screening,yet a systematic comparison of modern object detection architectures on nationally representative datasets remains limited.This study presents a comprehensive quantitative comparison of prominent deep learning–based object detection architectures for Artificial Intelligence-assisted mammography analysis using the MammosighTR dataset,developed within the Turkish National Breast Cancer Screening Program.The dataset comprises 12,740 patient cases collected between 2016 and 2022,annotated with BI-RADS categories,breast density levels,and lesion localization labels.A total of 31 models were evaluated,including One-Stage,Two-Stage,and Transformer-based architectures,under a unified experimental framework at both patient and breast levels.The results demonstrate that Two-Stage architectures consistently outperform One-Stage models,achieving approximately 2%–4%higher Macro F1-Scores and more balanced precision–recall trade-offs,with Double-Head R-CNN and Dynamic R-CNN yielding the highest overall performance(Macro F1≈0.84–0.86).This advantage is primarily attributed to the region proposal mechanism and improved class balance inherent to Two-Stage designs.One-Stage detectors exhibited higher sensitivity and faster inference,reaching Recall values above 0.88,but experienced minor reductions in Precision and overall accuracy(≈1%–2%)compared with Two-Stage models.Among Transformer-based architectures,Deformable DEtection TRansformer demonstrated strong robustness and consistency across datasets,achieving Macro F1-Scores comparable to CNN-based detectors(≈0.83–0.85)while exhibiting minimal performance degradation under distributional shifts.Breast density–based analysis revealed increased misclassification rates in medium-density categories(types B and C),whereas Transformer-based architectures maintained more stable performance in high-density type D tissue.These findings quantitatively confirm that both architectural design and tissue characteristics play a decisive role in diagnostic accuracy.Overall,the study provides a reproducible benchmark and highlights the potential of hybrid approaches that combine the accuracy of Two-Stage detectors with the contextual modeling capability of Transformer architectures for clinically reliable breast cancer screening systems.展开更多
Traffic sign detection is an important part of autonomous driving,and its recognition accuracy and speed are directly related to road traffic safety.Although convolutional neural networks(CNNs)have made certain breakt...Traffic sign detection is an important part of autonomous driving,and its recognition accuracy and speed are directly related to road traffic safety.Although convolutional neural networks(CNNs)have made certain breakthroughs in this field,in the face of complex scenes,such as image blur and target occlusion,the traffic sign detection continues to exhibit limited accuracy,accompanied by false positives and missed detections.To address the above problems,a traffic sign detection algorithm,You Only Look Once-based Skip Dynamic Way(YOLO-SDW)based on You Only Look Once version 8 small(YOLOv8s),is proposed.Firstly,a Skip Connection Reconstruction(SCR)module is introduced to efficiently integrate fine-grained feature information and enhance the detection accuracy of the algorithm in complex scenes.Secondly,a C2f module based on Dynamic Snake Convolution(C2f-DySnake)is proposed to dynamically adjust the receptive field information,improve the algorithm’s feature extraction ability for blurred or occluded targets,and reduce the occurrence of false detections and missed detections.Finally,the Wise Powerful IoU v2(WPIoUv2)loss function is proposed to further improve the detection accuracy of the algorithm.Experimental results show that the average precision mAP@0.5 of YOLO-SDW on the TT100K dataset is 89.2%,and mAP@0.5:0.95 is 68.5%,which is 4%and 3.3%higher than the YOLOv8s baseline,respectively.YOLO-SDW ensures real-time performance while having higher accuracy.展开更多
The continuous decrease in global fishery resources has increased the importance of precise and efficient underwater fish monitoring technology.First,this study proposes an improved underwater target detection framewo...The continuous decrease in global fishery resources has increased the importance of precise and efficient underwater fish monitoring technology.First,this study proposes an improved underwater target detection framework based on YOLOv8,with the aim of enhancing detection accuracy and the ability to recognize multi-scale targets in blurry and complex underwater environments.A streamlined Vision Transformer(ViT)model is used as the feature extraction backbone,which retains global self-attention feature extraction and accelerates training efficiency.In addition,a detection head named Dynamic Head(DyHead)is introduced,which enhances the efficiency of processing various target sizes through multi-scale feature fusion and adaptive attention modules.Furthermore,a dynamic loss function adjustment method called SlideLoss is employed.This method utilizes sliding window technology to adaptively adjust parameters,which optimizes the detection of challenging targets.The experimental results on the RUOD dataset show that the proposed improved model not only significantly enhances the accuracy of target detection but also increases the efficiency of target detection.展开更多
Simultaneous identification and quantitative detection of phenylenediamine(PDA)isomers,including o-phenylenediamine(OPD),m-phenylenediamine(MPD),and p-phenylenediamine(PPD),are essential for environmental risk assessm...Simultaneous identification and quantitative detection of phenylenediamine(PDA)isomers,including o-phenylenediamine(OPD),m-phenylenediamine(MPD),and p-phenylenediamine(PPD),are essential for environmental risk assessment and human health protection.However,current visual detection methods can only distinguish individual PDA isomers and failed to identify binary or ternary mixtures.Herein,a highly active and ultrastable peroxidase(POD)-like CoPt graphitic nanozyme was used for naked-eye identification and colorimetric/fluorescent(FL)dual-mode quantitative detection of PDA isomers.The CoPt@G nanozyme effectively catalyzed the oxidation of OPD,MPD,PPD,OPD+PPD,OPD+MPD,MPD+PPD and OPD+MPD+PPD into yellow,colorless,lilac,yellow,yellow,wine red and reddish-brown products,respectively,in the presence of H_(2)O_(2).Thus,the MPD,PPD,MPD+PPD and OPD+MPD+PPD were easily identified based on the distinct color of their oxidation products,and the OPD,OPD+PPD,OPD+MPD could be further identified by the additional addition of MPD or PPD.Subsequently,CoPt@G/H_(2)O_(2)-,a 3,3′,5,5′-tetramethylbenzidine(TMB)/CoPt@G/H_(2)O_(2)-,and MPD/CoPt@G/H_(2)O_(2)-enabled colorimetric/FL dual-mode platforms for the quantitative detection of OPD,MPD and PPD were proposed.The experimental results illustrated that the constructed sensing platforms exhibit satisfactory sensitivity,comparable to that reported in previous studies.Finally,the evaluation of PDAs in water samples was realized,yielding satisfactory recoveries.This work expanded the application prospects of nanozymes in assessing environmental risks and protection of human security.展开更多
基金supported by the National Natural Science Foundation of China(Grant Nos.62101275 and 62101274).
文摘Forest fires pose a serious threat to ecological balance, air quality, and the safety of both humans and wildlife. This paper presents an improved model based on You Only Look Once version 5 (YOLOv5), named YOLO Lightweight Fire Detector (YOLO-LFD), to address the limitations of traditional sensor-based fire detection methods in terms of real-time performance and accuracy. The proposed model is designed to enhance inference speed while maintaining high detection accuracy on resource-constrained devices such as drones and embedded systems. Firstly, we introduce Depthwise Separable Convolutions (DSConv) to reduce the complexity of the feature extraction network. Secondly, we design and implement the Lightweight Faster Implementation of Cross Stage Partial (CSP) Bottleneck with 2 Convolutions (C2f-Light) and the CSP Structure with 3 Compact Inverted Blocks (C3CIB) modules to replace the traditional C3 modules. This optimization enhances deep feature extraction and semantic information processing, thereby significantly increasing inference speed. To enhance the detection capability for small fires, the model employs a Normalized Wasserstein Distance (NWD) loss function, which effectively reduces the missed detection rate and improves the accuracy of detecting small fire sources. Experimental results demonstrate that compared to the baseline YOLOv5s model, the YOLO-LFD model not only increases inference speed by 19.3% but also significantly improves the detection accuracy for small fire targets, with only a 1.6% reduction in overall mean average precision (mAP)@0.5. Through these innovative improvements to YOLOv5s, the YOLO-LFD model achieves a balance between speed and accuracy, making it particularly suitable for real-time detection tasks on mobile and embedded devices.
基金supported by the National Natural Science Foundation of China(No.62276204)the Fundamental Research Funds for the Central Universities,China(No.YJSJ24011)+1 种基金the Natural Science Basic Research Program of Shaanxi,China(Nos.2022JM-340 and 2023-JC-QN-0710)the China Postdoctoral Science Foundation(Nos.2020T130494 and 2018M633470)。
文摘Visible and infrared(RGB-IR)fusion object detection plays an important role in security,disaster relief,etc.In recent years,deep-learning-based RGB-IR fusion detection methods have been developing rapidly,but still struggle to deal with the complex and changing scenarios captured by drones,mainly due to two reasons:(A)RGB-IR fusion detectors are susceptible to inferior inputs that degrade performance and stability.(B)RGB-IR fusion detectors are susceptible to redundant features that reduce accuracy and efficiency.In this paper,an innovative RGB-IR fusion detection framework based on global-local feature optimization,named GLFDet,is proposed to improve the detection performance and efficiency of drone-captured objects.The key components of GLFDet include a Global Feature Optimization(GFO)module,a Local Feature Optimization(LFO)module and a Channel Separation Fusion(CSF)module.Specifically,GFO calculates the information content of the input image from the frequency domain and optimizes the features holistically.Then,LFO dynamically selects high-value features and filters out low-value features before fusion,which significantly improves the efficiency of fusion.Finally,CSF fuses the RGB and IR features across the corresponding channels,which avoids the rearrangement of the channel relationships and enhances the model stability.Extensive experimental results show that the proposed method achieves the best performance on three popular RGB-IR datasets Drone Vehicle,VEDAI,and LLVIP.In addition,GLFDet is more lightweight than other comparable models,making it more appealing to edge devices such as drones.The code is available at https://github.com/lao chen330/GLFDet.
基金co-supported by the National Science and Technology Major Project,China(No.J2019-I-0001-0001)the National Natural Science Foundation of China(No.52105545)。
文摘The original monitoring data from aero-engines possess characteristics such as high dimen-sionality,strong noise,and imbalance,which present substantial challenges to traditional anomalydetection methods.In response,this paper proposes a method based on Fuzzy Fusion of variablesand Discriminant mapping of features for Clustering(FFD-Clustering)to detect anomalies in originalmonitoring data from Aircraft Communication Addressing and Reporting System(ACARS).Firstly,associated variables are fuzzily grouped to extract the underlying distribution characteristics and trendsfrom the data.Secondly,a multi-layer contrastive denoising-based feature Fusion Encoding Network(FEN)is designed for each variable group,which can construct representative features for each variablegroup through eliminating strong noise and complex interrelations between variables.Thirdly,a featureDiscriminative Mapping Network(DMN)based on reconstruction difference re-clustering is designed,which can distinguish dissimilar feature vectors when mapping representative features to a unified fea-ture space.Finally,the K-means clustering is used to detect the abnormal feature vectors in the unifiedfeature space.Additionally,the algorithm is capable of reconstructing identified abnormal vectors,thereby locating the abnormal variable groups.The performance of this algorithm was tested ontwo public datasets and real original monitoring data from four aero-engines'ACARS,demonstratingits superiority and application potential in aero-engine anomaly detection.
基金supported by a grant from R&D Program Development of Rail-Specific Digital Resource Technology Based on an AI-Enabled Rail Support Platform,grant number PK2401C1,of the Korea Railroad Research Institute.
文摘Fire detection has held stringent importance in computer vision for over half a century.The development of early fire detection strategies is pivotal to the realization of safe and smart cities,inhabitable in the future.However,the development of optimal fire and smoke detection models is hindered by limitations like publicly available datasets,lack of diversity,and class imbalance.In this work,we explore the possible ways forward to overcome these challenges posed by available datasets.We study the impact of a class-balanced dataset to improve the fire detection capability of state-of-the-art(SOTA)vision-based models and propose the use of generative models for data augmentation,as a future work direction.First,a comparative analysis of two prominent object detection architectures,You Only Look Once version 7(YOLOv7)and YOLOv8 has been carried out using a balanced dataset,where both models have been evaluated across various evaluation metrics including precision,recall,and mean Average Precision(mAP).The results are compared to other recent fire detection models,highlighting the superior performance and efficiency of the proposed YOLOv8 architecture as trained on our balanced dataset.Next,a fractal dimension analysis gives a deeper insight into the repetition of patterns in fire,and the effectiveness of the results has been demonstrated by a windowing-based inference approach.The proposed Slicing-Aided Hyper Inference(SAHI)improves the fire and smoke detection capability of YOLOv8 for real-life applications with a significantly improved mAP performance over a strict confidence threshold.YOLOv8 with SAHI inference gives a mAP:50-95 improvement of more than 25%compared to the base YOLOv8 model.The study also provides insights into future work direction by exploring the potential of generative models like deep convolutional generative adversarial network(DCGAN)and diffusion models like stable diffusion,for data augmentation.
基金Supported by the Visiting and Training Foundation of Teachers in Ordinary Undergraduate Universities of Shandong Province,the Qingdao Agricultural University Doctoral Start-Up Fund(No.6631122030)the Advanced Talents Foundation of QAU(No.6651118016)+2 种基金the Fish Innovation Team of Shandong Agriculture Research System(No.SDAIT12-06)the Shandong Engineering Research Center for Prevention and Control of Aquatic Animal Disease,the“First Class Fishery Discipline”Program[(2020)3]of Shandong Provincethe Key R&D Program(Soft Science Project)of Shandong Province,China(No.2023 RKY 06004)。
文摘Glugea plecoglossi,a microsporidia of the Glugea genus,can cause an infamous disease Plecoglossus altivelis in East Asia,resulting in heavy economic losses.At present,the main diagnostic methods for this disease include microscopy examination,quantitative real-time PCR,and loop-mediated isothermal amplification-lateral flow dipstick(LAMP-LFD).In this study,a recombinase polymerase amplification-lateral flow dipstick(RPA-LFD)method,targeting the beta-tubulin gene,was developed to detect G.plecoglossi,three sets of primers and probes were designed and screened,after which the initial reaction system was established.The RPA-LFD method for G.plecoglossi could complete nucleic acid amplification at 39℃ for 10 min,after which the amplification product was dropped on the LFD strip,and the results could then be observed within 5 min.A specificity assay revealed that there was no cross reactivity with other protozoa except G.plecoglossi.A sensitivity assay revealed that the detection limit was 9.38×10^(-6) ng/μL,which was more sensitive than that of conventional PCR.Compared with conventional detection methods,the novel RPA-LFD method has the advantages of simple operation,short operation time,high sensitivity,and high specificity for G.plecoglossi detection,indicating its potential use in rapid field detection of G.plecoglossi.
基金supported by the IITP(Institute of Information&Communications Technology Planning&Evaluation)-ICAN(ICT Challenge and Advanced Network of HRD)grant funded by the Korea government(Ministry of Science and ICT)(IITP-2025-RS-2022-00156299).
文摘Single-signal detection in orthogonal frequency-divisionmultiplexing(OFDM)systems presents a challenge due to the time-varying nature of wireless channels.Although conventional methods have limitations,particularly inmulti-inputmultioutput orthogonal frequency divisionmultiplexing(MIMO-OFDM)systems,this paper addresses this problem by exploring advanced deep learning approaches for combined channel estimation and signal detection.Specifically,we propose two hybrid architectures that integrate a convolutional neural network(CNN)with a recurrent neural network(RNN),namely,CNN-long short-term memory(CNN-LSTM)and CNN-bidirectional-LSTM(CNNBi-LSTM),designed to enhance signal detection performance in MIMO-OFDM systems.The proposed CNN-LSTM and CNN-Bi-LSTM architectures are evaluated and compared with both traditional methods and standalone deep learning models.Training was conducted offline using a dataset generated from a 2×2 MIMO-OFDM system with a 3GPP 5G channel model.The trained models are evaluated using accuracy,loss,and computational time,and further analysis of signal detection performance is based on bit error rate,optimal cyclic prefix length,and optimal pilot subcarrier configurations under various noise conditions and channel uncertainty scenarios.The results demonstrate that the proposed CNN-based architectures,particularly the CNN-Bi-LSTM trained model,significantly reduce the need for pilot and cyclic prefix symbols while delivering superior performance,especially at SNRs.All the hybrid deep learning architectures(CNN-LSTM,CNN-Bi-LSTM)demonstrated greater robustness and adaptability under dynamic channel conditions,outperforming conventional methods and benchmark deep learning architectures.These results indicate the effectiveness of CNN-based feature extractors in learning generalized spatial patterns,positioning these hybrid models as highly efficient and reliable solutions for MIMO-OFDM signal detection in 5G and future wireless communication systems.
基金supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2025R432),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Wheat fungal infections pose a danger to the grain quality and crop productivity.Thus,prompt and precise diagnosis is essential for efficient crop management.This study used the WFD2020 image dataset,which is available to everyone,to look into howdeep learningmodels could be used to find powdery mildew,leaf rust,and yellow rust,which are three common fungal diseases in Punjab,India.We changed a few hyperparameters to test TensorFlowbased models,such as SSD and Faster R-CNN with ResNet50,ResNet101,and ResNet152 as backbones.Faster R-CNN with ResNet50 achieved amean average precision(mAP)of 0.68 among these models.We then used the PyTorch-based YOLOv8 model,which significantly outperformed the previous methods with an impressive mAP of 0.99.YOLOv8 proved to be a beneficial approach for the early-stage diagnosis of fungal diseases,especially when it comes to precisely identifying diseased areas and various object sizes in images.Problems,such as class imbalance and possible model overfitting,persisted despite these developments.The results show that YOLOv8 is a good automated disease diagnosis tool that helps farmers quickly find and treat fungal infections using image-based systems.
基金supported in part by the by Chongqing Research Program of Basic Research and Frontier Technology under Grant CSTB2025NSCQ-GPX1309.
文摘Small object detection has been a focus of attention since the emergence of deep learning-based object detection.Although classical object detection frameworks have made significant contributions to the development of object detection,there are still many issues to be resolved in detecting small objects due to the inherent complexity and diversity of real-world visual scenes.In particular,the YOLO(You Only Look Once)series of detection models,renowned for their real-time performance,have undergone numerous adaptations aimed at improving the detection of small targets.In this survey,we summarize the state-of-the-art YOLO-based small object detection methods.This review presents a systematic categorization of YOLO-based approaches for small-object detection,organized into four methodological avenues,namely attention-based feature enhancement,detection-head optimization,loss function,and multi-scale feature fusion strategies.We then examine the principal challenges addressed by each category.Finally,we analyze the performance of thesemethods on public benchmarks and,by comparing current approaches,identify limitations and outline directions for future research.
基金supported by the Deanship of Graduate Studies and Scientific Research at Jouf University under grant No.(DGSSR-2025-02-01276).
文摘The rapid proliferation of Internet of Things(IoT)devices in critical healthcare infrastructure has introduced significant security and privacy challenges that demand innovative,distributed architectural solutions.This paper proposes FE-ACS(Fog-Edge Adaptive Cybersecurity System),a novel hierarchical security framework that intelligently distributes AI-powered anomaly detection algorithms across edge,fog,and cloud layers to optimize security efficacy,latency,and privacy.Our comprehensive evaluation demonstrates that FE-ACS achieves superior detection performance with an AUC-ROC of 0.985 and an F1-score of 0.923,while maintaining significantly lower end-to-end latency(18.7 ms)compared to cloud-centric(152.3 ms)and fog-only(34.5 ms)architectures.The system exhibits exceptional scalability,supporting up to 38,000 devices with logarithmic performance degradation—a 67×improvement over conventional cloud-based approaches.By incorporating differential privacy mechanisms with balanced privacy-utility tradeoffs(ε=1.0–1.5),FE-ACS maintains 90%–93%detection accuracy while ensuring strong privacy guarantees for sensitive healthcare data.Computational efficiency analysis reveals that our architecture achieves a detection rate of 12,400 events per second with only 12.3 mJ energy consumption per inference.In healthcare risk assessment,FE-ACS demonstrates robust operational viability with low patient safety risk(14.7%)and high system reliability(94.0%).The proposed framework represents a significant advancement in distributed security architectures,offering a scalable,privacy-preserving,and real-time solution for protecting healthcare IoT ecosystems against evolving cyber threats.
文摘Breast cancer screening programs rely heavily on mammography for early detection;however,diagnostic performance is strongly affected by inter-reader variability,breast density,and the limitations of conven-tional computer-aided detection systems.Recent advances in deep learning have enabled more robust and scalable solutions for large-scale screening,yet a systematic comparison of modern object detection architectures on nationally representative datasets remains limited.This study presents a comprehensive quantitative comparison of prominent deep learning–based object detection architectures for Artificial Intelligence-assisted mammography analysis using the MammosighTR dataset,developed within the Turkish National Breast Cancer Screening Program.The dataset comprises 12,740 patient cases collected between 2016 and 2022,annotated with BI-RADS categories,breast density levels,and lesion localization labels.A total of 31 models were evaluated,including One-Stage,Two-Stage,and Transformer-based architectures,under a unified experimental framework at both patient and breast levels.The results demonstrate that Two-Stage architectures consistently outperform One-Stage models,achieving approximately 2%–4%higher Macro F1-Scores and more balanced precision–recall trade-offs,with Double-Head R-CNN and Dynamic R-CNN yielding the highest overall performance(Macro F1≈0.84–0.86).This advantage is primarily attributed to the region proposal mechanism and improved class balance inherent to Two-Stage designs.One-Stage detectors exhibited higher sensitivity and faster inference,reaching Recall values above 0.88,but experienced minor reductions in Precision and overall accuracy(≈1%–2%)compared with Two-Stage models.Among Transformer-based architectures,Deformable DEtection TRansformer demonstrated strong robustness and consistency across datasets,achieving Macro F1-Scores comparable to CNN-based detectors(≈0.83–0.85)while exhibiting minimal performance degradation under distributional shifts.Breast density–based analysis revealed increased misclassification rates in medium-density categories(types B and C),whereas Transformer-based architectures maintained more stable performance in high-density type D tissue.These findings quantitatively confirm that both architectural design and tissue characteristics play a decisive role in diagnostic accuracy.Overall,the study provides a reproducible benchmark and highlights the potential of hybrid approaches that combine the accuracy of Two-Stage detectors with the contextual modeling capability of Transformer architectures for clinically reliable breast cancer screening systems.
基金funded by Key research and development Program of Henan Province(No.251111211200)National Natural Science Foundation of China(Grant No.U2004163).
文摘Traffic sign detection is an important part of autonomous driving,and its recognition accuracy and speed are directly related to road traffic safety.Although convolutional neural networks(CNNs)have made certain breakthroughs in this field,in the face of complex scenes,such as image blur and target occlusion,the traffic sign detection continues to exhibit limited accuracy,accompanied by false positives and missed detections.To address the above problems,a traffic sign detection algorithm,You Only Look Once-based Skip Dynamic Way(YOLO-SDW)based on You Only Look Once version 8 small(YOLOv8s),is proposed.Firstly,a Skip Connection Reconstruction(SCR)module is introduced to efficiently integrate fine-grained feature information and enhance the detection accuracy of the algorithm in complex scenes.Secondly,a C2f module based on Dynamic Snake Convolution(C2f-DySnake)is proposed to dynamically adjust the receptive field information,improve the algorithm’s feature extraction ability for blurred or occluded targets,and reduce the occurrence of false detections and missed detections.Finally,the Wise Powerful IoU v2(WPIoUv2)loss function is proposed to further improve the detection accuracy of the algorithm.Experimental results show that the average precision mAP@0.5 of YOLO-SDW on the TT100K dataset is 89.2%,and mAP@0.5:0.95 is 68.5%,which is 4%and 3.3%higher than the YOLOv8s baseline,respectively.YOLO-SDW ensures real-time performance while having higher accuracy.
基金supported by the National Natural Science Foundation of China(No.52106080)the Jilin City Science and Technology Innovation Development Plan Project(No.20240302014)+2 种基金the Jilin Provincial Department of Education Science and Technology Research Project(No.JJKH20230135K)the Jilin Province Science and Technology Development Plan Project(No.YDZJ202401640ZYTS)the Northeast Electric Power University Teaching Reform Research Project(No.J2427)。
文摘The continuous decrease in global fishery resources has increased the importance of precise and efficient underwater fish monitoring technology.First,this study proposes an improved underwater target detection framework based on YOLOv8,with the aim of enhancing detection accuracy and the ability to recognize multi-scale targets in blurry and complex underwater environments.A streamlined Vision Transformer(ViT)model is used as the feature extraction backbone,which retains global self-attention feature extraction and accelerates training efficiency.In addition,a detection head named Dynamic Head(DyHead)is introduced,which enhances the efficiency of processing various target sizes through multi-scale feature fusion and adaptive attention modules.Furthermore,a dynamic loss function adjustment method called SlideLoss is employed.This method utilizes sliding window technology to adaptively adjust parameters,which optimizes the detection of challenging targets.The experimental results on the RUOD dataset show that the proposed improved model not only significantly enhances the accuracy of target detection but also increases the efficiency of target detection.
基金supported by the National Key Research and Development Program of China(No.2022YFC2403500)the National Natural Science Foundation of China(No.22225401)+1 种基金the Science and Technology Innovation Program of Hunan Province(No.2020RC4017)the Guizhou Provincial Science and Technology Projects(No.ZK[2023]293).
文摘Simultaneous identification and quantitative detection of phenylenediamine(PDA)isomers,including o-phenylenediamine(OPD),m-phenylenediamine(MPD),and p-phenylenediamine(PPD),are essential for environmental risk assessment and human health protection.However,current visual detection methods can only distinguish individual PDA isomers and failed to identify binary or ternary mixtures.Herein,a highly active and ultrastable peroxidase(POD)-like CoPt graphitic nanozyme was used for naked-eye identification and colorimetric/fluorescent(FL)dual-mode quantitative detection of PDA isomers.The CoPt@G nanozyme effectively catalyzed the oxidation of OPD,MPD,PPD,OPD+PPD,OPD+MPD,MPD+PPD and OPD+MPD+PPD into yellow,colorless,lilac,yellow,yellow,wine red and reddish-brown products,respectively,in the presence of H_(2)O_(2).Thus,the MPD,PPD,MPD+PPD and OPD+MPD+PPD were easily identified based on the distinct color of their oxidation products,and the OPD,OPD+PPD,OPD+MPD could be further identified by the additional addition of MPD or PPD.Subsequently,CoPt@G/H_(2)O_(2)-,a 3,3′,5,5′-tetramethylbenzidine(TMB)/CoPt@G/H_(2)O_(2)-,and MPD/CoPt@G/H_(2)O_(2)-enabled colorimetric/FL dual-mode platforms for the quantitative detection of OPD,MPD and PPD were proposed.The experimental results illustrated that the constructed sensing platforms exhibit satisfactory sensitivity,comparable to that reported in previous studies.Finally,the evaluation of PDAs in water samples was realized,yielding satisfactory recoveries.This work expanded the application prospects of nanozymes in assessing environmental risks and protection of human security.