The continuous decrease in global fishery resources has increased the importance of precise and efficient underwater fish monitoring technology.First,this study proposes an improved underwater target detection framewo...The continuous decrease in global fishery resources has increased the importance of precise and efficient underwater fish monitoring technology.First,this study proposes an improved underwater target detection framework based on YOLOv8,with the aim of enhancing detection accuracy and the ability to recognize multi-scale targets in blurry and complex underwater environments.A streamlined Vision Transformer(ViT)model is used as the feature extraction backbone,which retains global self-attention feature extraction and accelerates training efficiency.In addition,a detection head named Dynamic Head(DyHead)is introduced,which enhances the efficiency of processing various target sizes through multi-scale feature fusion and adaptive attention modules.Furthermore,a dynamic loss function adjustment method called SlideLoss is employed.This method utilizes sliding window technology to adaptively adjust parameters,which optimizes the detection of challenging targets.The experimental results on the RUOD dataset show that the proposed improved model not only significantly enhances the accuracy of target detection but also increases the efficiency of target detection.展开更多
Within the domain of Intelligent Group Systems(IGSs),this paper develops a resourceaware multitarget Constant False Alarm Rate(CFAR)detection framework for multisite MIMO radar systems.It underscores the necessity of ...Within the domain of Intelligent Group Systems(IGSs),this paper develops a resourceaware multitarget Constant False Alarm Rate(CFAR)detection framework for multisite MIMO radar systems.It underscores the necessity of managing finite transmit and receive antennas and transmit power systematically to enhance detection performance.To tackle the multidimensional resource optimization challenge,we introduce a Cooperative Transmit-Receive Antenna Selection and Power Allocation(CTRSPA)strategy.It employs a perception-action cycle that incorporates uncertain external support information to optimize worst-case detection performance with multiple targets.First,we derive a closed-form expression that incorporates uncertainty for the noncoherent integration squared-law detection probability using the Neyman-Pearson criterion.Subsequently,a joint optimization model for antenna selection and power allocation in CFAR detection is formulated,incorporating practical radar resource constraints.Mathematically,this represents an NPhard problem involving coupled continuous and Boolean variables.We propose a three-stage method—Reformulation,Node Picker,and Convex Power Allocation—that capitalizes on the independent convexity of the optimization model for each variable,ensuring a near-optimal result.Simulations confirm the approach's effectiveness,efficiency,and timeliness,particularly for large-scale radar networks,and reveal the impact of threat levels,system layout,and detection parameters on resource allocation.展开更多
The application of deep learning for target detection in aerial images captured by Unmanned Aerial Vehicles(UAV)has emerged as a prominent research focus.Due to the considerable distance between UAVs and the photograp...The application of deep learning for target detection in aerial images captured by Unmanned Aerial Vehicles(UAV)has emerged as a prominent research focus.Due to the considerable distance between UAVs and the photographed objects,coupled with complex shooting environments,existing models often struggle to achieve accurate real-time target detection.In this paper,a You Only Look Once v8(YOLOv8)model is modified from four aspects:the detection head,the up-sampling module,the feature extraction module,and the parameter optimization of positive sample screening,and the YOLO-S3DT model is proposed to improve the performance of the model for detecting small targets in aerial images.Experimental results show that all detection indexes of the proposed model are significantly improved without increasing the number of model parameters and with the limited growth of computation.Moreover,this model also has the best performance compared to other detecting models,demonstrating its advancement within this category of tasks.展开更多
Aiming at the scale adaptation of automatic driving target detection algorithms in low illumination environments and the shortcomings in target occlusion processing,this paper proposes a YOLO-LKSDS automatic driving d...Aiming at the scale adaptation of automatic driving target detection algorithms in low illumination environments and the shortcomings in target occlusion processing,this paper proposes a YOLO-LKSDS automatic driving detection model.Firstly,the Contrast-Limited Adaptive Histogram Equalisation(CLAHE)image enhancement algorithm is improved to increase the image contrast and enhance the detailed features of the target;then,on the basis of the YOLOv5 model,the Kmeans++clustering algorithm is introduced to obtain a suitable anchor frame,and SPPELAN spatial pyramid pooling is improved to enhance the accuracy and robustness of the model for multi-scale target detection.Finally,an improved SEAM(Separated and Enhancement Attention Module)attention mechanism is combined with the DIOU-NMS algorithm to optimize the model’s performance when dealing with occlusion and dense scenes.Compared with the original model,the improved YOLO-LKSDS model achieves a 13.3%improvement in accuracy,a 1.7%improvement in mAP,and 240,000 fewer parameters on the BDD100K dataset.In order to validate the generalization of the improved algorithm,we selected the KITTI dataset for experimentation,which shows that YOLOv5’s accuracy improves by 21.1%,recall by 36.6%,and mAP50 by 29.5%,respectively,on the KITTI dataset.The deployment of this paper’s algorithm is verified by an edge computing platform,where the average speed of detection reaches 24.4 FPS while power consumption remains below 9 W,demonstrating high real-time capability and energy efficiency.展开更多
To solve the false detection and missed detection problems caused by various types and sizes of defects in the detection of steel surface defects,similar defects and background features,and similarities between differ...To solve the false detection and missed detection problems caused by various types and sizes of defects in the detection of steel surface defects,similar defects and background features,and similarities between different defects,this paper proposes a lightweight detection model named multiscale edge and squeeze-and-excitation attention detection network(MSESE),which is built upon the You Only Look Once version 11 nano(YOLOv11n).To address the difficulty of locating defect edges,we first propose an edge enhancement module(EEM),apply it to the process of multiscale feature extraction,and then propose a multiscale edge enhancement module(MSEEM).By obtaining defect features from different scales and enhancing their edge contours,the module uses the dual-domain selection mechanism to effectively focus on the important areas in the image to ensure that the feature images have richer information and clearer contour features.By fusing the squeeze-and-excitation attention mechanism with the EEM,we obtain a lighter module that can enhance the representation of edge features,which is named the edge enhancement module with squeeze-and-excitation attention(EEMSE).This module was subsequently integrated into the detection head.The enhanced detection head achieves improved edge feature enhancement with reduced computational overhead,while effectively adjusting channel-wise importance and further refining feature representation.Experiments on the NEU-DET dataset show that,compared with the original YOLOv11n,the improved model achieves improvements of 4.1%and 2.2%in terms of mAP@0.5 and mAP@0.5:0.95,respectively,and the GFLOPs value decreases from the original value of 6.4 to 6.2.Furthermore,when compared to current mainstream models,Mamba-YOLOT and RTDETR-R34,our method achieves superior performance with 6.5%and 8.9%higher mAP@0.5,respectively,while maintaining a more compact parameter footprint.These results collectively validate the effectiveness and efficiency of our proposed approach.展开更多
To address the challenge of real-time detection of unauthorized drone intrusions in complex low-altitude urban environments such as parks and airports,this paper proposes an enhanced MBS-YOLO(Multi-Branch Small Target...To address the challenge of real-time detection of unauthorized drone intrusions in complex low-altitude urban environments such as parks and airports,this paper proposes an enhanced MBS-YOLO(Multi-Branch Small Target Detection YOLO)model for anti-drone object detection,based on the YOLOv8 architecture.To overcome the limitations of existing methods in detecting small objects within complex backgrounds,we designed a C2f-Pu module with excellent feature extraction capability and a more compact parameter set,aiming to reduce the model’s computational complexity.To improve multi-scale feature fusion,we construct a Multi-Branch Feature Pyramid Network(MB-FPN)that employs a cross-level feature fusion strategy to enhance the model’s representation of small objects.Additionally,a shared detail-enhanced detection head is introduced to address the large size variations of Unmanned Aerial Vehicle(UAV)targets,thereby improving detection performance across different scales.Experimental results demonstrate that the proposed model achieves consistent improvements across multiple benchmarks.On the Det-Fly dataset,it improves precision by 3%,recall by 5.6%,and mAP50 by 4.5%compared with the baseline,while reducing parameters by 21.2%.Cross-validation on the VisDrone dataset further validates its robustness,yielding additional gains of 3.2%in precision,6.1%in recall,and 4.8%in mAP50 over the original YOLOv8.These findings confirm the effectiveness of the proposed algorithm in enhancing UAV detection performance under complex scenarios.展开更多
A measurement system for the scattering characteristics of warhead fragments based on high-speed imaging systems offers advantages such as simple deployment,flexible maneuverability,and high spatiotemporal resolution,...A measurement system for the scattering characteristics of warhead fragments based on high-speed imaging systems offers advantages such as simple deployment,flexible maneuverability,and high spatiotemporal resolution,enabling the acquisition of full-process data of the fragment scattering process.However,mismatches between camera frame rates and target velocities can lead to long motion blur tails of high-speed fragment targets,resulting in low signal-to-noise ratios and rendering conventional detection algorithms ineffective in dynamic strong interference testing environments.In this study,we propose a detection framework centered on dynamic strong interference disturbance signal separation and suppression.We introduce a mixture Gaussian model constrained under a joint spatialtemporal-transform domain Dirichlet process,combined with total variation regularization to achieve disturbance signal suppression.Experimental results demonstrate that the proposed disturbance suppression method can be integrated with certain conventional motion target detection tasks,enabling adaptation to real-world data to a certain extent.Moreover,we provide a specific implementation of this process,which achieves a detection rate close to 100%with an approximate 0%false alarm rate in multiple sets of real target field test data.This research effectively advances the development of the field of damage parameter testing.展开更多
An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyram...An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyramid network(FPN)structure of the original YOLOv8 mode is replaced by the generalized-FPN(GFPN)structure in GiraffeDet to realize the"cross-layer"and"cross-scale"adaptive feature fusion,to enrich the semantic information and spatial information on the feature map to improve the target detection ability of the model.Secondly,a pyramid-pool module of multi atrous spatial pyramid pooling(MASPP)is designed by using the idea of atrous convolution and feature pyramid structure to extract multi-scale features,so as to improve the processing ability of the model for multi-scale objects.The experimental results show that the detection accuracy of the improved YOLOv8 model on DIOR dataset is 92%and mean average precision(mAP)is 87.9%,respectively 3.5%and 1.7%higher than those of the original model.It is proved the detection and classification ability of the proposed model on multi-dimensional optical remote sensing target has been improved.展开更多
Unmanned aerial vehicle(UAV)imagery poses significant challenges for object detection due to extreme scale variations,high-density small targets(68%in VisDrone dataset),and complex backgrounds.While YOLO-series models...Unmanned aerial vehicle(UAV)imagery poses significant challenges for object detection due to extreme scale variations,high-density small targets(68%in VisDrone dataset),and complex backgrounds.While YOLO-series models achieve speed-accuracy trade-offs via fixed convolution kernels and manual feature fusion,their rigid architectures struggle with multi-scale adaptability,as exemplified by YOLOv8n’s 36.4%mAP and 13.9%small-object AP on VisDrone2019.This paper presents YOLO-LE,a lightweight framework addressing these limitations through three novel designs:(1)We introduce the C2f-Dy and LDown modules to enhance the backbone’s sensitivity to small-object features while reducing backbone parameters,thereby improving model efficiency.(2)An adaptive feature fusion module is designed to dynamically integrate multi-scale feature maps,optimizing the neck structure,reducing neck complexity,and enhancing overall model performance.(3)We replace the original loss function with a distributed focal loss and incorporate a lightweight self-attention mechanism to improve small-object recognition and bounding box regression accuracy.Experimental results demonstrate that YOLO-LE achieves 39.9%mAP@0.5 on VisDrone2019,representing a 9.6%improvement over YOLOv8n,while maintaining 8.5 GFLOPs computational efficiency.This provides an efficient solution for UAV object detection in complex scenarios.展开更多
Underwater target detection is extensively applied in domains such as underwater search and rescue,environmental monitoring,and marine resource surveys.It is crucial in enabling autonomous underwater robot operations ...Underwater target detection is extensively applied in domains such as underwater search and rescue,environmental monitoring,and marine resource surveys.It is crucial in enabling autonomous underwater robot operations and promoting ocean exploration.Nevertheless,low imaging quality,harsh underwater environments,and obscured objects considerably increase the difficulty of detecting underwater targets,making it difficult for current detection methods to achieve optimal performance.In order to enhance underwater object perception and improve target detection precision,we propose a lightweight underwater target detection method using You Only Look Once(YOLO)v8 with multi-scale cross-channel attention(MSCCA),named YOLOv8-UOD.In the proposed multiscale cross-channel attention module,multi-scale attention(MSA)augments the variety of attentional perception by extracting information from innately diverse sensory fields.The cross-channel strategy utilizes RepVGGbased channel shuffling(RCS)and one-shot aggregation(OSA)to rearrange feature map channels according to specific rules.It aggregates all features only once in the final feature mapping,resulting in the extraction of more comprehensive and valuable feature information.The experimental results show that the proposed YOLOv8-UOD achieves a mAP50 of 95.67%and FLOPs of 23.8 G on the Underwater Robot Picking Contest 2017(URPC2017)dataset,outperforming other methods in terms of detection precision and computational cost-efficiency.展开更多
Aiming at the problem that infrared small target detection faces low contrast between the background and the target and insufficient noise suppression ability under the complex cloud background,an infrared small targe...Aiming at the problem that infrared small target detection faces low contrast between the background and the target and insufficient noise suppression ability under the complex cloud background,an infrared small target detection method based on the tensor nuclear norm and direction residual weighting was proposed.Based on converting the infrared image into an infrared patch tensor model,from the perspective of the low-rank nature of the background tensor,and taking advantage of the difference in contrast between the background and the target in different directions,we designed a double-neighborhood local contrast based on direction residual weighting method(DNLCDRW)combined with the partial sum of tensor nuclear norm(PSTNN)to achieve effective background suppression and recovery of infrared small targets.Experiments show that the algorithm is effective in suppressing the background and improving the detection ability of the target.展开更多
When a fire breaks out in a high-rise building,the occlusion of smoke and obstacles results in dearth of crucial information concerning people in distress,thereby creating a challenge in their detection.Given the rest...When a fire breaks out in a high-rise building,the occlusion of smoke and obstacles results in dearth of crucial information concerning people in distress,thereby creating a challenge in their detection.Given the restricted sensing range of a single unmanned aerial vehicle(UAV)cam-era,enhancing the target recognition rate becomes challenging without target information.To tackle this issue,this paper proposes a multi-agent autonomous collaborative detection method for multi-targets in complex fire environments.The objective is to achieve the fusion of multi-angle visual information,effectively increasing the target’s information dimension,and ultimately address-ing the problem of low target recognition rate caused by the lack of target information.The method steps are as follows:first,the you only look once version5(YOLOv5)is used to detect the target in the image;second,the detected targets are tracked to monitor their movements and trajectories;third,the person re-identification(ReID)model is employed to extract the appearance features of targets;finally,by fusing the visual information from multi-angle cameras,the method achieves multi-agent autonomous collaborative detection.The experimental results show that the method effectively combines the visual information from multi-angle cameras,resulting in improved detec-tion efficiency for people in distress.展开更多
In this paper,a reasoning enhancement method based on RGCN(Relational Graph Convolutional Network)is proposed to improve the detection capability of UAV(Unmanned Aerial Vehicle)on fast-moving military targets in urban...In this paper,a reasoning enhancement method based on RGCN(Relational Graph Convolutional Network)is proposed to improve the detection capability of UAV(Unmanned Aerial Vehicle)on fast-moving military targets in urban battlefield environments.By combining military images with the publicly available VisDrone2019 dataset,a new dataset called VisMilitary was built and multiple YOLO(You Only Look Once)models were tested on it.Due to the low confidence problem caused by fuzzy targets,the performance of traditional YOLO models on real battlefield images decreases significantly.Therefore,we propose an improved RGCN inference model,which improves the performance of the model in complex environments by optimizing the data processing and graph network architecture.Experimental results show that the proposed method achieves an improvement of 0.4%to 1.7%on mAP@0.50,which proves the effectiveness of the model in military target detection.The research of this paper provides a new technical path for UAV target detection in urban battlefield,and provides important enlightenment for the application of deep learning in military field.展开更多
Infrared small-target detection has important applications in many fields due to its high penetration capability and detection distance.This study introduces a detector called“YOLO-SDLUWD”which is based on the YOLOv...Infrared small-target detection has important applications in many fields due to its high penetration capability and detection distance.This study introduces a detector called“YOLO-SDLUWD”which is based on the YOLOv7 network,for small target detection in complex infrared backgrounds.The“SDLUWD”refers to the combination of the Spatial Depth layer followed Convolutional layer structure(SD-Conv)and a Linear Up-sampling fusion Path Aggregation Feature Pyramid Network(LU-PAFPN)and a training strategy based on the normalized Gaussian Wasserstein Distance loss(WD-loss)function.“YOLO-SDLUWD”aims to reduce detection accuracy when the maximum pooling downsampling layer in the backbone network loses important feature information,support the interaction and fusion of high-dimensional and low-dimensional feature information,and overcome the false alarm predictions induced by noise in small target images.The detector achieved a mAP@0.5 of 90.4%and mAP@0.5:0.95 of 48.5%on IRIS-AG,an increase of 9%-11%over YOLOv7-tiny,outperforming other state-of-the-art target detectors in terms of accuracy and speed.展开更多
In the context of target detection under infrared conditions for drones,the common issues of high missed detection rates,low signal-to-noise ratio,and blurred edge features for small targets are prevalent.To address t...In the context of target detection under infrared conditions for drones,the common issues of high missed detection rates,low signal-to-noise ratio,and blurred edge features for small targets are prevalent.To address these challenges,this paper proposes an improved detection algorithm based on YOLOv11n.First,a Dynamic Multi-Scale Feature Fusion and Adaptive Weighting approach is employed to design an Adaptive Focused Diffusion Pyramid Network(AFDPN),which enhances the feature expression and transmission capability of shallow small targets,thereby reducing the loss of detailed information.Then,combined with an Edge Enhancement(EE)module,the model improves the extraction of infrared small target edge features through low-frequency suppression and high-frequency enhancement strategies.Experimental results on the publicly available HIT-UAV dataset show that the improved model achieves a 3.8%increase in average detection accuracy and a 3.0%improvement in recall rate compared to YOLOv11n,with a computational cost of only 9.1 GFLOPS.In comparison experiments,the detection accuracy and model size balance achieved the optimal solution,meeting the lightweight deployment requirements for drone-based systems.This method provides a high-precision,lightweight solution for small target detection in drone-based infrared imagery.展开更多
To address the issues of unknown target size,blurred edges,background interference and low contrast in infrared small target detection,this paper proposes a method based on density peaks searching and weighted multi-f...To address the issues of unknown target size,blurred edges,background interference and low contrast in infrared small target detection,this paper proposes a method based on density peaks searching and weighted multi-feature local difference.Firstly,an improved high-boost filter is used for preprocessing to eliminate background clutter and high-brightness interference,thereby increasing the probability of capturing real targets in the density peak search.Secondly,a triple-layer window is used to extract features from the area surrounding candidate targets,addressing the uncertainty of small target sizes.By calculating multi-feature local differences between the triple-layer windows,the problems of blurred target edges and low contrast are resolved.To balance the contribution of different features,intra-class distance is used to calculate weights,achieving weighted fusion of multi-feature local differences to obtain the weighted multi-feature local differences of candidate targets.The real targets are then extracted using the interquartile range.Experiments on datasets such as SIRST and IRSTD-IK show that the proposed method is suitable for various complex types and demonstrates good robustness and detection performance.展开更多
To address low detection accuracy in near-coastal vessel target detection under complex conditions,a novel near-coastal vessel detection model based on an improved YOLOv7 architecture is proposed in this paper.The att...To address low detection accuracy in near-coastal vessel target detection under complex conditions,a novel near-coastal vessel detection model based on an improved YOLOv7 architecture is proposed in this paper.The attention mechanism Coordinate Attention is used to improve channel attention weight and enhance a network’s ability to extract small target features.In the enhanced feature extraction network,the lightweight convolution algorithm Grouped Spatial Convolution is used to replace MPConv to reduce model calculation costs.EIoU Loss is used to replace the regression frame loss function in YOLOv7 to reduce the probability of missed and false detection.The performance of the improved model was verified using an enhanced dataset obtained through rainy and foggy weather simulation.Experiments were conducted on the datasets before and after the enhancement.The improved model achieved a mean average precision(mAP)of 97.45%on the original dataset,and the number of parameters was reduced by 2%.On the enhanced dataset,the mAP of the improved model reached 88.08%.Compared with seven target detection models,such as Faster R-CNN,YOLOv3,YOLOv4,YOLOv5,YOLOv7,YOLOv8-n,and YOLOv8-s,the improved model can effectively reduce the missed and false detection rates and improve target detection accuracy.The improved model not only accurately detects vessels in complex weather environments but also outperforms other methods on original and enhanced SeaShip datasets.This finding shows that the improved model can achieve near-coastal vessel target detection in multiple environments,laying the foundation for vessel path planning and automatic obstacle avoidance.展开更多
Developing an accurate and visual sensing strategy for trace levels of fluoroquinolone residues that pose threat to food safety and human health is highly desired but remains challenging.Herein,a target selfcalibratio...Developing an accurate and visual sensing strategy for trace levels of fluoroquinolone residues that pose threat to food safety and human health is highly desired but remains challenging.Herein,a target selfcalibration ratiometric fluorescent sensing platform has been designed for sensitive visual detection of levofloxacin(LEV)based on fluorescent europium metal-organic framework(Eu-MOF)probe.Specifically,the Eu-MOF was facilely synthesized via directly mixing Eu^(3+)with 1,10-phenanthroline-2,9-dicarboxylic acid(PDA)ligand at room temperature,which exhibited well-stable red fluorescence at 612 nm.Upon the addition of target LEV,the significant fluorescence quenching from Eu^(3+)was observed owing to the inner filter effect between the Eu-MOF and LEV.While the intrinsic fluorescence for LEV at 462nm was gradually enhanced,thereby realizing the self-calibration ratiometric fluorescence responses to LEV.Through this strategy,LEV can be detected down to 27 nmol/L.Furthermore,a test paper-based Eu-MOF integrated with the smartphone assisted RGB color analysis was exploited for the quantitative monitoring of LEV through the multi-color changes from red to blue,thus achieved portable,convenient and visual detection of LEV in honey and milk samples.Therefore,the developed strategy could provide a useful tool for supporting the practical on-site test in food samples.展开更多
Underwater target detection in forward-looking sonar(FLS)images is a challenging but promising endeavor.The existing neural-based methods yield notable progress but there remains room for improvement due to overlookin...Underwater target detection in forward-looking sonar(FLS)images is a challenging but promising endeavor.The existing neural-based methods yield notable progress but there remains room for improvement due to overlooking the unique characteristics of underwater environments.Considering the problems of low imaging resolution,complex background environment,and large changes in target imaging of underwater sonar images,this paper specifically designs a sonar images target detection Network based on Progressive sensitivity capture,named ProNet.It progressively captures the sensitive regions in the current image where potential effective targets may exist.Guided by this basic idea,the primary technical innovation of this paper is the introduction of a foundational module structure for constructing a sonar target detection backbone network.This structure employs a multi-subspace mixed convolution module that initially maps sonar images into different subspaces and extracts local contextual features using varying convolutional receptive fields within these heterogeneous subspaces.Subsequently,a Scale-aware aggregation module effectively aggregates the heterogeneous features extracted from different subspaces.Finally,the multi-scale attention structure further enhances the relational perception of the aggregated features.We evaluated ProNet on three FLS datasets of varying scenes,and experimental results indicate that ProNet outperforms the current state-of-the-art sonar image and general target detectors.展开更多
Infrared images typically exhibit diverse backgrounds,each potentially containing noise and target-like interference elements.In complex backgrounds,infrared small targets are prone to be submerged by background noise...Infrared images typically exhibit diverse backgrounds,each potentially containing noise and target-like interference elements.In complex backgrounds,infrared small targets are prone to be submerged by background noise due to their low pixel proportion and limited available features,leading to detection failure.To address this problem,this paper proposes an Attention Shift-Invariant Cross-Evolutionary Feature Fusion Network(ASCFNet)tailored for the detection of infrared weak and small targets.The network architecture first designs a Multidimensional Lightweight Pixel-level Attention Module(MLPA),which alleviates the issue of small-target feature suppression during deep network propagation by combining channel reshaping,multi-scale parallel subnet architectures,and local cross-channel interactions.Then,a Multidimensional Shift-Invariant Recall Module(MSIR)is designed to ensure the network remains unaffected by minor input perturbations when processing infrared images,through focusing on the model’s shift invariance.Subsequently,a Cross-Evolutionary Feature Fusion structure(CEFF)is designed to allow flexible and efficient integration of multidimensional feature information from different network hierarchies,thereby achieving complementarity and enhancement among features.Experimental results on three public datasets,SIRST,NUDT-SIRST,and IRST640,demonstrate that our proposed network outperforms advanced algorithms in the field.Specifically,on the NUDT-SIRST dataset,the mAP50,mAP50-95,and metrics reached 99.26%,85.22%,and 99.31%,respectively.Visual evaluations of detection results in diverse scenarios indicate that our algorithm exhibits an increased detection rate and reduced false alarm rate.Our method balances accuracy and real-time performance,and achieves efficient and stable detection of infrared weak and small targets.展开更多
基金supported by the National Natural Science Foundation of China(No.52106080)the Jilin City Science and Technology Innovation Development Plan Project(No.20240302014)+2 种基金the Jilin Provincial Department of Education Science and Technology Research Project(No.JJKH20230135K)the Jilin Province Science and Technology Development Plan Project(No.YDZJ202401640ZYTS)the Northeast Electric Power University Teaching Reform Research Project(No.J2427)。
文摘The continuous decrease in global fishery resources has increased the importance of precise and efficient underwater fish monitoring technology.First,this study proposes an improved underwater target detection framework based on YOLOv8,with the aim of enhancing detection accuracy and the ability to recognize multi-scale targets in blurry and complex underwater environments.A streamlined Vision Transformer(ViT)model is used as the feature extraction backbone,which retains global self-attention feature extraction and accelerates training efficiency.In addition,a detection head named Dynamic Head(DyHead)is introduced,which enhances the efficiency of processing various target sizes through multi-scale feature fusion and adaptive attention modules.Furthermore,a dynamic loss function adjustment method called SlideLoss is employed.This method utilizes sliding window technology to adaptively adjust parameters,which optimizes the detection of challenging targets.The experimental results on the RUOD dataset show that the proposed improved model not only significantly enhances the accuracy of target detection but also increases the efficiency of target detection.
基金supported by the National Natural Science Foundation of China(Nos.62071482 and 62471348)the Shaanxi Association of Science and Technology Youth Talent Support Program Project,China(No.20230137)+1 种基金the Innovative Talents Cultivate Program for Technology Innovation Team of Shaanxi Province,China(No.2024RS-CXTD-08)the Youth Innovation Team of Shaanxi Universities,China。
文摘Within the domain of Intelligent Group Systems(IGSs),this paper develops a resourceaware multitarget Constant False Alarm Rate(CFAR)detection framework for multisite MIMO radar systems.It underscores the necessity of managing finite transmit and receive antennas and transmit power systematically to enhance detection performance.To tackle the multidimensional resource optimization challenge,we introduce a Cooperative Transmit-Receive Antenna Selection and Power Allocation(CTRSPA)strategy.It employs a perception-action cycle that incorporates uncertain external support information to optimize worst-case detection performance with multiple targets.First,we derive a closed-form expression that incorporates uncertainty for the noncoherent integration squared-law detection probability using the Neyman-Pearson criterion.Subsequently,a joint optimization model for antenna selection and power allocation in CFAR detection is formulated,incorporating practical radar resource constraints.Mathematically,this represents an NPhard problem involving coupled continuous and Boolean variables.We propose a three-stage method—Reformulation,Node Picker,and Convex Power Allocation—that capitalizes on the independent convexity of the optimization model for each variable,ensuring a near-optimal result.Simulations confirm the approach's effectiveness,efficiency,and timeliness,particularly for large-scale radar networks,and reveal the impact of threat levels,system layout,and detection parameters on resource allocation.
文摘The application of deep learning for target detection in aerial images captured by Unmanned Aerial Vehicles(UAV)has emerged as a prominent research focus.Due to the considerable distance between UAVs and the photographed objects,coupled with complex shooting environments,existing models often struggle to achieve accurate real-time target detection.In this paper,a You Only Look Once v8(YOLOv8)model is modified from four aspects:the detection head,the up-sampling module,the feature extraction module,and the parameter optimization of positive sample screening,and the YOLO-S3DT model is proposed to improve the performance of the model for detecting small targets in aerial images.Experimental results show that all detection indexes of the proposed model are significantly improved without increasing the number of model parameters and with the limited growth of computation.Moreover,this model also has the best performance compared to other detecting models,demonstrating its advancement within this category of tasks.
基金supported by the Key R&D Program of Shaanxi Province(No.2025CYYBXM-078).
文摘Aiming at the scale adaptation of automatic driving target detection algorithms in low illumination environments and the shortcomings in target occlusion processing,this paper proposes a YOLO-LKSDS automatic driving detection model.Firstly,the Contrast-Limited Adaptive Histogram Equalisation(CLAHE)image enhancement algorithm is improved to increase the image contrast and enhance the detailed features of the target;then,on the basis of the YOLOv5 model,the Kmeans++clustering algorithm is introduced to obtain a suitable anchor frame,and SPPELAN spatial pyramid pooling is improved to enhance the accuracy and robustness of the model for multi-scale target detection.Finally,an improved SEAM(Separated and Enhancement Attention Module)attention mechanism is combined with the DIOU-NMS algorithm to optimize the model’s performance when dealing with occlusion and dense scenes.Compared with the original model,the improved YOLO-LKSDS model achieves a 13.3%improvement in accuracy,a 1.7%improvement in mAP,and 240,000 fewer parameters on the BDD100K dataset.In order to validate the generalization of the improved algorithm,we selected the KITTI dataset for experimentation,which shows that YOLOv5’s accuracy improves by 21.1%,recall by 36.6%,and mAP50 by 29.5%,respectively,on the KITTI dataset.The deployment of this paper’s algorithm is verified by an edge computing platform,where the average speed of detection reaches 24.4 FPS while power consumption remains below 9 W,demonstrating high real-time capability and energy efficiency.
基金funded by Ministry of Education Humanities and Social Science Research Project,grant number 23YJAZH034The Postgraduate Research and Practice Innovation Program of Jiangsu Province,grant number SJCX25_17National Computer Basic Education Research Project in Higher Education Institutions,grant number 2024-AFCEC-056,2024-AFCEC-057.
文摘To solve the false detection and missed detection problems caused by various types and sizes of defects in the detection of steel surface defects,similar defects and background features,and similarities between different defects,this paper proposes a lightweight detection model named multiscale edge and squeeze-and-excitation attention detection network(MSESE),which is built upon the You Only Look Once version 11 nano(YOLOv11n).To address the difficulty of locating defect edges,we first propose an edge enhancement module(EEM),apply it to the process of multiscale feature extraction,and then propose a multiscale edge enhancement module(MSEEM).By obtaining defect features from different scales and enhancing their edge contours,the module uses the dual-domain selection mechanism to effectively focus on the important areas in the image to ensure that the feature images have richer information and clearer contour features.By fusing the squeeze-and-excitation attention mechanism with the EEM,we obtain a lighter module that can enhance the representation of edge features,which is named the edge enhancement module with squeeze-and-excitation attention(EEMSE).This module was subsequently integrated into the detection head.The enhanced detection head achieves improved edge feature enhancement with reduced computational overhead,while effectively adjusting channel-wise importance and further refining feature representation.Experiments on the NEU-DET dataset show that,compared with the original YOLOv11n,the improved model achieves improvements of 4.1%and 2.2%in terms of mAP@0.5 and mAP@0.5:0.95,respectively,and the GFLOPs value decreases from the original value of 6.4 to 6.2.Furthermore,when compared to current mainstream models,Mamba-YOLOT and RTDETR-R34,our method achieves superior performance with 6.5%and 8.9%higher mAP@0.5,respectively,while maintaining a more compact parameter footprint.These results collectively validate the effectiveness and efficiency of our proposed approach.
基金supported by the Key R&D Programof Xianyang City,Shaanxi Province(L2024-ZDYF-ZDYF-GY-0043).
文摘To address the challenge of real-time detection of unauthorized drone intrusions in complex low-altitude urban environments such as parks and airports,this paper proposes an enhanced MBS-YOLO(Multi-Branch Small Target Detection YOLO)model for anti-drone object detection,based on the YOLOv8 architecture.To overcome the limitations of existing methods in detecting small objects within complex backgrounds,we designed a C2f-Pu module with excellent feature extraction capability and a more compact parameter set,aiming to reduce the model’s computational complexity.To improve multi-scale feature fusion,we construct a Multi-Branch Feature Pyramid Network(MB-FPN)that employs a cross-level feature fusion strategy to enhance the model’s representation of small objects.Additionally,a shared detail-enhanced detection head is introduced to address the large size variations of Unmanned Aerial Vehicle(UAV)targets,thereby improving detection performance across different scales.Experimental results demonstrate that the proposed model achieves consistent improvements across multiple benchmarks.On the Det-Fly dataset,it improves precision by 3%,recall by 5.6%,and mAP50 by 4.5%compared with the baseline,while reducing parameters by 21.2%.Cross-validation on the VisDrone dataset further validates its robustness,yielding additional gains of 3.2%in precision,6.1%in recall,and 4.8%in mAP50 over the original YOLOv8.These findings confirm the effectiveness of the proposed algorithm in enhancing UAV detection performance under complex scenarios.
文摘A measurement system for the scattering characteristics of warhead fragments based on high-speed imaging systems offers advantages such as simple deployment,flexible maneuverability,and high spatiotemporal resolution,enabling the acquisition of full-process data of the fragment scattering process.However,mismatches between camera frame rates and target velocities can lead to long motion blur tails of high-speed fragment targets,resulting in low signal-to-noise ratios and rendering conventional detection algorithms ineffective in dynamic strong interference testing environments.In this study,we propose a detection framework centered on dynamic strong interference disturbance signal separation and suppression.We introduce a mixture Gaussian model constrained under a joint spatialtemporal-transform domain Dirichlet process,combined with total variation regularization to achieve disturbance signal suppression.Experimental results demonstrate that the proposed disturbance suppression method can be integrated with certain conventional motion target detection tasks,enabling adaptation to real-world data to a certain extent.Moreover,we provide a specific implementation of this process,which achieves a detection rate close to 100%with an approximate 0%false alarm rate in multiple sets of real target field test data.This research effectively advances the development of the field of damage parameter testing.
基金supported by the National Natural Science Foundation of China(No.62241109)the Tianjin Science and Technology Commissioner Project(No.20YDTPJC01110)。
文摘An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyramid network(FPN)structure of the original YOLOv8 mode is replaced by the generalized-FPN(GFPN)structure in GiraffeDet to realize the"cross-layer"and"cross-scale"adaptive feature fusion,to enrich the semantic information and spatial information on the feature map to improve the target detection ability of the model.Secondly,a pyramid-pool module of multi atrous spatial pyramid pooling(MASPP)is designed by using the idea of atrous convolution and feature pyramid structure to extract multi-scale features,so as to improve the processing ability of the model for multi-scale objects.The experimental results show that the detection accuracy of the improved YOLOv8 model on DIOR dataset is 92%and mean average precision(mAP)is 87.9%,respectively 3.5%and 1.7%higher than those of the original model.It is proved the detection and classification ability of the proposed model on multi-dimensional optical remote sensing target has been improved.
文摘Unmanned aerial vehicle(UAV)imagery poses significant challenges for object detection due to extreme scale variations,high-density small targets(68%in VisDrone dataset),and complex backgrounds.While YOLO-series models achieve speed-accuracy trade-offs via fixed convolution kernels and manual feature fusion,their rigid architectures struggle with multi-scale adaptability,as exemplified by YOLOv8n’s 36.4%mAP and 13.9%small-object AP on VisDrone2019.This paper presents YOLO-LE,a lightweight framework addressing these limitations through three novel designs:(1)We introduce the C2f-Dy and LDown modules to enhance the backbone’s sensitivity to small-object features while reducing backbone parameters,thereby improving model efficiency.(2)An adaptive feature fusion module is designed to dynamically integrate multi-scale feature maps,optimizing the neck structure,reducing neck complexity,and enhancing overall model performance.(3)We replace the original loss function with a distributed focal loss and incorporate a lightweight self-attention mechanism to improve small-object recognition and bounding box regression accuracy.Experimental results demonstrate that YOLO-LE achieves 39.9%mAP@0.5 on VisDrone2019,representing a 9.6%improvement over YOLOv8n,while maintaining 8.5 GFLOPs computational efficiency.This provides an efficient solution for UAV object detection in complex scenarios.
基金supported in part by the National Natural Science Foundation of China Grants 62402085,61972062,62306060the Liaoning Doctoral Research Start-Up Fund 2023-BS-078+1 种基金the Dalian Youth Science and Technology Star Project 2023RQ023the Liaoning Basic Research Project 2023JH2/101300191.
文摘Underwater target detection is extensively applied in domains such as underwater search and rescue,environmental monitoring,and marine resource surveys.It is crucial in enabling autonomous underwater robot operations and promoting ocean exploration.Nevertheless,low imaging quality,harsh underwater environments,and obscured objects considerably increase the difficulty of detecting underwater targets,making it difficult for current detection methods to achieve optimal performance.In order to enhance underwater object perception and improve target detection precision,we propose a lightweight underwater target detection method using You Only Look Once(YOLO)v8 with multi-scale cross-channel attention(MSCCA),named YOLOv8-UOD.In the proposed multiscale cross-channel attention module,multi-scale attention(MSA)augments the variety of attentional perception by extracting information from innately diverse sensory fields.The cross-channel strategy utilizes RepVGGbased channel shuffling(RCS)and one-shot aggregation(OSA)to rearrange feature map channels according to specific rules.It aggregates all features only once in the final feature mapping,resulting in the extraction of more comprehensive and valuable feature information.The experimental results show that the proposed YOLOv8-UOD achieves a mAP50 of 95.67%and FLOPs of 23.8 G on the Underwater Robot Picking Contest 2017(URPC2017)dataset,outperforming other methods in terms of detection precision and computational cost-efficiency.
基金Supported by the Key Laboratory Fund for Equipment Pre-Research(6142207210202)。
文摘Aiming at the problem that infrared small target detection faces low contrast between the background and the target and insufficient noise suppression ability under the complex cloud background,an infrared small target detection method based on the tensor nuclear norm and direction residual weighting was proposed.Based on converting the infrared image into an infrared patch tensor model,from the perspective of the low-rank nature of the background tensor,and taking advantage of the difference in contrast between the background and the target in different directions,we designed a double-neighborhood local contrast based on direction residual weighting method(DNLCDRW)combined with the partial sum of tensor nuclear norm(PSTNN)to achieve effective background suppression and recovery of infrared small targets.Experiments show that the algorithm is effective in suppressing the background and improving the detection ability of the target.
文摘When a fire breaks out in a high-rise building,the occlusion of smoke and obstacles results in dearth of crucial information concerning people in distress,thereby creating a challenge in their detection.Given the restricted sensing range of a single unmanned aerial vehicle(UAV)cam-era,enhancing the target recognition rate becomes challenging without target information.To tackle this issue,this paper proposes a multi-agent autonomous collaborative detection method for multi-targets in complex fire environments.The objective is to achieve the fusion of multi-angle visual information,effectively increasing the target’s information dimension,and ultimately address-ing the problem of low target recognition rate caused by the lack of target information.The method steps are as follows:first,the you only look once version5(YOLOv5)is used to detect the target in the image;second,the detected targets are tracked to monitor their movements and trajectories;third,the person re-identification(ReID)model is employed to extract the appearance features of targets;finally,by fusing the visual information from multi-angle cameras,the method achieves multi-agent autonomous collaborative detection.The experimental results show that the method effectively combines the visual information from multi-angle cameras,resulting in improved detec-tion efficiency for people in distress.
基金supported by the National Natural Science Foundation of China(61806024,62206257)the Jilin Province Science and Technology Development Plan Key Research and Development Project(20210204050YY)+1 种基金the Wuxi University Research Start-up Fund for Introduced Talents(2023r004,2023r006)Jiangsu Engineering Research Center of Hyperconvergence Application and Security of IoT Devices,Jiangsu Foreign Expert Workshop,Wuxi City Internet of Vehicles Key Laboratory.
文摘In this paper,a reasoning enhancement method based on RGCN(Relational Graph Convolutional Network)is proposed to improve the detection capability of UAV(Unmanned Aerial Vehicle)on fast-moving military targets in urban battlefield environments.By combining military images with the publicly available VisDrone2019 dataset,a new dataset called VisMilitary was built and multiple YOLO(You Only Look Once)models were tested on it.Due to the low confidence problem caused by fuzzy targets,the performance of traditional YOLO models on real battlefield images decreases significantly.Therefore,we propose an improved RGCN inference model,which improves the performance of the model in complex environments by optimizing the data processing and graph network architecture.Experimental results show that the proposed method achieves an improvement of 0.4%to 1.7%on mAP@0.50,which proves the effectiveness of the model in military target detection.The research of this paper provides a new technical path for UAV target detection in urban battlefield,and provides important enlightenment for the application of deep learning in military field.
基金supported by the National Key R&D Program“Development and Application Verification of Underwater Intelligent Defect Detection Robot System for Large Hydropower Station Dams”(Project No.2022YFB4703400)sub-topic 4“Research on Intelligent Identification and Diagnosis of Dam Defects and Fine Inspection Equipment and Technology of Hydropower Stations”(Project No.2022YFB4703404)supported in part by the National Natural Science Foundation of China under Grant 62371181in part by the Changzhou Science and Technology International Cooperation Program under Grant CZ20230029。
文摘Infrared small-target detection has important applications in many fields due to its high penetration capability and detection distance.This study introduces a detector called“YOLO-SDLUWD”which is based on the YOLOv7 network,for small target detection in complex infrared backgrounds.The“SDLUWD”refers to the combination of the Spatial Depth layer followed Convolutional layer structure(SD-Conv)and a Linear Up-sampling fusion Path Aggregation Feature Pyramid Network(LU-PAFPN)and a training strategy based on the normalized Gaussian Wasserstein Distance loss(WD-loss)function.“YOLO-SDLUWD”aims to reduce detection accuracy when the maximum pooling downsampling layer in the backbone network loses important feature information,support the interaction and fusion of high-dimensional and low-dimensional feature information,and overcome the false alarm predictions induced by noise in small target images.The detector achieved a mAP@0.5 of 90.4%and mAP@0.5:0.95 of 48.5%on IRIS-AG,an increase of 9%-11%over YOLOv7-tiny,outperforming other state-of-the-art target detectors in terms of accuracy and speed.
文摘In the context of target detection under infrared conditions for drones,the common issues of high missed detection rates,low signal-to-noise ratio,and blurred edge features for small targets are prevalent.To address these challenges,this paper proposes an improved detection algorithm based on YOLOv11n.First,a Dynamic Multi-Scale Feature Fusion and Adaptive Weighting approach is employed to design an Adaptive Focused Diffusion Pyramid Network(AFDPN),which enhances the feature expression and transmission capability of shallow small targets,thereby reducing the loss of detailed information.Then,combined with an Edge Enhancement(EE)module,the model improves the extraction of infrared small target edge features through low-frequency suppression and high-frequency enhancement strategies.Experimental results on the publicly available HIT-UAV dataset show that the improved model achieves a 3.8%increase in average detection accuracy and a 3.0%improvement in recall rate compared to YOLOv11n,with a computational cost of only 9.1 GFLOPS.In comparison experiments,the detection accuracy and model size balance achieved the optimal solution,meeting the lightweight deployment requirements for drone-based systems.This method provides a high-precision,lightweight solution for small target detection in drone-based infrared imagery.
基金supported by the National Natural Science Foundation of China (No.52205548)。
文摘To address the issues of unknown target size,blurred edges,background interference and low contrast in infrared small target detection,this paper proposes a method based on density peaks searching and weighted multi-feature local difference.Firstly,an improved high-boost filter is used for preprocessing to eliminate background clutter and high-brightness interference,thereby increasing the probability of capturing real targets in the density peak search.Secondly,a triple-layer window is used to extract features from the area surrounding candidate targets,addressing the uncertainty of small target sizes.By calculating multi-feature local differences between the triple-layer windows,the problems of blurred target edges and low contrast are resolved.To balance the contribution of different features,intra-class distance is used to calculate weights,achieving weighted fusion of multi-feature local differences to obtain the weighted multi-feature local differences of candidate targets.The real targets are then extracted using the interquartile range.Experiments on datasets such as SIRST and IRSTD-IK show that the proposed method is suitable for various complex types and demonstrates good robustness and detection performance.
文摘To address low detection accuracy in near-coastal vessel target detection under complex conditions,a novel near-coastal vessel detection model based on an improved YOLOv7 architecture is proposed in this paper.The attention mechanism Coordinate Attention is used to improve channel attention weight and enhance a network’s ability to extract small target features.In the enhanced feature extraction network,the lightweight convolution algorithm Grouped Spatial Convolution is used to replace MPConv to reduce model calculation costs.EIoU Loss is used to replace the regression frame loss function in YOLOv7 to reduce the probability of missed and false detection.The performance of the improved model was verified using an enhanced dataset obtained through rainy and foggy weather simulation.Experiments were conducted on the datasets before and after the enhancement.The improved model achieved a mean average precision(mAP)of 97.45%on the original dataset,and the number of parameters was reduced by 2%.On the enhanced dataset,the mAP of the improved model reached 88.08%.Compared with seven target detection models,such as Faster R-CNN,YOLOv3,YOLOv4,YOLOv5,YOLOv7,YOLOv8-n,and YOLOv8-s,the improved model can effectively reduce the missed and false detection rates and improve target detection accuracy.The improved model not only accurately detects vessels in complex weather environments but also outperforms other methods on original and enhanced SeaShip datasets.This finding shows that the improved model can achieve near-coastal vessel target detection in multiple environments,laying the foundation for vessel path planning and automatic obstacle avoidance.
基金supported by the National Natural Science Foundation of China(Nos.32260247 and 22064010)the Natural Science Foundation of Jiangxi Province(Nos.20232BAB215071 and 20224BAB213009).
文摘Developing an accurate and visual sensing strategy for trace levels of fluoroquinolone residues that pose threat to food safety and human health is highly desired but remains challenging.Herein,a target selfcalibration ratiometric fluorescent sensing platform has been designed for sensitive visual detection of levofloxacin(LEV)based on fluorescent europium metal-organic framework(Eu-MOF)probe.Specifically,the Eu-MOF was facilely synthesized via directly mixing Eu^(3+)with 1,10-phenanthroline-2,9-dicarboxylic acid(PDA)ligand at room temperature,which exhibited well-stable red fluorescence at 612 nm.Upon the addition of target LEV,the significant fluorescence quenching from Eu^(3+)was observed owing to the inner filter effect between the Eu-MOF and LEV.While the intrinsic fluorescence for LEV at 462nm was gradually enhanced,thereby realizing the self-calibration ratiometric fluorescence responses to LEV.Through this strategy,LEV can be detected down to 27 nmol/L.Furthermore,a test paper-based Eu-MOF integrated with the smartphone assisted RGB color analysis was exploited for the quantitative monitoring of LEV through the multi-color changes from red to blue,thus achieved portable,convenient and visual detection of LEV in honey and milk samples.Therefore,the developed strategy could provide a useful tool for supporting the practical on-site test in food samples.
基金supported in part by Youth Innovation Promotion Association,Chinese Academy of Sciences under Grant 2022022in part by South China Sea Nova project of Hainan Province under Grant NHXXRCXM202340in part by the Scientific Research Foundation Project of Hainan Acoustics Laboratory under grant ZKNZ2024001.
文摘Underwater target detection in forward-looking sonar(FLS)images is a challenging but promising endeavor.The existing neural-based methods yield notable progress but there remains room for improvement due to overlooking the unique characteristics of underwater environments.Considering the problems of low imaging resolution,complex background environment,and large changes in target imaging of underwater sonar images,this paper specifically designs a sonar images target detection Network based on Progressive sensitivity capture,named ProNet.It progressively captures the sensitive regions in the current image where potential effective targets may exist.Guided by this basic idea,the primary technical innovation of this paper is the introduction of a foundational module structure for constructing a sonar target detection backbone network.This structure employs a multi-subspace mixed convolution module that initially maps sonar images into different subspaces and extracts local contextual features using varying convolutional receptive fields within these heterogeneous subspaces.Subsequently,a Scale-aware aggregation module effectively aggregates the heterogeneous features extracted from different subspaces.Finally,the multi-scale attention structure further enhances the relational perception of the aggregated features.We evaluated ProNet on three FLS datasets of varying scenes,and experimental results indicate that ProNet outperforms the current state-of-the-art sonar image and general target detectors.
基金supported in part by the National Natural Science Foundation of China under Grant 62271302the Shanghai Municipal Natural Science Foundation under Grant 20ZR1423500.
文摘Infrared images typically exhibit diverse backgrounds,each potentially containing noise and target-like interference elements.In complex backgrounds,infrared small targets are prone to be submerged by background noise due to their low pixel proportion and limited available features,leading to detection failure.To address this problem,this paper proposes an Attention Shift-Invariant Cross-Evolutionary Feature Fusion Network(ASCFNet)tailored for the detection of infrared weak and small targets.The network architecture first designs a Multidimensional Lightweight Pixel-level Attention Module(MLPA),which alleviates the issue of small-target feature suppression during deep network propagation by combining channel reshaping,multi-scale parallel subnet architectures,and local cross-channel interactions.Then,a Multidimensional Shift-Invariant Recall Module(MSIR)is designed to ensure the network remains unaffected by minor input perturbations when processing infrared images,through focusing on the model’s shift invariance.Subsequently,a Cross-Evolutionary Feature Fusion structure(CEFF)is designed to allow flexible and efficient integration of multidimensional feature information from different network hierarchies,thereby achieving complementarity and enhancement among features.Experimental results on three public datasets,SIRST,NUDT-SIRST,and IRST640,demonstrate that our proposed network outperforms advanced algorithms in the field.Specifically,on the NUDT-SIRST dataset,the mAP50,mAP50-95,and metrics reached 99.26%,85.22%,and 99.31%,respectively.Visual evaluations of detection results in diverse scenarios indicate that our algorithm exhibits an increased detection rate and reduced false alarm rate.Our method balances accuracy and real-time performance,and achieves efficient and stable detection of infrared weak and small targets.