Objective:To explore the clinical correlation between the detection of irregular antibodies in red blood cell blood groups and hemolytic disease of the newborn.Methods:This study selected newborns who underwent examin...Objective:To explore the clinical correlation between the detection of irregular antibodies in red blood cell blood groups and hemolytic disease of the newborn.Methods:This study selected newborns who underwent examinations and were diagnosed with hemolytic disease at our hospital from October 2024 to October 2025 as the research subjects.Based on the severity of their hemolytic disease,the infants were divided into a severe group and a mild group.All the infants underwent detection for irregular antibodies in their red blood cell blood groups.General information,blood types,and irregular antibody test results of the two groups were recorded.Univariate analysis was conducted,and variables with statistical significance from the univariate analysis were included in a multivariate logistic regression analysis to explore the clinical correlation between the detection of irregular antibodies in red blood cell blood groups and hemolytic disease of the newborn.Results:Through univariate analysis,it was found that IgG1 and IgG3 subclass antibodies,as well as ABO blood group incompatibility,were statistically significant(p<0.05).When these factors were included in a multivariate logistic regression analysis,it was discovered that IgG1(OR=2.461,95%CI:1.859-2.709),IgG3(OR=2.509,95%CI:1.918-2.893),and ABO blood group incompatibility(OR=2.998,95%CI:2.149-3.493)all exhibited a positive correlation with hemolytic disease of the newborn.Conclusion:As levels of IgG1,IgG3,and ABO blood group incompatibility increase,the incidence of hemolytic disease of the newborn also rises,warranting clinical attention.展开更多
This paper presents a high-speed and robust dual-band infrared thermal camera based on an ARM CPU.The system consists of a low-resolution long-wavelength infrared detector,a digital temperature and humid⁃ity sensor,an...This paper presents a high-speed and robust dual-band infrared thermal camera based on an ARM CPU.The system consists of a low-resolution long-wavelength infrared detector,a digital temperature and humid⁃ity sensor,and a CMOS sensor.In view of the significant contrast between face and background in thermal infra⁃red images,this paper explores a suitable accuracy-latency tradeoff for thermal face detection and proposes a tiny,lightweight detector named YOLO-Fastest-IR.Four YOLO-Fastest-IR models(IR0 to IR3)with different scales are designed based on YOLO-Fastest.To train and evaluate these lightweight models,a multi-user low-resolution thermal face database(RGBT-MLTF)was collected,and the four networks were trained.Experiments demon⁃strate that the lightweight convolutional neural network performs well in thermal infrared face detection tasks.The proposed algorithm outperforms existing face detection methods in both positioning accuracy and speed,making it more suitable for deployment on mobile platforms or embedded devices.After obtaining the region of interest(ROI)in the infrared(IR)image,the RGB camera is guided by the thermal infrared face detection results to achieve fine positioning of the RGB face.Experimental results show that YOLO-Fastest-IR achieves a frame rate of 92.9 FPS on a Raspberry Pi 4B and successfully detects 97.4%of faces in the RGBT-MLTF test set.Ultimate⁃ly,an infrared temperature measurement system with low cost,strong robustness,and high real-time perfor⁃mance was integrated,achieving a temperature measurement accuracy of 0.3℃.展开更多
Visible and near-infrared photodetectors are widely used in intelligent driving,health monitoring,and other fields.However,the application of photodetectors in the near-infrared region is significantly impacted by hig...Visible and near-infrared photodetectors are widely used in intelligent driving,health monitoring,and other fields.However,the application of photodetectors in the near-infrared region is significantly impacted by high dark current,which can greatly reduce their performance and sensitivity,thereby limiting their effectiveness in certain applications.In this work,the introduction of a C60 back interface layer successfully mitigated back interface reactions to decrease the thickness of the Mo(S,Se)_(2)layer,tailoring the back-contact barrier and preventing reverse charge injection,resulting in a kesterite photodetector with an ultralow dark current density of 5.2×10^(-9)mA/cm^(2)and ultra-weak-light detection at levels as low as 25 pW/cm^(2).Besides,under a self-powered operation,it demonstrates outstanding performance,achieving a peak responsivity of 0.68 A/W,a wide response range spanning from 300 to 1600 nm,and an impressive detectivity of 5.27×10^(14)Jones.In addition,it offers exceptionally rapid response times,with rise and decay times of 70 and 650 ns,respectively.This research offers important insights for developing high-performance self-powered near-infrared photodetectors that have high responsivity,rapid response times,and ultralow dark current.展开更多
As modern power systems grow in complexity,accurate and efficient fault detection has become increasingly important.While many existing reviews focus on a single modality,this paper presents a comprehensive survey fro...As modern power systems grow in complexity,accurate and efficient fault detection has become increasingly important.While many existing reviews focus on a single modality,this paper presents a comprehensive survey from a dual-modality perspective-infrared imaging and voiceprint analysis-two complementary,non-contact techniques that capture different fault characteristics.Infrared imaging excels at detecting thermal anomalies,while voiceprint signals provide insight into mechanical vibrations and internal discharge phenomena.We review both traditional signal processing and deep learning-based approaches for each modality,categorized by key processing stages such as feature extraction and classification.The paper highlights how these modalities address distinct fault types and how they may be fused to improve robustness and accuracy.Representative datasets are summarized,and practical challenges such as noise interference,limited fault samples,and deployment constraints are discussed.By offering a cross-modal,comparative analysis,this work aims to bridge fragmented research and guide future development in intelligent fault detection systems.The review concludes with research trends including multimodal fusion,lightweight models,and self-supervised learning.展开更多
In the context of target detection under infrared conditions for drones,the common issues of high missed detection rates,low signal-to-noise ratio,and blurred edge features for small targets are prevalent.To address t...In the context of target detection under infrared conditions for drones,the common issues of high missed detection rates,low signal-to-noise ratio,and blurred edge features for small targets are prevalent.To address these challenges,this paper proposes an improved detection algorithm based on YOLOv11n.First,a Dynamic Multi-Scale Feature Fusion and Adaptive Weighting approach is employed to design an Adaptive Focused Diffusion Pyramid Network(AFDPN),which enhances the feature expression and transmission capability of shallow small targets,thereby reducing the loss of detailed information.Then,combined with an Edge Enhancement(EE)module,the model improves the extraction of infrared small target edge features through low-frequency suppression and high-frequency enhancement strategies.Experimental results on the publicly available HIT-UAV dataset show that the improved model achieves a 3.8%increase in average detection accuracy and a 3.0%improvement in recall rate compared to YOLOv11n,with a computational cost of only 9.1 GFLOPS.In comparison experiments,the detection accuracy and model size balance achieved the optimal solution,meeting the lightweight deployment requirements for drone-based systems.This method provides a high-precision,lightweight solution for small target detection in drone-based infrared imagery.展开更多
The Global Positioning System(GPS)plays an indispensable role in the control of Unmanned Aerial Vehicle(UAV).However,the civilian GPS signals,transmitted over the air without any encryption,are vulnerable to spoofing ...The Global Positioning System(GPS)plays an indispensable role in the control of Unmanned Aerial Vehicle(UAV).However,the civilian GPS signals,transmitted over the air without any encryption,are vulnerable to spoofing attacks,which further guides the UAV on deviated positions or trajectories.To counter the GPS,,m spoofing on UAV system and to detect the position/trajectory anomaly in real time,a motion state vector based stack long short-term memory trajectory prediction scheme is firstly proposed,leveraging the temporal and spatial features of UAV kinematics.Based on the predicted results,an ensemble voting-based trajectory anomaly detection scheme is proposed to detect the position anomalies in real time with the information of motion state sequences.The proposed prediction-based trajectory anomaly detection scheme outperforms the existing offline detection schemes designed for fixed trajectories.Software In The Loop(SITL)based online prediction and online anomaly detection are demonstrated with random 3D flight trajectories.Results show that the coefficient of determination(R^(2))and Root Mean Square Error(RMSE)of the prediction scheme can reach 0.996 and 3.467,respectively.The accuracy,recall,and F1-score of the proposed anomaly detection scheme can reach 0.984,0.988,and 0.983,respectively,which outperform deep ensemble learning,LSTM-based classifier,machine learning classifier and GA-XGBoost based schemes.Moreover,results show that compared with LSTM-based classifier,the average duration(from the moment starting an attack to the moment the attack being detected)and distance of the proposed scheme are reduced by 24.4%and 19.5%,respectively.展开更多
Infrared(IR)spectroscopy,a technique within the realm of molecular vibrational spectroscopy,furnishes distinctive chemical signatures pivotal for both structural analysis and compound identification.A notable challeng...Infrared(IR)spectroscopy,a technique within the realm of molecular vibrational spectroscopy,furnishes distinctive chemical signatures pivotal for both structural analysis and compound identification.A notable challenge emerges from the misalignment between the mid-IR light wavelength range and molecular dimensions,culminating in a constrained absorption cross-section and diminished vibrational absorption coefficients(Supplementary data).展开更多
Aiming at the problem that infrared small target detection faces low contrast between the background and the target and insufficient noise suppression ability under the complex cloud background,an infrared small targe...Aiming at the problem that infrared small target detection faces low contrast between the background and the target and insufficient noise suppression ability under the complex cloud background,an infrared small target detection method based on the tensor nuclear norm and direction residual weighting was proposed.Based on converting the infrared image into an infrared patch tensor model,from the perspective of the low-rank nature of the background tensor,and taking advantage of the difference in contrast between the background and the target in different directions,we designed a double-neighborhood local contrast based on direction residual weighting method(DNLCDRW)combined with the partial sum of tensor nuclear norm(PSTNN)to achieve effective background suppression and recovery of infrared small targets.Experiments show that the algorithm is effective in suppressing the background and improving the detection ability of the target.展开更多
The fast increase of online communities has brought about an increase in cyber threats inclusive of cyberbullying, hate speech, misinformation, and online harassment, making content moderation a pressing necessity. Tr...The fast increase of online communities has brought about an increase in cyber threats inclusive of cyberbullying, hate speech, misinformation, and online harassment, making content moderation a pressing necessity. Traditional single-modal AI-based detection systems, which analyze both text, photos, or movies in isolation, have established useless at taking pictures multi-modal threats, in which malicious actors spread dangerous content throughout a couple of formats. To cope with these demanding situations, we advise a multi-modal deep mastering framework that integrates Natural Language Processing (NLP), Convolutional Neural Networks (CNNs), and Long Short-Term Memory (LSTM) networks to become aware of and mitigate online threats effectively. Our proposed model combines BERT for text class, ResNet50 for photograph processing, and a hybrid LSTM-3-d CNN community for video content material analysis. We constructed a large-scale dataset comprising 500,000 textual posts, 200,000 offensive images, and 50,000 annotated motion pictures from more than one platform, which includes Twitter, Reddit, YouTube, and online gaming forums. The system became carefully evaluated using trendy gadget mastering metrics which include accuracy, precision, remember, F1-score, and ROC-AUC curves. Experimental outcomes demonstrate that our multi-modal method extensively outperforms single-modal AI classifiers, achieving an accuracy of 92.3%, precision of 91.2%, do not forget of 90.1%, and an AUC rating of 0.95. The findings validate the necessity of integrating multi-modal AI for actual-time, high-accuracy online chance detection and moderation. Future paintings will have consciousness on improving hostile robustness, enhancing scalability for real-world deployment, and addressing ethical worries associated with AI-driven content moderation.展开更多
With the advancement of deep learning in the automotive domain,more and more researchers are focusing on autonomous driving.Among these tasks,free space detection is particularly crucial.Currently,many model-based app...With the advancement of deep learning in the automotive domain,more and more researchers are focusing on autonomous driving.Among these tasks,free space detection is particularly crucial.Currently,many model-based approaches have achieved autonomous driving on well-structured urban roads,but these efforts primarily focus on urban road environments.In contrast,there are fewer deep learningmethods specifically designed for off-road traversable area detection,and their effectiveness is not yet satisfactory.This is because detecting traversable areas in complex outdoor environments poses significant challenges,and current methods often rely on single-image inputs,which do not align with contemporary multimodal approaches.Therefore,in this study,we propose a CFH-Net model for off-road traversable area detection.This model employs a Transformer architecture to enhance its capability of capturing global information.For multimodal feature extraction and fusion,we integrate the CM-FRM module for feature extraction and introduce the novel FFX module for feature fusion,thereby improving the perception capability of autonomous vehicles on unstructured roads.To address upsampling,we propose a new convolution precorrection method to reduce model parameters and computational complexity while enhancing the model’s ability to capture complex features.Finally,we conducted experiments on the ORFD off-road dataset and achieved outstanding results.展开更多
Infrared small-target detection has important applications in many fields due to its high penetration capability and detection distance.This study introduces a detector called“YOLO-SDLUWD”which is based on the YOLOv...Infrared small-target detection has important applications in many fields due to its high penetration capability and detection distance.This study introduces a detector called“YOLO-SDLUWD”which is based on the YOLOv7 network,for small target detection in complex infrared backgrounds.The“SDLUWD”refers to the combination of the Spatial Depth layer followed Convolutional layer structure(SD-Conv)and a Linear Up-sampling fusion Path Aggregation Feature Pyramid Network(LU-PAFPN)and a training strategy based on the normalized Gaussian Wasserstein Distance loss(WD-loss)function.“YOLO-SDLUWD”aims to reduce detection accuracy when the maximum pooling downsampling layer in the backbone network loses important feature information,support the interaction and fusion of high-dimensional and low-dimensional feature information,and overcome the false alarm predictions induced by noise in small target images.The detector achieved a mAP@0.5 of 90.4%and mAP@0.5:0.95 of 48.5%on IRIS-AG,an increase of 9%-11%over YOLOv7-tiny,outperforming other state-of-the-art target detectors in terms of accuracy and speed.展开更多
Aiming at the problems of low detection accuracy and large model size of existing object detection algorithms applied to complex road scenes,an improved you only look once version 8(YOLOv8)object detection algorithm f...Aiming at the problems of low detection accuracy and large model size of existing object detection algorithms applied to complex road scenes,an improved you only look once version 8(YOLOv8)object detection algorithm for infrared images,F-YOLOv8,is proposed.First,a spatial-to-depth network replaces the traditional backbone network's strided convolution or pooling layer.At the same time,it combines with the channel attention mechanism so that the neural network focuses on the channels with large weight values to better extract low-resolution image feature information;then an improved feature pyramid network of lightweight bidirectional feature pyramid network(L-BiFPN)is proposed,which can efficiently fuse features of different scales.In addition,a loss function of insertion of union based on the minimum point distance(MPDIoU)is introduced for bounding box regression,which obtains faster convergence speed and more accurate regression results.Experimental results on the FLIR dataset show that the improved algorithm can accurately detect infrared road targets in real time with 3%and 2.2%enhancement in mean average precision at 50%IoU(mAP50)and mean average precision at 50%—95%IoU(mAP50-95),respectively,and 38.1%,37.3%and 16.9%reduction in the number of model parameters,the model weight,and floating-point operations per second(FLOPs),respectively.To further demonstrate the detection capability of the improved algorithm,it is tested on the public dataset PASCAL VOC,and the results show that F-YOLO has excellent generalized detection performance.展开更多
BACKGROUND Colorectal cancer has a high incidence and mortality rate,and the effectiveness of routine colonoscopy largely depends on the endoscopist’s expertise.In recent years,computer-aided detection(CADe)systems h...BACKGROUND Colorectal cancer has a high incidence and mortality rate,and the effectiveness of routine colonoscopy largely depends on the endoscopist’s expertise.In recent years,computer-aided detection(CADe)systems have been increasingly integrated into colonoscopy to improve detection accuracy.However,while most studies have focused on adenoma detection rate(ADR)as the primary outcome,the more sensitive adenoma miss rate(AMR)has been less frequently analyzed.AIM To evaluate the effectiveness of CADe in colonoscopy and assess the advantages of AMR over ADR.METHODS A comprehensive literature search was conducted in PubMed,Embase,and the Cochrane Central Register of Controlled Trials using predefined search strategies to identify relevant studies published up to August 2,2024.Statistical analyses were performed to compare outcomes between groups,and potential publication bias was assessed using funnel plots.The quality of the included studies was evaluated using the Cochrane Risk of Bias tool and the Grading of Recommendations,Assessment,Development,and Evaluation approach.RESULTS Five studies comprising 1624 patients met the inclusion criteria.AMR was significantly lower in the CADe-assisted group than in the routine colonoscopy group(147/927,15.9%vs 345/960,35.9%;P<0.01).However,CADe did not provide a significant advantage in detecting advanced adenomas or lesions measuring 6-9 mm or≥10 mm.The polyp miss rate(PMR)was also lower in the CADe-assisted group[odds ratio(OR),0.35;95% confidence interval(CI):0.23-0.52;P<0.01].While the overall ADR did not differ significantly between groups,the ADR during the first-pass examination was higher in the CADe-assisted group(OR,1.37;95%CI:1.10-1.69;P=0.004).The level of evidence for the included randomized controlled trials was graded as moderate.CONCLUSION CADe can significantly reduce AMR and PMR while improving ADR during initial detection,demonstrating its potential to enhance colonoscopy performance.These findings highlight the value of CADe in improving the detection of colorectal neoplasms,particularly small and histologically distinct adenomas.展开更多
In recent years,there has been a concerted effort to improve anomaly detection tech-niques,particularly in the context of high-dimensional,distributed clinical data.Analysing patient data within clinical settings reve...In recent years,there has been a concerted effort to improve anomaly detection tech-niques,particularly in the context of high-dimensional,distributed clinical data.Analysing patient data within clinical settings reveals a pronounced focus on refining diagnostic accuracy,personalising treatment plans,and optimising resource allocation to enhance clinical outcomes.Nonetheless,this domain faces unique challenges,such as irregular data collection,inconsistent data quality,and patient-specific structural variations.This paper proposed a novel hybrid approach that integrates heuristic and stochastic methods for anomaly detection in patient clinical data to address these challenges.The strategy combines HPO-based optimal Density-Based Spatial Clustering of Applications with Noise for clustering patient exercise data,facilitating efficient anomaly identification.Subsequently,a stochastic method based on the Interquartile Range filters unreliable data points,ensuring that medical tools and professionals receive only the most pertinent and accurate information.The primary objective of this study is to equip healthcare pro-fessionals and researchers with a robust tool for managing extensive,high-dimensional clinical datasets,enabling effective isolation and removal of aberrant data points.Furthermore,a sophisticated regression model has been developed using Automated Machine Learning(AutoML)to assess the impact of the ensemble abnormal pattern detection approach.Various statistical error estimation techniques validate the efficacy of the hybrid approach alongside AutoML.Experimental results show that implementing this innovative hybrid model on patient rehabilitation data leads to a notable enhance-ment in AutoML performance,with an average improvement of 0.041 in the R2 score,surpassing the effectiveness of traditional regression models.展开更多
Two-dimensional perovskite ferroelectric which strongly couple ferroelectricity with semiconducting properties are promising candidates for optoelectronic applications.However,it is still a great challenge to fabricat...Two-dimensional perovskite ferroelectric which strongly couple ferroelectricity with semiconducting properties are promising candidates for optoelectronic applications.However,it is still a great challenge to fabricate self-powered broadband photodetectors with low detection limit.Herein,we successfully realized self-powered broadband photodetection with low detection limit by using a trilayered perovskite ferroelectric(BA)_(2)EA_(2)Pb_(3)I_(10)(1,BA=n-butylamine,EA=ethylamine).Giving to its large spontaneous polarization(5.6μC/cm^(2)),1 exhibits an open-circuit voltage of 0.25 V which provide driving force to separate carriers.Combining with its low dark current(~10^(-14)A)and narrow bandgap(Eg=1.86 e V),1 demonstrates great potential on detecting the broadband weak lights.Thus,a prominent photodetection performance with high open-off ratio(~10^(5)),outstanding responsivity(>10 m A/W),and promising detectivity(>1011Jones),as well as the low detecting limit(~nW/cm^(2))among the wide wavelength from 377 nm to637 nm was realized based on the single crystal of 1.This work demonstrates the great potential of 2D perovskite ferroelectric on self-powered broadband photodetectors.展开更多
Roads inevitably have defects during use,which not only seriously affect their service life but also pose a hidden danger to traffic safety.Existing algorithms for detecting road defects are unsatisfactory in terms of...Roads inevitably have defects during use,which not only seriously affect their service life but also pose a hidden danger to traffic safety.Existing algorithms for detecting road defects are unsatisfactory in terms of accuracy and generalization,so this paper proposes an algorithm based on YOLOv11.The method embeds wavelet transform convolution(WTConv)into the backbone’s C3k2 module to enhance low-frequency feature extraction while avoiding parameter bloat.Secondly,a novel multi-scale fusion diffusion network(MFDN)architecture is designed for the neck to strengthen cross-scale feature interactions,boosting detection precision.In terms of model optimization,the traditional downsampling method is discarded,and the innovative Adown(adaptive downsampling)technique is adopted,which streamlines the parameter scales while effectively mitigating the information loss problem during downsampling.Finally,in this paper,we propose Wise-PIDIoU by combining WiseIoU and MPDIoU to minimize the negative impact of low-quality anchor frames and enhance the detection capability of the model.The experimental results indicate that the proposed algorithm achieves an average detection accuracy of 86.5%for mAP@50 on the RDD2022 dataset,which is 2%higher than the original algorithm while ensuring that the amount of computation is basically unchanged.The number of parameters is reduced by 17%,and the F1 score is improved by 3%,showing better detection performance than other algorithms when facing different types of defects.The excellent performance on embedded devices proves that the algorithm also has favorable application prospects in practical inspection.展开更多
Visible-infrared object detection leverages the day-night stable object perception capability of infrared images to enhance detection robustness in low-light environments by fusing the complementary information of vis...Visible-infrared object detection leverages the day-night stable object perception capability of infrared images to enhance detection robustness in low-light environments by fusing the complementary information of visible and infrared images.However,the inherent differences in the imaging mechanisms of visible and infrared modalities make effective cross-modal fusion challenging.Furthermore,constrained by the physical characteristics of sensors and thermal diffusion effects,infrared images generally suffer from blurred object contours and missing details,making it difficult to extract object features effectively.To address these issues,we propose an infrared-visible image fusion network that realizesmultimodal information fusion of infrared and visible images through a carefully designedmultiscale fusion strategy.First,we design an adaptive gray-radiance enhancement(AGRE)module to strengthen the detail representation in infrared images,improving their usability in complex lighting scenarios.Next,we introduce a channelspatial feature interaction(CSFI)module,which achieves efficient complementarity between the RGB and infrared(IR)modalities via dynamic channel switching and a spatial attention mechanism.Finally,we propose a multi-scale enhanced cross-attention fusion(MSECA)module,which optimizes the fusion ofmulti-level features through dynamic convolution and gating mechanisms and captures long-range complementary relationships of cross-modal features on a global scale,thereby enhancing the expressiveness of the fused features.Experiments on the KAIST,M3FD,and FLIR datasets demonstrate that our method delivers outstanding performance in daytime and nighttime scenarios.On the KAIST dataset,the miss rate drops to 5.99%,and further to 4.26% in night scenes.On the FLIR and M3FD datasets,it achieves AP50 scores of 79.4% and 88.9%,respectively.展开更多
Missile-borne short-range infrared detection(SIRD)technology is commonly used in military ground target detection.In complex battlefield environments,achieving precise strike on ground target is a challenging task.How...Missile-borne short-range infrared detection(SIRD)technology is commonly used in military ground target detection.In complex battlefield environments,achieving precise strike on ground target is a challenging task.However,real battlefield data is limited,and equivalent experiments are costly.Currently,there is a lack of comprehensive physical modeling and numerical simulation methods for SIRD.To this end,this study proposes a SIRD simulation framework incorporating full-link physical response,which is integrated through the radiative transfer layer,the sensor response layer,and the model-driven layer.In the radiative transfer layer,a coupled dynamic detection model is established to describe the external optical channel response of the SIRD system by combining the infrared radiation model and the geometric measurement model.In the sensor response layer,considering photoelectric conversion and signal processing,the internal signal response model of the SIRD system is established by a hybrid mode of parametric modeling and analog circuit analysis.In the model-driven layer,a cosimulation application based on a three-dimensional virtual environment is proposed to drive the full-link physical model,and a parallel ray tracing method is employed for real-time synchronous simulation.The proposed simulation framework can provide pixel-level signal output and is verified by the measured data.The evaluation results of the root mean square error(RMSE)and the Pearson correlation coefficient(PCC)show that the simulated data and the measured data achieve good consistency,and the evaluation results of the waveform eigenvalues indicate that the simulated signals exhibit low errors compared to the measured signals.The proposed simulation framework has the potential to acquire large sample datasets of SIRD under various complex battlefield environments and can provide an effective data source for SIRD application research.展开更多
Infrared images typically exhibit diverse backgrounds,each potentially containing noise and target-like interference elements.In complex backgrounds,infrared small targets are prone to be submerged by background noise...Infrared images typically exhibit diverse backgrounds,each potentially containing noise and target-like interference elements.In complex backgrounds,infrared small targets are prone to be submerged by background noise due to their low pixel proportion and limited available features,leading to detection failure.To address this problem,this paper proposes an Attention Shift-Invariant Cross-Evolutionary Feature Fusion Network(ASCFNet)tailored for the detection of infrared weak and small targets.The network architecture first designs a Multidimensional Lightweight Pixel-level Attention Module(MLPA),which alleviates the issue of small-target feature suppression during deep network propagation by combining channel reshaping,multi-scale parallel subnet architectures,and local cross-channel interactions.Then,a Multidimensional Shift-Invariant Recall Module(MSIR)is designed to ensure the network remains unaffected by minor input perturbations when processing infrared images,through focusing on the model’s shift invariance.Subsequently,a Cross-Evolutionary Feature Fusion structure(CEFF)is designed to allow flexible and efficient integration of multidimensional feature information from different network hierarchies,thereby achieving complementarity and enhancement among features.Experimental results on three public datasets,SIRST,NUDT-SIRST,and IRST640,demonstrate that our proposed network outperforms advanced algorithms in the field.Specifically,on the NUDT-SIRST dataset,the mAP50,mAP50-95,and metrics reached 99.26%,85.22%,and 99.31%,respectively.Visual evaluations of detection results in diverse scenarios indicate that our algorithm exhibits an increased detection rate and reduced false alarm rate.Our method balances accuracy and real-time performance,and achieves efficient and stable detection of infrared weak and small targets.展开更多
People with visual impairments face substantial navigation difficulties in residential and unfamiliar indoor spaces.Neither canes nor verbal navigation systems possess adequate features to deliver real-time spatial aw...People with visual impairments face substantial navigation difficulties in residential and unfamiliar indoor spaces.Neither canes nor verbal navigation systems possess adequate features to deliver real-time spatial awareness to users.This research work represents a feasibility study for the wearable IoT-based indoor object detection assistant system architecture that employs a real-time indoor object detection approach to help visually impaired users recognize indoor objects.The system architecture includes four main layers:Wearable Internet of Things(IoT),Network,Cloud,and Indoor Object Detection Layers.The wearable hardware prototype is assembled using a Raspberry Pi 4,while the indoor object detection approach exploits YOLOv11.YOLOv11 represents the cutting edge of deep learning models optimized for both speed and accuracy in recognizing objects and powers the research prototype.In this work,we used a prototype implementation,comparative experiments,and two datasets compiled from Furniture Detection(i.e.,from Roboflow Universe)and Kaggle,which comprises 3000 images evenly distributed across three object categories,including bed,sofa,and table.In the evaluation process,the Raspberry Pi is only used for a feasibility demonstration of real-time inference performance(e.g.,latency and memory consumption)on embedded hardware.We also evaluated YOLOv11 by comparing its performance with other current methodologies,which involved a Convolutional Neural Network(CNN)(MobileNet-Single Shot MultiBox Detector(SSD))model together with the RTDETR Vision Transformer.The experimental results show that YOLOv11 stands out by reaching an average of 99.07%,98.51%,97.96%,and 98.22%for the accuracy,precision,recall,and F1-score,respectively.This feasibility study highlights the effectiveness of Raspberry Pi 4 and YOLOv11 in real-time indoor object detection,paving the way for structured user studies with visually impaired people in the future to evaluate their real-world use and impact.展开更多
文摘Objective:To explore the clinical correlation between the detection of irregular antibodies in red blood cell blood groups and hemolytic disease of the newborn.Methods:This study selected newborns who underwent examinations and were diagnosed with hemolytic disease at our hospital from October 2024 to October 2025 as the research subjects.Based on the severity of their hemolytic disease,the infants were divided into a severe group and a mild group.All the infants underwent detection for irregular antibodies in their red blood cell blood groups.General information,blood types,and irregular antibody test results of the two groups were recorded.Univariate analysis was conducted,and variables with statistical significance from the univariate analysis were included in a multivariate logistic regression analysis to explore the clinical correlation between the detection of irregular antibodies in red blood cell blood groups and hemolytic disease of the newborn.Results:Through univariate analysis,it was found that IgG1 and IgG3 subclass antibodies,as well as ABO blood group incompatibility,were statistically significant(p<0.05).When these factors were included in a multivariate logistic regression analysis,it was discovered that IgG1(OR=2.461,95%CI:1.859-2.709),IgG3(OR=2.509,95%CI:1.918-2.893),and ABO blood group incompatibility(OR=2.998,95%CI:2.149-3.493)all exhibited a positive correlation with hemolytic disease of the newborn.Conclusion:As levels of IgG1,IgG3,and ABO blood group incompatibility increase,the incidence of hemolytic disease of the newborn also rises,warranting clinical attention.
基金Supported by the Fundamental Research Funds for the Central Universities(2024300443)the Natural Science Foundation of Jiangsu Province(BK20241224).
文摘This paper presents a high-speed and robust dual-band infrared thermal camera based on an ARM CPU.The system consists of a low-resolution long-wavelength infrared detector,a digital temperature and humid⁃ity sensor,and a CMOS sensor.In view of the significant contrast between face and background in thermal infra⁃red images,this paper explores a suitable accuracy-latency tradeoff for thermal face detection and proposes a tiny,lightweight detector named YOLO-Fastest-IR.Four YOLO-Fastest-IR models(IR0 to IR3)with different scales are designed based on YOLO-Fastest.To train and evaluate these lightweight models,a multi-user low-resolution thermal face database(RGBT-MLTF)was collected,and the four networks were trained.Experiments demon⁃strate that the lightweight convolutional neural network performs well in thermal infrared face detection tasks.The proposed algorithm outperforms existing face detection methods in both positioning accuracy and speed,making it more suitable for deployment on mobile platforms or embedded devices.After obtaining the region of interest(ROI)in the infrared(IR)image,the RGB camera is guided by the thermal infrared face detection results to achieve fine positioning of the RGB face.Experimental results show that YOLO-Fastest-IR achieves a frame rate of 92.9 FPS on a Raspberry Pi 4B and successfully detects 97.4%of faces in the RGBT-MLTF test set.Ultimate⁃ly,an infrared temperature measurement system with low cost,strong robustness,and high real-time perfor⁃mance was integrated,achieving a temperature measurement accuracy of 0.3℃.
基金supported by the National Natural Science Foundation of China(No.52472225)the Science and Technology Plan Project of Shenzhen(No.20220808165025003),China。
文摘Visible and near-infrared photodetectors are widely used in intelligent driving,health monitoring,and other fields.However,the application of photodetectors in the near-infrared region is significantly impacted by high dark current,which can greatly reduce their performance and sensitivity,thereby limiting their effectiveness in certain applications.In this work,the introduction of a C60 back interface layer successfully mitigated back interface reactions to decrease the thickness of the Mo(S,Se)_(2)layer,tailoring the back-contact barrier and preventing reverse charge injection,resulting in a kesterite photodetector with an ultralow dark current density of 5.2×10^(-9)mA/cm^(2)and ultra-weak-light detection at levels as low as 25 pW/cm^(2).Besides,under a self-powered operation,it demonstrates outstanding performance,achieving a peak responsivity of 0.68 A/W,a wide response range spanning from 300 to 1600 nm,and an impressive detectivity of 5.27×10^(14)Jones.In addition,it offers exceptionally rapid response times,with rise and decay times of 70 and 650 ns,respectively.This research offers important insights for developing high-performance self-powered near-infrared photodetectors that have high responsivity,rapid response times,and ultralow dark current.
基金supported by Science and Technology Project of State Grid Corporation of China(52094024003D).
文摘As modern power systems grow in complexity,accurate and efficient fault detection has become increasingly important.While many existing reviews focus on a single modality,this paper presents a comprehensive survey from a dual-modality perspective-infrared imaging and voiceprint analysis-two complementary,non-contact techniques that capture different fault characteristics.Infrared imaging excels at detecting thermal anomalies,while voiceprint signals provide insight into mechanical vibrations and internal discharge phenomena.We review both traditional signal processing and deep learning-based approaches for each modality,categorized by key processing stages such as feature extraction and classification.The paper highlights how these modalities address distinct fault types and how they may be fused to improve robustness and accuracy.Representative datasets are summarized,and practical challenges such as noise interference,limited fault samples,and deployment constraints are discussed.By offering a cross-modal,comparative analysis,this work aims to bridge fragmented research and guide future development in intelligent fault detection systems.The review concludes with research trends including multimodal fusion,lightweight models,and self-supervised learning.
文摘In the context of target detection under infrared conditions for drones,the common issues of high missed detection rates,low signal-to-noise ratio,and blurred edge features for small targets are prevalent.To address these challenges,this paper proposes an improved detection algorithm based on YOLOv11n.First,a Dynamic Multi-Scale Feature Fusion and Adaptive Weighting approach is employed to design an Adaptive Focused Diffusion Pyramid Network(AFDPN),which enhances the feature expression and transmission capability of shallow small targets,thereby reducing the loss of detailed information.Then,combined with an Edge Enhancement(EE)module,the model improves the extraction of infrared small target edge features through low-frequency suppression and high-frequency enhancement strategies.Experimental results on the publicly available HIT-UAV dataset show that the improved model achieves a 3.8%increase in average detection accuracy and a 3.0%improvement in recall rate compared to YOLOv11n,with a computational cost of only 9.1 GFLOPS.In comparison experiments,the detection accuracy and model size balance achieved the optimal solution,meeting the lightweight deployment requirements for drone-based systems.This method provides a high-precision,lightweight solution for small target detection in drone-based infrared imagery.
基金supported in part by the National Natural Science Foundation of China(No.62271076)in part by the Fundamental Research Funds for the Central Universities,China(No.2242022k60006).
文摘The Global Positioning System(GPS)plays an indispensable role in the control of Unmanned Aerial Vehicle(UAV).However,the civilian GPS signals,transmitted over the air without any encryption,are vulnerable to spoofing attacks,which further guides the UAV on deviated positions or trajectories.To counter the GPS,,m spoofing on UAV system and to detect the position/trajectory anomaly in real time,a motion state vector based stack long short-term memory trajectory prediction scheme is firstly proposed,leveraging the temporal and spatial features of UAV kinematics.Based on the predicted results,an ensemble voting-based trajectory anomaly detection scheme is proposed to detect the position anomalies in real time with the information of motion state sequences.The proposed prediction-based trajectory anomaly detection scheme outperforms the existing offline detection schemes designed for fixed trajectories.Software In The Loop(SITL)based online prediction and online anomaly detection are demonstrated with random 3D flight trajectories.Results show that the coefficient of determination(R^(2))and Root Mean Square Error(RMSE)of the prediction scheme can reach 0.996 and 3.467,respectively.The accuracy,recall,and F1-score of the proposed anomaly detection scheme can reach 0.984,0.988,and 0.983,respectively,which outperform deep ensemble learning,LSTM-based classifier,machine learning classifier and GA-XGBoost based schemes.Moreover,results show that compared with LSTM-based classifier,the average duration(from the moment starting an attack to the moment the attack being detected)and distance of the proposed scheme are reduced by 24.4%and 19.5%,respectively.
基金supported by National Natural Science Foundation of China(Grant No.:32301161)the Natural Scientific Foundation of Hunan Province,China(Grant No.:2023JJ60052)+3 种基金the Scientific Research Project of Hunan Provincial Health Commission,China(Grant No.:202112062218,20190161)the Scientific Research Project of Hunan Provincial Department of Education,China(Grant No.:22B0455)the Clinical“4310”Project of the University of South China,China(Grant No.:20224310NHYCG02)the Doctoral Scientific Research Foundation of University of South China,China(Grant No.:200XQD042).
文摘Infrared(IR)spectroscopy,a technique within the realm of molecular vibrational spectroscopy,furnishes distinctive chemical signatures pivotal for both structural analysis and compound identification.A notable challenge emerges from the misalignment between the mid-IR light wavelength range and molecular dimensions,culminating in a constrained absorption cross-section and diminished vibrational absorption coefficients(Supplementary data).
基金Supported by the Key Laboratory Fund for Equipment Pre-Research(6142207210202)。
文摘Aiming at the problem that infrared small target detection faces low contrast between the background and the target and insufficient noise suppression ability under the complex cloud background,an infrared small target detection method based on the tensor nuclear norm and direction residual weighting was proposed.Based on converting the infrared image into an infrared patch tensor model,from the perspective of the low-rank nature of the background tensor,and taking advantage of the difference in contrast between the background and the target in different directions,we designed a double-neighborhood local contrast based on direction residual weighting method(DNLCDRW)combined with the partial sum of tensor nuclear norm(PSTNN)to achieve effective background suppression and recovery of infrared small targets.Experiments show that the algorithm is effective in suppressing the background and improving the detection ability of the target.
文摘The fast increase of online communities has brought about an increase in cyber threats inclusive of cyberbullying, hate speech, misinformation, and online harassment, making content moderation a pressing necessity. Traditional single-modal AI-based detection systems, which analyze both text, photos, or movies in isolation, have established useless at taking pictures multi-modal threats, in which malicious actors spread dangerous content throughout a couple of formats. To cope with these demanding situations, we advise a multi-modal deep mastering framework that integrates Natural Language Processing (NLP), Convolutional Neural Networks (CNNs), and Long Short-Term Memory (LSTM) networks to become aware of and mitigate online threats effectively. Our proposed model combines BERT for text class, ResNet50 for photograph processing, and a hybrid LSTM-3-d CNN community for video content material analysis. We constructed a large-scale dataset comprising 500,000 textual posts, 200,000 offensive images, and 50,000 annotated motion pictures from more than one platform, which includes Twitter, Reddit, YouTube, and online gaming forums. The system became carefully evaluated using trendy gadget mastering metrics which include accuracy, precision, remember, F1-score, and ROC-AUC curves. Experimental outcomes demonstrate that our multi-modal method extensively outperforms single-modal AI classifiers, achieving an accuracy of 92.3%, precision of 91.2%, do not forget of 90.1%, and an AUC rating of 0.95. The findings validate the necessity of integrating multi-modal AI for actual-time, high-accuracy online chance detection and moderation. Future paintings will have consciousness on improving hostile robustness, enhancing scalability for real-world deployment, and addressing ethical worries associated with AI-driven content moderation.
文摘With the advancement of deep learning in the automotive domain,more and more researchers are focusing on autonomous driving.Among these tasks,free space detection is particularly crucial.Currently,many model-based approaches have achieved autonomous driving on well-structured urban roads,but these efforts primarily focus on urban road environments.In contrast,there are fewer deep learningmethods specifically designed for off-road traversable area detection,and their effectiveness is not yet satisfactory.This is because detecting traversable areas in complex outdoor environments poses significant challenges,and current methods often rely on single-image inputs,which do not align with contemporary multimodal approaches.Therefore,in this study,we propose a CFH-Net model for off-road traversable area detection.This model employs a Transformer architecture to enhance its capability of capturing global information.For multimodal feature extraction and fusion,we integrate the CM-FRM module for feature extraction and introduce the novel FFX module for feature fusion,thereby improving the perception capability of autonomous vehicles on unstructured roads.To address upsampling,we propose a new convolution precorrection method to reduce model parameters and computational complexity while enhancing the model’s ability to capture complex features.Finally,we conducted experiments on the ORFD off-road dataset and achieved outstanding results.
基金supported by the National Key R&D Program“Development and Application Verification of Underwater Intelligent Defect Detection Robot System for Large Hydropower Station Dams”(Project No.2022YFB4703400)sub-topic 4“Research on Intelligent Identification and Diagnosis of Dam Defects and Fine Inspection Equipment and Technology of Hydropower Stations”(Project No.2022YFB4703404)supported in part by the National Natural Science Foundation of China under Grant 62371181in part by the Changzhou Science and Technology International Cooperation Program under Grant CZ20230029。
文摘Infrared small-target detection has important applications in many fields due to its high penetration capability and detection distance.This study introduces a detector called“YOLO-SDLUWD”which is based on the YOLOv7 network,for small target detection in complex infrared backgrounds.The“SDLUWD”refers to the combination of the Spatial Depth layer followed Convolutional layer structure(SD-Conv)and a Linear Up-sampling fusion Path Aggregation Feature Pyramid Network(LU-PAFPN)and a training strategy based on the normalized Gaussian Wasserstein Distance loss(WD-loss)function.“YOLO-SDLUWD”aims to reduce detection accuracy when the maximum pooling downsampling layer in the backbone network loses important feature information,support the interaction and fusion of high-dimensional and low-dimensional feature information,and overcome the false alarm predictions induced by noise in small target images.The detector achieved a mAP@0.5 of 90.4%and mAP@0.5:0.95 of 48.5%on IRIS-AG,an increase of 9%-11%over YOLOv7-tiny,outperforming other state-of-the-art target detectors in terms of accuracy and speed.
基金supported by the National Natural Science Foundation of China(No.62103298)。
文摘Aiming at the problems of low detection accuracy and large model size of existing object detection algorithms applied to complex road scenes,an improved you only look once version 8(YOLOv8)object detection algorithm for infrared images,F-YOLOv8,is proposed.First,a spatial-to-depth network replaces the traditional backbone network's strided convolution or pooling layer.At the same time,it combines with the channel attention mechanism so that the neural network focuses on the channels with large weight values to better extract low-resolution image feature information;then an improved feature pyramid network of lightweight bidirectional feature pyramid network(L-BiFPN)is proposed,which can efficiently fuse features of different scales.In addition,a loss function of insertion of union based on the minimum point distance(MPDIoU)is introduced for bounding box regression,which obtains faster convergence speed and more accurate regression results.Experimental results on the FLIR dataset show that the improved algorithm can accurately detect infrared road targets in real time with 3%and 2.2%enhancement in mean average precision at 50%IoU(mAP50)and mean average precision at 50%—95%IoU(mAP50-95),respectively,and 38.1%,37.3%and 16.9%reduction in the number of model parameters,the model weight,and floating-point operations per second(FLOPs),respectively.To further demonstrate the detection capability of the improved algorithm,it is tested on the public dataset PASCAL VOC,and the results show that F-YOLO has excellent generalized detection performance.
文摘BACKGROUND Colorectal cancer has a high incidence and mortality rate,and the effectiveness of routine colonoscopy largely depends on the endoscopist’s expertise.In recent years,computer-aided detection(CADe)systems have been increasingly integrated into colonoscopy to improve detection accuracy.However,while most studies have focused on adenoma detection rate(ADR)as the primary outcome,the more sensitive adenoma miss rate(AMR)has been less frequently analyzed.AIM To evaluate the effectiveness of CADe in colonoscopy and assess the advantages of AMR over ADR.METHODS A comprehensive literature search was conducted in PubMed,Embase,and the Cochrane Central Register of Controlled Trials using predefined search strategies to identify relevant studies published up to August 2,2024.Statistical analyses were performed to compare outcomes between groups,and potential publication bias was assessed using funnel plots.The quality of the included studies was evaluated using the Cochrane Risk of Bias tool and the Grading of Recommendations,Assessment,Development,and Evaluation approach.RESULTS Five studies comprising 1624 patients met the inclusion criteria.AMR was significantly lower in the CADe-assisted group than in the routine colonoscopy group(147/927,15.9%vs 345/960,35.9%;P<0.01).However,CADe did not provide a significant advantage in detecting advanced adenomas or lesions measuring 6-9 mm or≥10 mm.The polyp miss rate(PMR)was also lower in the CADe-assisted group[odds ratio(OR),0.35;95% confidence interval(CI):0.23-0.52;P<0.01].While the overall ADR did not differ significantly between groups,the ADR during the first-pass examination was higher in the CADe-assisted group(OR,1.37;95%CI:1.10-1.69;P=0.004).The level of evidence for the included randomized controlled trials was graded as moderate.CONCLUSION CADe can significantly reduce AMR and PMR while improving ADR during initial detection,demonstrating its potential to enhance colonoscopy performance.These findings highlight the value of CADe in improving the detection of colorectal neoplasms,particularly small and histologically distinct adenomas.
文摘In recent years,there has been a concerted effort to improve anomaly detection tech-niques,particularly in the context of high-dimensional,distributed clinical data.Analysing patient data within clinical settings reveals a pronounced focus on refining diagnostic accuracy,personalising treatment plans,and optimising resource allocation to enhance clinical outcomes.Nonetheless,this domain faces unique challenges,such as irregular data collection,inconsistent data quality,and patient-specific structural variations.This paper proposed a novel hybrid approach that integrates heuristic and stochastic methods for anomaly detection in patient clinical data to address these challenges.The strategy combines HPO-based optimal Density-Based Spatial Clustering of Applications with Noise for clustering patient exercise data,facilitating efficient anomaly identification.Subsequently,a stochastic method based on the Interquartile Range filters unreliable data points,ensuring that medical tools and professionals receive only the most pertinent and accurate information.The primary objective of this study is to equip healthcare pro-fessionals and researchers with a robust tool for managing extensive,high-dimensional clinical datasets,enabling effective isolation and removal of aberrant data points.Furthermore,a sophisticated regression model has been developed using Automated Machine Learning(AutoML)to assess the impact of the ensemble abnormal pattern detection approach.Various statistical error estimation techniques validate the efficacy of the hybrid approach alongside AutoML.Experimental results show that implementing this innovative hybrid model on patient rehabilitation data leads to a notable enhance-ment in AutoML performance,with an average improvement of 0.041 in the R2 score,surpassing the effectiveness of traditional regression models.
基金financially supported by the National Natural Science Foundation of China(Nos.22435005,22193042,21921001,22305105,52202194,22201284)Natural Science Foundation of Jiangxi Province(No.20224BAB213003)+1 种基金the Natural Science Foundation of Fujian Province(No.2023J05076)Jiangxi Provincial Education Department Science and Technology Research Foundation(No.GJJ2200384)。
文摘Two-dimensional perovskite ferroelectric which strongly couple ferroelectricity with semiconducting properties are promising candidates for optoelectronic applications.However,it is still a great challenge to fabricate self-powered broadband photodetectors with low detection limit.Herein,we successfully realized self-powered broadband photodetection with low detection limit by using a trilayered perovskite ferroelectric(BA)_(2)EA_(2)Pb_(3)I_(10)(1,BA=n-butylamine,EA=ethylamine).Giving to its large spontaneous polarization(5.6μC/cm^(2)),1 exhibits an open-circuit voltage of 0.25 V which provide driving force to separate carriers.Combining with its low dark current(~10^(-14)A)and narrow bandgap(Eg=1.86 e V),1 demonstrates great potential on detecting the broadband weak lights.Thus,a prominent photodetection performance with high open-off ratio(~10^(5)),outstanding responsivity(>10 m A/W),and promising detectivity(>1011Jones),as well as the low detecting limit(~nW/cm^(2))among the wide wavelength from 377 nm to637 nm was realized based on the single crystal of 1.This work demonstrates the great potential of 2D perovskite ferroelectric on self-powered broadband photodetectors.
文摘Roads inevitably have defects during use,which not only seriously affect their service life but also pose a hidden danger to traffic safety.Existing algorithms for detecting road defects are unsatisfactory in terms of accuracy and generalization,so this paper proposes an algorithm based on YOLOv11.The method embeds wavelet transform convolution(WTConv)into the backbone’s C3k2 module to enhance low-frequency feature extraction while avoiding parameter bloat.Secondly,a novel multi-scale fusion diffusion network(MFDN)architecture is designed for the neck to strengthen cross-scale feature interactions,boosting detection precision.In terms of model optimization,the traditional downsampling method is discarded,and the innovative Adown(adaptive downsampling)technique is adopted,which streamlines the parameter scales while effectively mitigating the information loss problem during downsampling.Finally,in this paper,we propose Wise-PIDIoU by combining WiseIoU and MPDIoU to minimize the negative impact of low-quality anchor frames and enhance the detection capability of the model.The experimental results indicate that the proposed algorithm achieves an average detection accuracy of 86.5%for mAP@50 on the RDD2022 dataset,which is 2%higher than the original algorithm while ensuring that the amount of computation is basically unchanged.The number of parameters is reduced by 17%,and the F1 score is improved by 3%,showing better detection performance than other algorithms when facing different types of defects.The excellent performance on embedded devices proves that the algorithm also has favorable application prospects in practical inspection.
基金supported by the National Natural Science Foundation of China(Grant No.62302086)the Natural Science Foundation of Liaoning Province(Grant No.2023-MSBA-070)the Fundamental Research Funds for the Central Universities(Grant No.N2317005).
文摘Visible-infrared object detection leverages the day-night stable object perception capability of infrared images to enhance detection robustness in low-light environments by fusing the complementary information of visible and infrared images.However,the inherent differences in the imaging mechanisms of visible and infrared modalities make effective cross-modal fusion challenging.Furthermore,constrained by the physical characteristics of sensors and thermal diffusion effects,infrared images generally suffer from blurred object contours and missing details,making it difficult to extract object features effectively.To address these issues,we propose an infrared-visible image fusion network that realizesmultimodal information fusion of infrared and visible images through a carefully designedmultiscale fusion strategy.First,we design an adaptive gray-radiance enhancement(AGRE)module to strengthen the detail representation in infrared images,improving their usability in complex lighting scenarios.Next,we introduce a channelspatial feature interaction(CSFI)module,which achieves efficient complementarity between the RGB and infrared(IR)modalities via dynamic channel switching and a spatial attention mechanism.Finally,we propose a multi-scale enhanced cross-attention fusion(MSECA)module,which optimizes the fusion ofmulti-level features through dynamic convolution and gating mechanisms and captures long-range complementary relationships of cross-modal features on a global scale,thereby enhancing the expressiveness of the fused features.Experiments on the KAIST,M3FD,and FLIR datasets demonstrate that our method delivers outstanding performance in daytime and nighttime scenarios.On the KAIST dataset,the miss rate drops to 5.99%,and further to 4.26% in night scenes.On the FLIR and M3FD datasets,it achieves AP50 scores of 79.4% and 88.9%,respectively.
基金supported by the Foundation of Equipment Preresearch Area(Grant No.80919010303).
文摘Missile-borne short-range infrared detection(SIRD)technology is commonly used in military ground target detection.In complex battlefield environments,achieving precise strike on ground target is a challenging task.However,real battlefield data is limited,and equivalent experiments are costly.Currently,there is a lack of comprehensive physical modeling and numerical simulation methods for SIRD.To this end,this study proposes a SIRD simulation framework incorporating full-link physical response,which is integrated through the radiative transfer layer,the sensor response layer,and the model-driven layer.In the radiative transfer layer,a coupled dynamic detection model is established to describe the external optical channel response of the SIRD system by combining the infrared radiation model and the geometric measurement model.In the sensor response layer,considering photoelectric conversion and signal processing,the internal signal response model of the SIRD system is established by a hybrid mode of parametric modeling and analog circuit analysis.In the model-driven layer,a cosimulation application based on a three-dimensional virtual environment is proposed to drive the full-link physical model,and a parallel ray tracing method is employed for real-time synchronous simulation.The proposed simulation framework can provide pixel-level signal output and is verified by the measured data.The evaluation results of the root mean square error(RMSE)and the Pearson correlation coefficient(PCC)show that the simulated data and the measured data achieve good consistency,and the evaluation results of the waveform eigenvalues indicate that the simulated signals exhibit low errors compared to the measured signals.The proposed simulation framework has the potential to acquire large sample datasets of SIRD under various complex battlefield environments and can provide an effective data source for SIRD application research.
基金supported in part by the National Natural Science Foundation of China under Grant 62271302the Shanghai Municipal Natural Science Foundation under Grant 20ZR1423500.
文摘Infrared images typically exhibit diverse backgrounds,each potentially containing noise and target-like interference elements.In complex backgrounds,infrared small targets are prone to be submerged by background noise due to their low pixel proportion and limited available features,leading to detection failure.To address this problem,this paper proposes an Attention Shift-Invariant Cross-Evolutionary Feature Fusion Network(ASCFNet)tailored for the detection of infrared weak and small targets.The network architecture first designs a Multidimensional Lightweight Pixel-level Attention Module(MLPA),which alleviates the issue of small-target feature suppression during deep network propagation by combining channel reshaping,multi-scale parallel subnet architectures,and local cross-channel interactions.Then,a Multidimensional Shift-Invariant Recall Module(MSIR)is designed to ensure the network remains unaffected by minor input perturbations when processing infrared images,through focusing on the model’s shift invariance.Subsequently,a Cross-Evolutionary Feature Fusion structure(CEFF)is designed to allow flexible and efficient integration of multidimensional feature information from different network hierarchies,thereby achieving complementarity and enhancement among features.Experimental results on three public datasets,SIRST,NUDT-SIRST,and IRST640,demonstrate that our proposed network outperforms advanced algorithms in the field.Specifically,on the NUDT-SIRST dataset,the mAP50,mAP50-95,and metrics reached 99.26%,85.22%,and 99.31%,respectively.Visual evaluations of detection results in diverse scenarios indicate that our algorithm exhibits an increased detection rate and reduced false alarm rate.Our method balances accuracy and real-time performance,and achieves efficient and stable detection of infrared weak and small targets.
基金funded by the King Salman Center for Disability Research through Research Group No.KSRG-2024-140.
文摘People with visual impairments face substantial navigation difficulties in residential and unfamiliar indoor spaces.Neither canes nor verbal navigation systems possess adequate features to deliver real-time spatial awareness to users.This research work represents a feasibility study for the wearable IoT-based indoor object detection assistant system architecture that employs a real-time indoor object detection approach to help visually impaired users recognize indoor objects.The system architecture includes four main layers:Wearable Internet of Things(IoT),Network,Cloud,and Indoor Object Detection Layers.The wearable hardware prototype is assembled using a Raspberry Pi 4,while the indoor object detection approach exploits YOLOv11.YOLOv11 represents the cutting edge of deep learning models optimized for both speed and accuracy in recognizing objects and powers the research prototype.In this work,we used a prototype implementation,comparative experiments,and two datasets compiled from Furniture Detection(i.e.,from Roboflow Universe)and Kaggle,which comprises 3000 images evenly distributed across three object categories,including bed,sofa,and table.In the evaluation process,the Raspberry Pi is only used for a feasibility demonstration of real-time inference performance(e.g.,latency and memory consumption)on embedded hardware.We also evaluated YOLOv11 by comparing its performance with other current methodologies,which involved a Convolutional Neural Network(CNN)(MobileNet-Single Shot MultiBox Detector(SSD))model together with the RTDETR Vision Transformer.The experimental results show that YOLOv11 stands out by reaching an average of 99.07%,98.51%,97.96%,and 98.22%for the accuracy,precision,recall,and F1-score,respectively.This feasibility study highlights the effectiveness of Raspberry Pi 4 and YOLOv11 in real-time indoor object detection,paving the way for structured user studies with visually impaired people in the future to evaluate their real-world use and impact.