Adding colors to monochrome thermal infrared images can help observers understand the scenery better. A nonlinear color estimation method for single-band thermal infrared imagery based on kernel principal component an...Adding colors to monochrome thermal infrared images can help observers understand the scenery better. A nonlinear color estimation method for single-band thermal infrared imagery based on kernel principal component analysis (KPCA) and sparse representation was proposed. Nonlinear features of infrared image were extracted using KPCA. The relationship between image features and chromatic values was learned using sparse representation and a color estimation model was obtained. The thermal infrared images can be rendered automatically using the color estimation model. The experimental results show that the proposed method can render infrared image with an accurate color appearance. The proposed idea can also be used in other color estimation problem.展开更多
Lead sulfide(PbS)colloidal quantum dot(CQD)photodiodes integrated with silicon-based readout integrated circuits(ROICs)offer a promising solution for the next-generation short-wave infrared(SWIR)imaging technology.Des...Lead sulfide(PbS)colloidal quantum dot(CQD)photodiodes integrated with silicon-based readout integrated circuits(ROICs)offer a promising solution for the next-generation short-wave infrared(SWIR)imaging technology.Despite their potential,large-size CQD photodiodes pose a challenge due to high dark currents resulting from surface states on nonpassivated(100)facets and trap states generated by CQD fusion.In this work,we present a novel approach to address this issue by introducing double-ended ligands that supplementally passivate(100)facets of halidecapped large-size CQDs,leading to suppressed bandtail states and reduced defect concentration.Our results demonstrate that the dark current density is highly suppressed by about an order of magnitude to 9.6 nA cm^(2) at -10 mV,which is among the lowest reported for PbS CQD photodiodes.Furthermore,the performance of the photodiodes is exemplary,yielding an external quantum efficiency of 50.8%(which corresponds to a responsivity of 0.532 A W^(-1))and a specific detectivity of 2.5×10^(12) Jones at 1300 nm.By integrating CQD photodiodes with CMOS ROICs,the CQD imager provides high-resolution(640×512)SWIR imaging for infrared penetration and material discrimination.展开更多
Potential high-temperature risks exist in heat-prone components of electric moped charging devices,such as sockets,interfaces,and controllers.Traditional detection methods have limitations in terms of real-time perfor...Potential high-temperature risks exist in heat-prone components of electric moped charging devices,such as sockets,interfaces,and controllers.Traditional detection methods have limitations in terms of real-time performance and monitoring scope.To address this,a temperature detection method based on infrared image processing has been proposed:utilizing the median filtering algorithm to denoise the original infrared image,then applying an image segmentation algorithm to divide the image.展开更多
The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method f...The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.展开更多
In response to the scarcity of infrared aircraft samples and the tendency of traditional deep learning to overfit,a few-shot infrared aircraft classification method based on cross-correlation networks is proposed.This...In response to the scarcity of infrared aircraft samples and the tendency of traditional deep learning to overfit,a few-shot infrared aircraft classification method based on cross-correlation networks is proposed.This method combines two core modules:a simple parameter-free self-attention and cross-attention.By analyzing the self-correlation and cross-correlation between support images and query images,it achieves effective classification of infrared aircraft under few-shot conditions.The proposed cross-correlation network integrates these two modules and is trained in an end-to-end manner.The simple parameter-free self-attention is responsible for extracting the internal structure of the image while the cross-attention can calculate the cross-correlation between images further extracting and fusing the features between images.Compared with existing few-shot infrared target classification models,this model focuses on the geometric structure and thermal texture information of infrared images by modeling the semantic relevance between the features of the support set and query set,thus better attending to the target objects.Experimental results show that this method outperforms existing infrared aircraft classification methods in various classification tasks,with the highest classification accuracy improvement exceeding 3%.In addition,ablation experiments and comparative experiments also prove the effectiveness of the method.展开更多
1.Introduction Infrared Imaging Missiles(IRIMs)are advanced weapons utilizing infrared technology for target detection and tracking.Their sensors capture thermal signatures and convert them into electronic images,enab...1.Introduction Infrared Imaging Missiles(IRIMs)are advanced weapons utilizing infrared technology for target detection and tracking.Their sensors capture thermal signatures and convert them into electronic images,enabling precise target identification and tracking.To a certain extent,the all-weather adaptability of IRIMs enables their effective operation across diverse environmental conditions,providing high targeting accuracy and cost efficiency.展开更多
As modern power systems grow in complexity,accurate and efficient fault detection has become increasingly important.While many existing reviews focus on a single modality,this paper presents a comprehensive survey fro...As modern power systems grow in complexity,accurate and efficient fault detection has become increasingly important.While many existing reviews focus on a single modality,this paper presents a comprehensive survey from a dual-modality perspective-infrared imaging and voiceprint analysis-two complementary,non-contact techniques that capture different fault characteristics.Infrared imaging excels at detecting thermal anomalies,while voiceprint signals provide insight into mechanical vibrations and internal discharge phenomena.We review both traditional signal processing and deep learning-based approaches for each modality,categorized by key processing stages such as feature extraction and classification.The paper highlights how these modalities address distinct fault types and how they may be fused to improve robustness and accuracy.Representative datasets are summarized,and practical challenges such as noise interference,limited fault samples,and deployment constraints are discussed.By offering a cross-modal,comparative analysis,this work aims to bridge fragmented research and guide future development in intelligent fault detection systems.The review concludes with research trends including multimodal fusion,lightweight models,and self-supervised learning.展开更多
Infrared imaging technology has been widely adopted in various fields,such as military reconnaissance,medical diagnosis,and security monitoring,due to its excellent ability to penetrate smoke and fog.However,the preva...Infrared imaging technology has been widely adopted in various fields,such as military reconnaissance,medical diagnosis,and security monitoring,due to its excellent ability to penetrate smoke and fog.However,the prevalent low resolution of infrared images severely limits the accurate interpretation of their contents.In addition,deploying super-resolution models on resource-constrained devices faces significant challenges.To address these issues,this study proposes a lightweight super-resolution network for infrared images based on an adaptive attention mechanism.The network’s dynamic weighting module automatically adjusts the weights of the attention and nonattention branch outputs based on the network’s characteristics at different levels.Among them,the attention branch is further subdivided into pixel attention and brightness-texture attention,which are specialized for extracting the most informative features in infrared images.Meanwhile,the non-attention branch supplements the extraction of those neglected features to enhance the comprehensiveness of the features.Through ablation experiments,we verify the effectiveness of the proposed module.Finally,through experiments on two datasets,FLIR and Thermal101,qualitative and quantitative results demonstrate that the model can effectively recover high-frequency details of infrared images and significantly improve image resolution.In detail,compared with the suboptimal method,we have reduced the number of parameters by 30%and improved the model performance.When the scale factor is 2,the peak signal-tonoise ratio of the test datasets FLIR and Thermal101 is improved by 0.09 and 0.15 dB,respectively.When the scale factor is 4,it is improved by 0.05 and 0.09 dB,respectively.In addition,due to the lightweight design of the network structure,it has a low computational cost.It is suitable for deployment on edge devices,thus effectively enhancing the sensing performance of infrared imaging devices.展开更多
Infrared and visible light image fusion technology integrates feature information from two different modalities into a fused image to obtain more comprehensive information.However,in low-light scenarios,the illuminati...Infrared and visible light image fusion technology integrates feature information from two different modalities into a fused image to obtain more comprehensive information.However,in low-light scenarios,the illumination degradation of visible light images makes it difficult for existing fusion methods to extract texture detail information from the scene.At this time,relying solely on the target saliency information provided by infrared images is far from sufficient.To address this challenge,this paper proposes a lightweight infrared and visible light image fusion method based on low-light enhancement,named LLE-Fuse.The method is based on the improvement of the MobileOne Block,using the Edge-MobileOne Block embedded with the Sobel operator to perform feature extraction and downsampling on the source images.The intermediate features at different scales obtained are then fused by a cross-modal attention fusion module.In addition,the Contrast Limited Adaptive Histogram Equalization(CLAHE)algorithm is used for image enhancement of both infrared and visible light images,guiding the network model to learn low-light enhancement capabilities through enhancement loss.Upon completion of network training,the Edge-MobileOne Block is optimized into a direct connection structure similar to MobileNetV1 through structural reparameterization,effectively reducing computational resource consumption.Finally,after extensive experimental comparisons,our method achieved improvements of 4.6%,40.5%,156.9%,9.2%,and 98.6%in the evaluation metrics Standard Deviation(SD),Visual Information Fidelity(VIF),Entropy(EN),and Spatial Frequency(SF),respectively,compared to the best results of the compared algorithms,while only being 1.5 ms/it slower in computation speed than the fastest method.展开更多
0 INTRODUCTION Geohazards in mountainous regions pose significant risks to the construction and safe operation of transportation,water conservancy,and other critical infrastructure projects.Engineering geological inve...0 INTRODUCTION Geohazards in mountainous regions pose significant risks to the construction and safe operation of transportation,water conservancy,and other critical infrastructure projects.Engineering geological investigations are crucial for disaster prevention and mitigation.展开更多
As technologies related to power equipment fault diagnosis and infrared temperature measurement continue to advance,the classification and identification of infrared temperature measurement images have become crucial ...As technologies related to power equipment fault diagnosis and infrared temperature measurement continue to advance,the classification and identification of infrared temperature measurement images have become crucial in effective intelligent fault diagnosis of various electrical equipment.In response to the increasing demand for sufficient feature fusion in current real-time detection and low detection accuracy in existing networks for Substation fault diagnosis,we introduce an innovative method known as Gather and Distribution Mechanism-You Only Look Once(GD-YOLO).Firstly,a partial convolution group is designed based on different convolution kernels.We combine the partial convolution group with deep convolution to propose a new Grouped Channel-wise Spatial Convolution(GCSConv)that compensates for the information loss caused by spatial channel convolution.Secondly,the Gather and Distribute Mechanism,which addresses the fusion problem of different dimensional features,has been implemented by aligning and sharing information through aggregation and distribution mechanisms.Thirdly,considering the limitations in current bounding box regression and the imbalance between complex and simple samples,Maximum Possible Distance Intersection over Union(MPDIoU)and Adaptive SlideLoss is incorporated into the loss function,allowing samples near the Intersection over Union(IoU)to receive more attention through the dynamic variation of the mean Intersection over Union.The GD-YOLO algorithm can surpass YOLOv5,YOLOv7,and YOLOv8 in infrared image detection for electrical equipment,achieving a mean Average Precision(mAP)of 88.9%,with accuracy improvements of 3.7%,4.3%,and 3.1%,respectively.Additionally,the model delivers a frame rate of 48 FPS,which aligns with the precision and velocity criteria necessary for the detection of infrared images in power equipment.展开更多
Rockfalls are among the frequent hazards in underground mines worldwide,requiring effective methods for detecting unstable rock blocks to ensure miners’and equipment’s safety.This study proposes a novel approach for...Rockfalls are among the frequent hazards in underground mines worldwide,requiring effective methods for detecting unstable rock blocks to ensure miners’and equipment’s safety.This study proposes a novel approach for identifying potential rockfall zones using infrared thermal imaging and image segmentation techniques.Infrared images of rock blocks were captured at the Draa Sfar deep underground mine in Morocco using the FLUKE TI401 PRO thermal camera.Two segmentation methods were applied to locate the potential unstable areas:the classical thresholding and the K-means clustering model.The results show that while thresholding allows a binary distinction between stable and unstable areas,K-means clustering is more accurate,especially when using multiple clusters to show different risk levels.The close match between the clustering masks of unstable blocks and their corresponding visible light images further validated this.The findings confirm that thermal image segmentation can serve as an alternative method for predicting rockfalls and monitoring geotechnical issues in underground mines.Underground operators worldwide can apply this approach to monitor rock mass stability.However,further research is recommended to enhance these results,particularly through deep learning-based segmentation and object detection models.展开更多
A pancreas surgeon’s constant goal is to do"less damage,more radical".Currently,a small number of highly trained surgeons opt for single-incision laparoscopic pancreaticoduodenectomy(SILPD)or single-incisio...A pancreas surgeon’s constant goal is to do"less damage,more radical".Currently,a small number of highly trained surgeons opt for single-incision laparoscopic pancreaticoduodenectomy(SILPD)or single-incision plus one-port LPD(SILPD+1)to minimize post-operative pain,improve convalescence,and provide a more pleas-ing cosmetic outcome[1,2].Additionally,some skilled surgeons have claimed that laparoscopic duodenum-preserving complete pancreatic head resections(LDPPHR)result in less trauma and en-hanced quality of life[3,4].However,LDPPHR is still challenging because of its lengthy learning curve and"sword-fighting"impact.Additionally,there has not been any global reporting on the suit-ability of single-incision plus one-port DPPHR with pancreaticogas-trostomy(SILDPPHR-T+1)in place of SILPD+1.This study aimed to illustrate the SILDPPHR-T+1 procedure specifics for a patient with pancreatic head intraductal papillary mucinous neoplasm(IPMN)(main pancreatic duct type)(MD-IPMN).展开更多
Infrared and visible image fusion technology integrates the thermal radiation information of infrared images with the texture details of visible images to generate more informative fused images.However,existing method...Infrared and visible image fusion technology integrates the thermal radiation information of infrared images with the texture details of visible images to generate more informative fused images.However,existing methods often fail to distinguish salient objects from background regions,leading to detail suppression in salient regions due to global fusion strategies.This study presents a mask-guided latent low-rank representation fusion method to address this issue.First,the GrabCut algorithm is employed to extract a saliency mask,distinguishing salient regions from background regions.Then,latent low-rank representation(LatLRR)is applied to extract deep image features,enhancing key information extraction.In the fusion stage,a weighted fusion strategy strengthens infrared thermal information and visible texture details in salient regions,while an average fusion strategy improves background smoothness and stability.Experimental results on the TNO dataset demonstrate that the proposed method achieves superior performance in SPI,MI,Qabf,PSNR,and EN metrics,effectively preserving salient target details while maintaining balanced background information.Compared to state-of-the-art fusion methods,our approach achieves more stable and visually consistent fusion results.The fusion code is available on GitHub at:https://github.com/joyzhen1/Image(accessed on 15 January 2025).展开更多
The goal of infrared and visible image fusion(IVIF)is to integrate the unique advantages of both modalities to achieve a more comprehensive understanding of a scene.However,existing methods struggle to effectively han...The goal of infrared and visible image fusion(IVIF)is to integrate the unique advantages of both modalities to achieve a more comprehensive understanding of a scene.However,existing methods struggle to effectively handle modal disparities,resulting in visual degradation of the details and prominent targets of the fused images.To address these challenges,we introduce Prompt Fusion,a prompt-based approach that harmoniously combines multi-modality images under the guidance of semantic prompts.Firstly,to better characterize the features of different modalities,a contourlet autoencoder is designed to separate and extract the high-/low-frequency components of different modalities,thereby improving the extraction of fine details and textures.We also introduce a prompt learning mechanism using positive and negative prompts,leveraging Vision-Language Models to improve the fusion model's understanding and identification of targets in multi-modality images,leading to improved performance in downstream tasks.Furthermore,we employ bi-level asymptotic convergence optimization.This approach simplifies the intricate non-singleton non-convex bi-level problem into a series of convergent and differentiable single optimization problems that can be effectively resolved through gradient descent.Our approach advances the state-of-the-art,delivering superior fusion quality and boosting the performance of related downstream tasks.Project page:https://github.com/hey-it-s-me/PromptFusion.展开更多
Mining-induced ground fissures are common problems associated with mining damage in shallowly buried coal seams in the western mining area of China.To evaluate the surface mining damage of the 12203 working face of th...Mining-induced ground fissures are common problems associated with mining damage in shallowly buried coal seams in the western mining area of China.To evaluate the surface mining damage of the 12203 working face of the Huojitu Colliery in Shendong mining area,low-altitude infrared aerial surveys were conducted on the ground at the static fissure area(O-A1)and the dynamic fissure area(O-A2)of the working face.The temperature evolution patterns of fissures,sand and plants in the infrared images were analysed.The relationship between overburden fractures and surface fissure temperature was revealed,and the influence range and temperature self-healing period of the surface affected by underground mining were determined.The results indicated that underground mining could lead to a decrease in the ground temperature above the working face.The surface temperature evolution can be divided into three zones:a temperature stabilization zone before mining,a temperature cooling zone during mining,and a temperature recovery zone after mining.The temperature of sand and plants above the working face exhibited quadratic curve changes in O-A1 and O-A2,respectively.The length of the temperature reduction zone affected by mining is 40 m in O-A2,and 46.8 m in O-A1.The temperature recovery periods of ground fissures in O-A1 and O-A2 were 4.0 and 4.6 d,respectively.These findings could provide a basis for evaluating mining ground damage.展开更多
In recent years face recognition has received substantial attention, but still remained very challenging in real applications. Despite the variety of approaches and tools studied, face recognition is not accurate or r...In recent years face recognition has received substantial attention, but still remained very challenging in real applications. Despite the variety of approaches and tools studied, face recognition is not accurate or robust enough to be used in uncontrolled environments. Infrared (IR) imagery of human faces offers a promising alternative to visible imagery, however, IR has its own limitations. In this paper, a scheme to fuse information from the two modalities is proposed. The scheme is based on eigenfaces and probabilistic neural network (PNN), using fuzzy integral to fuse the objective evidence supplied by each modality. Recognition rate is used to evaluate the fusion scheme. Experimental results show that the scheme improves recognition performance substantially.展开更多
Road traffic safety can decrease when drivers drive in a low-visibility environment.The application of visual perception technology to detect vehicles and pedestrians in infrared images proves to be an effective means...Road traffic safety can decrease when drivers drive in a low-visibility environment.The application of visual perception technology to detect vehicles and pedestrians in infrared images proves to be an effective means of reducing the risk of accidents.To tackle the challenges posed by the low recognition accuracy and the substan-tial computational burden associated with current infrared pedestrian-vehicle detection methods,an infrared pedestrian-vehicle detection method A proposal is presented,based on an enhanced version of You Only Look Once version 5(YOLOv5).First,A head specifically designed for detecting small targets has been integrated into the model to make full use of shallow feature information to enhance the accuracy in detecting small targets.Second,the Focal Generalized Intersection over Union(GIoU)is employed as an alternative to the original loss function to address issues related to target overlap and category imbalance.Third,the distribution shift convolution optimization feature extraction operator is used to alleviate the computational burden of the model without significantly compromising detection accuracy.The test results of the improved algorithm show that its average accuracy(mAP)reaches 90.1%.Specifically,the Giga Floating Point Operations Per second(GFLOPs)of the improved algorithm is only 9.1.In contrast,the improved algorithms outperformed the other algorithms on similar GFLOPs,such as YOLOv6n(11.9),YOLOv8n(8.7),YOLOv7t(13.2)and YOLOv5s(16.0).The mAPs that are 4.4%,3%,3.5%,and 1.7%greater than those of these algorithms show that the improved algorithm achieves higher accuracy in target detection tasks under similar computational resource overhead.On the other hand,compared with other algorithms such as YOLOv8l(91.1%),YOLOv6l(89.5%),YOLOv7(90.8%),and YOLOv3(90.1%),the improved algorithm needs only 5.5%,2.3%,8.6%,and 2.3%,respectively,of the GFLOPs.The improved algorithm has shown significant advancements in balancing accuracy and computational efficiency,making it promising for practical use in resource-limited scenarios.展开更多
Infrared small target detection is a common task in infrared image processing.Under limited computa⁃tional resources.Traditional methods for infrared small target detection face a trade-off between the detection rate ...Infrared small target detection is a common task in infrared image processing.Under limited computa⁃tional resources.Traditional methods for infrared small target detection face a trade-off between the detection rate and the accuracy.A fast infrared small target detection method tailored for resource-constrained conditions is pro⁃posed for the YOLOv5s model.This method introduces an additional small target detection head and replaces the original Intersection over Union(IoU)metric with Normalized Wasserstein Distance(NWD),while considering both the detection accuracy and the detection speed of infrared small targets.Experimental results demonstrate that the proposed algorithm achieves a maximum effective detection speed of 95 FPS on a 15 W TPU,while reach⁃ing a maximum effective detection accuracy of 91.9 AP@0.5,effectively improving the efficiency of infrared small target detection under resource-constrained conditions.展开更多
A novel image fusion network framework with an autonomous encoder and decoder is suggested to increase thevisual impression of fused images by improving the quality of infrared and visible light picture fusion. The ne...A novel image fusion network framework with an autonomous encoder and decoder is suggested to increase thevisual impression of fused images by improving the quality of infrared and visible light picture fusion. The networkcomprises an encoder module, fusion layer, decoder module, and edge improvementmodule. The encoder moduleutilizes an enhanced Inception module for shallow feature extraction, then combines Res2Net and Transformerto achieve deep-level co-extraction of local and global features from the original picture. An edge enhancementmodule (EEM) is created to extract significant edge features. A modal maximum difference fusion strategy isintroduced to enhance the adaptive representation of information in various regions of the source image, therebyenhancing the contrast of the fused image. The encoder and the EEM module extract features, which are thencombined in the fusion layer to create a fused picture using the decoder. Three datasets were chosen to test thealgorithmproposed in this paper. The results of the experiments demonstrate that the network effectively preservesbackground and detail information in both infrared and visible images, yielding superior outcomes in subjectiveand objective evaluations.展开更多
基金National Natural Science Foundation of China(No. 61072090)the Fundamental Research Funds for the Central Universities,China+2 种基金Shanghai Pujiang Program,China(No. 12PJ1402200)China Postdoctoral Science Foundation Funded Project(No. 2012M511058)Shanghai Postdoctoral Sustentation Fund,China(No. 12R21412500)
文摘Adding colors to monochrome thermal infrared images can help observers understand the scenery better. A nonlinear color estimation method for single-band thermal infrared imagery based on kernel principal component analysis (KPCA) and sparse representation was proposed. Nonlinear features of infrared image were extracted using KPCA. The relationship between image features and chromatic values was learned using sparse representation and a color estimation model was obtained. The thermal infrared images can be rendered automatically using the color estimation model. The experimental results show that the proposed method can render infrared image with an accurate color appearance. The proposed idea can also be used in other color estimation problem.
基金National Natural Science Foundation of China,Grant/Award Numbers:U22A2083,62204091,62374068National Key Research and Development Program of China,Grant/Award Number:2021YFA0715502+5 种基金Key R&D program of Hubei Province,Grant/Award Number:2021BAA014Innovation Project of Optics Valley Laboratory,Grant/Award Numbers:OVL2021BG009,OVL2023ZD002Exploration Project of Natural Science Foundation of Zhejiang Province,Grant/Award Number:LY23F040005Fund for Innovative Research Groups of the Natural Science Foundation of Hubei Province,Grant/Award Number:2020CFA034Fund from Science,Technology and Innovation Commission of Shenzhen Municipality,Grant/Award Numbers:GJHZ20210705142540010,GJHZ20220913143403007China Postdoctoral Science Foundation,Grant/Award Numbers:2021M691118,2022M711237,2022M721243,2023T160244。
文摘Lead sulfide(PbS)colloidal quantum dot(CQD)photodiodes integrated with silicon-based readout integrated circuits(ROICs)offer a promising solution for the next-generation short-wave infrared(SWIR)imaging technology.Despite their potential,large-size CQD photodiodes pose a challenge due to high dark currents resulting from surface states on nonpassivated(100)facets and trap states generated by CQD fusion.In this work,we present a novel approach to address this issue by introducing double-ended ligands that supplementally passivate(100)facets of halidecapped large-size CQDs,leading to suppressed bandtail states and reduced defect concentration.Our results demonstrate that the dark current density is highly suppressed by about an order of magnitude to 9.6 nA cm^(2) at -10 mV,which is among the lowest reported for PbS CQD photodiodes.Furthermore,the performance of the photodiodes is exemplary,yielding an external quantum efficiency of 50.8%(which corresponds to a responsivity of 0.532 A W^(-1))and a specific detectivity of 2.5×10^(12) Jones at 1300 nm.By integrating CQD photodiodes with CMOS ROICs,the CQD imager provides high-resolution(640×512)SWIR imaging for infrared penetration and material discrimination.
基金supported by the National Key Research and Development Project of China(No.2023YFB3709605)the National Natural Science Foundation of China(No.62073193)the National College Student Innovation Training Program(No.202310422122)。
文摘Potential high-temperature risks exist in heat-prone components of electric moped charging devices,such as sockets,interfaces,and controllers.Traditional detection methods have limitations in terms of real-time performance and monitoring scope.To address this,a temperature detection method based on infrared image processing has been proposed:utilizing the median filtering algorithm to denoise the original infrared image,then applying an image segmentation algorithm to divide the image.
基金Supported by the Henan Province Key Research and Development Project(231111211300)the Central Government of Henan Province Guides Local Science and Technology Development Funds(Z20231811005)+2 种基金Henan Province Key Research and Development Project(231111110100)Henan Provincial Outstanding Foreign Scientist Studio(GZS2024006)Henan Provincial Joint Fund for Scientific and Technological Research and Development Plan(Application and Overcoming Technical Barriers)(242103810028)。
文摘The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.
基金Supported by the National Pre-research Program during the 14th Five-Year Plan(514010405)。
文摘In response to the scarcity of infrared aircraft samples and the tendency of traditional deep learning to overfit,a few-shot infrared aircraft classification method based on cross-correlation networks is proposed.This method combines two core modules:a simple parameter-free self-attention and cross-attention.By analyzing the self-correlation and cross-correlation between support images and query images,it achieves effective classification of infrared aircraft under few-shot conditions.The proposed cross-correlation network integrates these two modules and is trained in an end-to-end manner.The simple parameter-free self-attention is responsible for extracting the internal structure of the image while the cross-attention can calculate the cross-correlation between images further extracting and fusing the features between images.Compared with existing few-shot infrared target classification models,this model focuses on the geometric structure and thermal texture information of infrared images by modeling the semantic relevance between the features of the support set and query set,thus better attending to the target objects.Experimental results show that this method outperforms existing infrared aircraft classification methods in various classification tasks,with the highest classification accuracy improvement exceeding 3%.In addition,ablation experiments and comparative experiments also prove the effectiveness of the method.
基金co-supported by the China Postdoctoral Science Foundation(No.2024M754304)the Hunan Provincial Natural Science Foundation of China(No.2025JJ60072)。
文摘1.Introduction Infrared Imaging Missiles(IRIMs)are advanced weapons utilizing infrared technology for target detection and tracking.Their sensors capture thermal signatures and convert them into electronic images,enabling precise target identification and tracking.To a certain extent,the all-weather adaptability of IRIMs enables their effective operation across diverse environmental conditions,providing high targeting accuracy and cost efficiency.
基金supported by Science and Technology Project of State Grid Corporation of China(52094024003D).
文摘As modern power systems grow in complexity,accurate and efficient fault detection has become increasingly important.While many existing reviews focus on a single modality,this paper presents a comprehensive survey from a dual-modality perspective-infrared imaging and voiceprint analysis-two complementary,non-contact techniques that capture different fault characteristics.Infrared imaging excels at detecting thermal anomalies,while voiceprint signals provide insight into mechanical vibrations and internal discharge phenomena.We review both traditional signal processing and deep learning-based approaches for each modality,categorized by key processing stages such as feature extraction and classification.The paper highlights how these modalities address distinct fault types and how they may be fused to improve robustness and accuracy.Representative datasets are summarized,and practical challenges such as noise interference,limited fault samples,and deployment constraints are discussed.By offering a cross-modal,comparative analysis,this work aims to bridge fragmented research and guide future development in intelligent fault detection systems.The review concludes with research trends including multimodal fusion,lightweight models,and self-supervised learning.
基金funded in part by theHenan ProvinceKeyR&DProgramProject,“Research and Application Demonstration of Class Ⅱ Superlattice Medium Wave High Temperature Infrared Detector Technology”under Grant No.231111210400.
文摘Infrared imaging technology has been widely adopted in various fields,such as military reconnaissance,medical diagnosis,and security monitoring,due to its excellent ability to penetrate smoke and fog.However,the prevalent low resolution of infrared images severely limits the accurate interpretation of their contents.In addition,deploying super-resolution models on resource-constrained devices faces significant challenges.To address these issues,this study proposes a lightweight super-resolution network for infrared images based on an adaptive attention mechanism.The network’s dynamic weighting module automatically adjusts the weights of the attention and nonattention branch outputs based on the network’s characteristics at different levels.Among them,the attention branch is further subdivided into pixel attention and brightness-texture attention,which are specialized for extracting the most informative features in infrared images.Meanwhile,the non-attention branch supplements the extraction of those neglected features to enhance the comprehensiveness of the features.Through ablation experiments,we verify the effectiveness of the proposed module.Finally,through experiments on two datasets,FLIR and Thermal101,qualitative and quantitative results demonstrate that the model can effectively recover high-frequency details of infrared images and significantly improve image resolution.In detail,compared with the suboptimal method,we have reduced the number of parameters by 30%and improved the model performance.When the scale factor is 2,the peak signal-tonoise ratio of the test datasets FLIR and Thermal101 is improved by 0.09 and 0.15 dB,respectively.When the scale factor is 4,it is improved by 0.05 and 0.09 dB,respectively.In addition,due to the lightweight design of the network structure,it has a low computational cost.It is suitable for deployment on edge devices,thus effectively enhancing the sensing performance of infrared imaging devices.
基金This researchwas Sponsored by Xinjiang Uygur Autonomous Region Tianshan Talent Programme Project(2023TCLJ02)Natural Science Foundation of Xinjiang Uygur Autonomous Region(2022D01C349).
文摘Infrared and visible light image fusion technology integrates feature information from two different modalities into a fused image to obtain more comprehensive information.However,in low-light scenarios,the illumination degradation of visible light images makes it difficult for existing fusion methods to extract texture detail information from the scene.At this time,relying solely on the target saliency information provided by infrared images is far from sufficient.To address this challenge,this paper proposes a lightweight infrared and visible light image fusion method based on low-light enhancement,named LLE-Fuse.The method is based on the improvement of the MobileOne Block,using the Edge-MobileOne Block embedded with the Sobel operator to perform feature extraction and downsampling on the source images.The intermediate features at different scales obtained are then fused by a cross-modal attention fusion module.In addition,the Contrast Limited Adaptive Histogram Equalization(CLAHE)algorithm is used for image enhancement of both infrared and visible light images,guiding the network model to learn low-light enhancement capabilities through enhancement loss.Upon completion of network training,the Edge-MobileOne Block is optimized into a direct connection structure similar to MobileNetV1 through structural reparameterization,effectively reducing computational resource consumption.Finally,after extensive experimental comparisons,our method achieved improvements of 4.6%,40.5%,156.9%,9.2%,and 98.6%in the evaluation metrics Standard Deviation(SD),Visual Information Fidelity(VIF),Entropy(EN),and Spatial Frequency(SF),respectively,compared to the best results of the compared algorithms,while only being 1.5 ms/it slower in computation speed than the fastest method.
基金financially supported by the National Key R&D Program of China(No.2022YFC3080200)。
文摘0 INTRODUCTION Geohazards in mountainous regions pose significant risks to the construction and safe operation of transportation,water conservancy,and other critical infrastructure projects.Engineering geological investigations are crucial for disaster prevention and mitigation.
基金Science and Technology Department of Jilin Province(No.20200403075SF)Education Department of Jilin Province(No.JJKH20240148KJ).
文摘As technologies related to power equipment fault diagnosis and infrared temperature measurement continue to advance,the classification and identification of infrared temperature measurement images have become crucial in effective intelligent fault diagnosis of various electrical equipment.In response to the increasing demand for sufficient feature fusion in current real-time detection and low detection accuracy in existing networks for Substation fault diagnosis,we introduce an innovative method known as Gather and Distribution Mechanism-You Only Look Once(GD-YOLO).Firstly,a partial convolution group is designed based on different convolution kernels.We combine the partial convolution group with deep convolution to propose a new Grouped Channel-wise Spatial Convolution(GCSConv)that compensates for the information loss caused by spatial channel convolution.Secondly,the Gather and Distribute Mechanism,which addresses the fusion problem of different dimensional features,has been implemented by aligning and sharing information through aggregation and distribution mechanisms.Thirdly,considering the limitations in current bounding box regression and the imbalance between complex and simple samples,Maximum Possible Distance Intersection over Union(MPDIoU)and Adaptive SlideLoss is incorporated into the loss function,allowing samples near the Intersection over Union(IoU)to receive more attention through the dynamic variation of the mean Intersection over Union.The GD-YOLO algorithm can surpass YOLOv5,YOLOv7,and YOLOv8 in infrared image detection for electrical equipment,achieving a mean Average Precision(mAP)of 88.9%,with accuracy improvements of 3.7%,4.3%,and 3.1%,respectively.Additionally,the model delivers a frame rate of 48 FPS,which aligns with the precision and velocity criteria necessary for the detection of infrared images in power equipment.
基金supported by the Moroccan Ministry of Higher Education,Scientific Research,and Innovationthe Moroccan Digital Development Agency(DDA)+2 种基金the National Center for Scientific and Technical Research of Morocco(CNRST)through the Al-Khawarizmi projectthe MANAGEM groupMASCIR supporting this project.
文摘Rockfalls are among the frequent hazards in underground mines worldwide,requiring effective methods for detecting unstable rock blocks to ensure miners’and equipment’s safety.This study proposes a novel approach for identifying potential rockfall zones using infrared thermal imaging and image segmentation techniques.Infrared images of rock blocks were captured at the Draa Sfar deep underground mine in Morocco using the FLUKE TI401 PRO thermal camera.Two segmentation methods were applied to locate the potential unstable areas:the classical thresholding and the K-means clustering model.The results show that while thresholding allows a binary distinction between stable and unstable areas,K-means clustering is more accurate,especially when using multiple clusters to show different risk levels.The close match between the clustering masks of unstable blocks and their corresponding visible light images further validated this.The findings confirm that thermal image segmentation can serve as an alternative method for predicting rockfalls and monitoring geotechnical issues in underground mines.Underground operators worldwide can apply this approach to monitor rock mass stability.However,further research is recommended to enhance these results,particularly through deep learning-based segmentation and object detection models.
基金supported by grants from the National Natu-ral Science Foundation of China(81302161 and 82003103)the Science and Technology Department of Sichuan Province(2021YFS0375 and 2020YJ0450).
文摘A pancreas surgeon’s constant goal is to do"less damage,more radical".Currently,a small number of highly trained surgeons opt for single-incision laparoscopic pancreaticoduodenectomy(SILPD)or single-incision plus one-port LPD(SILPD+1)to minimize post-operative pain,improve convalescence,and provide a more pleas-ing cosmetic outcome[1,2].Additionally,some skilled surgeons have claimed that laparoscopic duodenum-preserving complete pancreatic head resections(LDPPHR)result in less trauma and en-hanced quality of life[3,4].However,LDPPHR is still challenging because of its lengthy learning curve and"sword-fighting"impact.Additionally,there has not been any global reporting on the suit-ability of single-incision plus one-port DPPHR with pancreaticogas-trostomy(SILDPPHR-T+1)in place of SILPD+1.This study aimed to illustrate the SILDPPHR-T+1 procedure specifics for a patient with pancreatic head intraductal papillary mucinous neoplasm(IPMN)(main pancreatic duct type)(MD-IPMN).
基金supported by Universiti Teknologi MARA through UiTM MyRA Research Grant,600-RMC 5/3/GPM(053/2022).
文摘Infrared and visible image fusion technology integrates the thermal radiation information of infrared images with the texture details of visible images to generate more informative fused images.However,existing methods often fail to distinguish salient objects from background regions,leading to detail suppression in salient regions due to global fusion strategies.This study presents a mask-guided latent low-rank representation fusion method to address this issue.First,the GrabCut algorithm is employed to extract a saliency mask,distinguishing salient regions from background regions.Then,latent low-rank representation(LatLRR)is applied to extract deep image features,enhancing key information extraction.In the fusion stage,a weighted fusion strategy strengthens infrared thermal information and visible texture details in salient regions,while an average fusion strategy improves background smoothness and stability.Experimental results on the TNO dataset demonstrate that the proposed method achieves superior performance in SPI,MI,Qabf,PSNR,and EN metrics,effectively preserving salient target details while maintaining balanced background information.Compared to state-of-the-art fusion methods,our approach achieves more stable and visually consistent fusion results.The fusion code is available on GitHub at:https://github.com/joyzhen1/Image(accessed on 15 January 2025).
基金partially supported by China Postdoctoral Science Foundation(2023M730741)the National Natural Science Foundation of China(U22B2052,52102432,52202452,62372080,62302078)
文摘The goal of infrared and visible image fusion(IVIF)is to integrate the unique advantages of both modalities to achieve a more comprehensive understanding of a scene.However,existing methods struggle to effectively handle modal disparities,resulting in visual degradation of the details and prominent targets of the fused images.To address these challenges,we introduce Prompt Fusion,a prompt-based approach that harmoniously combines multi-modality images under the guidance of semantic prompts.Firstly,to better characterize the features of different modalities,a contourlet autoencoder is designed to separate and extract the high-/low-frequency components of different modalities,thereby improving the extraction of fine details and textures.We also introduce a prompt learning mechanism using positive and negative prompts,leveraging Vision-Language Models to improve the fusion model's understanding and identification of targets in multi-modality images,leading to improved performance in downstream tasks.Furthermore,we employ bi-level asymptotic convergence optimization.This approach simplifies the intricate non-singleton non-convex bi-level problem into a series of convergent and differentiable single optimization problems that can be effectively resolved through gradient descent.Our approach advances the state-of-the-art,delivering superior fusion quality and boosting the performance of related downstream tasks.Project page:https://github.com/hey-it-s-me/PromptFusion.
基金supported by the National Natural Science Foundation of China(52225402,U1910206)。
文摘Mining-induced ground fissures are common problems associated with mining damage in shallowly buried coal seams in the western mining area of China.To evaluate the surface mining damage of the 12203 working face of the Huojitu Colliery in Shendong mining area,low-altitude infrared aerial surveys were conducted on the ground at the static fissure area(O-A1)and the dynamic fissure area(O-A2)of the working face.The temperature evolution patterns of fissures,sand and plants in the infrared images were analysed.The relationship between overburden fractures and surface fissure temperature was revealed,and the influence range and temperature self-healing period of the surface affected by underground mining were determined.The results indicated that underground mining could lead to a decrease in the ground temperature above the working face.The surface temperature evolution can be divided into three zones:a temperature stabilization zone before mining,a temperature cooling zone during mining,and a temperature recovery zone after mining.The temperature of sand and plants above the working face exhibited quadratic curve changes in O-A1 and O-A2,respectively.The length of the temperature reduction zone affected by mining is 40 m in O-A2,and 46.8 m in O-A1.The temperature recovery periods of ground fissures in O-A1 and O-A2 were 4.0 and 4.6 d,respectively.These findings could provide a basis for evaluating mining ground damage.
文摘In recent years face recognition has received substantial attention, but still remained very challenging in real applications. Despite the variety of approaches and tools studied, face recognition is not accurate or robust enough to be used in uncontrolled environments. Infrared (IR) imagery of human faces offers a promising alternative to visible imagery, however, IR has its own limitations. In this paper, a scheme to fuse information from the two modalities is proposed. The scheme is based on eigenfaces and probabilistic neural network (PNN), using fuzzy integral to fuse the objective evidence supplied by each modality. Recognition rate is used to evaluate the fusion scheme. Experimental results show that the scheme improves recognition performance substantially.
文摘Road traffic safety can decrease when drivers drive in a low-visibility environment.The application of visual perception technology to detect vehicles and pedestrians in infrared images proves to be an effective means of reducing the risk of accidents.To tackle the challenges posed by the low recognition accuracy and the substan-tial computational burden associated with current infrared pedestrian-vehicle detection methods,an infrared pedestrian-vehicle detection method A proposal is presented,based on an enhanced version of You Only Look Once version 5(YOLOv5).First,A head specifically designed for detecting small targets has been integrated into the model to make full use of shallow feature information to enhance the accuracy in detecting small targets.Second,the Focal Generalized Intersection over Union(GIoU)is employed as an alternative to the original loss function to address issues related to target overlap and category imbalance.Third,the distribution shift convolution optimization feature extraction operator is used to alleviate the computational burden of the model without significantly compromising detection accuracy.The test results of the improved algorithm show that its average accuracy(mAP)reaches 90.1%.Specifically,the Giga Floating Point Operations Per second(GFLOPs)of the improved algorithm is only 9.1.In contrast,the improved algorithms outperformed the other algorithms on similar GFLOPs,such as YOLOv6n(11.9),YOLOv8n(8.7),YOLOv7t(13.2)and YOLOv5s(16.0).The mAPs that are 4.4%,3%,3.5%,and 1.7%greater than those of these algorithms show that the improved algorithm achieves higher accuracy in target detection tasks under similar computational resource overhead.On the other hand,compared with other algorithms such as YOLOv8l(91.1%),YOLOv6l(89.5%),YOLOv7(90.8%),and YOLOv3(90.1%),the improved algorithm needs only 5.5%,2.3%,8.6%,and 2.3%,respectively,of the GFLOPs.The improved algorithm has shown significant advancements in balancing accuracy and computational efficiency,making it promising for practical use in resource-limited scenarios.
文摘Infrared small target detection is a common task in infrared image processing.Under limited computa⁃tional resources.Traditional methods for infrared small target detection face a trade-off between the detection rate and the accuracy.A fast infrared small target detection method tailored for resource-constrained conditions is pro⁃posed for the YOLOv5s model.This method introduces an additional small target detection head and replaces the original Intersection over Union(IoU)metric with Normalized Wasserstein Distance(NWD),while considering both the detection accuracy and the detection speed of infrared small targets.Experimental results demonstrate that the proposed algorithm achieves a maximum effective detection speed of 95 FPS on a 15 W TPU,while reach⁃ing a maximum effective detection accuracy of 91.9 AP@0.5,effectively improving the efficiency of infrared small target detection under resource-constrained conditions.
文摘A novel image fusion network framework with an autonomous encoder and decoder is suggested to increase thevisual impression of fused images by improving the quality of infrared and visible light picture fusion. The networkcomprises an encoder module, fusion layer, decoder module, and edge improvementmodule. The encoder moduleutilizes an enhanced Inception module for shallow feature extraction, then combines Res2Net and Transformerto achieve deep-level co-extraction of local and global features from the original picture. An edge enhancementmodule (EEM) is created to extract significant edge features. A modal maximum difference fusion strategy isintroduced to enhance the adaptive representation of information in various regions of the source image, therebyenhancing the contrast of the fused image. The encoder and the EEM module extract features, which are thencombined in the fusion layer to create a fused picture using the decoder. Three datasets were chosen to test thealgorithmproposed in this paper. The results of the experiments demonstrate that the network effectively preservesbackground and detail information in both infrared and visible images, yielding superior outcomes in subjectiveand objective evaluations.