Adding colors to monochrome thermal infrared images can help observers understand the scenery better. A nonlinear color estimation method for single-band thermal infrared imagery based on kernel principal component an...Adding colors to monochrome thermal infrared images can help observers understand the scenery better. A nonlinear color estimation method for single-band thermal infrared imagery based on kernel principal component analysis (KPCA) and sparse representation was proposed. Nonlinear features of infrared image were extracted using KPCA. The relationship between image features and chromatic values was learned using sparse representation and a color estimation model was obtained. The thermal infrared images can be rendered automatically using the color estimation model. The experimental results show that the proposed method can render infrared image with an accurate color appearance. The proposed idea can also be used in other color estimation problem.展开更多
The fast crystallization and facile oxidation of Sn^(2+)of tin-lead(Sn-Pb)perovskites are the biggest challenges for their applications in high-performance near-infrared(NIR)photodetectors and imagers.Here,we introduc...The fast crystallization and facile oxidation of Sn^(2+)of tin-lead(Sn-Pb)perovskites are the biggest challenges for their applications in high-performance near-infrared(NIR)photodetectors and imagers.Here,we introduce a multifunctional diphenyl sulfoxide(DPSO)molecule into perovskite precursor ink to response these issues by revealing its strong binding interactions with the precursor species.The regulated perovskite film exhibits a dense morphology,reduced defect density and prolonged carrier diffusion length.The manufactured self-powered photodetector realizes a spectral response of 300-1100 nm,dark current density of 4.7×10^(−8)mA cm^(−2),peak responsivity of 0.49 A W^(−1)and specific detectivity of 1.20×10^(12)Jones in NIR region(780-1100 nm),-3 dB bandwidth of 11.4 MHz,linear dynamic range of 174 dB,and ultrafast rise/fall time of 14.2/17.1 ns,respectively.We demonstrate a 64×64 NIR imager with an impressive spatial resolution of 1.32 lp mm^(−1)by monolithically integrating the photodetector with a commercial thin-film transistor readout circuit.展开更多
Lead sulfide(PbS)colloidal quantum dot(CQD)photodiodes integrated with silicon-based readout integrated circuits(ROICs)offer a promising solution for the next-generation short-wave infrared(SWIR)imaging technology.Des...Lead sulfide(PbS)colloidal quantum dot(CQD)photodiodes integrated with silicon-based readout integrated circuits(ROICs)offer a promising solution for the next-generation short-wave infrared(SWIR)imaging technology.Despite their potential,large-size CQD photodiodes pose a challenge due to high dark currents resulting from surface states on nonpassivated(100)facets and trap states generated by CQD fusion.In this work,we present a novel approach to address this issue by introducing double-ended ligands that supplementally passivate(100)facets of halidecapped large-size CQDs,leading to suppressed bandtail states and reduced defect concentration.Our results demonstrate that the dark current density is highly suppressed by about an order of magnitude to 9.6 nA cm^(2) at -10 mV,which is among the lowest reported for PbS CQD photodiodes.Furthermore,the performance of the photodiodes is exemplary,yielding an external quantum efficiency of 50.8%(which corresponds to a responsivity of 0.532 A W^(-1))and a specific detectivity of 2.5×10^(12) Jones at 1300 nm.By integrating CQD photodiodes with CMOS ROICs,the CQD imager provides high-resolution(640×512)SWIR imaging for infrared penetration and material discrimination.展开更多
In recent years face recognition has received substantial attention, but still remained very challenging in real applications. Despite the variety of approaches and tools studied, face recognition is not accurate or r...In recent years face recognition has received substantial attention, but still remained very challenging in real applications. Despite the variety of approaches and tools studied, face recognition is not accurate or robust enough to be used in uncontrolled environments. Infrared (IR) imagery of human faces offers a promising alternative to visible imagery, however, IR has its own limitations. In this paper, a scheme to fuse information from the two modalities is proposed. The scheme is based on eigenfaces and probabilistic neural network (PNN), using fuzzy integral to fuse the objective evidence supplied by each modality. Recognition rate is used to evaluate the fusion scheme. Experimental results show that the scheme improves recognition performance substantially.展开更多
To address two challenging problems in infrared target tracking, target appearance changes and unpre- dictable abrupt motions, a novel particle filtering based tracking algorithm is introduced. In this method, a novel...To address two challenging problems in infrared target tracking, target appearance changes and unpre- dictable abrupt motions, a novel particle filtering based tracking algorithm is introduced. In this method, a novel saliency model is proposed to distinguish the salient target from background, and the eigenspace model is invoked to adapt target appearance changes. To account for the abrupt motions efficiently, a two- step sampling method is proposed to combine the two observation models. The proposed tracking method is demonstrated through two real infrared image sequences, which include the changes of luminance and size, and the drastic abrupt motions of the target.展开更多
Potential high-temperature risks exist in heat-prone components of electric moped charging devices,such as sockets,interfaces,and controllers.Traditional detection methods have limitations in terms of real-time perfor...Potential high-temperature risks exist in heat-prone components of electric moped charging devices,such as sockets,interfaces,and controllers.Traditional detection methods have limitations in terms of real-time performance and monitoring scope.To address this,a temperature detection method based on infrared image processing has been proposed:utilizing the median filtering algorithm to denoise the original infrared image,then applying an image segmentation algorithm to divide the image.展开更多
The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method f...The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.展开更多
In response to the scarcity of infrared aircraft samples and the tendency of traditional deep learning to overfit,a few-shot infrared aircraft classification method based on cross-correlation networks is proposed.This...In response to the scarcity of infrared aircraft samples and the tendency of traditional deep learning to overfit,a few-shot infrared aircraft classification method based on cross-correlation networks is proposed.This method combines two core modules:a simple parameter-free self-attention and cross-attention.By analyzing the self-correlation and cross-correlation between support images and query images,it achieves effective classification of infrared aircraft under few-shot conditions.The proposed cross-correlation network integrates these two modules and is trained in an end-to-end manner.The simple parameter-free self-attention is responsible for extracting the internal structure of the image while the cross-attention can calculate the cross-correlation between images further extracting and fusing the features between images.Compared with existing few-shot infrared target classification models,this model focuses on the geometric structure and thermal texture information of infrared images by modeling the semantic relevance between the features of the support set and query set,thus better attending to the target objects.Experimental results show that this method outperforms existing infrared aircraft classification methods in various classification tasks,with the highest classification accuracy improvement exceeding 3%.In addition,ablation experiments and comparative experiments also prove the effectiveness of the method.展开更多
1.Introduction Infrared Imaging Missiles(IRIMs)are advanced weapons utilizing infrared technology for target detection and tracking.Their sensors capture thermal signatures and convert them into electronic images,enab...1.Introduction Infrared Imaging Missiles(IRIMs)are advanced weapons utilizing infrared technology for target detection and tracking.Their sensors capture thermal signatures and convert them into electronic images,enabling precise target identification and tracking.To a certain extent,the all-weather adaptability of IRIMs enables their effective operation across diverse environmental conditions,providing high targeting accuracy and cost efficiency.展开更多
As modern power systems grow in complexity,accurate and efficient fault detection has become increasingly important.While many existing reviews focus on a single modality,this paper presents a comprehensive survey fro...As modern power systems grow in complexity,accurate and efficient fault detection has become increasingly important.While many existing reviews focus on a single modality,this paper presents a comprehensive survey from a dual-modality perspective-infrared imaging and voiceprint analysis-two complementary,non-contact techniques that capture different fault characteristics.Infrared imaging excels at detecting thermal anomalies,while voiceprint signals provide insight into mechanical vibrations and internal discharge phenomena.We review both traditional signal processing and deep learning-based approaches for each modality,categorized by key processing stages such as feature extraction and classification.The paper highlights how these modalities address distinct fault types and how they may be fused to improve robustness and accuracy.Representative datasets are summarized,and practical challenges such as noise interference,limited fault samples,and deployment constraints are discussed.By offering a cross-modal,comparative analysis,this work aims to bridge fragmented research and guide future development in intelligent fault detection systems.The review concludes with research trends including multimodal fusion,lightweight models,and self-supervised learning.展开更多
The purpose of infrared and visible image fusion is to create a single image containing the texture details and significant object information of the source images,particularly in challenging environments.However,exis...The purpose of infrared and visible image fusion is to create a single image containing the texture details and significant object information of the source images,particularly in challenging environments.However,existing image fusion algorithms are generally suitable for normal scenes.In the hazy scene,a lot of texture information in the visible image is hidden,the results of existing methods are filled with infrared information,resulting in the lack of texture details and poor visual effect.To address the aforementioned difficulties,we propose a haze-free infrared and visible fusion method,termed HaIVFusion,which can eliminate the influence of haze and obtain richer texture information in the fused image.Specifically,we first design a scene information restoration network(SIRNet)to mine the masked texture information in visible images.Then,a denoising fusion network(DFNet)is designed to integrate the features extracted from infrared and visible images and remove the influence of residual noise as much as possible.In addition,we use color consistency loss to reduce the color distortion resulting from haze.Furthermore,we publish a dataset of hazy scenes for infrared and visible image fusion to promote research in extreme scenes.Extensive experiments show that HaIVFusion produces fused images with increased texture details and higher contrast in hazy scenes,and achieves better quantitative results,when compared to state-ofthe-art image fusion methods,even combined with state-of-the-art dehazing methods.展开更多
Infrared imaging technology has been widely adopted in various fields,such as military reconnaissance,medical diagnosis,and security monitoring,due to its excellent ability to penetrate smoke and fog.However,the preva...Infrared imaging technology has been widely adopted in various fields,such as military reconnaissance,medical diagnosis,and security monitoring,due to its excellent ability to penetrate smoke and fog.However,the prevalent low resolution of infrared images severely limits the accurate interpretation of their contents.In addition,deploying super-resolution models on resource-constrained devices faces significant challenges.To address these issues,this study proposes a lightweight super-resolution network for infrared images based on an adaptive attention mechanism.The network’s dynamic weighting module automatically adjusts the weights of the attention and nonattention branch outputs based on the network’s characteristics at different levels.Among them,the attention branch is further subdivided into pixel attention and brightness-texture attention,which are specialized for extracting the most informative features in infrared images.Meanwhile,the non-attention branch supplements the extraction of those neglected features to enhance the comprehensiveness of the features.Through ablation experiments,we verify the effectiveness of the proposed module.Finally,through experiments on two datasets,FLIR and Thermal101,qualitative and quantitative results demonstrate that the model can effectively recover high-frequency details of infrared images and significantly improve image resolution.In detail,compared with the suboptimal method,we have reduced the number of parameters by 30%and improved the model performance.When the scale factor is 2,the peak signal-tonoise ratio of the test datasets FLIR and Thermal101 is improved by 0.09 and 0.15 dB,respectively.When the scale factor is 4,it is improved by 0.05 and 0.09 dB,respectively.In addition,due to the lightweight design of the network structure,it has a low computational cost.It is suitable for deployment on edge devices,thus effectively enhancing the sensing performance of infrared imaging devices.展开更多
Infrared and visible light image fusion technology integrates feature information from two different modalities into a fused image to obtain more comprehensive information.However,in low-light scenarios,the illuminati...Infrared and visible light image fusion technology integrates feature information from two different modalities into a fused image to obtain more comprehensive information.However,in low-light scenarios,the illumination degradation of visible light images makes it difficult for existing fusion methods to extract texture detail information from the scene.At this time,relying solely on the target saliency information provided by infrared images is far from sufficient.To address this challenge,this paper proposes a lightweight infrared and visible light image fusion method based on low-light enhancement,named LLE-Fuse.The method is based on the improvement of the MobileOne Block,using the Edge-MobileOne Block embedded with the Sobel operator to perform feature extraction and downsampling on the source images.The intermediate features at different scales obtained are then fused by a cross-modal attention fusion module.In addition,the Contrast Limited Adaptive Histogram Equalization(CLAHE)algorithm is used for image enhancement of both infrared and visible light images,guiding the network model to learn low-light enhancement capabilities through enhancement loss.Upon completion of network training,the Edge-MobileOne Block is optimized into a direct connection structure similar to MobileNetV1 through structural reparameterization,effectively reducing computational resource consumption.Finally,after extensive experimental comparisons,our method achieved improvements of 4.6%,40.5%,156.9%,9.2%,and 98.6%in the evaluation metrics Standard Deviation(SD),Visual Information Fidelity(VIF),Entropy(EN),and Spatial Frequency(SF),respectively,compared to the best results of the compared algorithms,while only being 1.5 ms/it slower in computation speed than the fastest method.展开更多
In the field of infrared small target detection(ISTD),the ability to detect targets in dim environments is critical,as it improves the performance of target recognition in nighttime and harsh weather conditions.The bl...In the field of infrared small target detection(ISTD),the ability to detect targets in dim environments is critical,as it improves the performance of target recognition in nighttime and harsh weather conditions.The blurry contour,small size and sparse distribution of infrared small targets increase the difficulty of identifying such targets in cluttered backgrounds.Existing methodologies fall short of satisfying the requisites for the detection and categorisation of infrared small targets.To address these challenges and to enhance the precision of small object detection and classification,this paper introduces an innovative approach called location refinement and adjacent feature fusion YOLO(LA-YOLO),which enhances feature extraction by integrating a multi-head self-attention mechanism(MSA).We have improved the feature fusion method to merge adjacent features,to enhance information utilisation in the path aggregation network(PAN).Lastly,we introduce supervision on the target centre points in the detection network.Empirical results on publicly available datasets demonstrate that LA-YOLO achieves an impressive average precision(AP)of 92.46% on IST-A and a mean average precision(mAP)of 84.82%on FLIR.The results surpass those of contemporary state-of-the-art detectors,striking a balance between precision and speed.LA-YOLO emerges as a viable and efficacious solution for ISTD,making a substantial contribution to the progression of infrared imagery analysis.The code is available at https://github.com/liusjo/LA-YOLO.展开更多
0 INTRODUCTION Geohazards in mountainous regions pose significant risks to the construction and safe operation of transportation,water conservancy,and other critical infrastructure projects.Engineering geological inve...0 INTRODUCTION Geohazards in mountainous regions pose significant risks to the construction and safe operation of transportation,water conservancy,and other critical infrastructure projects.Engineering geological investigations are crucial for disaster prevention and mitigation.展开更多
As technologies related to power equipment fault diagnosis and infrared temperature measurement continue to advance,the classification and identification of infrared temperature measurement images have become crucial ...As technologies related to power equipment fault diagnosis and infrared temperature measurement continue to advance,the classification and identification of infrared temperature measurement images have become crucial in effective intelligent fault diagnosis of various electrical equipment.In response to the increasing demand for sufficient feature fusion in current real-time detection and low detection accuracy in existing networks for Substation fault diagnosis,we introduce an innovative method known as Gather and Distribution Mechanism-You Only Look Once(GD-YOLO).Firstly,a partial convolution group is designed based on different convolution kernels.We combine the partial convolution group with deep convolution to propose a new Grouped Channel-wise Spatial Convolution(GCSConv)that compensates for the information loss caused by spatial channel convolution.Secondly,the Gather and Distribute Mechanism,which addresses the fusion problem of different dimensional features,has been implemented by aligning and sharing information through aggregation and distribution mechanisms.Thirdly,considering the limitations in current bounding box regression and the imbalance between complex and simple samples,Maximum Possible Distance Intersection over Union(MPDIoU)and Adaptive SlideLoss is incorporated into the loss function,allowing samples near the Intersection over Union(IoU)to receive more attention through the dynamic variation of the mean Intersection over Union.The GD-YOLO algorithm can surpass YOLOv5,YOLOv7,and YOLOv8 in infrared image detection for electrical equipment,achieving a mean Average Precision(mAP)of 88.9%,with accuracy improvements of 3.7%,4.3%,and 3.1%,respectively.Additionally,the model delivers a frame rate of 48 FPS,which aligns with the precision and velocity criteria necessary for the detection of infrared images in power equipment.展开更多
Rockfalls are among the frequent hazards in underground mines worldwide,requiring effective methods for detecting unstable rock blocks to ensure miners’and equipment’s safety.This study proposes a novel approach for...Rockfalls are among the frequent hazards in underground mines worldwide,requiring effective methods for detecting unstable rock blocks to ensure miners’and equipment’s safety.This study proposes a novel approach for identifying potential rockfall zones using infrared thermal imaging and image segmentation techniques.Infrared images of rock blocks were captured at the Draa Sfar deep underground mine in Morocco using the FLUKE TI401 PRO thermal camera.Two segmentation methods were applied to locate the potential unstable areas:the classical thresholding and the K-means clustering model.The results show that while thresholding allows a binary distinction between stable and unstable areas,K-means clustering is more accurate,especially when using multiple clusters to show different risk levels.The close match between the clustering masks of unstable blocks and their corresponding visible light images further validated this.The findings confirm that thermal image segmentation can serve as an alternative method for predicting rockfalls and monitoring geotechnical issues in underground mines.Underground operators worldwide can apply this approach to monitor rock mass stability.However,further research is recommended to enhance these results,particularly through deep learning-based segmentation and object detection models.展开更多
A pancreas surgeon’s constant goal is to do"less damage,more radical".Currently,a small number of highly trained surgeons opt for single-incision laparoscopic pancreaticoduodenectomy(SILPD)or single-incisio...A pancreas surgeon’s constant goal is to do"less damage,more radical".Currently,a small number of highly trained surgeons opt for single-incision laparoscopic pancreaticoduodenectomy(SILPD)or single-incision plus one-port LPD(SILPD+1)to minimize post-operative pain,improve convalescence,and provide a more pleas-ing cosmetic outcome[1,2].Additionally,some skilled surgeons have claimed that laparoscopic duodenum-preserving complete pancreatic head resections(LDPPHR)result in less trauma and en-hanced quality of life[3,4].However,LDPPHR is still challenging because of its lengthy learning curve and"sword-fighting"impact.Additionally,there has not been any global reporting on the suit-ability of single-incision plus one-port DPPHR with pancreaticogas-trostomy(SILDPPHR-T+1)in place of SILPD+1.This study aimed to illustrate the SILDPPHR-T+1 procedure specifics for a patient with pancreatic head intraductal papillary mucinous neoplasm(IPMN)(main pancreatic duct type)(MD-IPMN).展开更多
Infrared and visible image fusion technology integrates the thermal radiation information of infrared images with the texture details of visible images to generate more informative fused images.However,existing method...Infrared and visible image fusion technology integrates the thermal radiation information of infrared images with the texture details of visible images to generate more informative fused images.However,existing methods often fail to distinguish salient objects from background regions,leading to detail suppression in salient regions due to global fusion strategies.This study presents a mask-guided latent low-rank representation fusion method to address this issue.First,the GrabCut algorithm is employed to extract a saliency mask,distinguishing salient regions from background regions.Then,latent low-rank representation(LatLRR)is applied to extract deep image features,enhancing key information extraction.In the fusion stage,a weighted fusion strategy strengthens infrared thermal information and visible texture details in salient regions,while an average fusion strategy improves background smoothness and stability.Experimental results on the TNO dataset demonstrate that the proposed method achieves superior performance in SPI,MI,Qabf,PSNR,and EN metrics,effectively preserving salient target details while maintaining balanced background information.Compared to state-of-the-art fusion methods,our approach achieves more stable and visually consistent fusion results.The fusion code is available on GitHub at:https://github.com/joyzhen1/Image(accessed on 15 January 2025).展开更多
The goal of infrared and visible image fusion(IVIF)is to integrate the unique advantages of both modalities to achieve a more comprehensive understanding of a scene.However,existing methods struggle to effectively han...The goal of infrared and visible image fusion(IVIF)is to integrate the unique advantages of both modalities to achieve a more comprehensive understanding of a scene.However,existing methods struggle to effectively handle modal disparities,resulting in visual degradation of the details and prominent targets of the fused images.To address these challenges,we introduce Prompt Fusion,a prompt-based approach that harmoniously combines multi-modality images under the guidance of semantic prompts.Firstly,to better characterize the features of different modalities,a contourlet autoencoder is designed to separate and extract the high-/low-frequency components of different modalities,thereby improving the extraction of fine details and textures.We also introduce a prompt learning mechanism using positive and negative prompts,leveraging Vision-Language Models to improve the fusion model's understanding and identification of targets in multi-modality images,leading to improved performance in downstream tasks.Furthermore,we employ bi-level asymptotic convergence optimization.This approach simplifies the intricate non-singleton non-convex bi-level problem into a series of convergent and differentiable single optimization problems that can be effectively resolved through gradient descent.Our approach advances the state-of-the-art,delivering superior fusion quality and boosting the performance of related downstream tasks.Project page:https://github.com/hey-it-s-me/PromptFusion.展开更多
基金National Natural Science Foundation of China(No. 61072090)the Fundamental Research Funds for the Central Universities,China+2 种基金Shanghai Pujiang Program,China(No. 12PJ1402200)China Postdoctoral Science Foundation Funded Project(No. 2012M511058)Shanghai Postdoctoral Sustentation Fund,China(No. 12R21412500)
文摘Adding colors to monochrome thermal infrared images can help observers understand the scenery better. A nonlinear color estimation method for single-band thermal infrared imagery based on kernel principal component analysis (KPCA) and sparse representation was proposed. Nonlinear features of infrared image were extracted using KPCA. The relationship between image features and chromatic values was learned using sparse representation and a color estimation model was obtained. The thermal infrared images can be rendered automatically using the color estimation model. The experimental results show that the proposed method can render infrared image with an accurate color appearance. The proposed idea can also be used in other color estimation problem.
基金financially supported by the National Key Research and Development Program of China(2022YFA1404201)the National Natural Science Foundation of China(62205187,U23A20380,U22A2091,62222509,62127817,62075120,62075122,and 62105193)+1 种基金the Changjiang Scholars and Innovative Research Team in University of Ministry of Education of China(IRT_17R70)the Fund for Shanxi“1331 Project”Key Subjects Construction,the Fundamental Research Program of Shanxi Province(202103021223032,202203021222107).
文摘The fast crystallization and facile oxidation of Sn^(2+)of tin-lead(Sn-Pb)perovskites are the biggest challenges for their applications in high-performance near-infrared(NIR)photodetectors and imagers.Here,we introduce a multifunctional diphenyl sulfoxide(DPSO)molecule into perovskite precursor ink to response these issues by revealing its strong binding interactions with the precursor species.The regulated perovskite film exhibits a dense morphology,reduced defect density and prolonged carrier diffusion length.The manufactured self-powered photodetector realizes a spectral response of 300-1100 nm,dark current density of 4.7×10^(−8)mA cm^(−2),peak responsivity of 0.49 A W^(−1)and specific detectivity of 1.20×10^(12)Jones in NIR region(780-1100 nm),-3 dB bandwidth of 11.4 MHz,linear dynamic range of 174 dB,and ultrafast rise/fall time of 14.2/17.1 ns,respectively.We demonstrate a 64×64 NIR imager with an impressive spatial resolution of 1.32 lp mm^(−1)by monolithically integrating the photodetector with a commercial thin-film transistor readout circuit.
基金National Natural Science Foundation of China,Grant/Award Numbers:U22A2083,62204091,62374068National Key Research and Development Program of China,Grant/Award Number:2021YFA0715502+5 种基金Key R&D program of Hubei Province,Grant/Award Number:2021BAA014Innovation Project of Optics Valley Laboratory,Grant/Award Numbers:OVL2021BG009,OVL2023ZD002Exploration Project of Natural Science Foundation of Zhejiang Province,Grant/Award Number:LY23F040005Fund for Innovative Research Groups of the Natural Science Foundation of Hubei Province,Grant/Award Number:2020CFA034Fund from Science,Technology and Innovation Commission of Shenzhen Municipality,Grant/Award Numbers:GJHZ20210705142540010,GJHZ20220913143403007China Postdoctoral Science Foundation,Grant/Award Numbers:2021M691118,2022M711237,2022M721243,2023T160244。
文摘Lead sulfide(PbS)colloidal quantum dot(CQD)photodiodes integrated with silicon-based readout integrated circuits(ROICs)offer a promising solution for the next-generation short-wave infrared(SWIR)imaging technology.Despite their potential,large-size CQD photodiodes pose a challenge due to high dark currents resulting from surface states on nonpassivated(100)facets and trap states generated by CQD fusion.In this work,we present a novel approach to address this issue by introducing double-ended ligands that supplementally passivate(100)facets of halidecapped large-size CQDs,leading to suppressed bandtail states and reduced defect concentration.Our results demonstrate that the dark current density is highly suppressed by about an order of magnitude to 9.6 nA cm^(2) at -10 mV,which is among the lowest reported for PbS CQD photodiodes.Furthermore,the performance of the photodiodes is exemplary,yielding an external quantum efficiency of 50.8%(which corresponds to a responsivity of 0.532 A W^(-1))and a specific detectivity of 2.5×10^(12) Jones at 1300 nm.By integrating CQD photodiodes with CMOS ROICs,the CQD imager provides high-resolution(640×512)SWIR imaging for infrared penetration and material discrimination.
文摘In recent years face recognition has received substantial attention, but still remained very challenging in real applications. Despite the variety of approaches and tools studied, face recognition is not accurate or robust enough to be used in uncontrolled environments. Infrared (IR) imagery of human faces offers a promising alternative to visible imagery, however, IR has its own limitations. In this paper, a scheme to fuse information from the two modalities is proposed. The scheme is based on eigenfaces and probabilistic neural network (PNN), using fuzzy integral to fuse the objective evidence supplied by each modality. Recognition rate is used to evaluate the fusion scheme. Experimental results show that the scheme improves recognition performance substantially.
基金supported by the National"863" Project of China (No. 2007AA01Z164)the National Natural Science Foundation of China (Nos.60675023 and 60602012)
文摘To address two challenging problems in infrared target tracking, target appearance changes and unpre- dictable abrupt motions, a novel particle filtering based tracking algorithm is introduced. In this method, a novel saliency model is proposed to distinguish the salient target from background, and the eigenspace model is invoked to adapt target appearance changes. To account for the abrupt motions efficiently, a two- step sampling method is proposed to combine the two observation models. The proposed tracking method is demonstrated through two real infrared image sequences, which include the changes of luminance and size, and the drastic abrupt motions of the target.
基金supported by the National Key Research and Development Project of China(No.2023YFB3709605)the National Natural Science Foundation of China(No.62073193)the National College Student Innovation Training Program(No.202310422122)。
文摘Potential high-temperature risks exist in heat-prone components of electric moped charging devices,such as sockets,interfaces,and controllers.Traditional detection methods have limitations in terms of real-time performance and monitoring scope.To address this,a temperature detection method based on infrared image processing has been proposed:utilizing the median filtering algorithm to denoise the original infrared image,then applying an image segmentation algorithm to divide the image.
基金Supported by the Henan Province Key Research and Development Project(231111211300)the Central Government of Henan Province Guides Local Science and Technology Development Funds(Z20231811005)+2 种基金Henan Province Key Research and Development Project(231111110100)Henan Provincial Outstanding Foreign Scientist Studio(GZS2024006)Henan Provincial Joint Fund for Scientific and Technological Research and Development Plan(Application and Overcoming Technical Barriers)(242103810028)。
文摘The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.
基金Supported by the National Pre-research Program during the 14th Five-Year Plan(514010405)。
文摘In response to the scarcity of infrared aircraft samples and the tendency of traditional deep learning to overfit,a few-shot infrared aircraft classification method based on cross-correlation networks is proposed.This method combines two core modules:a simple parameter-free self-attention and cross-attention.By analyzing the self-correlation and cross-correlation between support images and query images,it achieves effective classification of infrared aircraft under few-shot conditions.The proposed cross-correlation network integrates these two modules and is trained in an end-to-end manner.The simple parameter-free self-attention is responsible for extracting the internal structure of the image while the cross-attention can calculate the cross-correlation between images further extracting and fusing the features between images.Compared with existing few-shot infrared target classification models,this model focuses on the geometric structure and thermal texture information of infrared images by modeling the semantic relevance between the features of the support set and query set,thus better attending to the target objects.Experimental results show that this method outperforms existing infrared aircraft classification methods in various classification tasks,with the highest classification accuracy improvement exceeding 3%.In addition,ablation experiments and comparative experiments also prove the effectiveness of the method.
基金co-supported by the China Postdoctoral Science Foundation(No.2024M754304)the Hunan Provincial Natural Science Foundation of China(No.2025JJ60072)。
文摘1.Introduction Infrared Imaging Missiles(IRIMs)are advanced weapons utilizing infrared technology for target detection and tracking.Their sensors capture thermal signatures and convert them into electronic images,enabling precise target identification and tracking.To a certain extent,the all-weather adaptability of IRIMs enables their effective operation across diverse environmental conditions,providing high targeting accuracy and cost efficiency.
基金supported by Science and Technology Project of State Grid Corporation of China(52094024003D).
文摘As modern power systems grow in complexity,accurate and efficient fault detection has become increasingly important.While many existing reviews focus on a single modality,this paper presents a comprehensive survey from a dual-modality perspective-infrared imaging and voiceprint analysis-two complementary,non-contact techniques that capture different fault characteristics.Infrared imaging excels at detecting thermal anomalies,while voiceprint signals provide insight into mechanical vibrations and internal discharge phenomena.We review both traditional signal processing and deep learning-based approaches for each modality,categorized by key processing stages such as feature extraction and classification.The paper highlights how these modalities address distinct fault types and how they may be fused to improve robustness and accuracy.Representative datasets are summarized,and practical challenges such as noise interference,limited fault samples,and deployment constraints are discussed.By offering a cross-modal,comparative analysis,this work aims to bridge fragmented research and guide future development in intelligent fault detection systems.The review concludes with research trends including multimodal fusion,lightweight models,and self-supervised learning.
基金supported by the Natural Science Foundation of Shandong Province,China(ZR2022MF237)the National Natural Science Foundation of China Youth Fund(62406155)the Major Innovation Project(2023JBZ02)of Qilu University of Technology(Shandong Academy of Sciences).
文摘The purpose of infrared and visible image fusion is to create a single image containing the texture details and significant object information of the source images,particularly in challenging environments.However,existing image fusion algorithms are generally suitable for normal scenes.In the hazy scene,a lot of texture information in the visible image is hidden,the results of existing methods are filled with infrared information,resulting in the lack of texture details and poor visual effect.To address the aforementioned difficulties,we propose a haze-free infrared and visible fusion method,termed HaIVFusion,which can eliminate the influence of haze and obtain richer texture information in the fused image.Specifically,we first design a scene information restoration network(SIRNet)to mine the masked texture information in visible images.Then,a denoising fusion network(DFNet)is designed to integrate the features extracted from infrared and visible images and remove the influence of residual noise as much as possible.In addition,we use color consistency loss to reduce the color distortion resulting from haze.Furthermore,we publish a dataset of hazy scenes for infrared and visible image fusion to promote research in extreme scenes.Extensive experiments show that HaIVFusion produces fused images with increased texture details and higher contrast in hazy scenes,and achieves better quantitative results,when compared to state-ofthe-art image fusion methods,even combined with state-of-the-art dehazing methods.
基金funded in part by theHenan ProvinceKeyR&DProgramProject,“Research and Application Demonstration of Class Ⅱ Superlattice Medium Wave High Temperature Infrared Detector Technology”under Grant No.231111210400.
文摘Infrared imaging technology has been widely adopted in various fields,such as military reconnaissance,medical diagnosis,and security monitoring,due to its excellent ability to penetrate smoke and fog.However,the prevalent low resolution of infrared images severely limits the accurate interpretation of their contents.In addition,deploying super-resolution models on resource-constrained devices faces significant challenges.To address these issues,this study proposes a lightweight super-resolution network for infrared images based on an adaptive attention mechanism.The network’s dynamic weighting module automatically adjusts the weights of the attention and nonattention branch outputs based on the network’s characteristics at different levels.Among them,the attention branch is further subdivided into pixel attention and brightness-texture attention,which are specialized for extracting the most informative features in infrared images.Meanwhile,the non-attention branch supplements the extraction of those neglected features to enhance the comprehensiveness of the features.Through ablation experiments,we verify the effectiveness of the proposed module.Finally,through experiments on two datasets,FLIR and Thermal101,qualitative and quantitative results demonstrate that the model can effectively recover high-frequency details of infrared images and significantly improve image resolution.In detail,compared with the suboptimal method,we have reduced the number of parameters by 30%and improved the model performance.When the scale factor is 2,the peak signal-tonoise ratio of the test datasets FLIR and Thermal101 is improved by 0.09 and 0.15 dB,respectively.When the scale factor is 4,it is improved by 0.05 and 0.09 dB,respectively.In addition,due to the lightweight design of the network structure,it has a low computational cost.It is suitable for deployment on edge devices,thus effectively enhancing the sensing performance of infrared imaging devices.
基金This researchwas Sponsored by Xinjiang Uygur Autonomous Region Tianshan Talent Programme Project(2023TCLJ02)Natural Science Foundation of Xinjiang Uygur Autonomous Region(2022D01C349).
文摘Infrared and visible light image fusion technology integrates feature information from two different modalities into a fused image to obtain more comprehensive information.However,in low-light scenarios,the illumination degradation of visible light images makes it difficult for existing fusion methods to extract texture detail information from the scene.At this time,relying solely on the target saliency information provided by infrared images is far from sufficient.To address this challenge,this paper proposes a lightweight infrared and visible light image fusion method based on low-light enhancement,named LLE-Fuse.The method is based on the improvement of the MobileOne Block,using the Edge-MobileOne Block embedded with the Sobel operator to perform feature extraction and downsampling on the source images.The intermediate features at different scales obtained are then fused by a cross-modal attention fusion module.In addition,the Contrast Limited Adaptive Histogram Equalization(CLAHE)algorithm is used for image enhancement of both infrared and visible light images,guiding the network model to learn low-light enhancement capabilities through enhancement loss.Upon completion of network training,the Edge-MobileOne Block is optimized into a direct connection structure similar to MobileNetV1 through structural reparameterization,effectively reducing computational resource consumption.Finally,after extensive experimental comparisons,our method achieved improvements of 4.6%,40.5%,156.9%,9.2%,and 98.6%in the evaluation metrics Standard Deviation(SD),Visual Information Fidelity(VIF),Entropy(EN),and Spatial Frequency(SF),respectively,compared to the best results of the compared algorithms,while only being 1.5 ms/it slower in computation speed than the fastest method.
基金supported by Guangdong Basic and Applied Basic Research Foundation(No.2025A1515011617,2022A1515110570)Fundamental Research Funds for the Provincial Universities of Zhejiang(No.GK259909299001-006)+2 种基金Innovation Teams of Youth Innovation in Science and Technology of High Education Institutions of Shandong Province(No.2021KJ088)Anhui Provincial Joint Construction Key Laboratory of Intelligent Education Equipment and Technology(No.IEET202401)Aeronautical Science Foundation of China(No.2022Z0710T5001).
文摘In the field of infrared small target detection(ISTD),the ability to detect targets in dim environments is critical,as it improves the performance of target recognition in nighttime and harsh weather conditions.The blurry contour,small size and sparse distribution of infrared small targets increase the difficulty of identifying such targets in cluttered backgrounds.Existing methodologies fall short of satisfying the requisites for the detection and categorisation of infrared small targets.To address these challenges and to enhance the precision of small object detection and classification,this paper introduces an innovative approach called location refinement and adjacent feature fusion YOLO(LA-YOLO),which enhances feature extraction by integrating a multi-head self-attention mechanism(MSA).We have improved the feature fusion method to merge adjacent features,to enhance information utilisation in the path aggregation network(PAN).Lastly,we introduce supervision on the target centre points in the detection network.Empirical results on publicly available datasets demonstrate that LA-YOLO achieves an impressive average precision(AP)of 92.46% on IST-A and a mean average precision(mAP)of 84.82%on FLIR.The results surpass those of contemporary state-of-the-art detectors,striking a balance between precision and speed.LA-YOLO emerges as a viable and efficacious solution for ISTD,making a substantial contribution to the progression of infrared imagery analysis.The code is available at https://github.com/liusjo/LA-YOLO.
基金financially supported by the National Key R&D Program of China(No.2022YFC3080200)。
文摘0 INTRODUCTION Geohazards in mountainous regions pose significant risks to the construction and safe operation of transportation,water conservancy,and other critical infrastructure projects.Engineering geological investigations are crucial for disaster prevention and mitigation.
基金Science and Technology Department of Jilin Province(No.20200403075SF)Education Department of Jilin Province(No.JJKH20240148KJ).
文摘As technologies related to power equipment fault diagnosis and infrared temperature measurement continue to advance,the classification and identification of infrared temperature measurement images have become crucial in effective intelligent fault diagnosis of various electrical equipment.In response to the increasing demand for sufficient feature fusion in current real-time detection and low detection accuracy in existing networks for Substation fault diagnosis,we introduce an innovative method known as Gather and Distribution Mechanism-You Only Look Once(GD-YOLO).Firstly,a partial convolution group is designed based on different convolution kernels.We combine the partial convolution group with deep convolution to propose a new Grouped Channel-wise Spatial Convolution(GCSConv)that compensates for the information loss caused by spatial channel convolution.Secondly,the Gather and Distribute Mechanism,which addresses the fusion problem of different dimensional features,has been implemented by aligning and sharing information through aggregation and distribution mechanisms.Thirdly,considering the limitations in current bounding box regression and the imbalance between complex and simple samples,Maximum Possible Distance Intersection over Union(MPDIoU)and Adaptive SlideLoss is incorporated into the loss function,allowing samples near the Intersection over Union(IoU)to receive more attention through the dynamic variation of the mean Intersection over Union.The GD-YOLO algorithm can surpass YOLOv5,YOLOv7,and YOLOv8 in infrared image detection for electrical equipment,achieving a mean Average Precision(mAP)of 88.9%,with accuracy improvements of 3.7%,4.3%,and 3.1%,respectively.Additionally,the model delivers a frame rate of 48 FPS,which aligns with the precision and velocity criteria necessary for the detection of infrared images in power equipment.
基金supported by the Moroccan Ministry of Higher Education,Scientific Research,and Innovationthe Moroccan Digital Development Agency(DDA)+2 种基金the National Center for Scientific and Technical Research of Morocco(CNRST)through the Al-Khawarizmi projectthe MANAGEM groupMASCIR supporting this project.
文摘Rockfalls are among the frequent hazards in underground mines worldwide,requiring effective methods for detecting unstable rock blocks to ensure miners’and equipment’s safety.This study proposes a novel approach for identifying potential rockfall zones using infrared thermal imaging and image segmentation techniques.Infrared images of rock blocks were captured at the Draa Sfar deep underground mine in Morocco using the FLUKE TI401 PRO thermal camera.Two segmentation methods were applied to locate the potential unstable areas:the classical thresholding and the K-means clustering model.The results show that while thresholding allows a binary distinction between stable and unstable areas,K-means clustering is more accurate,especially when using multiple clusters to show different risk levels.The close match between the clustering masks of unstable blocks and their corresponding visible light images further validated this.The findings confirm that thermal image segmentation can serve as an alternative method for predicting rockfalls and monitoring geotechnical issues in underground mines.Underground operators worldwide can apply this approach to monitor rock mass stability.However,further research is recommended to enhance these results,particularly through deep learning-based segmentation and object detection models.
基金supported by grants from the National Natu-ral Science Foundation of China(81302161 and 82003103)the Science and Technology Department of Sichuan Province(2021YFS0375 and 2020YJ0450).
文摘A pancreas surgeon’s constant goal is to do"less damage,more radical".Currently,a small number of highly trained surgeons opt for single-incision laparoscopic pancreaticoduodenectomy(SILPD)or single-incision plus one-port LPD(SILPD+1)to minimize post-operative pain,improve convalescence,and provide a more pleas-ing cosmetic outcome[1,2].Additionally,some skilled surgeons have claimed that laparoscopic duodenum-preserving complete pancreatic head resections(LDPPHR)result in less trauma and en-hanced quality of life[3,4].However,LDPPHR is still challenging because of its lengthy learning curve and"sword-fighting"impact.Additionally,there has not been any global reporting on the suit-ability of single-incision plus one-port DPPHR with pancreaticogas-trostomy(SILDPPHR-T+1)in place of SILPD+1.This study aimed to illustrate the SILDPPHR-T+1 procedure specifics for a patient with pancreatic head intraductal papillary mucinous neoplasm(IPMN)(main pancreatic duct type)(MD-IPMN).
基金supported by Universiti Teknologi MARA through UiTM MyRA Research Grant,600-RMC 5/3/GPM(053/2022).
文摘Infrared and visible image fusion technology integrates the thermal radiation information of infrared images with the texture details of visible images to generate more informative fused images.However,existing methods often fail to distinguish salient objects from background regions,leading to detail suppression in salient regions due to global fusion strategies.This study presents a mask-guided latent low-rank representation fusion method to address this issue.First,the GrabCut algorithm is employed to extract a saliency mask,distinguishing salient regions from background regions.Then,latent low-rank representation(LatLRR)is applied to extract deep image features,enhancing key information extraction.In the fusion stage,a weighted fusion strategy strengthens infrared thermal information and visible texture details in salient regions,while an average fusion strategy improves background smoothness and stability.Experimental results on the TNO dataset demonstrate that the proposed method achieves superior performance in SPI,MI,Qabf,PSNR,and EN metrics,effectively preserving salient target details while maintaining balanced background information.Compared to state-of-the-art fusion methods,our approach achieves more stable and visually consistent fusion results.The fusion code is available on GitHub at:https://github.com/joyzhen1/Image(accessed on 15 January 2025).
基金partially supported by China Postdoctoral Science Foundation(2023M730741)the National Natural Science Foundation of China(U22B2052,52102432,52202452,62372080,62302078)
文摘The goal of infrared and visible image fusion(IVIF)is to integrate the unique advantages of both modalities to achieve a more comprehensive understanding of a scene.However,existing methods struggle to effectively handle modal disparities,resulting in visual degradation of the details and prominent targets of the fused images.To address these challenges,we introduce Prompt Fusion,a prompt-based approach that harmoniously combines multi-modality images under the guidance of semantic prompts.Firstly,to better characterize the features of different modalities,a contourlet autoencoder is designed to separate and extract the high-/low-frequency components of different modalities,thereby improving the extraction of fine details and textures.We also introduce a prompt learning mechanism using positive and negative prompts,leveraging Vision-Language Models to improve the fusion model's understanding and identification of targets in multi-modality images,leading to improved performance in downstream tasks.Furthermore,we employ bi-level asymptotic convergence optimization.This approach simplifies the intricate non-singleton non-convex bi-level problem into a series of convergent and differentiable single optimization problems that can be effectively resolved through gradient descent.Our approach advances the state-of-the-art,delivering superior fusion quality and boosting the performance of related downstream tasks.Project page:https://github.com/hey-it-s-me/PromptFusion.