In last few years,guided image fusion algorithms become more and more popular.However,the current algorithms cannot solve the halo artifacts.We propose an image fusion algorithm based on fast weighted guided filter.Fi...In last few years,guided image fusion algorithms become more and more popular.However,the current algorithms cannot solve the halo artifacts.We propose an image fusion algorithm based on fast weighted guided filter.Firstly,the source images are separated into a series of high and low frequency components.Secondly,three visual features of the source image are extracted to construct a decision graph model.Thirdly,a fast weighted guided filter is raised to optimize the result obtained in the previous step and reduce the time complexity by considering the correlation among neighboring pixels.Finally,the image obtained in the previous step is combined with the weight map to realize the image fusion.The proposed algorithm is applied to multi-focus,visible-infrared and multi-modal image respectively and the final results show that the algorithm effectively solves the halo artifacts of the merged images with higher efficiency,and is better than the traditional method considering subjective visual consequent and objective evaluation.展开更多
Study on the evaluation system for multi-source image fusion is an important and necessary part of image fusion. Qualitative evaluation indexes and quantitative evaluation indexes were studied. A series of new concept...Study on the evaluation system for multi-source image fusion is an important and necessary part of image fusion. Qualitative evaluation indexes and quantitative evaluation indexes were studied. A series of new concepts, such as independent single evaluation index, union single evaluation index, synthetic evaluation index were proposed. Based on these concepts, synthetic evaluation system for digital image fusion was formed. The experiments with the wavelet fusion method, which was applied to fuse the multi-spectral image and panchromatic remote sensing image, the IR image and visible image, the CT and MRI image, and the multi-focus images show that it is an objective, uniform and effective quantitative method for image fusion evaluation.展开更多
Image fusion based on the sparse representation(SR)has become the primary research direction of the transform domain method.However,the SR-based image fusion algorithm has the characteristics of high computational com...Image fusion based on the sparse representation(SR)has become the primary research direction of the transform domain method.However,the SR-based image fusion algorithm has the characteristics of high computational complexity and neglecting the local features of an image,resulting in limited image detail retention and a high registration misalignment sensitivity.In order to overcome these shortcomings and the noise existing in the image of the fusion process,this paper proposes a new signal decomposition model,namely the multi-source image fusion algorithm of the gradient regularization convolution SR(CSR).The main innovation of this work is using the sparse optimization function to perform two-scale decomposition of the source image to obtain high-frequency components and low-frequency components.The sparse coefficient is obtained by the gradient regularization CSR model,and the sparse coefficient is taken as the maximum value to get the optimal high frequency component of the fused image.The best low frequency component is obtained by using the fusion strategy of the extreme or the average value.The final fused image is obtained by adding two optimal components.Experimental results demonstrate that this method greatly improves the ability to maintain image details and reduces image registration sensitivity.展开更多
The spatial offset of bridge has a significant impact on the safety,comfort,and durability of high-speed railway(HSR)operations,so it is crucial to rapidly and effectively detect the spatial offset of operational HSR ...The spatial offset of bridge has a significant impact on the safety,comfort,and durability of high-speed railway(HSR)operations,so it is crucial to rapidly and effectively detect the spatial offset of operational HSR bridges.Drive-by monitoring of bridge uneven settlement demonstrates significant potential due to its practicality,cost-effectiveness,and efficiency.However,existing drive-by methods for detecting bridge offset have limitations such as reliance on a single data source,low detection accuracy,and the inability to identify lateral deformations of bridges.This paper proposes a novel drive-by inspection method for spatial offset of HSR bridge based on multi-source data fusion of comprehensive inspection train.Firstly,dung beetle optimizer-variational mode decomposition was employed to achieve adaptive decomposition of non-stationary dynamic signals,and explore the hidden temporal relationships in the data.Subsequently,a long short-term memory neural network was developed to achieve feature fusion of multi-source signal and accurate prediction of spatial settlement of HSR bridge.A dataset of track irregularities and CRH380A high-speed train responses was generated using a 3D train-track-bridge interaction model,and the accuracy and effectiveness of the proposed hybrid deep learning model were numerically validated.Finally,the reliability of the proposed drive-by inspection method was further validated by analyzing the actual measurement data obtained from comprehensive inspection train.The research findings indicate that the proposed approach enables rapid and accurate detection of spatial offset in HSR bridge,ensuring the long-term operational safety of HSR bridges.展开更多
Background:Medical imaging advancements are constrained by fundamental trade-offs between acquisition speed,radiation dose,and image quality,forcing clinicians to work with noisy,incomplete data.Existing reconstructio...Background:Medical imaging advancements are constrained by fundamental trade-offs between acquisition speed,radiation dose,and image quality,forcing clinicians to work with noisy,incomplete data.Existing reconstruction methods either compromise on accuracy with iterative algorithms or suffer from limited generalizability with task-specific deep learning approaches.Methods:We present LDM-PIR,a lightweight physics-conditioned diffusion multi-model for medical image reconstruction that addresses key challenges in magnetic resonance imaging(MRI),CT,and low-photon imaging.Unlike traditional iterative methods,which are computationally expensive,or task-specific deep learning approaches lacking generalizability,integrates three innovations.A physics-conditioned diffusion framework that embeds acquisition operators(Fourier/Radon transforms)and noise models directly into the reconstruction process.A multi-model architecture that unifies denoising,inpainting,and super-resolution via shared weight conditioning.A lightweight design(2.1M parameters)enabling rapid inference(0.8s/image on GPU).Through self-supervised fine-tuning with measurement consistency losses adapts to new imaging modalities using fewer annotated samples.Results:Achieves state-of-the-art performance on fastMRI(peak signal-to-noise ratio(PSNR):34.04 for single-coil/31.50 for multi-coil)and Lung Image Database Consortium and Image Database Resource Initiative(28.83 PSNR under Poisson noise).Clinical evaluations demonstrate superior preservation of anatomical structures,with SSIM improvements of 8.8%for single-coil and 4.36%for multi-coil MRI over uDPIR.Conclusion:It offers a flexible,efficient,and scalable solution for medical image reconstruction,addressing the challenges of noise,undersampling,and modality generalization.The model’s lightweight design allows for rapid inference,while its self-supervised fine-tuning capability minimizes reliance on large annotated datasets,making it suitable for real-world clinical applications.展开更多
Image registration is an indispensable component in multi-source remote sensing image processing. In this paper, we put forward a remote sensing image registration method by including an improved multi-scale and multi...Image registration is an indispensable component in multi-source remote sensing image processing. In this paper, we put forward a remote sensing image registration method by including an improved multi-scale and multi-direction Harris algorithm and a novel compound feature. Multi-scale circle Gaussian combined invariant moments and multi-direction gray level co-occurrence matrix are extracted as features for image matching. The proposed algorithm is evaluated on numerous multi-source remote sensor images with noise and illumination changes. Extensive experimental studies prove that our proposed method is capable of receiving stable and even distribution of key points as well as obtaining robust and accurate correspondence matches. It is a promising scheme in multi-source remote sensing image registration.展开更多
Remote Sensing image fusion is an effective way to use the large volume ofdata from multi-source images. This paper introduces a new method of remote sensing image fusionbased on support vector machine (SVM), using hi...Remote Sensing image fusion is an effective way to use the large volume ofdata from multi-source images. This paper introduces a new method of remote sensing image fusionbased on support vector machine (SVM), using high spatial resolution data SPIN-2 and multi-spectralremote sensing data SPOT-4. Firstly, the new method is established by building a model of remotesensing image fusion based on SVM. Then by using SPIN-2 data and SPOT-4 data, image classificationfusion is tested. Finally, an evaluation of the fusion result is made in two ways. 1) Fromsubjectivity assessment, the spatial resolution of the fused image is improved compared to theSPOT-4. And it is clearly that the texture of the fused image is distinctive. 2) From quantitativeanalysis, the effect of classification fusion is better. As a whole, the re-suit shows that theaccuracy of image fusion based on SVM is high and the SVM algorithm can be recommended forapplication in remote sensing image fusion processes.展开更多
We present a novel sea-ice classification framework based on locality preserving fusion of multi-source images information.The locality preserving fusion arises from two-fold,i.e.,the local characterization in both sp...We present a novel sea-ice classification framework based on locality preserving fusion of multi-source images information.The locality preserving fusion arises from two-fold,i.e.,the local characterization in both spatial and feature domains.We commence by simultaneously learning a projection matrix,which preserves spatial localities,and a similarity matrix,which encodes feature similarities.We map the pixels of multi-source images by the projection matrix to a set fusion vectors that preserve spatial localities of the image.On the other hand,by applying the Laplacian eigen-decomposition to the similarity matrix,we obtain another set of fusion vectors that preserve the feature local similarities.We concatenate the fusion vectors for both spatial and feature locality preservation and obtain the fusion image.Finally,we classify the fusion image pixels by a novel sliding ensemble strategy,which enhances the locality preservation in classification.Our locality preserving fusion framework is effective in classifying multi-source sea-ice images(e.g.,multi-spectral and synthetic aperture radar(SAR)images)because it not only comprehensively captures the spatial neighboring relationships but also intrinsically characterizes the feature associations between different types of sea-ices.Experimental evaluations validate the effectiveness of our framework.展开更多
The geological data are constructed in vector format in geographical information system (GIS) while other data such as remote sensing images, geographical data and geochemical data are saved in raster ones. This paper...The geological data are constructed in vector format in geographical information system (GIS) while other data such as remote sensing images, geographical data and geochemical data are saved in raster ones. This paper converts the vector data into 8 bit images according to their importance to mineralization each by programming. We can communicate the geological meaning with the raster images by this method. The paper also fuses geographical data and geochemical data with the programmed strata data. The result shows that image fusion can express different intensities effectively and visualize the structure characters in 2 dimensions. Furthermore, it also can produce optimized information from multi-source data and express them more directly.展开更多
Mangroves are indispensable to coastlines,maintaining biodiversity,and mitigating climate change.Therefore,improving the accuracy of mangrove information identification is crucial for their ecological protection.Aimin...Mangroves are indispensable to coastlines,maintaining biodiversity,and mitigating climate change.Therefore,improving the accuracy of mangrove information identification is crucial for their ecological protection.Aiming at the limited morphological information of synthetic aperture radar(SAR)images,which is greatly interfered by noise,and the susceptibility of optical images to weather and lighting conditions,this paper proposes a pixel-level weighted fusion method for SAR and optical images.Image fusion enhanced the target features and made mangrove monitoring more comprehensive and accurate.To address the problem of high similarity between mangrove forests and other forests,this paper is based on the U-Net convolutional neural network,and an attention mechanism is added in the feature extraction stage to make the model pay more attention to the mangrove vegetation area in the image.In order to accelerate the convergence and normalize the input,batch normalization(BN)layer and Dropout layer are added after each convolutional layer.Since mangroves are a minority class in the image,an improved cross-entropy loss function is introduced in this paper to improve the model’s ability to recognize mangroves.The AttU-Net model for mangrove recognition in high similarity environments is thus constructed based on the fused images.Through comparison experiments,the overall accuracy of the improved U-Net model trained from the fused images to recognize the predicted regions is significantly improved.Based on the fused images,the recognition results of the AttU-Net model proposed in this paper are compared with its benchmark model,U-Net,and the Dense-Net,Res-Net,and Seg-Net methods.The AttU-Net model captured mangroves’complex structures and textural features in images more effectively.The average OA,F1-score,and Kappa coefficient in the four tested regions were 94.406%,90.006%,and 84.045%,which were significantly higher than several other methods.This method can provide some technical support for the monitoring and protection of mangrove ecosystems.展开更多
Roadbed disease detection is essential for maintaining road functionality.Ground penetrating radar(GPR)enables non-destructive detection without drilling.However,current identification often relies on manual inspectio...Roadbed disease detection is essential for maintaining road functionality.Ground penetrating radar(GPR)enables non-destructive detection without drilling.However,current identification often relies on manual inspection,which requires extensive experience,suffers from low efficiency,and is highly subjective.As the results are presented as radar images,image processing methods can be applied for fast and objective identification.Deep learning-based approaches now offer a robust solution for automated roadbed disease detection.This study proposes an enhanced Faster Region-based Convolutional Neural Networks(R-CNN)framework integrating ResNet-50 as the backbone and two-dimensional discrete Fourier spectrum transformation(2D-DFT)for frequency-domain feature fusion.A dedicated GPR image dataset comprising 1650 annotated images was constructed and augmented to 6600 images via median filtering,histogram equalization,and binarization.The proposed model segments defect regions,applies binary masking,and fuses frequency-domain features to improve small-target detection under noisy backgrounds.Experimental results show that the improved Faster R-CNN achieves a mean Average Precision(mAP)of 0.92,representing a 0.22 increase over the baseline.Precision improved by 26%while recall remained stable at 87%.The model was further validated on real urban road data,demonstrating robust detection capability even under interference.These findings highlight the potential of combining GPR with deep learning for efficient,non-destructive roadbed health monitoring.展开更多
The initial noise present in the depth images obtained with RGB-D sensors is a combination of hardware limitations in addition to the environmental factors,due to the limited capabilities of sensors,which also produce...The initial noise present in the depth images obtained with RGB-D sensors is a combination of hardware limitations in addition to the environmental factors,due to the limited capabilities of sensors,which also produce poor computer vision results.The common image denoising techniques tend to remove significant image details and also remove noise,provided they are based on space and frequency filtering.The updated framework presented in this paper is a novel denoising model that makes use of Boruta-driven feature selection using a Long Short-Term Memory Autoencoder(LSTMAE).The Boruta algorithm identifies the most useful depth features that are used to maximize the spatial structure integrity and reduce redundancy.An LSTMAE is then used to process these selected features and model depth pixel sequences to generate robust,noise-resistant representations.The system uses the encoder to encode the input data into a latent space that has been compressed before it is decoded to retrieve the clean image.Experiments on a benchmark data set show that the suggested technique attains a PSNR of 45 dB and an SSIM of 0.90,which is 10 dB higher than the performance of conventional convolutional autoencoders and 15 times higher than that of the wavelet-based models.Moreover,the feature selection step will decrease the input dimensionality by 40%,resulting in a 37.5%reduction in training time and a real-time inference rate of 200 FPS.Boruta-LSTMAE framework,therefore,offers a highly efficient and scalable system for depth image denoising,with a high potential to be applied to close-range 3D systems,such as robotic manipulation and gesture-based interfaces.展开更多
Benthic habitat mapping is an emerging discipline in the international marine field in recent years,providing an effective tool for marine spatial planning,marine ecological management,and decision-making applications...Benthic habitat mapping is an emerging discipline in the international marine field in recent years,providing an effective tool for marine spatial planning,marine ecological management,and decision-making applications.Seabed sediment classification is one of the main contents of seabed habitat mapping.In response to the impact of remote sensing imaging quality and the limitations of acoustic measurement range,where a single data source does not fully reflect the substrate type,we proposed a high-precision seabed habitat sediment classification method that integrates data from multiple sources.Based on WorldView-2 multi-spectral remote sensing image data and multibeam bathymetry data,constructed a random forests(RF)classifier with optimal feature selection.A seabed sediment classification experiment integrating optical remote sensing and acoustic remote sensing data was carried out in the shallow water area of Wuzhizhou Island,Hainan,South China.Different seabed sediment types,such as sand,seagrass,and coral reefs were effectively identified,with an overall classification accuracy of 92%.Experimental results show that RF matrix optimized by fusing multi-source remote sensing data for feature selection were better than the classification results of simple combinations of data sources,which improved the accuracy of seabed sediment classification.Therefore,the method proposed in this paper can be effectively applied to high-precision seabed sediment classification and habitat mapping around islands and reefs.展开更多
The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method f...The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.展开更多
The purpose of infrared and visible image fusion is to create a single image containing the texture details and significant object information of the source images,particularly in challenging environments.However,exis...The purpose of infrared and visible image fusion is to create a single image containing the texture details and significant object information of the source images,particularly in challenging environments.However,existing image fusion algorithms are generally suitable for normal scenes.In the hazy scene,a lot of texture information in the visible image is hidden,the results of existing methods are filled with infrared information,resulting in the lack of texture details and poor visual effect.To address the aforementioned difficulties,we propose a haze-free infrared and visible fusion method,termed HaIVFusion,which can eliminate the influence of haze and obtain richer texture information in the fused image.Specifically,we first design a scene information restoration network(SIRNet)to mine the masked texture information in visible images.Then,a denoising fusion network(DFNet)is designed to integrate the features extracted from infrared and visible images and remove the influence of residual noise as much as possible.In addition,we use color consistency loss to reduce the color distortion resulting from haze.Furthermore,we publish a dataset of hazy scenes for infrared and visible image fusion to promote research in extreme scenes.Extensive experiments show that HaIVFusion produces fused images with increased texture details and higher contrast in hazy scenes,and achieves better quantitative results,when compared to state-ofthe-art image fusion methods,even combined with state-of-the-art dehazing methods.展开更多
Infrared and visible light image fusion technology integrates feature information from two different modalities into a fused image to obtain more comprehensive information.However,in low-light scenarios,the illuminati...Infrared and visible light image fusion technology integrates feature information from two different modalities into a fused image to obtain more comprehensive information.However,in low-light scenarios,the illumination degradation of visible light images makes it difficult for existing fusion methods to extract texture detail information from the scene.At this time,relying solely on the target saliency information provided by infrared images is far from sufficient.To address this challenge,this paper proposes a lightweight infrared and visible light image fusion method based on low-light enhancement,named LLE-Fuse.The method is based on the improvement of the MobileOne Block,using the Edge-MobileOne Block embedded with the Sobel operator to perform feature extraction and downsampling on the source images.The intermediate features at different scales obtained are then fused by a cross-modal attention fusion module.In addition,the Contrast Limited Adaptive Histogram Equalization(CLAHE)algorithm is used for image enhancement of both infrared and visible light images,guiding the network model to learn low-light enhancement capabilities through enhancement loss.Upon completion of network training,the Edge-MobileOne Block is optimized into a direct connection structure similar to MobileNetV1 through structural reparameterization,effectively reducing computational resource consumption.Finally,after extensive experimental comparisons,our method achieved improvements of 4.6%,40.5%,156.9%,9.2%,and 98.6%in the evaluation metrics Standard Deviation(SD),Visual Information Fidelity(VIF),Entropy(EN),and Spatial Frequency(SF),respectively,compared to the best results of the compared algorithms,while only being 1.5 ms/it slower in computation speed than the fastest method.展开更多
Multi-source data fusion provides high-precision spatial situational awareness essential for analyzing granular urban social activities.This study used Shanghai’s catering industry as a case study,leveraging electron...Multi-source data fusion provides high-precision spatial situational awareness essential for analyzing granular urban social activities.This study used Shanghai’s catering industry as a case study,leveraging electronic reviews and consumer data sourced from third-party restaurant platforms collected in 2021.By performing weighted processing on two-dimensional point-of-interest(POI)data,clustering hotspots of high-dimensional restaurant data were identified.A hierarchical network of restaurant hotspots was constructed following the Central Place Theory(CPT)framework,while the Geo-Informatic Tupu method was employed to resolve the challenges posed by network deformation in multi-scale processes.These findings suggest the necessity of enhancing the spatial balance of Shanghai’s urban centers by moderately increasing the number and service capacity of suburban centers at the urban periphery.Such measures would contribute to a more optimized urban structure and facilitate the outward dispersion of comfort-oriented facilities such as the restaurant industry.At a finer spatial scale,the distribution of restaurant hotspots demonstrates a polycentric and symmetric spatial pattern,with a developmental trend radiating outward along the city’s ring roads.This trend can be attributed to the efforts of restaurants to establish connections with other urban functional spaces,leading to the reconfiguration of urban spaces,expansion of restaurant-dedicated land use,and the reorganization of associated commercial activities.The results validate the existence of a polycentric urban structure in Shanghai but also highlight the instability of the restaurant hotspot network during cross-scale transitions.展开更多
Visible-infrared object detection leverages the day-night stable object perception capability of infrared images to enhance detection robustness in low-light environments by fusing the complementary information of vis...Visible-infrared object detection leverages the day-night stable object perception capability of infrared images to enhance detection robustness in low-light environments by fusing the complementary information of visible and infrared images.However,the inherent differences in the imaging mechanisms of visible and infrared modalities make effective cross-modal fusion challenging.Furthermore,constrained by the physical characteristics of sensors and thermal diffusion effects,infrared images generally suffer from blurred object contours and missing details,making it difficult to extract object features effectively.To address these issues,we propose an infrared-visible image fusion network that realizesmultimodal information fusion of infrared and visible images through a carefully designedmultiscale fusion strategy.First,we design an adaptive gray-radiance enhancement(AGRE)module to strengthen the detail representation in infrared images,improving their usability in complex lighting scenarios.Next,we introduce a channelspatial feature interaction(CSFI)module,which achieves efficient complementarity between the RGB and infrared(IR)modalities via dynamic channel switching and a spatial attention mechanism.Finally,we propose a multi-scale enhanced cross-attention fusion(MSECA)module,which optimizes the fusion ofmulti-level features through dynamic convolution and gating mechanisms and captures long-range complementary relationships of cross-modal features on a global scale,thereby enhancing the expressiveness of the fused features.Experiments on the KAIST,M3FD,and FLIR datasets demonstrate that our method delivers outstanding performance in daytime and nighttime scenarios.On the KAIST dataset,the miss rate drops to 5.99%,and further to 4.26% in night scenes.On the FLIR and M3FD datasets,it achieves AP50 scores of 79.4% and 88.9%,respectively.展开更多
There is still a dearth of systematic study on picture stitching techniques for the natural tubular structures of intestines,and traditional stitching techniques have a poor application to endoscopic images with deep ...There is still a dearth of systematic study on picture stitching techniques for the natural tubular structures of intestines,and traditional stitching techniques have a poor application to endoscopic images with deep scenes.In order to recreate the intestinal wall in two dimensions,a method is developed.The normalized Laplacian algorithm is used to enhance the image and transform it into polar coordinates according to the characteristics that intestinal images are not obvious and usually arranged in a circle,in order to extract the new image segments of the current image relative to the previous image.The improved weighted fusion algorithm is then used to sequentially splice the segment images.The experimental results demonstrate that the suggested approach can improve image clarity and minimize noise while maintaining the information content of intestinal images.In addition,the method's seamless transition between the final portions of a panoramic image also demonstrates that the stitching trace has been removed.展开更多
The goal of infrared and visible image fusion(IVIF)is to integrate the unique advantages of both modalities to achieve a more comprehensive understanding of a scene.However,existing methods struggle to effectively han...The goal of infrared and visible image fusion(IVIF)is to integrate the unique advantages of both modalities to achieve a more comprehensive understanding of a scene.However,existing methods struggle to effectively handle modal disparities,resulting in visual degradation of the details and prominent targets of the fused images.To address these challenges,we introduce Prompt Fusion,a prompt-based approach that harmoniously combines multi-modality images under the guidance of semantic prompts.Firstly,to better characterize the features of different modalities,a contourlet autoencoder is designed to separate and extract the high-/low-frequency components of different modalities,thereby improving the extraction of fine details and textures.We also introduce a prompt learning mechanism using positive and negative prompts,leveraging Vision-Language Models to improve the fusion model's understanding and identification of targets in multi-modality images,leading to improved performance in downstream tasks.Furthermore,we employ bi-level asymptotic convergence optimization.This approach simplifies the intricate non-singleton non-convex bi-level problem into a series of convergent and differentiable single optimization problems that can be effectively resolved through gradient descent.Our approach advances the state-of-the-art,delivering superior fusion quality and boosting the performance of related downstream tasks.Project page:https://github.com/hey-it-s-me/PromptFusion.展开更多
基金supported by the National Natural Science Foundation of China(61472324 61671383)+1 种基金Shaanxi Key Industry Innovation Chain Project(2018ZDCXL-G-12-2 2019ZDLGY14-02-02)
文摘In last few years,guided image fusion algorithms become more and more popular.However,the current algorithms cannot solve the halo artifacts.We propose an image fusion algorithm based on fast weighted guided filter.Firstly,the source images are separated into a series of high and low frequency components.Secondly,three visual features of the source image are extracted to construct a decision graph model.Thirdly,a fast weighted guided filter is raised to optimize the result obtained in the previous step and reduce the time complexity by considering the correlation among neighboring pixels.Finally,the image obtained in the previous step is combined with the weight map to realize the image fusion.The proposed algorithm is applied to multi-focus,visible-infrared and multi-modal image respectively and the final results show that the algorithm effectively solves the halo artifacts of the merged images with higher efficiency,and is better than the traditional method considering subjective visual consequent and objective evaluation.
基金National Natural Science Foundation ofChina (No. 60375008) Shanghai EXPOSpecial Project ( No.2004BA908B07 )Shanghai NRC International CooperationProject (No.05SN07118)
文摘Study on the evaluation system for multi-source image fusion is an important and necessary part of image fusion. Qualitative evaluation indexes and quantitative evaluation indexes were studied. A series of new concepts, such as independent single evaluation index, union single evaluation index, synthetic evaluation index were proposed. Based on these concepts, synthetic evaluation system for digital image fusion was formed. The experiments with the wavelet fusion method, which was applied to fuse the multi-spectral image and panchromatic remote sensing image, the IR image and visible image, the CT and MRI image, and the multi-focus images show that it is an objective, uniform and effective quantitative method for image fusion evaluation.
基金the National Natural Science Foundation of China(61671383)Shaanxi Key Industry Innovation Chain Project(2018ZDCXL-G-12-2,2019ZDLGY14-02-02,2019ZDLGY14-02-03).
文摘Image fusion based on the sparse representation(SR)has become the primary research direction of the transform domain method.However,the SR-based image fusion algorithm has the characteristics of high computational complexity and neglecting the local features of an image,resulting in limited image detail retention and a high registration misalignment sensitivity.In order to overcome these shortcomings and the noise existing in the image of the fusion process,this paper proposes a new signal decomposition model,namely the multi-source image fusion algorithm of the gradient regularization convolution SR(CSR).The main innovation of this work is using the sparse optimization function to perform two-scale decomposition of the source image to obtain high-frequency components and low-frequency components.The sparse coefficient is obtained by the gradient regularization CSR model,and the sparse coefficient is taken as the maximum value to get the optimal high frequency component of the fused image.The best low frequency component is obtained by using the fusion strategy of the extreme or the average value.The final fused image is obtained by adding two optimal components.Experimental results demonstrate that this method greatly improves the ability to maintain image details and reduces image registration sensitivity.
基金sponsored by the National Natural Science Foundation of China(Grant No.52178100).
文摘The spatial offset of bridge has a significant impact on the safety,comfort,and durability of high-speed railway(HSR)operations,so it is crucial to rapidly and effectively detect the spatial offset of operational HSR bridges.Drive-by monitoring of bridge uneven settlement demonstrates significant potential due to its practicality,cost-effectiveness,and efficiency.However,existing drive-by methods for detecting bridge offset have limitations such as reliance on a single data source,low detection accuracy,and the inability to identify lateral deformations of bridges.This paper proposes a novel drive-by inspection method for spatial offset of HSR bridge based on multi-source data fusion of comprehensive inspection train.Firstly,dung beetle optimizer-variational mode decomposition was employed to achieve adaptive decomposition of non-stationary dynamic signals,and explore the hidden temporal relationships in the data.Subsequently,a long short-term memory neural network was developed to achieve feature fusion of multi-source signal and accurate prediction of spatial settlement of HSR bridge.A dataset of track irregularities and CRH380A high-speed train responses was generated using a 3D train-track-bridge interaction model,and the accuracy and effectiveness of the proposed hybrid deep learning model were numerically validated.Finally,the reliability of the proposed drive-by inspection method was further validated by analyzing the actual measurement data obtained from comprehensive inspection train.The research findings indicate that the proposed approach enables rapid and accurate detection of spatial offset in HSR bridge,ensuring the long-term operational safety of HSR bridges.
文摘Background:Medical imaging advancements are constrained by fundamental trade-offs between acquisition speed,radiation dose,and image quality,forcing clinicians to work with noisy,incomplete data.Existing reconstruction methods either compromise on accuracy with iterative algorithms or suffer from limited generalizability with task-specific deep learning approaches.Methods:We present LDM-PIR,a lightweight physics-conditioned diffusion multi-model for medical image reconstruction that addresses key challenges in magnetic resonance imaging(MRI),CT,and low-photon imaging.Unlike traditional iterative methods,which are computationally expensive,or task-specific deep learning approaches lacking generalizability,integrates three innovations.A physics-conditioned diffusion framework that embeds acquisition operators(Fourier/Radon transforms)and noise models directly into the reconstruction process.A multi-model architecture that unifies denoising,inpainting,and super-resolution via shared weight conditioning.A lightweight design(2.1M parameters)enabling rapid inference(0.8s/image on GPU).Through self-supervised fine-tuning with measurement consistency losses adapts to new imaging modalities using fewer annotated samples.Results:Achieves state-of-the-art performance on fastMRI(peak signal-to-noise ratio(PSNR):34.04 for single-coil/31.50 for multi-coil)and Lung Image Database Consortium and Image Database Resource Initiative(28.83 PSNR under Poisson noise).Clinical evaluations demonstrate superior preservation of anatomical structures,with SSIM improvements of 8.8%for single-coil and 4.36%for multi-coil MRI over uDPIR.Conclusion:It offers a flexible,efficient,and scalable solution for medical image reconstruction,addressing the challenges of noise,undersampling,and modality generalization.The model’s lightweight design allows for rapid inference,while its self-supervised fine-tuning capability minimizes reliance on large annotated datasets,making it suitable for real-world clinical applications.
基金supported by National Nature Science Foundation of China (Nos. 61462046 and 61762052)Natural Science Foundation of Jiangxi Province (Nos. 20161BAB202049 and 20161BAB204172)+2 种基金the Bidding Project of the Key Laboratory of Watershed Ecology and Geographical Environment Monitoring, NASG (Nos. WE2016003, WE2016013 and WE2016015)the Science and Technology Research Projects of Jiangxi Province Education Department (Nos. GJJ160741, GJJ170632 and GJJ170633)the Art Planning Project of Jiangxi Province (Nos. YG2016250 and YG2017381)
文摘Image registration is an indispensable component in multi-source remote sensing image processing. In this paper, we put forward a remote sensing image registration method by including an improved multi-scale and multi-direction Harris algorithm and a novel compound feature. Multi-scale circle Gaussian combined invariant moments and multi-direction gray level co-occurrence matrix are extracted as features for image matching. The proposed algorithm is evaluated on numerous multi-source remote sensor images with noise and illumination changes. Extensive experimental studies prove that our proposed method is capable of receiving stable and even distribution of key points as well as obtaining robust and accurate correspondence matches. It is a promising scheme in multi-source remote sensing image registration.
文摘Remote Sensing image fusion is an effective way to use the large volume ofdata from multi-source images. This paper introduces a new method of remote sensing image fusionbased on support vector machine (SVM), using high spatial resolution data SPIN-2 and multi-spectralremote sensing data SPOT-4. Firstly, the new method is established by building a model of remotesensing image fusion based on SVM. Then by using SPIN-2 data and SPOT-4 data, image classificationfusion is tested. Finally, an evaluation of the fusion result is made in two ways. 1) Fromsubjectivity assessment, the spatial resolution of the fused image is improved compared to theSPOT-4. And it is clearly that the texture of the fused image is distinctive. 2) From quantitativeanalysis, the effect of classification fusion is better. As a whole, the re-suit shows that theaccuracy of image fusion based on SVM is high and the SVM algorithm can be recommended forapplication in remote sensing image fusion processes.
基金The National Natural Science Foundation of China under contract No.61671481the Qingdao Applied Fundamental Research under contract No.16-5-1-11-jchthe Fundamental Research Funds for Central Universities under contract No.18CX05014A
文摘We present a novel sea-ice classification framework based on locality preserving fusion of multi-source images information.The locality preserving fusion arises from two-fold,i.e.,the local characterization in both spatial and feature domains.We commence by simultaneously learning a projection matrix,which preserves spatial localities,and a similarity matrix,which encodes feature similarities.We map the pixels of multi-source images by the projection matrix to a set fusion vectors that preserve spatial localities of the image.On the other hand,by applying the Laplacian eigen-decomposition to the similarity matrix,we obtain another set of fusion vectors that preserve the feature local similarities.We concatenate the fusion vectors for both spatial and feature locality preservation and obtain the fusion image.Finally,we classify the fusion image pixels by a novel sliding ensemble strategy,which enhances the locality preservation in classification.Our locality preserving fusion framework is effective in classifying multi-source sea-ice images(e.g.,multi-spectral and synthetic aperture radar(SAR)images)because it not only comprehensively captures the spatial neighboring relationships but also intrinsically characterizes the feature associations between different types of sea-ices.Experimental evaluations validate the effectiveness of our framework.
文摘The geological data are constructed in vector format in geographical information system (GIS) while other data such as remote sensing images, geographical data and geochemical data are saved in raster ones. This paper converts the vector data into 8 bit images according to their importance to mineralization each by programming. We can communicate the geological meaning with the raster images by this method. The paper also fuses geographical data and geochemical data with the programmed strata data. The result shows that image fusion can express different intensities effectively and visualize the structure characters in 2 dimensions. Furthermore, it also can produce optimized information from multi-source data and express them more directly.
基金The Key R&D Project of Hainan Province under contract No.ZDYF2023SHFZ097the National Natural Science Foundation of China under contract No.42376180。
文摘Mangroves are indispensable to coastlines,maintaining biodiversity,and mitigating climate change.Therefore,improving the accuracy of mangrove information identification is crucial for their ecological protection.Aiming at the limited morphological information of synthetic aperture radar(SAR)images,which is greatly interfered by noise,and the susceptibility of optical images to weather and lighting conditions,this paper proposes a pixel-level weighted fusion method for SAR and optical images.Image fusion enhanced the target features and made mangrove monitoring more comprehensive and accurate.To address the problem of high similarity between mangrove forests and other forests,this paper is based on the U-Net convolutional neural network,and an attention mechanism is added in the feature extraction stage to make the model pay more attention to the mangrove vegetation area in the image.In order to accelerate the convergence and normalize the input,batch normalization(BN)layer and Dropout layer are added after each convolutional layer.Since mangroves are a minority class in the image,an improved cross-entropy loss function is introduced in this paper to improve the model’s ability to recognize mangroves.The AttU-Net model for mangrove recognition in high similarity environments is thus constructed based on the fused images.Through comparison experiments,the overall accuracy of the improved U-Net model trained from the fused images to recognize the predicted regions is significantly improved.Based on the fused images,the recognition results of the AttU-Net model proposed in this paper are compared with its benchmark model,U-Net,and the Dense-Net,Res-Net,and Seg-Net methods.The AttU-Net model captured mangroves’complex structures and textural features in images more effectively.The average OA,F1-score,and Kappa coefficient in the four tested regions were 94.406%,90.006%,and 84.045%,which were significantly higher than several other methods.This method can provide some technical support for the monitoring and protection of mangrove ecosystems.
基金supported by the Second Batch of Key Textbook Construction Projects of“14th Five-Year Plan”of Zhejiang Vocational Colleges(SZDJC-2412).
文摘Roadbed disease detection is essential for maintaining road functionality.Ground penetrating radar(GPR)enables non-destructive detection without drilling.However,current identification often relies on manual inspection,which requires extensive experience,suffers from low efficiency,and is highly subjective.As the results are presented as radar images,image processing methods can be applied for fast and objective identification.Deep learning-based approaches now offer a robust solution for automated roadbed disease detection.This study proposes an enhanced Faster Region-based Convolutional Neural Networks(R-CNN)framework integrating ResNet-50 as the backbone and two-dimensional discrete Fourier spectrum transformation(2D-DFT)for frequency-domain feature fusion.A dedicated GPR image dataset comprising 1650 annotated images was constructed and augmented to 6600 images via median filtering,histogram equalization,and binarization.The proposed model segments defect regions,applies binary masking,and fuses frequency-domain features to improve small-target detection under noisy backgrounds.Experimental results show that the improved Faster R-CNN achieves a mean Average Precision(mAP)of 0.92,representing a 0.22 increase over the baseline.Precision improved by 26%while recall remained stable at 87%.The model was further validated on real urban road data,demonstrating robust detection capability even under interference.These findings highlight the potential of combining GPR with deep learning for efficient,non-destructive roadbed health monitoring.
文摘The initial noise present in the depth images obtained with RGB-D sensors is a combination of hardware limitations in addition to the environmental factors,due to the limited capabilities of sensors,which also produce poor computer vision results.The common image denoising techniques tend to remove significant image details and also remove noise,provided they are based on space and frequency filtering.The updated framework presented in this paper is a novel denoising model that makes use of Boruta-driven feature selection using a Long Short-Term Memory Autoencoder(LSTMAE).The Boruta algorithm identifies the most useful depth features that are used to maximize the spatial structure integrity and reduce redundancy.An LSTMAE is then used to process these selected features and model depth pixel sequences to generate robust,noise-resistant representations.The system uses the encoder to encode the input data into a latent space that has been compressed before it is decoded to retrieve the clean image.Experiments on a benchmark data set show that the suggested technique attains a PSNR of 45 dB and an SSIM of 0.90,which is 10 dB higher than the performance of conventional convolutional autoencoders and 15 times higher than that of the wavelet-based models.Moreover,the feature selection step will decrease the input dimensionality by 40%,resulting in a 37.5%reduction in training time and a real-time inference rate of 200 FPS.Boruta-LSTMAE framework,therefore,offers a highly efficient and scalable system for depth image denoising,with a high potential to be applied to close-range 3D systems,such as robotic manipulation and gesture-based interfaces.
基金Supported by the National Natural Science Foundation of China(Nos.42376185,41876111)the Shandong Provincial Natural Science Foundation(No.ZR2023MD073)。
文摘Benthic habitat mapping is an emerging discipline in the international marine field in recent years,providing an effective tool for marine spatial planning,marine ecological management,and decision-making applications.Seabed sediment classification is one of the main contents of seabed habitat mapping.In response to the impact of remote sensing imaging quality and the limitations of acoustic measurement range,where a single data source does not fully reflect the substrate type,we proposed a high-precision seabed habitat sediment classification method that integrates data from multiple sources.Based on WorldView-2 multi-spectral remote sensing image data and multibeam bathymetry data,constructed a random forests(RF)classifier with optimal feature selection.A seabed sediment classification experiment integrating optical remote sensing and acoustic remote sensing data was carried out in the shallow water area of Wuzhizhou Island,Hainan,South China.Different seabed sediment types,such as sand,seagrass,and coral reefs were effectively identified,with an overall classification accuracy of 92%.Experimental results show that RF matrix optimized by fusing multi-source remote sensing data for feature selection were better than the classification results of simple combinations of data sources,which improved the accuracy of seabed sediment classification.Therefore,the method proposed in this paper can be effectively applied to high-precision seabed sediment classification and habitat mapping around islands and reefs.
基金Supported by the Henan Province Key Research and Development Project(231111211300)the Central Government of Henan Province Guides Local Science and Technology Development Funds(Z20231811005)+2 种基金Henan Province Key Research and Development Project(231111110100)Henan Provincial Outstanding Foreign Scientist Studio(GZS2024006)Henan Provincial Joint Fund for Scientific and Technological Research and Development Plan(Application and Overcoming Technical Barriers)(242103810028)。
文摘The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.
基金supported by the Natural Science Foundation of Shandong Province,China(ZR2022MF237)the National Natural Science Foundation of China Youth Fund(62406155)the Major Innovation Project(2023JBZ02)of Qilu University of Technology(Shandong Academy of Sciences).
文摘The purpose of infrared and visible image fusion is to create a single image containing the texture details and significant object information of the source images,particularly in challenging environments.However,existing image fusion algorithms are generally suitable for normal scenes.In the hazy scene,a lot of texture information in the visible image is hidden,the results of existing methods are filled with infrared information,resulting in the lack of texture details and poor visual effect.To address the aforementioned difficulties,we propose a haze-free infrared and visible fusion method,termed HaIVFusion,which can eliminate the influence of haze and obtain richer texture information in the fused image.Specifically,we first design a scene information restoration network(SIRNet)to mine the masked texture information in visible images.Then,a denoising fusion network(DFNet)is designed to integrate the features extracted from infrared and visible images and remove the influence of residual noise as much as possible.In addition,we use color consistency loss to reduce the color distortion resulting from haze.Furthermore,we publish a dataset of hazy scenes for infrared and visible image fusion to promote research in extreme scenes.Extensive experiments show that HaIVFusion produces fused images with increased texture details and higher contrast in hazy scenes,and achieves better quantitative results,when compared to state-ofthe-art image fusion methods,even combined with state-of-the-art dehazing methods.
基金This researchwas Sponsored by Xinjiang Uygur Autonomous Region Tianshan Talent Programme Project(2023TCLJ02)Natural Science Foundation of Xinjiang Uygur Autonomous Region(2022D01C349).
文摘Infrared and visible light image fusion technology integrates feature information from two different modalities into a fused image to obtain more comprehensive information.However,in low-light scenarios,the illumination degradation of visible light images makes it difficult for existing fusion methods to extract texture detail information from the scene.At this time,relying solely on the target saliency information provided by infrared images is far from sufficient.To address this challenge,this paper proposes a lightweight infrared and visible light image fusion method based on low-light enhancement,named LLE-Fuse.The method is based on the improvement of the MobileOne Block,using the Edge-MobileOne Block embedded with the Sobel operator to perform feature extraction and downsampling on the source images.The intermediate features at different scales obtained are then fused by a cross-modal attention fusion module.In addition,the Contrast Limited Adaptive Histogram Equalization(CLAHE)algorithm is used for image enhancement of both infrared and visible light images,guiding the network model to learn low-light enhancement capabilities through enhancement loss.Upon completion of network training,the Edge-MobileOne Block is optimized into a direct connection structure similar to MobileNetV1 through structural reparameterization,effectively reducing computational resource consumption.Finally,after extensive experimental comparisons,our method achieved improvements of 4.6%,40.5%,156.9%,9.2%,and 98.6%in the evaluation metrics Standard Deviation(SD),Visual Information Fidelity(VIF),Entropy(EN),and Spatial Frequency(SF),respectively,compared to the best results of the compared algorithms,while only being 1.5 ms/it slower in computation speed than the fastest method.
基金Under the auspices of the Key Program of National Natural Science Foundation of China(No.42030409)。
文摘Multi-source data fusion provides high-precision spatial situational awareness essential for analyzing granular urban social activities.This study used Shanghai’s catering industry as a case study,leveraging electronic reviews and consumer data sourced from third-party restaurant platforms collected in 2021.By performing weighted processing on two-dimensional point-of-interest(POI)data,clustering hotspots of high-dimensional restaurant data were identified.A hierarchical network of restaurant hotspots was constructed following the Central Place Theory(CPT)framework,while the Geo-Informatic Tupu method was employed to resolve the challenges posed by network deformation in multi-scale processes.These findings suggest the necessity of enhancing the spatial balance of Shanghai’s urban centers by moderately increasing the number and service capacity of suburban centers at the urban periphery.Such measures would contribute to a more optimized urban structure and facilitate the outward dispersion of comfort-oriented facilities such as the restaurant industry.At a finer spatial scale,the distribution of restaurant hotspots demonstrates a polycentric and symmetric spatial pattern,with a developmental trend radiating outward along the city’s ring roads.This trend can be attributed to the efforts of restaurants to establish connections with other urban functional spaces,leading to the reconfiguration of urban spaces,expansion of restaurant-dedicated land use,and the reorganization of associated commercial activities.The results validate the existence of a polycentric urban structure in Shanghai but also highlight the instability of the restaurant hotspot network during cross-scale transitions.
基金supported by the National Natural Science Foundation of China(Grant No.62302086)the Natural Science Foundation of Liaoning Province(Grant No.2023-MSBA-070)the Fundamental Research Funds for the Central Universities(Grant No.N2317005).
文摘Visible-infrared object detection leverages the day-night stable object perception capability of infrared images to enhance detection robustness in low-light environments by fusing the complementary information of visible and infrared images.However,the inherent differences in the imaging mechanisms of visible and infrared modalities make effective cross-modal fusion challenging.Furthermore,constrained by the physical characteristics of sensors and thermal diffusion effects,infrared images generally suffer from blurred object contours and missing details,making it difficult to extract object features effectively.To address these issues,we propose an infrared-visible image fusion network that realizesmultimodal information fusion of infrared and visible images through a carefully designedmultiscale fusion strategy.First,we design an adaptive gray-radiance enhancement(AGRE)module to strengthen the detail representation in infrared images,improving their usability in complex lighting scenarios.Next,we introduce a channelspatial feature interaction(CSFI)module,which achieves efficient complementarity between the RGB and infrared(IR)modalities via dynamic channel switching and a spatial attention mechanism.Finally,we propose a multi-scale enhanced cross-attention fusion(MSECA)module,which optimizes the fusion ofmulti-level features through dynamic convolution and gating mechanisms and captures long-range complementary relationships of cross-modal features on a global scale,thereby enhancing the expressiveness of the fused features.Experiments on the KAIST,M3FD,and FLIR datasets demonstrate that our method delivers outstanding performance in daytime and nighttime scenarios.On the KAIST dataset,the miss rate drops to 5.99%,and further to 4.26% in night scenes.On the FLIR and M3FD datasets,it achieves AP50 scores of 79.4% and 88.9%,respectively.
基金the Special Research Fund for the Natural Science Foundation of Chongqing(No.cstc2019jcyjmsxm1351)the Science and Technology Research Project of Chongqing Education Commission(No.KJQN2020006300)。
文摘There is still a dearth of systematic study on picture stitching techniques for the natural tubular structures of intestines,and traditional stitching techniques have a poor application to endoscopic images with deep scenes.In order to recreate the intestinal wall in two dimensions,a method is developed.The normalized Laplacian algorithm is used to enhance the image and transform it into polar coordinates according to the characteristics that intestinal images are not obvious and usually arranged in a circle,in order to extract the new image segments of the current image relative to the previous image.The improved weighted fusion algorithm is then used to sequentially splice the segment images.The experimental results demonstrate that the suggested approach can improve image clarity and minimize noise while maintaining the information content of intestinal images.In addition,the method's seamless transition between the final portions of a panoramic image also demonstrates that the stitching trace has been removed.
基金partially supported by China Postdoctoral Science Foundation(2023M730741)the National Natural Science Foundation of China(U22B2052,52102432,52202452,62372080,62302078)
文摘The goal of infrared and visible image fusion(IVIF)is to integrate the unique advantages of both modalities to achieve a more comprehensive understanding of a scene.However,existing methods struggle to effectively handle modal disparities,resulting in visual degradation of the details and prominent targets of the fused images.To address these challenges,we introduce Prompt Fusion,a prompt-based approach that harmoniously combines multi-modality images under the guidance of semantic prompts.Firstly,to better characterize the features of different modalities,a contourlet autoencoder is designed to separate and extract the high-/low-frequency components of different modalities,thereby improving the extraction of fine details and textures.We also introduce a prompt learning mechanism using positive and negative prompts,leveraging Vision-Language Models to improve the fusion model's understanding and identification of targets in multi-modality images,leading to improved performance in downstream tasks.Furthermore,we employ bi-level asymptotic convergence optimization.This approach simplifies the intricate non-singleton non-convex bi-level problem into a series of convergent and differentiable single optimization problems that can be effectively resolved through gradient descent.Our approach advances the state-of-the-art,delivering superior fusion quality and boosting the performance of related downstream tasks.Project page:https://github.com/hey-it-s-me/PromptFusion.