Images taken in dim environments frequently exhibit issues like insufficient brightness,noise,color shifts,and loss of detail.These problems pose significant challenges to dark image enhancement tasks.Current approach...Images taken in dim environments frequently exhibit issues like insufficient brightness,noise,color shifts,and loss of detail.These problems pose significant challenges to dark image enhancement tasks.Current approaches,while effective in global illumination modeling,often struggle to simultaneously suppress noise and preserve structural details,especially under heterogeneous lighting.Furthermore,misalignment between luminance and color channels introduces additional challenges to accurate enhancement.In response to the aforementioned difficulties,we introduce a single-stage framework,M2ATNet,using the multi-scale multi-attention and Transformer architecture.First,to address the problems of texture blurring and residual noise,we design a multi-scale multi-attention denoising module(MMAD),which is applied separately to the luminance and color channels to enhance the structural and texture modeling capabilities.Secondly,to solve the non-alignment problem of the luminance and color channels,we introduce the multi-channel feature fusion Transformer(CFFT)module,which effectively recovers the dark details and corrects the color shifts through cross-channel alignment and deep feature interaction.To guide the model to learn more stably and efficiently,we also fuse multiple types of loss functions to form a hybrid loss term.We extensively evaluate the proposed method on various standard datasets,including LOL-v1,LOL-v2,DICM,LIME,and NPE.Evaluation in terms of numerical metrics and visual quality demonstrate that M2ATNet consistently outperforms existing advanced approaches.Ablation studies further confirm the critical roles played by the MMAD and CFFT modules to detail preservation and visual fidelity under challenging illumination-deficient environments.展开更多
Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to ...Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to the inability to effectively capture global information from images,CNNs can easily lead to loss of contours and textures in segmentation results.Notice that the transformer model can effectively capture the properties of long-range dependencies in the image,and furthermore,combining the CNN and the transformer can effectively extract local details and global contextual features of the image.Motivated by this,we propose a multi-branch and multi-scale attention network(M2ANet)for medical image segmentation,whose architecture consists of three components.Specifically,in the first component,we construct an adaptive multi-branch patch module for parallel extraction of image features to reduce information loss caused by downsampling.In the second component,we apply residual block to the well-known convolutional block attention module to enhance the network’s ability to recognize important features of images and alleviate the phenomenon of gradient vanishing.In the third component,we design a multi-scale feature fusion module,in which we adopt adaptive average pooling and position encoding to enhance contextual features,and then multi-head attention is introduced to further enrich feature representation.Finally,we validate the effectiveness and feasibility of the proposed M2ANet method through comparative experiments on four benchmark medical image segmentation datasets,particularly in the context of preserving contours and textures.展开更多
The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method f...The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.展开更多
This paper aims to develop a nonrigid registration method of preoperative and intraoperative thoracoabdominal CT images in computer-assisted interventional surgeries for accurate tumor localization and tissue visualiz...This paper aims to develop a nonrigid registration method of preoperative and intraoperative thoracoabdominal CT images in computer-assisted interventional surgeries for accurate tumor localization and tissue visualization enhancement.However,fine structure registration of complex thoracoabdominal organs and large deformation registration caused by respiratory motion is challenging.To deal with this problem,we propose a 3D multi-scale attention VoxelMorph(MAVoxelMorph)registration network.To alleviate the large deformation problem,a multi-scale axial attention mechanism is utilized by using a residual dilated pyramid pooling for multi-scale feature extraction,and position-aware axial attention for long-distance dependencies between pixels capture.To further improve the large deformation and fine structure registration results,a multi-scale context channel attention mechanism is employed utilizing content information via adjacent encoding layers.Our method was evaluated on four public lung datasets(DIR-Lab dataset,Creatis dataset,Learn2Reg dataset,OASIS dataset)and a local dataset.Results proved that the proposed method achieved better registration performance than current state-of-the-art methods,especially in handling the registration of large deformations and fine structures.It also proved to be fast in 3D image registration,using about 1.5 s,and faster than most methods.Qualitative and quantitative assessments proved that the proposed MA-VoxelMorph has the potential to realize precise and fast tumor localization in clinical interventional surgeries.展开更多
This paper introduces a novel method for medical image retrieval and classification by integrating a multi-scale encoding mechanism with Vision Transformer(ViT)architectures and a dynamic multi-loss function.The multi...This paper introduces a novel method for medical image retrieval and classification by integrating a multi-scale encoding mechanism with Vision Transformer(ViT)architectures and a dynamic multi-loss function.The multi-scale encoding significantly enhances the model’s ability to capture both fine-grained and global features,while the dynamic loss function adapts during training to optimize classification accuracy and retrieval performance.Our approach was evaluated on the ISIC-2018 and ChestX-ray14 datasets,yielding notable improvements.Specifically,on the ISIC-2018 dataset,our method achieves an F1-Score improvement of+4.84% compared to the standard ViT,with a precision increase of+5.46% for melanoma(MEL).On the ChestX-ray14 dataset,the method delivers an F1-Score improvement of 5.3%over the conventional ViT,with precision gains of+5.0% for pneumonia(PNEU)and+5.4%for fibrosis(FIB).Experimental results demonstrate that our approach outperforms traditional CNN-based models and existing ViT variants,particularly in retrieving relevant medical cases and enhancing diagnostic accuracy.These findings highlight the potential of the proposedmethod for large-scalemedical image analysis,offering improved tools for clinical decision-making through superior classification and case comparison.展开更多
The application of image super-resolution(SR)has brought significant assistance in the medical field,aiding doctors to make more precise diagnoses.However,solely relying on a convolutional neural network(CNN)for image...The application of image super-resolution(SR)has brought significant assistance in the medical field,aiding doctors to make more precise diagnoses.However,solely relying on a convolutional neural network(CNN)for image SR may lead to issues such as blurry details and excessive smoothness.To address the limitations,we proposed an algorithm based on the generative adversarial network(GAN)framework.In the generator network,three different sizes of convolutions connected by a residual dense structure were used to extract detailed features,and an attention mechanism combined with dual channel and spatial information was applied to concentrate the computing power on crucial areas.In the discriminator network,using InstanceNorm to normalize tensors sped up the training process while retaining feature information.The experimental results demonstrate that our algorithm achieves higher peak signal-to-noise ratio(PSNR)and structural similarity index measure(SSIM)compared to other methods,resulting in an improved visual quality.展开更多
Underwater imaging posts a challenge due to the degradation by the absorption and scattering occurred during light propagation as well as poor lighting conditions in water medium Although image filtering techniques ar...Underwater imaging posts a challenge due to the degradation by the absorption and scattering occurred during light propagation as well as poor lighting conditions in water medium Although image filtering techniques are utilized to improve image quality effectively, problems of the distortion of image details and the bias of color correction still exist in output images due to the complexity of image texture distribution. This paper proposes a new underwater image enhancement method based on image struc- tural decomposition. By introducing a curvature factor into the Mumford_Shah_G decomposition algorithm, image details and struc- ture components are better preserved without the gradient effect. Thus, histogram equalization and Retinex algorithms are applied in the decomposed structure component for global image enhancement and non-uniform brightness correction for gray level and the color images, then the optical absorption spectrum in water medium is incorporate to improve the color correction. Finally, the en- hauced structure and preserved detail component are re.composed to generate the output. Experiments with real underwater images verify the image improvement by the proposed method in image contrast, brightness and color fidelity.展开更多
When used for separating multi-component non-stationary signals, the adaptive time-varying filter(ATF) based on multi-scale chirplet sparse signal decomposition(MCSSD) generates phase shift and signal distortion. To o...When used for separating multi-component non-stationary signals, the adaptive time-varying filter(ATF) based on multi-scale chirplet sparse signal decomposition(MCSSD) generates phase shift and signal distortion. To overcome this drawback, the zero phase filter is introduced to the mentioned filter, and a fault diagnosis method for speed-changing gearbox is proposed. Firstly, the gear meshing frequency of each gearbox is estimated by chirplet path pursuit. Then, according to the estimated gear meshing frequencies, an adaptive zero phase time-varying filter(AZPTF) is designed to filter the original signal. Finally, the basis for fault diagnosis is acquired by the envelope order analysis to the filtered signal. The signal consisting of two time-varying amplitude modulation and frequency modulation(AM-FM) signals is respectively analyzed by ATF and AZPTF based on MCSSD. The simulation results show the variances between the original signals and the filtered signals yielded by AZPTF based on MCSSD are 13.67 and 41.14, which are far less than variances (323.45 and 482.86) between the original signals and the filtered signals obtained by ATF based on MCSSD. The experiment results on the vibration signals of gearboxes indicate that the vibration signals of the two speed-changing gearboxes installed on one foundation bed can be separated by AZPTF effectively. Based on the demodulation information of the vibration signal of each gearbox, the fault diagnosis can be implemented. Both simulation and experiment examples prove that the proposed filter can extract a mono-component time-varying AM-FM signal from the multi-component time-varying AM-FM signal without distortion.展开更多
Although the Convolutional Neural Network(CNN)has shown great potential for land cover classification,the frequently used single-scale convolution kernel limits the scope of informa-tion extraction.Therefore,we propos...Although the Convolutional Neural Network(CNN)has shown great potential for land cover classification,the frequently used single-scale convolution kernel limits the scope of informa-tion extraction.Therefore,we propose a Multi-Scale Fully Convolutional Network(MSFCN)with a multi-scale convolutional kernel as well as a Channel Attention Block(CAB)and a Global Pooling Module(GPM)in this paper to exploit discriminative representations from two-dimensional(2D)satellite images.Meanwhile,to explore the ability of the proposed MSFCN for spatio-temporal images,we expand our MSFCN to three-dimension using three-dimensional(3D)CNN,capable of harnessing each land cover category’s time series interac-tion from the reshaped spatio-temporal remote sensing images.To verify the effectiveness of the proposed MSFCN,we conduct experiments on two spatial datasets and two spatio-temporal datasets.The proposed MSFCN achieves 60.366%on the WHDLD dataset and 75.127%on the GID dataset in terms of mIoU index while the figures for two spatio-temporal datasets are 87.753%and 77.156%.Extensive comparative experiments and abla-tion studies demonstrate the effectiveness of the proposed MSFCN.展开更多
Range-azimuth imaging of ground targets via frequency-modulated continuous wave(FMCW)radar is crucial for effective target detection.However,when the pitch of the moving array constructed during motion exceeds the phy...Range-azimuth imaging of ground targets via frequency-modulated continuous wave(FMCW)radar is crucial for effective target detection.However,when the pitch of the moving array constructed during motion exceeds the physical array aperture,azimuth ambiguity occurs,making range-azimuth imaging on a moving platform challenging.To address this issue,we theoretically analyze azimuth ambiguity generation in sparse motion arrays and propose a dual-aperture adaptive processing(DAAP)method for suppressing azimuth ambiguity.This method combines spatial multiple-input multiple-output(MIMO)arrays with sparse motion arrays to achieve high-resolution range-azimuth imaging.In addition,an adaptive QR decomposition denoising method for sparse array signals based on iterative low-rank matrix approximation(LRMA)and regularized QR is proposed to preprocess sparse motion array signals.Simulations and experiments show that on a two-transmitter-four-receiver array,the signal-to-noise ratio(SNR)of the sparse motion array signal after noise suppression via adaptive QR decomposition can exceed 0 dB,and the azimuth ambiguity signal ratio(AASR)can be reduced to below-20 dB.展开更多
A nonlinear data analysis algorithm, namely empirical data decomposition (EDD) is proposed, which can perform adaptive analysis of observed data. Analysis filter, which is not a linear constant coefficient filter, i...A nonlinear data analysis algorithm, namely empirical data decomposition (EDD) is proposed, which can perform adaptive analysis of observed data. Analysis filter, which is not a linear constant coefficient filter, is automatically determined by observed data, and is able to implement multi-resolution analysis as wavelet transform. The algorithm is suitable for analyzing non-stationary data and can effectively wipe off the relevance of observed data. Then through discussing the applications of EDD in image compression, the paper presents a 2-dimension data decomposition framework and makes some modifications of contexts used by Embedded Block Coding with Optimized Truncation (EBCOT) . Simulation results show that EDD is more suitable for non-stationary image data compression.展开更多
High resolution image fusion is a significant focus in the field of image processing. A new image fusion model is presented based on the characteristic level of empirical mode decomposition (EMD). The intensity hue ...High resolution image fusion is a significant focus in the field of image processing. A new image fusion model is presented based on the characteristic level of empirical mode decomposition (EMD). The intensity hue saturation (IHS) transform of the multi-spectral image first gives the intensity image. Thereafter, the 2D EMD in terms of row-column extension of the 1D EMD model is used to decompose the detailed scale image and coarse scale image from the high-resolution band image and the intensity image. Finally, a fused intensity image is obtained by reconstruction with high frequency of the high-resolution image and low frequency of the intensity image and IHS inverse transform result in the fused image. After presenting the EMD principle, a multi-scale decomposition and reconstruction algorithm of 2D EMD is defined and a fusion technique scheme is advanced based on EMD. Panchromatic band and multi-spectral band 3,2,1 of Quickbird are used to assess the quality of the fusion algorithm. After selecting the appropriate intrinsic mode function (IMF) for the merger on the basis of EMD analysis on specific row (column) pixel gray value series, the fusion scheme gives a fused image, which is compared with generally used fusion algorithms (wavelet, IHS, Brovey). The objectives of image fusion include enhancing the visibility of the image and improving the spatial resolution and the spectral information of the original images. To assess quality of an image after fusion, information entropy and standard deviation are applied to assess spatial details of the fused images and correlation coefficient, bias index and warping degree for measuring distortion between the original image and fused image in terms of spectral information. For the proposed fusion algorithm, better results are obtained when EMD algorithm is used to perform the fusion experience.展开更多
Removing rain from a single image is a challenging task due to the absence of temporal information. Considering that a rainy image can be decomposed into the low-frequency(LF) and high-frequency(HF) components, where ...Removing rain from a single image is a challenging task due to the absence of temporal information. Considering that a rainy image can be decomposed into the low-frequency(LF) and high-frequency(HF) components, where the coarse scale information is retained in the LF component and the rain streaks and texture correspond to the HF component, we propose a single image rain removal algorithm using image decomposition and a dense network. We design two task-driven sub-networks to estimate the LF and non-rain HF components of a rainy image. The high-frequency estimation sub-network employs a densely connected network structure, while the low-frequency sub-network uses a simple convolutional neural network(CNN).We add total variation(TV) regularization and LF-channel fidelity terms to the loss function to optimize the two subnetworks jointly. The method then obtains de-rained output by combining the estimated LF and non-rain HF components.Extensive experiments on synthetic and real-world rainy images demonstrate that our method removes rain streaks while preserving non-rain details, and achieves superior de-raining performance both perceptually and quantitatively.展开更多
Total variation (TV) is widely applied in image process-ing. The assumption of TV is that an image consists of piecewise constants, however, it suffers from the so-cal ed staircase effect. In order to reduce the sta...Total variation (TV) is widely applied in image process-ing. The assumption of TV is that an image consists of piecewise constants, however, it suffers from the so-cal ed staircase effect. In order to reduce the staircase effect and preserve the edges when textures of image are extracted, a new image decomposition model is proposed in this paper. The proposed model is based on the to-tal generalized variation method which involves and balances the higher order of the structure. We also derive a numerical algorithm based on a primal-dual formulation that can be effectively imple-mented. Numerical experiments show that the proposed method can achieve a better trade-off between noise removal and texture extraction, while avoiding the staircase effect efficiently.展开更多
In order to avoid staircasing effect and preserve small scale texture information for the classical total variation regularization, a new minimization energy functional model for image decomposition is proposed. First...In order to avoid staircasing effect and preserve small scale texture information for the classical total variation regularization, a new minimization energy functional model for image decomposition is proposed. Firstly, an adaptive regularization based on the local feature of images is introduced to substitute total variational regularization. The oscillatory component containing texture and/or noise is modeled in generalized function space div (BMO). And then, the existence and uniqueness of the minimizer for proposed model are proved. Finally, the gradient descent flow of the Euler-Lagrange equations for the new model is numerically implemented by using a finite difference method. Experiments show that the proposed model is very robust to noise, and the staircasing effect is avoided efficiently, while edges and textures are well remained.展开更多
Due to the data acquired by most optical earth observation satellite such as IKONOS, QuickBird-2 and GF-1 consist of a panchromatic image with high spatial resolution and multiple multispectral images with low spatial...Due to the data acquired by most optical earth observation satellite such as IKONOS, QuickBird-2 and GF-1 consist of a panchromatic image with high spatial resolution and multiple multispectral images with low spatial resolution. Many image fusion techniques have been developed to produce high resolution multispectral image. Considering panchromatic image and multispectral images contain the same spatial information with different accuracy, using the least square theory could estimate optimal spatial information. Compared with previous spatial details injection mode, this mode is more accurate and robust. In this paper, an image fusion method using Bidimensional Empirical Mode Decomposition (BEMD) and the least square theory is proposed to merge multispectral images and panchromatic image. After multi-spectral images were transformed from RGB space into IHS space, next I component and Panchromatic are decomposed by BEMD, then using the least squares theory to evaluate optimal spatial information and inject spatial information, finally completing fusion through inverse BEMD and inverse intensity-hue-saturation transform. Two data sets are used to evaluate the proposed fusion method, GF-1 images and QuickBird-2 images. The fusion images were evaluated visually and statistically. The evaluation results show the method proposed in this paper achieves the best performance compared with the conventional method.展开更多
Infrared-visible image fusion plays an important role in multi-source data fusion,which has the advantage of integrating useful information from multi-source sensors.However,there are still challenges in target enhanc...Infrared-visible image fusion plays an important role in multi-source data fusion,which has the advantage of integrating useful information from multi-source sensors.However,there are still challenges in target enhancement and visual improvement.To deal with these problems,a sub-regional infrared-visible image fusion method(SRF)is proposed.First,morphology and threshold segmentation is applied to extract targets interested in infrared images.Second,the infrared back-ground is reconstructed based on extracted targets and the visible image.Finally,target and back-ground regions are fused using a multi-scale transform.Experimental results are obtained using public data for comparison and evaluation,which demonstrate that the proposed SRF has poten-tial benefits over other methods.展开更多
A super-resolution reconstruction approach of (SVD) technique was presented, and its performance was radar image using an adaptive-threshold singular value decomposition analyzed, compared and assessed detailedly. F...A super-resolution reconstruction approach of (SVD) technique was presented, and its performance was radar image using an adaptive-threshold singular value decomposition analyzed, compared and assessed detailedly. First, radar imaging model and super-resolution reconstruction mechanism were outlined. Then, the adaptive-threshold SVD super-resolution algorithm, and its two key aspects, namely the determination method of point spread function (PSF) matrix T and the selection scheme of singular value threshold, were presented. Finally, the super-resolution algorithm was demonstrated successfully using the measured synthetic-aperture radar (SAR) images, and a Monte Carlo assessment was carried out to evaluate the performance of the algorithm by using the input/output signal-to-noise ratio (SNR). Five versions of SVD algorithms, namely 1 ) using all singular values, 2) using the top 80% singular values, 3) using the top 50% singular values, 4) using the top 20% singular values and 5) using singular values s such that S2≥/max(s2)/rinsNR were tested. The experimental results indicate that when the singular value threshold is set as Smax/(rinSNR)1/2, the super-resolution algorithm provides a good compromise between too much noise and too much bias and has good reconstruction results.展开更多
New models for image decomposition are proposed which separate an image into a cartoon, consisting only of geometric objects, and an oscillatory component, consisting of textures or noise. The proposed models are give...New models for image decomposition are proposed which separate an image into a cartoon, consisting only of geometric objects, and an oscillatory component, consisting of textures or noise. The proposed models are given in a variational formulation with adaptive regularization norms for both the cartoon and texture parts. The adaptive behavior preserves key features such as object boundaries and textures while avoiding staircasing in what should be smooth regions. This decomposition is computed by minimizing a convex functional which depends on the two variables u and v, alternatively in each variable. Experimental results and comparisons to validate the proposed models are presented.展开更多
We propose a layered image inpainting scheme based on image decomposition. The damaged image is first decomposed into three layers: cartoon, edge, and texture. The cartoon and edge layers are repaired using an adapti...We propose a layered image inpainting scheme based on image decomposition. The damaged image is first decomposed into three layers: cartoon, edge, and texture. The cartoon and edge layers are repaired using an adaptive offset operator that can fill-in damaged image blocks while preserving sharpness of edges. The missing information in the texture layer is generated with a texture synthesis method. By using discrete cosine transform (DCT) in image decomposition and trading between resolution and computation complexity in texture synthesis, the processing time is kept at a reasonable level.展开更多
基金funded by the National Natural Science Foundation of China,grant numbers 52374156 and 62476005。
文摘Images taken in dim environments frequently exhibit issues like insufficient brightness,noise,color shifts,and loss of detail.These problems pose significant challenges to dark image enhancement tasks.Current approaches,while effective in global illumination modeling,often struggle to simultaneously suppress noise and preserve structural details,especially under heterogeneous lighting.Furthermore,misalignment between luminance and color channels introduces additional challenges to accurate enhancement.In response to the aforementioned difficulties,we introduce a single-stage framework,M2ATNet,using the multi-scale multi-attention and Transformer architecture.First,to address the problems of texture blurring and residual noise,we design a multi-scale multi-attention denoising module(MMAD),which is applied separately to the luminance and color channels to enhance the structural and texture modeling capabilities.Secondly,to solve the non-alignment problem of the luminance and color channels,we introduce the multi-channel feature fusion Transformer(CFFT)module,which effectively recovers the dark details and corrects the color shifts through cross-channel alignment and deep feature interaction.To guide the model to learn more stably and efficiently,we also fuse multiple types of loss functions to form a hybrid loss term.We extensively evaluate the proposed method on various standard datasets,including LOL-v1,LOL-v2,DICM,LIME,and NPE.Evaluation in terms of numerical metrics and visual quality demonstrate that M2ATNet consistently outperforms existing advanced approaches.Ablation studies further confirm the critical roles played by the MMAD and CFFT modules to detail preservation and visual fidelity under challenging illumination-deficient environments.
基金supported by the Natural Science Foundation of the Anhui Higher Education Institutions of China(Grant Nos.2023AH040149 and 2024AH051915)the Anhui Provincial Natural Science Foundation(Grant No.2208085MF168)+1 种基金the Science and Technology Innovation Tackle Plan Project of Maanshan(Grant No.2024RGZN001)the Scientific Research Fund Project of Anhui Medical University(Grant No.2023xkj122).
文摘Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to the inability to effectively capture global information from images,CNNs can easily lead to loss of contours and textures in segmentation results.Notice that the transformer model can effectively capture the properties of long-range dependencies in the image,and furthermore,combining the CNN and the transformer can effectively extract local details and global contextual features of the image.Motivated by this,we propose a multi-branch and multi-scale attention network(M2ANet)for medical image segmentation,whose architecture consists of three components.Specifically,in the first component,we construct an adaptive multi-branch patch module for parallel extraction of image features to reduce information loss caused by downsampling.In the second component,we apply residual block to the well-known convolutional block attention module to enhance the network’s ability to recognize important features of images and alleviate the phenomenon of gradient vanishing.In the third component,we design a multi-scale feature fusion module,in which we adopt adaptive average pooling and position encoding to enhance contextual features,and then multi-head attention is introduced to further enrich feature representation.Finally,we validate the effectiveness and feasibility of the proposed M2ANet method through comparative experiments on four benchmark medical image segmentation datasets,particularly in the context of preserving contours and textures.
基金Supported by the Henan Province Key Research and Development Project(231111211300)the Central Government of Henan Province Guides Local Science and Technology Development Funds(Z20231811005)+2 种基金Henan Province Key Research and Development Project(231111110100)Henan Provincial Outstanding Foreign Scientist Studio(GZS2024006)Henan Provincial Joint Fund for Scientific and Technological Research and Development Plan(Application and Overcoming Technical Barriers)(242103810028)。
文摘The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.
基金supported in part by the National Natural Science Foundation of China[62301374]Hubei Provincial Natural Science Foundation of China[2022CFB804]+2 种基金Hubei Provincial Education Research Project[B2022057]the Youths Science Foundation of Wuhan Institute of Technology[K202240]the 15th Graduate Education Innovation Fund of Wuhan Institute of Technology[CX2023295].
文摘This paper aims to develop a nonrigid registration method of preoperative and intraoperative thoracoabdominal CT images in computer-assisted interventional surgeries for accurate tumor localization and tissue visualization enhancement.However,fine structure registration of complex thoracoabdominal organs and large deformation registration caused by respiratory motion is challenging.To deal with this problem,we propose a 3D multi-scale attention VoxelMorph(MAVoxelMorph)registration network.To alleviate the large deformation problem,a multi-scale axial attention mechanism is utilized by using a residual dilated pyramid pooling for multi-scale feature extraction,and position-aware axial attention for long-distance dependencies between pixels capture.To further improve the large deformation and fine structure registration results,a multi-scale context channel attention mechanism is employed utilizing content information via adjacent encoding layers.Our method was evaluated on four public lung datasets(DIR-Lab dataset,Creatis dataset,Learn2Reg dataset,OASIS dataset)and a local dataset.Results proved that the proposed method achieved better registration performance than current state-of-the-art methods,especially in handling the registration of large deformations and fine structures.It also proved to be fast in 3D image registration,using about 1.5 s,and faster than most methods.Qualitative and quantitative assessments proved that the proposed MA-VoxelMorph has the potential to realize precise and fast tumor localization in clinical interventional surgeries.
基金funded by the Deanship of Research and Graduate Studies at King Khalid University through small group research under grant number RGP1/278/45.
文摘This paper introduces a novel method for medical image retrieval and classification by integrating a multi-scale encoding mechanism with Vision Transformer(ViT)architectures and a dynamic multi-loss function.The multi-scale encoding significantly enhances the model’s ability to capture both fine-grained and global features,while the dynamic loss function adapts during training to optimize classification accuracy and retrieval performance.Our approach was evaluated on the ISIC-2018 and ChestX-ray14 datasets,yielding notable improvements.Specifically,on the ISIC-2018 dataset,our method achieves an F1-Score improvement of+4.84% compared to the standard ViT,with a precision increase of+5.46% for melanoma(MEL).On the ChestX-ray14 dataset,the method delivers an F1-Score improvement of 5.3%over the conventional ViT,with precision gains of+5.0% for pneumonia(PNEU)and+5.4%for fibrosis(FIB).Experimental results demonstrate that our approach outperforms traditional CNN-based models and existing ViT variants,particularly in retrieving relevant medical cases and enhancing diagnostic accuracy.These findings highlight the potential of the proposedmethod for large-scalemedical image analysis,offering improved tools for clinical decision-making through superior classification and case comparison.
文摘The application of image super-resolution(SR)has brought significant assistance in the medical field,aiding doctors to make more precise diagnoses.However,solely relying on a convolutional neural network(CNN)for image SR may lead to issues such as blurry details and excessive smoothness.To address the limitations,we proposed an algorithm based on the generative adversarial network(GAN)framework.In the generator network,three different sizes of convolutions connected by a residual dense structure were used to extract detailed features,and an attention mechanism combined with dual channel and spatial information was applied to concentrate the computing power on crucial areas.In the discriminator network,using InstanceNorm to normalize tensors sped up the training process while retaining feature information.The experimental results demonstrate that our algorithm achieves higher peak signal-to-noise ratio(PSNR)and structural similarity index measure(SSIM)compared to other methods,resulting in an improved visual quality.
基金supported by the National Natural Science Foundation of China (Grant Nos.60772058 and 61271406)
文摘Underwater imaging posts a challenge due to the degradation by the absorption and scattering occurred during light propagation as well as poor lighting conditions in water medium Although image filtering techniques are utilized to improve image quality effectively, problems of the distortion of image details and the bias of color correction still exist in output images due to the complexity of image texture distribution. This paper proposes a new underwater image enhancement method based on image struc- tural decomposition. By introducing a curvature factor into the Mumford_Shah_G decomposition algorithm, image details and struc- ture components are better preserved without the gradient effect. Thus, histogram equalization and Retinex algorithms are applied in the decomposed structure component for global image enhancement and non-uniform brightness correction for gray level and the color images, then the optical absorption spectrum in water medium is incorporate to improve the color correction. Finally, the en- hauced structure and preserved detail component are re.composed to generate the output. Experiments with real underwater images verify the image improvement by the proposed method in image contrast, brightness and color fidelity.
基金supported by National Natural Science Foundation of China (Grant No. 71271078)National Hi-tech Research and Development Program of China (863 Program, Grant No. 2009AA04Z414)Integration of Industry, Education and Research of Guangdong Province, and Ministry of Education of China (Grant No. 2009B090300312)
文摘When used for separating multi-component non-stationary signals, the adaptive time-varying filter(ATF) based on multi-scale chirplet sparse signal decomposition(MCSSD) generates phase shift and signal distortion. To overcome this drawback, the zero phase filter is introduced to the mentioned filter, and a fault diagnosis method for speed-changing gearbox is proposed. Firstly, the gear meshing frequency of each gearbox is estimated by chirplet path pursuit. Then, according to the estimated gear meshing frequencies, an adaptive zero phase time-varying filter(AZPTF) is designed to filter the original signal. Finally, the basis for fault diagnosis is acquired by the envelope order analysis to the filtered signal. The signal consisting of two time-varying amplitude modulation and frequency modulation(AM-FM) signals is respectively analyzed by ATF and AZPTF based on MCSSD. The simulation results show the variances between the original signals and the filtered signals yielded by AZPTF based on MCSSD are 13.67 and 41.14, which are far less than variances (323.45 and 482.86) between the original signals and the filtered signals obtained by ATF based on MCSSD. The experiment results on the vibration signals of gearboxes indicate that the vibration signals of the two speed-changing gearboxes installed on one foundation bed can be separated by AZPTF effectively. Based on the demodulation information of the vibration signal of each gearbox, the fault diagnosis can be implemented. Both simulation and experiment examples prove that the proposed filter can extract a mono-component time-varying AM-FM signal from the multi-component time-varying AM-FM signal without distortion.
基金supported by the National Natural Science Foundation of China[grant number 41671452].
文摘Although the Convolutional Neural Network(CNN)has shown great potential for land cover classification,the frequently used single-scale convolution kernel limits the scope of informa-tion extraction.Therefore,we propose a Multi-Scale Fully Convolutional Network(MSFCN)with a multi-scale convolutional kernel as well as a Channel Attention Block(CAB)and a Global Pooling Module(GPM)in this paper to exploit discriminative representations from two-dimensional(2D)satellite images.Meanwhile,to explore the ability of the proposed MSFCN for spatio-temporal images,we expand our MSFCN to three-dimension using three-dimensional(3D)CNN,capable of harnessing each land cover category’s time series interac-tion from the reshaped spatio-temporal remote sensing images.To verify the effectiveness of the proposed MSFCN,we conduct experiments on two spatial datasets and two spatio-temporal datasets.The proposed MSFCN achieves 60.366%on the WHDLD dataset and 75.127%on the GID dataset in terms of mIoU index while the figures for two spatio-temporal datasets are 87.753%and 77.156%.Extensive comparative experiments and abla-tion studies demonstrate the effectiveness of the proposed MSFCN.
基金supported by the National Natural Science Foundation of China under Grant 62301051.
文摘Range-azimuth imaging of ground targets via frequency-modulated continuous wave(FMCW)radar is crucial for effective target detection.However,when the pitch of the moving array constructed during motion exceeds the physical array aperture,azimuth ambiguity occurs,making range-azimuth imaging on a moving platform challenging.To address this issue,we theoretically analyze azimuth ambiguity generation in sparse motion arrays and propose a dual-aperture adaptive processing(DAAP)method for suppressing azimuth ambiguity.This method combines spatial multiple-input multiple-output(MIMO)arrays with sparse motion arrays to achieve high-resolution range-azimuth imaging.In addition,an adaptive QR decomposition denoising method for sparse array signals based on iterative low-rank matrix approximation(LRMA)and regularized QR is proposed to preprocess sparse motion array signals.Simulations and experiments show that on a two-transmitter-four-receiver array,the signal-to-noise ratio(SNR)of the sparse motion array signal after noise suppression via adaptive QR decomposition can exceed 0 dB,and the azimuth ambiguity signal ratio(AASR)can be reduced to below-20 dB.
基金This project was supported by the National Natural Science Foundation of China (60532060)Hainan Education Bureau Research Project (Hjkj200602)Hainan Natural Science Foundation (80551).
文摘A nonlinear data analysis algorithm, namely empirical data decomposition (EDD) is proposed, which can perform adaptive analysis of observed data. Analysis filter, which is not a linear constant coefficient filter, is automatically determined by observed data, and is able to implement multi-resolution analysis as wavelet transform. The algorithm is suitable for analyzing non-stationary data and can effectively wipe off the relevance of observed data. Then through discussing the applications of EDD in image compression, the paper presents a 2-dimension data decomposition framework and makes some modifications of contexts used by Embedded Block Coding with Optimized Truncation (EBCOT) . Simulation results show that EDD is more suitable for non-stationary image data compression.
文摘High resolution image fusion is a significant focus in the field of image processing. A new image fusion model is presented based on the characteristic level of empirical mode decomposition (EMD). The intensity hue saturation (IHS) transform of the multi-spectral image first gives the intensity image. Thereafter, the 2D EMD in terms of row-column extension of the 1D EMD model is used to decompose the detailed scale image and coarse scale image from the high-resolution band image and the intensity image. Finally, a fused intensity image is obtained by reconstruction with high frequency of the high-resolution image and low frequency of the intensity image and IHS inverse transform result in the fused image. After presenting the EMD principle, a multi-scale decomposition and reconstruction algorithm of 2D EMD is defined and a fusion technique scheme is advanced based on EMD. Panchromatic band and multi-spectral band 3,2,1 of Quickbird are used to assess the quality of the fusion algorithm. After selecting the appropriate intrinsic mode function (IMF) for the merger on the basis of EMD analysis on specific row (column) pixel gray value series, the fusion scheme gives a fused image, which is compared with generally used fusion algorithms (wavelet, IHS, Brovey). The objectives of image fusion include enhancing the visibility of the image and improving the spatial resolution and the spectral information of the original images. To assess quality of an image after fusion, information entropy and standard deviation are applied to assess spatial details of the fused images and correlation coefficient, bias index and warping degree for measuring distortion between the original image and fused image in terms of spectral information. For the proposed fusion algorithm, better results are obtained when EMD algorithm is used to perform the fusion experience.
基金supported by the National Natural Science Foundation of China(61471313)the Natural Science Foundation of Hebei Province(F2019203318)
文摘Removing rain from a single image is a challenging task due to the absence of temporal information. Considering that a rainy image can be decomposed into the low-frequency(LF) and high-frequency(HF) components, where the coarse scale information is retained in the LF component and the rain streaks and texture correspond to the HF component, we propose a single image rain removal algorithm using image decomposition and a dense network. We design two task-driven sub-networks to estimate the LF and non-rain HF components of a rainy image. The high-frequency estimation sub-network employs a densely connected network structure, while the low-frequency sub-network uses a simple convolutional neural network(CNN).We add total variation(TV) regularization and LF-channel fidelity terms to the loss function to optimize the two subnetworks jointly. The method then obtains de-rained output by combining the estimated LF and non-rain HF components.Extensive experiments on synthetic and real-world rainy images demonstrate that our method removes rain streaks while preserving non-rain details, and achieves superior de-raining performance both perceptually and quantitatively.
基金supported by the National Natural Science Foundation of China(6127129461301229)+1 种基金the Doctoral Research Fund of Henan University of Science and Technology(0900170809001751)
文摘Total variation (TV) is widely applied in image process-ing. The assumption of TV is that an image consists of piecewise constants, however, it suffers from the so-cal ed staircase effect. In order to reduce the staircase effect and preserve the edges when textures of image are extracted, a new image decomposition model is proposed in this paper. The proposed model is based on the to-tal generalized variation method which involves and balances the higher order of the structure. We also derive a numerical algorithm based on a primal-dual formulation that can be effectively imple-mented. Numerical experiments show that the proposed method can achieve a better trade-off between noise removal and texture extraction, while avoiding the staircase effect efficiently.
基金supported by the Science and Technology Foundation Program of Chongqing Municipal Education Committee (KJ091208)
文摘In order to avoid staircasing effect and preserve small scale texture information for the classical total variation regularization, a new minimization energy functional model for image decomposition is proposed. Firstly, an adaptive regularization based on the local feature of images is introduced to substitute total variational regularization. The oscillatory component containing texture and/or noise is modeled in generalized function space div (BMO). And then, the existence and uniqueness of the minimizer for proposed model are proved. Finally, the gradient descent flow of the Euler-Lagrange equations for the new model is numerically implemented by using a finite difference method. Experiments show that the proposed model is very robust to noise, and the staircasing effect is avoided efficiently, while edges and textures are well remained.
文摘Due to the data acquired by most optical earth observation satellite such as IKONOS, QuickBird-2 and GF-1 consist of a panchromatic image with high spatial resolution and multiple multispectral images with low spatial resolution. Many image fusion techniques have been developed to produce high resolution multispectral image. Considering panchromatic image and multispectral images contain the same spatial information with different accuracy, using the least square theory could estimate optimal spatial information. Compared with previous spatial details injection mode, this mode is more accurate and robust. In this paper, an image fusion method using Bidimensional Empirical Mode Decomposition (BEMD) and the least square theory is proposed to merge multispectral images and panchromatic image. After multi-spectral images were transformed from RGB space into IHS space, next I component and Panchromatic are decomposed by BEMD, then using the least squares theory to evaluate optimal spatial information and inject spatial information, finally completing fusion through inverse BEMD and inverse intensity-hue-saturation transform. Two data sets are used to evaluate the proposed fusion method, GF-1 images and QuickBird-2 images. The fusion images were evaluated visually and statistically. The evaluation results show the method proposed in this paper achieves the best performance compared with the conventional method.
基金supported by the China Postdoctoral Science Foundation Funded Project(No.2021M690385)the National Natural Science Foundation of China(No.62101045).
文摘Infrared-visible image fusion plays an important role in multi-source data fusion,which has the advantage of integrating useful information from multi-source sensors.However,there are still challenges in target enhancement and visual improvement.To deal with these problems,a sub-regional infrared-visible image fusion method(SRF)is proposed.First,morphology and threshold segmentation is applied to extract targets interested in infrared images.Second,the infrared back-ground is reconstructed based on extracted targets and the visible image.Finally,target and back-ground regions are fused using a multi-scale transform.Experimental results are obtained using public data for comparison and evaluation,which demonstrate that the proposed SRF has poten-tial benefits over other methods.
基金Project(2008041001) supported by the Academician Foundation of China Project(N0601-041) supported by the General Armament Department Science Foundation of China
文摘A super-resolution reconstruction approach of (SVD) technique was presented, and its performance was radar image using an adaptive-threshold singular value decomposition analyzed, compared and assessed detailedly. First, radar imaging model and super-resolution reconstruction mechanism were outlined. Then, the adaptive-threshold SVD super-resolution algorithm, and its two key aspects, namely the determination method of point spread function (PSF) matrix T and the selection scheme of singular value threshold, were presented. Finally, the super-resolution algorithm was demonstrated successfully using the measured synthetic-aperture radar (SAR) images, and a Monte Carlo assessment was carried out to evaluate the performance of the algorithm by using the input/output signal-to-noise ratio (SNR). Five versions of SVD algorithms, namely 1 ) using all singular values, 2) using the top 80% singular values, 3) using the top 50% singular values, 4) using the top 20% singular values and 5) using singular values s such that S2≥/max(s2)/rinsNR were tested. The experimental results indicate that when the singular value threshold is set as Smax/(rinSNR)1/2, the super-resolution algorithm provides a good compromise between too much noise and too much bias and has good reconstruction results.
文摘New models for image decomposition are proposed which separate an image into a cartoon, consisting only of geometric objects, and an oscillatory component, consisting of textures or noise. The proposed models are given in a variational formulation with adaptive regularization norms for both the cartoon and texture parts. The adaptive behavior preserves key features such as object boundaries and textures while avoiding staircasing in what should be smooth regions. This decomposition is computed by minimizing a convex functional which depends on the two variables u and v, alternatively in each variable. Experimental results and comparisons to validate the proposed models are presented.
基金Project supported by the Shanghai Leading Academic Discipline Project(Grant No.T0102)
文摘We propose a layered image inpainting scheme based on image decomposition. The damaged image is first decomposed into three layers: cartoon, edge, and texture. The cartoon and edge layers are repaired using an adaptive offset operator that can fill-in damaged image blocks while preserving sharpness of edges. The missing information in the texture layer is generated with a texture synthesis method. By using discrete cosine transform (DCT) in image decomposition and trading between resolution and computation complexity in texture synthesis, the processing time is kept at a reasonable level.