Accurate segmentation of camouflage objects in aerial imagery is vital for improving the efficiency of UAV-based reconnaissance and rescue missions.However,camouflage object segmentation is increasingly challenging du...Accurate segmentation of camouflage objects in aerial imagery is vital for improving the efficiency of UAV-based reconnaissance and rescue missions.However,camouflage object segmentation is increasingly challenging due to advances in both camouflage materials and biological mimicry.Although multispectral-RGB based technology shows promise,conventional dual-aperture multispectral-RGB imaging systems are constrained by imprecise and time-consuming registration and fusion across different modalities,limiting their performance.Here,we propose the Reconstructed Multispectral-RGB Fusion Network(RMRF-Net),which reconstructs RGB images into multispectral ones,enabling efficient multimodal segmentation using only an RGB camera.Specifically,RMRF-Net employs a divergentsimilarity feature correction strategy to minimize reconstruction errors and includes an efficient boundary-aware decoder to enhance object contours.Notably,we establish the first real-world aerial multispectral-RGB semantic segmentation of camouflage objects dataset,including 11 object categories.Experimental results demonstrate that RMRF-Net outperforms existing methods,achieving 17.38 FPS on the NVIDIA Jetson AGX Orin,with only a 0.96%drop in mIoU compared to the RTX 3090,showing its practical applicability in multimodal remote sensing.展开更多
Image super-resolution reconstruction technology is currently widely used in medical imaging,video surveillance,and industrial quality inspection.It not only enhances image quality but also improves details and visual...Image super-resolution reconstruction technology is currently widely used in medical imaging,video surveillance,and industrial quality inspection.It not only enhances image quality but also improves details and visual perception,significantly increasing the utility of low-resolution images.In this study,an improved image superresolution reconstruction model based on Generative Adversarial Networks(SRGAN)was proposed.This model introduced a channel and spatial attention mechanism(CSAB)in the generator,allowing it to effectively leverage the information from the input image to enhance feature representations and capture important details.The discriminator was designed with an improved PatchGAN architecture,which more accurately captured local details and texture information of the image.With these enhanced generator and discriminator architectures and an optimized loss function design,this method demonstrated superior performance in image quality assessment metrics.Experimental results showed that this model outperforms traditional methods,presenting more detailed and realistic image details in the visual effects.展开更多
As a form of discrete representation learning,Vector Quantized Variational Autoencoders(VQ-VAE)have increasingly been applied to generative and multimodal tasks due to their ease of embedding and representative capaci...As a form of discrete representation learning,Vector Quantized Variational Autoencoders(VQ-VAE)have increasingly been applied to generative and multimodal tasks due to their ease of embedding and representative capacity.However,existing VQ-VAEs often perform quantization in the spatial domain,ignoring global structural information and potentially suffering from codebook collapse and information coupling issues.This paper proposes a frequency quantized variational autoencoder(FQ-VAE)to address these issues.The proposed method transforms image features into linear combinations in the frequency domain using a 2D fast Fourier transform(2D-FFT)and performs adaptive quantization on these frequency components to preserve image’s global relationships.The codebook is dynamically optimized to avoid collapse and information coupling issue by considering the usage frequency and dependency of code vectors.Furthermore,we introduce a post-processing module based on graph convolutional networks to further improve reconstruction quality.Experimental results on four public datasets demonstrate that the proposed method outperforms state-of-the-art approaches in terms of Structural Similarity Index(SSIM),Learned Perceptual Image Patch Similarity(LPIPS),and Reconstruction Fréchet Inception Distance(rFID).In the experiments on the CIFAR-10 dataset,compared to the baselinemethod VQ-VAE,the proposedmethod improves the abovemetrics by 4.9%,36.4%,and 52.8%,respectively.展开更多
In agricultural production,crop images are commonly used for the classification and identification of various crops.However,several challenges arise,including low image clarity,elevated noise levels,low accuracy,and p...In agricultural production,crop images are commonly used for the classification and identification of various crops.However,several challenges arise,including low image clarity,elevated noise levels,low accuracy,and poor robustness of existing classification models.To address these issues,this research proposes an innovative crop image classification model named Lap-FEHRNet,which integrates a Laplacian Pyramid Super Resolution Network(LapSRN)with a feature enhancement high-resolution network based on attention mechanisms(FEHRNet).To mitigate noise interference,this research incorporates the LapSRN network,which utilizes a Laplacian pyramid structure to extract multi-level feature details from low-resolution images through a systematic layer-by-layer amplification and pixel detail superposition process.This gradual reconstruction enhances the high-frequency information of the image,enabling super-resolution reconstruction of low-quality images.To obtain a broader range of comprehensive and diverse features,this research employs the FEHRNetmodel for both deep and shallow feature extraction.This approach results in features that encapsulate multi-scale information and integrate both deep and shallow insights.To effectively fuse these complementary features,this research introduces an attention mechanism during the feature enhancement stage.This mechanism highlights important regions within the image,assigning greater weights to salient features and resulting in a more comprehensive and effective image feature representation.Consequently,the accuracy of image classification is significantly improved.Experimental results demonstrate that the Lap-FEHRNetmodel achieves impressive classification accuracies of 98.8%on the crop classification dataset and 98.57%on the rice leaf disease dataset,underscoring the model’s outstanding accuracy,robustness,and generalization capability.展开更多
Person Image Synthesis has been widely used in fashion with extensive application scenarios.The point of this task is how to synthesise person image from a single source image under arbitrary poses.Prior methods gener...Person Image Synthesis has been widely used in fashion with extensive application scenarios.The point of this task is how to synthesise person image from a single source image under arbitrary poses.Prior methods generate the person image with target pose well;however,they fail to preserve the fine style details of the source image.To address this problem,a robust style injection(RSI)model is proposed,which is a coarse-to-fine framework to synthesise target the person image.RSI develops a simple and efficient cross-attention based module to fuse the features of both source semantic styles and target pose for achieving the coarse aligned features.The adaptive instance normalisation is employed to enhance the aligned features in conjunction with source semantic styles.Subsequently,source semantic styles are further injected into the positional normalisation scheme to avoid the fine style details erosion caused by massive convolution.In training losses,optimal transport theory in the form of energy distance is introduced to constrain data distribution to refine the texture style details.Additionally,the authors’model is capable of editing the shape and texture of garments to the target style separately.The experiments demonstrate that the authors’RSI achieves better performance over the state-of-art methods.展开更多
Deep learning(DL)-based image reconstruction methods have garnered increasing interest in the last few years.Numerous studies demonstrate that DL-based reconstruction methods function admirably in optical tomographic ...Deep learning(DL)-based image reconstruction methods have garnered increasing interest in the last few years.Numerous studies demonstrate that DL-based reconstruction methods function admirably in optical tomographic imaging techniques,such as bioluminescence tomography(BLT).Nevertheless,nearly every existing DL-based method utilizes an explicit neural representation for the reconstruction problem,which either consumes much memory space or requires various complicated computations.In this paper,we present a neural field(NF)-based image reconstruction scheme for BLT that uses an implicit neural representation.The proposed NFbased method establishes a transformation between the coordinate of an arbitrary spatial point and the source value of the point with a relatively light-weight multilayer perceptron,which has remarkable computational efficiency.Another simple neural network composed of two fully connected layers and a 1D convolutional layer is used to generate the neural features.Results of simulations and experiments show that the proposed NF-based method has similar performance to the photon density complement network and the two-stage network,while consuming fewer floating point operations with fewer model parameters.展开更多
Epilepsy is a chronic neurological disorder that affects the function of the brain in people of all ages.It manifests in the electroencephalogram(EEG) signal which records the electrical activity of the brain.Various ...Epilepsy is a chronic neurological disorder that affects the function of the brain in people of all ages.It manifests in the electroencephalogram(EEG) signal which records the electrical activity of the brain.Various image processing,signal processing,and machine-learning based techniques are employed to analyze epilepsy,using spatial and temporal features.The nervous system that generates the EEG signal is considered nonlinear and the EEG signals exhibit chaotic behavior.In order to capture these nonlinear dynamics,we use reconstructed phase space(RPS) representation of the signal.Earlier studies have primarily addressed seizure detection as a binary classification(normal vs.ictal) problem and rarely as a ternary class(normal vs.interictal vs.ictal)problem.We employ transfer learning on a pre-trained deep neural network model and retrain it using RPS images of the EEG signal.The classification accuracy of the model for the binary classes is(98.5±1.5)% and(95±2)% for the ternary classes.The performance of the convolution neural network(CNN) model is better than the other existing statistical approach for all performance indicators such as accuracy,sensitivity,and specificity.The result of the proposed approach shows the prospect of employing RPS images with CNN for predicting epileptic seizures.展开更多
Structured illumination microscopy(SIM)achieves super-resolution(SR)by modulating the high-frequency information of the sample into the passband of the optical system and subsequent image reconstruction.The traditiona...Structured illumination microscopy(SIM)achieves super-resolution(SR)by modulating the high-frequency information of the sample into the passband of the optical system and subsequent image reconstruction.The traditional Wiener-filtering-based reconstruction algorithm operates in the Fourier domain,it requires prior knowledge of the sinusoidal illumination patterns which makes the time-consuming procedure of parameter estimation to raw datasets necessary,besides,the parameter estimation is sensitive to noise or aberration-induced pattern distortion which leads to reconstruction artifacts.Here,we propose a spatial-domain image reconstruction method that does not require parameter estimation but calculates patterns from raw datasets,and a reconstructed image can be obtained just by calculating the spatial covariance of differential calculated patterns and differential filtered datasets(the notch filtering operation is performed to the raw datasets for attenuating and compensating the optical transfer function(OTF)).Experiments on reconstructing raw datasets including nonbiological,biological,and simulated samples demonstrate that our method has SR capability,high reconstruction speed,and high robustness to aberration and noise.展开更多
Photoacoustic imaging(PAI)is a noninvasive emerging imaging method based on the photoacoustic effect,which provides necessary assistance for medical diagnosis.It has the characteristics of large imaging depth and high...Photoacoustic imaging(PAI)is a noninvasive emerging imaging method based on the photoacoustic effect,which provides necessary assistance for medical diagnosis.It has the characteristics of large imaging depth and high contrast.However,limited by the equipment cost and reconstruction time requirements,the existing PAI systems distributed with annular array transducers are difficult to take into account both the image quality and the imaging speed.In this paper,a triple-path feature transform network(TFT-Net)for ring-array photoacoustic tomography is proposed to enhance the imaging quality from limited-view and sparse measurement data.Specifically,the network combines the raw photoacoustic pressure signals and conventional linear reconstruction images as input data,and takes the photoacoustic physical model as a prior information to guide the reconstruction process.In addition,to enhance the ability of extracting signal features,the residual block and squeeze and excitation block are introduced into the TFT-Net.For further efficient reconstruction,the final output of photoacoustic signals uses‘filter-then-upsample’operation with a pixel-shuffle multiplexer and a max out module.Experiment results on simulated and in-vivo data demonstrate that the constructed TFT-Net can restore the target boundary clearly,reduce background noise,and realize fast and high-quality photoacoustic image reconstruction of limited view with sparse sampling.展开更多
In this paper,we introduce an accelerating algorithm based on the Taylor series for reconstructing target images in the spectral digital image correlation method(SDIC).The Taylor series image reconstruction method is ...In this paper,we introduce an accelerating algorithm based on the Taylor series for reconstructing target images in the spectral digital image correlation method(SDIC).The Taylor series image reconstruction method is employed instead of the previous direct Fourier transform(DFT)image reconstruction method,which consumes the majority of the computational time for target image reconstruction.The partial derivatives in the Taylor series are computed using the fast Fourier transform(FFT)of the entire image,following the principles of Fourier transform theory.To examine the impact of different orders of Taylor series expansion on accuracy and efficiency,we employ third-and fourth-order Taylor series image reconstruction methods and compare them with the DFT image reconstruction method through simulated experiments.As a result of these enhancements,the computational efficiency using the third-and fourth-order Taylor series improves by factors of 57 and 46,respectively,compared to the previous method.In terms of analysis accuracy,within a strain range of 0–0.1 and without the addition of image noise,the accuracy of the proposed method increases with higher expansion orders,surpassing that of the DFT image reconstruction method when the fourth order is utilized.However,when different levels of Gaussian noise are applied to simulated images individually,the accuracy of the third-or fourth-order Taylor series expansion method is superior to that of the DFT reconstruction method.Finally,we present the analyzed experimental results of a silicone rubber plate specimen with bilateral cracks under uniaxial tension.展开更多
Deep learning is capable of greatly promoting the progress of super-resolution imaging technology in terms of imaging and reconstruction speed,imaging resolution,and imagingflux.This paper proposes a deep neural netwo...Deep learning is capable of greatly promoting the progress of super-resolution imaging technology in terms of imaging and reconstruction speed,imaging resolution,and imagingflux.This paper proposes a deep neural network based on a generative adversarial network(GAN).The generator employs a U-Net-based network,which integrates Dense Net for the downsampling component.The proposed method has excellent properties,for example,the network model is trained with several different datasets of biological structures;the trained model can improve the imaging resolution of different microscopy imaging modalities such as confocal imaging and wide-field imaging;and the model demonstrates a generalized ability to improve the resolution of different biological structures even out of the datasets.In addition,experimental results showed that the method improved the resolution of caveolin-coated pits(CCPs)structures from 264 nm to 138 nm,a 1.91-fold increase,and nearly doubled the resolution of DNA molecules imaged while being transported through microfluidic channels.展开更多
Underwater images are often with biased colours and reduced contrast because of the absorption and scattering effects when light propagates in water.Such images with degradation cannot meet the needs of underwater ope...Underwater images are often with biased colours and reduced contrast because of the absorption and scattering effects when light propagates in water.Such images with degradation cannot meet the needs of underwater operations.The main problem in classic underwater image restoration or enhancement methods is that they consume long calcu-lation time,and often,the colour or contrast of the result images is still unsatisfied.Instead of using the complicated physical model of underwater imaging degradation,we propose a new method to deal with underwater images by imitating the colour constancy mechanism of human vision using double-opponency.Firstly,the original image is converted to the LMS space.Then the signals are linearly combined,and Gaussian convolutions are per-formed to imitate the function of receptive fields(RFs).Next,two RFs with different sizes work together to constitute the double-opponency response.Finally,the underwater light is estimated to correct the colours in the image.Further contrast stretching on the luminance is optional.Experiments show that the proposed method can obtain clarified underwater images with higher quality than before,and it spends significantly less time cost compared to other previously published typical methods.展开更多
Person image generation aims to generate images that maintain the original human appearance in different target poses.Recent works have revealed that the critical element in achieving this task is the alignment of app...Person image generation aims to generate images that maintain the original human appearance in different target poses.Recent works have revealed that the critical element in achieving this task is the alignment of appearance domain and pose domain.Previous alignment methods,such as appearance flow warping,correspondence learning and cross attention,often encounter challenges when it comes to producing fine texture details.These approaches suffer from limitations in accurately estimating appearance flows due to the lack of global receptive field.Alternatively,they can only perform cross-domain alignment on high-level feature maps with small spatial dimensions since the computational complexity increases quadratically with larger feature sizes.In this article,the significance of multi-scale alignment,in both low-level and high-level domains,for ensuring reliable cross-domain alignment of appearance and pose is demonstrated.To this end,a novel and effective method,named Multi-scale Crossdomain Alignment(MCA)is proposed.Firstly,MCA adopts global context aggregation transformer to model multi-scale interaction between pose and appearance inputs,which employs pair-wise window-based cross attention.Furthermore,leveraging the integrated global source information for each target position,MCA applies flexible flow prediction head and point correlation to effectively conduct warping and fusing for final transformed person image generation.Our proposed MCA achieves superior performance on two popular datasets than other methods,which verifies the effectiveness of our approach.展开更多
The image reconstruction of electrical impedance tomography(EIT)is a nonlinear and ill-posed inverse problem and the imaging results are easily affected by measurement noise,which needs to be solved by using regulariz...The image reconstruction of electrical impedance tomography(EIT)is a nonlinear and ill-posed inverse problem and the imaging results are easily affected by measurement noise,which needs to be solved by using regularization methods.The iterative regularization method has become a focus of the research due to its ease of implementation.To deal with the ill-posed and ill-conditional problems in image reconstruction,the inexact Newton-Landweber iterative method is considered and the Nesterov’s acceleration strategy is introduced.One Nesterov-type accelerated version of the inexact Newton-Landweber iteration is presented to determine the conductivity distributions inside an object from electrical measurements made on the surface.In order to further optimize the acceleration,both the steepest descent step-length and the minimal error step-length are exploited during the iterative image reconstruction process.Landweber iteration and its accelerated version are also implemented for comparison.All algorithms are terminated by the discrepancy principle.Finally,the performance is tested by reporting numerical simulations to verify the remarkable acceleration efficiency of the proposed method.展开更多
A new method to accelerate the convergent rate of the space-alternatinggeneralized expectation-maximization (SAGE) algorithm is proposed. The new rescaled block-iterativeSAGE (RBI-SAGE) algorithm combines the RBI algo...A new method to accelerate the convergent rate of the space-alternatinggeneralized expectation-maximization (SAGE) algorithm is proposed. The new rescaled block-iterativeSAGE (RBI-SAGE) algorithm combines the RBI algorithm with the SAGE algorithm for PET imagereconstruction. In the new approach, the projection data is partitioned into disjoint blocks; eachiteration step involves only one of these blocks. SAGE updates the parameters sequentially in eachblock. In experiments, the RBI-SAGE algorithm and classical SAGE algorithm are compared in theapplication on positron emission tomography (PET) image reconstruction. Simulation results show thatRBI-SAGE has better performance than SAGE in both convergence and image quality.展开更多
In high-resolution cone-beam computed tomography (CBCT) using the flat-panel detector, imperfect or defect detector elements cause ring artifacts due to the none-uniformity of their X-ray response. They often distur...In high-resolution cone-beam computed tomography (CBCT) using the flat-panel detector, imperfect or defect detector elements cause ring artifacts due to the none-uniformity of their X-ray response. They often disturb the image quality. A dedicated fitting correction method for high-resolution micro-CT is presented. The method converts each elementary X-ray response curve to an average one, and eliminates response inconsistency among pixels. Other factors of the method are discussed, such as the correction factor variability by different sampling frames and nonlinear factors over the whole spectrum. Results show that the noise and artifacts are both reduced in reconstructed images展开更多
To improve spectral X-ray CT reconstructed image quality, the energy-weighted reconstructed image xbins^W and the separable paraboloidal surrogates(SPS) algorithm are proposed for the prior image constrained compres...To improve spectral X-ray CT reconstructed image quality, the energy-weighted reconstructed image xbins^W and the separable paraboloidal surrogates(SPS) algorithm are proposed for the prior image constrained compressed sensing(PICCS)-based spectral X-ray CT image reconstruction. The PICCS-based image reconstruction takes advantage of the compressed sensing theory, a prior image and an optimization algorithm to improve the image quality of CT reconstructions.To evaluate the performance of the proposed method, three optimization algorithms and three prior images are employed and compared in terms of reconstruction accuracy and noise characteristics of the reconstructed images in each energy bin.The experimental simulation results show that the image xbins^W is the best as the prior image in general with respect to the three optimization algorithms; and the SPS algorithm offers the best performance for the simulated phantom with respect to the three prior images. Compared with filtered back-projection(FBP), the PICCS via the SPS algorithm and xbins^W as the prior image can offer the noise reduction in the reconstructed images up to 80. 46%, 82. 51%, 88. 08% in each energy bin,respectively. M eanwhile, the root-mean-squared error in each energy bin is decreased by 15. 02%, 18. 15%, 34. 11% and the correlation coefficient is increased by 9. 98%, 11. 38%,15. 94%, respectively.展开更多
A discussion is presented to gain the relationship between the original image and the reconstructed image while errors are mixed with the DCT data of the original image for the transmission. We can find the wrong reco...A discussion is presented to gain the relationship between the original image and the reconstructed image while errors are mixed with the DCT data of the original image for the transmission. We can find the wrong reconstructed blocks, locate blocks, which are most likely to contain errors and eliminate errors. The method needs no channel error protection, needs no verifiable bits, and needs no extra bandwidth. Experimental results are provided in the end.展开更多
In order to improve the reconstruction accuracy of magnetic resonance imaging(MRI),an accurate natural image compressed sensing(CS)reconstruction network is proposed,which combines the advantages of model-based and de...In order to improve the reconstruction accuracy of magnetic resonance imaging(MRI),an accurate natural image compressed sensing(CS)reconstruction network is proposed,which combines the advantages of model-based and deep learning-based CS-MRI methods.In theory,enhancing geometric texture details in linear reconstruction is possible.First,the optimization problem is decomposed into two problems:linear approximation and geometric compensation.Aimed at the problem of image linear approximation,the data consistency module is used to deal with it.Since the processing process will lose texture details,a neural network layer that explicitly combines image and frequency feature representation is proposed,which is named butterfly dilated geometric distillation network.The network introduces the idea of butterfly operation,skillfully integrates the features of image domain and frequency domain,and avoids the loss of texture details when extracting features in a single domain.Finally,a channel feature fusion module is designed by combining channel attention mechanism and dilated convolution.The attention of the channel makes the final output feature map focus on the more important part,thus improving the feature representation ability.The dilated convolution enlarges the receptive field,thereby obtaining more dense image feature data.The experimental results show that the peak signal-to-noise ratio of the network is 5.43 dB,5.24 dB and 3.89 dB higher than that of ISTA-Net+,FISTA and DGDN networks on the brain data set with a Cartesian sampling mask CS ratio of 10%.展开更多
Super-resolution structured illumination microscopy(SR-SIM)relies heavily on post-processing reconstruction to obtain high-quality SR images from raw data.Although many SIM reconstruction algorithms have been develope...Super-resolution structured illumination microscopy(SR-SIM)relies heavily on post-processing reconstruction to obtain high-quality SR images from raw data.Although many SIM reconstruction algorithms have been developed to recover fine cellular structures with high fidelity even from the noisy data,whether the pixel intensities of reconstructed SR images are still proportional to the original fluorescence intensity has been less explored.The linearity between the intensity before and after reconstruction is de fined as the intensity fidelity.Here,we proposed a method to evaluate the reconstructed SR image intensity fidelity at different spatial frequencies.With the proposed metric,we systematically investigated the impact of the key factors on the intensity fidelity in the standard Wiener-SIM reconstructions with simulated data,then evaluated the intensity fidelity of the SR images reconstructed by representative open-source packages.Our work provides a reference for SR-SIM image intensity fidelity improvement.展开更多
基金National Natural Science Foundation of China(Grant Nos.62005049 and 62072110)Natural Science Foundation of Fujian Province(Grant No.2020J01451).
文摘Accurate segmentation of camouflage objects in aerial imagery is vital for improving the efficiency of UAV-based reconnaissance and rescue missions.However,camouflage object segmentation is increasingly challenging due to advances in both camouflage materials and biological mimicry.Although multispectral-RGB based technology shows promise,conventional dual-aperture multispectral-RGB imaging systems are constrained by imprecise and time-consuming registration and fusion across different modalities,limiting their performance.Here,we propose the Reconstructed Multispectral-RGB Fusion Network(RMRF-Net),which reconstructs RGB images into multispectral ones,enabling efficient multimodal segmentation using only an RGB camera.Specifically,RMRF-Net employs a divergentsimilarity feature correction strategy to minimize reconstruction errors and includes an efficient boundary-aware decoder to enhance object contours.Notably,we establish the first real-world aerial multispectral-RGB semantic segmentation of camouflage objects dataset,including 11 object categories.Experimental results demonstrate that RMRF-Net outperforms existing methods,achieving 17.38 FPS on the NVIDIA Jetson AGX Orin,with only a 0.96%drop in mIoU compared to the RTX 3090,showing its practical applicability in multimodal remote sensing.
文摘Image super-resolution reconstruction technology is currently widely used in medical imaging,video surveillance,and industrial quality inspection.It not only enhances image quality but also improves details and visual perception,significantly increasing the utility of low-resolution images.In this study,an improved image superresolution reconstruction model based on Generative Adversarial Networks(SRGAN)was proposed.This model introduced a channel and spatial attention mechanism(CSAB)in the generator,allowing it to effectively leverage the information from the input image to enhance feature representations and capture important details.The discriminator was designed with an improved PatchGAN architecture,which more accurately captured local details and texture information of the image.With these enhanced generator and discriminator architectures and an optimized loss function design,this method demonstrated superior performance in image quality assessment metrics.Experimental results showed that this model outperforms traditional methods,presenting more detailed and realistic image details in the visual effects.
基金supported by the Interdisciplinary project of Dalian University DLUXK-2023-ZD-001.
文摘As a form of discrete representation learning,Vector Quantized Variational Autoencoders(VQ-VAE)have increasingly been applied to generative and multimodal tasks due to their ease of embedding and representative capacity.However,existing VQ-VAEs often perform quantization in the spatial domain,ignoring global structural information and potentially suffering from codebook collapse and information coupling issues.This paper proposes a frequency quantized variational autoencoder(FQ-VAE)to address these issues.The proposed method transforms image features into linear combinations in the frequency domain using a 2D fast Fourier transform(2D-FFT)and performs adaptive quantization on these frequency components to preserve image’s global relationships.The codebook is dynamically optimized to avoid collapse and information coupling issue by considering the usage frequency and dependency of code vectors.Furthermore,we introduce a post-processing module based on graph convolutional networks to further improve reconstruction quality.Experimental results on four public datasets demonstrate that the proposed method outperforms state-of-the-art approaches in terms of Structural Similarity Index(SSIM),Learned Perceptual Image Patch Similarity(LPIPS),and Reconstruction Fréchet Inception Distance(rFID).In the experiments on the CIFAR-10 dataset,compared to the baselinemethod VQ-VAE,the proposedmethod improves the abovemetrics by 4.9%,36.4%,and 52.8%,respectively.
文摘In agricultural production,crop images are commonly used for the classification and identification of various crops.However,several challenges arise,including low image clarity,elevated noise levels,low accuracy,and poor robustness of existing classification models.To address these issues,this research proposes an innovative crop image classification model named Lap-FEHRNet,which integrates a Laplacian Pyramid Super Resolution Network(LapSRN)with a feature enhancement high-resolution network based on attention mechanisms(FEHRNet).To mitigate noise interference,this research incorporates the LapSRN network,which utilizes a Laplacian pyramid structure to extract multi-level feature details from low-resolution images through a systematic layer-by-layer amplification and pixel detail superposition process.This gradual reconstruction enhances the high-frequency information of the image,enabling super-resolution reconstruction of low-quality images.To obtain a broader range of comprehensive and diverse features,this research employs the FEHRNetmodel for both deep and shallow feature extraction.This approach results in features that encapsulate multi-scale information and integrate both deep and shallow insights.To effectively fuse these complementary features,this research introduces an attention mechanism during the feature enhancement stage.This mechanism highlights important regions within the image,assigning greater weights to salient features and resulting in a more comprehensive and effective image feature representation.Consequently,the accuracy of image classification is significantly improved.Experimental results demonstrate that the Lap-FEHRNetmodel achieves impressive classification accuracies of 98.8%on the crop classification dataset and 98.57%on the rice leaf disease dataset,underscoring the model’s outstanding accuracy,robustness,and generalization capability.
基金National Natural Science Foundation of China,Grant/Award Number:62176124。
文摘Person Image Synthesis has been widely used in fashion with extensive application scenarios.The point of this task is how to synthesise person image from a single source image under arbitrary poses.Prior methods generate the person image with target pose well;however,they fail to preserve the fine style details of the source image.To address this problem,a robust style injection(RSI)model is proposed,which is a coarse-to-fine framework to synthesise target the person image.RSI develops a simple and efficient cross-attention based module to fuse the features of both source semantic styles and target pose for achieving the coarse aligned features.The adaptive instance normalisation is employed to enhance the aligned features in conjunction with source semantic styles.Subsequently,source semantic styles are further injected into the positional normalisation scheme to avoid the fine style details erosion caused by massive convolution.In training losses,optimal transport theory in the form of energy distance is introduced to constrain data distribution to refine the texture style details.Additionally,the authors’model is capable of editing the shape and texture of garments to the target style separately.The experiments demonstrate that the authors’RSI achieves better performance over the state-of-art methods.
基金supported in part by the National Natural Science Foundation of China(62101278,62001379,62271023)Beijing Natural Science Foundation(7242269).
文摘Deep learning(DL)-based image reconstruction methods have garnered increasing interest in the last few years.Numerous studies demonstrate that DL-based reconstruction methods function admirably in optical tomographic imaging techniques,such as bioluminescence tomography(BLT).Nevertheless,nearly every existing DL-based method utilizes an explicit neural representation for the reconstruction problem,which either consumes much memory space or requires various complicated computations.In this paper,we present a neural field(NF)-based image reconstruction scheme for BLT that uses an implicit neural representation.The proposed NFbased method establishes a transformation between the coordinate of an arbitrary spatial point and the source value of the point with a relatively light-weight multilayer perceptron,which has remarkable computational efficiency.Another simple neural network composed of two fully connected layers and a 1D convolutional layer is used to generate the neural features.Results of simulations and experiments show that the proposed NF-based method has similar performance to the photon density complement network and the two-stage network,while consuming fewer floating point operations with fewer model parameters.
文摘Epilepsy is a chronic neurological disorder that affects the function of the brain in people of all ages.It manifests in the electroencephalogram(EEG) signal which records the electrical activity of the brain.Various image processing,signal processing,and machine-learning based techniques are employed to analyze epilepsy,using spatial and temporal features.The nervous system that generates the EEG signal is considered nonlinear and the EEG signals exhibit chaotic behavior.In order to capture these nonlinear dynamics,we use reconstructed phase space(RPS) representation of the signal.Earlier studies have primarily addressed seizure detection as a binary classification(normal vs.ictal) problem and rarely as a ternary class(normal vs.interictal vs.ictal)problem.We employ transfer learning on a pre-trained deep neural network model and retrain it using RPS images of the EEG signal.The classification accuracy of the model for the binary classes is(98.5±1.5)% and(95±2)% for the ternary classes.The performance of the convolution neural network(CNN) model is better than the other existing statistical approach for all performance indicators such as accuracy,sensitivity,and specificity.The result of the proposed approach shows the prospect of employing RPS images with CNN for predicting epileptic seizures.
基金funded by the National Natural Science Foundation of China(62125504,61827825,and 31901059)Zhejiang Provincial Ten Thousand Plan for Young Top Talents(2020R52001)Open Project Program of Wuhan National Laboratory for Optoelectronics(2021WNLOKF007).
文摘Structured illumination microscopy(SIM)achieves super-resolution(SR)by modulating the high-frequency information of the sample into the passband of the optical system and subsequent image reconstruction.The traditional Wiener-filtering-based reconstruction algorithm operates in the Fourier domain,it requires prior knowledge of the sinusoidal illumination patterns which makes the time-consuming procedure of parameter estimation to raw datasets necessary,besides,the parameter estimation is sensitive to noise or aberration-induced pattern distortion which leads to reconstruction artifacts.Here,we propose a spatial-domain image reconstruction method that does not require parameter estimation but calculates patterns from raw datasets,and a reconstructed image can be obtained just by calculating the spatial covariance of differential calculated patterns and differential filtered datasets(the notch filtering operation is performed to the raw datasets for attenuating and compensating the optical transfer function(OTF)).Experiments on reconstructing raw datasets including nonbiological,biological,and simulated samples demonstrate that our method has SR capability,high reconstruction speed,and high robustness to aberration and noise.
基金supported by National Key R&D Program of China[2022YFC2402400]the National Natural Science Foundation of China[Grant No.62275062]Guangdong Provincial Key Laboratory of Biomedical Optical Imaging Technology[Grant No.2020B121201010-4].
文摘Photoacoustic imaging(PAI)is a noninvasive emerging imaging method based on the photoacoustic effect,which provides necessary assistance for medical diagnosis.It has the characteristics of large imaging depth and high contrast.However,limited by the equipment cost and reconstruction time requirements,the existing PAI systems distributed with annular array transducers are difficult to take into account both the image quality and the imaging speed.In this paper,a triple-path feature transform network(TFT-Net)for ring-array photoacoustic tomography is proposed to enhance the imaging quality from limited-view and sparse measurement data.Specifically,the network combines the raw photoacoustic pressure signals and conventional linear reconstruction images as input data,and takes the photoacoustic physical model as a prior information to guide the reconstruction process.In addition,to enhance the ability of extracting signal features,the residual block and squeeze and excitation block are introduced into the TFT-Net.For further efficient reconstruction,the final output of photoacoustic signals uses‘filter-then-upsample’operation with a pixel-shuffle multiplexer and a max out module.Experiment results on simulated and in-vivo data demonstrate that the constructed TFT-Net can restore the target boundary clearly,reduce background noise,and realize fast and high-quality photoacoustic image reconstruction of limited view with sparse sampling.
基金supported by the National Natural Science Foundation of China(Grant Nos.12272145 and 11972013)the Ministry of Science and Technology of China(Grant No.2018YFF01014200)Hubei Provincial Natural Science Foundation of China(Grant No.2022CFB288).
文摘In this paper,we introduce an accelerating algorithm based on the Taylor series for reconstructing target images in the spectral digital image correlation method(SDIC).The Taylor series image reconstruction method is employed instead of the previous direct Fourier transform(DFT)image reconstruction method,which consumes the majority of the computational time for target image reconstruction.The partial derivatives in the Taylor series are computed using the fast Fourier transform(FFT)of the entire image,following the principles of Fourier transform theory.To examine the impact of different orders of Taylor series expansion on accuracy and efficiency,we employ third-and fourth-order Taylor series image reconstruction methods and compare them with the DFT image reconstruction method through simulated experiments.As a result of these enhancements,the computational efficiency using the third-and fourth-order Taylor series improves by factors of 57 and 46,respectively,compared to the previous method.In terms of analysis accuracy,within a strain range of 0–0.1 and without the addition of image noise,the accuracy of the proposed method increases with higher expansion orders,surpassing that of the DFT image reconstruction method when the fourth order is utilized.However,when different levels of Gaussian noise are applied to simulated images individually,the accuracy of the third-or fourth-order Taylor series expansion method is superior to that of the DFT reconstruction method.Finally,we present the analyzed experimental results of a silicone rubber plate specimen with bilateral cracks under uniaxial tension.
基金Subjects funded by the National Natural Science Foundation of China(Nos.62275216 and 61775181)the Natural Science Basic Research Programme of Shaanxi Province-Major Basic Research Special Project(Nos.S2018-ZC-TD-0061 and TZ0393)the Special Project for the Development of National Key Scientific Instruments and Equipment No.(51927804).
文摘Deep learning is capable of greatly promoting the progress of super-resolution imaging technology in terms of imaging and reconstruction speed,imaging resolution,and imagingflux.This paper proposes a deep neural network based on a generative adversarial network(GAN).The generator employs a U-Net-based network,which integrates Dense Net for the downsampling component.The proposed method has excellent properties,for example,the network model is trained with several different datasets of biological structures;the trained model can improve the imaging resolution of different microscopy imaging modalities such as confocal imaging and wide-field imaging;and the model demonstrates a generalized ability to improve the resolution of different biological structures even out of the datasets.In addition,experimental results showed that the method improved the resolution of caveolin-coated pits(CCPs)structures from 264 nm to 138 nm,a 1.91-fold increase,and nearly doubled the resolution of DNA molecules imaged while being transported through microfluidic channels.
文摘Underwater images are often with biased colours and reduced contrast because of the absorption and scattering effects when light propagates in water.Such images with degradation cannot meet the needs of underwater operations.The main problem in classic underwater image restoration or enhancement methods is that they consume long calcu-lation time,and often,the colour or contrast of the result images is still unsatisfied.Instead of using the complicated physical model of underwater imaging degradation,we propose a new method to deal with underwater images by imitating the colour constancy mechanism of human vision using double-opponency.Firstly,the original image is converted to the LMS space.Then the signals are linearly combined,and Gaussian convolutions are per-formed to imitate the function of receptive fields(RFs).Next,two RFs with different sizes work together to constitute the double-opponency response.Finally,the underwater light is estimated to correct the colours in the image.Further contrast stretching on the luminance is optional.Experiments show that the proposed method can obtain clarified underwater images with higher quality than before,and it spends significantly less time cost compared to other previously published typical methods.
基金National Natural Science Foundation of China,Grant/Award Number:62274142Hangzhou Major Technology Innovation Project of Artificial Intelligence,Grant/Award Number:2022AIZD0060。
文摘Person image generation aims to generate images that maintain the original human appearance in different target poses.Recent works have revealed that the critical element in achieving this task is the alignment of appearance domain and pose domain.Previous alignment methods,such as appearance flow warping,correspondence learning and cross attention,often encounter challenges when it comes to producing fine texture details.These approaches suffer from limitations in accurately estimating appearance flows due to the lack of global receptive field.Alternatively,they can only perform cross-domain alignment on high-level feature maps with small spatial dimensions since the computational complexity increases quadratically with larger feature sizes.In this article,the significance of multi-scale alignment,in both low-level and high-level domains,for ensuring reliable cross-domain alignment of appearance and pose is demonstrated.To this end,a novel and effective method,named Multi-scale Crossdomain Alignment(MCA)is proposed.Firstly,MCA adopts global context aggregation transformer to model multi-scale interaction between pose and appearance inputs,which employs pair-wise window-based cross attention.Furthermore,leveraging the integrated global source information for each target position,MCA applies flexible flow prediction head and point correlation to effectively conduct warping and fusing for final transformed person image generation.Our proposed MCA achieves superior performance on two popular datasets than other methods,which verifies the effectiveness of our approach.
基金National Natural Science Foundation of China(12101204,12261021)Heilongjiang Provincial Natural Science Foundation of China(LH2023A018)Modern Numerical Method Course for Research Program on Teaching Reform of Degree and Postgraduate Education of Heilongjiang University(2024)。
文摘The image reconstruction of electrical impedance tomography(EIT)is a nonlinear and ill-posed inverse problem and the imaging results are easily affected by measurement noise,which needs to be solved by using regularization methods.The iterative regularization method has become a focus of the research due to its ease of implementation.To deal with the ill-posed and ill-conditional problems in image reconstruction,the inexact Newton-Landweber iterative method is considered and the Nesterov’s acceleration strategy is introduced.One Nesterov-type accelerated version of the inexact Newton-Landweber iteration is presented to determine the conductivity distributions inside an object from electrical measurements made on the surface.In order to further optimize the acceleration,both the steepest descent step-length and the minimal error step-length are exploited during the iterative image reconstruction process.Landweber iteration and its accelerated version are also implemented for comparison.All algorithms are terminated by the discrepancy principle.Finally,the performance is tested by reporting numerical simulations to verify the remarkable acceleration efficiency of the proposed method.
文摘A new method to accelerate the convergent rate of the space-alternatinggeneralized expectation-maximization (SAGE) algorithm is proposed. The new rescaled block-iterativeSAGE (RBI-SAGE) algorithm combines the RBI algorithm with the SAGE algorithm for PET imagereconstruction. In the new approach, the projection data is partitioned into disjoint blocks; eachiteration step involves only one of these blocks. SAGE updates the parameters sequentially in eachblock. In experiments, the RBI-SAGE algorithm and classical SAGE algorithm are compared in theapplication on positron emission tomography (PET) image reconstruction. Simulation results show thatRBI-SAGE has better performance than SAGE in both convergence and image quality.
基金Supported by the National Basic Research Program of China ("973"Program)(2006CB601201)~~
文摘In high-resolution cone-beam computed tomography (CBCT) using the flat-panel detector, imperfect or defect detector elements cause ring artifacts due to the none-uniformity of their X-ray response. They often disturb the image quality. A dedicated fitting correction method for high-resolution micro-CT is presented. The method converts each elementary X-ray response curve to an average one, and eliminates response inconsistency among pixels. Other factors of the method are discussed, such as the correction factor variability by different sampling frames and nonlinear factors over the whole spectrum. Results show that the noise and artifacts are both reduced in reconstructed images
基金The National Natural Science Foundation of China(No.51575256)the Fundamental Research Funds for the Central Universities(No.NP2015101,XZA16003)the Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD)
文摘To improve spectral X-ray CT reconstructed image quality, the energy-weighted reconstructed image xbins^W and the separable paraboloidal surrogates(SPS) algorithm are proposed for the prior image constrained compressed sensing(PICCS)-based spectral X-ray CT image reconstruction. The PICCS-based image reconstruction takes advantage of the compressed sensing theory, a prior image and an optimization algorithm to improve the image quality of CT reconstructions.To evaluate the performance of the proposed method, three optimization algorithms and three prior images are employed and compared in terms of reconstruction accuracy and noise characteristics of the reconstructed images in each energy bin.The experimental simulation results show that the image xbins^W is the best as the prior image in general with respect to the three optimization algorithms; and the SPS algorithm offers the best performance for the simulated phantom with respect to the three prior images. Compared with filtered back-projection(FBP), the PICCS via the SPS algorithm and xbins^W as the prior image can offer the noise reduction in the reconstructed images up to 80. 46%, 82. 51%, 88. 08% in each energy bin,respectively. M eanwhile, the root-mean-squared error in each energy bin is decreased by 15. 02%, 18. 15%, 34. 11% and the correlation coefficient is increased by 9. 98%, 11. 38%,15. 94%, respectively.
文摘A discussion is presented to gain the relationship between the original image and the reconstructed image while errors are mixed with the DCT data of the original image for the transmission. We can find the wrong reconstructed blocks, locate blocks, which are most likely to contain errors and eliminate errors. The method needs no channel error protection, needs no verifiable bits, and needs no extra bandwidth. Experimental results are provided in the end.
基金the National Natural Science Foundation of China(No.61962032)。
文摘In order to improve the reconstruction accuracy of magnetic resonance imaging(MRI),an accurate natural image compressed sensing(CS)reconstruction network is proposed,which combines the advantages of model-based and deep learning-based CS-MRI methods.In theory,enhancing geometric texture details in linear reconstruction is possible.First,the optimization problem is decomposed into two problems:linear approximation and geometric compensation.Aimed at the problem of image linear approximation,the data consistency module is used to deal with it.Since the processing process will lose texture details,a neural network layer that explicitly combines image and frequency feature representation is proposed,which is named butterfly dilated geometric distillation network.The network introduces the idea of butterfly operation,skillfully integrates the features of image domain and frequency domain,and avoids the loss of texture details when extracting features in a single domain.Finally,a channel feature fusion module is designed by combining channel attention mechanism and dilated convolution.The attention of the channel makes the final output feature map focus on the more important part,thus improving the feature representation ability.The dilated convolution enlarges the receptive field,thereby obtaining more dense image feature data.The experimental results show that the peak signal-to-noise ratio of the network is 5.43 dB,5.24 dB and 3.89 dB higher than that of ISTA-Net+,FISTA and DGDN networks on the brain data set with a Cartesian sampling mask CS ratio of 10%.
基金supported by the National Natural Science Foundation of China[Grant Nos.62205367 and 62141506]Suzhou Basic Research Pilot Project[Grant Nos.SSD2023006 and SJC2021013]Jiangsu Provincial Key Research and Development Program[Grant No.BE2020664].
文摘Super-resolution structured illumination microscopy(SR-SIM)relies heavily on post-processing reconstruction to obtain high-quality SR images from raw data.Although many SIM reconstruction algorithms have been developed to recover fine cellular structures with high fidelity even from the noisy data,whether the pixel intensities of reconstructed SR images are still proportional to the original fluorescence intensity has been less explored.The linearity between the intensity before and after reconstruction is de fined as the intensity fidelity.Here,we proposed a method to evaluate the reconstructed SR image intensity fidelity at different spatial frequencies.With the proposed metric,we systematically investigated the impact of the key factors on the intensity fidelity in the standard Wiener-SIM reconstructions with simulated data,then evaluated the intensity fidelity of the SR images reconstructed by representative open-source packages.Our work provides a reference for SR-SIM image intensity fidelity improvement.