Deep learning-based semantic communication has achieved remarkable progress with CNNs and Transformers.However,CNNs exhibit constrained performance in high-resolution image transmission,while Transformers incur high c...Deep learning-based semantic communication has achieved remarkable progress with CNNs and Transformers.However,CNNs exhibit constrained performance in high-resolution image transmission,while Transformers incur high computational cost due to quadratic complexity.Recently,VMamba,a novel state space model with linear complexity and exceptional long-range dependency modeling capabilities,has shown great potential in computer vision tasks.Inspired by this,we propose MNTSCC,an efficient VMamba-based nonlinear joint source-channel coding(JSCC)model for wireless image transmission.Specifically,MNTSCC comprises a VMamba-based nonlinear transform module,an MCAM entropy model,and a JSCC module.In the encoding stage,the input image is first encoded into a latent representation via the nonlinear transformation module,which is then processed by the MCAM for source distribution modeling.The JSCC module then optimizes transmission efficiency by adaptively assigning transmission rate to the latent representation according to the estimated entropy values.The proposedMCAMenhances the channel-wise autoregressive entropy model with attention mechanisms,which enables the entropy model to effectively capture both global and local information within latent features,thereby enabling more accurate entropy estimation and improved rate-distortion performance.Additionally,to further enhance the robustness of the system under varying signal-to-noise ratio(SNR)conditions,we incorporate SNR adaptive net(SAnet)into the JSCCmodule,which dynamically adjusts the encoding strategy by integrating SNRinformationwith latent features,thereby improving SNR adaptability.Experimental results across diverse resolution datasets demonstrate that the proposed method achieves superior image transmission performance compared to existing CNN-and Transformer-based semantic communication models,while maintaining competitive computational efficiency.In particular,under an Additive White Gaussian Noise(AWGN)channel with SNR=10 dB and a channel bandwidth ratio(CBR)of 1/16,MNTSCC consistently outperforms NTSCC,achieving a 1.72 dB Peak Signal-to-Noise Ratio(PSNR)gain on the Kodak24 dataset,0.79 dB on CLIC2022,and 2.54 dB on CIFAR-10,while reducing computational cost by 32.23%.The code is available at https://github.com/WanChen10/MNTSCC(accessed on 09 July 2025).展开更多
The paper presents a class of nonlinear adaptive wavelet transforms for lossless image compression. In update step of the lifting the different operators are chosen by the local gradient of original image. A nonlinear...The paper presents a class of nonlinear adaptive wavelet transforms for lossless image compression. In update step of the lifting the different operators are chosen by the local gradient of original image. A nonlinear morphological predictor follows the update adaptive lifting to result in fewer large wavelet coefficients near edges for reducing coding. The nonlinear adaptive wavelet transforms can also allow perfect reconstruction without any overhead cost. Experiment results are given to show lower entropy of the adaptive transformed images than those of the non-adaptive case and great applicable potentiality in lossless image compresslon.展开更多
Remote sensing image registration is still a challenging task owing to the significant influence of nonlinear differences between remote sensing images.To solve this problem,this paper proposes a novel approach with r...Remote sensing image registration is still a challenging task owing to the significant influence of nonlinear differences between remote sensing images.To solve this problem,this paper proposes a novel approach with regard to feature-based remote sensing image registration.There are two key contributions:1)we bring forward an improved strategy of composite nonlinear diffusion filtering according to the scale factors in multi-scale space and 2)we design a gradually decreasing resolution of multi-scale pyramid space.And a binary code string is served as feature descriptors to improve matching efficiency.Extensive experiments of different categories of remote image datasets on feature extraction and feature registration are performed.The experimental results demonstrate the superiority of our proposed scheme compared with other classical algorithms in terms of correct matching ratio,accuracy and computation efficiency.展开更多
Reversible data hiding in encrypted images(RDHEI)is essential for safeguarding sensitive information within the encrypted domain.In this study,we propose an intelligent pixel predictor based on a residual group block ...Reversible data hiding in encrypted images(RDHEI)is essential for safeguarding sensitive information within the encrypted domain.In this study,we propose an intelligent pixel predictor based on a residual group block and a spatial attention module,showing superior pixel prediction performance compared to existing predictors.Additionally,we introduce an adaptive joint coding method that leverages bit-plane characteristics and intra-block pixel correlations to maximize embedding space,outperforming single coding approaches.The image owner employs the presented intelligent predictor to forecast the original image,followed by encryption through additive secret sharing before conveying the encrypted image to data hiders.Subsequently,data hiders encrypt secret data and embed them within the encrypted image before transmitting the image to the receiver.The receiver can extract secret data and recover the original image losslessly,with the processes of data extraction and image recovery being separable.Our innovative approach combines an intelligent predictor with additive secret sharing,achieving reversible data embedding and extraction while ensuring security and lossless recovery.Experimental results demonstrate that the predictor performs well and has a substantial embedding capacity.For the Lena image,the number of prediction errors within the range of[-5,5]is as high as 242500 and our predictor achieves an embedding capacity of 4.39 bpp.展开更多
文摘Deep learning-based semantic communication has achieved remarkable progress with CNNs and Transformers.However,CNNs exhibit constrained performance in high-resolution image transmission,while Transformers incur high computational cost due to quadratic complexity.Recently,VMamba,a novel state space model with linear complexity and exceptional long-range dependency modeling capabilities,has shown great potential in computer vision tasks.Inspired by this,we propose MNTSCC,an efficient VMamba-based nonlinear joint source-channel coding(JSCC)model for wireless image transmission.Specifically,MNTSCC comprises a VMamba-based nonlinear transform module,an MCAM entropy model,and a JSCC module.In the encoding stage,the input image is first encoded into a latent representation via the nonlinear transformation module,which is then processed by the MCAM for source distribution modeling.The JSCC module then optimizes transmission efficiency by adaptively assigning transmission rate to the latent representation according to the estimated entropy values.The proposedMCAMenhances the channel-wise autoregressive entropy model with attention mechanisms,which enables the entropy model to effectively capture both global and local information within latent features,thereby enabling more accurate entropy estimation and improved rate-distortion performance.Additionally,to further enhance the robustness of the system under varying signal-to-noise ratio(SNR)conditions,we incorporate SNR adaptive net(SAnet)into the JSCCmodule,which dynamically adjusts the encoding strategy by integrating SNRinformationwith latent features,thereby improving SNR adaptability.Experimental results across diverse resolution datasets demonstrate that the proposed method achieves superior image transmission performance compared to existing CNN-and Transformer-based semantic communication models,while maintaining competitive computational efficiency.In particular,under an Additive White Gaussian Noise(AWGN)channel with SNR=10 dB and a channel bandwidth ratio(CBR)of 1/16,MNTSCC consistently outperforms NTSCC,achieving a 1.72 dB Peak Signal-to-Noise Ratio(PSNR)gain on the Kodak24 dataset,0.79 dB on CLIC2022,and 2.54 dB on CIFAR-10,while reducing computational cost by 32.23%.The code is available at https://github.com/WanChen10/MNTSCC(accessed on 09 July 2025).
基金Supported by the National Natural Science Foundation of China (69983005)
文摘The paper presents a class of nonlinear adaptive wavelet transforms for lossless image compression. In update step of the lifting the different operators are chosen by the local gradient of original image. A nonlinear morphological predictor follows the update adaptive lifting to result in fewer large wavelet coefficients near edges for reducing coding. The nonlinear adaptive wavelet transforms can also allow perfect reconstruction without any overhead cost. Experiment results are given to show lower entropy of the adaptive transformed images than those of the non-adaptive case and great applicable potentiality in lossless image compresslon.
基金supported by National Nature Science Foundation of China(Nos.61640412 and 61762052)the Natural Science Foundation of Jiangxi Province(No.20192BAB207021)the Science and Technology Research Projects of Jiangxi Province Education Department(Nos.GJJ170633 and GJJ170632).
文摘Remote sensing image registration is still a challenging task owing to the significant influence of nonlinear differences between remote sensing images.To solve this problem,this paper proposes a novel approach with regard to feature-based remote sensing image registration.There are two key contributions:1)we bring forward an improved strategy of composite nonlinear diffusion filtering according to the scale factors in multi-scale space and 2)we design a gradually decreasing resolution of multi-scale pyramid space.And a binary code string is served as feature descriptors to improve matching efficiency.Extensive experiments of different categories of remote image datasets on feature extraction and feature registration are performed.The experimental results demonstrate the superiority of our proposed scheme compared with other classical algorithms in terms of correct matching ratio,accuracy and computation efficiency.
基金Project supported by the Scientific Research Project of Liaoning Provincial Department of Education,China(No.JYTMS20231039)the Liaoning Provincial Educational Science Planning Project,China(No.JG22CB252)。
文摘Reversible data hiding in encrypted images(RDHEI)is essential for safeguarding sensitive information within the encrypted domain.In this study,we propose an intelligent pixel predictor based on a residual group block and a spatial attention module,showing superior pixel prediction performance compared to existing predictors.Additionally,we introduce an adaptive joint coding method that leverages bit-plane characteristics and intra-block pixel correlations to maximize embedding space,outperforming single coding approaches.The image owner employs the presented intelligent predictor to forecast the original image,followed by encryption through additive secret sharing before conveying the encrypted image to data hiders.Subsequently,data hiders encrypt secret data and embed them within the encrypted image before transmitting the image to the receiver.The receiver can extract secret data and recover the original image losslessly,with the processes of data extraction and image recovery being separable.Our innovative approach combines an intelligent predictor with additive secret sharing,achieving reversible data embedding and extraction while ensuring security and lossless recovery.Experimental results demonstrate that the predictor performs well and has a substantial embedding capacity.For the Lena image,the number of prediction errors within the range of[-5,5]is as high as 242500 and our predictor achieves an embedding capacity of 4.39 bpp.