Nonlinear transforms have significantly advanced learned image compression(LIC),particularly using residual blocks.This transform enhances the nonlinear expression ability and obtain compact feature representation by ...Nonlinear transforms have significantly advanced learned image compression(LIC),particularly using residual blocks.This transform enhances the nonlinear expression ability and obtain compact feature representation by enlarging the receptive field,which indicates how the convolution process extracts features in a high dimensional feature space.However,its functionality is restricted to the spatial dimension and network depth,limiting further improvements in network performance due to insufficient information interaction and representation.Crucially,the potential of high dimensional feature space in the channel dimension and the exploration of network width/resolution remain largely untapped.In this paper,we consider nonlinear transforms from the perspective of feature space,defining high-dimensional feature spaces in different dimensions and investigating the specific effects.Firstly,we introduce the dimension increasing and decreasing transforms in both channel and spatial dimensions to obtain high dimensional feature space and achieve better feature extraction.Secondly,we design a channel-spatial fusion residual transform(CSR),which incorporates multi-dimensional transforms for a more effective representation.Furthermore,we simplify the proposed fusion transform to obtain a slim architecture(CSR-sm),balancing network complexity and compression performance.Finally,we build the overall network with stacked CSR transforms to achieve better compression and reconstruction.Experimental results demonstrate that the proposed method can achieve superior ratedistortion performance compared to the existing LIC methods and traditional codecs.Specifically,our proposed method achieves 9.38%BD-rate reduction over VVC on Kodak dataset.展开更多
针对现有深度学习算法在壁画修复时,存在全局语义一致性约束不足及局部特征提取不充分,导致修复后的壁画易出现边界效应和细节模糊等问题,提出一种双向自回归Transformer与快速傅里叶卷积增强的壁画修复方法.首先,设计基于Transformer...针对现有深度学习算法在壁画修复时,存在全局语义一致性约束不足及局部特征提取不充分,导致修复后的壁画易出现边界效应和细节模糊等问题,提出一种双向自回归Transformer与快速傅里叶卷积增强的壁画修复方法.首先,设计基于Transformer结构的全局语义特征修复模块,利用双向自回归机制与掩码语言模型(masked language modeling,MLM),提出改进的多头注意力全局语义壁画修复模块,提高对全局语义特征的修复能力.然后,构建了由门控卷积和残差模块组成的全局语义增强模块,增强全局语义特征一致性约束.最后,设计局部细节修复模块,采用大核注意力机制(large kernel attention,LKA)与快速傅里叶卷积提高细节特征的捕获能力,同时减少局部细节信息的丢失,提升修复壁画局部和整体特征的一致性.通过对敦煌壁画数字化修复实验,结果表明,所提算法修复性能更优,客观评价指标均优于比较算法.展开更多
现有的基于卷积神经网络的超分辨率重建方法由于感受野限制,难以充分利用遥感图像丰富的上下文信息和自相关性,导致重建效果不佳.针对该问题,本文提出了一种基于多重蒸馏与Transformer的遥感图像超分辨率(remote sensing image super-re...现有的基于卷积神经网络的超分辨率重建方法由于感受野限制,难以充分利用遥感图像丰富的上下文信息和自相关性,导致重建效果不佳.针对该问题,本文提出了一种基于多重蒸馏与Transformer的遥感图像超分辨率(remote sensing image super-resolution based on multi-distillation and Transformer,MDT)重建方法.首先结合多重蒸馏和双注意力机制,逐步提取低分辨率图像中的多尺度特征,以减少特征丢失.接着,构建一种卷积调制Transformer来提取图像的全局信息,恢复更多复杂的纹理细节,从而提升重建图像的视觉效果.最后,在上采样过程中添加全局残差路径,提高特征在网络中的传播效率,有效减少了图像的失真与伪影问题.在AID和UCMerced两个数据集上的进行实验,结果表明,本文方法在放大至4倍超分辨率任务上的峰值信噪比和结构相似度分别最高达到了29.10 dB和0.7807,重建图像质量明显提高,并且在细节保留方面达到了更好的视觉效果.展开更多
基金supported by the Key Program of the National Natural Science Foundation of China(Grant No.62031013)Guangdong Province Key Construction Discipline Scientific Research Capacity Improvement Project(Grant No.2022ZDJS117).
文摘Nonlinear transforms have significantly advanced learned image compression(LIC),particularly using residual blocks.This transform enhances the nonlinear expression ability and obtain compact feature representation by enlarging the receptive field,which indicates how the convolution process extracts features in a high dimensional feature space.However,its functionality is restricted to the spatial dimension and network depth,limiting further improvements in network performance due to insufficient information interaction and representation.Crucially,the potential of high dimensional feature space in the channel dimension and the exploration of network width/resolution remain largely untapped.In this paper,we consider nonlinear transforms from the perspective of feature space,defining high-dimensional feature spaces in different dimensions and investigating the specific effects.Firstly,we introduce the dimension increasing and decreasing transforms in both channel and spatial dimensions to obtain high dimensional feature space and achieve better feature extraction.Secondly,we design a channel-spatial fusion residual transform(CSR),which incorporates multi-dimensional transforms for a more effective representation.Furthermore,we simplify the proposed fusion transform to obtain a slim architecture(CSR-sm),balancing network complexity and compression performance.Finally,we build the overall network with stacked CSR transforms to achieve better compression and reconstruction.Experimental results demonstrate that the proposed method can achieve superior ratedistortion performance compared to the existing LIC methods and traditional codecs.Specifically,our proposed method achieves 9.38%BD-rate reduction over VVC on Kodak dataset.
文摘针对现有深度学习算法在壁画修复时,存在全局语义一致性约束不足及局部特征提取不充分,导致修复后的壁画易出现边界效应和细节模糊等问题,提出一种双向自回归Transformer与快速傅里叶卷积增强的壁画修复方法.首先,设计基于Transformer结构的全局语义特征修复模块,利用双向自回归机制与掩码语言模型(masked language modeling,MLM),提出改进的多头注意力全局语义壁画修复模块,提高对全局语义特征的修复能力.然后,构建了由门控卷积和残差模块组成的全局语义增强模块,增强全局语义特征一致性约束.最后,设计局部细节修复模块,采用大核注意力机制(large kernel attention,LKA)与快速傅里叶卷积提高细节特征的捕获能力,同时减少局部细节信息的丢失,提升修复壁画局部和整体特征的一致性.通过对敦煌壁画数字化修复实验,结果表明,所提算法修复性能更优,客观评价指标均优于比较算法.
文摘现有的基于卷积神经网络的超分辨率重建方法由于感受野限制,难以充分利用遥感图像丰富的上下文信息和自相关性,导致重建效果不佳.针对该问题,本文提出了一种基于多重蒸馏与Transformer的遥感图像超分辨率(remote sensing image super-resolution based on multi-distillation and Transformer,MDT)重建方法.首先结合多重蒸馏和双注意力机制,逐步提取低分辨率图像中的多尺度特征,以减少特征丢失.接着,构建一种卷积调制Transformer来提取图像的全局信息,恢复更多复杂的纹理细节,从而提升重建图像的视觉效果.最后,在上采样过程中添加全局残差路径,提高特征在网络中的传播效率,有效减少了图像的失真与伪影问题.在AID和UCMerced两个数据集上的进行实验,结果表明,本文方法在放大至4倍超分辨率任务上的峰值信噪比和结构相似度分别最高达到了29.10 dB和0.7807,重建图像质量明显提高,并且在细节保留方面达到了更好的视觉效果.