融合Mamba与蛇形卷积的图像去模糊网络

Image deblurring network combining Mamba and snake-like convolution

导出

摘要目的针对Transformer在图像去模糊过程中难以精确恢复图像细节的问题,提出一种结合Mamba模型与蛇形卷积技术的图像去模糊网络MSNet(Mamba snake convolution network)。方法首先,结合Mamba框架与蛇形卷积,提出蛇形状态空间模块(snake state-space module,SSSM)。SSSM通过调整卷积核的形状和路径,动态适应图像局部特征并调整卷积方向,以对齐不同的模糊条纹模式;其次,使用多方向扫描模块(direction scan module,DSM)进行多个方向的扫描,捕捉图像中的长期依赖。再利用离散状态空间方程合并多方向的结构信息,增强模型对全局结构的捕捉能力;最后,引入蛇形通道注意力(snake channel attention,SCA),利用门控设计筛选和调整模糊信息的权重,确保在去除模糊的同时保留关键细节。结果实验在GoPro和HIDE数据集上,与主流的卷积神经网络(convolutional neural network,CNN)和Transformer去模糊方法相比,MSNet的峰值信噪比(peak signal to noise ratio,PSNR)分别提升1.2%和1.9%,结构相似性(structural similarity,SSIM)分别提升0.6%和0.7%。结论本文方法可以有效去除复杂场景下产生的图像模糊,并复原细节。 Objective Traditional image deblurring methods,such as those based on convolutional neural networks(CNNs)and Transformers,have achieved substantial advancements in improving deblurring performance.Despite these achievements,these methods are still constrained by high computational demands and limitations in restoring intricate image details.In complex conditions involving motion blur or high-frequency details,existing approaches often rely on fixed convolution kernels or global self-attention mechanisms.Such static designs lack the adaptability to handle diverse types of blur effectively,which leads to suboptimal detail recovery and inadequate reconstruction of global image structures.Moreover,Transformer-based deblurring methods frequently require extensive computational resources,which significantly diminishes their feasibility for deployment on mobile devices or embedded systems.These resource constraints not only restrict their applicability in practical scenarios but also impede their broader adoption in real-world applications.To address these challenges,this study proposes a novel image deblurring method,which is termed MSNet.By integrating the efficient state space modeling capabilities of the Mamba framework with snake convolution techniques,MSNet leverages the complementary strengths of these innovations.This approach aims to reduce computational overhead while achieving high-fidelity recovery of fine image details and structural information.With its enhanced adaptability and efficiency,MSNet is better suited for practical applications.It offers robust performance in tackling complex deblurring tasks across diverse scenarios.Method To achieve the objective,the MSNet network integrates three key modules:the snake state space module(SSSM),the directional scanning module(DSM),and the snake channel attention module(SCA).Each module is designed for a specific purpose,and their combination effectively tackles local detail recovery and global structure restoration.The SSSM combines the Mamba framework with snake convolution technology,with the aim of enhancing the capability of the model to capture subtle blur features.Unlike traditional CNN-based methods relying on fixed convolution kernels,SSSM dynamically adjusts the shape and path of the convolutional kernels.This way allows them to adapt to local image features and blur stripe patterns.Snake convolution alters the convolution path to effectively capture local blur features.Moreover,the Mamba framework takes advantage of state space models through processing long-range dependencies with linear computational complexity.In contrast to the high computational complexity of Transformer-based models relying on self-attention,Mamba can more efficiently capture long-term dependencies in the image,which avoids the excessive computational burden associated with Transformer models.Simultaneously,snake convolution enhances the precision with which the network adapts to local image features.Thus,it offers notable advantages in capturing complex motion blur and fine detail blur.The DSM module transforms image features into a one-dimensional sequence and scans these features in multiple directions(diagonal,horizontal,and vertical)to capture long-range dependencies.This module effectively improves global structure restoration,particularly in scenes with objects moving simultaneously in multiple directions,which allows for better reconstruction of the overall image structure.The SCA module uses a gating mechanism to filter and adjust the weights of the blurred information.Through combining snake convolution with a channel attention mechanism,this module allows the model to dynamically adjust the weights of different features,which prioritizes key image details while removing irrelevant blur information.Through this selective focus,the SCA module significantly enhances detail recovery and optimizes the overall deblurring performance.Result To validate the effectiveness of MSNet,we conducted comparative and ablation experiments on two widely used image deblurring benchmark datasets:GoPro and HIDE.During the experiments,MSNet was compared against several commonly used deblurring methods.The results show that MSNet exhibited outstanding performance in addressing image blur artifacts and restoring fine details.On the GoPro dataset,MSNet achieved significant improvements in PSNR and SSIM compared with Transformer-and CNN-based methods.MSNet demonstrated superior accuracy in restoring blurred regions,which effectively addressed the limitations of existing methods in handling complex scenes.This performance highlights capability of MSNet to process images with intricate details and challenging blur conditions more effectively than its counterparts.On the HIDE dataset,MSNet also outperformed Transformer-and CNN-based methods through achieving higher PSNR and SSIM scores.It showed remarkable accuracy in deblurring fine textual and facial details in blurred images.By leveraging its adaptive convolution design and multidirectional scanning approach,MSNet exhibited strong robustness and generalization capabilities.Thus,it is well suited for complex and dynamic scenarios.Moreover,MSNet demonstrated exceptional computational efficiency.It achieved a computational complexity of 63.7 GFLOPs on the GoPro dataset,which was significantly lower than those of MIMO-UNet and other comparative methods.This balance of high deblurring performance and low computational cost makes MSNet an ideal solution for real-time deblurring tasks in resource-constrained environments.Ablation studies further validated the contributions of the key modules of MSNet.The removal of the SSSM or the SCA module led to a significant drop in PSNR,with the greatest decrease occurring when both modules were removed.These findings highlight the critical role of these modules in improving deblurring accuracy and restoring fine image details.In addition,network depth analysis revealed that MSNet-28(28 layers)achieved the best performance,with a PSNR of 33.51 dB and an SSIM of 0.97.This result confirms the importance of optimizing network depth and module design to enhance overall performance.Conclusion MSNet demonstrates outstanding performance across multiple datasets.It not only showcases its exceptional deblurring accuracy and detail recovery capabilities but also achieves a good balance in computational efficiency.By incorporating the state space model of the Mamba framework and the flexibility of serpentine convolution,MSNet efficiently handles long-range dependencies,particularly exhibiting stronger adaptability in complex blur scenarios.The ablation experiments validate the importance of each module,with the SSSM and SCA modules playing key roles in detail recovery and global structure reconstruction.Overall,MSNet excels in deblurring tasks with its strong generalization capabilities,efficient computation,and superior performance in detail recovery.

作者邱云飞刘则延王茂华 Qiu Yunfei;Liu Zeyan;Wang Maohua(School of Software,Liaoning Technical University,Huludao 125105,China;College of Information Engineering,Liaoning Institute of Science and Engineering,Jinzhou 121000,China;School of Computer and Cyberspace Security,Fujian Normal University,Fuzhou 350117,China)

机构地区辽宁工程技术大学软件学院辽宁理工学院信息工程学院福建师范大学计算机与网络空间安全学院

出处《中国图象图形学报》北大核心 2025年第10期3187-3198,共12页 Journal of Image and Graphics

基金国家自然科学基金项目(62173171) 辽宁省自然科学基金项目(2015020095)。

关键词图像去模糊 Mamba模型方向扫描模块(DSM) 蛇形卷积蛇形通道注意力(SCA) image deblurring Mamba model direction scan module(DSM) snake convolution snake channel attention(SCA)

分类号 TP391.4 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献3

1陈加保,熊邦书,况发,章照中.深度特征融合注意力与双尺度的运动去模糊[J].中国图象图形学报,2023,28(12):3731-3743. 被引量：4
2程茹秋,余烨,石岱宗,蔡文.图像与视频质量评价综述[J].中国图象图形学报,2022,27(5):1410-1429. 被引量：10
3胡张颖,周全,陈明举,崔景程,吴晓富,郑宝玉.图像去模糊研究综述[J].中国图象图形学报,2024,29(4):841-861. 被引量：10

二级参考文献15

1金波,李朝锋,吴小俊.结合NSS和小波变换的无参考图像质量评价[J].中国图象图形学报,2012,17(1):33-39. 被引量：22
2谭海鹏,曾炫杰,牛四杰,陈强,孙权森.基于正则化约束的遥感图像多尺度去模糊[J].中国图象图形学报,2015,20(3):386-394. 被引量：24
3李晨昊,卓力,李嘉锋.基于内容的H.264无参考视频质量评价模型[J].测控技术,2019,38(1):106-110. 被引量：2
4邱枫,侯飞,袁野,王文成.加强边缘感知的盲去模糊算法[J].中国图象图形学报,2019,24(6):847-858. 被引量：8
5马苏欣,王家希,戴雅淑,陈杰,邵文泽.监控视频下模糊车牌的去模糊与识别探析[J].信息系统工程,2019,0(11):111-113. 被引量：3
6吴迪,赵洪田,郑世宝.密集连接卷积网络图像去模糊[J].中国图象图形学报,2020,25(5):890-899. 被引量：12
7宋巍,刘诗梦,黄冬梅,王文娟,王建.适用小样本的无参考水下视频质量评价方法[J].中国图象图形学报,2020,25(9):1787-1799. 被引量：5
8Guangtao ZHAI,Xiongkuo MIN.Perceptual image quality assessment:a survey[J].Science China(Information Sciences),2020,63(11):76-127. 被引量：34
9高敏娟,党宏社,魏立力,王海龙,张选德.结合全局与局部变化的图像质量评价[J].自动化学报,2020,46(12):2662-2671. 被引量：7
10朱惠娟,宗平,丛玉华.基于权重池的多尺度图像质量评估方法[J].计算机工程与应用,2021,57(3):215-221. 被引量：2

共引文献21

1崔玮,王体彬,莫宇蓉,陈旻瑞,李骁一.公路视频云联网检测内容与方法研究[J].公路交通科技,2022,39(S01):148-153. 被引量：1
2鄢杰斌,方玉明,刘学林,姚怡茹,眭相杰.视频质量评价研究综述[J].计算机学报,2023,46(10):2196-2224. 被引量：1
3朱仲杰,崔伟锋,白永强,井维一,金敏红.宏微观信息增强与色彩校正的高效色调映射[J].中国图象图形学报,2023,28(9):2833-2843.
4胡波,谢国庆,李雷达,李静,杨嘉琛,路文,高新波.图像重定向质量评价的研究进展[J].中国图象图形学报,2024,29(1):22-44. 被引量：1
5杨文兵,邱天,张志鹏,施博凯,张明威.基于深度学习的视频质量评价方法研究综述[J].现代信息科技,2024,8(7):73-80. 被引量：2
6胡张颖,周全,陈明举,崔景程,吴晓富,郑宝玉.图像去模糊研究综述[J].中国图象图形学报,2024,29(4):841-861. 被引量：10
7马丽,李英.基于区块链的视频流边缘计算卸载方案研究[J].青岛大学学报（自然科学版）,2024,37(2):33-40. 被引量：2
8鲁轶凡,蒋志迪,郁梅.联合全息平面与物平面的全息图质量评价[J].激光杂志,2024,45(7):124-129.
9谢壹珍.多尺度交叉模块的图像去模糊算法研究[J].信息技术与信息化,2024(10):16-19.
10杨涛,朱文球.基于近端梯度下降算法展开的盲图像去模糊方法[J].信息技术与信息化,2024(10):99-102.

1胡枭,张泽朕,杨家全,杨金铎,和学豪,王闯.融合路网-气象-日期多特征信息的电动汽车充电负荷预测[J].电力建设,2025,46(9):57-70.
2陈康,林建涵,刘元杰.图像去模糊算法研究综述[J].计算机科学,2025,52(11):98-112.
3贾迪,刘洋,李维,韩雪峰,宋慧伦,孟晓华,刘宇琪.融合局部空间信息的新视角合成方法[J].中国图象图形学报,2025,30(10):3346-3360.
4周楷,陶正顺,潘庭龙,许德智.基于CCS-MPC的储能锂电池组均衡控制策略[J].中国电力,2025,58(7):177-186.
5Lihua Zuo,Jian Mao.Social desirability response bias confounds the effect of gender on social media addiction[J].Journal of Psychology in Africa,2025,35(2):241-247.
6Zhiao Gao,Lingwei Kong,Junbiao Yan,Shuangjiao Wang.Effects of drying-wetting cycles on small-strain stiffness characteristics of fissured clay[J].Journal of Rock Mechanics and Geotechnical Engineering,2025,17(7):4618-4631.
7刘豪,唐贞云.简支梁桥走行车桥系统的实时混合试验稳定性预测方法[J].振动工程学报,2025,38(9):2011-2022.
8Chaoli Yuan,Mantuo Huang,Jiajia wan,Zijia Hong,Jiwen Luo,Lixuan Zeng,Yu Bon Man,Bingyan Lan,Xiaomin Yan,Yuan Kang.The effect of lead on dermal exposure of plasticizers in toys and associated risk assessment[J].Emerging Contaminants,2025,11(1):407-417.
9盛积良,陈兰兮,曾燕,周骐.基于SLSTM的协方差预测及其在分层风险平价资产配置中的应用[J].计量经济学报,2025,5(4):1095-1120.
10柯德涨,陈晔曜,徐海勇,金充充,蒋刚毅.基于多重注意力和感知加权学习的单图像高动态范围重建[J].电子学报,2025,53(6):2063-2078. 被引量：1

中国图象图形学报

2025年第10期

浏览历史

内容加载中请稍等...

融合Mamba与蛇形卷积的图像去模糊网络

参考文献3

二级参考文献15

共引文献21

相关作者

相关机构

相关主题

浏览历史