期刊文献+

基于显著性特征与交叉注意力的无参考图像质量评价算法

No-reference image quality assessment algorithm based on saliency features and cross-attention mechanism
在线阅读 下载PDF
导出
摘要 实际业务场景中的图像数据通常呈现内容丰富和失真表现复杂的特点,对客观图像质量评价(IQA)算法的泛化是一个巨大挑战。针对这一问题,提出一种无参考IQA(NR-IQA)算法。该算法主要由特征提取网络(FEN)、特征融合网络(FFN)和自适应预测网络(APN)这3个子网络组成。首先,将样本的全局视图、局部patch和显著性视图一并输入FEN,并通过Swim Transformer提取全局失真、局部失真和显著性特征;其次,采用级联的Transformer编码器融合全局失真特征和局部失真特征,并挖掘二者的潜在关联模式;受人类视觉关注机制的启发,在FFN中使用显著性特征激发注意力模块,使该模块对视觉显著性区域施加额外关注,从而提升算法的语义解析能力;最后,通过动态构建的多层感知机(MLP)回归网络计算出预测分数。在主流的合成失真和真实失真数据集上的实验结果表明,所提算法与DSMix(Distortion-induced Sensitivity map-guided Mixed augmentation)算法相比,所提算法在TID2013数据集上的斯皮尔曼秩序相关系数(SRCC)提升了4.3%,在KonIQ数据集上的皮尔森线性相关系数(PLCC)提升了1.4%,并展现出了出色的泛化能力和可解释性,能够有效应对业务场景中失真表现复杂的情况,且可以根据样本个体特征做出适应性预测。 Image data in actual business scenarios usually presents the characteristics of rich content and complex distortion performance,which is a great challenge to the generalization of objective Image Quality Assessment(IQA)algorithms.In order to solve this problem,a No-Reference IQA(NR-IQA)algorithm was proposed,which is mainly composed of three sub-networks:Feature Extraction Network(FEN),Feature Fusion Network(FFN),and Adaptive Prediction Network(APN).Firstly,the global view,local patch,and saliency view of the sample were input into the FEN together,and the global distortion,local distortion,and saliency features were extracted by Swim Transformer.Then,the cascaded Transformer encoder was used to fuse the global distortion features and local distortion features,and the potential correlation patterns of the two were explored.Inspired by the human visual attention mechanism,the saliency features were used in the FFN to activate the attention module,so that the module was able to pay additional attention to the visual salient region,so as to improve the semantic parsing ability of the algorithm.Finally,the prediction score was calculated by the dynamically constructed MultiLayer Perceptron(MLP)regression network.Experimental results on main stream synthetic and real-world distortion datasets show that compared with the DSMix(Distortion-induced Sensitivity map-guided Mixed augmentation)algorithm,the proposed algorithm improves the Spearman Rank-order Correlation Coefficient(SRCC)by 4.3%on TID2013 dataset,and the Pearson Linear Correlation Coefficient(PLCC)by 1.4%on KonIQ dataset.The proposed algorithm also demonstrates excellent generalization ability and interpretability,which can deal with the complex distortion performance in business scenarios effectively,and can make adaptive prediction according to the individual characteristics of the sample.
作者 邓旸 赵涛 孙凯 童同 高钦泉 DENG Yang;ZHAO Tao;SUN Kai;TONG Tong;GAO Qinquan(School of Physics and Information Engineering,Fuzhou University,Fuzhou Fujian 350108,China;Fuzhou Branch,China Telecom Corporation Limited,Fuzhou Fujian 350005,China;Beijing Radio and Television Station,Beijing 100022,China;Fujian Imperial Vision Technology Group Company Limited,Fuzhou Fujian 350002,China)
出处 《计算机应用》 北大核心 2025年第12期3995-4003,共9页 journal of Computer Applications
基金 福建省人工智能科技经济融合服务平台项目([2022]15)。
关键词 图像质量评价 人类视觉系统 视觉关注 显著目标检测 注意力机制 Image Quality Assessment(IQA) Human Visual System(HVS) visual attention Salient Object Detection(SOD) attention mechanism
  • 相关文献

参考文献1

共引文献66

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部