摘要
随着深度学习的发展,基于CNN和Transformer的语义分割在遥感领域得到了广泛应用。然而,这些方法仍存在局限:前者缺乏远程建模能力,后者受制于计算复杂性。最近,Mamba所提出的视觉状态空间(visual state space,VSS)模型展现了其能够对远程关系进行有效线性计算的能力。受此启发,提出了一种基于CNN和视觉状态空间的遥感影像语义分割网络,以克服现有方法的局限。首先,构建一个由CNN和VSS分支组成的架构,并行提取多尺度特征信息,挖掘局部相关性并捕获远程上下文依赖关系,并将VSS代替Transformer应用于解码器;其次,设计了协同调制模块学习空间权重调制特征,以自适应融合双分支语义信息,增强语义信息间的依赖关系;最后,使用额外的辅助头优化网络,通过辅助损失函数引导模型在训练中更多关注关键区域。该方法在LoveDA和Vaihingen数据集上进行实验验证,其mF1指标分别为69.61%和90.53%,mIoU指标分别为53.95%和83.13%。实验结果表明,所提出的模型在这两个公共数据集上表现优于其他分割模型。
With the development of deep learning,CNN and Transformer drive the widespread application of semantic segmentation in the field of remote sensing.However,these methods still face limitations:the former fails to model long-range dependencies effectively,while computational complexity constrains the latter.Recently,Mamba demonstrates that the visual state space(VSS)model efficiently computes long-range dependencies through linear operations.The design introduced a semantic segmentation network for remote sensing images,combining CNN and visual state space to overcome existing limitations.The architecture consisted of a CNN branch and a VSS branch,which extracted multi-scale features,explored local correlations,and captured long-range dependencies.It applied the VSS instead of the Transformer to the decoder.A co-modulation module learned spatial weight modulation features,fused semantic information from both branches,and enhanced dependencies between them.An additional auxiliary header optimized the network,focusing the model on critical regions during training through the auxiliary loss function.The experiments validate the method achieves mF1 scores of 69.61% and 90.53% and mIoU scores of 53.95% and 83.13% on the LoveDA and Vaihingen datasets,respectively.The experimental results show that the proposed model outperforms other segmentation models on these two public datasets.
作者
张仕洁
张斌
赵文豪
Zhang Shijie;Zhang Bin;Zhao Wenhao(Hubei Province Key Laboratory of Intelligent Robot,Wuhan Institute of Technology,Wuhan 430205,China;School of Computer Science&Engineering,Wuhan Institute of Technology,Wuhan 430205,China)
出处
《计算机应用研究》
北大核心
2025年第5期1583-1588,共6页
Application Research of Computers
基金
湖北省自然科学基金资助项目(2022CFCO31)。
关键词
遥感
语义分割
视觉状态空间
CNN
特征融合
remote sensing
semantic segmentation
visual state space
CNN
feature fusion