期刊文献+

基于伪标签去噪和SAM优化的大规模无监督语义分割

Pseudo-label Denoising and SAM Optimization for Large-scale Unsupervised Semantic Segmentation
在线阅读 下载PDF
导出
摘要 语义分割技术能够对复杂、多元的场景实现细粒度理解,是促进无人系统高效、智能工作的关键技术之一.大规模无监督语义分割旨在从大规模未标记图像中学习语义分割能力.然而,现有方法由于自学习伪标签存在类别混淆和形状表示欠佳的问题,导致最终分割精度较低.为此,本文提出一种伪标签去噪和SAM优化(Pseudo-label Denoising and SAM Optimization,PDSO)方法以解决大规模无监督语义分割问题.本文设计了一种基于去噪的特征微调模块,在基于小损失准则从大规模数据集中筛选出具有干净图像级伪标签的潜在样本后,利用这些干净样本对预训练的主干网络进行微调,使网络获得更稳健的类别表示.为了进一步减少伪标签中的类别噪声,设计了一种基于聚类的样本去噪模块,根据类别占比和样本与聚类中心之间的距离来去除干扰聚类任务的噪声样本,从而提升聚类性能.本文还设计了一种SAM提示优化模块,根据聚类距离识别出图像中的活跃类别,以过滤噪声目标,并将点和框作为SAM的目标提示信息,生成预期的目标掩膜以细化伪标签中目标的边缘.实验结果表明,在大规模语义分割数据集ImageNet-S_(50)、ImageNet-S_(300)和ImageNet-S_(919)的测试集上,本文方法在平均交并比指标上分别达到了45.0%、26.6%和14.5%,显著提高了分割目标的类别准确率和边缘精度. Semantic segmentation technology enables fine-grained understanding of complex and diverse scenes and is one of the key technologies to promote efficient and intelligent work of unmanned systems.Large-scale unsupervised semantic segmentation aims to learn semantic segmentation capabilities from a large number of unlabeled images.However,the existing approaches suffer heavily from their noisy self-learned pseudo-labels with poor category and shape representations,leading to low final segmentation accuracy.In this paper,we propose a Pseudo-label Denoising and SAM Optimization(PDSO)approach for large-scale unsupervised semantic segmentation to alleviate the problem mentioned above.Specifically,we first propose a denoising-based feature fine-tuning module,which fine-tunes the pre-trained backbone network with clean image-level pseudo-label samples selected from a large dataset based on a small loss criterion,enabling the network to obtain more robust category representations.To further reduce category noise in pseudo-labels,we propose a clustering-based sample denoising module to discard noisy samples that interfere with clustering based on the category proportion and the distances between samples and cluster centers,thereby enhancing clustering performance.Moreover,we propose a SAM prompt optimization module,which identifies active categories in the image based on clustering distance to filter out noisy targets and uses points and boxes as SAM’s target prompt information to generate expected target masks and refine the edges of targets in pseudo-labels.Our proposed PDSO reaches the mIoU of 45.0%,26.6%,and 14.5%on the test set of ImageNet-S_(50),ImageNet-S_(300),and ImageNet-S_(919)datasets,respectively,which significantly improves the category accuracy and edge accuracy of the segmented targets.
作者 杨维静 徐瑞 顾浩文 陈涛 舒祥波 姚亚洲 YANG Wei-jing;XU Rui;GU Hao-wen;CHEN Tao;SHU Xiang-bo;YAO Ya-zhou(School of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing,Jiangsu 210094,China)
出处 《电子学报》 北大核心 2025年第3期716-727,共12页 Acta Electronica Sinica
基金 国家自然科学基金(No.62302217) 装备发展部信息系统共用技术预研项目(No.31511030202)。
关键词 大规模无监督语义分割 图像级去噪 分割一切模型 伪标签 聚类 large-scale unsupervised semantic segmentation image-level denoising segment anything model pseudo-label clustering
  • 相关文献

参考文献2

二级参考文献17

共引文献39

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部