目的工业缺陷检测是现代工业质量控制中至关重要的一环,针对工业多模态缺陷检测场景下,捕捉不同形状大小、在RGB图像上感知度低的缺陷,以及减少单模态原始特征空间内存在的噪声对多模态信息交互的干扰的挑战,提出了一种基于归一化流的...目的工业缺陷检测是现代工业质量控制中至关重要的一环,针对工业多模态缺陷检测场景下,捕捉不同形状大小、在RGB图像上感知度低的缺陷,以及减少单模态原始特征空间内存在的噪声对多模态信息交互的干扰的挑战,提出了一种基于归一化流的多模态多尺度缺陷检测方法。方法首先,使用Vision Transformer和Point Transformer对RGB图像和3D点云两个模态的信息提取第1、3、11块的特征构建特征金字塔,保留低层次特征的空间信息助力缺陷定位任务,并提高模型对不同形状大小缺陷的鲁棒性;其次,为了简化多模态交互,使用过点特征对齐算法将3D点云特征对齐至RGB图像所在平面,通过构建对比学习矩阵的方式实现无监督多模态特征融合,促进不同模态之间信息的交互;此外,通过设计代理任务的方式将信息瓶颈机制扩展至无监督,并在尽可能保留原始信息的同时,减少噪声干扰得到更充分有力的多模态表示;最后,使用多尺度归一化流结构捕捉不同尺度的特征信息,实现不同尺度特征之间的交互。结果本文方法在MVTec-3D AD数据集上进行性能评估,实验结果显示Detection AUCROC(area under the curve of the receiveroperating characteristic)指标达到93.3%,SegmentationAUPRO(area under the precision-recall overlap)指标达到96.1%,Segmentation AUCROC指标达到98.8%,优于大多数现有的多模态缺陷检测方法。结论本文方法对于不同形状大小、在RGB图像上感知度低的缺陷有较好的检测效果,不但减少了原始特征空间内噪声对多模态表示的影响,并且对不同形状大小的缺陷具有一定的泛化能力,较好地满足了现代工业对于缺陷检测的要求。展开更多
Graph contrastive learning(GCL)has attracted extensive research interest due to its powerful ability to capture latent structural and semantic information of graphs in a self-supervised manner.Existing GCL methods com...Graph contrastive learning(GCL)has attracted extensive research interest due to its powerful ability to capture latent structural and semantic information of graphs in a self-supervised manner.Existing GCL methods commonly adopt predefined graph augmentations to generate two contrastive views.Subsequently,they design a contrastive pretext task between these views with the goal of maximizing their agreement.These methods as-sume the augmented graph can fully preserve the semantics of the original.However,typical data augmentation strategies in GCL,such as random edge dropping,may alter the properties of the original graph.As a result,previous GCL methods overlooked graph differences,potentially leading to difficulty distinguishing between graphs that are structurally similar but semantically different.Therefore,we argue that it is necessary to design a method that can quantify the dissimilarity between the original and augmented graphs to more accurately capture the relationships between samples.In this work,we propose a novel graph contrastive learning framework,named Accurate Difference-based Node-Level Graph Contrastive Learning(DNGCL),which helps the model distinguish similar graphs with slight differences by learning node-level differences between graphs.Specifically,we train the model to distinguish between original and augmented nodes via a node discriminator and employ cosine dissimilarity to accurately measure the difference between each node.Furthermore,we employ multiple types of data augmentation commonly used in current GCL methods on the original graph,aiming to learn the differences between nodes under different augmentation strategies and help the model learn richer local information.We conduct extensive experiments on six benchmark datasets and the results show that our DNGCL outperforms most state-of-the-art baselines,which strongly validates the effectiveness of our model.展开更多
文摘目的工业缺陷检测是现代工业质量控制中至关重要的一环,针对工业多模态缺陷检测场景下,捕捉不同形状大小、在RGB图像上感知度低的缺陷,以及减少单模态原始特征空间内存在的噪声对多模态信息交互的干扰的挑战,提出了一种基于归一化流的多模态多尺度缺陷检测方法。方法首先,使用Vision Transformer和Point Transformer对RGB图像和3D点云两个模态的信息提取第1、3、11块的特征构建特征金字塔,保留低层次特征的空间信息助力缺陷定位任务,并提高模型对不同形状大小缺陷的鲁棒性;其次,为了简化多模态交互,使用过点特征对齐算法将3D点云特征对齐至RGB图像所在平面,通过构建对比学习矩阵的方式实现无监督多模态特征融合,促进不同模态之间信息的交互;此外,通过设计代理任务的方式将信息瓶颈机制扩展至无监督,并在尽可能保留原始信息的同时,减少噪声干扰得到更充分有力的多模态表示;最后,使用多尺度归一化流结构捕捉不同尺度的特征信息,实现不同尺度特征之间的交互。结果本文方法在MVTec-3D AD数据集上进行性能评估,实验结果显示Detection AUCROC(area under the curve of the receiveroperating characteristic)指标达到93.3%,SegmentationAUPRO(area under the precision-recall overlap)指标达到96.1%,Segmentation AUCROC指标达到98.8%,优于大多数现有的多模态缺陷检测方法。结论本文方法对于不同形状大小、在RGB图像上感知度低的缺陷有较好的检测效果,不但减少了原始特征空间内噪声对多模态表示的影响,并且对不同形状大小的缺陷具有一定的泛化能力,较好地满足了现代工业对于缺陷检测的要求。
基金supported in part by the Zhejiang Provincial Natural Science Foundation of China(LDT23F01012F01 and LDT23F01015F01)in part by the Fundamental Research Funds for the Provincial Universities of Zhejiang Grant GK229909299001-008the National Natural Science Foundation of China(62372146 and 61806061).
文摘Graph contrastive learning(GCL)has attracted extensive research interest due to its powerful ability to capture latent structural and semantic information of graphs in a self-supervised manner.Existing GCL methods commonly adopt predefined graph augmentations to generate two contrastive views.Subsequently,they design a contrastive pretext task between these views with the goal of maximizing their agreement.These methods as-sume the augmented graph can fully preserve the semantics of the original.However,typical data augmentation strategies in GCL,such as random edge dropping,may alter the properties of the original graph.As a result,previous GCL methods overlooked graph differences,potentially leading to difficulty distinguishing between graphs that are structurally similar but semantically different.Therefore,we argue that it is necessary to design a method that can quantify the dissimilarity between the original and augmented graphs to more accurately capture the relationships between samples.In this work,we propose a novel graph contrastive learning framework,named Accurate Difference-based Node-Level Graph Contrastive Learning(DNGCL),which helps the model distinguish similar graphs with slight differences by learning node-level differences between graphs.Specifically,we train the model to distinguish between original and augmented nodes via a node discriminator and employ cosine dissimilarity to accurately measure the difference between each node.Furthermore,we employ multiple types of data augmentation commonly used in current GCL methods on the original graph,aiming to learn the differences between nodes under different augmentation strategies and help the model learn richer local information.We conduct extensive experiments on six benchmark datasets and the results show that our DNGCL outperforms most state-of-the-art baselines,which strongly validates the effectiveness of our model.