输电线路巡检中采集的螺栓图像有分辨率低、视觉信息不足的特点。针对传统图像分类模型难以从螺栓图像中学习到语义丰富的视觉表征问题,提出了一种基于多模态对比学习的输电线路螺栓缺陷分类方法。首先,为了将文本中螺栓相关的语义信息...输电线路巡检中采集的螺栓图像有分辨率低、视觉信息不足的特点。针对传统图像分类模型难以从螺栓图像中学习到语义丰富的视觉表征问题,提出了一种基于多模态对比学习的输电线路螺栓缺陷分类方法。首先,为了将文本中螺栓相关的语义信息和先验知识以跨模态的方式注入视觉表征,提出了一种结合多模态对比预训练和监督式微调的二阶段训练算法;其次,为了缓解多模态对比预训练中的过拟合问题,提出了标签平滑的信息噪声对比估计损失(info noise contrastive estimation loss with label smoothing,infoNCE-LS),以提高预训练视觉表征的泛化性能;最后,针对上下游任务的不匹配问题,设计了3种基于文本提示的分类头,以改善预训练视觉表征在监督式微调阶段的迁移学习效果。实验结果表明:该文基于Res Net50和ViT构建的两种模型在螺栓缺陷分类数据集上的准确率分别为92.3%和97.4%,相比基线分别提高了2.4%和5.8%。研究实现了从文本到图像的语义信息跨模态补充,为螺栓缺陷识别的研究提供了新的思路。展开更多
Magnetic resonance imaging(MRI)inherently requires considerable time for data acquisition,but obtaining multi-contrast MRI data further prolongs this process,thereby increasing susceptibility to motion artifacts.It is...Magnetic resonance imaging(MRI)inherently requires considerable time for data acquisition,but obtaining multi-contrast MRI data further prolongs this process,thereby increasing susceptibility to motion artifacts.It is worth noting that the multi-contrast MR images have both structural similarities and unique contrast information.Therefore,to take advantage of their similarities while preserving their distinctive characteristics,we proposed a new method called high-dimensional subsets embedding(HDSE).This novel approach is based on the frame of low-rank modeling of local k-space neighborhoods with parallel imaging(P-LORAKS).Specifically,our approach utilizes the structural similarity of multi-contrast MR images to process different k-space data through two independent channels.In one channel,we individually separate the complementary T_(1)-T_(2)k-space data and directly construct a new subset of local k-space,allowing the model to better capture structural correlations between multiple contrasts.In another channel,we provide global under-sampled T_(2)-weighted k-space data further constrain image acquisition in highdimensional space to maintain image consistency and reduce noise amplification.These two different channels information is fused together to form high-dimensional feature objects.Besides,we embed the constructed objects into P-LORAKS in various ways to enhance the reconstruction performance.Experimental results demonstrated that the aided reconstruction of local subsets fusion and the high-dimensional reconstruction of adaptive global constraints can improve the accuracy of image reconstruction and enhance the robustness of the model.展开更多
文摘输电线路巡检中采集的螺栓图像有分辨率低、视觉信息不足的特点。针对传统图像分类模型难以从螺栓图像中学习到语义丰富的视觉表征问题,提出了一种基于多模态对比学习的输电线路螺栓缺陷分类方法。首先,为了将文本中螺栓相关的语义信息和先验知识以跨模态的方式注入视觉表征,提出了一种结合多模态对比预训练和监督式微调的二阶段训练算法;其次,为了缓解多模态对比预训练中的过拟合问题,提出了标签平滑的信息噪声对比估计损失(info noise contrastive estimation loss with label smoothing,infoNCE-LS),以提高预训练视觉表征的泛化性能;最后,针对上下游任务的不匹配问题,设计了3种基于文本提示的分类头,以改善预训练视觉表征在监督式微调阶段的迁移学习效果。实验结果表明:该文基于Res Net50和ViT构建的两种模型在螺栓缺陷分类数据集上的准确率分别为92.3%和97.4%,相比基线分别提高了2.4%和5.8%。研究实现了从文本到图像的语义信息跨模态补充,为螺栓缺陷识别的研究提供了新的思路。
基金supported by National Natural Science Foundation under 62122033.
文摘Magnetic resonance imaging(MRI)inherently requires considerable time for data acquisition,but obtaining multi-contrast MRI data further prolongs this process,thereby increasing susceptibility to motion artifacts.It is worth noting that the multi-contrast MR images have both structural similarities and unique contrast information.Therefore,to take advantage of their similarities while preserving their distinctive characteristics,we proposed a new method called high-dimensional subsets embedding(HDSE).This novel approach is based on the frame of low-rank modeling of local k-space neighborhoods with parallel imaging(P-LORAKS).Specifically,our approach utilizes the structural similarity of multi-contrast MR images to process different k-space data through two independent channels.In one channel,we individually separate the complementary T_(1)-T_(2)k-space data and directly construct a new subset of local k-space,allowing the model to better capture structural correlations between multiple contrasts.In another channel,we provide global under-sampled T_(2)-weighted k-space data further constrain image acquisition in highdimensional space to maintain image consistency and reduce noise amplification.These two different channels information is fused together to form high-dimensional feature objects.Besides,we embed the constructed objects into P-LORAKS in various ways to enhance the reconstruction performance.Experimental results demonstrated that the aided reconstruction of local subsets fusion and the high-dimensional reconstruction of adaptive global constraints can improve the accuracy of image reconstruction and enhance the robustness of the model.