期刊文献+
共找到262篇文章
< 1 2 14 >
每页显示 20 50 100
Zero-Shot Based Spatial AI Algorithm for Up-to-Date 3D Vision Map Generations in Highly Complex Indoor Environments
1
作者 Sehun Lee Taehoon Kim Junho Ahn 《Computers, Materials & Continua》 2025年第8期3623-3648,共26页
This paper proposes a zero-shot based spatial recognition AI algorithm by fusing and developing multidimensional vision identification technology adapted to the situation in large indoor and underground spaces.With th... This paper proposes a zero-shot based spatial recognition AI algorithm by fusing and developing multidimensional vision identification technology adapted to the situation in large indoor and underground spaces.With the expansion of large shopping malls and underground urban spaces(UUS),there is an increasing need for new technologies that can quickly identify complex indoor structures and changes such as relocation,remodeling,and construction for the safety and management of citizens through the provision of the up-to-date indoor 3D site maps.The proposed algorithm utilizes data collected by an unmanned robot to create a 3D site map of the up-to-date indoor site and recognizes complex indoor spaces based on zero-shot learning.This research specifically addresses two major challenges:the difficulty of detecting walls and floors due to complex patterns and the difficulty of spatial perception due to unknown obstacles.The proposed algorithm addresses the limitations of the existing foundation model,detects floors and obstacles without expensive sensors,and improves the accuracy of spatial recognition by combining floor detection,vanishing point detection,and fusion obstacle detection algorithms.The experimental results show that the algorithm effectively detects the floor and obstacles in various indoor environments,with F1 scores of 0.96 and 0.93 in the floor detection and obstacle detection experiments,respectively. 展开更多
关键词 Spatial AI VISION foundation model zero-shot learning image segmentation
在线阅读 下载PDF
Denoising graph neural network based on zero-shot learning for Gibbs phenomenon in high-order DG applications
2
作者 Wei AN Jiawen LIU +3 位作者 Wenxuan OUYANG Haoyu RU Xuejun LIU Hongqiang LYU 《Chinese Journal of Aeronautics》 2025年第3期234-248,共15页
With the availability of high-performance computing technology and the development of advanced numerical simulation methods, Computational Fluid Dynamics (CFD) is becoming more and more practical and efficient in engi... With the availability of high-performance computing technology and the development of advanced numerical simulation methods, Computational Fluid Dynamics (CFD) is becoming more and more practical and efficient in engineering. As one of the high-precision representative algorithms, the high-order Discontinuous Galerkin Method (DGM) has not only attracted widespread attention from scholars in the CFD research community, but also received strong development. However, when DGM is extended to high-speed aerodynamic flow field calculations, non-physical numerical Gibbs oscillations near shock waves often significantly affect the numerical accuracy and even cause calculation failure. Data driven approaches based on machine learning techniques can be used to learn the characteristics of Gibbs noise, which motivates us to use it in high-speed DG applications. To achieve this goal, labeled data need to be generated in order to train the machine learning models. This paper proposes a new method for denoising modeling of Gibbs phenomenon using a machine learning technique, the zero-shot learning strategy, to eliminate acquiring large amounts of CFD data. The model adopts a graph convolutional network combined with graph attention mechanism to learn the denoising paradigm from synthetic Gibbs noise data and generalize to DGM numerical simulation data. Numerical simulation results show that the Gibbs denoising model proposed in this paper can suppress the numerical oscillation near shock waves in the high-order DGM. Our work automates the extension of DGM to high-speed aerodynamic flow field calculations with higher generalization and lower cost. 展开更多
关键词 Computational fluid dynamics High-order discon tinuous Galerkin method Gibbs phenomenon Graph neural networks zero-shot learning
原文传递
Select-and-Answer Prompting:Facilitating LLMs for Improving Zero-Shot Reasoning
3
作者 WANG Yufang TANG Xuesong HAO Kuangrong 《Journal of Donghua University(English Edition)》 2025年第5期513-522,共10页
Large language models(LLMs)have demonstrated remarkable generalization abilities across multiple tasks in natural language processing(NLP).For multi-step reasoning tasks,chain-of-thought(CoT)prompting facilitates step... Large language models(LLMs)have demonstrated remarkable generalization abilities across multiple tasks in natural language processing(NLP).For multi-step reasoning tasks,chain-of-thought(CoT)prompting facilitates step-by-step thinking,leading to improved performance.However,despite significant advancements in LLMs,current CoT prompting performs suboptimally on smaller-scale models that have fewer parameters.Additionally,the common paradigm of few-shot CoT prompting relies on a set of manual demonstrations,with performance contingent on the quality of these annotations and varying with task-specific requirements.To address these limitations,we propose a select-and-answer prompting method(SAP)to enhance language model performance on reasoning tasks without the need for manual demonstrations.This method comprises two primary steps:guiding the model to conduct preliminary analysis and generate several candidate answers based on the prompting;allowing the model to provide final answers derived from these candidate answers.The proposed prompting strategy is evaluated across two language models of varying sizes and six datasets.On ChatGLM-6B,SAP consistently outperforms few-shot CoT across all datasets.For GPT-3.5,SAP achieves comparable performance to few-shot CoT and outperforms zero-shot CoT in most cases.These experimental results indicate that SAP can significantly improve the accuracy of language models in reasoning tasks. 展开更多
关键词 zero-shot learning large language model(LLM) reasoning problem chain-of-thought(CoT)prompting
在线阅读 下载PDF
基于反向投影的zero-shot learning目标分类算法研究 被引量:1
4
作者 冯鹏 庹红娅 +2 位作者 乔凌峰 王洁欣 敬忠良 《计算机应用研究》 CSCD 北大核心 2017年第11期3291-3294,共4页
Zero-shot learning(ZSL)是针对没有训练样本的类别进行分类的问题。传统回归方法的核心是将视觉特征投影到语义空间,没有充分利用视觉特征自身包含的样本信息,同时训练计算量大。提出基于反向投影的ZSL目标分类方法,将类别原型投影到... Zero-shot learning(ZSL)是针对没有训练样本的类别进行分类的问题。传统回归方法的核心是将视觉特征投影到语义空间,没有充分利用视觉特征自身包含的样本信息,同时训练计算量大。提出基于反向投影的ZSL目标分类方法,将类别原型投影到视觉空间,利用视觉特征的语义性学习出映射函数,参数优化过程仅通过解析解就可以获得。在两个基准数据集的实验结果表明,提出的反向投影方法分类结果较传统回归方法和其他现有方法有大幅提升,并且训练时间大大减少,可以更好地推广到未知类别的分类问题上。 展开更多
关键词 zero-shot LEARNING 目标分类 反向投影 解析解
在线阅读 下载PDF
Zero-shot Fine-grained Classification by Deep Feature Learning with Semantics 被引量:8
5
作者 Ao-Xue Li Ke-Xin Zhang Li-Wei Wang 《International Journal of Automation and computing》 EI CSCD 2019年第5期563-574,共12页
Fine-grained image classification, which aims to distinguish images with subtle distinctions, is a challenging task for two main reasons: lack of sufficient training data for every class and difficulty in learning dis... Fine-grained image classification, which aims to distinguish images with subtle distinctions, is a challenging task for two main reasons: lack of sufficient training data for every class and difficulty in learning discriminative features for representation. In this paper, to address the two issues, we propose a two-phase framework for recognizing images from unseen fine-grained classes, i.e., zeroshot fine-grained classification. In the first feature learning phase, we finetune deep convolutional neural networks using hierarchical semantic structure among fine-grained classes to extract discriminative deep visual features. Meanwhile, a domain adaptation structure is induced into deep convolutional neural networks to avoid domain shift from training data to test data. In the second label inference phase, a semantic directed graph is constructed over attributes of fine-grained classes. Based on this graph, we develop a label propagation algorithm to infer the labels of images in the unseen classes. Experimental results on two benchmark datasets demonstrate that our model outperforms the state-of-the-art zero-shot learning models. In addition, the features obtained by our feature learning model also yield significant gains when they are used by other zero-shot learning models, which shows the flexility of our model in zero-shot finegrained classification. 展开更多
关键词 FINE-GRAINED image CLASSIFICATION zero-shot LEARNING DEEP FEATURE LEARNING domain adaptation semantic graph
原文传递
A Dual Discriminator Method for Generalized Zero-Shot Learning
6
作者 Tianshu Wei Jinjie Huang 《Computers, Materials & Continua》 SCIE EI 2024年第4期1599-1612,共14页
Zero-shot learning enables the recognition of new class samples by migrating models learned from semanticfeatures and existing sample features to things that have never been seen before. The problems of consistencyof ... Zero-shot learning enables the recognition of new class samples by migrating models learned from semanticfeatures and existing sample features to things that have never been seen before. The problems of consistencyof different types of features and domain shift problems are two of the critical issues in zero-shot learning. Toaddress both of these issues, this paper proposes a new modeling structure. The traditional approach mappedsemantic features and visual features into the same feature space;based on this, a dual discriminator approachis used in the proposed model. This dual discriminator approach can further enhance the consistency betweensemantic and visual features. At the same time, this approach can also align unseen class semantic features andtraining set samples, providing a portion of information about the unseen classes. In addition, a new feature fusionmethod is proposed in the model. This method is equivalent to adding perturbation to the seen class features,which can reduce the degree to which the classification results in the model are biased towards the seen classes.At the same time, this feature fusion method can provide part of the information of the unseen classes, improvingits classification accuracy in generalized zero-shot learning and reducing domain bias. The proposed method isvalidated and compared with othermethods on four datasets, and fromthe experimental results, it can be seen thatthe method proposed in this paper achieves promising results. 展开更多
关键词 Generalized zero-shot learning modality consistent DISCRIMINATOR domain shift problem feature fusion
在线阅读 下载PDF
A Novel Siamese Network for Few/Zero-Shot Handwritten Character Recognition Tasks
7
作者 Nagwa Elaraby Sherif Barakat Amira Rezk 《Computers, Materials & Continua》 SCIE EI 2023年第1期1837-1854,共18页
Deep metric learning is one of the recommended methods for the challenge of supporting few/zero-shot learning by deep networks.It depends on building a Siamese architecture of two homogeneous Convolutional Neural Netw... Deep metric learning is one of the recommended methods for the challenge of supporting few/zero-shot learning by deep networks.It depends on building a Siamese architecture of two homogeneous Convolutional Neural Networks(CNNs)for learning a distance function that can map input data from the input space to the feature space.Instead of determining the class of each sample,the Siamese architecture deals with the existence of a few training samples by deciding if the samples share the same class identity or not.The traditional structure for the Siamese architecture was built by forming two CNNs from scratch with randomly initialized weights and trained by binary cross-entropy loss.Building two CNNs from scratch is a trial and error and time-consuming phase.In addition,training with binary crossentropy loss sometimes leads to poor margins.In this paper,a novel Siamese network is proposed and applied to few/zero-shot Handwritten Character Recognition(HCR)tasks.The novelties of the proposed network are in.1)Utilizing transfer learning and using the pre-trained AlexNet as a feature extractor in the Siamese architecture.Fine-tuning a pre-trained network is typically faster and easier than building from scratch.2)Training the Siamese architecture with contrastive loss instead of the binary cross-entropy.Contrastive loss helps the network to learn a nonlinear mapping function that enables it to map the extracted features in the vector space with an optimal way.The proposed network is evaluated on the challenging Chars74K datasets by conducting two experiments.One is for testing the proposed network in few-shot learning while the other is for testing it in zero-shot learning.The recognition accuracy of the proposed network reaches to 85.6%and 82%in few-and zero-shot learning respectively.In addition,a comparison between the performance of the proposed Siamese network and the traditional Siamese CNNs is conducted.The comparison results show that the proposed network achieves higher recognition results in less time.The proposed network reduces the training time from days to hours in both experiments. 展开更多
关键词 Handwritten character recognition(HCR) few-shot learning zero-shot learning deep metric learning transfer learning contrastive loss Chars74K datasets
在线阅读 下载PDF
Explanatory Multi-Scale Adversarial Semantic Embedding Space Learning for Zero-Shot Recognition
8
作者 Huiting Li 《Open Journal of Applied Sciences》 2022年第3期317-335,共19页
The goal of zero-shot recognition is to classify classes it has never seen before, which needs to build a bridge between seen and unseen classes through semantic embedding space. Therefore, semantic embedding space le... The goal of zero-shot recognition is to classify classes it has never seen before, which needs to build a bridge between seen and unseen classes through semantic embedding space. Therefore, semantic embedding space learning plays an important role in zero-shot recognition. Among existing works, semantic embedding space is mainly taken by user-defined attribute vectors. However, the discriminative information included in the user-defined attribute vector is limited. In this paper, we propose to learn an extra latent attribute space automatically to produce a more generalized and discriminative semantic embedded space. To prevent the bias problem, both user-defined attribute vector and latent attribute space are optimized by adversarial learning with auto-encoders. We also propose to reconstruct semantic patterns produced by explanatory graphs, which can make semantic embedding space more sensitive to usefully semantic information and less sensitive to useless information. The proposed method is evaluated on the AwA2 and CUB dataset. These results show that our proposed method achieves superior performance. 展开更多
关键词 zero-shot Recognition Semantic Embedding Space Adversarial Learning Explanatory Graph
在线阅读 下载PDF
A Survey of Zero-Shot Object Detection
9
作者 Weipeng Cao Xuyang Yao +3 位作者 Zhiwu Xu Ye Liu Yinghui Pan Zhong Ming 《Big Data Mining and Analytics》 2025年第3期726-750,共25页
Zero-Shot object Detection(ZSD),one of the most challenging problems in the field of object detection,aims to accurately identify new categories that are not encountered during training.Recent advancements in deep lea... Zero-Shot object Detection(ZSD),one of the most challenging problems in the field of object detection,aims to accurately identify new categories that are not encountered during training.Recent advancements in deep learning and increased computational power have led to significant improvements in object detection systems,achieving high recognition accuracy on benchmark datasets.However,these systems remain limited in real-world applications due to the scarcity of labeled training samples,making it difficult to detect unseen classes.To address this,researchers have explored various approaches,yielding promising progress.This article provides a comprehensive review of the current state of ZSD,distinguishing four related methods—zero-shot,open-vocabulary,open-set,and open-world approaches—based on task objectives and data usage.We highlight representative methods,discuss the technical challenges within each framework,and summarize the commonly used evaluation metrics,benchmark datasets,and experimental results.Our review aims to offer readers a clear overview of the latest developments and performance trends in ZSD. 展开更多
关键词 zero-shot object Detection(ZSD) open-vocabulary object detection open-set object detection open-world object detection
原文传递
生成式零样本深度学习模型的轴承故障诊断方法
10
作者 刘月文 刘文淼 +2 位作者 李永亭 齐咏生 刘慧文 《中国农机化学报》 北大核心 2026年第1期201-209,共9页
基于深度学习的故障诊断模型需要大量数据进行训练,然而在实际工况中环境恶劣,完备故障数据的获取困难,导致模型训练精度差甚至无法训练。为此,引入生成式零样本学习模型,然而生成式模型也存在一些局限性,如生成的特征质量可能比较差,... 基于深度学习的故障诊断模型需要大量数据进行训练,然而在实际工况中环境恶劣,完备故障数据的获取困难,导致模型训练精度差甚至无法训练。为此,引入生成式零样本学习模型,然而生成式模型也存在一些局限性,如生成的特征质量可能比较差,与真实特征之间存在较大差距,限制模型性能。针对此问题,提出一种结合互补属性和回归模块生成式零样本学习(CARMGZSL)方法并应用于轴承故障诊断。首先采用连续小波变换将一维故障信号转换为时频图,使用CNN提取故障特征;然后设计一种语义属性模块,依据不同故障定义不同语义属性,通过生成对抗模块将可见类故障的语义属性和故障特征进行对抗性训练,生成不可见类故障特征并送入判别器,和真实故障样本特征进行判别;再构造一类回归模块,将生成样本特征通过回归模块重构为语义属性送入生成器,使生成样本特征更加逼真;最后通过相似性度量实现对不可见类故障与生成式不可见类故障的距离判别,完成故障识别。通过凯斯西储大学轴承数据集进行算法验证,结果表明,在零样本情况下,该方法可实现滚动轴承零样本故障诊断,相比于其他经典的零样本诊断算法,所提方法平均准确率达到92.32%,具有更好的诊断性能。 展开更多
关键词 滚动轴承 零样本学习 故障诊断 生成对抗网络 语义特征
在线阅读 下载PDF
面向无人驾驶的零样本记忆感知选择视觉跟踪模型
11
作者 李杰 汪诗敏 +7 位作者 王长城 崔亚峰 汪俊杰 周惟嘉 胡铮 兰海 杜玲 高猛 《浙江大学学报(工学版)》 北大核心 2026年第1期61-70,共10页
为了保证无人驾驶车辆在遇到目标变形、被部分或完全遮挡等情况时仍然具有较高的跟踪准确性,构建零样本视觉跟踪模型.以经典卡尔曼滤波为基础,在掩码预测阶段加入运动建模模块,考虑时间和空间的一致性并结合运动线索,对预测掩码进行循... 为了保证无人驾驶车辆在遇到目标变形、被部分或完全遮挡等情况时仍然具有较高的跟踪准确性,构建零样本视觉跟踪模型.以经典卡尔曼滤波为基础,在掩码预测阶段加入运动建模模块,考虑时间和空间的一致性并结合运动线索,对预测掩码进行循环校正.采用混合评分系统,从预测掩码中选择最优掩码.对于历史最优掩码,设计记忆感知选择模块,创建理想掩码候选库,并结合历史特征和信息线索,动态选择最合适的掩码.在LaSOT、GOT-10k和OTB100数据集上对所提模型与HIPTrack-B384等多个经典视觉跟踪模型的性能进行评估和对比,结果表明,所提模型的ROC曲线下面积(AUC)、精度、平均重叠度、交并比阈值0.50和0.75对应的重叠精度和成功率相比于对比方法中各指标的最优值分别提升了2.87%、2.73%、2.84%、3.18%、5.46%和1.62%,表明算法在多个指标上具有较好的性能. 展开更多
关键词 无人汽车 视觉跟踪 运动建模 混合评分 记忆感知选择 零样本跟踪
在线阅读 下载PDF
Fabric Recognition Using Zero-Shot Learning 被引量:1
12
作者 Feng Wang Huaping Liu +1 位作者 Fuchun Sun Haihong Pan 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2019年第6期645-653,共9页
In this work, we use a deep learning method to tackle the Zero-Shot Learning(ZSL) problem in tactile material recognition by incorporating the advanced semantic information into a training model. Our main technical co... In this work, we use a deep learning method to tackle the Zero-Shot Learning(ZSL) problem in tactile material recognition by incorporating the advanced semantic information into a training model. Our main technical contribution is our proposal of an end-to-end deep learning framework for solving the tactile ZSL problem. In this framework, we use a Convolutional Neural Network(CNN) to extract the spatial features and Long Short-Term Memory(LSTM) to extract the temporal features in dynamic tactile sequences, and develop a loss function suitable for the ZSL setting. We present the results of experimental evaluations on publicly available datasets, which show the effectiveness of the proposed method. 展开更多
关键词 zero-shot-Learning (ZSL) FABRIC recog nition TACTILE recog nition DEEP lear ning
原文传递
TV-SAM:Increasing Zero-Shot Segmentation Performance on Multimodal Medical Images Using GPT-4 Generated Descriptive Prompts Without Human Annotation
13
作者 Zekun Jiang Dongjie Cheng +10 位作者 Ziyuan Qin Jun Gao Qicheng Lao Abdullaev Bakhrom Ismoilovich Urazboev Gayrat Yuldashov Elyorbek Bekchanov Habibullo Defu Tang Linjing Wei Kang Li Le Zhang 《Big Data Mining and Analytics》 CSCD 2024年第4期1199-1211,共13页
This study presents a novel multimodal medical image zero-shot segmentation algorithm named the text-visual-prompt segment anything model(TV-SAM)without any manual annotations.The TV-SAM incorporates and integrates th... This study presents a novel multimodal medical image zero-shot segmentation algorithm named the text-visual-prompt segment anything model(TV-SAM)without any manual annotations.The TV-SAM incorporates and integrates the large language model GPT-4,the vision language model GLIP,and the SAM to autonomously generate descriptive text prompts and visual bounding box prompts from medical images,thereby enhancing the SAM’s capability for zero-shot segmentation.Comprehensive evaluations are implemented on seven public datasets encompassing eight imaging modalities to demonstrate that TV-SAM can effectively segment unseen targets across various modalities without additional training.TV-SAM significantly outperforms SAM AUTO(p<0.01)and GSAM(p<0.05),closely matching the performance of SAM BBOX with gold standard bounding box prompts(p=0.07),and surpasses the state-of-the-art methods on specific datasets such as ISIC(0.853 versus 0.802)and WBC(0.968 versus 0.883).The study indicates that TV-SAM serves as an effective multimodal medical image zero-shot segmentation algorithm,highlighting the significant contribution of GPT-4 to zero-shot segmentation.By integrating foundational models such as GPT-4,GLIP,and SAM,the ability to address complex problems in specialized domains can be enhanced. 展开更多
关键词 large language model vision language model segment anything model medical image segmentation zero-shot segmentation GPT-4
原文传递
基于图像-文本大模型CLIP微调的零样本参考图像分割 被引量:3
14
作者 刘杰 乔文昇 +2 位作者 朱佩佩 雷印杰 王紫轩 《计算机应用研究》 北大核心 2025年第4期1248-1254,共7页
近年来,以CLIP为代表的视觉-语言大模型在众多下游场景中显示出了出色的零样本推理能力,然而将CLIP模型迁移至需要像素水平图-文理解的参考图像分割中非常困难,其根本原因在于CLIP关注图像-文本整体上的对齐情况,却丢弃了图像中像素点... 近年来,以CLIP为代表的视觉-语言大模型在众多下游场景中显示出了出色的零样本推理能力,然而将CLIP模型迁移至需要像素水平图-文理解的参考图像分割中非常困难,其根本原因在于CLIP关注图像-文本整体上的对齐情况,却丢弃了图像中像素点的空间位置信息。鉴于此,以CLIP为基础模型,提出了一种单阶段、细粒度、多层次的零样本参考图像分割模型PixelCLIP。具体地,采取了多尺度的图像特征融合,既聚集CLIP中不同视觉编码器提取的图像像素级特征,同时又考虑CLIP中固有的图像整体语义特征。在文本信息表征上,不但依靠CLIP-BERT来保持物体种类信息,还引入LLaVA大语言模型进一步注入上下文背景知识。最后,PixelCLIP通过细粒度跨模态关联匹配,实现像素水平的参考图像分割。充分的数值分析结果验证了该方法的有效性。 展开更多
关键词 零样本 CLIP 像素级 单阶段 参考图像分割
在线阅读 下载PDF
基于多模态融合Transformer的视听广义零次学习方法 被引量:1
15
作者 杨静 李小勇 +3 位作者 阮小利 李少波 唐向红 徐计 《电子与信息学报》 北大核心 2025年第7期2375-2384,共10页
视听零次学习需要理解音频和视觉信息之间的关系,以便能够推理未见过的类别。尽管领域做出了许多努力并取得了重大进展,但往往专注于学习强大的表征,从而忽视了音频和视频之间的依赖关系和输出分布与目标分布不一致的问题。因此,该文提... 视听零次学习需要理解音频和视觉信息之间的关系,以便能够推理未见过的类别。尽管领域做出了许多努力并取得了重大进展,但往往专注于学习强大的表征,从而忽视了音频和视频之间的依赖关系和输出分布与目标分布不一致的问题。因此,该文提出了基于Transformer的视听广义零次学习方法。具体来说,使用注意力机制来学习数据的内部信息,增强不同模态的信息交互,以捕捉视听数据之间的语义一致性;为了度量不同概率分布之间的差异和类别之间的一致性,引入了Kullback-Leibler(KL)散度和余弦相似度损失。为了评估所提方法,在VGGSound-GZSL^(cls),UCF-GZSL^(cls)和ActivityNet-GZSL^(cls)3个基准数据集上进行测试。大量的实验结果表明,所提方法在3个数据集上都取得了最先进的性能。 展开更多
关键词 视听零次学习 视频分类 注意力机制 KL散度
在线阅读 下载PDF
面向零样本图像分类的交互式类属性构建方法 被引量:1
16
作者 刘真 徐景胜 +2 位作者 颜菁 徐润森 吴向阳 《计算机辅助设计与图形学学报》 北大核心 2025年第2期243-253,共11页
零样本图像分类解决了训练和测试数据类别不相交的问题,人类标注属性是一种常用的实现零样本图像分类的辅助知识.为协助专家设计类属性矩阵,提出了一种交互式构建方法,简化了烦琐且缺乏指导的流程.首先,通过一种基于概念的深度学习可解... 零样本图像分类解决了训练和测试数据类别不相交的问题,人类标注属性是一种常用的实现零样本图像分类的辅助知识.为协助专家设计类属性矩阵,提出了一种交互式构建方法,简化了烦琐且缺乏指导的流程.首先,通过一种基于概念的深度学习可解释性方法,在训练集图像数据中提取出可理解的属性信息;然后,采用多视图协作的交互方式,探索和分析已提取属性的重要性.系统提供了全局和局部2种方式,辅助用户设计测试集数据类别的属性值;最后,通过在数据集Animals with Attributes2上进行的案例分析,以及采用李克特量表的用户评估实验,验证了设计方法的有效性和实用性,可以帮助专家用户高效且便捷地完成类属性构建工作. 展开更多
关键词 零样本学习 零样本图像分类 可视分析 可解释人工智能 人机协作
在线阅读 下载PDF
CGR-BERT-ZESHEL:基于中文特征的零样本实体链接模型 被引量:1
17
作者 潘建 吴志伟 李燕君 《计算机科学》 北大核心 2025年第4期262-270,共9页
目前,在实体链接任务的研究中,对中文实体链接、新兴实体与不知名实体链接的研究较少。此外,传统的BERT模型忽略了中文的两个关键方面,即字形和部首,这两者为语言理解提供了重要的语法和语义信息。针对以上问题,提出了一种基于中文特征... 目前,在实体链接任务的研究中,对中文实体链接、新兴实体与不知名实体链接的研究较少。此外,传统的BERT模型忽略了中文的两个关键方面,即字形和部首,这两者为语言理解提供了重要的语法和语义信息。针对以上问题,提出了一种基于中文特征的零样本实体链接模型CGR-BERT-ZESHEL。该模型首先通过引入视觉图像嵌入和传统字符嵌入,分别将字形特征和部首特征输入模型,从而增强词向量特征并缓解未登录词对模型性能的影响;然后采用候选实体生成和候选实体排序两阶段的方法得到实体链接的结果。在Hansel和CLEEK两个数据集上进行实验,结果表明,与基线模型相比,CGR-BERT-ZESHEL模型在候选实体生成阶段的性能指标Recall@100提高了17.49%和7.34%,在候选实体排序阶段的性能指标Accuracy提高了3.02%和3.11%;同时,在Recall@100和Accuracy指标上的性能均优于其他对比模型。 展开更多
关键词 实体链接 中文零样本 BERT 候选实体生成 候选实体排序
在线阅读 下载PDF
基于双专家的巡检影像多模态零样本缺陷检测 被引量:1
18
作者 吴华 贾栋豪 +3 位作者 张婷婷 白晓静 孙笠 蒲梦杨 《中国图象图形学报》 北大核心 2025年第3期672-682,共11页
目的电力设备巡检影像缺陷检测对于提高电力传输的安全性和电网运行的可靠性具有重要作用。但由于相应训练数据集的构造成本高昂,传统的监督学习方法难以适应电力设备巡检影像缺陷检测。同时电力设备巡检影像中通常含有复杂多样的背景,... 目的电力设备巡检影像缺陷检测对于提高电力传输的安全性和电网运行的可靠性具有重要作用。但由于相应训练数据集的构造成本高昂,传统的监督学习方法难以适应电力设备巡检影像缺陷检测。同时电力设备巡检影像中通常含有复杂多样的背景,严重干扰了模型对缺陷的检测。方法基于视觉语言模型并结合文本提示,提出了电力设备巡检影像零样本缺陷检测模型。模型中含有多个双专家模块,在由视觉语言模型获得文本特征和视觉特征后,经多个双专家模块处理并融合,得到像素级的缺陷检测结果。同时,构建了具有像素级掩码标注的电力设备巡检影像数据集对模型性能进行全面评测。结果在本文构建的电力设备巡检影像测试数据集上与SAA+(segment any anomaly+)、AnomalyGPT、WinCLIP(window-based CLIP)、PaDiM(patch distribution modeling)和PatchCore进行比较,在像素级的缺陷分割性能表现上,AUROC(area under the receiver operating characteristic curve)平均提升18.1%,F1-max(F1 score at optimal threshold)平均提升26.1%;在图像级的缺陷分类性能表现上,AUROC平均提升20.2%,AP(average precision)平均提升10.0%。具体到数据集中的各个电力设备,模型在像素级缺陷分割性能表现上,均获得最好结果。同时进行了消融实验,证明了双专家模块对提升模型缺陷检测精度的显著效果。结论本文模型以零样本的方式,避免了构造电力设备巡检影像数据集的高昂成本。同时提出的双专家模块,使模型减少了受巡检影像复杂背景区域的干扰。 展开更多
关键词 零样本缺陷检测 双专家 视觉语言模型 多模态 电力设备巡检影像
原文传递
提升零样本工业异常检测方法泛化性的属性无关提示学习分析 被引量:2
19
作者 刘桂雄 闫奕樸 +1 位作者 陈贵龙 邢星奥 《激光杂志》 北大核心 2025年第5期64-70,共7页
工业异常检测是制造过程质量控制核心环节,零样本工业异常检测属性无关提示学习是提升泛化性有效途径。本文面向工业生产应用,针对零样本工业异常检测属性无关提示学习,从可学习文本提示、物体解耦文本提示两个方面的基本原理、框架、... 工业异常检测是制造过程质量控制核心环节,零样本工业异常检测属性无关提示学习是提升泛化性有效途径。本文面向工业生产应用,针对零样本工业异常检测属性无关提示学习,从可学习文本提示、物体解耦文本提示两个方面的基本原理、框架、流程与应用性能等内容,系统分析比较各方法应用特点,指出图像与文本共同优化提示,以及细化异常特征描述是该领域值得关注方向,对工业异常检测技术研究人员具有指导参考价值。 展开更多
关键词 工业异常检测 属性无关提示学习 大模型 零样本
原文传递
融合CLIP和3D高斯的多模态场景编辑算法
20
作者 曹仰杰 王伟平 +2 位作者 李振强 谢俊 吕润峰 《郑州大学学报(工学版)》 北大核心 2025年第5期35-42,共8页
针对3D场景编辑算法对标注数据过度依赖和计算复杂度高的问题,提出了一种融合CLIP与3D高斯的多模态场景编辑算法(CLIP2Gaussian)。首先,利用SAM从多视角图像中提取目标掩码,并引入双向传播策略实现不同视角之间的掩码一致性;其次,将提... 针对3D场景编辑算法对标注数据过度依赖和计算复杂度高的问题,提出了一种融合CLIP与3D高斯的多模态场景编辑算法(CLIP2Gaussian)。首先,利用SAM从多视角图像中提取目标掩码,并引入双向传播策略实现不同视角之间的掩码一致性;其次,将提取的掩码通过CLIP进行语义标签分配,并映射到3D高斯点,实现3D场景的语义嵌入;最后,采用可微分渲染机制对3D高斯参数进行优化,同时引入空间一致性正则化策略,通过聚类增强语义标签在3D空间中的一致性与稳定性。实验结果表明:CLIP2Gaussian在LERF数据集上IoU达到61.23%,语义分割任务中单次文本查询响应时间为0.57 s,准确率和效率均优于LERF。消融实验进一步验证了所提算法在最小扰动原始场景的前提下对目标区域的精准编辑。 展开更多
关键词 3D重建 零样本学习 场景理解 场景编辑 3D高斯
在线阅读 下载PDF
上一页 1 2 14 下一页 到第
使用帮助 返回顶部