期刊文献+
共找到102篇文章
< 1 2 6 >
每页显示 20 50 100
Steel Surface Defect Detection Using Learnable Memory Vision Transformer
1
作者 Syed Tasnimul Karim Ayon Farhan Md.Siraj Jia Uddin 《Computers, Materials & Continua》 SCIE EI 2025年第1期499-520,共22页
This study investigates the application of Learnable Memory Vision Transformers(LMViT)for detecting metal surface flaws,comparing their performance with traditional CNNs,specifically ResNet18 and ResNet50,as well as o... This study investigates the application of Learnable Memory Vision Transformers(LMViT)for detecting metal surface flaws,comparing their performance with traditional CNNs,specifically ResNet18 and ResNet50,as well as other transformer-based models including Token to Token ViT,ViT withoutmemory,and Parallel ViT.Leveraging awidely-used steel surface defect dataset,the research applies data augmentation and t-distributed stochastic neighbor embedding(t-SNE)to enhance feature extraction and understanding.These techniques mitigated overfitting,stabilized training,and improved generalization capabilities.The LMViT model achieved a test accuracy of 97.22%,significantly outperforming ResNet18(88.89%)and ResNet50(88.90%),aswell as the Token to TokenViT(88.46%),ViT without memory(87.18),and Parallel ViT(91.03%).Furthermore,LMViT exhibited superior training and validation performance,attaining a validation accuracy of 98.2%compared to 91.0%for ResNet 18,96.0%for ResNet50,and 89.12%,87.51%,and 91.21%for Token to Token ViT,ViT without memory,and Parallel ViT,respectively.The findings highlight the LMViT’s ability to capture long-range dependencies in images,an areawhere CNNs struggle due to their reliance on local receptive fields and hierarchical feature extraction.The additional transformer-based models also demonstrate improved performance in capturing complex features over CNNs,with LMViT excelling particularly at detecting subtle and complex defects,which is critical for maintaining product quality and operational efficiency in industrial applications.For instance,the LMViT model successfully identified fine scratches and minor surface irregularities that CNNs often misclassify.This study not only demonstrates LMViT’s potential for real-world defect detection but also underscores the promise of other transformer-based architectures like Token to Token ViT,ViT without memory,and Parallel ViT in industrial scenarios where complex spatial relationships are key.Future research may focus on enhancing LMViT’s computational efficiency for deployment in real-time quality control systems. 展开更多
关键词 learnable Memory Vision Transformer(LMViT) Convolutional Neural Networks(CNN) metal surface defect detection deep learning computer vision image classification learnable memory gradient clipping label smoothing t-SNE visualization
在线阅读 下载PDF
PMoE:在P-tuning中引入混合专家的参数高效微调框架 被引量:4
2
作者 王浩 王珺 +3 位作者 胡海峰 周飞飞 龚锐 张索非 《计算机应用研究》 北大核心 2025年第7期1956-1963,共8页
大语言模型(LLM)在推理和生成任务中的表现显著提升,但现有的开源LLM在处理专业领域问题时仍存在知识不足,亟需针对特定任务进行微调。传统微调方法在多任务学习中难以兼顾低成本与高效性。为此,提出了一种名为PMoE的参数高效微调框架... 大语言模型(LLM)在推理和生成任务中的表现显著提升,但现有的开源LLM在处理专业领域问题时仍存在知识不足,亟需针对特定任务进行微调。传统微调方法在多任务学习中难以兼顾低成本与高效性。为此,提出了一种名为PMoE的参数高效微调框架。该框架基于P-tuning方法,并引入混合专家机制,在保持低成本微调的同时增强多任务处理能力。PMoE在Transformer模块的每一层构建可训练的专家模块,以替代P-tuning中的提示词模块,并采用路由机制根据输入任务特征动态分配任务。此外,PMoE的专家模块支持拆卸,实现不同任务场景下的模型复用,进一步降低计算成本。实验结果表明,PMoE在中国医学领域数据集上相较于P-tuning方法性能提升6.24%,并在多任务处理和迁移学习方面表现优越,验证了其高效性与广泛适用性。 展开更多
关键词 大语言模型 参数高效微调 p-tuning 混合专家 多任务学习
在线阅读 下载PDF
Toward a Learnable Climate Model in the Artificial Intelligence Era 被引量:6
3
作者 Gang HUANG Ya WANG +3 位作者 Yoo-Geun HAM Bin MU Weichen TAO Chaoyang XIE 《Advances in Atmospheric Sciences》 SCIE CAS CSCD 2024年第7期1281-1288,共8页
Artificial intelligence(AI)models have significantly impacted various areas of the atmospheric sciences,reshaping our approach to climate-related challenges.Amid this AI-driven transformation,the foundational role of ... Artificial intelligence(AI)models have significantly impacted various areas of the atmospheric sciences,reshaping our approach to climate-related challenges.Amid this AI-driven transformation,the foundational role of physics in climate science has occasionally been overlooked.Our perspective suggests that the future of climate modeling involves a synergistic partnership between AI and physics,rather than an“either/or”scenario.Scrutinizing controversies around current physical inconsistencies in large AI models,we stress the critical need for detailed dynamic diagnostics and physical constraints.Furthermore,we provide illustrative examples to guide future assessments and constraints for AI models.Regarding AI integration with numerical models,we argue that offline AI parameterization schemes may fall short of achieving global optimality,emphasizing the importance of constructing online schemes.Additionally,we highlight the significance of fostering a community culture and propose the OCR(Open,Comparable,Reproducible)principles.Through a better community culture and a deep integration of physics and AI,we contend that developing a learnable climate model,balancing AI and physics,is an achievable goal. 展开更多
关键词 artificial intelligence deep learning learnable climate model
在线阅读 下载PDF
Boosting Adversarial Training with Learnable Distribution
4
作者 Kai Chen Jinwei Wang +2 位作者 James Msughter Adeke Guangjie Liu Yuewei Dai 《Computers, Materials & Continua》 SCIE EI 2024年第3期3247-3265,共19页
In recent years,various adversarial defense methods have been proposed to improve the robustness of deep neural networks.Adversarial training is one of the most potent methods to defend against adversarial attacks.How... In recent years,various adversarial defense methods have been proposed to improve the robustness of deep neural networks.Adversarial training is one of the most potent methods to defend against adversarial attacks.However,the difference in the feature space between natural and adversarial examples hinders the accuracy and robustness of the model in adversarial training.This paper proposes a learnable distribution adversarial training method,aiming to construct the same distribution for training data utilizing the Gaussian mixture model.The distribution centroid is built to classify samples and constrain the distribution of the sample features.The natural and adversarial examples are pushed to the same distribution centroid to improve the accuracy and robustness of the model.The proposed method generates adversarial examples to close the distribution gap between the natural and adversarial examples through an attack algorithm explicitly designed for adversarial training.This algorithm gradually increases the accuracy and robustness of the model by scaling perturbation.Finally,the proposed method outputs the predicted labels and the distance between the sample and the distribution centroid.The distribution characteristics of the samples can be utilized to detect adversarial cases that can potentially evade the model defense.The effectiveness of the proposed method is demonstrated through comprehensive experiments. 展开更多
关键词 Adversarial training feature space learnable distribution distribution centroid
在线阅读 下载PDF
LDAS&ET-AD:Learnable Distillation Attack Strategies and Evolvable Teachers Adversarial Distillation
5
作者 Shuyi Li Hongchao Hu +3 位作者 Xiaohan Yang Guozhen Cheng Wenyan Liu Wei Guo 《Computers, Materials & Continua》 SCIE EI 2024年第5期2331-2359,共29页
Adversarial distillation(AD)has emerged as a potential solution to tackle the challenging optimization problem of loss with hard labels in adversarial training.However,fixed sample-agnostic and student-egocentric atta... Adversarial distillation(AD)has emerged as a potential solution to tackle the challenging optimization problem of loss with hard labels in adversarial training.However,fixed sample-agnostic and student-egocentric attack strategies are unsuitable for distillation.Additionally,the reliability of guidance from static teachers diminishes as target models become more robust.This paper proposes an AD method called Learnable Distillation Attack Strategies and Evolvable Teachers Adversarial Distillation(LDAS&ET-AD).Firstly,a learnable distillation attack strategies generating mechanism is developed to automatically generate sample-dependent attack strategies tailored for distillation.A strategy model is introduced to produce attack strategies that enable adversarial examples(AEs)to be created in areas where the target model significantly diverges from the teachers by competing with the target model in minimizing or maximizing the AD loss.Secondly,a teacher evolution strategy is introduced to enhance the reliability and effectiveness of knowledge in improving the generalization performance of the target model.By calculating the experimentally updated target model’s validation performance on both clean samples and AEs,the impact of distillation from each training sample and AE on the target model’s generalization and robustness abilities is assessed to serve as feedback to fine-tune standard and robust teachers accordingly.Experiments evaluate the performance of LDAS&ET-AD against different adversarial attacks on the CIFAR-10 and CIFAR-100 datasets.The experimental results demonstrate that the proposed method achieves a robust precision of 45.39%and 42.63%against AutoAttack(AA)on the CIFAR-10 dataset for ResNet-18 and MobileNet-V2,respectively,marking an improvement of 2.31%and 3.49%over the baseline method.In comparison to state-of-the-art adversarial defense techniques,our method surpasses Introspective Adversarial Distillation,the top-performing method in terms of robustness under AA attack for the CIFAR-10 dataset,with enhancements of 1.40%and 1.43%for ResNet-18 and MobileNet-V2,respectively.These findings demonstrate the effectiveness of our proposed method in enhancing the robustness of deep learning networks(DNNs)against prevalent adversarial attacks when compared to other competing methods.In conclusion,LDAS&ET-AD provides reliable and informative soft labels to one of the most promising defense methods,AT,alleviating the limitations of untrusted teachers and unsuitable AEs in existing AD techniques.We hope this paper promotes the development of DNNs in real-world trust-sensitive fields and helps ensure a more secure and dependable future for artificial intelligence systems. 展开更多
关键词 Adversarial training adversarial distillation learnable distillation attack strategies teacher evolution strategy
在线阅读 下载PDF
反思性思维链在智能车任务级控制中的应用研究
6
作者 钱鹏 储开斌 +1 位作者 殷聪聪 黄思涵 《计算机科学与探索》 北大核心 2026年第1期228-237,共10页
现阶段,大语言模型在处理复杂的长程任务推理时仍面临“幻觉”等问题,这对机器人控制构成了重大挑战。传统的思维链(CoT)技术在应对多模态信息整合与错误校正方面仍存在局限。为此,提出一种基于反思性思维链的大模型微调方法,以提升大... 现阶段,大语言模型在处理复杂的长程任务推理时仍面临“幻觉”等问题,这对机器人控制构成了重大挑战。传统的思维链(CoT)技术在应对多模态信息整合与错误校正方面仍存在局限。为此,提出一种基于反思性思维链的大模型微调方法,以提升大语言模型在智能小车任务级控制中的规划能力。该研究以ChatGLM2-6B模型为基础,结合P-Tuning v2微调技术实现深度提示优化,构建了三类逐步增强推理能力的数据集:基础的CoT数据集、以自洽性为目标的CoT-SC数据集,以及具有反思和修正能力的反思性CoT数据集。通过引导模型进行逻辑推理和错误纠正,大幅度提升了规划结果的准确性和鲁棒性。实验结果表明,相较于基准模型,经过反思性CoT微调的模型在单步和双步任务指令中,BLEU-4指标分别提升20.91和26.80个百分点,在逻辑推理、任务规划及多步骤指令处理方面均优于其他微调方法。 展开更多
关键词 思维链 ChatGLM2-6B模型 p-tuning v2 反思性CoT 任务级交互
在线阅读 下载PDF
A novel deep learning-based framework for forecasting
7
作者 Congqi Cao Ze Sun +2 位作者 Lanshu Hu Liujie Pan Yanning Zhang 《Atmospheric and Oceanic Science Letters》 2026年第1期22-26,共5页
Deep learning-based methods have become alternatives to traditional numerical weather prediction systems,offering faster computation and the ability to utilize large historical datasets.However,the application of deep... Deep learning-based methods have become alternatives to traditional numerical weather prediction systems,offering faster computation and the ability to utilize large historical datasets.However,the application of deep learning to medium-range regional weather forecasting with limited data remains a significant challenge.In this work,three key solutions are proposed:(1)motivated by the need to improve model performance in data-scarce regional forecasting scenarios,the authors innovatively apply semantic segmentation models,to better capture spatiotemporal features and improve prediction accuracy;(2)recognizing the challenge of overfitting and the inability of traditional noise-based data augmentation methods to effectively enhance model robustness,a novel learnable Gaussian noise mechanism is introduced that allows the model to adaptively optimize perturbations for different locations,ensuring more effective learning;and(3)to address the issue of error accumulation in autoregressive prediction,as well as the challenge of learning difficulty and the lack of intermediate data utilization in one-shot prediction,the authors propose a cascade prediction approach that effectively resolves these problems while significantly improving model forecasting performance.The method achieves a competitive result in The East China Regional AI Medium Range Weather Forecasting Competition.Ablation experiments further validate the effectiveness of each component,highlighting their contributions to enhancing prediction performance. 展开更多
关键词 Weather forecasting Deep learning Semantic segmentation models learnable Gaussian noise Cascade prediction
在线阅读 下载PDF
Asynchronous hierarchical deep reinforcement learning with learnable reward shaping for distributed multi-UCAV air combat decision
8
作者 Yifan ZHENG Bin XIN +2 位作者 Jie CHEN Keming JIAO Zhixin ZHAO 《Science China(Technological Sciences)》 2026年第1期44-67,共24页
The complexity of the battlefield environment,including its high dynamics,along with the high-dimensional spaces of state and decision-making,has brought severe challenges to unmanned combat aerial vehicles(UCAVs)in t... The complexity of the battlefield environment,including its high dynamics,along with the high-dimensional spaces of state and decision-making,has brought severe challenges to unmanned combat aerial vehicles(UCAVs)in the cooperative autonomous air combat decision-making.This paper focuses on the many-to-many air combat maneuvering decision(MMACMD)in an environment with extremely limited communication.An asynchronous hierarchical deep reinforcement learning method with learnable reward shaping(AHDRL_LRS)is proposed.First,by introducing an asynchronous hierarchical reinforcement learning framework,the large-scale MMACMD is decomposed into smaller-scale subtasks to reduce the dimensions of the decision spaces.Second,to achieve the coordinated global task allocation in the environment with extremely limited communication,the learnable reward with embedded target intention(LRETI)is proposed.Through the LRETI,the target selecting intentions generated by the high-level policy are implicitly represented as learnable parameters in the situation reward function,which is used to train the low-level flight maneuver policy.Third,to dynamically characterize the topological correlations of each unit in the UCAV swarm and enhance the transferability and scalability of the decision-making model,the flexible target intention network(FTIN)structure based on the multi-head self-attention(MHSA)model is designed for the representation of the high-level policy,which can accept input features with variable-length sequences.Moreover,a graph learning-based critic network is adopted in the low-level policy model to address the dynamic credit assignment.Finally,by comparing with the baseline methods under scenarios with various initialization from 6-vs-6 to 20-to-20 scales,the effectiveness and superiority of the proposed AHDRL_LRS are validated through the results of the simulation experiment. 展开更多
关键词 many-to-many air combat distributed decision-making hierarchical reinforcement learning learnable reward shaping
原文传递
融合用户兴趣边界与可学习滤波器的序列化推荐模型 被引量:1
9
作者 杨兴耀 刘岩松 +3 位作者 于炯 李梓杨 张少东 张君 《东北师大学报(自然科学版)》 北大核心 2025年第2期82-89,共8页
用户交互序列不可避免的含有多种噪声,传统模型在对用户交互序列信息进行处理时常因忽略噪声问题导致用户交互序列特征提取不充分,可能导致算法难以把握用户兴趣范围,且现有推荐算法损失函数难以获取用户兴趣边界以平衡正负样本.针对以... 用户交互序列不可避免的含有多种噪声,传统模型在对用户交互序列信息进行处理时常因忽略噪声问题导致用户交互序列特征提取不充分,可能导致算法难以把握用户兴趣范围,且现有推荐算法损失函数难以获取用户兴趣边界以平衡正负样本.针对以上问题,本文提出了融合用户兴趣边界与可学习滤波器的序列化推荐模型,并使用自注意力机制提取用户交互序列特征;以用户兴趣边界充分获取用户兴趣范围,采用融合用户兴趣边界的混合损失函数,以此来缓解正负样本不平衡的问题,达到优化算法的目标.实验表明,模型较好地过滤用户序列噪声,增强注意力机制的特征提取效果,缓解正负样本不平衡的问题,通过调节正负样本得分使得模型的通用性更高. 展开更多
关键词 可学习滤波器 序列化推荐 兴趣边界 自注意力
在线阅读 下载PDF
基于大规模语言模型的头部MRI报告诊断方法研究
10
作者 刘之洋 张明浩 +2 位作者 杨东 柴超 张颖 《南开大学学报(自然科学版)》 北大核心 2025年第5期100-109,共10页
头部MRI作为脑部临床检查手段之一,常用于脑卒中等疾病的诊断,而人工进行头部MRI的诊断依赖医生丰富的临床经验且会消耗大量人力.为了有效降低头部MRI的误诊率和漏诊率,减少医生的工作量,提出了一种通过微调大规模语言模型进行头部MRI... 头部MRI作为脑部临床检查手段之一,常用于脑卒中等疾病的诊断,而人工进行头部MRI的诊断依赖医生丰富的临床经验且会消耗大量人力.为了有效降低头部MRI的误诊率和漏诊率,减少医生的工作量,提出了一种通过微调大规模语言模型进行头部MRI自动诊断的方法,首先通过对头部MRI报告-诊断数据集进行预处理获得高质量数据,再以ChatGLM-6B为基础模型,并采用P-Tuning v2方法对该模型进行微调.为了克服部分表述的同义性使得用准确率难以对模型进行评估,提出采用平均Dice系数对模型进行评估.通过实验,模型可以达到0.890 4的平均Dice系数. 展开更多
关键词 深度学习 大规模语言模型 p-tuning v2 头部MRI 自动诊断
原文传递
基于三维姿态估计的智能康复运动检测系统应用研究
11
作者 张堃 张鹏程 +2 位作者 陈孝豪 张彬 华亮 《仪器仪表学报》 北大核心 2025年第6期181-193,共13页
在康复运动场景中,运动输入通常是视频序列,基于主流的2D人体姿态估计方法和深度相机进行的伪3D方案无法对视频中的骨骼点测距,影响最终评估效果。为了解决这个问题,提出一种针对视频的序列到序列3D帧聚焦姿态识别方法用于康复评估。其... 在康复运动场景中,运动输入通常是视频序列,基于主流的2D人体姿态估计方法和深度相机进行的伪3D方案无法对视频中的骨骼点测距,影响最终评估效果。为了解决这个问题,提出一种针对视频的序列到序列3D帧聚焦姿态识别方法用于康复评估。其目的是从最原始的二维噪声场景中直接提取更全面、更详细的三维坐标信息,并基于这些信息进行运动序列分析。该方法采用四支路流式变换器,能够捕获长序列时间与空间之间的交互关系,同时分别对原始2D输入进行时序与空间处理。这四支路信息通过可学习比例参数进行整合,并通过一个额外模块,结合空间编码器和增强型时间解码器获得最终输出。所提方法在Human 3.6M数据集上的表现优于最先进方法,平均关节位置误差仅为14.4 mm,三维姿态坐标误差最低,证明了所提主干架构能够有效处理更复杂的康复运动视频序列任务,同时在实际康复视频序列的对比实验也验证了本方法的有效性。此外,基于先进的人体姿态估计方法,研发了一种新颖的多维度智能康复运动评估分析系统,能够对人体各个关节120个动作进行运动指标估计,已进入临床验证阶段,并完成2000余例病人测试,平均准确率93.2%。 展开更多
关键词 序列到序列 FFPose算法 四支路流式变换器 可学习比例参数 无接触式
原文传递
强势最简命题视域下汉语供用义主宾可逆句的生成句法研究
12
作者 马志刚 《北京第二外国语学院学报》 北大核心 2025年第4期68-83,共16页
生成语法的强势最简命题主张,内在语言优化系统中的合并操作附加经济原则有助于删除外化过程中无须拼读的成分。依据该生物语言学理念对汉语主宾可逆句进行的最简句法分析显示,采用“施事者、受事者”等题元名称来指称“一锅饭吃五个人... 生成语法的强势最简命题主张,内在语言优化系统中的合并操作附加经济原则有助于删除外化过程中无须拼读的成分。依据该生物语言学理念对汉语主宾可逆句进行的最简句法分析显示,采用“施事者、受事者”等题元名称来指称“一锅饭吃五个人/五个人吃一锅饭”中NP的语义角色并不适切,因为其中的NP是由分配义数量算子而非动词次范畴化所选择的语义成分(S-selection)。由于数量对应关系是该类句式的核心要义,因此两个名词词组都必须是数量型无定成分,这是由量化义中心语“每”的语类选择(C-selection)要求所决定的。鉴于定指短语无论是做主语还是做宾语都会改变句式的基本语义,这类结构更应该被称为分配义主宾可逆句(可区分为供义宾主句和用义主宾句两种变体);中心语的隐形拼读虽然会对主宾可逆句的感知造成理解困难,但却是该类结构的形态-句法-语义核心,而动词的隐现则是因其嫁接而成的附加语性质所致。基于局域非对称成分统制的分析方案显示,不同线性序列完全可以溯源至同一结构图示,因而可兑现强势最简命题所要求的可学性、普适性和可进化性3个基本条件。本文的新观点是:此类句式应该隶属于祈使句范畴,其中的名词短语具有数量短语固有的内在格位(部分格)。 展开更多
关键词 强势最简命题 可逆句 可学性 普适性 可进化性 合并
在线阅读 下载PDF
融合傅里叶变换和可学习矩阵的肝脏肿瘤CT图像分割
13
作者 邵虹 张雨 《沈阳大学学报(自然科学版)》 2025年第2期147-154,161,共9页
为实现肝脏肿瘤的精确自动分割,构建了一种融合残差注意力的3D深度U形网络用以分割肝脏,以减少背景中其他器官对肿瘤分割的影响。在肝脏分割的基础上,提出了一个融合傅里叶变换和可学习矩阵的门控装置,该装置在过滤无用信息的同时更好... 为实现肝脏肿瘤的精确自动分割,构建了一种融合残差注意力的3D深度U形网络用以分割肝脏,以减少背景中其他器官对肿瘤分割的影响。在肝脏分割的基础上,提出了一个融合傅里叶变换和可学习矩阵的门控装置,该装置在过滤无用信息的同时更好地处理边缘部分,提高了模型对肿瘤边缘信息的辨别能力。结果表明,该算法在肝脏肿瘤分割中取得了良好的效果,重合率(dice similarity coefficient,DSC)为78.99%,较3D U-Net模型提高了7.15%,提升了肝脏肿瘤的分割精度,同时具有良好的泛化性能。 展开更多
关键词 肝脏肿瘤分割 残差模块 注意力机制 傅里叶变换 可学习矩阵
在线阅读 下载PDF
基于Bert-BiLSTM-CRF模型的中文命名实体识别 被引量:4
14
作者 龙星全 李佳 《吉林大学学报(信息科学版)》 2025年第2期384-393,共10页
针对现有的中文命名实体识别算法没有充分考虑实体识别任务的数据特征,存在中文样本数据的类别不平衡、训练数据中的噪声太大和每次模型生成数据的分布差异较大的问题,提出了一种以BERT-BiLSTM-CRF(Bidirectional Encoder Representatio... 针对现有的中文命名实体识别算法没有充分考虑实体识别任务的数据特征,存在中文样本数据的类别不平衡、训练数据中的噪声太大和每次模型生成数据的分布差异较大的问题,提出了一种以BERT-BiLSTM-CRF(Bidirectional Encoder Representations from Transformers-Bidirectional Long Short-Term Memory-Conditional Random Field)为基线改进的中文命名实体识别模型。首先在BERT-BiLSTM-CRF模型上结合P-Tuning v2技术,精确提取数据特征,然后使用3个损失函数包括聚焦损失(Focal Loss)、标签平滑(Label Smoothing)和KL Loss(Kullback-Leibler divergence loss)作为正则项参与损失计算。实验结果表明,改进的模型在Weibo、Resume和MSRA(Microsoft Research Asia)数据集上的F 1得分分别为71.13%、96.31%、95.90%,验证了所提算法具有更好的性能,并且在不同的下游任务中,所提算法易于与其他的神经网络结合与扩展。 展开更多
关键词 中文命名实体识别 BERT-BiLSTM-CRF模型 p-tuning v2技术 损失函数
在线阅读 下载PDF
基于DID-AugGAN的小样本缺陷图像生成与数据增强算法 被引量:1
15
作者 黄绿娥 邓亚峰 +1 位作者 鄢化彪 肖文祥 《数据采集与处理》 北大核心 2025年第5期1306-1321,共16页
针对小样本条件下生成对抗网络(Generative adversarial network,GAN)生成缺陷图像质量低、不真实且多样性差的问题,提出一种缺陷图像生成算法(Defect image data augmentation GAN,DID-AugGAN),旨在实现小样本缺陷图像的数据增强。为... 针对小样本条件下生成对抗网络(Generative adversarial network,GAN)生成缺陷图像质量低、不真实且多样性差的问题,提出一种缺陷图像生成算法(Defect image data augmentation GAN,DID-AugGAN),旨在实现小样本缺陷图像的数据增强。为解决传统卷积在有限数据集中难以有效学习图像中非刚性特征的问题,设计可学习偏移卷积,以提高模型对图像语义信息的学习能力;为避免关键缺陷特征丢失,提升局部特征之间的关联性,设计多尺度坐标注意力模块,重点关注缺陷位置信息;为提高网络对输入图像局部信息的判别能力,重新设计判别器网络架构,使其从传统的单一前馈网络转变为包含对称编码与解码路径的UNet-like结构;将DID-AugGAN与原算法在Rail-4c轨道扣件缺陷数据集上进行对比实验,并利用分类网络MobileNetV3进行验证。实验结果表明,改进后的方法显著提高了IS(Inception score),有效降低了FID(Fréchet inception distance)和LPIPS(Learned perceptual image patch similarity)指标,并且MobileNetV3分类准确率和F1分数也得到提高。该算法能稳定生成高质量的缺陷图像,有效扩充缺陷数据样本,满足下游任务需求。 展开更多
关键词 小样本学习 生成对抗网络 可学习偏移卷积 多尺度坐标注意力 UNet-like
在线阅读 下载PDF
含噪背景下GIS操作机构声纹信号可学习特征提取及辨识方法
16
作者 李嘉宁 李喆 盛戈皞 《高电压技术》 北大核心 2025年第11期5516-5525,共10页
气体绝缘组合电器(gas insulated switchgear,GIS)动作声纹信号具有瞬时性和突发性,且常有噪声干扰,影响声纹识别算法的识别效果。针对这一问题,提出了一种基于可学习特征提取与改进Transformer的GIS动作声纹识别方法。首先,利用可学习... 气体绝缘组合电器(gas insulated switchgear,GIS)动作声纹信号具有瞬时性和突发性,且常有噪声干扰,影响声纹识别算法的识别效果。针对这一问题,提出了一种基于可学习特征提取与改进Transformer的GIS动作声纹识别方法。首先,利用可学习参数的Gabor卷积、池化层、单通道压缩代替传统的梅尔倒谱系数特征提取方法,针对GIS动作声纹微调参数,提高频谱图的特征表达能力。其次,在Transformer网络中引入多头注意力机制和时频注意力机制,从多角度学习音频特征,捕捉瞬时音频信号在时频突变中的特征信息,提升网络的泛化性和抗噪声能力。最后,以GIS为实验对象,完成了多种机械故障工况下开断设备分合闸过程中的声纹信号采集。实验结果表明:相较于传统的声纹识别方法,该文算法能更好地表达瞬时音频信号的声纹特征,准确性、抗噪性。在高信噪比条件下识别准确率稳定在96%,在信噪比为10 dB时该文算法识别准确率提升超过9.4%,在低信噪比条件下识别效果更好。 展开更多
关键词 GIS 机械故障诊断 声纹识别 可学习特征提取 Transformer网络
原文传递
多模态预训练模型在金融票据信息抽取中的应用 被引量:1
17
作者 颜政锦 叶正 葛君 《计算机工程与应用》 北大核心 2025年第9期186-193,共8页
金融领域的票据信息抽取是一项复杂且具有挑战的任务,其目标是从金融文档中准确抽取票据所包含的关键信息。金融票据作为商业活动中重要的信息载体,其准确提取对于商业决策和财务分析具有重要意义。然而,由于票据格式的不规范性,在实际... 金融领域的票据信息抽取是一项复杂且具有挑战的任务,其目标是从金融文档中准确抽取票据所包含的关键信息。金融票据作为商业活动中重要的信息载体,其准确提取对于商业决策和财务分析具有重要意义。然而,由于票据格式的不规范性,在实际应用中可能导致关键信息的丢失,如数据中键值对不完整或缺失等问题,给金融票据信息抽取任务带来了挑战。当前,LayoutLMV3模型是主流的信息抽取的方法之一,它结合了自然语言处理和多模态技术,能够在大规模金融文档中进行信息抽取。但它在处理复杂布局的文档时准确性会下降,处理长文本时因包含大量的字符可能难以捕捉其中重要的信息。为了解决上述挑战和问题,以LayoutLMV3为基线模型,引入了P-Tuning V1技术,不仅能够解决特定问题(如金融票据中的键值关系),还具备适应不同情境和任务的能力,而且可以利用多模态的文本、图像和布局信息来更全面地理解票据内容。P-Tuning V1通过引入可训练的连续提示嵌入,即“prompt”,作为模型输入的一部分,用以表示文本数据中的“键”信息。同时,采用离散提示作为“值”的一部分,两者相结合构成完整的键值对。实验结果表明,相较于基于LayoutLMV3的方法,结合的新方法在Finance-Receipts数据集上取得了显著的提升,在F1得分上从95.95%提高到96.69%。 展开更多
关键词 信息抽取 多模态 预训练 LayoutLMv3 p-tuning V1
在线阅读 下载PDF
基于空间语义引导的零样本缺陷检测方法
18
作者 宋亚楠 潘柏松 +1 位作者 易文超 张彪 《计算机集成制造系统》 北大核心 2025年第7期2438-2445,共8页
针对现有视觉语言模型过多关注物体类别语义,忽略局部空间缺陷区域的细粒度感知问题,提出基于空间语义引导的零样本缺陷检测方法。设计空间语义引导网络提取图像语义分布特征,并将其添加到视觉语言模型中的视觉编码网络。针对正常和缺... 针对现有视觉语言模型过多关注物体类别语义,忽略局部空间缺陷区域的细粒度感知问题,提出基于空间语义引导的零样本缺陷检测方法。设计空间语义引导网络提取图像语义分布特征,并将其添加到视觉语言模型中的视觉编码网络。针对正常和缺陷状态设计通用性较强的可学习文本提示,由设计的文本编码网络提取对应的文本嵌入,并与多阶段视觉特征计算余弦相似度,进而预测缺陷区域热图。所提缺陷检测模型在MVTec、VisA、MPDD、BTAD四个数据集上分别获得了88.5%、95.3%、97.0%、91.6%的像素级缺陷检测准确率。实验结果表明所提方法具有较强的零样本缺陷检测性能。 展开更多
关键词 零样本缺陷检测 视觉语言模型 可学习提示 语义引导
在线阅读 下载PDF
基于自适应先验增强的智能Turbo译码算法
19
作者 李若冰 刘丽哲 +2 位作者 杨朔 李勇 王斌 《计算机测量与控制》 2025年第10期280-288,共9页
针对传统Turbo信道译码算法误码率性能不足的问题,采用一种基于模型与数据双驱动的智能Turbo信道译码方法;基于传统Max-Log-MAP译码算法,将迭代过程深度展开,重构为多层级联的神经网络架构,提出自适应先验信息增强APETurbo译码网络模型... 针对传统Turbo信道译码算法误码率性能不足的问题,采用一种基于模型与数据双驱动的智能Turbo信道译码方法;基于传统Max-Log-MAP译码算法,将迭代过程深度展开,重构为多层级联的神经网络架构,提出自适应先验信息增强APETurbo译码网络模型,在模型中设计APENet神经子网络;该子网络采用可学习权重线性调整外部信息,并基于全连接层与非线性激活函数进一步提取非线性特征,利用可训练混合系数结合线性计算结果和非线性提取结果,构建残差连接结构,实现对先验信息的更精准估计;设计基于归一化的概率空间均方误差损失函数进行优化;仿真结果表明,在AWGN信道下,所提算法在误码率为10^(-6)时比Max-Log-MAP算法误码率性能提升约0.4 dB。 展开更多
关键词 TURBO译码 MAX-LOG-MAP算法 模型与数据双驱动 神经网络 可学习权重
在线阅读 下载PDF
大语言模型融合知识图谱的装备问答系统研究
20
作者 王美华 王兴芬 张友星 《人工智能与机器人研究》 2025年第3期684-697,共14页
在大数据时代,海量的互联网信息飞速增长,人们对信息获取的精准度与效率提出了更高的要求。随着企业信息化和装备管理现代化的不断推进,对海量企业装备信息进行有效的提炼、管理与利用,对于提升企业装备知识的应用价值以及企业资源的利... 在大数据时代,海量的互联网信息飞速增长,人们对信息获取的精准度与效率提出了更高的要求。随着企业信息化和装备管理现代化的不断推进,对海量企业装备信息进行有效的提炼、管理与利用,对于提升企业装备知识的应用价值以及企业资源的利用效率具有重要意义。本研究提出了一套融合大语言模型自然语言处理能力的系统,可智能理解用户查询并提供精准的装备信息。通过采用P-Tuning v2方法对大语言模型进行微调,大幅提升了其在企业装备领域对关键词的识别和提取能力。同时,借助企业装备知识图谱作为本地知识库,为模型提供行业领域知识,使其能够将相关信息作为问题的上下文进行学习。基于此,还设计了提示工程来引导模型生成更准确的回复,并对结果进行了效果评估。实验结果表明,相较于直接使用大语言模型,该基于知识图谱增强的大语言模型在企业装备领域的智能化回复准确率更高,为企业装备问答系统的建设提供了有力支持。In the era of big data, the volume of Internet information is growing at an astonishing rate, and people have put forward higher requirements for the accuracy and efficiency of information acquisition. With the continuous advancement of enterprise informatization and modernization of equipment management, effectively extracting, managing and utilizing massive enterprise equipment information is of great significance for enhancing the application value of enterprise equipment knowledge and improving the efficiency of enterprise resource utilization. This study proposes a system that integrates the natural language processing capabilities of large language models, which can intelligently understand user queries and provide precise equipment information. By using the P-Tuning v2 method to fine-tune the large language model, its ability to recognize and extract keywords in the field of enterprise equipment has been significantly enhanced. At the same time, with the help of the enterprise equipment knowledge graph as a local knowledge base, industry-specific knowledge is provided to the model, enabling it to learn relevant information in the context of the question. Based on this, prompt engineering is designed to guide the model to generate more accurate responses, and the results are evaluated. Experimental results show that compared with directly using large language models, the knowledge graph-enhanced large language model has a higher accuracy rate in intelligent responses in the field of enterprise equipment, providing strong support for the construction of enterprise equipment question-answering systems. 展开更多
关键词 大语言模型 知识图谱 p-tuning v2方法 企业装备
在线阅读 下载PDF
上一页 1 2 6 下一页 到第
使用帮助 返回顶部