期刊文献+
共找到111篇文章
< 1 2 6 >
每页显示 20 50 100
Coevolutionary Neural Dynamics With Learnable Parameters for Nonconvex Optimisation
1
作者 Yipiao Chen Wenbin Du +1 位作者 Huichao Cao Long Jin 《CAAI Transactions on Intelligence Technology》 2026年第1期111-122,共12页
Nonconvex optimisation plays a crucial role in science and industry.However,existing methods often encounter local optima or provide inferior solutions when solving nonconvex optimisation problems,lacking robustness i... Nonconvex optimisation plays a crucial role in science and industry.However,existing methods often encounter local optima or provide inferior solutions when solving nonconvex optimisation problems,lacking robustness in noise scenarios.To address these limitations,we aim to develop a robust,efficient and globally convergent solver for nonconvex optimisation.This is achieved by combining the efficient local exploitation ability of a parameter-learnt neural dynamics(PLND)model with the global search capability of the coevolutionary mechanism.We combine their characteristics to design a coevolutionary neural dynamics with learnable parameters(CNDLP)model.The gradient information is used to find the optimal solution more effectively,and neural dynamics models have robustness,which ensures that the influence of noise can be effectively suppressed in the calculation process.Theoretical analyses show the global convergence and robustness of the designed CNDLP model.Numerical experiments on 9 benchmark functions and a practical engineering design example are conducted with five existing meta-heuristic algorithms.Benchmarks cover diverse problems,from classical landscapes like benchmark Shubert to high-dimensional cases such as 30-dimensional Rosenbrock.Results confirm CNDLP's excellent performance in both solution quality and convergence speed under noise. 展开更多
关键词 coevolutionary neural dynamics with learnable parameters(CNDLP) nonconvex optimization ROBUSTNESS
在线阅读 下载PDF
基于可学习P-tuning的视频目标移除篡改检测与定位方法
2
作者 张雨亭 袁程胜 +3 位作者 贾星星 张波 夏志华 付章杰 《信息安全研究》 北大核心 2026年第1期61-67,共7页
随着人工智能和大数据技术的不断发展,制作伪造视频的门槛显著降低.因此,鉴别视频是否被篡改有助于确保信息的真实性和可信度.当前主流视频篡改检测方法依赖卷积神经网络,对时序依赖性捕捉能力有限,缺乏全局时间模式理解.为此,提出了一... 随着人工智能和大数据技术的不断发展,制作伪造视频的门槛显著降低.因此,鉴别视频是否被篡改有助于确保信息的真实性和可信度.当前主流视频篡改检测方法依赖卷积神经网络,对时序依赖性捕捉能力有限,缺乏全局时间模式理解.为此,提出了一种基于可学习P-tuning的视频目标移除篡改检测与定位方法.首先,通过可学习P-tuning充分挖掘预训练模型的先验知识,高效提取空域、时序及高频等多视图特征.其次,提出了一种多尺度特征交互模块,通过多尺度卷积运算和2步分解策略,精准捕捉从细粒度至粗粒度的篡改痕迹.此外,设计了一种多路融合注意模块,通过跨视图交互机制,显著增强多视图特征之间的信息共享与融合能力.实验结果表明,该方法在时域及空域定位上的性能均优于现有检测方法. 展开更多
关键词 视频篡改检测 目标移除 可学习p-tuning 多尺度特征交互 多视图特征
在线阅读 下载PDF
Steel Surface Defect Detection Using Learnable Memory Vision Transformer
3
作者 Syed Tasnimul Karim Ayon Farhan Md.Siraj Jia Uddin 《Computers, Materials & Continua》 SCIE EI 2025年第1期499-520,共22页
This study investigates the application of Learnable Memory Vision Transformers(LMViT)for detecting metal surface flaws,comparing their performance with traditional CNNs,specifically ResNet18 and ResNet50,as well as o... This study investigates the application of Learnable Memory Vision Transformers(LMViT)for detecting metal surface flaws,comparing their performance with traditional CNNs,specifically ResNet18 and ResNet50,as well as other transformer-based models including Token to Token ViT,ViT withoutmemory,and Parallel ViT.Leveraging awidely-used steel surface defect dataset,the research applies data augmentation and t-distributed stochastic neighbor embedding(t-SNE)to enhance feature extraction and understanding.These techniques mitigated overfitting,stabilized training,and improved generalization capabilities.The LMViT model achieved a test accuracy of 97.22%,significantly outperforming ResNet18(88.89%)and ResNet50(88.90%),aswell as the Token to TokenViT(88.46%),ViT without memory(87.18),and Parallel ViT(91.03%).Furthermore,LMViT exhibited superior training and validation performance,attaining a validation accuracy of 98.2%compared to 91.0%for ResNet 18,96.0%for ResNet50,and 89.12%,87.51%,and 91.21%for Token to Token ViT,ViT without memory,and Parallel ViT,respectively.The findings highlight the LMViT’s ability to capture long-range dependencies in images,an areawhere CNNs struggle due to their reliance on local receptive fields and hierarchical feature extraction.The additional transformer-based models also demonstrate improved performance in capturing complex features over CNNs,with LMViT excelling particularly at detecting subtle and complex defects,which is critical for maintaining product quality and operational efficiency in industrial applications.For instance,the LMViT model successfully identified fine scratches and minor surface irregularities that CNNs often misclassify.This study not only demonstrates LMViT’s potential for real-world defect detection but also underscores the promise of other transformer-based architectures like Token to Token ViT,ViT without memory,and Parallel ViT in industrial scenarios where complex spatial relationships are key.Future research may focus on enhancing LMViT’s computational efficiency for deployment in real-time quality control systems. 展开更多
关键词 learnable Memory Vision Transformer(LMViT) Convolutional Neural Networks(CNN) metal surface defect detection deep learning computer vision image classification learnable memory gradient clipping label smoothing t-SNE visualization
在线阅读 下载PDF
PMoE:在P-tuning中引入混合专家的参数高效微调框架 被引量:4
4
作者 王浩 王珺 +3 位作者 胡海峰 周飞飞 龚锐 张索非 《计算机应用研究》 北大核心 2025年第7期1956-1963,共8页
大语言模型(LLM)在推理和生成任务中的表现显著提升,但现有的开源LLM在处理专业领域问题时仍存在知识不足,亟需针对特定任务进行微调。传统微调方法在多任务学习中难以兼顾低成本与高效性。为此,提出了一种名为PMoE的参数高效微调框架... 大语言模型(LLM)在推理和生成任务中的表现显著提升,但现有的开源LLM在处理专业领域问题时仍存在知识不足,亟需针对特定任务进行微调。传统微调方法在多任务学习中难以兼顾低成本与高效性。为此,提出了一种名为PMoE的参数高效微调框架。该框架基于P-tuning方法,并引入混合专家机制,在保持低成本微调的同时增强多任务处理能力。PMoE在Transformer模块的每一层构建可训练的专家模块,以替代P-tuning中的提示词模块,并采用路由机制根据输入任务特征动态分配任务。此外,PMoE的专家模块支持拆卸,实现不同任务场景下的模型复用,进一步降低计算成本。实验结果表明,PMoE在中国医学领域数据集上相较于P-tuning方法性能提升6.24%,并在多任务处理和迁移学习方面表现优越,验证了其高效性与广泛适用性。 展开更多
关键词 大语言模型 参数高效微调 p-tuning 混合专家 多任务学习
在线阅读 下载PDF
Toward a Learnable Climate Model in the Artificial Intelligence Era 被引量:8
5
作者 Gang HUANG Ya WANG +3 位作者 Yoo-Geun HAM Bin MU Weichen TAO Chaoyang XIE 《Advances in Atmospheric Sciences》 SCIE CAS CSCD 2024年第7期1281-1288,共8页
Artificial intelligence(AI)models have significantly impacted various areas of the atmospheric sciences,reshaping our approach to climate-related challenges.Amid this AI-driven transformation,the foundational role of ... Artificial intelligence(AI)models have significantly impacted various areas of the atmospheric sciences,reshaping our approach to climate-related challenges.Amid this AI-driven transformation,the foundational role of physics in climate science has occasionally been overlooked.Our perspective suggests that the future of climate modeling involves a synergistic partnership between AI and physics,rather than an“either/or”scenario.Scrutinizing controversies around current physical inconsistencies in large AI models,we stress the critical need for detailed dynamic diagnostics and physical constraints.Furthermore,we provide illustrative examples to guide future assessments and constraints for AI models.Regarding AI integration with numerical models,we argue that offline AI parameterization schemes may fall short of achieving global optimality,emphasizing the importance of constructing online schemes.Additionally,we highlight the significance of fostering a community culture and propose the OCR(Open,Comparable,Reproducible)principles.Through a better community culture and a deep integration of physics and AI,we contend that developing a learnable climate model,balancing AI and physics,is an achievable goal. 展开更多
关键词 artificial intelligence deep learning learnable climate model
在线阅读 下载PDF
Boosting Adversarial Training with Learnable Distribution
6
作者 Kai Chen Jinwei Wang +2 位作者 James Msughter Adeke Guangjie Liu Yuewei Dai 《Computers, Materials & Continua》 SCIE EI 2024年第3期3247-3265,共19页
In recent years,various adversarial defense methods have been proposed to improve the robustness of deep neural networks.Adversarial training is one of the most potent methods to defend against adversarial attacks.How... In recent years,various adversarial defense methods have been proposed to improve the robustness of deep neural networks.Adversarial training is one of the most potent methods to defend against adversarial attacks.However,the difference in the feature space between natural and adversarial examples hinders the accuracy and robustness of the model in adversarial training.This paper proposes a learnable distribution adversarial training method,aiming to construct the same distribution for training data utilizing the Gaussian mixture model.The distribution centroid is built to classify samples and constrain the distribution of the sample features.The natural and adversarial examples are pushed to the same distribution centroid to improve the accuracy and robustness of the model.The proposed method generates adversarial examples to close the distribution gap between the natural and adversarial examples through an attack algorithm explicitly designed for adversarial training.This algorithm gradually increases the accuracy and robustness of the model by scaling perturbation.Finally,the proposed method outputs the predicted labels and the distance between the sample and the distribution centroid.The distribution characteristics of the samples can be utilized to detect adversarial cases that can potentially evade the model defense.The effectiveness of the proposed method is demonstrated through comprehensive experiments. 展开更多
关键词 Adversarial training feature space learnable distribution distribution centroid
在线阅读 下载PDF
LDAS&ET-AD:Learnable Distillation Attack Strategies and Evolvable Teachers Adversarial Distillation
7
作者 Shuyi Li Hongchao Hu +3 位作者 Xiaohan Yang Guozhen Cheng Wenyan Liu Wei Guo 《Computers, Materials & Continua》 SCIE EI 2024年第5期2331-2359,共29页
Adversarial distillation(AD)has emerged as a potential solution to tackle the challenging optimization problem of loss with hard labels in adversarial training.However,fixed sample-agnostic and student-egocentric atta... Adversarial distillation(AD)has emerged as a potential solution to tackle the challenging optimization problem of loss with hard labels in adversarial training.However,fixed sample-agnostic and student-egocentric attack strategies are unsuitable for distillation.Additionally,the reliability of guidance from static teachers diminishes as target models become more robust.This paper proposes an AD method called Learnable Distillation Attack Strategies and Evolvable Teachers Adversarial Distillation(LDAS&ET-AD).Firstly,a learnable distillation attack strategies generating mechanism is developed to automatically generate sample-dependent attack strategies tailored for distillation.A strategy model is introduced to produce attack strategies that enable adversarial examples(AEs)to be created in areas where the target model significantly diverges from the teachers by competing with the target model in minimizing or maximizing the AD loss.Secondly,a teacher evolution strategy is introduced to enhance the reliability and effectiveness of knowledge in improving the generalization performance of the target model.By calculating the experimentally updated target model’s validation performance on both clean samples and AEs,the impact of distillation from each training sample and AE on the target model’s generalization and robustness abilities is assessed to serve as feedback to fine-tune standard and robust teachers accordingly.Experiments evaluate the performance of LDAS&ET-AD against different adversarial attacks on the CIFAR-10 and CIFAR-100 datasets.The experimental results demonstrate that the proposed method achieves a robust precision of 45.39%and 42.63%against AutoAttack(AA)on the CIFAR-10 dataset for ResNet-18 and MobileNet-V2,respectively,marking an improvement of 2.31%and 3.49%over the baseline method.In comparison to state-of-the-art adversarial defense techniques,our method surpasses Introspective Adversarial Distillation,the top-performing method in terms of robustness under AA attack for the CIFAR-10 dataset,with enhancements of 1.40%and 1.43%for ResNet-18 and MobileNet-V2,respectively.These findings demonstrate the effectiveness of our proposed method in enhancing the robustness of deep learning networks(DNNs)against prevalent adversarial attacks when compared to other competing methods.In conclusion,LDAS&ET-AD provides reliable and informative soft labels to one of the most promising defense methods,AT,alleviating the limitations of untrusted teachers and unsuitable AEs in existing AD techniques.We hope this paper promotes the development of DNNs in real-world trust-sensitive fields and helps ensure a more secure and dependable future for artificial intelligence systems. 展开更多
关键词 Adversarial training adversarial distillation learnable distillation attack strategies teacher evolution strategy
在线阅读 下载PDF
用于脑电情感识别的组级刺激感知自监督软对比学习框架
8
作者 陈景霞 王倩 +1 位作者 李小池 张鹏伟 《生物医学工程学杂志》 北大核心 2026年第1期34-44,共11页
为降低传统脑电情感识别方法对标签的依赖,并弥补现有对比学习方法在跨刺激源情感相似性建模上的不足,本文提出一种基于组级刺激感知的自监督软对比学习框架(GSCL)用于脑电情感识别。GSCL利用相同刺激下受试者脑活动一致性,构建对比学... 为降低传统脑电情感识别方法对标签的依赖,并弥补现有对比学习方法在跨刺激源情感相似性建模上的不足,本文提出一种基于组级刺激感知的自监督软对比学习框架(GSCL)用于脑电情感识别。GSCL利用相同刺激下受试者脑活动一致性,构建对比学习任务,引入软赋值机制,并根据样本对距离自适应调整负样本对权重,提升表征质量。此外,本研究还设计了可学习混洗分离的数据扩充方法,以可学习的混洗参数动态优化数据分布。最后,在公开情感数据集(DEAP)上,本文所提方法在效价、唤醒和四分类维度的准确率分别达到94.91%、95.29%和92.78%;而在上海交通大学情感脑电数据集(SEED)上的三分类准确率也达到95.25%。实验结果表明,本文所提方法实现了更高的分类准确率,为自监督脑电情感识别提供了新思路。 展开更多
关键词 情感识别 脑电信号 对比学习 软赋值机制 可学习混洗分离
原文传递
基于LVMD的回顾性原型网络及其在油气管道泄漏检测中的应用
9
作者 路敬祎 吴阳 +3 位作者 陈波 韦启航 梁棋皓 王鹏 《压力容器》 北大核心 2026年第1期77-86,共10页
针对在油气管道泄漏检测时,小样本条件下模型泛化能力不足、泄漏检测精度不高的问题,提出一种基于可学习变分模态分解的回顾性原型网络框架,并将其应用于油气管道泄漏检测中。通过可学习卷积核与自适应参数,实现模态分解的端到端优化和... 针对在油气管道泄漏检测时,小样本条件下模型泛化能力不足、泄漏检测精度不高的问题,提出一种基于可学习变分模态分解的回顾性原型网络框架,并将其应用于油气管道泄漏检测中。通过可学习卷积核与自适应参数,实现模态分解的端到端优化和动态权重分配;引入双层Inception多尺度卷积分支捕获多维度信号,来提取更精确特征信息;将回顾性原型分类引入历史原型缓冲区,融合当前与历史原型网络以抑制振荡,提升识别精度;设计交叉熵与原型分离损失的联合优化目标函数,增强训练的稳定性与效率。试验表明,提出的VRPN网络框架在有限样本及跨工况场景下表现优异,平均准确率达98.50%,较ProtoNet提升0.67%。研究可为油气管道泄漏检测提供新的有效解决方案。 展开更多
关键词 管道泄漏检测 回顾性原型网络 可学习变分模态分解 小样本
在线阅读 下载PDF
反思性思维链在智能车任务级控制中的应用研究
10
作者 钱鹏 储开斌 +1 位作者 殷聪聪 黄思涵 《计算机科学与探索》 北大核心 2026年第1期228-237,共10页
现阶段,大语言模型在处理复杂的长程任务推理时仍面临“幻觉”等问题,这对机器人控制构成了重大挑战。传统的思维链(CoT)技术在应对多模态信息整合与错误校正方面仍存在局限。为此,提出一种基于反思性思维链的大模型微调方法,以提升大... 现阶段,大语言模型在处理复杂的长程任务推理时仍面临“幻觉”等问题,这对机器人控制构成了重大挑战。传统的思维链(CoT)技术在应对多模态信息整合与错误校正方面仍存在局限。为此,提出一种基于反思性思维链的大模型微调方法,以提升大语言模型在智能小车任务级控制中的规划能力。该研究以ChatGLM2-6B模型为基础,结合P-Tuning v2微调技术实现深度提示优化,构建了三类逐步增强推理能力的数据集:基础的CoT数据集、以自洽性为目标的CoT-SC数据集,以及具有反思和修正能力的反思性CoT数据集。通过引导模型进行逻辑推理和错误纠正,大幅度提升了规划结果的准确性和鲁棒性。实验结果表明,相较于基准模型,经过反思性CoT微调的模型在单步和双步任务指令中,BLEU-4指标分别提升20.91和26.80个百分点,在逻辑推理、任务规划及多步骤指令处理方面均优于其他微调方法。 展开更多
关键词 思维链 ChatGLM2-6B模型 p-tuning v2 反思性CoT 任务级交互
在线阅读 下载PDF
面向情感信息不对称的多模态情感识别
11
作者 秦培玉 李鸿燕 +1 位作者 丁思森 郑泽 《重庆理工大学学报(自然科学)》 北大核心 2026年第3期167-175,共9页
针对多模态情感识别中模态融合不充分与传统融合难以适应模态间情感信息不对称的问题,提出结合双路径交叉注意力与自适应加权融合的多模态情感识别模型,并分别为各模态设计特征学习模块,提取文本、视频和音频模态情感特征。融合阶段,模... 针对多模态情感识别中模态融合不充分与传统融合难以适应模态间情感信息不对称的问题,提出结合双路径交叉注意力与自适应加权融合的多模态情感识别模型,并分别为各模态设计特征学习模块,提取文本、视频和音频模态情感特征。融合阶段,模型以文本和视频为主导,音频为辅助,针对模态间情感信息不对称性设计双路径交叉注意力与自适应加权融合模块,通过交叉注意力强化模态间关联,动态学习两条路径的贡献参数。基于中文数据集CH-SIMS的实验结果表明,针对各模态设计特征学习模块并对模态间情感信息不对称性建模,可有效提升多模态情感特征的融合质量,进而提高情感识别准确率。 展开更多
关键词 多模态情感识别 情感信息不对称 注意力机制 自适应加权 可学习图卷积网络
在线阅读 下载PDF
基于可学习聚合权重的解析性联邦学习方法
12
作者 蒋伟进 崔新雨 +2 位作者 刘志华 陈伸有 胡佳龙 《计算机学报》 北大核心 2026年第1期84-108,共25页
联邦学习通过在客户端与参数服务器之间交换模型参数而非原始数据,有效保护了数据隐私安全。然而,随着客户端数量和数据规模的增加,联邦学习仍面临通信开销增加和任务复杂性提升的问题。现有方法通常采用基于客户端本地数据量的权重归... 联邦学习通过在客户端与参数服务器之间交换模型参数而非原始数据,有效保护了数据隐私安全。然而,随着客户端数量和数据规模的增加,联邦学习仍面临通信开销增加和任务复杂性提升的问题。现有方法通常采用基于客户端本地数据量的权重归一化策略进行模型聚合,在一定程度上降低通信开销,但未充分考虑数据异质性,这可能导致模型过拟合、收敛速度减缓,并加重通信负担。因此,本文提出了一种具有可学习聚合权重的解析性联邦学习算法(Learnable Aggregation Weights and Analytic Federated Learning,LAW-AFL),该算法首先通过引入可学习的收缩因子和相对权重,改进了聚合过程中的权重计算方式,并引入闭式训练范式指导神经网络训练,增强模型在异质性数据下的稳定性和泛化能力;其次通过推导绝对聚合规则,进一步提升了聚合过程的效率和准确性,实现了单周期本地训练,简化了训练流程,同时该算法利用闭式解进行高效聚合,简化了训练流程。实验结果表明,所提出的算法在多个数据集和模型上都显著提高了全局模型的精度和泛化能力,相比较于基线方法,在处理大规模客户端和非独立同分布(Non Independent and Identically Distributed,Non-IID)数据时准确率提高了10%,并在特定实验设置下将全局模型的准确率提升至90%以上,单论训练时间相较于Fed AVG缩短了69.82秒/轮。这证明了LAW-AFL在准确性和鲁棒性方面具有一定的优势,并且大幅度降低了通信成本。 展开更多
关键词 联邦学习 可学习聚合权重 闭式训练范式 自动解析技术 泛化能力 通信成本
在线阅读 下载PDF
A novel deep learning-based framework for forecasting
13
作者 Congqi Cao Ze Sun +2 位作者 Lanshu Hu Liujie Pan Yanning Zhang 《Atmospheric and Oceanic Science Letters》 2026年第1期22-26,共5页
Deep learning-based methods have become alternatives to traditional numerical weather prediction systems,offering faster computation and the ability to utilize large historical datasets.However,the application of deep... Deep learning-based methods have become alternatives to traditional numerical weather prediction systems,offering faster computation and the ability to utilize large historical datasets.However,the application of deep learning to medium-range regional weather forecasting with limited data remains a significant challenge.In this work,three key solutions are proposed:(1)motivated by the need to improve model performance in data-scarce regional forecasting scenarios,the authors innovatively apply semantic segmentation models,to better capture spatiotemporal features and improve prediction accuracy;(2)recognizing the challenge of overfitting and the inability of traditional noise-based data augmentation methods to effectively enhance model robustness,a novel learnable Gaussian noise mechanism is introduced that allows the model to adaptively optimize perturbations for different locations,ensuring more effective learning;and(3)to address the issue of error accumulation in autoregressive prediction,as well as the challenge of learning difficulty and the lack of intermediate data utilization in one-shot prediction,the authors propose a cascade prediction approach that effectively resolves these problems while significantly improving model forecasting performance.The method achieves a competitive result in The East China Regional AI Medium Range Weather Forecasting Competition.Ablation experiments further validate the effectiveness of each component,highlighting their contributions to enhancing prediction performance. 展开更多
关键词 Weather forecasting Deep learning Semantic segmentation models learnable Gaussian noise Cascade prediction
在线阅读 下载PDF
Asynchronous hierarchical deep reinforcement learning with learnable reward shaping for distributed multi-UCAV air combat decision
14
作者 Yifan ZHENG Bin XIN +2 位作者 Jie CHEN Keming JIAO Zhixin ZHAO 《Science China(Technological Sciences)》 2026年第1期44-67,共24页
The complexity of the battlefield environment,including its high dynamics,along with the high-dimensional spaces of state and decision-making,has brought severe challenges to unmanned combat aerial vehicles(UCAVs)in t... The complexity of the battlefield environment,including its high dynamics,along with the high-dimensional spaces of state and decision-making,has brought severe challenges to unmanned combat aerial vehicles(UCAVs)in the cooperative autonomous air combat decision-making.This paper focuses on the many-to-many air combat maneuvering decision(MMACMD)in an environment with extremely limited communication.An asynchronous hierarchical deep reinforcement learning method with learnable reward shaping(AHDRL_LRS)is proposed.First,by introducing an asynchronous hierarchical reinforcement learning framework,the large-scale MMACMD is decomposed into smaller-scale subtasks to reduce the dimensions of the decision spaces.Second,to achieve the coordinated global task allocation in the environment with extremely limited communication,the learnable reward with embedded target intention(LRETI)is proposed.Through the LRETI,the target selecting intentions generated by the high-level policy are implicitly represented as learnable parameters in the situation reward function,which is used to train the low-level flight maneuver policy.Third,to dynamically characterize the topological correlations of each unit in the UCAV swarm and enhance the transferability and scalability of the decision-making model,the flexible target intention network(FTIN)structure based on the multi-head self-attention(MHSA)model is designed for the representation of the high-level policy,which can accept input features with variable-length sequences.Moreover,a graph learning-based critic network is adopted in the low-level policy model to address the dynamic credit assignment.Finally,by comparing with the baseline methods under scenarios with various initialization from 6-vs-6 to 20-to-20 scales,the effectiveness and superiority of the proposed AHDRL_LRS are validated through the results of the simulation experiment. 展开更多
关键词 many-to-many air combat distributed decision-making hierarchical reinforcement learning learnable reward shaping
原文传递
QingNangTCM:a parameter-efficient fine-tuning large language model for traditional Chinese medicine
15
作者 Xuming Tong Liyan Liu +7 位作者 Yanhong Yuan Xiaozheng Ding Huiru Jia Xu Yang Sio Kei Im Mini Han Wang Zhang Xiong Yapeng Wang 《Digital Chinese Medicine》 2026年第1期1-12,共12页
Objective To develop QingNangTCM,a specialized large language model(LLM)tailored for expert-level traditional Chinese medicine(TCM)question-answering and clinical reasoning,addressing the scarcity of domain-specific c... Objective To develop QingNangTCM,a specialized large language model(LLM)tailored for expert-level traditional Chinese medicine(TCM)question-answering and clinical reasoning,addressing the scarcity of domain-specific corpora and specialized alignment.Methods We constructed QnTCM_Dataset,a corpus of 100000 entries,by integrating data from ShenNong_TCM_Dataset and SymMap v2.0,and synthesizing additional samples via retrieval-augmented generation(RAG)and persona-driven generation.The dataset comprehensively covers diagnostic inquiries,prescriptions,and herbal knowledge.Utilizing P-Tuning v2,we fine-tuned the GLM-4-9B-Chat backbone to develop QingNangTCM.A multidimensional evaluation framework,assessing accuracy,coverage,consistency,safety,professionalism,and fluency,was established using metrics such as bilingual evaluation understudy(BLEU),recall-oriented understudy for gisting evaluation(ROUGE),metric for evaluation of translation with explicit ordering(METEOR),and LLM-as-a-Judge with expert review.Qualitative analysis was conducted across four simulated clinical scenarios:symptom analysis,disease treatment,herb inquiry,and failure cases.Baseline models included GLM-4-9BChat,DeepSeek-V2,HuatuoGPT-II(7B),and GLM-4-9B-Chat(freeze-tuning).Results QingNangTCM achieved the highest scores in BLEU-1/2/3/4(0.425/0.298/0.137/0.064),ROUGE-1/2(0.368/0.157),and METEOR(0.218),demonstrating a balanced and superior normalized performance profile of 0.900 across the dimensions of accuracy,coverage,and consistency.Although its ROUGE-L score(0.299)was lower than that of HuatuoGPT-II(7B)(0.351),it significantly outperformed domain-specific models in expert-validated win rates for professionalism(86%)and safety(73%).Qualitative analysis confirmed that the model strictly adheres to the“symptom-syndrome-pathogenesis-treatment”reasoning chain,though occasional misclassifications and hallucinations persisted when dealing with rare medicinal materials and uncommon syndromes. 展开更多
关键词 Large language model(LLM) Traditional Chinese medicine(TCM) Fine-tuning p-tuning v2 Clinical decision support
在线阅读 下载PDF
融合用户兴趣边界与可学习滤波器的序列化推荐模型 被引量:1
16
作者 杨兴耀 刘岩松 +3 位作者 于炯 李梓杨 张少东 张君 《东北师大学报(自然科学版)》 北大核心 2025年第2期82-89,共8页
用户交互序列不可避免的含有多种噪声,传统模型在对用户交互序列信息进行处理时常因忽略噪声问题导致用户交互序列特征提取不充分,可能导致算法难以把握用户兴趣范围,且现有推荐算法损失函数难以获取用户兴趣边界以平衡正负样本.针对以... 用户交互序列不可避免的含有多种噪声,传统模型在对用户交互序列信息进行处理时常因忽略噪声问题导致用户交互序列特征提取不充分,可能导致算法难以把握用户兴趣范围,且现有推荐算法损失函数难以获取用户兴趣边界以平衡正负样本.针对以上问题,本文提出了融合用户兴趣边界与可学习滤波器的序列化推荐模型,并使用自注意力机制提取用户交互序列特征;以用户兴趣边界充分获取用户兴趣范围,采用融合用户兴趣边界的混合损失函数,以此来缓解正负样本不平衡的问题,达到优化算法的目标.实验表明,模型较好地过滤用户序列噪声,增强注意力机制的特征提取效果,缓解正负样本不平衡的问题,通过调节正负样本得分使得模型的通用性更高. 展开更多
关键词 可学习滤波器 序列化推荐 兴趣边界 自注意力
在线阅读 下载PDF
基于三维姿态估计的智能康复运动检测系统应用研究 被引量:1
17
作者 张堃 张鹏程 +2 位作者 陈孝豪 张彬 华亮 《仪器仪表学报》 北大核心 2025年第6期181-193,共13页
在康复运动场景中,运动输入通常是视频序列,基于主流的2D人体姿态估计方法和深度相机进行的伪3D方案无法对视频中的骨骼点测距,影响最终评估效果。为了解决这个问题,提出一种针对视频的序列到序列3D帧聚焦姿态识别方法用于康复评估。其... 在康复运动场景中,运动输入通常是视频序列,基于主流的2D人体姿态估计方法和深度相机进行的伪3D方案无法对视频中的骨骼点测距,影响最终评估效果。为了解决这个问题,提出一种针对视频的序列到序列3D帧聚焦姿态识别方法用于康复评估。其目的是从最原始的二维噪声场景中直接提取更全面、更详细的三维坐标信息,并基于这些信息进行运动序列分析。该方法采用四支路流式变换器,能够捕获长序列时间与空间之间的交互关系,同时分别对原始2D输入进行时序与空间处理。这四支路信息通过可学习比例参数进行整合,并通过一个额外模块,结合空间编码器和增强型时间解码器获得最终输出。所提方法在Human 3.6M数据集上的表现优于最先进方法,平均关节位置误差仅为14.4 mm,三维姿态坐标误差最低,证明了所提主干架构能够有效处理更复杂的康复运动视频序列任务,同时在实际康复视频序列的对比实验也验证了本方法的有效性。此外,基于先进的人体姿态估计方法,研发了一种新颖的多维度智能康复运动评估分析系统,能够对人体各个关节120个动作进行运动指标估计,已进入临床验证阶段,并完成2000余例病人测试,平均准确率93.2%。 展开更多
关键词 序列到序列 FFPose算法 四支路流式变换器 可学习比例参数 无接触式
原文传递
基于大规模语言模型的头部MRI报告诊断方法研究
18
作者 刘之洋 张明浩 +2 位作者 杨东 柴超 张颖 《南开大学学报(自然科学版)》 北大核心 2025年第5期100-109,共10页
头部MRI作为脑部临床检查手段之一,常用于脑卒中等疾病的诊断,而人工进行头部MRI的诊断依赖医生丰富的临床经验且会消耗大量人力.为了有效降低头部MRI的误诊率和漏诊率,减少医生的工作量,提出了一种通过微调大规模语言模型进行头部MRI... 头部MRI作为脑部临床检查手段之一,常用于脑卒中等疾病的诊断,而人工进行头部MRI的诊断依赖医生丰富的临床经验且会消耗大量人力.为了有效降低头部MRI的误诊率和漏诊率,减少医生的工作量,提出了一种通过微调大规模语言模型进行头部MRI自动诊断的方法,首先通过对头部MRI报告-诊断数据集进行预处理获得高质量数据,再以ChatGLM-6B为基础模型,并采用P-Tuning v2方法对该模型进行微调.为了克服部分表述的同义性使得用准确率难以对模型进行评估,提出采用平均Dice系数对模型进行评估.通过实验,模型可以达到0.890 4的平均Dice系数. 展开更多
关键词 深度学习 大规模语言模型 p-tuning v2 头部MRI 自动诊断
原文传递
强势最简命题视域下汉语供用义主宾可逆句的生成句法研究
19
作者 马志刚 《北京第二外国语学院学报》 北大核心 2025年第4期68-83,共16页
生成语法的强势最简命题主张,内在语言优化系统中的合并操作附加经济原则有助于删除外化过程中无须拼读的成分。依据该生物语言学理念对汉语主宾可逆句进行的最简句法分析显示,采用“施事者、受事者”等题元名称来指称“一锅饭吃五个人... 生成语法的强势最简命题主张,内在语言优化系统中的合并操作附加经济原则有助于删除外化过程中无须拼读的成分。依据该生物语言学理念对汉语主宾可逆句进行的最简句法分析显示,采用“施事者、受事者”等题元名称来指称“一锅饭吃五个人/五个人吃一锅饭”中NP的语义角色并不适切,因为其中的NP是由分配义数量算子而非动词次范畴化所选择的语义成分(S-selection)。由于数量对应关系是该类句式的核心要义,因此两个名词词组都必须是数量型无定成分,这是由量化义中心语“每”的语类选择(C-selection)要求所决定的。鉴于定指短语无论是做主语还是做宾语都会改变句式的基本语义,这类结构更应该被称为分配义主宾可逆句(可区分为供义宾主句和用义主宾句两种变体);中心语的隐形拼读虽然会对主宾可逆句的感知造成理解困难,但却是该类结构的形态-句法-语义核心,而动词的隐现则是因其嫁接而成的附加语性质所致。基于局域非对称成分统制的分析方案显示,不同线性序列完全可以溯源至同一结构图示,因而可兑现强势最简命题所要求的可学性、普适性和可进化性3个基本条件。本文的新观点是:此类句式应该隶属于祈使句范畴,其中的名词短语具有数量短语固有的内在格位(部分格)。 展开更多
关键词 强势最简命题 可逆句 可学性 普适性 可进化性 合并
在线阅读 下载PDF
融合傅里叶变换和可学习矩阵的肝脏肿瘤CT图像分割
20
作者 邵虹 张雨 《沈阳大学学报(自然科学版)》 2025年第2期147-154,161,共9页
为实现肝脏肿瘤的精确自动分割,构建了一种融合残差注意力的3D深度U形网络用以分割肝脏,以减少背景中其他器官对肿瘤分割的影响。在肝脏分割的基础上,提出了一个融合傅里叶变换和可学习矩阵的门控装置,该装置在过滤无用信息的同时更好... 为实现肝脏肿瘤的精确自动分割,构建了一种融合残差注意力的3D深度U形网络用以分割肝脏,以减少背景中其他器官对肿瘤分割的影响。在肝脏分割的基础上,提出了一个融合傅里叶变换和可学习矩阵的门控装置,该装置在过滤无用信息的同时更好地处理边缘部分,提高了模型对肿瘤边缘信息的辨别能力。结果表明,该算法在肝脏肿瘤分割中取得了良好的效果,重合率(dice similarity coefficient,DSC)为78.99%,较3D U-Net模型提高了7.15%,提升了肝脏肿瘤的分割精度,同时具有良好的泛化性能。 展开更多
关键词 肝脏肿瘤分割 残差模块 注意力机制 傅里叶变换 可学习矩阵
在线阅读 下载PDF
上一页 1 2 6 下一页 到第
使用帮助 返回顶部