期刊文献+
共找到4篇文章
< 1 >
每页显示 20 50 100
RoBGP:A Chinese Nested Biomedical Named Entity Recognition Model Based on RoBERTa and Global Pointer 被引量:3
1
作者 Xiaohui Cui Chao Song +4 位作者 Dongmei Li Xiaolong Qu Jiao Long Yu Yang Hanchao Zhang 《Computers, Materials & Continua》 SCIE EI 2024年第3期3603-3618,共16页
Named Entity Recognition(NER)stands as a fundamental task within the field of biomedical text mining,aiming to extract specific types of entities such as genes,proteins,and diseases from complex biomedical texts and c... Named Entity Recognition(NER)stands as a fundamental task within the field of biomedical text mining,aiming to extract specific types of entities such as genes,proteins,and diseases from complex biomedical texts and categorize them into predefined entity types.This process can provide basic support for the automatic construction of knowledge bases.In contrast to general texts,biomedical texts frequently contain numerous nested entities and local dependencies among these entities,presenting significant challenges to prevailing NER models.To address these issues,we propose a novel Chinese nested biomedical NER model based on RoBERTa and Global Pointer(RoBGP).Our model initially utilizes the RoBERTa-wwm-ext-large pretrained language model to dynamically generate word-level initial vectors.It then incorporates a Bidirectional Long Short-Term Memory network for capturing bidirectional semantic information,effectively addressing the issue of long-distance dependencies.Furthermore,the Global Pointer model is employed to comprehensively recognize all nested entities in the text.We conduct extensive experiments on the Chinese medical dataset CMeEE and the results demonstrate the superior performance of RoBGP over several baseline models.This research confirms the effectiveness of RoBGP in Chinese biomedical NER,providing reliable technical support for biomedical information extraction and knowledge base construction. 展开更多
关键词 BIOMEDICINE knowledge base named entity recognition pretrained language model global pointer
在线阅读 下载PDF
融合RoBERTa-WWM和全局指针网络的农业病害实体关系联合抽取研究 被引量:4
2
作者 王彤 张立杰 +4 位作者 王铭 吴华瑞 朱华吉 杨英茹 王春山 《河北农业大学学报》 CAS CSCD 北大核心 2024年第3期113-120,129,共9页
针对实体和关系抽取过程中存在的一词多义、实体嵌套、三元组重叠的问题,本文提出了1种融合RoBERTa-WWM和全局指针网络的联合抽取模型RBGPL。该模型引入RoBERTa-WWM预训练模型,利用上下文的语境信息融合克服了不同语境下一词多义问题;... 针对实体和关系抽取过程中存在的一词多义、实体嵌套、三元组重叠的问题,本文提出了1种融合RoBERTa-WWM和全局指针网络的联合抽取模型RBGPL。该模型引入RoBERTa-WWM预训练模型,利用上下文的语境信息融合克服了不同语境下一词多义问题;采用全局指针网络Global pointer标注方式解决了实体嵌套问题;通过全局指针联合解码模型将三重抽取转变为五重提取,解决了三元组重叠问题。在自建农业病害数据集上,模型RBGPL的精确率、召回率、F1值达到76.23%,91.18%,83.04%,与其他联合抽取模型相对比F1值均取最优,有效地克服了一词多义问题和三元组重叠问题。此外,在病原(Pathogeny)和作物名称(Crop)2种易嵌套实体的F1值上提升了3%和18%,实体嵌套得到了显著缓解。本文方法提高了中文农业病害领域实体关系抽取性能,可为农业病害领域知识图谱的构建提供技术支持。 展开更多
关键词 农业病害 联合抽取 RoBERTa-WWM global pointer
在线阅读 下载PDF
GeoNER:Geological Named Entity Recognition with Enriched Domain Pre-Training Model and Adversarial Training
3
作者 MA Kai HU Xinxin +4 位作者 TIAN Miao TAN Yongjian ZHENG Shuai TAO Liufeng QIU Qinjun 《Acta Geologica Sinica(English Edition)》 SCIE CAS CSCD 2024年第5期1404-1417,共14页
As important geological data,a geological report contains rich expert and geological knowledge,but the challenge facing current research into geological knowledge extraction and mining is how to render accurate unders... As important geological data,a geological report contains rich expert and geological knowledge,but the challenge facing current research into geological knowledge extraction and mining is how to render accurate understanding of geological reports guided by domain knowledge.While generic named entity recognition models/tools can be utilized for the processing of geoscience reports/documents,their effectiveness is hampered by a dearth of domain-specific knowledge,which in turn leads to a pronounced decline in recognition accuracy.This study summarizes six types of typical geological entities,with reference to the ontological system of geological domains and builds a high quality corpus for the task of geological named entity recognition(GNER).In addition,Geo Wo BERT-adv BGP(Geological Word-base BERTadversarial training Bi-directional Long Short-Term Memory Global Pointer)is proposed to address the issues of ambiguity,diversity and nested entities for the geological entities.The model first uses the fine-tuned word granularitybased pre-training model Geo Wo BERT(Geological Word-base BERT)and combines the text features that are extracted using the Bi LSTM(Bi-directional Long Short-Term Memory),followed by an adversarial training algorithm to improve the robustness of the model and enhance its resistance to interference,the decoding finally being performed using a global association pointer algorithm.The experimental results show that the proposed model for the constructed dataset achieves high performance and is capable of mining the rich geological information. 展开更多
关键词 geological named entity recognition geological report adversarial training confrontation training global pointer pre-training model
在线阅读 下载PDF
基于BERT+GP+KF的非结构化威胁情报实体识别
4
作者 管伟 《计算机与数字工程》 2025年第9期2551-2557,共7页
网络安全报告具有数量繁多、信息迭代快、结构复杂的特点,网络安全实体存在多义、专业性强、分类模糊等特征,如何从非结构化的文本报告中自动高效地提取出所需的威胁情报实体对情报的分析和利用具有重要意义。人工标注不仅需要较高的网... 网络安全报告具有数量繁多、信息迭代快、结构复杂的特点,网络安全实体存在多义、专业性强、分类模糊等特征,如何从非结构化的文本报告中自动高效地提取出所需的威胁情报实体对情报的分析和利用具有重要意义。人工标注不仅需要较高的网络安全水平,还需要丰富的从业经验。文中提出了一种基于预训练模型(Bidirectional Encoder Representation from Transformers,BERT)基础模型和爬取的网络安全博客训练了一个网络安全方向的预训练模型,来获取文本语句的词汇特征和句子特征,同时在模型输入端还与领域专业知识进行知识融合,通过全局指针(Global Pointer,GP)进行解码,添加了更多的词扩充(Word Expansion,WE)信息进行有监督训练,完成实体级的实体识别。文中采用了由多位领域专家公开的DNRTI数据集,可以方便进行模型效果的横向对比,并在此基础上对数据集进行了标注扩充完善。在该数据集下多个模型的实验结果显示,该方法在模型推理耗时和F1-score上都有较大的提升,评估F1值达到了0.836,与目前主流模型方法相比提升了5.5%,模型推理时间节省了13.6%,可以快速有效地从安全报告中提取出威胁情报实体,用于威胁情报的构建、分享和利用。 展开更多
关键词 威胁情报 实体识别 BERT模型 global Pointer 知识融合
在线阅读 下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部