摘要
针对通用领域命名实体识别方法难以识别网络安全领域中专业名词等安全实体,且提取特征不充分,导致网络安全实体识别准确率低等问题,提出一种融合残差感知网络的Bi-LSTM-CRF(Res-Inception Bi-LSTM-CRF, RIBIC)模型,通过残差感知网络模型提取多粒度特征,以捕获更丰富的特征信息;并自行构建网络安全领域词典,结合词典匹配校正算法进一步提高实体识别准确率。实验结果表明,在两个威胁情报公开数据集上,F1值分别达到94.09%和83.91%,比基线模型分别高出15.02%和15.72%,充分证明本文方法在威胁情报领域命名实体识别上的有效性。
Addressing the challenges faced by general domain named entity recognition methods,which struggle to identify specialized terms and security entities within the cybersecurity domain,and suffer from insufficient feature extraction leading to low accuracy in cybersecurity entity recognition,this paper introduces a new model named Res-Inception Bi-LSTM-CRF(RIBIC).The RIBIC model leverages a Res-Inception Network to extract multi-granularity features,thereby capturing a richer set of feature information.Furthermore,an in-house cybersecurity domain-specific dictionary is developed,and a dictionary-based matching correction algorithm is incorporated to enhance the precision of entity recognition.The experimental results indicate that on two publicly available threat intelligence datasets,the F1 scores achieved are 94.09%and 83.91%,representing improvements of 15.02%and 15.72%over the baseline models,respectively.These findings robustly validate the effectiveness of the proposed method for named entity recognition in the threat intelligence domain.
作者
曾文丽
陈继鑫
ZENG Wen-li;CHEN Ji-xin(Sichuan University of Science&Engineering of Computer Science and Engineering,Yibin 644000,China;The Key Laboratory of Higher Education of Sichuan Province for Enterprise Informationalization and Internet of Things,Yibin 644000,China)
出处
《电脑与电信》
2025年第4期30-37,共8页
Computer & Telecommunication
基金
企业信息化与物联网测控技术四川省高校重点实验室,项目编号:2022WYJ03
四川轻化工大学2023年校级教学改革研究项目“产教融合背景下网络安全综合实验教学改革探索与实践”资助,项目编号:JG-2307。
关键词
威胁情报
命名实体识别
残差感知网络
Cyber Threat Intelligence
Named Entity Recognition
Res-Inception Network