期刊文献+

融合外部语义知识的多标签分类方法

Multi-label classification method integrating external semantic knowledge
在线阅读 下载PDF
导出
摘要 文本分类作为自然语言处理(NLP)领域的重要任务,它的多标签分类因标签空间大而成为难点。针对该问题,以儿童读物中的价值观标识为实例,提出一种融合外部语义知识的多标签分类方法HSGIN(Heterogeneous Semantic Gated Interaction Network)。首先,利用SBERT(Sentence embeddings from Siamese BERT(Bidirectional Encoder Representations from Transformers))和双向长短期记忆(Bi-LSTM)网络提取文本特征;其次,通过异质图转换架构(HGT)联合建模知识图谱(KG)中的实体和关系,并利用先验知识和语义关联提取标签特征;最后,将文本特征和标签特征进行注意力融合以得到不同的标签特征表示,且引入门控图神经网络(GGNN)捕捉标签间的语义依赖和交互模式并进行预测。实验结果表明,相较于目前性能先进的对比方法BERT,所提方法的精确率、召回率和F1分数分别提升了2.66、0.47和1.16个百分点。以上实验结果验证了所提方法的有效性,同时,对儿童读物中价值观标识的精准分析有助于为儿童选择健康的读物。 Text classification is regarded as a crucial task in Natural Language Processing(NLP)field,with multi-label classification becoming a challenge due to large label space.To address this issue,a multi-label classification method integrating external semantic knowledge was proposed,named HSGIN(Heterogeneous Semantic Gated Interaction Network),using values markers in children’s books as a case study.Firstly,text features were extracted through SBERT(Sentence Embeddings from Siamese BERT(Bidirectional Encoder Representations from Transformers))and Bidirectional Long Short-Term Memory(Bi-LSTM)network.Then,entities and relations in the Knowledge Graph(KG)were modeled jointly using a Heterogeneous Graph Transformer(HGT),and label features were extracted using the prior knowledge and semantic associations.Finally,the attention mechanism was employed to fuse text features and label features,generating distinct label feature representations.These embeddings were fed into a Gated Graph Neural Network(GGNN)to capture semantic dependencies and interaction patterns among labels for prediction.Experimental results show that compared with the existing state-of-the-art comparison method BERT,the proposed method achieves increases of 2.66,0.47,and 1.16 percentage points in precision,recall,and F1 score,respectively.The above experimental results verify the effectiveness of the proposed method.At the same time,precise analysis of values markers in children’s books helps choose healthy books for children.
作者 杨进才 班启旭 杨旭生 沈显君 YANG Jincai;BAN Qixu;YANG Xusheng;SHEN Xianjun(School of Computer Science,Central China Normal University,Wuhan Hubei 430079,China)
出处 《计算机应用》 北大核心 2025年第12期3757-3763,共7页 journal of Computer Applications
基金 国家自然科学基金资助项目(61977032) 国家社会科学基金资助项目(19BYY092)。
关键词 多标签文本分类 知识图谱 异质图转换架构 门控图神经网络 标签相关性 Multi-Label Text Classification(MLTC) Knowledge Graph(KG) Heterogeneous Graph Transformer(HGT)architecture Gated Graph Neural Network(GGNN) label correlation
  • 相关文献

参考文献6

二级参考文献13

共引文献103

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部