期刊文献+

基于ALBERT-Seq2Seq-Attention模型的数字化档案多标签分类 被引量:2

Multi-label Classification of Digital ArchivesBased on ALBERT-Seq2Seq-Attention Model
在线阅读 下载PDF
导出
摘要 针对现有的数字化档案多标签分类方法存在分类标签之间缺少关联性的问题,提出一种用于档案多标签分类的深层神经网络模型ALBERT-Seq2Seq-Attention.该模型通过ALBERT(A Little BERT)预训练语言模型内部多层双向的Transfomer结构获取进行文本特征向量的提取,并获得上下文语义信息;将预训练提取的文本特征作为Seq2Seq-Attention(Sequence to Sequence-Attention)模型的输入序列,构建标签字典以获取多标签间的关联关系.将分类模型在3种数据集上分别进行对比实验,结果表明:模型分类的效果F1值均超过90%.该模型不仅能提高档案文本的多标签分类效果,也能关注标签之间的相关关系. Aiming at the problem of lack of correlation between classification labels in existing digital archive multi label classification methods,a deep neural network model for archive multi label classification,ALBERT-Seq2Seq-Attention,is proposed.This model uses the multi-layer and bidirectional Transfomer structure acquisition within the ALBERT(A Little BERT)pre training language model to extract text feature vectors and obtain contextual semantic information.Secondly,the text features extracted by pre-training are used as the input sequence of the Seq2Seq-Attention(Sequence to Sequence-Attention)model,and a label dictionary is constructed to obtain the association relationship between multiple labels.Comparative experiments were conducted on three datasets using the classification model.The experimental results showed that the F1 value of the model classification effect exceeded 90%,not only improving the multi label classification effect of archive text,but also paying attention to the correlation between labels.
作者 王少阳 成新民 王瑞琴 陈静雯 周阳 费志高 WANG Shaoyang;CHENG Xinmin;WANG Runqin;CHEN Jingwen;ZHOU Yang;FEI Zhigao(School of Information Engineering,Huzhou University,Huzhou 313000,China)
出处 《湖州师范学院学报》 2024年第2期65-72,共8页 Journal of Huzhou University
基金 国家自然科学基金项目(62277016) 湖州师范学院研究生科研创新项目(2022KYCX45)。
关键词 ALBERT Seq2Seq ATTENTION 多标签分类 数字化档案 ALBERT Seq2Seq Attention multi-label classification digital archives
  • 相关文献

参考文献6

二级参考文献29

共引文献121

同被引文献28

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部