期刊文献+

基于卷积神经网络的酒文献关键词自动识别与标注算法 被引量:3

An Automatic Keyword Recognition and Annotation Algorithm for Liquor Literature Based on Convolu⁃tional Neural Networks
在线阅读 下载PDF
导出
摘要 关键词自动识别与标注算法在酒类历史文献自动分析和机器识别理解领域中有重要价值.首先采用YOLOv7网络模型进行酒文献的文本框识别,接着引入CBAM注意力机制获得文本框位置、大小等特征,然后采用PaddleOCR算法实现酒文献的关键词识别,最后应用文字修补技术进行优化处理.应用该检测算法设计的实验分析系统能高效处理海量酒文献数据,以90%的识别率提取文献中与酒类相关的文字信息,能有效克服酒文献中存在的文字印刷模糊不完整、字体种类多样的特殊情形,实验中取得了较好的识别标注效果. Keyword automatic recognition and annotation algorithms have important value in automatic analysis and machine recognition understanding of alcoholic history literature.Firstly,the YOLOv7 network model is used for text box recognition in liquor literature.Secondly the CBAM attention mechanism is introduced to obtain features such as text box position and size.Thirdly,the PaddleOCR algorithm is used to achieve keyword recognition in liquor literature.Finally,text repair technology is applied for optimization processing.The experimental analysis system designed using this detection algorithm can efficiently pro⁃cess massive alcohol literature data,extract text information related to alcohol with a 90%recognition rate,and effectively over⁃come the special situation of blurred and incomplete text printing and diverse font types in alcohol literature,achieving good rec⁃ognition and annotation results in the experiment.
作者 张桃 童旭 胡隆河 杨强 ZHANG Tao;TONG Xu;HU Longhe;YANG Qiang(Department of Artificial Intelligence and Big Data,Yibin University,Yibin,Sichuan 644000,China;School of Information Engineering,Chengdu University of Technology,Chengdu,Sichuan 610059,China)
出处 《宜宾学院学报》 2024年第6期27-32,91,共7页 Journal of Yibin University
基金 四川省哲学社会科学重点研究基地中国酒史研究中心开放基金项目(ZGJS2021-03) 四川省科技计划重点研发项目(2021YFG0029)。
关键词 深度学习 卷积神经网络 文字识别 酒文献 deep learning convolutional neural networks text recognition wine literature
  • 相关文献

参考文献4

二级参考文献49

  • 1王勇,郑辉,胡德文.图像和视频中的文字获取技术[J].中国图象图形学报(A辑),2004,9(5):532-538. 被引量:13
  • 2谢毓湘,栾悉道,吴玲达,老松杨.新闻视频帧中的字幕探测[J].计算机工程,2004,30(20):167-168. 被引量:15
  • 3史迎春,王韬,周献中.一种基于时空分布特征的新闻字幕检测新算法[J].系统仿真学报,2004,16(11):2483-2485. 被引量:5
  • 4来新夏.中国地方志的史料价值及其利用[J].国家图书馆学刊,2005,14(1):5-8. 被引量:41
  • 5陈小荷,冯敏萱,徐润华,等.先秦文献信息处理[M].北京:世界图书出版公司北京公司,2013:146-168.
  • 6Sang E F T K, De Meulder F. Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition [ C ]//Special Interest Group on Natural Language Learning of the Association for Computational Linguistics. Proceedings of the Sev- enth Conference on Natural Language Learning at HLT-NAACL. Edmonton: CONLL, 2003:142 - 147.
  • 7Busa R. The annals of humanities computing: The index thomistic- us[J]. Computers and the Humanities, 1980,14(2) :83 -90.
  • 8Unsworth J. What is humanities computing and what is not [ EB/ OL ]. [ 2015 - 03 - 26 ]. http ://computerphilologie. uni - muench en. de/jgO2/unsworth, html.
  • 9Lafferty J, McCallum A, Pereira F. Conditional random fields : Prob- abilistic models for segmenting and labeling sequence data [ C ]// The International Machine Learning Society. Proceedings of 18th International Conference on Machine Learning. Williamstown: Williams College, 2001:282 -289.
  • 10CRF++ [ EB/OL]. [ 2015 - 05 - 21 ]. http://sourceforge, net/ projects/crfpp/.

共引文献76

同被引文献12

引证文献3

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部