期刊文献+

文本数字化图像OCR识别的准确度测度实验与提高 被引量:11

Text Digital Image OCR Accuracy Measurement Experiment and Improvement
在线阅读 下载PDF
导出
摘要 基于英国国家图书馆的Reshelp和Burney两个古旧英文报纸数字化项目,作者对文本型数字图像的OCR识别的准确度进行测试实验,结果显示整体准确度不高,且从高到低依次为字符、单词、重要单词、大写字母开头的重要单词。然后,将OCR识别周期划分为数字扫描对象的获取、数字图像的生产、数字图像的处理和文本识别等四个阶段,分析每个阶段影响准确度的因素,探讨提高准确度的具体措施。 The following two aspects are discussed in this paper: ( 1 ) based on Reshelp and Burney historic English newspaper digitization projects in British Library, the author does an experiment on OCR accuracy measurement, and the result shows that the overall accuracies are not very good, and the sequence from high to low is characters, words, significant words and words start with capital letter; (2) based on the four stages of OCR period which are digital scanning object obtainment, digital image production, digital image process and text recognition, the author analyses the accuracy influencing factors and discusses the measures for improving the accuracy.
作者 臧国全
出处 《图书情报知识》 CSSCI 北大核心 2010年第3期62-67,共6页 Documentation,Information & Knowledge
基金 河南省高校科技创新人才支持计划(2008-551)资助
关键词 OCR识别 准确度测试 信息资源数字化 OCR recognition Accuracy measurement Information resource digitization
  • 相关文献

参考文献5

  • 1Schantz, Herbert F. The History of OCR, Optical Character Recognition. Recognition Technologies, 1982 ( 02 ) : 78-81.
  • 2British Library. 19th Century British Library Newspapers Database. [ 2009-06-10 ]. http: //www. bl. uk/reshelp/findhelprestype/news/newspdigproj/database/index. html.
  • 3JISC. the Burney Collection. [ 2009-07- 11]. www. jisc-collections.ac. uk/burney.
  • 4Eric. Free Diff Tool: SourceGear DiffMerge. [2009-06-29]. http : //www. ericsink. com/entries/DiffMerge. html.
  • 5Michael Gilleland. Levenshtein Distance. [2009-07-01 ]. http: // www. merriampark. com/Id.htm.

同被引文献37

引证文献11

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部