摘要
工程图纸中字符串及标注信息的提取是工程图纸自动化处理极为重要的组成部分,是进行尺寸理解、图像理解等高层次理解的前提和基础。提出一种基于工程图纸知识的预分割字符串及标注信息提取方法,重点关注工程图纸中以表格形式存在的字符串以及图元标注信息的解析、定位、提取。通过前期处理保持字符串与字符串、图元与标注信息之间的逻辑联系,解析获得字符串的坐标信息,对字符串所在的区域进行水平化,去除杂质线段等操作,以达到最佳的识别效果。
Recognition of characters and annotations on engineering drawings is very important to automated processing of engineer drawings. It is the precondition and foundation of size and image understanding. A new algorithm is presented for recognition of characters and annotations which is based on pre-cut characters and annotations mainly focusing on the recognition, location and extraction. By preserving the logical relations of characters and graphic annotations, what could have are coordinates of characters through complanation and denoising.
出处
《计算机工程与应用》
CSCD
2012年第7期161-164,共4页
Computer Engineering and Applications
关键词
预分割字符串
标注信息提取
文字识别
pre-cut characters
annotations extraction
Optical Character Recogntion(OCR)