This paper proposes a new two-phase approach to robust text detection by integrating the visual appearance and the geometric reasoning rules. In the first phase, geometric rules are used to achieve a higher recall rat...This paper proposes a new two-phase approach to robust text detection by integrating the visual appearance and the geometric reasoning rules. In the first phase, geometric rules are used to achieve a higher recall rate. Specifically, a robust stroke width transform(RSWT) feature is proposed to better recover the stroke width by additionally considering the cross of two strokes and the continuousness of the letter border. In the second phase, a classification scheme based on visual appearance features is used to reject the false alarms while keeping the recall rate. To learn a better classifier from multiple visual appearance features, a novel classification method called double soft multiple kernel learning(DS-MKL) is proposed. DS-MKL is motivated by a novel kernel margin perspective for multiple kernel learning and can effectively suppress the influence of noisy base kernels. Comprehensive experiments on the benchmark ICDAR2005 competition dataset demonstrate the effectiveness of the proposed two-phase text detection approach over the state-of-the-art approaches by a performance gain up to 4.4% in terms of F-measure.展开更多
针对实际应用场景中如何在大批量图像文件中快速找到中文印刷体文本图像文件进行OCR (Optical Character Recognition)识别的问题,本文在笔画宽度变换算法(SWT)的基础上,设计了针对中文文本固有特点的启发式规则,并将水平投影技术与离...针对实际应用场景中如何在大批量图像文件中快速找到中文印刷体文本图像文件进行OCR (Optical Character Recognition)识别的问题,本文在笔画宽度变换算法(SWT)的基础上,设计了针对中文文本固有特点的启发式规则,并将水平投影技术与离散傅里叶变换相结合,提出了一种适合倾斜角度在–90至90°之间的中文印刷体文本图像文件识别技术.实验结果显示,在1606张测试集图像文件的识别中,本文算法针对文本图像文件整体识别F值(F-Measure)为0.95,平均识别耗时为0.65 s.展开更多
基金supported by National Natural Science Foundation of China(Nos.61300163,61125106 and 61300162)Jiangsu Key Laboratory of Big Data Analysis Technology
文摘This paper proposes a new two-phase approach to robust text detection by integrating the visual appearance and the geometric reasoning rules. In the first phase, geometric rules are used to achieve a higher recall rate. Specifically, a robust stroke width transform(RSWT) feature is proposed to better recover the stroke width by additionally considering the cross of two strokes and the continuousness of the letter border. In the second phase, a classification scheme based on visual appearance features is used to reject the false alarms while keeping the recall rate. To learn a better classifier from multiple visual appearance features, a novel classification method called double soft multiple kernel learning(DS-MKL) is proposed. DS-MKL is motivated by a novel kernel margin perspective for multiple kernel learning and can effectively suppress the influence of noisy base kernels. Comprehensive experiments on the benchmark ICDAR2005 competition dataset demonstrate the effectiveness of the proposed two-phase text detection approach over the state-of-the-art approaches by a performance gain up to 4.4% in terms of F-measure.
文摘针对实际应用场景中如何在大批量图像文件中快速找到中文印刷体文本图像文件进行OCR (Optical Character Recognition)识别的问题,本文在笔画宽度变换算法(SWT)的基础上,设计了针对中文文本固有特点的启发式规则,并将水平投影技术与离散傅里叶变换相结合,提出了一种适合倾斜角度在–90至90°之间的中文印刷体文本图像文件识别技术.实验结果显示,在1606张测试集图像文件的识别中,本文算法针对文本图像文件整体识别F值(F-Measure)为0.95,平均识别耗时为0.65 s.