摘要
采用图像的结构化局部边缘模式特征(structured local edge pattern,SLEP)对文档图像进行分类,由于该算法精确描述了图像边缘方向邻域中的空间分布,因此相应的学习对于文档图像类型具有很强的区分能力.与基于图像复杂结构分布特征的方法或基于光学字符识别系统特征(OCR)的方法相比,基于SLEP特征的方法更简单有效.本实验通过组建文档图像数据库,利用支持向量机(SVM)作为分类器,总共对4种文档图像类型进行分类,分别为学术论文(paper),影像照片(photo),表格文件(table),幻灯影片(slide).实验结果表明,基于SLEP特征的方法在准确率、召回率等方面都明显优于所对比方法,并且即使在文档图像低分辨率的情况下,所分类结果仍然有不错表现.
This paper adopts structured local edge pattern (SLEP) feature to have a classification on document images, the algorithm accurately describes the spatial distribution of the image in the neighborhood of the edge direction, thus the corresponding learning has a strong ability to distinguish for document image type classification. Compared with the method of based on complex image structure distribution characteristics and the method of using optical character recognition system (OCR), the method of based on SLEP feature is more simple and more effective. Through assembling a database, using support vector machines (SVM) as the classi- fier, this paper will have a classification on four document image types, respectively paper, photo, table, slide. The experiment confirms that the method of based on SLEP feature was significantly better than the comparative method both in precision and recall, and it still has a good performance even in th'e case of low-resolution images.
出处
《厦门大学学报(自然科学版)》
CAS
CSCD
北大核心
2013年第3期349-355,共7页
Journal of Xiamen University:Natural Science
基金
国家自然科学基金项目(60873179
61202143)
国家教育部博士点专项基金项目(20090121110032)
台湾行政院国家科学委员会项目(NSC 100-2221-E-155-086)
福建省自然科学基金项目(2011J01367)
深圳科学技术研究基金项目(JC200903180630A
ZYB200907110169A)
深圳市战略性新兴产业发展专项资金项目(JCYJ20120614164600201)
关键词
类型识别
图像处理
结构化局部边缘模式
模式分类
genre identification
image processing
structured local edge pattern
pattern classification