摘要
对现有的各种斜体检测方法进行详细的分析与比较,并提出一种垂直与水平两级加权的归一化比较方法.该方法通过放大正体与斜体字符图像间的特征差异,可以快速实现文档中散布中文斜体字符的检测.采用3个测试集来测试本文方法,并对各种斜体检测方法进行对比实验.实验证明,本文方法性能较优越,可满足实际应用的需要.
Various algorithms of detecting italic characters are introduced and analyzed in detail. An algorithm of vertical and horizontal weighted normalized comparison is presented, which can find Chinese italic characters scattered in the documents rapidly by enlarging the differences of image features between normal and italic characters. Three collections, including character, string and document, are used to evaluate the algorithm. Various algorithms of detecting italic characters are tested and compared based on these collections. The experimental results demonstrate the proposed method is effective and applicable.
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2007年第6期855-860,共6页
Pattern Recognition and Artificial Intelligence
基金
国家自然科学基金(No.60472066
60602031)
关键词
散布中文斜体字符的检测
加权的归一化比较法
中文字符串样本集
Detection of Scattered Chinese Italic Characters, Weighted Normalized Comparison,Chinese Character String Collection