基于TILT、DBNet与CRNN的图书封面文字识别算法

A Book Cover Text Recognition Algorithm Based on TILT,DBNet and CRNN

下载PDF

导出

摘要从图书封面自动识别文字是获取元数据的关键,但书籍摆放角度、复杂设计及光照条件显著影响识别精度。为此,提出多阶段协同的级联框架,融合DBNet检测网络、改进的TILT姿态矫正算法与CRNN序列模型,构建“检测—矫正—再检测”闭环流程。首先通过DBNet初步定位文字区域,随后采用局部低秩优化的TILT算法对所有文字区域进行一次性几何校正,再通过DBNet二次检测精确定位文字位置,最终结合CRNN实现多语言混合文本的高效识别。双重检测机制抑制误差传播,局部低秩优化避免全局矫正对背景的敏感性,在常规与倾斜场景下均提升识别鲁棒性。实验表明,较传统OCR及主流深度学习模型,该方法在复杂图书封面场景中准确性与适应性更优,为图书馆数字化管理的文字信息提取提供有效技术路径。 Automatically recognizing text on book covers is crucial for retrieving metadata,but challenges such as book orientation,complex designs,and varying lighting conditions significantly degrade recognition accuracy.To address this,this paper proposes a multi-stage cascaded framework that integrates the DBNet detection network,an improved TILT pose correction algorithm,and a CRNN sequence model to construct a closed-loop“detection-correction-re-detection”pipeline.First,DBNet preliminarily localizes text regions.Then,the TILT algorithm with local low-rank optimization performs geometric correction on all text regions in a single step.A second DBNet detection refines text positions,and CRNN ultimately enables efficient recognition of multilingual mixed text.The double detection mechanism suppresses error propagation,while local low-rank optimization avoids global correction’s sensitivity to background interference,enhancing recognition robustness in both regular and tilted scenarios.Experiments demonstrate that the method outperforms traditional OCR and mainstream deep learning models in accuracy and adaptability for complex book cover scenarios,providing an effective technical solution for text extraction in library digitization management.

作者秦燕 QIN Yan

机构地区长治医学院图书馆

出处《图书情报导刊》 2025年第5期27-34,共8页 Journal of Library and Information Science

关键词深度学习光学字符识别神经网络图书馆自动化图书元数据管理 deep learning optical character recognition neural networks library automation bibliographic metadata management

分类号 TP391.43 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献1

1刘玉杰,李伟.基于自动定位分割的图书识别框架[J].计算机辅助设计与图形学学报,2012,24(11):1464-1470. 被引量：3

二级参考文献19

1Girod B, Chandrasekhar V, Chen D M, et al. Mobile visual search [J]. IEEE Signal Processing Magazine, 2011, 28(4): 61-76.
2Lowe D G. Distinctive image features from scale-invariant keypoints [J]. International Journal of Computer Vision, 2004, 60(2): 91-110.
3Morel J M, Yu G S. ASIFT: a new framework for fully affine invariant image comparison [J]. SIAM Journal on Imaging Sciences, 2009, 2(2): 438-469.
4Iwata K, Yamamoto K. Book cover identification by using four directional features filed for a small-scale library system [C] //Proceedings of International Conference on Document Analysis and Recognition. Los Alamitos: IEEE Computer Society Press, 2001:582-586.
5Tsai S S, Chen D, Singh J P, etal. Rate-efficient, real-time cd cover recognition on a camera-phone[C] //Proceedings of the 16th International Conference on Multimedia. New York: ACM Press, 2008: 1023-1024.
6Tsai S S, Chen D M, Chandrasekhar V, etal. Mobile product recognition[C] //Proceedings of the International Conference on Multimedia. New York: ACMPress, 2010:1587-1590.
7Burges C J C. A tutorial on support vector machines for pattern recognition [J]. Data Mining and Knowledge Discovery, 1998, 2(2): 121-167.
8Shi J B, Malik J. Normalized cuts and image segmentation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(8): 888-905.
9Osher S, Sethian J A. Fronts propagating with curvature dependent speedz algorithms based on the Hamilton Jacobi formulation [J]. Journal of Computational Physics, 1988, 79 (1): 12-49.
10Rother C, Kolmogorov V, Blake A. "GrabCut": interactive foreground extraction using iterated graph cuts [J]. ACM Transactions on Graphics, 2004, 23(3); 309-314.

共引文献2

1刘乐元,赵毅,陈靓影.基于卷积神经网络的图书页面检索方法[J].华中科技大学学报（自然科学版）,2017,45(11):22-28. 被引量：2
2陶磊,李天剑,胡欢.基于改进Mask R-CNN的纸箱堆垛分割与定位方法[J].北京信息科技大学学报（自然科学版）,2020,35(3):85-88. 被引量：2

1原野,齐文超,许源雅,王彤,郭克石,李旺,梁增辉,魏豪.BIM技术在装配式数据中心项目中的综合应用[J].智能建筑与智慧城市,2025(3):75-77. 被引量：1
2刘小光,孙贵洲,姚瑞,金中武,柴朝晖.长江中游典型通江故道物理形态演变特征研究[J].人民长江,2024,55(11):24-29.
3吴爱青,杨小芳,孙叶,范宪勇,李晨,张鲁,惠燕先.挖掘机链轨节失效分析及改善[J].金属加工(热加工),2025(4):156-160.
4周涛,陆海琳.基于精细加工可能性模型(ELM)的社交媒体信息传播效果研究[J].评价与管理,2025,23(1):65-71.
5张亮霞,谢丽萍,刘瑞龙,夏妍.遥感影像自动配准技术及其在农业领域的应用[J].安徽农学通报,2025,31(10):109-113.
690后插画师关于现代社会的思考[J].发现,2024(5).
7孙瀚,田薇,刘焱赫,姚高明,庄育荣,王柏懿.竹智凌云——低成本轻量化可降解竹制无人机探索[J].大众文艺(学术版),2025(9):34-37.
8郑卓纹,吴攀超,王婷婷,孙琦.基于轻量化目标检测算法的指针仪表读数识别[J].机械与电子,2025,43(5):10-17.

图书情报导刊

2025年第5期

浏览历史

内容加载中请稍等...

基于TILT、DBNet与CRNN的图书封面文字识别算法

参考文献1

二级参考文献19

共引文献2

相关作者

相关机构

相关主题

浏览历史