Modifying SwinTextSpotter for Vietnamese Scene Text Spotting

导出

摘要 End-to-end scene text spotting,which jointly localizes and recognizes texts in natural images,has advanced significantly for Chinese and English.However,Vietnamese text spotting remains challenging due to persistent diacritic recognition failures and missed detections.To bridge this gap,we proposed a diacritic-focused Vietnamese text spotting framework that mitigates background interference.Specifically,we proposed the DDCM to capture fine-grained diacritical features by adapting to the structural characteristics of Vietnamese character.During the detection phase,we proposed the Global Feature Fusion Module to help the model more accurately understand the relationship between local details and global context for each region of interest.During the recognition phase,we designed the Cross Channel Attention Module to capture the spatial relationships while discriminating subtle diacritic variations through channel-wise recalibration.Extensive experiments demonstrate that our framework improves recognition accuracy over several state-of-the-art methods on Vietnamese scene text benchmarks.The code is available at https://github.com/mlmmwym/FCVintextSpotter.

作者 Yimin Wen Wenhui Huang Ruiqi Tian Liyu Jiang Lianxi Wang Vinh Loc Cu

机构地区 School of Electronics and Information Engineering

出处《Data Intelligence》 2026年第1期244-267,共24页 数据智能(英文)

基金 partially supported by the National Natural Science Foundation of China(62366011) the Natural Science Foundation of Guangxi District(2024GXNSFDA010066) the Key R&D Program of Guangxi under Grant(AB21220023)。

关键词 Vietnamese characters Diacritics Scene text spotting

分类号 TP391.41 [自动化与计算机技术]

Data Intelligence

2026年第1期

浏览历史

内容加载中请稍等...

Modifying SwinTextSpotter for Vietnamese Scene Text Spotting

相关作者

相关机构

相关主题

浏览历史