摘要
针对传统特征提取和分类算法复杂度高、泛化能力差等问题,提出使用卷积神经网络(CNN)对东巴象形文字图像进行特征提取和自动分类。首先,对采集的图像做手动文字分割,并进行归一化、灰度化、滤波去噪、二值化等预处理;其次,对分割的文字图像根据文字形状特征分成18个类别,并进行手动标注;然后,采用卷积神经网络训练分类模型,并用测试样本进行测试。共采集70000个数据样本,按7:2:1的比例划分为训练集、验证集和测试集。为了克服数据样本对分类精度的影响,基于旋转、仿射、缩放、平移等变换对训练样本进行增强,分类的准确率平均达到了99.43%。实验结果表明,所用方法精度高、速度快,具有很高的实际应用价值。
Given the problems of high complexity and poor generalization intraditional feature extraction and classification algorithms,this paper proposes using convolutional neural network(CNN)algorithm to extract features and classify Dongba pictogram automatically.First,the method manually perform text segmentation on the collected images and perform preprocessing normalization,grayscale conversion,filtering and denoising,and binarization.Second,the segmented text images are classified into 18 categories based on their shape features and manually annotated.Thereby,a convolutional neural network is adopted to train the classification model,and the test samples are tested as well.For the data set,a total of 70000 samples are collected in the experiment,which are divided into training set,validation set,and testing set in a ratio of 7:2:1.To overcome the impact of data samples on classification accuracy,training samples are enhanced through the transformations,including rotation,affine,scaling,and translation approaches,resulting in a classification average accuracy of 99.43%.Results of experiments show that the proposed method is capable of achieving high accuracy,and fast speed,which has rich high practical values.
作者
张桂莲
李世辉
谭贵生
张榆锋
ZHANG Gui-lian;LI Shi-hui;TAN Gui-sheng;ZHANG Yu-feng(School of Information,Lijiang Culture and Tourism College,Lijiang Yunnan 674199,China;School of Information,Yunnan University,Kunming Yunnan 650500,China)
出处
《计算机仿真》
2025年第10期376-381,共6页
Computer Simulation
基金
云南省教育厅科学研究基金项目(2023J1458)
学校第三批中青年学术和技术后备人才(2024xshb06)。
关键词
东巴象形文字
分类
图像处理
卷积神经网络
深度学习
Dongba pictogram
Classification
Image process
Convolution neural network
Deep learning