汉语连续语音识别中的分级聚类算法的研究和应用被引量：2

A Hierarchical Clustering Algorithm in Continuous Mandarin Speech Recognition

下载PDF

导出

摘要针对汉语语音单音节结构的特点,考虑音节间协同发音的现象,本文提出了一种对三音子模型进行分级聚类的方法。与传统的基于决策树的状态聚类算法相比,该方法通过对稀少三音子模型聚类,更充分地利用训练数据,减少稀少三音子对状态聚类的影响,从而提高声学模型的鲁棒性。实验结果表明:大词汇量连续语音识别器采用这种分级聚类方法,不仅可以大大减少模型及其参数的数量,还可使系统识别率有所提高,其中误识率相对于传统的决策树状态聚类系统降低了4.93％。 Based on the single syllable characteristics of Mandarin and considering the inter-syllable coarticulatory phenomena, a new hierarchical clustering algorithm is proposed. Compared with the traditional decision-tree based state-tying, the algorithm can take better use of training data and lessen the impact of rare triphones to state-tying. Experiments on large vocabulary continuous Mandarin speech recognition system show that the method can get better performance (about 4.93% word error rate reduction) with even fewer parameters.

作者徐向华朱杰郭强

机构地区上海交通大学电子工程系

出处《信号处理》 CSCD 2004年第5期497-500,共4页 Journal of Signal Processing

基金上海市科委重点基金项目资助(01JC14033)

关键词状态聚类决策树训练数据聚类算法三音子鲁棒性聚类方法汉语连续语音识别协同发音误识率 continuous speech recognition decision tree model-based clustering state-tying

分类号 TP391.43 [自动化与计算机技术—计算机应用技术] TN912.34 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献11

1K. E Ice, H. W. Hon, and R. Reddy. An overview of the SPHINX speech recognition system. IEEE Trans. Acoustics, Speech, Signal Processing, 1990, 38: 35-45.
2C. H. Lee, E. Giachin, L. Rabiner, and A. Rosenberg.Improved acoustic modeling for large vocabulary continuous speech recognition. Computer Speech and Language, 1992, 6: 103-127.
3M. Y. Hwang, X.D. Huang. Subphonetic modeling with Markov states-senone. Proc. IEEE Int. Conf. Acoustics, Speech, Signal processing, 1992, 1: 33-36.
4S..L Young, J.J. Odell, and E C. Woodland. Tree-based state tying for high accuracy acoustic modeling. In Proceedings ARPA Workshop on Human Language Technology, 1994: 307-312.
5M. Y. Hwang, X. D. Huang and E Alleva. Predicting unseen tdphones with senones. Proc. IEEE Int. Conf.Acoustics, Speech, Signal processing, 1993, 2:311-314.
6W. Rcichl, W. Chou. Robust decision tree state tying for continuous speech reognition. IEEE Trans. Acoustics,Speech, Signal Processing, 2000, 8: 555-566.
7C. J. Liu, X.T. Wu, and Y. H. Yah. High accuracy acoustic modeling using two-level decision-tree based state-tying.Proc. 6^th Eur. Conf. Speech Communication Technology,1999, 4: 1703-1706.
8E. Chang, J. L. Zhou, S. Di, C. Huang, and K. E Lee.Large vocabulary Mandarin speech recognition with different approaches in modeling tones. Proc. IEF.E Int. Conf. Spoken Language Processing, 2000, 983-986.
9J. T. Chien, C. H. Huang and S. J. Chen. Compact decision trees with cluster validity for speech recognition.Proc. IEEE Int. Conf. Acoustics, Speech, Signal processsing, 2000, 2: 873-876.
10E. Chang, Y. Shi, J. L. Zhou, and C. Huang. Speech lab in a box: a Mandarin speech toolbox to jumpstart speech related research. Proc. 7^th Eur. Conf. Speech Communication Technology, 2001.

同被引文献33

1曹剑芬.普通话双音子和三音子结构系统代表语料集[J].语言文字应用,1997(1):62-70. 被引量：7
2王志明,蔡莲红,艾海舟.基于数据驱动方法的汉语文本-可视语音合成(英文)[J].软件学报,2005,16(6):1054-1063. 被引量：16
3张翠丽,张申生,李磊.基于统一受理的农业呼叫中心解决方案[J].计算机应用与软件,2006,23(10):31-32. 被引量：9
4赵春江,申长军,邢振,郑文刚,鲍锋,吴文彪.农产品信息采集器及采集方法[P].中国:CNl02122430A,2011.
5Singh G. Multi utility e-controlled cum voice operated farm.International Journal of Computer Applications, 2010, 1(13): 109-113.
6Mantena G V, Rajendran S, Rambabu B, Gangashetty S V, Yegnanarayana B, Prahallad K. A speech-based conversation system for accessing agriculture commodity prices in Indian languages. Hands-free Speech Communication and Microphone Arrays (HSCMA) 2011 Joint Workshop on, 2011: 153-154.
7Plauche M, Nallasamy U, Pal J, Wooters C, Ramachandran D. Speech recognition for illiterate access to information and technology. //Proceedings of the First International Conference on Information and Communication Technologies and Development (ICTD '06). Berkeley, CA, 2006: 83-92.
8Ou W H, Gao W L, Li Z, Zhang S L, Wang Q. Application of keywords speech recognition in agricultural voice information system. //Computational Intelligence and Natural Computing Proceedings ( CINC), 2010 Second International Conference. Wuhan, Hubei, 2010: 197-200.
9Chedad A, Moshou D, Aerts J M, Van Hirtum A, Ramon H, Berckmans D. Recognition system for pig cough based on probabilistic neural networks. Journal of Agricultural Engineering Research, 2001, 79(4): 449-457.
10Guarino M, Jans P, Costa A, Aerts J M, Berckmans D. Field test of algorithm for automatic cough detection in pig houses. Computers and Electronics in Agriculture, 2008, 62(1): 22-28.

引证文献2

1李皓,陈艳艳,唐朝京.唇部子运动与权重函数表征的汉语动态视位[J].信号处理,2012,28(3):322-328. 被引量：12
2许金普,诸叶平.基于语音识别的农产品价格信息采集方法[J].中国农业科学,2015,48(3):449-459. 被引量：8

二级引证文献20

1曾洪鑫,胡东波,胡志刚.文本与朗读语音共同驱动的汉语语音与口型匹配方案[J].计算机与现代化,2013(10):135-137. 被引量：1
2曾洪鑫,胡东波,胡志刚.浅析汉语语音与口型匹配的基本机理[J].电声技术,2013,37(10):44-48.
3吴翠娟,赵晖.可视化协同发音合成研究综述[J].现代计算机,2014,20(9):9-14.
4曾洪鑫,胡东波,胡志刚.双模态驱动的汉语语音与口型匹配控制模型[J].计算机工程与应用,2015,51(3):202-207. 被引量：1
5米辉辉,侯进,李克豹,甘凌云.汉语语音同步的三维口型动画研究[J].计算机应用研究,2015,32(4):1244-1247. 被引量：3
6米辉辉,侯进,李克豹,甘凌云.虚拟人“双簧”—与语音同步的三维人脸动画的研究[J].计算机应用与软件,2015,32(8):145-149. 被引量：1
7许金普,许丰娟,诸叶平,刘升平,岳慧丽,刘丹.农产品市场信息采集的语音识别鲁棒性方法[J].中国农业科技导报,2015,17(4):100-106.
8荣传振,岳振军,王渊,杨宇.模糊语言模型在唇读系统中的应用[J].信号处理,2015,31(10):1301-1306. 被引量：1
9唐郅,侯进.基于深度神经网络的语音驱动发音器官的运动合成[J].自动化学报,2016,42(6):923-930. 被引量：6
10吴志明,侯进,位雪岭.基于运动分解与权重函数的嘴部中文语音动画[J].计算机应用研究,2016,33(12):3858-3862. 被引量：1

1齐耀辉,潘复平,葛凤培,颜永红.汉语连续语音识别系统中三音子模型的优化[J].计算机应用研究,2013,30(10):2920-2922. 被引量：4
2徐宝龙,努尔麦麦提.尤鲁瓦斯,吾守尔.斯拉木.关于维吾尔语口语语料的三音子选取方法研究[J].中文信息学报,2015,29(2):118-124. 被引量：2
3晁浩,杨占磊,刘文举.汉语语音识别中声学界标点引导的随机段模型解码算法[J].计算机科学,2013,40(10):208-212. 被引量：1
4杨阳蕊,李永宏,于洪志.藏语安多方言的音联结构及统计分析[J].西北民族大学学报（自然科学版）,2008,29(2):11-16. 被引量：2
5Jana Schmidt,Stefan Kramer.Online Induction of Probabilistic Real-Time Automata[J].Journal of Computer Science & Technology,2014,29(3):345-360.
6董明,刘润生.基于先验知识的三音子模型聚类结构自适应策略[J].电子与信息学报,2007,29(9):2050-2053.
7晁浩,杨占磊,刘文举.汉语语音识别中基于音节的声学模型改进算法[J].计算机应用,2013,33(6):1742-1745. 被引量：1
8霍云.漫谈各种复用技术[J].中国有线电视,2004(16):6-10.
9许金普,诸叶平.基于语音识别的农产品价格信息采集方法[J].中国农业科学,2015,48(3):449-459. 被引量：8
10吴翠娟,赵晖.可视化协同发音合成研究综述[J].现代计算机,2014,20(9):9-14.

信号处理

2004年第5期

浏览历史

内容加载中请稍等...

汉语连续语音识别中的分级聚类算法的研究和应用被引量：2

参考文献11

同被引文献33

引证文献2

二级引证文献20

相关作者

相关机构

相关主题

浏览历史

汉语连续语音识别中的分级聚类算法的研究和应用 被引量：2

参考文献11

同被引文献33

引证文献2

二级引证文献20

相关作者

相关机构

相关主题

浏览历史

汉语连续语音识别中的分级聚类算法的研究和应用被引量：2