期刊文献+

汉语连续语音识别中一种新的音节间相关识别单元 被引量:3

A new acoustic modeling of inter-syllable context-dependent units for Putonghua continuous speech recognition
原文传递
导出
摘要 考虑汉语连续语音中的协同发音现象对语音识别性能的提高是非常重要的。针对汉语语音的特点,提出了一种新的在汉语连续语音识别中考虑音节间协同发音现象,对声学模型进行细化的识别单元。然后基于语音学知识对音节间上下文影响进行分类,实现单元间状态参数的共享,降低了模型的复杂程度,保证了模型的可训练度。这种方法和传统方法的最大不同在于:这种方法完全利用语音学知识进行聚类,而传统方法采用数据驱动的聚类方式。识别实验表明,基于语音学分类的音节间相关识别单元对识别性能有明显的改善,系统的首选误识率降低了17%。 To capture the coarticulatory effects in Putonghua continuous speech is important to improve the performance of automatic speech recognition system. A new acoustic modeling technique to construct inter-syllable context-dependent units is proposed, which is based on some particular characteristics of Putonghua. The acoustic model is detailed and context-dependent units are formed after phonetic coarticulation between neighboring syllables is taken into account. Then various contextual influences between syllables are classified based on Putonghua phonetic knowledge. This phonetic classification makes sharing parameters across different units possible, which can significantly reduce the complexity of acoustic model and construct a trainable model. Compared with traditional parameter-sharing techniques, this one is purely based on phonetics, instead of acoustical data-driven clustering. Experimental results show that this technique can significantly improve system performance. The proposed method reduces error rate by 17%.
作者 李春 王作英
出处 《声学学报》 EI CSCD 北大核心 2003年第2期187-191,共5页 Acta Acustica
  • 相关文献

参考文献6

  • 1王作英.基于段长分布的HMM语音识别模型[A]..第二届全国汉字语音识别会议[C].庐山,1989..
  • 2赵庆卫,王作英,陆大.汉语连续语音识别中上下文相关的识别单元(三音子)的研究[J].电子学报,1999,27(6):79-82. 被引量:4
  • 3Lee C H, Giachin E, Rabiner L, Rosenberg A. Improved acoustic modeling for large vocabulary continuous speech recognition. Computer Speech and Language, 1992(6):103--127.
  • 4Lee K F, Hon H W, Reddy R. An overview of the SPHINX speech recognition system. IEEE Transactions on Acoustics, Speech and Signal Processing, 1990; 38(Issue.1): 35-45.
  • 5Huang X D, Alex Acero, Hon H W. Spoken language processing: a guide to theory, algorithm and system development. Prentice Hall, 2001:427--434.
  • 6Hwang M Y, Huang X D. Subphonetic modeling with Markov states -- senone. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, 1992.

二级参考文献6

  • 1王作英 曹洪.语音识别的改进隐含马尔可夫模型[J].863智能计算机系统主题学术会议,1988,12.
  • 2计天颖.一种汉语连续语音识别的算法及其实现(博士学位论文)[M].清华大学电子工程系,1995,4..
  • 3Hwanq Mei Yuh,IEEE Trans SAP,1996年,4卷,6期,412页
  • 4计天颖,博士学位论文,1995年
  • 5Kai-FuLee,IEEE Trans ASSP,1990年,38卷,4期,509页
  • 6工作英,863智能计算机系统主题学术会议,1988年

共引文献6

同被引文献49

  • 1徐向华,朱杰,郭强.汉语连续语音识别中的分级聚类算法的研究和应用[J].信号处理,2004,20(5):497-500. 被引量:2
  • 2张翠丽,张申生,李磊.基于统一受理的农业呼叫中心解决方案[J].计算机应用与软件,2006,23(10):31-32. 被引量:9
  • 3王作英.基于段长分布的HMM语音识别模型.第二届全国汉字语音识别会议[M].庐山,1989..
  • 4丁鹏 徐波.基于决策树的海量语音数据处理与建模[A]..第六届全国人机语音通讯会议[C].,2001.291-294.
  • 5Lee, C-H, Giachin, E., Rabiner, L., and Rosenberg, A., Improved Acoustic Modeling for Large Vocabulary Continuous Speech Recognition. Computer Speech and Language 6:103-127, 1992.
  • 6Hwang, M-Y and Huang, X-D, Subphonetic Modeling with Markov Statew-Senone. in: IEEE International Conference on Acoustics, Speech, and Signal Processing, 1992.
  • 7Titterington D M,Smith A F M,Makov U E. Statistical Analysis of Finite Mixture Distributions. London:John Wiley & Sons, 1985.
  • 8Gauvain J L, Lee C H. Maximum-a-posteriori estimation for multivariate Gaussian observations of Markov chains.IEEE Trans Speech Audio Processing, 1994; 2:291-298.
  • 9Padmanabhan Met al. Speaker clustering and transformation for speaker adaptation in speech recognition systems.IEEE Trans Speech Audio Processing, 1998; 6:71-77.
  • 10Gao Y, Padmanabhan M, Picheny M. Speaker adaptation based on pre-clustering training speakers. In: Proc of EUROSPEECH, 1997; 3:2091-2094.

引证文献3

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部