期刊文献+

基于双高斯GMM的特征参数规整及其在语音识别中的应用 被引量:4

Double Gaussian GMM Based Feature Normalization and Its Application in Speech Recognition
在线阅读 下载PDF
导出
摘要 对特征参数概率分布的实验分析表明,在有噪声影响的情况下,特征参数通常呈现双峰分布.据此,本文提出了一种新的,基于双高斯的高斯混合模型(Gaussian mixture model,GMM)的特征参数归一化方法,以提高语音识别系统的鲁棒性.该方法采用更为细致的双高斯模型来表达特征参数的累积分布函数(CDF),并依据估计得到的CDF进行参数变换将训练和识别时的特征参数的分布都规整为标准高斯分布,从而提高识别正确率.在Aurora 2和Aurora 3数据库上的实验结果表明,本文提出的方法的性能明显好于传统的倒谱均值规整(Cepstral mean normalization,CMN)和倒谱均值方差规整(Cepstral mean and variance normalization,CMVN)方法,而与非参数化方法一直方图均衡特征规整方法的性能基本相当. In this paper, a new feature normalization approach based on double Gaussian mixture model is proposed. Since speech features in noisy environments usually follow hi- modal distributions, to fully utilize this characteristic we represent the cumulative density function (CDF) of the features with a more delicate Gaussian mixture model. Finally, feature normalization process is performed according to the estimated CDF to improve speech recognition performance. Experimental results on Aurora 2 and Aurora 3 tasks show that the performance of our method is much better than those of the conventional cepstral mean normalization (CMN) and cepstral mean and variance normalization (CMVN) methods, and is comparable to that of the histogram equalization method, which is a non-parametric method,
出处 《自动化学报》 EI CSCD 北大核心 2006年第4期519-525,共7页 Acta Automatica Sinica
基金 国家自然科学基金项目(60275038) 国家高技术研究发展计划(863计划)(2004AA114030)资助~~
关键词 语音识别 前端 噪声鲁棒性 语音特征参数规整 直方图均衡 Speech recognition, front-end, noise-robustness, speech feature normalization,histogram equalization
  • 相关文献

参考文献11

  • 1de la Torre A,Segura J C,Benitez C.Non-linear transformations of the feature space for robust speech recognition.In:Proceedings of International Conference on Acoustics,Speech,and Signal Processing 2002.Piscataway,USA:IEEE Press,2002.401~404
  • 2Hilger F,Ney H.Quantile based histogram equalization for noise robust speech recognition.In:Proceedings of European Conference on Speech Communication and Technology 2001.Aalborg,Denmark:ISCA,2001.1135~1138
  • 3Hilger F,Molau S,Ney H.Quantile based histogram equalization for online application.In:Proceedings of International Conference of Spoken Language Processing 2002.Rundle Mall,Australia:Causal Productions,2002.237~240
  • 4Hilger F,Molau S,Ney H.Evaluation of quantile based histogram equalization with filter combination on the Aurora 3 and 4 Databases.In:Proceedings of European Conference on Speech Communication and Technology.Grenoble,France:ISCA,2003.341~344
  • 5Segura J C,Benitez M C,de la Torre A,Ruiio A J.Feature extraction combining spectral noise reduction and cepstral histogram equalization for robust ASR.In:Proceedings of International Conference of Spoken Language Processing 2002.Rundle Mall,Australia:Causal Productions,2002.225~228
  • 6Segura J C,Benitez M C,de la Torre A.VTS residual noise compensation.In:Proceedings of International Conference on Acoustics,Speech,and Signal Processing 2002.Piscataway,USA:IEEE Press,2002.409~412
  • 7Molau S,Hilger F,Keysers D,Ney H.Enhanced histogram normalization in the acoustic feature space.In:Proceedings of International Conference of Spoken Language Processing 2002.Rundle Mall,Australia:Causal Productions,2002.1421~1424
  • 8Molau S,Hilger F,Ney H.Feature space normalization in adverse acoustic conditions.In:Proceedings of International Conference on Acoustics,Speech,and Signal Processing 2003.Piscataway,USA:IEEE Press,2003.656~659
  • 9Dharanipragada S,Padmanabhan M.A nonlinear unsupervised adaptation technique for speech recognition.In:Proceedings of International Conference of Spoken Language Processing 2000.Beijing,China:Institute of Acoustics,2000.556~559
  • 10Dempster A P,Laird N M,Rubin D B.Maximum likelihood from incomplete data via the EM algorithm.Journal of the Royal Statistical Society,1977,39(1):1~38

同被引文献30

  • 1R. C. Gonzalez, R. E. Woods. Digital Image Processing [ M ] , New Jersey, Prentice-Hall, 2002.
  • 2O. Viikki, K. Laurila. Cepstral Domain Segmental Feature Vector Normalization for Noise Robust Speech Recogni- tion[ J ]. Speech Communication, 1998,1 (25) : 133-147.
  • 3Hilger F, Molan S, Ney H. Quantile based histogram e- qualization for online application. Proceedings of Interna- tional Conference of Spoken Language Proceessing, Run- die Mall,Australia, Causal Productions,2002,237-240.
  • 4Segura J C, Benitez M C, de la Torre A, Rubio A J. Fea- ture extraction combining spectral noise reduction and cepstral histogram equalization for robust ASR [ J ]. Pro- ceedings of International Conference of Spoken Language Processing 2002, Rundle Mall, Australia, Causal Produc- tions, 2002,225-228.
  • 5Segura J C, Benitez M C, de la Torre A. VTS residual noise compensation [ J ]. Proceedings of International Conference on Acoustics and Signal Processing 2002.Piscataway, USA, IEEE Press,2002,209-212.
  • 6J. C. Segura, C. Benitez, ~. de la Torre, A. J. Rubio, J. Ramfrez. Cepstral Domain Segmental Nonlinear Feature Transformations for Robust Speec Recognition [ J ]. IEEE Signal Processing Letters ,2004,5( 11 ) :517-520.
  • 7Young S,Evermann G, Hain T et al. The HTK Book (for HTK Version 3.2.1 ). 2002, http : ff htk. eng. cam. ac. uk.
  • 8H. Y. Jun. Filtering of Filter-Bank Energies for Robust Speech Recognition [ J ]. ETRI, 3 ( 26 ), 2004,273-276.
  • 9HOU Jinbiao.Design and implementation of a system of video imagecapture of camera based on JMF [C] //International Conference:on Multi Media and Information Technology,2008:201-204.
  • 10Rivlin T J.The chebyshev polynomials [M].John Wiley &-Sons Press,1974.

引证文献4

二级引证文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部