基于双高斯GMM的特征参数规整及其在语音识别中的应用被引量：4

Double Gaussian GMM Based Feature Normalization and Its Application in Speech Recognition

下载PDF

导出

摘要对特征参数概率分布的实验分析表明,在有噪声影响的情况下,特征参数通常呈现双峰分布．据此,本文提出了一种新的,基于双高斯的高斯混合模型(Gaussian mixture model,GMM)的特征参数归一化方法,以提高语音识别系统的鲁棒性．该方法采用更为细致的双高斯模型来表达特征参数的累积分布函数(CDF),并依据估计得到的CDF进行参数变换将训练和识别时的特征参数的分布都规整为标准高斯分布,从而提高识别正确率．在Aurora 2和Aurora 3数据库上的实验结果表明,本文提出的方法的性能明显好于传统的倒谱均值规整(Cepstral mean normalization,CMN)和倒谱均值方差规整(Cepstral mean and variance normalization,CMVN)方法,而与非参数化方法一直方图均衡特征规整方法的性能基本相当． In this paper, a new feature normalization approach based on double Gaussian mixture model is proposed. Since speech features in noisy environments usually follow hi- modal distributions, to fully utilize this characteristic we represent the cumulative density function （CDF） of the features with a more delicate Gaussian mixture model. Finally, feature normalization process is performed according to the estimated CDF to improve speech recognition performance. Experimental results on Aurora 2 and Aurora 3 tasks show that the performance of our method is much better than those of the conventional cepstral mean normalization （CMN） and cepstral mean and variance normalization （CMVN） methods, and is comparable to that of the histogram equalization method, which is a non-parametric method,

作者刘波戴礼荣王仁华杜俊李锦宇

机构地区中国科学技术大学电子工程与信息科学系

出处《自动化学报》 EI CSCD 北大核心 2006年第4期519-525,共7页 Acta Automatica Sinica

基金国家自然科学基金项目(60275038) 国家高技术研究发展计划(863计划)(2004AA114030)资助~~

关键词语音识别前端噪声鲁棒性语音特征参数规整直方图均衡 Speech recognition, front-end, noise-robustness, speech feature normalization,histogram equalization

分类号 TN912.34 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献11

1de la Torre A,Segura J C,Benitez C.Non-linear transformations of the feature space for robust speech recognition.In:Proceedings of International Conference on Acoustics,Speech,and Signal Processing 2002.Piscataway,USA:IEEE Press,2002.401～404
2Hilger F,Ney H.Quantile based histogram equalization for noise robust speech recognition.In:Proceedings of European Conference on Speech Communication and Technology 2001.Aalborg,Denmark:ISCA,2001.1135～1138
3Hilger F,Molau S,Ney H.Quantile based histogram equalization for online application.In:Proceedings of International Conference of Spoken Language Processing 2002.Rundle Mall,Australia:Causal Productions,2002.237～240
4Hilger F,Molau S,Ney H.Evaluation of quantile based histogram equalization with filter combination on the Aurora 3 and 4 Databases.In:Proceedings of European Conference on Speech Communication and Technology.Grenoble,France:ISCA,2003.341～344
5Segura J C,Benitez M C,de la Torre A,Ruiio A J.Feature extraction combining spectral noise reduction and cepstral histogram equalization for robust ASR.In:Proceedings of International Conference of Spoken Language Processing 2002.Rundle Mall,Australia:Causal Productions,2002.225～228
6Segura J C,Benitez M C,de la Torre A.VTS residual noise compensation.In:Proceedings of International Conference on Acoustics,Speech,and Signal Processing 2002.Piscataway,USA:IEEE Press,2002.409～412
7Molau S,Hilger F,Keysers D,Ney H.Enhanced histogram normalization in the acoustic feature space.In:Proceedings of International Conference of Spoken Language Processing 2002.Rundle Mall,Australia:Causal Productions,2002.1421～1424
8Molau S,Hilger F,Ney H.Feature space normalization in adverse acoustic conditions.In:Proceedings of International Conference on Acoustics,Speech,and Signal Processing 2003.Piscataway,USA:IEEE Press,2003.656～659
9Dharanipragada S,Padmanabhan M.A nonlinear unsupervised adaptation technique for speech recognition.In:Proceedings of International Conference of Spoken Language Processing 2000.Beijing,China:Institute of Acoustics,2000.556～559
10Dempster A P,Laird N M,Rubin D B.Maximum likelihood from incomplete data via the EM algorithm.Journal of the Royal Statistical Society,1977,39(1):1～38

同被引文献30

1R. C. Gonzalez, R. E. Woods. Digital Image Processing [ M ] , New Jersey, Prentice-Hall, 2002.
2O. Viikki, K. Laurila. Cepstral Domain Segmental Feature Vector Normalization for Noise Robust Speech Recogni- tion[ J ]. Speech Communication, 1998,1 (25) : 133-147.
3Hilger F, Molan S, Ney H. Quantile based histogram e- qualization for online application. Proceedings of Interna- tional Conference of Spoken Language Proceessing, Run- die Mall,Australia, Causal Productions,2002,237-240.
4Segura J C, Benitez M C, de la Torre A, Rubio A J. Fea- ture extraction combining spectral noise reduction and cepstral histogram equalization for robust ASR [ J ]. Pro- ceedings of International Conference of Spoken Language Processing 2002, Rundle Mall, Australia, Causal Produc- tions, 2002,225-228.
5Segura J C, Benitez M C, de la Torre A. VTS residual noise compensation [ J ]. Proceedings of International Conference on Acoustics and Signal Processing 2002.Piscataway, USA, IEEE Press,2002,209-212.
6J. C. Segura, C. Benitez, ~. de la Torre, A. J. Rubio, J. Ramfrez. Cepstral Domain Segmental Nonlinear Feature Transformations for Robust Speec Recognition [ J ]. IEEE Signal Processing Letters ,2004,5( 11 ) :517-520.
7Young S,Evermann G, Hain T et al. The HTK Book (for HTK Version 3.2.1 ). 2002, http : ff htk. eng. cam. ac. uk.
8H. Y. Jun. Filtering of Filter-Bank Energies for Robust Speech Recognition [ J ]. ETRI, 3 ( 26 ), 2004,273-276.
9HOU Jinbiao.Design and implementation of a system of video imagecapture of camera based on JMF [C] //International Conference:on Multi Media and Information Technology,2008:201-204.
10Rivlin T J.The chebyshev polynomials [M].John Wiley &-Sons Press,1974.

引证文献4

1姜莹,俞一彪.采用特征分类直方图均衡化的鲁棒性语音识别[J].信号处理,2011,27(6):896-900. 被引量：4
2刘晓峰,张雪英,贺元元.基于切比雪夫核的SVM在语音识别中的应用[J].计算机工程与设计,2013,34(5):1783-1786.
3何勇军,付茂国,孙广路.语音特征增强方法综述[J].哈尔滨理工大学学报,2014,19(2):19-25. 被引量：3
4彭君君,刘勇进.基于双高斯函数的一种高效鸟群优化算法[J].现代电子技术,2018,41(23):106-112. 被引量：4

二级引证文献11

1许友亮,张连海,张文林,李永彬.基于语速调整和音位属性后验概率的音素识别[J].信号处理,2012,28(2):295-300. 被引量：5
2周阿转,俞一彪.采用特征空间随机映射的鲁棒性语音识别[J].计算机应用,2012,32(7):2070-2073. 被引量：5
3吕钊,吴小培,张超.鲁棒语音识别技术综述[J].安徽大学学报（自然科学版）,2013,37(5):17-24. 被引量：4
4杜文龙.一种提高语音特征参数稳健性MLMCC算法的研究[J].智能计算机与应用,2014,4(4):94-96.
5盖晁旭,梁隆恺,何勇军.数据不充分情况下的说话人识别[J].哈尔滨理工大学学报,2017,22(3):13-18. 被引量：1
6陈树,于海波.一种改进的特征提取方法在语音识别中的应用[J].传感器与微系统,2018,37(5):154-157. 被引量：9
7李丹,贾桂敏,程方圆,杨金锋,郭晓静.陆空通话复诵语义自动化校验BiLSTM模型[J].信号处理,2019,35(1):57-64. 被引量：8
8闫威,张达敏,张绘娟,辛梓芸,陈忠云.基于混合决策的改进鸟群算法[J].山东大学学报（工学版）,2020,50(2):34-43. 被引量：3
9王雪莹,张君婷,赵全明.基于改进鸟群算法优化最小二乘支持向量机的锂离子电池寿命预测方法研究[J].电气应用,2020,39(7):74-78. 被引量：2
10何奕涛,李珺,郝丽艳.具有引力机制的细菌觅食算法[J].系统仿真学报,2020,32(9):1724-1735. 被引量：4

1吴四清.基于MATLAB的二阶系统仿真与分析[J].咸宁学院学报,2009,29(3):79-80. 被引量：7
2赵俊霞,刘桥.对悬浮元器件的类Y参数变换[J].贵州大学学报（自然科学版）,2004,21(1):83-85.
3曹韶琴.浅谈WCDMAR4信令网规划[J].移动通信,2006,30(3):80-82.
4杨新凯,席裕庚.ATM网络中服务质量问题的完整视图[J].通信技术,1999,32(3):29-32.
5李书军,杨金环.移动核心网电路域的VoIP演进[J].无线互联科技,2014,11(10):94-94.
6董俊涛.浅谈移动网长途语音IP化改造[J].信息通信,2013,26(2):218-219.
7吴荣娣.基于直方图均衡的鲁棒性语音识别研究[J].科技信息,2010(24):132-132.
8秦楚雄,张连海.低资源语音识别中融合多流特征的卷积神经网络声学建模方法[J].计算机应用,2016,36(9):2609-2615. 被引量：7
9刘海波.BICC协议在联通移动网IP化改造中的应用[J].中国新通信,2016,18(17):40-41. 被引量：2
10刘萍,申明明.核心网电路域基于IP组网的数据配置[J].数字技术与应用,2015,33(5):56-56.

自动化学报

2006年第4期

浏览历史

内容加载中请稍等...

基于双高斯GMM的特征参数规整及其在语音识别中的应用被引量：4

参考文献11

同被引文献30

引证文献4

二级引证文献11

相关作者

相关机构

相关主题

浏览历史

基于双高斯GMM的特征参数规整及其在语音识别中的应用 被引量：4

参考文献11

同被引文献30

引证文献4

二级引证文献11

相关作者

相关机构

相关主题

浏览历史

基于双高斯GMM的特征参数规整及其在语音识别中的应用被引量：4