期刊文献+

基于分类特征空间高斯混合模型和神经网络融合的说话人识别 被引量:3

Speaker Identification Based on Classify Feature Sub-space Gaussian Mixture Model and Neural Net Fusion
在线阅读 下载PDF
导出
摘要 该文提出了一种基于分类高斯混合模型和神经网络融合(FS-GMM/NN)的说话人识别方法,通过对特征矢量进行聚类分析,将说话人的训练语音分成若干类。然后根据各个类中含特征矢量的多少采用不同的模型混合度,训练建立分类高斯混合模型。并采用神经网络实现各个分类高斯混合模型输出的融合.在100个男性话者的与文本无关的说话人识别实验中,基于分类高斯混合模型和神经网络融合的方法在识别性能及噪声鲁棒性上都优于不分类的GMM识别系统,并具有较高的模型训练效率,且可以有效地降低话者模型的混合度和测试语音长度。 In this paper, a speaker identification system is proposed based on classify Feature Sub-space Gaussian Mixture Model and Neural Net fusion (FS-GMM/NN).With clustering analysis of the feature vectors, the speaker's training feature vectors can be classified to some subsets and training classify Gaussian Mixture Models (GMM) with different mixtures according to the subset's feature vectors's number. Finally, the outputs of every classify GMM will be fused by Neural Net (NN). In the experiment of text-independent speaker identification of 100 speakers (male), the system based on FS-GMM/NN overmatch the Baseline Gaussian Mixture Model (B-GMM) in identification performance and noise robustness with fewer mixtures and shorter test speech. Moreover, the training of FS-GMM/NN is more effective.
出处 《电子与信息学报》 EI CSCD 北大核心 2004年第10期1607-1612,共6页 Journal of Electronics & Information Technology
基金 国家自然科学基金项目(60272039) 安徽省自然科学基金项目(01042205)资助
关键词 说话人识别 分类特征空间 高斯混合模型 神经网络融合 Speaker identification Classified feature-subspace GMM Neural Net(NN)fusion
  • 相关文献

参考文献6

  • 1Reynolds D A, Rose R C. Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans. on Speech Audio Process, 1995, 3(1): 72-83.
  • 2Reynolds D A. Speaker identification and verification using Gaussian mixture speaker models.Speech Communication, 1995, 17(1-2): 91-108.
  • 3Reynolds D A. Speaker verification using adapted Gaussian mixture models. Digital Signal Processing, 2000, 10(1-3): 19-41.
  • 4Deller J R, Proakisa J G, Hansenm J H L. Discrete-Time Processing of Speech Signals. New York: Macmillan Publishing Company, 1993.
  • 5Reynolds D A. Experimental evaluation of features for robust speaker identification. IEEE Trans.on Speech Audio Process, 1994, 2(4): 639-643.
  • 6Chang E, Shi Y, Zhou J, Huang C. Speech lab in a box: A mandarin speech toolbox to jumpstart speech related research. in EUROSPEECH, Aalborg, Denmark, 2001: 192-199.

同被引文献24

  • 1于德介,程军圣,杨宇.基于EMD和AR模型的滚动轴承故障诊断方法[J].振动工程学报,2004,17(3):332-335. 被引量:47
  • 2吕勇,李友荣,徐金梧.延时矢量方差算法及其在齿轮故障识别中的应用[J].振动与冲击,2006,25(6):59-61. 被引量:11
  • 3肖涵,李友荣,吕勇.四分位偏差分形维及其在齿轮故障识别中的应用[J].振动与冲击,2006,25(6):108-110. 被引量:2
  • 4杨晓敏,何小海,吴炜,薛磊,陈默.基于高斯混合模型的车辆字符识别算法[J].光电子.激光,2007,18(4):487-490. 被引量:8
  • 5Kanty H, Sehreiber T. Nonlinear time series analysis[M]. Cambridge: Cambridge University Press, 1997: 29-33.
  • 6Indrebo K M, Povinelli R J, Johnson M T. Sub-banded reconstructed phase spaces for speech recognition [J]. Speech Communication, 2005, (48)7 : 760-774.
  • 7Povinelli R J, Johnson M T, Lindgren A C. Time series classification using Gaussian mixture models of reconstructed phase spaces[J]. Knowledge and Data Engineering, 2004, 16(6): 779-783.
  • 8Richard J Povinelli, Michael T Johnson, Andrew C Lindgren,et al. Speech recognition using reconstructed phase space features [J]. IEEE Transactions on Knowledg and Data Engineering, 2004,16 (6):779-783.
  • 9Hideaki Shono,Peng C K, Goldberger A L, et al. A new method to determine a fractal dimension of nonstationary biological time-serial data[J]. Computers in Biology and Medicine, 2000, (30) : 237-245.
  • 10杨超,李亦滔.基于信号预处理和Hilbert变换的滚动轴承故障诊断[J].华东交通大学,2012,29(4):1-4.

引证文献3

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部