期刊文献+

一种结构受限的异方差线性判别分析

A Structure-Specific Heteroscedastic Linear Discriminant Analysis
在线阅读 下载PDF
导出
摘要 异方差线性判别分析(HLDA)因在语音识别中起到了巨大的特征去相关作用而被广泛利用。然而在训练数据不足或特征维数较高时,HLDA易出现不稳定性和小样本问题。根据特征的矩阵表示形式,提出了一种结构受限的HLDA。首先用二维线性判别分析(2DLDA)压缩矩阵形式的特征,然后作一维的HLDA。通过分析我们指出,二维的特征变换实际上是一种结构受限的一维特征变换。在RM库上的实验,受限HLDA对常规HLDA的词识别错误相对下降12.39%;在TIMIT库上的实验,受限HLDA对常规HLDA的音素识别错误相对下降4.43%。 Heteroscedastic linear discriminant analysis (HLDA) is applied widely in speech recognition due to its ability of feature de-correlation. To overcome its instability on high dimension features and the small sample issue on insufficient training samples, this paper proposes a structure-specific HLDA method to transform the feature matrix. The method adopts the two-dimensional linear discriminant analysis (2DLDA) to compress features in the matrix, and then, the one-dimensional HLDA is applied. It is revealed that two dimensional feature transformation is actually a structure-constrained one dimensional feature transformation. Experiments show that the proposed struc ture-specific HLDA achieves 12. 39% word error rate (WER) reduction on RM database and 4. 43% phone error rate (PER) reduction on TIMIT database compared with the traditional HLDA.
出处 《中文信息学报》 CSCD 北大核心 2008年第4期94-99,共6页 Journal of Chinese Information Processing
基金 国家863计划资助项目(2004AA114030)
关键词 计算机应用 中文信息处理 语音识别 特征变换 HLDA 结构受限 computer application Chinese information processing speech recognition feature transformation HLDA structure-specific
  • 相关文献

参考文献24

  • 1Fukunaga K. Introduction to Statistical Pattern Recognition [M]. New York; Academic, 1972.
  • 2Davis S B, Mermelstein P, Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences [J]. IEEE Trans. Acoust. , Speech, Signal Process. , 1980, 28(4) : 357- 366.
  • 3Haeb-Umbach R, Ney H. Linear discriminant analysis for improved large vocabulary continuous speech recognition [C]//Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, 1992, 1: 13-16.
  • 4Saon G, Padmanabhan M, Gopinath R, et al. Maximum likelihood discriminant feature spaces [C]// Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, 2000, Ⅱ:1129-1132.
  • 5Sakai M, Kitaoka N, Nakagawa S. Generalization of linear discriminant analysis used in segmental unit input HMM for speech recognition [C]//Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. 2007, Ⅳ: 333-337.
  • 6Gopinath R A. Maximum likelihood modeling with Gaussian distributions for classification [C]//Proc. IEEE Int. Conf. Acoustics, Speech, Signal Process. 1998, Ⅱ: 661-664.
  • 7Kumar N. Investigation of silicon-auditory models and generalization of linear diseriminant analysis for improved speeeh reeognition [D]. Ph.D. dissertation, Johns Hopkins Univ. , Baltimore, MD, 1997.
  • 8Gales M J F. Semi-tied covariance matrices for hidden Markov models[J]. IEEE Trans. Speech Audio Process. , 1999, 7(3): 272-281.
  • 9Gales M J F. Maximum likelihood multiple subspace projections for hidden Markov models [J]. IEEE Trans. Speech Audio Process., 2002, 10(2): 37-47.
  • 10Yang J, Zhang D, Frangi A F, et al. Two-dimensional PCA: a new approach to appearance-based face representation and recognition [J ]. IEEE Trans. Pattern Anal. & Mach. Intell. , 2004, 26 (1): 131- 137.

二级参考文献65

  • 1梁维谦,王国梁,刘加,刘润生.基于音素的发音质量评价算法[J].清华大学学报(自然科学版),2005,45(1):5-8. 被引量:12
  • 2车万翔,刘挺,李生.实体关系自动抽取[J].中文信息学报,2005,19(2):1-6. 被引量:122
  • 3陈文亮,朱慕华,朱靖波,姚天顺.基于Bootstrapping的文本分类模型[J].中文信息学报,2005,19(2):86-92. 被引量:6
  • 4梁晗,陈群秀,吴平博.基于事件框架的信息抽取系统[J].中文信息学报,2006,20(2):40-46. 被引量:38
  • 5魏思,刘庆升,胡郁,王仁华.普通话水平测试电子化系统[J].中文信息学报,2006,20(6):89-96. 被引量:24
  • 6Tsuhan Chen, Audiovisual speech processing[J], IEEE Signal Processing Magazine,Jan,2001,18:9-21.
  • 7Petajan, E.D., Automatic lip reading to enhance speech recognition, Ph.D. thesis,[D] University of Illinois at Urbana-Champaign, 1984.
  • 8D.L. Swets and J.J. Weng, Using Discriminant Eigenfeatures for Image Retrieval[J], IEEE Trans. Pattern Analysis and Machine Intelligence, Aug.1996,18(8):831-836.
  • 9J.Luettin, N.A.Thacker and S.W.Beet, Active Shape Models for Visual Speech Feature Extraction[M], D.G.Storck (editor), Speechreading by Man and Machine: Models, Systems and Applications, volume 150 of NATO ASI Series F: Computer and Systems Sciences. Spri
  • 10P. Duchnowski, M. Hunke, D. Busching, U. Meier, and A. Waibel, Toward movement-invariant automatic lip-reading and speech recognition,[A] In: Proc. International Conference on Spoken Language Processing[C], 1995,109-112.

共引文献108

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部