一种结构受限的异方差线性判别分析

A Structure-Specific Heteroscedastic Linear Discriminant Analysis

下载PDF

导出

摘要异方差线性判别分析(HLDA)因在语音识别中起到了巨大的特征去相关作用而被广泛利用。然而在训练数据不足或特征维数较高时,HLDA易出现不稳定性和小样本问题。根据特征的矩阵表示形式,提出了一种结构受限的HLDA。首先用二维线性判别分析(2DLDA)压缩矩阵形式的特征,然后作一维的HLDA。通过分析我们指出,二维的特征变换实际上是一种结构受限的一维特征变换。在RM库上的实验,受限HLDA对常规HLDA的词识别错误相对下降12.39%;在TIMIT库上的实验,受限HLDA对常规HLDA的音素识别错误相对下降4.43%。 Heteroscedastic linear discriminant analysis （HLDA） is applied widely in speech recognition due to its ability of feature de-correlation. To overcome its instability on high dimension features and the small sample issue on insufficient training samples, this paper proposes a structure-specific HLDA method to transform the feature matrix. The method adopts the two-dimensional linear discriminant analysis （2DLDA） to compress features in the matrix, and then, the one-dimensional HLDA is applied. It is revealed that two dimensional feature transformation is actually a structure-constrained one dimensional feature transformation. Experiments show that the proposed struc ture-specific HLDA achieves 12. 39% word error rate （WER） reduction on RM database and 4. 43% phone error rate （PER） reduction on TIMIT database compared with the traditional HLDA.

作者陈思宝胡郁王仁华

机构地区中国科学技术大学电子工程与信息科学系讯飞语音实验室

出处《中文信息学报》 CSCD 北大核心 2008年第4期94-99,共6页 Journal of Chinese Information Processing

基金国家863计划资助项目(2004AA114030)

关键词计算机应用中文信息处理语音识别特征变换 HLDA 结构受限 computer application Chinese information processing speech recognition feature transformation HLDA structure-specific

分类号 TN912.34 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献24

1Fukunaga K. Introduction to Statistical Pattern Recognition [M]. New York; Academic, 1972.
2Davis S B, Mermelstein P, Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences [J]. IEEE Trans. Acoust. , Speech, Signal Process. , 1980, 28(4) : 357- 366.
3Haeb-Umbach R, Ney H. Linear discriminant analysis for improved large vocabulary continuous speech recognition [C]//Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, 1992, 1: 13-16.
4Saon G, Padmanabhan M, Gopinath R, et al. Maximum likelihood discriminant feature spaces [C]// Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, 2000, Ⅱ:1129-1132.
5Sakai M, Kitaoka N, Nakagawa S. Generalization of linear discriminant analysis used in segmental unit input HMM for speech recognition [C]//Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. 2007, Ⅳ: 333-337.
6Gopinath R A. Maximum likelihood modeling with Gaussian distributions for classification [C]//Proc. IEEE Int. Conf. Acoustics, Speech, Signal Process. 1998, Ⅱ: 661-664.
7Kumar N. Investigation of silicon-auditory models and generalization of linear diseriminant analysis for improved speeeh reeognition [D]. Ph.D. dissertation, Johns Hopkins Univ. , Baltimore, MD, 1997.
8Gales M J F. Semi-tied covariance matrices for hidden Markov models[J]. IEEE Trans. Speech Audio Process. , 1999, 7(3): 272-281.
9Gales M J F. Maximum likelihood multiple subspace projections for hidden Markov models [J]. IEEE Trans. Speech Audio Process., 2002, 10(2): 37-47.
10Yang J, Zhang D, Frangi A F, et al. Two-dimensional PCA: a new approach to appearance-based face representation and recognition [J ]. IEEE Trans. Pattern Anal. & Mach. Intell. , 2004, 26 (1): 131- 137.

二级参考文献65

1梁维谦,王国梁,刘加,刘润生.基于音素的发音质量评价算法[J].清华大学学报（自然科学版）,2005,45(1):5-8. 被引量：12
2车万翔,刘挺,李生.实体关系自动抽取[J].中文信息学报,2005,19(2):1-6. 被引量：122
3陈文亮,朱慕华,朱靖波,姚天顺.基于Bootstrapping的文本分类模型[J].中文信息学报,2005,19(2):86-92. 被引量：6
4梁晗,陈群秀,吴平博.基于事件框架的信息抽取系统[J].中文信息学报,2006,20(2):40-46. 被引量：38
5魏思,刘庆升,胡郁,王仁华.普通话水平测试电子化系统[J].中文信息学报,2006,20(6):89-96. 被引量：24
6Tsuhan Chen, Audiovisual speech processing[J], IEEE Signal Processing Magazine,Jan,2001,18:9-21.
7Petajan, E.D., Automatic lip reading to enhance speech recognition, Ph.D. thesis,[D] University of Illinois at Urbana-Champaign, 1984.
8D.L. Swets and J.J. Weng, Using Discriminant Eigenfeatures for Image Retrieval[J], IEEE Trans. Pattern Analysis and Machine Intelligence, Aug.1996,18(8):831-836.
9J.Luettin, N.A.Thacker and S.W.Beet, Active Shape Models for Visual Speech Feature Extraction[M], D.G.Storck (editor), Speechreading by Man and Machine: Models, Systems and Applications, volume 150 of NATO ASI Series F: Computer and Systems Sciences. Spri
10P. Duchnowski, M. Hunke, D. Busching, U. Meier, and A. Waibel, Toward movement-invariant automatic lip-reading and speech recognition,[A] In: Proc. International Conference on Spoken Language Processing[C], 1995,109-112.

共引文献108

1葛艳,杜坤钰,杜军威,陈卓.基于混合神经网络的实体关系抽取方法研究[J].中文信息学报,2021,35(10):81-89. 被引量：8
2杨炳儒,邵阔义,宋泽锋,张克君.基于高性能特征选择函数的Web文档聚类算法[J].计算机应用研究,2009,26(2):631-633. 被引量：2
3万济萍,刘子菡,王玥,刘婉姬,张清涛,辛杰.基于语音识别技术口语自动评测的专利分析[J].电声技术,2012,36(S1):53-56. 被引量：1
4严可,胡国平,魏思,戴礼荣,李萌涛,杨晓果,冯国栋.面向大规模英语口语机考的复述题自动评分技术[J].清华大学学报（自然科学版）,2009(S1):1356-1362. 被引量：18
5吴健.普通话调位调域变体在机辅测试中的处理探讨[J].华中师范大学学报（人文社会科学版）,2013,52(S5):108-113.
6陈志雄,陈健,闵华清.基于信息增益的中文文本关联分类[J].中文信息学报,2007,21(3):61-68. 被引量：1
7刘庆升,魏思,胡郁,郭武,王仁华.基于语言学知识的发音质量评价算法改进[J].中文信息学报,2007,21(4):92-96. 被引量：14
8LI Yanling,DAI Guanzhong,ZHU Yehang,QIN Sen.A High-Performance Extraction Method for Public Opinion on Internet[J].Wuhan University Journal of Natural Sciences,2007,12(5):902-906. 被引量：3
9原福永,于歌,崔春华.基于特征选择的网页分类方法研究[J].计算机工程与设计,2007,28(17):4282-4284. 被引量：3
10秦伟,韦岗.多数据流隐马尔可夫模型的流权值优化方法[J].计算机应用研究,2007,24(11):100-102.

1常振超,刘斌,石远超,张兴明,杨镇西,张丽.一种层次化空间分析方法在语种识别系统中的应用[J].计算机应用研究,2012,29(10):3651-3654.
2徐峰.卫星通信的重要[J].无线互联科技,2014,11(2):40-40. 被引量：1
3曹晖.基于光纤通信传输损耗的分析[J].中国新通信,2016,0(16):111-111. 被引量：3
4王秀锦,盛骥松.成像雷达干扰机效能验证平台的设计与实现[J].舰船电子对抗,2014,37(5):51-54. 被引量：1
5张文国.连续汉语语音识别技术[J].自动化博览,1997(6):21-22.
6龙艳花,郭武,戴礼荣.采用支持向量机的说话者确认中的样本平衡[J].中文信息学报,2008,22(3):99-104. 被引量：1
7陈斌,陈琦,张连海,屈丹,李弼程.基于群稀疏约束的语音识别特征混合判别分析[J].四川大学学报（工程科学版）,2015,47(5):139-145.
8沈达阳.Java在中文信息处理中的应用[J].广东通信技术,2004,24(A01):87-90.
9高明亮,杨晓敏,余艳梅,罗代升.Photometric invariant feature descriptor based on SIFT[J].Chinese Optics Letters,2012,10(B06):63-68.
10王立建,陈壮,王欣,代红.中文信息处理标准化[J].信息技术与标准化,2004(11):21-24. 被引量：3

中文信息学报

2008年第4期

浏览历史

内容加载中请稍等...

一种结构受限的异方差线性判别分析

参考文献24

二级参考文献65

共引文献108

相关作者

相关机构

相关主题

浏览历史