层叠式“产生/判别”混合模型的语音情感识别被引量：3

Speech emotion recognition using stacked generative and discriminative hybrid models

导出

摘要提出了层叠式"产生/判别"混合模型的语音情感识别方法。首先,提取63维语句级特征,运用Fisher从中选择12个最佳的语句级特征,建立小波神经网络(WNN)的层叠式产生式模型进行语音情感识别;然后提取69维帧级特征,采用SFS选择出待使用的8维特征,将高斯混合模型(GMM)进行多维概率输出,建立层叠式"产生/判别"混合模型进行语音情感识别。实验结果显示:(1)层叠式"产生/判别"混合模型较单独WNN、GMM、HMM(隐马尔可夫模型)、SVM(支持向量机)的识别率要高;(2)层叠式"产生/判决式"混合模型识别率较基于WNN的层叠产生式模型高;(3)M=13,D维GMM-MAP/SVM(MAP,最大后验概率)串联融合模型为最优的层叠式"产生/判别"混合模型,能获得最高85.1%的识别率。 Generative models and discriminative models have advantages and disadvantages on internal distribution, optimize classification results, dynamic variation characteristics of emotion. This paper attempts to fuse the two kinds of models together and speech emotion recognition based on stacked hybrid generative and discriminative models. First, we reduce the dimensions of utterance-level eigenvectors from 63 to 12 by fisher discriminant, which is used for the stacked discriminative models. Then we use Sequential Forward Selection to select 8 dimensional frame-level features from the total 69 dimensional features, and two kind of GMM multidimensional likelihoods （the same dimension as eigenvector and mixtures of GMM） are proposed for hybrid generative and discriminative models. Experimental results on Berlin emotional speech databases show that （1） hybrid generative and discriminative models achieves significant improvements than merely using WNN, GMM, HMM, or SVM; （2） the recognition rate of the stacked generative and discriminative hybrid models is higher than the stacked discriminative models （3） the GMM-MAP/SVM series hybrid model （the mixtures of GMM is 13, GMM multidimensional likelihoods is the same dimension with eigenvector） is the optimal stacked generative and discriminative hybrid Models, with the recognition rate up to 85.1%.

作者黄永明章国宝董飞李悦

机构地区东南大学自动化学院

出处《声学学报》 EI CSCD 北大核心 2013年第2期231-240,共10页 Acta Acustica

基金国家863计划国家自然科学基金资助项目

关键词高斯混合模型语音情感识别层叠式判别最大后验概率隐马尔可夫模型 FISHER 小波神经网络 Optimization Speech recognition

分类号 TN912.34 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献19

1Elif B, Erzin E, Eroglu E C et al. Improving automatic emotion recognition from speech signals. 10th Annum Con- ference of the International Speech Communication Asso- ciation (Brighton, United kingdom, September 6-10, 2009), 2009:324- 327.
2Yang B, Lugger M. Emotion recognition from speech sig- nals using new harmony features. Signal Processing, 2010; 90(5): 1415-1423.
3Kim E H, Hyun K H, Kim S H et al. Improved emo- tion recognition with a novel speaker-independent feature. IEEE Trans. on Mechatronics, 2009; 14(3): 317-325.
4Park J S, Kim J H, Oh Y H. Feature vector classification based speech emotion recognition for service robots. IEEE Trans. on Consumer Electronics, 2009; 55(3): 1590-1596.
5Bitouk D, Verma R, Nenkova A. Class-level spectral fea- tures for emotion recognition. Speech Communication, 2010; 52(7-8): 613-625.
6Suryannarayana C, Amitava C, Sugata M. Support vector machines employing cross-correlation for emotional speech recognition. Journal of the International Measurement Confederation, 2009; 42(4): 611-618.
7张建平,李明,索宏彬,杨琳,付强,颜永红.长时语音特征在说话人识别技术上的应用[J].声学学报,2010,35(2):267-269. 被引量：8
8张军,韦岗,余华.基于特征分量输出概率加权的多数据流鲁棒语音识别方法[J].声学学报,2008,33(2):102-108. 被引量：2
9严斌峰,朱小燕,张智江,张范.基于邻接空间的鲁棒语音识别方法[J].软件学报,2007,18(4):878-883. 被引量：5
10Huang Y M, Zhang C B, Xu X L. Speech emotion recog- nition research based on wavelet neural network for robot pet. In: 5th International Conference on Intelligent Com- puting, 2009:993-1000.

二级参考文献26

1谢磊,付中华,蒋冬梅,赵荣椿,Werner Verhelst,Hichem Sahli,Jan Conlenis.一种稳健的基于VisemicLDA的口形动态特征及听视觉语音识别[J].电子与信息学报,2005,27(1):64-68. 被引量：4
2张军,韦岗.噪声自适应的多数据流复合子带语音识别方法[J].电子与信息学报,2006,28(7):1183-1187. 被引量：3
3赵蕤,王作英.语音识别中信道和噪音的联合补偿[J].声学学报,2006,31(5):466-470. 被引量：11
4Campbell W M, Sturim D E, Reynolds D A. Support vector machines using GMM supervectors for speaker verification. IEEE SIGNAL PROCESSING LETTERS, 2006; 13(5).
5Reynolds D A, Rose R C. An integrated speech-background model for robust speaker identification. ICASSP-92 pp. II- 185 - II-188.
6Pelecanos J, Sridharan S. Feature warping for robust speaker verification. In: Proc. ISCA Workshop on Speaker, Recognition - 2001.
7Campbell W M, Sturim D, Reynolds D A, Solomonoff A. SVM based speaker verification using a GMM supervector kernel and NAP variability compensation. ICASSP, 2006: 97-100.
8Kenny P, Boulianne G, Ouellet P, Dumouchel P. Joint factor analysis versus eigenchannels in speaker recognition. IEEE Transactions. on Audio, Speech, and Language, 2007; 15(4): 1435-1447.
9Auckenthaler R, Carey M, Lloyd-Thomas H. Score normalization for text-independent speaker verification systems. Digital Signal Processing, 2000; 10:42-54.
10Dehak Najim, Demouchel Pierre, Kenny Patrick. Modeling prosodic feature with joint factor analysis for speaker verification. IEEE Trans. Audio Speech and Language Processing, 2007.

共引文献12

1袁里驰.基于改进的隐马尔科夫模型的语音识别方法[J].中南大学学报（自然科学版）,2008,39(6):1303-1308. 被引量：20
2袁里驰.Improved hidden Markov model for speech recognition and POS tagging[J].Journal of Central South University,2012,19(2):511-516. 被引量：4
3黄永明,章国宝,李雄,达飞鹏.全局特征及弱尺度融合策略的小样本语音情感识别[J].声学学报,2012,37(3):330-338. 被引量：9
4李倩,庄琳,达钊,郭霞生,章东.讲话人识别系统的鼻腔参数研究[J].声学技术,2012,31(3):291-295.
5杨海,张翔,梁春燕,索宏彬,颜永红.联合因子分析和稀疏表示在稳健性说话人确认中的应用[J].声学学报,2012,37(5):548-552. 被引量：7
6侯雷静,郭婷婷,孙燕,齐英杰,应冬文,唐闽,颜永红.面向心音分割的个性化高斯混合建模方法[J].声学学报,2019,44(1):20-27. 被引量：9
7梁春燕,杨琳,周若华,颜永红.韵律特征在概率线性判别分析说话人确认中的应用[J].声学学报,2015,40(1):28-33. 被引量：6
8宋黎明,李明,颜永红.谐波显著度的基频提取方法[J].声学学报,2015,40(2):294-299. 被引量：5
9L Ying,LUO Senlin,GAO Xiaofang,XIE Erman,PAN Limin.A rapid audio event detection method by adopting 2D-Haar acoustic super feature vector[J].Chinese Journal of Acoustics,2015,34(2):186-202. 被引量：1
10吕英,罗森林,高晓芳,谢尔曼,潘丽敏.采用2D-Haar声学特征超向量的快速特定音频识别方法[J].声学学报,2015,40(5):739-750. 被引量：2

同被引文献34

1韩文静,李海峰,韩纪庆.基于长短时特征融合的语音情感识别方法[J].清华大学学报（自然科学版）,2008,48(S1):708-714. 被引量：20
2WANG Zhiping ZHAO Li ZOU Cairong.Speech emotion recognition based on statistical pitch model[J].Chinese Journal of Acoustics,2006,25(1):87-96. 被引量：3
3蒋丹宁,蔡莲红.基于语音声学特征的情感信息识别[J].清华大学学报（自然科学版）,2006,46(1):86-89. 被引量：40
4王治平,赵力,邹采荣.基于基音参数规整及统计分布模型距离的语音情感识别[J].声学学报,2006,31(1):28-34. 被引量：26
5姜晓庆,田岚,崔国辉.多语种情感语音的韵律特征分析和情感识别研究[J].声学学报,2006,31(3):217-221. 被引量：8
6SHAO Yanqiu HAN Jiqing ZHAO Yongzhen LIU Ting.Study on automatic prediction of sentential stress for Chinese Putonghua Text-to-Speech system with natural style[J].Chinese Journal of Acoustics,2007,26(1):49-62. 被引量：2
7SCHULLER B, RIGOLL G, LANG M. Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture [C] ∥IEEE International Conference on Acoustics, Speech, and Signal Processing, Proceedings (ICASSP 04). Montreal: IEEE, 2004: 1(1):577-580.
8JONES C M, JONASSON M. Performance analysis of acoustic emotion recognition for in-car conversational interfaces [M]∥Universal Access in Human-Computer Interaction. Ambient Interaction. Berlin Heidelberg: Springer, 2007: 411-420.
9GHARAVIAN D, SHEIKHAN M, NAZERIEH A, et al. Speech emotion recognition using FCBF feature selection method and GA-optimized fuzzy ARTMAP neural network [J]. Neural Computing and Applications, 2012, 21(8): 2115-2126.
10LIN J C, WU C H, WEI W L. Error weighted semi-coupled hidden Markov model for audio-visual emotion recognition[J]. IEEE Transactions on Multimedia, 2012, 14(1): 142-156.

引证文献3

1金赟,宋鹏,郑文明,赵力.半监督判别分析的跨库语音情感识别[J].声学学报,2015,40(1):20-27. 被引量：6
2孙凌云,何博伟,刘征,杨智渊.基于语义细胞的语音情感识别[J].浙江大学学报（工学版）,2015,49(6):1001-1008. 被引量：2
3梁瑞宇,赵力,陶华伟,王青云,邹采荣.仿选择性注意机制的语音情感识别算法[J].声学学报,2016,41(4):537-544. 被引量：7

二级引证文献14

1张石清,刘瑞欣,赵小明.跨库语音情感识别研究进展[J].计算机系统应用,2022,31(11):31-48.
2李高玲,帖云,齐林.基于随机森林分类优化的多特征语音情感识别[J].微电子学与计算机,2019,36(1):70-73. 被引量：12
3FAN Xiaohe,ZHAO Heming,CHEN Xueqin,ZHOU Yan.Deceptive Chinese speech detection based on sparse decomposition of cepstral feature[J].Chinese Journal of Acoustics,2019,38(1):99-112.
4陶华伟,张昕然,梁瑞宇,查诚,赵力,王青云.面向语音情感识别的改进可辨别完全局部二值模式[J].声学学报,2016,41(6):905-912. 被引量：9
5晋艳云,高万林,张晗,安冬,于丽娜.虫蛀玉米种子的空气耦合超声波检测[J].声学学报,2017,42(5):577-585. 被引量：2
6樊晓鹤,赵鹤鸣,陈雪勤,周燕.倒谱参数稀疏分解下的汉语音谎言检测[J].声学学报,2018,43(1):121-128. 被引量：4
7张若凡,黄俊,古来,许二敏,古智星.基于语谱图的老年人语音情感识别方法[J].软件导刊,2018,17(9):28-31. 被引量：3
8侯培文,王一军.基于随机森林的连续情感识别和跟踪算法[J].科学技术与工程,2018,18(22):77-83. 被引量：2
9高庆吉,赵志华,徐达,邢志伟.语音情感识别研究综述[J].智能系统学报,2020,15(1):1-13. 被引量：18
10王静,刘洪岩,刘芳芳,王青青.基于随机森林和卷积特征学习的人机交互语音情感识别[J].系统仿真学报,2020,32(12):2388-2400. 被引量：3

1殷君伟,陈建明,薛百里,张健.一种基于排序划分的聚类初始化方法[J].微电子学与计算机,2013,30(6):80-83. 被引量：3
2徐红波,胡文,潘海为,高祥,刘润涛.高维空间范围查询并行算法研究[J].哈尔滨商业大学学报（自然科学版）,2013,29(1):73-75. 被引量：2
3郑炅,石刚.基于用户间动态信任关系的推荐算法研究[J].计算机科学,2015,42(9):230-234. 被引量：1
4江帆,王贵锦,刘畅,林行刚.一种基于模型融合的行人跟踪算法[J].电视技术,2010,34(3):85-87. 被引量：6
5江雪莲,石洪波.产生式与判别式组合分类器学习算法[J].山东大学学报（理学版）,2010,45(7):7-12. 被引量：1
6赵振杰,方勇纯,张雪波.一种基于筛选机制的快速概率占据图目标定位算法[J].机器人,2016,38(1):17-26. 被引量：2
7宋传鸣,王相海.最长d维箱嵌套问题的贪心算法[J].计算机科学,2003,30(12):161-163. 被引量：1
8曾剑平,张世永.网络论坛的自相似性及其模型[J].计算机工程,2009,35(6):63-65. 被引量：4
9付芩.二维设计与三维设计的分析比较与图形转换[J].江汉大学学报（自然科学版）,2009,37(3):72-74. 被引量：7
10颜轲,万国伟,李思昆.基于图像分割的立体匹配算法[J].计算机应用,2011,31(1):175-178. 被引量：12

声学学报

2013年第2期

浏览历史

内容加载中请稍等...

层叠式“产生/判别”混合模型的语音情感识别被引量：3

参考文献19

二级参考文献26

共引文献12

同被引文献34

引证文献3

二级引证文献14

相关作者

相关机构

相关主题

浏览历史

层叠式“产生/判别”混合模型的语音情感识别 被引量：3

参考文献19

二级参考文献26

共引文献12

同被引文献34

引证文献3

二级引证文献14

相关作者

相关机构

相关主题

浏览历史

层叠式“产生/判别”混合模型的语音情感识别被引量：3