基于分层增长语音活动检测的鲁棒性说话人识别

Robust speaker recognition based on level-building voice activity detection

下载PDF

导出

摘要基于欧洲电信标准化协会颁布的分布式语音识别和前端标准(ETSI-DSR-AFE).针对分布式说话人识别噪声鲁棒性较差的问题,提出一种新的前端处理方法.该方法以似然距离为测度,对语音进行无监督聚类,为减少计算量,采用分层增长(level-building)方法进行逐层分割,从而准确找出语音和静音的边界点.实验结果表明,用该方法改进ETSI-DSR-AFE标准后,信噪比在大于0 dB时,说话人辨认系统识别率相对改进了18.9%,相对原有的Mel频率倒谱系数(Mel-frequenly Ceptral coefficients,MFCC)系统识别率改进了60.7%. A level-building and two-stage Wiener filter methodology is proposed to improve the robustness in distributed noise speech recognition in ETSI （European Telecommunications Standards Institute ）-DSR （Distributed Speech Recognition）-AFE（Advanced Front-End）standard. The speech is clustered in an unsupervised with a likelihood measurement. The level-building process for dividing speech at each level is introduced to reduce the computational load. Therefore, the boundaries of voice and non-voice data are precisely detected. Experiments have demonstrated that performance of this proposed methodology shows improvement by 18.9% in ETSI-DSRAFE standard when the SNR of speech is greater than 0 dB. The recognition rate is also improved by 60.7% in comparison with that of Mel-frequenly Ceptral coefficients（MFCC） system.

作者解焱陆张劲松刘明辉黄中伟

机构地区北京语言大学信息科学学院深圳大学语音实验室

出处《深圳大学学报（理工版）》 EI CAS 北大核心 2012年第4期328-334,共7页 Journal of Shenzhen University(Science and Engineering)

基金国家自然科学基金项目(61005020) 中央高校基本科研业务费专项资金资助项目(10JBT01)~~

关键词语音信号处理说话人识别分布式语音识别分层增长语音活动检测似然距离 speech signal processing speaker identification distributed speech recognition level-building voice activity detection likelihood measurement

分类号 TN912.34 [电子电信—通信与信息系统] TP391.4 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献1

1蔡宇,原建平,侯朝焕.基于两级梳状滤波的语音谐波增强[J].仪器仪表学报,2010,31(1):26-31. 被引量：5

二级参考文献14

1王娜,郑德忠.结点阈值小波包变换语音增强新算法[J].仪器仪表学报,2007,28(5):952-955. 被引量：14
2COHEN I. Noise spectral estimation in adverse environments: improved minima controlled recursive averaging[J]. IEEE Trans. on Speech and Audio Processing, 2003,11 (5):466-475.
3BEROUTI M, SCHWARTZ R, MAKHOUL J. Enhancement of speech corrupted by acoustic noise. Proc[C]. IEEE Int. Conf. Acoustics, Speech, Signal Processing, 1979:208-211.
4BOLL S E Suppression of acoustic noise in speech using spectral subtraction[J]. IEEE Trans. on Acoustics, Speech, and Signal Processing, 1979, ASSP-27,(2): 113-120.
5LIM J S, OPPENHEIM A V. Enhancement and band-width compression of noisy speech[J]. IEEE proc. 1979, 67( 12): 1586-1604.
6SCALART P, FILHO J. Speech enhancement based on a priori signal to noise estimation[C]. Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, 1996,(2):629-632.
7PLAPOUS C, MARRO C, SCALART P. Speech enhancement using harmonic regeneration[C]. Proc. IEEE Inc. Conf. Acoustics, Speech, and Signal Processing, 2005,1:157-160.
8EPHRAIM Y, MALAH D. Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator[J]. 1EEE Trans. on Acoustics, Speech, and Signal processing, 1984,32(6):1109-1121.
9RIS C, DUPONT S. Assessing local noise level estimation methods: application to noise robust ASR[J]. Speech Commun., 2001,34:141-158.
10BOERSMA E Accurate Short-Term Analysis of the fundamental frequency and the harmonic-to-noise ratio of sampled sound[C]. IFA Proc. 17, 1993:97-110.

共引文献4

1刘琳,陈伟民,章鹏,胡顺仁,罗伟.振频法索力监测中自振频率的频谱倍增方法[J].仪器仪表学报,2011,32(11):2443-2448. 被引量：5
2蔡宇,郝程鹏,侯朝焕.线性相位非均匀带宽DFT调制滤波器组设计[J].仪器仪表学报,2013,34(10):2293-2299. 被引量：5
3米建伟,方晓莉,仇原鹰.非平稳背景噪声下声音信号增强技术[J].仪器仪表学报,2017,38(1):17-22. 被引量：21
4王建华,贺瑞斌.声磁同步电缆故障定位中背景噪声抑制技术[J].电工技术,2022(20):109-111. 被引量：1

1王艳琴,梁钊,蒙山.分布式语音识别的前端处理及相关标准[J].电声技术,2002,26(5):4-7.
2刘海波,李辉,张琨磊.低信噪比下噪声抑制的语音活动检测[J].小型微型计算机系统,2012,33(6):1381-1384. 被引量：3
3刘波,李锦宇,戴礼荣,王仁华.语音识别中的两级MEL域滤波器组维纳滤波方法[J].信号处理,2004,20(2):133-137. 被引量：2
4邱雪梅,宫本媛,王培红,郝明涛.分布式语音识别技术对3GPP体系的影响[J].科技资讯,2010,8(28):3-4.
5蔡铁.基于在线单类支持向量机的自适应语音活动检测[J].深圳信息职业技术学院学报,2008,6(2):17-22.
6蔡铁,梁永生.基于单类支持向量机的稳健语音端点检测[J].深圳信息职业技术学院学报,2006,4(4):19-24. 被引量：1
7意法半导体(ST)推出智能楼宇解决方案及物联网开发生态系统[J].电子设计工程,2015,23(24):165-165.
8叶蕾,方鹏.分布式语音识别参数提取的改进算法及实现[J].福建电脑,2007,23(5):91-91.
9肖佳林,赵聿晴,王英.基于HMM与SVM的语音活动检测[J].计算机工程,2014,40(1):203-208. 被引量：10
10王艳琴,梁钊.基于Socket和JMF的分布式语音识别前端处理系统[J].电声技术,2003,27(3):45-48. 被引量：1

深圳大学学报（理工版）

2012年第4期

浏览历史

内容加载中请稍等...

基于分层增长语音活动检测的鲁棒性说话人识别

参考文献1

二级参考文献14

共引文献4

相关作者

相关机构

相关主题

浏览历史