期刊文献+

基于分层增长语音活动检测的鲁棒性说话人识别

Robust speaker recognition based on level-building voice activity detection
在线阅读 下载PDF
导出
摘要 基于欧洲电信标准化协会颁布的分布式语音识别和前端标准(ETSI-DSR-AFE).针对分布式说话人识别噪声鲁棒性较差的问题,提出一种新的前端处理方法.该方法以似然距离为测度,对语音进行无监督聚类,为减少计算量,采用分层增长(level-building)方法进行逐层分割,从而准确找出语音和静音的边界点.实验结果表明,用该方法改进ETSI-DSR-AFE标准后,信噪比在大于0 dB时,说话人辨认系统识别率相对改进了18.9%,相对原有的Mel频率倒谱系数(Mel-frequenly Ceptral coefficients,MFCC)系统识别率改进了60.7%. A level-building and two-stage Wiener filter methodology is proposed to improve the robustness in distributed noise speech recognition in ETSI (European Telecommunications Standards Institute )-DSR (Distributed Speech Recognition)-AFE(Advanced Front-End)standard. The speech is clustered in an unsupervised with a likelihood measurement. The level-building process for dividing speech at each level is introduced to reduce the computational load. Therefore, the boundaries of voice and non-voice data are precisely detected. Experiments have demonstrated that performance of this proposed methodology shows improvement by 18.9% in ETSI-DSRAFE standard when the SNR of speech is greater than 0 dB. The recognition rate is also improved by 60.7% in comparison with that of Mel-frequenly Ceptral coefficients(MFCC) system.
出处 《深圳大学学报(理工版)》 EI CAS 北大核心 2012年第4期328-334,共7页 Journal of Shenzhen University(Science and Engineering)
基金 国家自然科学基金项目(61005020) 中央高校基本科研业务费专项资金资助项目(10JBT01)~~
关键词 语音信号处理 说话人识别 分布式语音识别 分层增长 语音活动检测 似然距离 speech signal processing speaker identification distributed speech recognition level-building voice activity detection likelihood measurement
  • 相关文献

参考文献1

二级参考文献14

  • 1王娜,郑德忠.结点阈值小波包变换语音增强新算法[J].仪器仪表学报,2007,28(5):952-955. 被引量:14
  • 2COHEN I. Noise spectral estimation in adverse environments: improved minima controlled recursive averaging[J]. IEEE Trans. on Speech and Audio Processing, 2003,11 (5):466-475.
  • 3BEROUTI M, SCHWARTZ R, MAKHOUL J. Enhancement of speech corrupted by acoustic noise. Proc[C]. IEEE Int. Conf. Acoustics, Speech, Signal Processing, 1979:208-211.
  • 4BOLL S E Suppression of acoustic noise in speech using spectral subtraction[J]. IEEE Trans. on Acoustics, Speech, and Signal Processing, 1979, ASSP-27,(2): 113-120.
  • 5LIM J S, OPPENHEIM A V. Enhancement and band-width compression of noisy speech[J]. IEEE proc. 1979, 67( 12): 1586-1604.
  • 6SCALART P, FILHO J. Speech enhancement based on a priori signal to noise estimation[C]. Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, 1996,(2):629-632.
  • 7PLAPOUS C, MARRO C, SCALART P. Speech enhancement using harmonic regeneration[C]. Proc. IEEE Inc. Conf. Acoustics, Speech, and Signal Processing, 2005,1:157-160.
  • 8EPHRAIM Y, MALAH D. Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator[J]. 1EEE Trans. on Acoustics, Speech, and Signal processing, 1984,32(6):1109-1121.
  • 9RIS C, DUPONT S. Assessing local noise level estimation methods: application to noise robust ASR[J]. Speech Commun., 2001,34:141-158.
  • 10BOERSMA E Accurate Short-Term Analysis of the fundamental frequency and the harmonic-to-noise ratio of sampled sound[C]. IFA Proc. 17, 1993:97-110.

共引文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部