摘要
基于欧洲电信标准化协会颁布的分布式语音识别和前端标准(ETSI-DSR-AFE).针对分布式说话人识别噪声鲁棒性较差的问题,提出一种新的前端处理方法.该方法以似然距离为测度,对语音进行无监督聚类,为减少计算量,采用分层增长(level-building)方法进行逐层分割,从而准确找出语音和静音的边界点.实验结果表明,用该方法改进ETSI-DSR-AFE标准后,信噪比在大于0 dB时,说话人辨认系统识别率相对改进了18.9%,相对原有的Mel频率倒谱系数(Mel-frequenly Ceptral coefficients,MFCC)系统识别率改进了60.7%.
A level-building and two-stage Wiener filter methodology is proposed to improve the robustness in distributed noise speech recognition in ETSI (European Telecommunications Standards Institute )-DSR (Distributed Speech Recognition)-AFE(Advanced Front-End)standard. The speech is clustered in an unsupervised with a likelihood measurement. The level-building process for dividing speech at each level is introduced to reduce the computational load. Therefore, the boundaries of voice and non-voice data are precisely detected. Experiments have demonstrated that performance of this proposed methodology shows improvement by 18.9% in ETSI-DSRAFE standard when the SNR of speech is greater than 0 dB. The recognition rate is also improved by 60.7% in comparison with that of Mel-frequenly Ceptral coefficients(MFCC) system.
出处
《深圳大学学报(理工版)》
EI
CAS
北大核心
2012年第4期328-334,共7页
Journal of Shenzhen University(Science and Engineering)
基金
国家自然科学基金项目(61005020)
中央高校基本科研业务费专项资金资助项目(10JBT01)~~
关键词
语音信号处理
说话人识别
分布式语音识别
分层增长
语音活动检测
似然距离
speech signal processing
speaker identification
distributed speech recognition
level-building
voice activity detection
likelihood measurement