摘要
本文从模拟人类听觉角度出发,给出了基于人耳耳蜗听觉模型的伽马通滤波器组模型,测试语音通过该滤波器组输出得到了高维听觉特征向量.经过主成分分析和离散余弦变换,分别得到了可用于表征说话人的伽马通系数和伽马通滤波器倒谱系数及其衍生特征.实验证明,与传统梅尔倒谱特征相比,采用本文提出特征的说话人识别系统在识别率及鲁棒性上均有明显提高.
By means of emulating human auditory,gamma-Tone filter-banks models based on the auditory system in human cochlea are presented.The speech to be detected goes through the gamma-Tone filter-banks,thereby multi-dimension eigenvectors are obtained.By PCA(principal component analysis)and DCT(discrete cosine transform),it is yielded to represent a speaker's gamma-Tone coefficients,gamma-Tone filter-banks cepstral coefficients respectively and their derivative features as well.Compared to the ordinary Mel-frequency cepstral coefficients,the speaker recognition system presented turns out to have better recognition rate and robustness characteristics.
出处
《电子学报》
EI
CAS
CSCD
北大核心
2010年第3期525-528,共4页
Acta Electronica Sinica
关键词
语音信号处理
伽马通滤波器
听觉特征提取
倒谱系数
speech signal processing
gammatone filter
auditory feature extraction
cepstral coefficients