摘要
本文提出了一种基于主分量分析和Fisher准则的新的Mel频率域特征参数。它是在Mel域频谱的基础上做主分量分析,并且根据Fisher准则,按Fisher比的大小进行特征参量的选择而得到的。它充分的利用了各频带间的相关统计信息,能更紧致有效的区分说话人。这样得到的特征矢量,与传统的按相应特征值进行特征选择的方法相比,在相同维数时具有最大的类别区分度。最后我们实现了一个文本无关的说话人自动识别系统,它的后端采用矢量量化实现聚类分析。在语音库上的实验表明本文的特征矢量在说话人识别上比相同维数的传统特征矢量识别率更高,证实了它紧致、区分度好、冗余信息少的优良性能。
A new feature vectorMel Frequency Principal Coefficient (MFPC), used for speaker recognition is proposed. It is derived from Principal Component Analysis on the Mel Scale Spectrum Vector. The correlation information among different frequency channels, which is mainly caused by the vocal tract resonance, can be efficiently exploited by means of MFPC. This correlation information has been found to vary consistently from one speaker to another. Feature coefficients are chosen according to their Fisher Ratio. Compared with conventional Frequency Cepstrum Coefficient, the proposed feature vector can give greater distance between classes under the condition of same dimensions. A text-independent speaker recognition system has been complemented based on Vector Quantization to design the code-books of a given reference speaker. Experiment results demonstrate that the proposed feature vector has many good performances as compact, easy to be discriminated and low redundancy.
出处
《电路与系统学报》
CSCD
2002年第1期116-119,共4页
Journal of Circuits and Systems
基金
国家自然科学基金资助项目(39870194)
关键词
主分量分析Fisher准则
说话人识别
语音识别
Mel Frequency Principal Coefficient (MFPC)
Principle Component Analysis (PCA)
Vector Quantization (VQ)
Fisher Ratio.