摘要
本文针对齐次HMM语音识别模型在使用段长信息时存在的缺陷 ,形式化地定义了一种适合语音信号描述的自左向右非齐次隐含马尔科夫模型 ,证明了这种模型的状态转移概率表示与状态段长表示的等效性 ,并在此基础上提出了基于段长分布的HMM模型 (DDBHMM ) .非特定人连续语音实验结果表明 ,仅仅利用状态段长信息的DDBHMM语音识别模型比经典HMM模型的性能有了明显的提高 (误识率降低了 17 8% ) ,展示了DDBHMM的良好的性能 ,为语音信号的时长、语速、时间断续性以及语音特征的相关性等重要特征的描述和利用开辟了空间 .
In order to overcome the defects of the duration modeling of homogeneous HMM in speech recognitions, a Duration Distribution Based HMM (DDBHMM) is proposed based on a formalized definition of a left-to-right inhomogeneous Markov model, which has been demonstrated that it can be identically defined by either the state duration or the state transition probabilities. The speaker independent continuous speech recognition experiments have shown that, by only modeling the state duration in DDBHMM, a significant improvement (17.8% error rate reduction) has been achieved comparing with the classical HMM. The ideal properties of DDBHMM will give promise to many aspects of speech modeling, such as the modeling of the state duration, speed variation, speech discontinuity and the inter frame correlation.
出处
《电子学报》
EI
CAS
CSCD
北大核心
2004年第1期46-49,共4页
Acta Electronica Sinica
关键词
段长
语音识别
DDBHMM
Markov processes
Mathematical models
Probability