期刊文献+

基于发音特征的音/视频双流语音识别模型 被引量:1

Articulatory feature based on audio-visual speech recognition model
在线阅读 下载PDF
导出
摘要 构建了一种基于发音特征的音/视频双流动态贝叶斯网络(dynamic Bayesian network,DBN)语音识别模型,定义了各节点的条件概率关系,以及发音特征之间的异步约束关系,最后在音/视频连接数字语音数据库上进行了语音识别实验,并与音频单流、视频单流DBN模型比较了在不同信噪比情况下的识别效果。结果表明,在低信噪比情况下,基于发音特征的音/视频双流语音识别模型表现出最好的识别性能,而且随着噪声的增加,其识别率下降的趋势比较平缓,表明该模型对噪声具有很强的鲁棒性,更适用于低信噪比环境下的语音识别。 This paper presented an articulatory feature (AF) -based multi-stream dynamic Bayesian networks (DBN) model (AF_AV_DBN) for audio visual speech recognition. Defined conditional probability of each node and degree of asynchrony between AFs, and carried out speech recognition experiments on an audio visual connected digit database. Comparing results with the other two single stream DBN models (audio-only model and video-only model) show that AF AV DBN performs the best when the signal-noise ratio on the audio stream is low. Moreover, the AF AV DBN model is more robust to noise, thus more suitable for speech recognition in noisy environments.
出处 《计算机应用研究》 CSCD 北大核心 2009年第7期2481-2483,共3页 Application Research of Computers
基金 国家自然科学基金资助项目(60703104)
关键词 动态贝叶斯网络 发音特征 音/视频 语音识别 dynamic Bayesian network( DBN) articulatory feature audio-visual speech recognition
  • 相关文献

参考文献8

  • 1LIVESCU K, CETIN O, HASEGAWA-JOHNSON M, et al. Articulatory feature-based methods for acoustic and audio-visual speech recognition: summary from the 2006 JHU Summer workshop [ C ]//Proc of IEEE International Conference on Acoustics, Speech, and Signal Processing. 2007 : 621- 624.
  • 2GOWDY J N, SUBRAMANYA A, BARTELS C. DBN based multistream models for audio-visual speech recognition [ C ]//Proc of IEEE International Conference on Acoustics, Speech, and Signal Processing. 2004:993- 996.
  • 3吴志勇,蔡莲红.基于动态贝叶斯网络的音视频双模态说话人识别[J].计算机研究与发展,2006,43(3):470-475. 被引量:11
  • 4BILMES J. GMTK: the graphical models toolkit[ EB/OL]. [ 2006- 06-04]. http://ssli. ee. washington. edu/- bilmes/gmtk/doc. pdf.
  • 5ZHOU Yi, GU Lie, ZHANG Hong-jiang. Bayesian tangent'shape model: estimating shape and pose parameters via Bayesian inference [ C ]// Proc of IEEE Conference on Computer Vision and Pattern Recognition. 2003.
  • 6BILMES J A, CHRIS B. Graphical model architectures for speech recognition [ J]. IEEE Signal Processing ,2005,22 (5) :89- 100.
  • 7LIVESCU K, GIASS J. Feature-based pronunciation modeling with trainable asynchrony probabilities[ C]//Proc of International Conference on Spoken Language Processing. 2004.
  • 8孙阿利,蒋冬梅,吕国云,Hichem Sahli,Werner Verhelst.基于动态贝叶斯网络的语音识别及音素切分研究[J].计算机应用研究,2007,24(10):104-106. 被引量:2

二级参考文献18

  • 1C. C. Chibelushi, F. Deravi, J. S. D. Mason. A review of speech-based bimodal recognition, IEEE Trans. Multimedia,2002, 4(1): 23-37.
  • 2S. Dupont, J. Luettin. Audiovisual .speech modeling for continuous speech reeognition, IEEE Trans. Multimedia, 2000, 2(3): 141-151.
  • 3A. Nefian, Luhong Liang, Xiaobo Pi, et al. A coupled HMM for audio visual speech recognition. In: Int'l Conf. Acoustics, Speech and Signal Processing (ICASSP2002) . Piscataway, N J: IEEE Press, 2002. 2013-2016.
  • 4A. Nefian, Luhong Liang, Tieyan Fu, et al. A Bayesian approach to audlo-visual speaker identification. Inz Proe. 4th Int'l Conf. Audio-and Video-based Biometrie Person Authentication(AVBPA2003). Berlin: Springer, 2003. 761-769.
  • 5G. G, Zweig, Speech recognition with dynamic Bayesian networks: [Ph. D, dissertation]. Berkeley: U, C. Berkeley,1998.
  • 6J. N. Gowdy, A. Subramanya, C. Bartels, et al. DBN based multi-stream models for audio visual speech recognition. In: Int'l Conf. Acoustics, Speech and Signal Processing (ICASSP2004).Piscataway, NJ: IEEE Press, 2004. 993-996.
  • 7T. Chen, Audiovisual speech processing. IEEE Trans. Signal Processing, 2001, 18 ( 1 ) : 9-21.
  • 8K. Murphy. The Bayes net toolbox for Matlab. http://www. ai. mit. edu/-- murphyk/Scftware/BNT/bnt, html, 2004-11 -22.
  • 9ZWEIG G,RUSSELL S.Speech recognition with dynamic Bayesian networks[C]//Proc of the 15th Nat Conf Artificial Intelligence and 10th Innovative Applications of Artificial Intelligence Conf(AAAI-'98).1998:173-180.
  • 10RUASSELL S,NOORVIG P.人工智能:一种现代方法.[M].中文版.北京:人民邮电出版社,2004:430-437.

共引文献11

同被引文献5

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部