期刊文献+

基于唇动的说话人识别技术

Speaker Recognition Technology Based on Lip Movement
在线阅读 下载PDF
导出
摘要 针对说话人识别技术多基于语音的现状,文章提出了一种新颖的基于唇动的说话人识别技术。通过离散余弦变换,从说话人讲话时的图像序列提取那些既反映说话人嘴部生理特性也反映了说话人唇动的行为特性的视觉特征。基于这些特征,为说话人建立静态-动态混合模型,其中使用半连续隐马尔可夫模型为说话人建立动态模型。在一个小型的视觉语料库上,我们分别对说话人辨认系统和确认系统进行实现。对说话人辨认系统,其文本有关与文本无关模式的正确率分别达到了100%和99.7%;对说话人确认系统,文本有关与文本无关模式的等错误率分别为0.09%与0.33%。 For most of speaker recognition systems based on acoustic signals,a novel approach of speaker recognition technology based on lip movement is presented in this paper.By Discrete Cosine Transform,visual features is extracted from the talking image sequences,which represent both the physical characteristics of the speaker mouth and his lip movement behaviour trait.Based on these feantures,the static-dynmic models are constructed for the speakers,in which the dynmic model is based on SCHMM.We implement both speaker identification system and speaker verification system on a small visual database,and the accuracy of the text-dependent and the text-independent got to 100% and 99.7% for identification,respectively,and the ERR of both of them are 0.09% and 0.33% for speaker verification,separately.
出处 《计算机工程与应用》 CSCD 北大核心 2006年第12期85-88,共4页 Computer Engineering and Applications
关键词 唇动 说话人辨认 说话人确认 隐马尔可夫模型 离散余弦变换 lip movement, speaker identification, speaker verification, HMM, DCT
  • 相关文献

参考文献8

  • 1姚鸿勋,高文,王瑞,郎咸波.视觉语言——唇读综述[J].电子学报,2001,29(2):239-246. 被引量:31
  • 2J Luettin,N A Thacker,S W Beet.Speechreading Using Shape and Intensity Information[C].In:Proc Int Conf On Spoken Language Processing,1996
  • 3J Luettin,N A Thacker,S W Beet.Speaker identification by lip-reading[C].In:Proceedings of the 4th Int conf on Spoken Language Processing (ICSLP'96),1996,1:62~65
  • 4M Acheroy et al.Multi-modal person verification tools using speech and images[C].In:Proc Europ Conf On Multimedia Applications,Dervices and Techniques,1996
  • 5姚鸿勋,高文,李静梅,吕雅娟,王瑞.用于口型识别的实时唇定位方法[J].软件学报,2000,11(8):1126-1132. 被引量:10
  • 6N A Fox,R B Reilly.Audio-Visual Speaker Identification Based on the Use of Dynamic Audio and Visual Features[C].In:Proceedings of the 4th Int.Conf.on Audio-and Video-Based Biometric Person Authentication,AVBPA,Guildford,UK,2003:743~751
  • 7S Lucey,T Chen.Improved audio-visual speaker recognition via the use of a hybrid combination strategy[C].In:Conf of Audio-and VideoBased Person Authentication(AVBPA),Guildford U K,2003
  • 8L R Rabiner.A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition[C].In:Proceedings of the IEEE,1989,77(2)

二级参考文献14

  • 1王瑞.连续语音唇读识别的研究.哈尔滨工业大学计算机系博士论文开题报告[M].哈尔滨工业大学档案馆,1998..
  • 2徐彦君.中文双语料语音识别关键技术研究:博士论文[M].北京:中科院语音所,1998..
  • 3间濑健二.读唇[J].电子情报通信学会论文志,1990,73(6):796-803.
  • 4Kin Manlam,Pattern Recognition,1996年,29卷,5期,771页
  • 5Yao H,IEEE Fourth Int Conference on Signal Processing,1998年,912页
  • 6徐彦君,博士学位论文,1998年
  • 7王瑞,博士论文开题报告,1998年
  • 8Liu M B,计算机学报,1998年,21卷,6期,527页
  • 9Li N,http://www.cs.ucf.edu/~vision/papers/shah/97/NDS97 pdf,1997年
  • 10Chiou G I,IEEE Trans Image Processing,1997年,6卷,8期,1192页

共引文献38

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部