期刊文献+

语音识别中声学模型研究综述 被引量:6

Summary of Acoustic Models in Speech Recognition
在线阅读 下载PDF
导出
摘要 智能语音技术包含语音识别、自然语言处理、语音合成三个方面的内容,其中语音识别是实现人机交互的关键技术,识别系统通常需要建立声学模型和语言模型。神经网络的兴起使声学模型数量急剧增加,基于神经网络的声学模型与传统识别模型相结合的方式,极大地推动了语音识别的发展。语音识别作为人机交互的前端,具有许多研究方向,文中着重对语音识别任务中的文本识别、说话人识别、情绪识别三个方向的声学模型研究现状进行归纳总结,尽可能对语音识别技术的演化进行细致介绍,为以后的相关研究提供有价值的参考。同时对目前语音识别的主流方法进行概括比较,介绍了端到端的语音识别模型的优势,并对发展趋势进行分析展望,最后提出当前语音识别任务中面临的挑战。 Intelligent speech technology includes speech recognition,natural language processing and speech synthesis.Speech recognition is a key technology for human-computer interaction,and the acoustic model and language model are usually needed to establish for recognition system.The rise of neural network leads to a sharp increase in acoustic models.The combination of acoustic models based on neural network and traditional recognition models greatly promotes the development of speech recognition.As the front end of human-computer interaction,speech recognition has many research directions.In this study,we mainly summarize the current research status of acoustic models in three directions of text recognition,speaker recognition and emotion recognition,and make a detailed introduction of the evolution of speech recognition technology as far as possible,so as to provide valuable reference for the related research in the future.At the same time,we generalize and compare the main methods of speech recognition,introduce the advantages of the end-to-end speech recognition model,analyze the development trend and present the challenges in the current speech recognition tasks at the end.
作者 叶硕 褚钰 王祎 李田港 YE Shuo;CHU Yu;WANG Yi;LI Tian-gang(Wuhan Research Institute of Posts and Telecommunications,Wuhan 430000,China)
出处 《计算机技术与发展》 2020年第3期181-186,共6页 Computer Technology and Development
基金 2018年度湖北省科学技术创新专项重大项目(2018AAA063)。
关键词 语音识别 声学模型 神经网络 深度学习 speech recognition acoustic model neural network deep learning
  • 相关文献

参考文献8

二级参考文献221

  • 1栗学丽,丁慧,徐柏龄.基于熵函数的耳语音声韵分割法[J].声学学报,2005,30(1):69-75. 被引量:34
  • 2杨莉莉,李燕,徐柏龄.汉语耳语音库的建立与听觉实验研究[J].南京大学学报(自然科学版),2005,41(3):311-317. 被引量:13
  • 3林玮,杨莉莉,徐柏龄.基于修正MFCC参数汉语耳语音的话者识别[J].南京大学学报(自然科学版),2006,42(1):54-62. 被引量:24
  • 4Yu H.The whisper is not helpful for treating hoarseness and recovering voice [J].Journal of the Central University for Nationalities, 1996,5(2): 163-166.
  • 5Itoh T,Takeda K,Itakura F.Acoustic analysis and recognition of whispered speech[C]//Proc ICASSP,Orlando,Florida,USA,2002:389-392.
  • 6Morris R W,CIements M A.Reconstruction of speech from whispers[J].Medical Engineering & Physics, 2002,24 (8): 515-520.
  • 7Morris R W.Enhancement and reconition of whispered speech[D]. Geo a Institute of Technology,USA,2002.
  • 8van Bezooijen R,Otto SA,Heenan TA. Recognition of vocal expressions of emotion:A three-nation study to identify universal characteristics[J].{H}JOURNAL OF CROSS-CULTURAL PSYCHOLOGY,1983,(04):387-406.
  • 9Tolkmitt FJ,Scherer KR. Effect of experimentally induced stress on vocal parameters[J].Journal of Experimental Psychology Human Perception Performance,1986,(03):302-313.
  • 10Cahn JE. The generation of affect in synthesized speech[J].Journal of the American Voice Input/Output Society,1990.1-19.

共引文献666

同被引文献43

引证文献6

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部