利用极点轨迹图探讨语速对语音共振峰的影响

A research concerning the effect of speaking rate on formant frequencies of speech using pole-locus plots

下载PDF

导出

摘要基于语音共振峰频率与声道系统的极点存在一一对应的关系,针对语速变化导致语音参数变化的问题,提出了利用语音极点轨迹图探讨不同语速对共振峰影响的方法并进行了实验。实验中利用逆滤波器分别对快慢2种语速的单音节及连接数字语音提取极点并形成语音极点轨迹图。其快慢2种语速下的单音节语音的极点轨迹基本一致;对于数字连接词,比起快速语音,慢速语音的极点轨迹倾向于有更大的动态范围,即共振峰频率在发音过程中经历了更多变化。实验结果表明,对于孤立发音的单音节语音,语速变化对共振峰参数并无显著影响;而对于连接词语音,语速变化对共振峰参数有明显影响,慢速连接词语音的共振峰发生了更多变化。 Based on the corresponding relations between formant frequencies of speech and poles of vocal system,for the changes of speech signal parameters led by changes of speaking rate,a method of investigating the effect of speaking rate on formant frequencies is presented and related experiments are carried out. Using Inverse Filters,poles of both monosyllable and connected digits are extracted at fast and slow speaking rate respectively. For every syllable or connected digits,poles of all frames form a pole-locus plot. The results of experiments indicate that（ 1） pole-locus plots of a monosyllabic speech with different speaking rates are rather similar and（ 2） for connected digital speech,comparing with fast speech,pole-locus plots of slow speech have broader dynamic extent,namely formant frequencies changes much more during uttering. Thus the conclusion is obtained： speaking rate change has no significant effect on formant frequencies of isolated monosyllable speech but very noticeable effect on connected word speech which has much more changes of formant frequencies of slow speech.

作者洪学敏刘惠华

机构地区厦门大学通信工程系北京信息科技大学电子信息工程系

出处《北京信息科技大学学报（自然科学版）》 2015年第5期57-60,共4页 Journal of Beijing Information Science and Technology University

基金北京市教委科技发展基金项目(KM200410772004)

关键词语速语音共振峰语音参数逆滤波器极点轨迹图 speaking rate speech formant speech parameter inverse filter pole-locus plot

分类号 TN912.3 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献8

1Joseph C Toscano,Bob Mc Murray.The timecourse of speaking rate compensation:effects of sentential rate and vowel length on voicing judgments[J].Language,Cognition and Neuroscience,2015,30(5):529-543.
2许友亮,张连海,张文林,李永彬.基于语速调整和音位属性后验概率的音素识别[J].信号处理,2012,28(2):295-300. 被引量：5
3Travis Wade,Bernd Mbius.Speaking rate effects in a landmark-based phonetic exemplar model[C].Proceedings of the 8th Annual Conference of the International Speech Communication Association(Interspeech2007).Antwerp,Belgium,2007:402-405.
4孟君,杨大利.说话人辨认中通用背景模型训练时长研究[J].北京信息科技大学学报（自然科学版）,2013,28(3):87-91. 被引量：4
5Watanabe A.Formant estimation method using inverse-filter control[J].Speech and Audio Processing,IEEE Transactions on,2001,9(4).:317–326.
6章文义,朱杰,陈斐利.一种新的共振峰参数提取算法及在语音识别中的应用[J].计算机工程,2003,29(13):67-68. 被引量：3
7蔡莲红、黄德智、蔡锐.现代语音技术基础与技术[M].北京:清华大学出版社,2003:169-171.
8Lawrence Rabiner,Biing-Hwang Juang.Fundamentals of speech recognition[M].影印本.北京:清华大学出版社,1999:28-29.

二级参考文献27

1Chin-Hui Lee,Mark A.Clements,Sorin Dusan.An Overview on Automatic Speech Attribute Transcription(ASAT) [C]// Conference on the International Speech Communication Association.Antwerp,Belgium;InterSpeech Express, 2007.1825-1828.
2S.King,P.Taylor.Detection of phonological features in continuous speech recognition using neural networks[J]. Computer,Speech and Language,2000,14(4):333-353.
3M.A.Siegler,R.M.Stern.On the effects of speech rate in large vocabulary speech recognition systems[C]// International Conference on Acoustics,Speech,and Signal Processing. Detroit,MI:ICASSP express,1995.612-615.
4V.R.Gadde,K.Sonmez,H.Franco.Multirate ASR Models for Phone-class Dependent N-best List Rescoring [C]//IEEE Workshop on Automatic Speech Recognition and Understanding(ASRU ).San Juan:IEEE express, 2005.157-161.
5S.Dimopoulos,A.Potamianos,E.-F.Lussier,L.Chin-Hui. Multiple time resolution analysis of speech signal using MCE training with application to speech recognition [C]// International Conference on Acoustics,Speech, and Signal Processing.Tai Bei:IEEE express,2009. 3801-3804.
6I-F Chen,Hsin-Min Wang.Articulatory Feature Asynchrony Analysis and Compensation in Detection-Based ASR//.International Speech Communication Association, Brighton United Kingdom,2009:3059-3062.
7Zoltan Tuske,Christian Plahl,Ralf Schluter.A study on Speaker Normalized MLP Features in LVCSR[C]//Conference on the International Speech Communication Association. Florence,Italy,2011:1089-1092.
8N.Strom,.“The NICO Artificial Neural Network Toolkit”, http://nico.nikkostrom.com.
9Frantisek Grezl.Trap-Based Probabilistic Features For Automatic Speech Recognition[D].Brno,CZ:Brno University of Technology,2007.
10Afsaneh Asaei,Benjamin Picart,Herve Bourlard.Analysis of Phone Posterior Feature space Exploiting Class-Specific Sparsity And MLP-Based Similarity Measure[C]// International Conference on ICASSP.Dallas,TX:2010. 4886-4889.

共引文献9

1余伶俐,蔡自兴,陈明义.语音信号的情感特征分析与识别研究综述[J].电路与系统学报,2007,12(4):76-84. 被引量：27
2朱浩冰,郭东辉.声纹识别系统原理及其关键技术[J].计算机安全,2007(9):14-17. 被引量：19
3陆俊,张琼,杨俊安,王一,刘辉.嵌入深度信念网络的点过程模型用于关键词检出[J].信号处理,2013,29(7):865-872. 被引量：5
4赵成辉,杨大利.基于声纹识别技术的移动通信监听方案[J].北京信息科技大学学报（自然科学版）,2015,30(1):59-65. 被引量：2
5张俊,关胜晓.基于改进的最大后验概率矢量量化和最小二乘支持向量机集成算法[J].计算机应用,2015,35(7):2101-2104. 被引量：2
6杨金霄,沈天飞,滕秋霞.基于声门激励的语音语速、音量调整方法[J].电子测量技术,2016,39(2):72-75. 被引量：3
7王民,苏利博,王稚慧,要趁红.采用STRAIGHT模型和深度信念网络的语音转换方法[J].计算机工程与科学,2016,38(9):1950-1954. 被引量：4
8王民,黄斐,刘利,卫铭斐,王明明.采用深度信念网络的语音转换方法[J].计算机工程与应用,2016,52(15):168-171. 被引量：2
9刘晓晨,潘孝勤,曹金璇,芦天亮.声纹识别和语音识别技术在公安领域的应用[J].网络安全技术与应用,2021(4):153-155. 被引量：21

1陈宁,万茂文.语音信号共振峰频率估计的分段线性预测算法[J].计算机工程与应用,2009,45(28):156-159. 被引量：1
2田玉静,周立俭,左红伟.跟踪共振峰的小波包语音增强仿真研究[J].烟台大学学报（自然科学与工程版）,2011,24(4):292-297.
3彭柏,许刚.利用频谱搬移控制语音转换中的共振峰[J].电声技术,2007,31(1):39-43. 被引量：2
4沈阳,王曾泉.语音基音周期精确测量方法研究[J].现代电信科技,2012(6):31-33. 被引量：1
5高武阳,马跃,徐塞虹.加快H.323系统中媒体信道建立的研究[J].计算机应用,2002,22(7):97-98.
6张延平,陈锡先,蔡长年.一种新的全汉语单音节语音识别算法[J].信号处理,1992,8(3):143-151.
7曹洪.一种新型汉语单音节识别方法[J].清华大学学报（自然科学版）,1990,30(4):87-92.
8黄海,陈祥献.基于Hilbert-Huang变换的语音信号共振峰频率估计[J].浙江大学学报（工学版）,2006,40(11):1926-1930. 被引量：12
9陈远鹏,金奕丹,景新幸.一种基于TI C5410 DSP的数字连接词语音识别实时系统[J].桂林电子工业学院学报,2003,23(1):5-8. 被引量：2
10王振东,黄鹤鸣.藏语单音节的语音端点检测[J].山东工业技术,2015(10):260-261. 被引量：1

北京信息科技大学学报（自然科学版）

2015年第5期

浏览历史

内容加载中请稍等...

利用极点轨迹图探讨语速对语音共振峰的影响

参考文献8

二级参考文献27

共引文献9

相关作者

相关机构

相关主题

浏览历史