期刊文献+

基于小波包分解的含噪语音时频特性分析及端点检测 被引量:3

Endpoint Detection of Noise-Corrupted Speech Time-Frequency Characteristics Based on Wavelet Packet Decomposition
在线阅读 下载PDF
导出
摘要 针对Hilbert-Huang变换方法在语音处理过程中存在模态混叠问题,本文提出了基于小波包分解的语音时频分析方法。首先对含噪语音进行小波包分解,对各分量分别进行经验模态分解,并运用相关系数阈值准则对固有模态函数进行筛选;然后建立语音信号的Hilbert谱和瞬时能量谱;最后将基于小波包分解的HilbertHuang变换瞬时能量谱方法应用于含噪语音的端点检测。实验结果表明:与传统广义维数以及谱熵算法相比,本文方法具有更好的准确性、稳定性和自适应性,能够有效描述语音信号非线性非平稳的时频特性。 To overcome the problem of mode mixing for Hilbert-Huang transform (HHT) in speech processing, a new method of time-frequency analysis based on wavelet packet decompo- sition (WPD) is proposed in this paper. Firstly, noise-corrupted speech is decomposed by u- sing WPD, each component is carried out empirical mode decomposition (EMD) separately, and the intrinsic mode function (IMF) is selected by using correlation threshold criterion. Then, the Hilbert spectrum and instantaneous energy spectrum of speech signal are achieved. Finally, the method of instantaneous energy spectrum based on WPD is applied to noise-cor- rupted speech endpoint detection. Experimental results indicate that the proposed method is more accurate, robust and self-adaptive by comparison with the original generalized dimension (OGD) and the spectral entropy(SE) algorithms. The proposed method can effectively de- scribe the time-frequency characteristics of the non-linear and non-stationary speech signal, and has provided a new idea for the research of speech signal.
出处 《数据采集与处理》 CSCD 北大核心 2014年第2期293-297,共5页 Journal of Data Acquisition and Processing
基金 国家自然科学基金(60302027)资助项目 浙江省教育厅科研计划(Y201018050)资助项目
关键词 语音端点检测 Hilbert—Huang变换 时频分析 相关系数 阈值准则 小波包分解 speech endpoint detection Hilbert-Huang transform time-frequency analysis correlation coefficient threshold criterion wavelet packet decomposition
  • 相关文献

参考文献10

  • 1Kim K, Kim M Y. Robust speaker recognition a- gainst background noise in an enhanced multi-condi- tion domain[J]. IEEE Transactions on Consumer Electronics, 2010, 56(3): 1684-1688.
  • 2余华,黄程韦,金赟,赵力.基于粒子群优化神经网络的语音情感识别[J].数据采集与处理,2011,26(1):57-62. 被引量:20
  • 3Backstrom T, Magi C. Effect of white-noise correc- tion on linear predictive coding[J]. IEEE Signal Pro- cessing Letters, 2007, 14(2): 148-151.
  • 4Huang N E, Shen Z, Long S R, et al. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis[J]. Proc. R. Soc. Lond. A, 1998,454: 903-995.
  • 5Huang H, Pan J Q. Speech pitch determination based on Hilbert-Huang transform[J]. Signal Pro cessing, 2006, 86(4): 792-803.
  • 6Molla K I, Shaikh M, Hirose K. Time-frequency representation of audio signals using Hilbert spec trum with effective frequency scaling[C] // Proceed- ing of llth International Conference on Computer and Information Technology ( ICCIT ). Khulna: IEEE, 2008: 335-340.
  • 7Peng Z K, Tse P W, Chu F L. An improved Hilbert- Huang transform and its application in vibration sig- nal analysis [J]. Journal of Sound and Vibration, 2005, 186(2): 187- 205.
  • 8Yuan L, Yang B H, Ma S W, et al. Combination of wavelet packet transform and Hilbert-Huang trans form for recognition of continuous EEG in BCIs [C] // Proceeding of the 2nd IEEE International Con- ference Computer Science and Information Technolo- gy. Beijing, China: IEEE, 2009: 594-599.
  • 9Varga A. Assessment for automatic speech recogni- tion: Ⅱ. NOISEX-92: A database and an experi ment to study the effect of additive noise on speech recognition systems [J]. Speech Communication, 1993, 12(3): 247-251.
  • 10武薇,范影乐,庞全.基于广义维数距离的语音端点检测方法[J].电子与信息学报,2007,29(2):465-468. 被引量:11

二级参考文献36

  • 1Heard R W.Affective computing[M].Cambridge:MIT Press,1997.
  • 2Heard R W.Toward computers that recognize and respond to user emotion[J].IBM Technical Journal,2000,38(2):705-719.
  • 3Qiang Guo,Zhang Peter.Neural networks for classification:a survey[J].IEEE Transaction on Systern,Man,and Cybernetics Application and Reviews,2000,30(4):451-462.
  • 4Yamada T,Hashimoto H,Tosa N,Pattern recognition of emotion with neural network[C] //Proceeding of the 1995 IEEE IECON 21st International Conference on Industrial Electronics.Control,and Instrumentation.[S.l.] :IEEE,1995,1:183-187.
  • 5Sato H,Mitsukura Y,Fukumi M,et al.Emotional speech classification with prosodic parameters by using neural networks[C] //Seventh Australian and NewZealand Intelligent Information Systems Conference.New Zealand:[s.n.] ,2001:395-398.
  • 6Nicholson J,Takahashi K,Nakatsu R.Emotion recognition in speech using neural networks[C] //Proceedings ICONIP 99,6 th International Conference on Neural Information Processing,1999(2):16-20.
  • 7日本文部省.情感信息处理的信息学、心理学研究.[R].1999.
  • 8Shi Y,Eberhart R C.A modified swarm optimizer[C] //IEEE International Conference on Evolutionary Computation.Anchorage,AK,USA:IEEE,1998:69-73.
  • 9Eberhart R C,Shi Y.Comparing inertia weights and constriction factors in particle swarm optimization[C] //2000 Congress on Evolutionary Computation.La Jolla CA USA:[s.n.] ,2000:84-88.
  • 10Chasaide A N,Gobl C.Voice quality and the synthesis of affect[J].Improvements in Speech Synthesis,2002:252-263.

共引文献29

同被引文献37

引证文献3

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部