期刊文献+

考虑帧间动态特征的音色变换算法 被引量:1

Voice conversion using dynamic inter-frame features
原文传递
导出
摘要 基于G auss ian混合模型的音色变换算法在预测目标说话人频谱时会出现过平滑问题,导致声音转换结果的音质下降。该文分析了造成过平滑问题的原因,并提出一种考虑帧间动态特征的音色变换改进算法,在估计参数的目标函数中加入了连续性和方差的影响,从而改善了映射结果的帧间连续性,并使方差最大化,克服了过平滑现象。实验表明该算法在保证变换结果的目标倾向性的同时,能够使变换语音的音质主观意见得分由3.11提高到3.89,证明动态特征对提高音色变换的音质有重要意义。 In conventional Gaussian mixture model (GMM)-based voice conversion systems, speech quality of converted utterances is degraded by over-smoothing of the predicted spectrum. A conversion method using dynamic inter-frame features was developed to alleviate the over-smoothing by taking account the continuity and variations of the object function. As a result, the predicted features are continuous and the variance is maximized into one syllable. Experimental results show that the method improves the opinion score of converted speech quality from 3.11 to 3.89, while effectively changing the speaker's individuality which shows that the dynamic features are important for quality voice conversion.
出处 《清华大学学报(自然科学版)》 EI CAS CSCD 北大核心 2006年第10期1767-1770,1775,共5页 Journal of Tsinghua University(Science and Technology)
基金 国家自然科学基金资助项目(60275014)
关键词 音色变换 Gaussian混合模型 动态特征 方差 voice conversion GMM (Gaussian mixture model) dynamic feature variance
  • 相关文献

参考文献7

  • 1左国玉,刘文举,阮晓钢.声音转换技术的研究与进展[J].电子学报,2004,32(7):1165-1172. 被引量:32
  • 2Stylianou Y,Cappe O,Moulines E.Continuous probabilistic transform for voice conversion[J].IEEE Trans Speech and Audio Proc,1998,6:131-142.
  • 3Kawahara H,Masuda-katsuse I,De Cheveign A.Restructuring speech representations using a pitchadaptive time-frequency smoothing and an instantaneous frequency-based f0 extraction:Possible role of a repetitive structure in sounds[J].Speech Communication,1999,27:187-207.
  • 4牟晓隆,胡起秀,吴文虎.与文本无关的复合策略说话人辨识系统[J].清华大学学报(自然科学版),1997,37(3):16-19. 被引量:6
  • 5Toda T,Saruwatari H,Shikano K.Voice conversion algorithm based on Gaussian mixture model with dynamic frequency warping of straight spectrum[C]∥ Proc ICASSP,IEEE International Conference on Acoustics,Speech and Signal Processing,2001,2:841-844.
  • 6CHEN Yining,CHU Min,Chang E,et al,Voice conversion with smoothed GMM and MAP adaptation[C]∥ Proc Eurospeech,Geneva,Switzerland,2003:24132416.
  • 7Toda T,Black A W,Tokuda K.Spectral conversion based on maximum likelihood estimation considering global variance of converted parameter[C]∥ Proc ICASSP,IEEE International Conference on Acoustics,Speech and Signal Processing,Philadelphia,USA,2005:9-12.

二级参考文献56

  • 1H Kuwabara and Y Sagisaka.Acoustic characteristics of speaker individuality:control and conversion[J].Speech Communication.1995,16(2):165-173.
  • 2D Klatt and L C Klatt.Analysis,synthesis,and perception of voice quality variations among female and male talkers[J].J Acoust Soc Am,1990,87(2):820-857.
  • 3P H Milenkovic.Voice source model for continuous control of pitch period[J].J Acoust Soc Am,1993,93(2):1087-1096.
  • 4H Matsumoto,et al.Multidimensional representation of personal quality of vowels and its acoustical correlates[J].IEEE Trans Audio and Electroacoustics,1973,21(5):428-436.
  • 5S Furui.Research on individuality features in speech waves and automatic speaker recognition techniques [J].Speech Communication,1986,5(2):183-197.
  • 6K S Lee,et al.A new voice transformation based on both linear and nonlinear prediction[A].Proc ICSLP[C].Philadelphia,USA:ESCA,1996.1401-1404.
  • 7L M Arslan.Speaker transformation algorithm using segmental codebooks (STASC)[J].Speech Communication,1999,28(3):211-226.
  • 8H Mizuno and M Abe.Voice conversion algorithm based on piecewise linear conversion rules of formant frequency and spectrum tilt[J].Speech Communication.1995,16(2):165-173.
  • 9T Yoshimura,et al.Speaker interpolation in HMM-based speech synthesis system[A].Proc.Eurospeech [C].Rhodes,Greece:ESCA,1997.2523-2526.
  • 10D G Childers.Glottal source modeling for voice conversion [J].Speech Communication.1995,16 (2):127-138.

共引文献36

同被引文献13

  • 1左国玉,刘文举,阮晓钢.声音转换技术的研究与进展[J].电子学报,2004,32(7):1165-1172. 被引量:32
  • 2孙俊,戴蓓蒨,张剑.基于GMM和概率修正码本的源-目标说话人声门波转换[J].数据采集与处理,2007,22(1):19-24. 被引量:2
  • 3Abe M, Nakamura S, Shikano K. Voice Conversion Through Vector Quantization, Proc. Of ICASSP, 1988, ( 1 ) :655-658.
  • 4Arslan L Speaker transformation algorithm using segment codebook. Speech Communication. 1999,28 (3) :211-226.
  • 5Stylianou Y, Cappe o, Moulines E. Continuous Probabilistic Transformation for Voice Conversion. IEEE Tran. on Speech and Audio Processing, 1998,6 (2) : 131-142.
  • 6Kain A. , Macon M.W. Design and evaluation of a voice conversion algorithm based on spectral envelope mapping and residual prediction. Proc. Of ICASSP,2001, (2) :813- 816.
  • 7Quafiefi T.F.离散时间语音信号处理-原理与应用.北京:电子工业出版社,2004.
  • 8Toda T, Saruwatari H, Shikano K. Voice conversion algo-rithm based on Gaussian mixture model with dynamic frequency warping of straight spectrum . Proc. Of ICASSP, 2001, (2) : 841- 844.
  • 9CHEN Yining, CHU Min, Chang E. Voice conversion with smoothed GMM and MAP adaptation [ C ] Proc of Euro-speech, Geneva, Switzerland, 2003, ( 1 ) : 2413- 2416.
  • 10Toda T, Black A W , Tokuda K. Spectral conversion based on maximum likelihood estimation considering global variance of converted parameter, Proc. Of ICASSP, 2005, (1) : 9-12.

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部