期刊文献+

基于Gaussian混合模型的LSF参数量化方法 被引量:2

Quantization of LSF parameters using a Gaussian mixture model
原文传递
导出
摘要 为了高效率量化线谱频率(linear spectrumfrequency,LSF)参数,提出了基于G auss ian混合模型(G auss ian m ix ture m ode l,GMM)的LSF量化算法。假设LSF矢量属于GMM中的某一个G auss ian分布,用G auss ian分布随机矢量的量化方法对LSF矢量进行了量化。利用准确的G auss ian分布变量量化误差,得到了G auss ian分布矢量的比特分配方法。应用G auss ian分布随机变量的非均匀量化方法量化每一维LSF参数。最后给出了分裂矢量量化、基于概率密度函数(probab ility dens ityfunction,PDF)量化方法和该算法的性能对比。该无记忆LSF量化算法在21 b/帧可以达到透明量化,比传统Sp litVQ节省3 b。 An efficient linear spectrum frequency (LSF) parameter quantization scheme was developed based on the Gaussian mixture model (GMM). The algorithm assumes that the LSF parameter has a GMM Gaussian distribution so that the LSF vector can be quantized using a random Gaussian distribution vector quantization. The bits of the LSF parameters are allocated according to the precise quantization error of the Gaussian distribution. Each dimension of the LSF parameter is quantized using a non-uniform scalar quantization of the Gaussian distribution variable. Comparison of the method with the Split-VQ and PDF VQ methods shows that the LSF parameters could be transparent quantized at 21 b/frame by the memoryless quantizer, which is 3b less than the conventional Split-VQ method.
出处 《清华大学学报(自然科学版)》 EI CAS CSCD 北大核心 2006年第10期1727-1730,共4页 Journal of Tsinghua University(Science and Technology)
基金 国家自然科学基金资助项目(60272020)
关键词 语音编码 矢量量化 Gaussian混合模型 线谱频率 speech coding vector quantization Gaussian mixture model (GMM) linear spectrum frequency (LSF)
  • 相关文献

参考文献9

  • 1Gray R M,Neuhoff D L.Quantization[J].IEEE Trans Inform Theory,1998,44:2325-2383.
  • 2Gersho A,Gray R M.Vector Quantization and Signal Compression[M].New York:Wiley,1994.
  • 3Subramaniam A D,Rao B D.PDF optimized parametric vector quantization of speech line spectral frequencies[J].IEEE Trans Speech Audio Processing,2003,11(2):130-142.
  • 4Reynolds D A,Rose R C.Robust text-independent speaker identification using Gaussian mixture speaker models[J].IEEE Trans Speech Audio Processing,1995,3(1):72-83.
  • 5Hedelin P,Skoglund J.Vector quantization based on Gaussian mixture models[J].IEEE Transactions on Speech and Audio Processing,2000,8:385-401.
  • 6Dempster A,Laird N,Rubin D.Maximum likelihood from incomplete data via the EM algorithm[J].J R Statist Soc,1977,39:1-38.
  • 7Huang J,Schultheiss P M.Block quantization of correlated Gaussian random variables[J].IEEE Trans Commun,1963,11(3):289296.
  • 8Max J.Quantizing for minimum distortion[J].IRE Trans On Information Theory,1960,6:7-12.
  • 9Paliwal K K,Atal B S.Efficient vector quantization of LPC parameters at 24 bits/frame[J].IEEE Transactions on Speech and Audio Processing,1993,1(1):3-14.

同被引文献20

  • 1ETSI EN 301 704 V7.2.1 Adaptive Multi-Rate(AMR)Speech Transcoding[S]. 2000.
  • 2ITU-T G.729:Coding of Speech at 8kbit/s Using Conjugate Structure Algebraic Code Excited Linear Prediction(CS-ACELP)[S]. 1996.
  • 3ITU-T G.729A: Educed Complexity 8kbit/s CS-ACELP Speech Codec[S]. 1996.
  • 4OTA Y, SUZUKI M, TSUCHINAGA Y, et al. Speech coding translation for IP and 3G mobile integrated network[A]. IEEE International Conference on Communications[C]. New York: IEEE Press, 2002. 114-118.
  • 5GHENANIA M, LAMBL1N C. Low-cost smart transcoding algorithm between ITU-T G.729 (8kbit/s) and 3GPPNB-AMR (12.2kbit/s)[A]. European Signal Processing Conference[C]. Vienna: EUSIPCO Press, 2004, (3): 1681-1684.
  • 6吴金池.语音辩识系统之研究[D].台湾国立中央大学,2003.9-17.
  • 7KAIN A B. High Resolution Voice Transformation[D]. Oregon Health and Science University, 2001.36-54.
  • 8康永国,双志伟,陶建华等.高斯混合模型和码本映射相结合的语音转换算法[A].第八届全国人机语音通讯学术会议[c].2005.293-297.
  • 9ITU-T P.800.1 :Mean Opinion Score(MOS) Terminology[S]. 2003.
  • 10ITU-T P.862.1: Mapping Function for Transforming P.862 Raw Result Scores to MOS-LQO[S]. 2003.

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部