一种改进的基于时域参数的语音切分算法被引量：3

An Improved Speech Detection Algorithm Based on Time-domain Parameter

下载PDF

导出

摘要本文探讨了基于时域的语音切分算法,在前人研究的基础上,提出一种改进算法——自适应、前后搜索和检测短时脉冲噪音算法。该算法主要利用语音信号的短时参数,采用统计的方法定出切分所需要的阈值;根据背景音和静音过零率的不同,进一步搜索符合要求的静音帧;同时滤去短时脉冲噪音。实验证明,该算法准确卑很高,有很好的鲁棒性,允许误差在60 ms 的范围内,对于原始语音切分错误率为5.04%;在信噪比(SNR)大于等于2 dB 的情况下,对带噪语音的切分错误率为10%～20%。 This paper researches on speech detection algodthrn based on time domain, and describes an adaptive, both forwards and backwards search, detecting short-term pulse noise algorithm. This algorithm uses a variety of features including the frame amplitude and zero crossing rate to calculate threshold using statistical method. And it searches much further for the unvoiced frame according to the ZCR（ zero crossing rate ）, which is differ from unvoiced frame to background frame. This algorithm also detects impulse noise that last little. Experimental results show that this im- provement has good performance, even in noisy condition. Testing the original speech, the error rate is 5.04%, and in noisy environment with a SNR of beyond 2 dB, the rate is around 80-90%.

作者林帆徐明星

机构地区中山大学计算机科学系

出处《计算机科学》 CSCD 北大核心 2006年第4期164-167,共4页 Computer Science

基金国家自然重点基金资助基金号:60433030

关键词语音切分短时参数自适应前后搜索检测短时脉冲噪首 Speech detection, Short-term variety, Adaptive, Search forwards and backwards, Detecting short-term pulse noise

分类号 TP391.12 [自动化与计算机技术—计算机应用技术] TN912.3 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献10

1Rabiner L R, Sambur M R. An Algorithm for Determining the Endpoint of Isolated utterances. Bell Syst. Tech. J., 1975, 54(2),297~315
2张继勇,sp.cs.tsinghua.edu.cn,郑方,sp.cs.tsinghua.edu.cn,杜术,sp.cs.tsinghua.edu.cn,宋战江,sp.cs.tsinghua.edu.cn,徐明星,sp.cs.tsinghua.edu.cn.连续汉语语音识别中基于归并的音节切分自动机[J].软件学报,1999,10(11):1212-1215. 被引量：10
3Burileanu D, Pascalin L, Burileanu C, et al. An Adaptive and Fast Speech Detection Algorithm. In: Sojka P, Kopeck I, Pala K, et al eds. Text, Speech and Dialogue: Third International Workshop, TSD 2000. Brno, Czech Republic: Springer-Verlag GmbH, 2000. 177~182
4Beritelli F, Casale S, Cavallaro A. A robust voice activity detector for wireless communications using soft computing. IEEE J Sel Areas Commun, 1998,16(9) : 1818~ 1829
5Beritelli F, Casale S, et al. Performance evaluation and comparison of G. 729/AMR/fuzzy voice activity detectors. IEEE Signal Process Lett, 2002,9(3) :85~88
6张文军,谢剑英,李聪.基于贝叶斯方法的鲁棒语音切分[J].数据采集与处理,2002,17(3):260-264. 被引量：2
7Wu Bing-Fei, Wang Kun-Ching. Robust Endpoint Detection Algorithm Based on the Adaptive Band Partitioning Spectral Entropy in Adverse Environments. IEEE Transactions on Speech and Audio Processing, 2005,13 (5)
8Li K, Swamy M N S, Ahmad M O, An Improved Voice Activity Detection Using Higher Order Statistics, IEEE Transactions on Speech and Audio Processing, 2005,13(5)
9贾卓燕,申瑞民.一种利用声音特性快速切分英文单词音节的算法[J].计算机仿真,2005,22(2):86-88. 被引量：1
10何致远,胡起秀,徐光.说话人识别中语音切分算法的研究[J].计算机工程与应用,2003,39(6):55-58. 被引量：4

二级参考文献20

1郑方吴文虎等.CDCPM及其在语音识别中的应用[J].软件学报,1996,7(10):69-75.
2郑方王承发等.一个语文转换文本编辑器的实现.第5届全国人机语音通讯学术会议（NCMMSC'98）会议论文集[M].哈尔滨:哈尔滨工业大学出版社,1998.280-285.
3[1]Sevendsen T, Soong F K. On the automatic segmentation of speech signals[C]. In: Proc ICASSP′87 (Vol.1, Dallas, Texas), 1987.77～80
4[2]Mak M W, Allen W G. Spectral transitivity functions for speech segmentation in noise[J]. Acoustic Letters, 1993,16(10):228～234
5[3]Rabiner L R. A tutorial on hidden Markov models and selected applications in speech recognition[J]. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1989,77(2):267～295
6[4]Leung H, Zue V. A procedure for automatic alignment of phonetic transcriptions with continuous speech[C]. In: Proc ICASSP′84 (San Diego, California), 1984.271～274
7[5]van Hernert J. Automatic segmentation of speech[J]. IEEE Trans on Signal Processing, 1991,39(4):1008～1012
8[6]Brugnara F,Falavigna D,Omologo M.Automatic segmentation and labeling of speech based on hidden Markov models[J]. Speech Communication, 1993,12(4):357～370
9[7]Young S J. HTK: hidden Markov model toolkit V1.4[M]. Engineering Department of Cambridge University Cambridge, England,1992
10[8]Varga A, Steeneken H. Assessment for automaticspeech recognition: II.NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition system[J]. Speech Communication, 1993, 12:247～251

共引文献12

1何致远,胡起秀,徐光祜.说话人识别中语音切分算法的研究[J].计算机科学,2002,29(z1):140-143.
2张帅,贾珈,杨大利,徐明星,蔡莲红.方言转换系统中的音节切分算法研究[J].计算机技术与发展,2009,19(7):41-43. 被引量：1
3汤霖,黄建中,尹俊勋.基于语音知识的音节切分[J].中文信息学报,2010,24(4):91-95. 被引量：4
4王艳,冯宏伟,张利平,忽满利.基于元音检测的汉语连续语音声韵母分割[J].计算机工程与应用,2011,47(14):134-136. 被引量：7
5宋战江,郑方,徐明星,武健,吴文虎.汉语连续语音识别系统与知识导引的搜索策略研究[J].自动化学报,2000,26(4):470-477. 被引量：1
6李欢欢,王金明,尹海明,徐志军,孔磊,张开礼.一种改进的基于Viterbi的语音切分算法[J].通信技术,2015,48(9):1027-1031. 被引量：4
7张扬,赵晓群,王缔罡.基于时频二维能量特征的汉语音节切分方法[J].计算机应用,2016,36(11):3222-3228. 被引量：6
8曹冠彬,张二华,王凯龙.连续汉语语音切分技术研究[J].计算机与数字工程,2019,47(7):1667-1671. 被引量：3
9杨健,李振鹏,苏鹏.语音分割与端点检测研究综述[J].计算机应用,2020,40(1):1-7. 被引量：10
10王宇琛,张二华.汉语连续语音切分技术研究[J].计算机与数字工程,2020,48(8):1864-1869.

同被引文献13

1Lee Chin- Hui, Soong F K, Juang Bing - Hwang. A segment model based approach to speech recognition[ C] // International Conference on Acoustic, Speech, and Signal Preassing ( ICASSP- 88). New Vork. NY, USA: [s. n. ]. 1988.
2Toledano D T, Gomez L A H, Grande L V. Automatic phonetic segmentation[J]. IEEE Transactions on Speech and Audio Processing, 2004,11:617 - 621.
3Villing R, Timoney J, Ward T, et al. Automatic Blind Syllable Segmentation for Continuous Speech[C] // Irish Signals and Systems Conference. Belfast, Ireland: [ s. n. ] ,2004.
4罗世谦,冯子亮,张恒.一种基于能量聚类分析的句子语音端点检测法[J].计算机技术与发展,2008,18(4):13-15. 被引量：5
5李永光,李雪耀.基于小波变换的自动声/韵切分的研究[J].哈尔滨工程大学学报,1998,19(3):75-81. 被引量：2
6汤霖,黄建中,尹俊勋.基于语音知识的音节切分[J].中文信息学报,2010,24(4):91-95. 被引量：4
7张继勇,sp.cs.tsinghua.edu.cn,郑方,sp.cs.tsinghua.edu.cn,杜术,sp.cs.tsinghua.edu.cn,宋战江,sp.cs.tsinghua.edu.cn,徐明星,sp.cs.tsinghua.edu.cn.连续汉语语音识别中基于归并的音节切分自动机[J].软件学报,1999,10(11):1212-1215. 被引量：10
8张永锋,杨影,肖莹莹.基于主成分分析的汉语连续语音切分算法[J].应用声学,2011,30(5):366-369. 被引量：3
9王帆,郑方,吴文虎.基于多尺度分形维数的汉语语音声韵切分[J].清华大学学报（自然科学版）,2002,42(1):68-71. 被引量：14
10李欢欢,王金明,尹海明,徐志军,孔磊,张开礼.一种改进的基于Viterbi的语音切分算法[J].通信技术,2015,48(9):1027-1031. 被引量：4

引证文献3

1张帅,贾珈,杨大利,徐明星,蔡莲红.方言转换系统中的音节切分算法研究[J].计算机技术与发展,2009,19(7):41-43. 被引量：1
2张永锋,杨影,肖莹莹.基于主成分分析的汉语连续语音切分算法[J].应用声学,2011,30(5):366-369. 被引量：3
3曹冠彬,张二华,王凯龙.连续汉语语音切分技术研究[J].计算机与数字工程,2019,47(7):1667-1671. 被引量：3

二级引证文献6

1高桥,张二华.基于基音周期轨迹的连续汉语语音切分技术研究[J].计算机与数字工程,2023,51(1):163-167. 被引量：1
2张永锋,田勇,张阳.利用语音的频谱空间特征进行汉语抗噪语音识别的方法[J].声学技术,2015,34(1):51-53.
3曹冠彬,张二华,王凯龙.连续汉语语音切分技术研究[J].计算机与数字工程,2019,47(7):1667-1671. 被引量：3
4卓嘎.基于Praat的藏语连续语音参数提取仿真和分析[J].电子技术与软件工程,2019,0(20):53-56. 被引量：1
5王宇琛,张二华.汉语连续语音切分技术研究[J].计算机与数字工程,2020,48(8):1864-1869.
6李琦,张二华.连续汉语语音的自动切分研究[J].计算机与数字工程,2023,51(4):959-964. 被引量：1

1何致远,胡起秀,徐光祐.两级决策的开集说话人辨认方法[J].清华大学学报（自然科学版）,2003,43(4):516-520. 被引量：12
2李欢欢,王金明,尹海明,徐志军,孔磊,张开礼.一种改进的基于Viterbi的语音切分算法[J].通信技术,2015,48(9):1027-1031. 被引量：4
3雷文辉,宋彦,戴礼荣.一种基于层次化支持向量机的语种识别方法[J].小型微型计算机系统,2009,30(4):721-725. 被引量：2
4何致远,胡起秀,徐光祜.说话人识别中语音切分算法的研究[J].计算机科学,2002,29(z1):140-143.
5艾玉林,张然.正弦脉宽调制时脉宽和频率的确定[J].国外电气自动化,1991,12(5):59-61.
6Imagination发布Android GPU解决方案[J].中国集成电路,2014,23(8):6-6.
7木子.Intel真正1GHz PⅢ性能大披露[J].广东电脑与电讯,2000(9):62-62.
8面积最小的Android GPU解决方案[J].今日电子,2014(9):72-72.
9小玩子.Athlon 850MHz[J].电脑硬件（现代电子技术）,2000(4):16-17.
10钱济国.机械故障的时域参数诊断法[J].煤矿机械,2006,27(9):192-193. 被引量：7

计算机科学

2006年第4期

浏览历史

内容加载中请稍等...

一种改进的基于时域参数的语音切分算法被引量：3

参考文献10

二级参考文献20

共引文献12

同被引文献13

引证文献3

二级引证文献6

相关作者

相关机构

相关主题

浏览历史

一种改进的基于时域参数的语音切分算法 被引量：3

参考文献10

二级参考文献20

共引文献12

同被引文献13

引证文献3

二级引证文献6

相关作者

相关机构

相关主题

浏览历史

一种改进的基于时域参数的语音切分算法被引量：3