摘要
质量优良的语音识别系统或语音合成系统需要高质量的、在语音学和语言学知识指导下设计的科学合理简洁有效的连续语音数据库的支持.在目前阶段,汉语语音数据库应限制在朗读言语(readspeech)的音段方面。为了描写语流中的音变现象,考虑如下语音单元:(1)不计声调的音节(401个)。(2)音节间的双音子415个。(3)音节间的三音子3035个,这是根据37个基本音子,利用音节间共振峰过渡的研究结果,按规则规纳的结果.(4)所有音节间过渡段的韵母一声母结构,采用和同三音子相同的归并方法,共781个.为了增加不同的韵律结构,并考虑语音识别系统的后处理,语料还包括汉语的17类基本句型.选用1993、1994两年的“人民日报”、“百家报刊精选”及若干电视剧本、词典词库作为语料库的原始语料,从中选出2185个句子和388个短语作为朗读语料,它们覆盖了99.8%个无调音节,100%的双音子,99.6%的三音子,以及17类句型。
Well developed continuous speech recognition systems need a higher quality, scientific designed, succinct and valid continuous speech database. At the first stage the database should be mainlylimited in read speech. To describe very complex variances in continuous speech, we propose the following speech units: (1) 401 syllables witout tone. (2) 415 inter-syllabic diphones. (3) 3035 inter-syllabictriphones. (4) 781 inter-syllabicfinal-initial structures. We also give 17 sentence patterns to include theprosodic phenomena. Using automatic method 2185 sentences and 388 phrases are collected by abovephonetic rules from a big corpus-recent years 'Peple's Daily' and so on, as the read text of continuousspeech recognition database in Standard Chinese. This set of sentences covers 99.8% syllables withouttone, 100% inter-syllabic diphoes, 99.6% inter-syllabic triphones and 100% sentence patterns.
出处
《声学学报》
EI
CSCD
北大核心
1999年第3期236-247,共12页
Acta Acustica
基金
国家863高科技计划资助!863-306-03-09-1