期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Speaker adapted dynamic lexicons containing phonetic deviations of words
1
作者 Bahram VAZIRNEZHAD Farshad ALMASGANJ +1 位作者 Seyed Mohammad AHADI Ari CHANEN 《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》 SCIE EI CAS CSCD 2009年第10期1461-1475,共15页
Speaker variability is an important source of speech variations which makes continuous speech recognition a difficult task.Adapting automatic speech recognition(ASR) models to the speaker variations is a well-known st... Speaker variability is an important source of speech variations which makes continuous speech recognition a difficult task.Adapting automatic speech recognition(ASR) models to the speaker variations is a well-known strategy to cope with the challenge.Almost all such techniques focus on developing adaptation solutions within the acoustic models of the ASR systems.Although variations of the acoustic features constitute an important portion of the inter-speaker variations,they do not cover variations at the phonetic level.Phonetic variations are known to form an important part of variations which are influenced by both micro-segmental and suprasegmental factors.Inter-speaker phonetic variations are influenced by the structure and anatomy of a speaker's articulatory system and also his/her speaking style which is driven by many speaker background characteristics such as accent,gender,age,socioeconomic and educational class.The effect of inter-speaker variations in the feature space may cause explicit phone recognition errors.These errors can be compensated later by having appropriate pronunciation variants for the lexicon entries which consider likely phone misclassifications besides pronunciation.In this paper,we introduce speaker adaptive dynamic pronunciation models,which generate different lexicons for various speaker clusters and different ranges of speech rate.The models are hybrids of speaker adapted contextual rules and dynamic generalized decision trees,which take into account word phonological structures,rate of speech,unigram probabilities and stress to generate pronunciation variants of words.Employing the set of speaker adapted dynamic lexicons in a Farsi(Persian) continuous speech recognition task results in word error rate reductions of as much as 10.1% in a speaker-dependent scenario and 7.4% in a speaker-independent scenario. 展开更多
关键词 Pronunciation models Continuous speech recognition lexicon adaptation
原文传递
基于LEBERT的多模态领域知识图谱构建 被引量:4
2
作者 李华昱 付亚凤 +1 位作者 闫阳 李家瑞 《计算机系统应用》 2022年第11期79-90,共12页
多模态知识图谱(multi-modal knowledge graph,MMKG)是近几年新兴的人工智能领域研究热点.本文提供了一种多模态领域知识图谱的构建方法,以解决计算机学科领域知识体系庞大分散的问题.首先,通过爬取计算机学科的相关多模态数据,构建了... 多模态知识图谱(multi-modal knowledge graph,MMKG)是近几年新兴的人工智能领域研究热点.本文提供了一种多模态领域知识图谱的构建方法,以解决计算机学科领域知识体系庞大分散的问题.首先,通过爬取计算机学科的相关多模态数据,构建了一个系统化的多模态知识图谱.但构建多模态知识图谱需要耗费大量的人力物力,本文训练了基于LEBERT模型和关系抽取规则的实体-关系联合抽取模型,最终实现了一个能够自动抽取关系三元组的多模态计算机学科领域知识图谱. 展开更多
关键词 多模态 知识图谱 领域 LEBERT 关系抽取规则 lexicon adapter
在线阅读 下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部