期刊文献+

BioTrHMM:基于迁移学习的生物医学命名实体识别算法 被引量:18

BioTrHMM: named entity recognition algorithm based on transfer learning in biomedical texts
在线阅读 下载PDF
导出
摘要 为了降低生物医学文本中命名实体识别对目标领域标注数据的需求,将生物医学文本中的命名实体识别问题转换为基于迁移学习的隐马尔可夫模型问题。对要进行命名实体识别的目标领域数据集无须进行大量数据标注,通过迁移学习的方法实现对目标领域的识别分类;以相关领域数据为辅助数据集,利用数据引力的方法评估辅助数据集的样本在目标领域学习中的贡献程度,在辅助数据集和目标领域数据集上计算权值进行迁移学习。基于权值学习模型,构建基于迁移学习的隐马尔可夫模型算法BioTrHMM。在GENIA语料库的数据集上的实验表明,BioTrHMM算法比传统的隐马尔可夫模型算法具有更好的性能,仅需要少量的目标领域标注数据即可具有较好的命名实体识别性能。 In order to reduce the requirement of labeled data in target domain for biomedical NER( named entity recognition),this paper transformed the problem of NER in biomedical texts into a hidden Markov model based on transfer learning. The data sets in the target domain for NER did not need a large amount of labeled data to learn a model for the task by transfer learning. With the help of labeled data in source data sets across a different but related domain,it used the data gravitation method to evaluate the contribution of samples in the auxiliary data sets about learning a model for the target domain. And it calculate the weights of the data from the source domain and the data from the target domain. And then it construct the hidden Markov model algorithm( BioTrHMM) based on the transfer learning. The experiment results on GENIA corpus show the BioTrHMM algorithm has better performance than the traditional algorithm of hidden Markov model,only uses small amount of labeled data in target domain.
作者 高冰涛 张阳 刘斌 Gao Bingtao;Zhang Yang;Liu Bin(College of Information Engineering,Northwest A&F University,Yangling Shaanxi 712100,China)
出处 《计算机应用研究》 CSCD 北大核心 2019年第1期45-48,共4页 Application Research of Computers
基金 国家自然科学基金资助项目(61602388) 中央高校基本科研业务费专项资金资助项目(2452015193 2452015194 2452016081)
关键词 迁移学习 隐马尔可夫模型 命名实体识别 文本挖掘 transfer learning hidden Markov model named entity recognition text mining
  • 相关文献

参考文献2

二级参考文献88

  • 1[1]Alon Y.Intelligent Internet Systems.http://www.cs.washington.edu/h omes/alon/site/files/aij 00 .ps,2000
  • 2[2]Perkowitz M.Learning to Understand Information on the Intemet:An Example-based Approach.http://cs.sungshin.ac.kr/~jskim/PS/imagedb/ILA-JIIS.ps,1997
  • 3[3]McCallum A,Nigam K,Rennie J.Building Domain-specific SearchEngines with Machine Learning Techniques.http://www.cs.cmu.edu/~mccallum/papers/cora-aaaiss99.ps.gz,1999
  • 4Ben-David S,Blitzer J,Crammer K,Pereira F.Analysis of representations for domain adaptation.In:Platt JC,Koller D,Singer Y,Roweis ST,eds.Proc.of the Advances in Neural Information Processing Systems 19.Cambridge:MIT Press,2007.137-144.
  • 5Blitzer J,McDonald R,Pereira F.Domain adaptation with structural correspondence learning.In:Jurafsky D,Gaussier E,eds.Proc.of the Int’l Conf.on Empirical Methods in Natural Language Processing.Stroudsburg PA:ACL,2006.120-128.
  • 6Dai WY,Xue GR,Yang Q,Yu Y.Co-Clustering based classification for out-of-domain documents.In:Proc.of the 13th ACM Int’l Conf.on Knowledge Discovery and Data Mining.New York:ACM Press,2007.210-219.[doi:10.1145/1281192.1281218].
  • 7Dai WY,Xue GR,Yang Q,Yu Y.Transferring naive Bayes classifiers for text classification.In:Proc.of the 22nd Conf.on Artificial Intelligence.AAAI Press,2007.540-545.
  • 8Liao XJ,Xue Y,Carin L.Logistic regression with an auxiliary data source.In:Proc.of the 22nd lnt*I Conf.on Machine Learning.San Francisco:Morgan Kaufmann Publishers,2005.505-512.[doi:10.1145/1102351.1102415].
  • 9Xing DK,Dai WY,Xue GR,Yu Y.Bridged refinement for transfer learning.In:Proc.of the Ilth European Conf.on Practice of Knowledge Discovery in Databases.Berlin:Springer-Verlag,2007.324-335.[doi:10.1007/978-3-540-74976-9_31].
  • 10Mahmud MMH.On universal transfer learning.In:Proc.of the 18th Int’l Conf.on Algorithmic Learning Theory.Sendai,2007.135-149.[doi:10,1007/978-3-540-75225-7_14].

共引文献507

同被引文献223

引证文献18

二级引证文献121

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部