期刊文献+

基于关系相似性的蛋白质交互自动识别 被引量:4

Protein-protein Interaction Identification Based on Relational Similarity
在线阅读 下载PDF
导出
摘要 针对目前蛋白质交互关系识别主要以单句为依据、因标注数据缺乏而导致训练集规模小等不足,提出一种以关系相似性分析为框架、基于大规模文本的蛋白质交互关系自动识别方法。首先通过对大规模生物医学文本数据库的自动搜索获取描述蛋白质对的句子集合,然后分别从单词、短语结构、依赖关系3个角度抽取特征,建立向量空间模型来表示一对蛋白质之间的关系,最后根据两个向量之间的相似性对关系作出判断。所需训练数据直接取自现有蛋白质交互网络,无需任何额外的人工标注。实验表明,基于关系相似性的蛋白质交互关系自动识别取得了较高的精度(F-score 74.2%)。 Current protein-protein interaction (PPI) identification systems use single sentences as evidence, and often suffer from the heavy burden of manual annotation. To address these problems, a new relational similarity-based ap- proach using large-scale text as evidence was proposed. First, description of PPIs is obtained by automatic searching of the whole PubMed database. Then, three types of features including lexical features, phrases, and dependency relations are extracted to build the vector space model of PPL Finally, similarity between vectors is measured to classify the rela- tionship between two proteins. In this method, training data is taken from existing PPI databases and no extra annota- tion work is needed. Results o~ the experiment show that this approach achieves high F-score (74. 2%).
出处 《计算机科学》 CSCD 北大核心 2013年第6期229-232,251,共5页 Computer Science
基金 教育部高等学校博士学科点专项基金项目(20103218120024) 国家自然科学基金项目(61170043),国家自然科学基金青年科学基金项目(61202132)资助 校青年科创基金(NS2012073)
关键词 蛋白质交互关系 关系相似性 句法分析 空间向量模型 Protein-protein interaction, Relational similarity, Syntactic analysis, Vector space model
  • 相关文献

参考文献20

  • 1Bader G D,et al.BIND-the biomolecular interaction network database[J].Nucleic Acids Res.,2003,31 (1):242-245.
  • 2Peri S,et al.Development of human protein reference database as an initial platform for approaching systems biology in humans[J].Genome Res.,2003,13:2363-2371.
  • 3U.S.National Library of Medicine.PubMed[OL].http://www.ncbi.nlm.nih.gov/pubmed/.
  • 4Ono T,Hishigaki H,et al.Automatic extraction of information on protein-protein interactions from the biological literature[J].Bioinfornatics,2001,17(2):155-161.
  • 5Huang M L,Zhu X Y,Hao Y,et al.Discovering patterns to extract protein-protein interactions form full text[J].Bioinformatics,2004,20(18):3604-3612.
  • 6Fundel K,et al.RelEx Relation extraction using dependency parse trees[J].Bioinformatics,2007,23 (3):365-371.
  • 7Temkin J M,Gllder M R.Extraction of protein interaction information from unstructured text using a context tree grammar[J].Bioinformatics,2003,19(16):2046-2053.
  • 8Bunescu R C,Mooney R J.Subsequence kernels for relation extraction[C]// Proceedings of the 19th Aunual Conference on Neual Information Processing Systems.Cambridge.MA,USA:MIT Press,2005:171-178.
  • 9Niu Y,et al.Evaluation of linguistic features useful in extraction of interactions from PubMed; Application to annotating known,high-throughput and predicted interactions in I2D[J].Bioinformatics,2010,26(1):111-119.
  • 10唐楠,杨志豪,林鸿飞,李彦鹏.基于多核学习的医学文献蛋白质关系抽取[J].计算机工程,2011,37(10):184-186. 被引量:13

二级参考文献5

  • 1Xiao Juan,Su Jian,Zhou Guodong,et al.Protein-protein Interaction Extraction:A Supervised Learning Approach[C] //Proc.of the 1st International Symposium on Semantic Mining in Biomedicine.Hinxton,Cambridge,UK:[s.n.] ,2005.
  • 2Yang Zhihao,Lin Hongfei,Li Yanpeng.BiOPPISVM Extractor:A Protein-protein Interaction Extractor for Biomedical Literature Using SVM and Rich Feature Sets[J].Journal of Biomedical Informatics,2010,43(1):88-96.
  • 3Airola A,Pyysalo S,Bj6rne J,et al.All-paths Graph Kernel for Protein-protein Interaction Extraction with Evaluation of Crosscorpus Learning[EB/OL].(2008-09-19).http://www.ncbi.nlm.nih.gov/pubmed/19025688.
  • 4Miwa M,Swtre R,Miyao Y,et al.Combining Multiple Layers of Syntactic Information for Protein-Protein Interaction Extraction[C] //Proc.of the 3rd International Symposium on Semantic Mining in Biomedicine.Turku,Finland:[s.n.] ,2008.
  • 5王海东,谭魏璇,李艳翠,周国栋.基于树核函数的代词指代消解[J].计算机工程,2009,35(15):165-167. 被引量:4

共引文献12

同被引文献65

引证文献4

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部