期刊文献+

基于熵聚类和双重筛选策略挖掘动脉粥样硬化风险疾病基因(英文) 被引量:1

Uncovering Atherosclerosis Risk Disease Genes with Entropy-Based Clustering and Double Screening Strategies
原文传递
导出
摘要 动脉粥样硬化是因脂质堆积在血管壁上并受到多种遗传和环境因素影响的一种复杂的病理生理疾病。动脉粥样硬化风险疾病基因的辨识可以增进对该疾病机理的了解,并对该疾病的诊断和治疗起到指导性作用。虽然在风险疾病基因的辨识方面已经提出了很多计算方法,但仍存在着推论准确性和计算效率方面的问题。一种命名为基于熵聚类和双重筛选(Entropy-based clustering and double screening,ECDS)的新方法被用来辨识该疾病的风险疾病基因。该方法将功能基因组信息和蛋白质相互作用网络拓扑结构信息进行整合,运用于基于熵聚类的方法中,之后,使用双重筛选策略(即支持向量机和相似性得分)进行风险疾病基因挖掘。运用该方法,从巨噬细胞样本和泡沫细胞样本中分别辨识出79个和113个风险疾病基因。该结果表明ECDS在辨识动脉粥样硬化风险疾病基因方面非常有效。此外,该方法也很易于扩展应用到其它复杂疾病的风险基因辨识中。 Atherosclerosis (AS) is a complex pathophysiologic disease characterized by lipid accumulation in the vascular wall and regulated by multiple genetic and environmental factors. The identification of atherosclerotic risk disease genes can broaden the understanding of the mechanism of AS, and guide for disease diagnosis and treatments. Although many computational approaches have been proposed for identifying the risk disease genes, a major challenge is the balance between inference accuracy and computational efficiency. A novel method of entropy-based clustering and double screening, named as ECDS, was introduced to identify the atherosclerotic risk disease genes in this work. In this algorithm, the functional genomic information and the topological structure information of protein-protein interaction network was integrated with the entropy-based clustering method, then the double screening strategy (that is, support vector machine and similarity score) was employed to predict the atherosclerosis risk disease genes. ECDS can identify 79 risk disease genes from macrophages samples, and 113 from foam cell samples, respectively. These risk genes have similar biological functions and signaling pathways with known AS disease genes. The results show that ECDS is very effective for identifying the atherosclerotic risk disease genes. In addition, ECDS is easy to be extended and applied to recognize other complex disease genes.
出处 《生物物理学报》 CAS CSCD 北大核心 2014年第1期63-71,共9页 Acta Biophysica Sinica
基金 supported by grants from The National Natural Science Foundation of China(61170134,60775012) The Doctorate Foundation of Northwestern Polytechnical University(cx201017)~~
关键词 动脉粥样硬化 基于熵聚类 风险疾病基因 相似性得分 支持向量机 Atherosclerosis Entropy-based clustering Risk disease genes Similarity score Support vector machine
  • 相关文献

参考文献2

二级参考文献30

  • 1秦树存.细胞因子与动脉粥样硬化[J].国外医学(老年医学分册),1994,15(5):193-196. 被引量:41
  • 2Schadt E E, Lamb J, Yang X, et al. An integrative genomics approach to infer causal associations between gene expression and disease. Nat Genet, 2005, 37(7): 710-717.
  • 3Brent R. Genomic biology. Cell, 2000, 100(1): 169-183.
  • 4Sharan R, Suthram S, Kelley R M, et al. Conserved patterns of protein interaction in multiple species. Proc Natl Acad Sci USA, 2005, 102(6): 1974 1979.
  • 5K.itano H. Systems biology: a brief overview. Science, 2002, 295(5560):1662-1664.
  • 6Peri S, Navarro J D, K_ristiansen T Z, et al. Human protein reference database as a discovery resource for proteomics. Nucleic Acids Res, 2004, 32(Database issue): D497-501.
  • 7Barrett T, Troup D B, Wilhite S E, et al. NCBI GEO: mining tens of millions of expression profiles-database and tools update. Nucleic Acids Res, 2007, 35(Database issue): D760 765.
  • 8Edgar R, Domrachev M, Lash A E. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res, 2002, 30(1): 207-210.
  • 9Hamosh A, Scott A F, Amberger J S, et al. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res, 2005, 33 (Database issue): D514-517.
  • 10Ozgur A, Vu T, Erkan G, et al. Identifying gene-disease associations using centrality on a literature mined gene-interaction network.Bioinformatics, 2008, 24(13): i277-285.

共引文献6

同被引文献7

引证文献1

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部