期刊文献+

基于多维伪F统计量的基因表达动态聚类分析方法研究 被引量:12

Study on Dynamic Clustering Analysis Method for Gene Expression Data Based on Multidimension Pseudo F-statistics
在线阅读 下载PDF
导出
摘要 K-均值聚类分析算法是一种广泛应用于基因表达数据聚类分析中的迭代变换算法,它通过指定类别数K,基于给定的聚类目标函数,并采用迭代更新的方法,使得最终的聚类结果的目标函数值为极小值,达到较优的聚类效果。针对K-均值聚类分析算法存在参数依赖性强,且在整个聚类过程中类的数目无法改变的缺点,引入动态调整聚类个数的思想和多维伪F统计量,提出了一种基于多维伪F统计量的基因表达动态K-均值聚类算法。实验结果表明该算法可以动态调整聚类个数,给出最佳聚类数目,从而获得较好的聚类质量。 K-means clustering analysis algorithm is a widely iterated algorithm in clustering analysis ofgene expression data. In this algorithm, cluster number is assumed to be K and iterated methods are employed to make the value of objective function minimum. By doing so, the cluster result improves very much. However K-means clustering analysis algorithm depends on parameters strongly and the cluster number keeps unchanged.. Fake F-statistic and an idea of adjusting cluster number were dynamically introduced, and then a new dynamic K-means clustering algorithm for Genes expressed data was proposed based on multi-dimension fake F-statistic. The experiment results show that the algorithm can adjust cluster number and gain a prime number of clustering, which thus argues that this algorithm can attain better clustering quality.
出处 《系统仿真学报》 EI CAS CSCD 北大核心 2006年第3期586-589,601,共5页 Journal of System Simulation
基金 湖南省自然科学基金(03JJY3095)
关键词 聚类分析 基因表达数据 伪F统计量 动态K-均值聚类 clustering analysis genes expressed data fake F-statistics dynamic K-means clustering
  • 相关文献

参考文献14

  • 1R Sharan, R Elkon, R Shamir. Cluster Analysis and its Application to Gene Expression Data[C]//In Proceedings of the 38th Ernst Schering workshop on Bioinformatics and Genome Analysis. Japan: Springer Verlag, 2002:83-108.
  • 2Einav U. Class Discovery in Acute Lymphoblastic Leukemia using gene expression analysis[D]. M.Sc Thesis, USA: Kluwer Academic,2003.
  • 3Alon U, Barkai N, Notter man D A, et al. Broad pattems of gene expression revealed by clustering analysis of rumor and normal colon tissues probed by oligonucleotide arrays[C]// Proc. Natl. Acad. Sci USA, 1999,96:6745-6750.
  • 4Eisen M B, Spellman PT, Brown P O. Cluster analysis and display of genome-wide expression patterns [C]//Proc. Natl. Acad. Sci, USA,1998,95:14863-14868.
  • 5Sharan R, Shamir R. CLICK: A Clustering Algorithm with Applications to Gene Expression Analysis[C]//. In Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology (ISMB). New York: ACM Press, 2000:307-316.
  • 6Eran Segal, Daphne Koller. Probabilistic Hierarchical Clustering for Biological Data[C]//In Proceedings of the sixth annual international conference on Computational biology. New York: ACM Press, 2002:273-280.
  • 7Kohonen T. Self- OrganizingMaps[M]. New York: Springer- Verlag,1997.
  • 8Brian S Everitt, Graham Dunn. Applied Multivariate Data Analysis[M]. UK: Oxford University Press, 2001.
  • 9Theresa M. Culley, Lisa E. Wallace. Calculating F-Statistics[EB/OL].(2001)[2004]. Http://ib.Berkeley.edu/courses/ib160/h13a.html.
  • 10马振华.现代应用数学手册-概率论与随机过程卷[K].北京:清华大学出版社,2002.

同被引文献145

引证文献12

二级引证文献82

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部