期刊文献+

一种改进的BIRCH聚类分析算法及其应用研究 被引量:6

An improved BIRCH Clustering Algorithm and its applications
在线阅读 下载PDF
导出
摘要 文中详细分析讨论了BIRCH算法中存在的不足,并针对其不足进行一定的改进,提出了一种基于离差平方和的改进多阈值BIRCH算法,充分利用离差平方和来建立簇与簇的相关性,相对于单纯以簇之间的中心距离来建立相关性有一定的改进,同时在分裂因子的确定上采用了簇中直径的最大值,克服因采用经验值确定分裂因子的缺陷.最后,引入到基因序列图形表达数据聚类分析应用中. BIRCH (Balanced Iterative R is a new algorithm for large datasets, but educing and Clustering Using Hierarchies) clustering algorithm this algorithm has some shortcomings. Considering these short- comings, an improved algorithm based on the sum of deviation square is proposed to fully utilize the pertinence between the clusters. The split factor is defined by the maximum diameter to overcome shortcoming of the factor from the experience. At last, the improved BIRCH clustering algorithm is tested to analyze the gene graphical representation data.
出处 《湛江师范学院学报》 2009年第3期83-87,共5页 Journal of Zhanjiang Normal College
基金 湖南省自然科学重点基金资助项目(06JJ4076)
关键词 BIRCH算法 聚类特征 基因图形表达数据 BIRCH algorithm clustering feature data of gene graphical representation
  • 相关文献

参考文献4

二级参考文献26

  • 1骆嘉伟,李仁发,张白妮.基于多维伪F统计量的基因表达动态聚类分析方法研究[J].系统仿真学报,2006,18(3):586-589. 被引量:12
  • 2Lang S D,Proc SPIE Data Mining Knowledge Discovery:Theory Toolsand Technology …,1999年,31页
  • 3Aggarwal C C,Proc the ACMSIGMOD Int Conference on Management of Data,1999年,407页
  • 4Han E,Bulletin IEEE Computer Society Technical Committee Data Engineering,1998年,21卷,1期,15页
  • 5Zhang T,Proc the ACMSIGMOD Int Conference on Management of Data,1996年,103页
  • 6Cheung D W,Distributed and Parallel Databases
  • 7PHYLIP[EB/OL].[2007-02-10].http://evolution.genetics.washington.edu/phylip.html.
  • 8ZHANG C T,ZHANG R.Analysis of distribution of bases in the coding sequences by a diagrammatic technique[J].Nucleic Acids Research,1991,19:6313-6317.
  • 9ZHANG R,ZHANG C T,CURVES Z.an intuitive tool for visualizing and analyzing DNA sequences[J].Journal Biomolec Struct Dyn,1994,11:767-782.
  • 10GUO F B,OU H Y,ZHANG C T.ZCURVE:a new system for recognizing protein coding genes in bacterial and archaeal genomes[J].Nucleic Acids Research,2003,31:1780-1789.

共引文献33

同被引文献56

  • 1HANJ,KAMBERM.数据挖掘概念与技术[M].范明,孟小峰,译.北京:机械工业出版社,2006.
  • 2孙爱香,杨鑫华.关于文本聚类有效性评价的研究[J].山东理工大学学报(自然科学版),2007,21(5):65-68. 被引量:5
  • 3朱映辉,江玉珍.BIRCH聚类算法优化及并行化研究[J].计算机工程与设计,2007,28(18):4345-4346. 被引量:9
  • 4艾树宇. 基于Hadoop/MapReduce的K_NN算法[J]. 科技传播, 2013(1):203-204,200.
  • 5王淑玲. 增量聚类算法的设计与实现[D]. 包头:内蒙古科技大学, 2009.
  • 6姚明宇,皮德常,丛湘香.基于k-means的中文文本聚类算法[C]//SSME 2010:IEEE 2010,2010:9-12.
  • 7Ares M E,Parapar J,Barreiro A.Improving alternative text clustering quality in the avoiding bias task with spectral and flat partition algorithms[C]//LNCS 6262:DEXA 2010,2010:407-421.
  • 8Kishida K.High-speed rough clustering for very large document collections[J].Journal of the American Society for Information Science and Technology,2010,61(6):1092-1104.
  • 9Zhang T,Ramakrishnan R,Linvy M.BIRCH:an efficient data clustering method for very large databases[C]//Proceedings of ACM SIGMOD Intl Conf,New York,USA,1996:103-114.
  • 10Horng Shi-Jinn,Su Mingyang.A novel intrusion detection system based on hierarchical clustering and support vector machines[J].Expert Systems with Applications,2011,38(1):306-313.

引证文献6

二级引证文献17

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部