期刊文献+

分级与密度相结合的Web文本聚类算法

Algorithm of Web Text Classification Based on Hierarchical and Density Clustering
在线阅读 下载PDF
导出
摘要 考虑到实验数据的大规模及样本数据形状的复杂性等特点,提出一种基于分级聚类与DBSCAN聚类相结合的HL-DBSCAN聚类算法,避免了DBSCAN的聚类算法较大的时间复杂度,适用性更广,更能体现一个聚簇的规律,提高分类精度.通过实验与结果分析,取得较好的聚类结果,证明了该算法在文本聚类处理中的可行性. Due to the complexity of text classification. The DBSCAN algorithm is modified with hierarchical idea to overcome its thread limitation, which can only adapt to small spatial data structure so that its clustering result can be more widely used and reflect the character of clustering better. The modified algorithm can also increase classification accuracy. According to the result of experiments for HL-DBSCAN algorithm,it is proved that the clustering result is not bad. At the same time,it also indicates that HL-DBSCAN algorithm is feasible for text clustering miming.
作者 林国平
出处 《太原师范学院学报(自然科学版)》 2008年第3期45-48,共4页 Journal of Taiyuan Normal University:Natural Science Edition
基金 国家自然科学基金项目(项目编号:10671173)资助
关键词 分级聚类 DBSCAN算法 WEB文本分类 hierarchical clustering DBSCAN algorithm web text clustering
  • 相关文献

参考文献5

  • 1[1]Han Jiawei,Micheline amber.Data mining concepts and techniques[M].Peking:China Machine Press,2001
  • 2[2]Chen Bing,Hong Jiarong,Wang Yadong.Subset of the features of the optimal choice[J].Journal of Computer,1997(2):133-138
  • 3[3]Nasraoui O,Rojas C,Cardona C.A framework for mining evolving trends in web data stream using dynamic learning and retrospective validation[J].Computer Networks,2006,50(10):1 488-1 512
  • 4[4]Shi Zhongzhi.Konwledge discovery[M].Peking:Tsinghua University Press,2002
  • 5[5]Zhao Y,Karypis G.Hierarchical clustering algorithms for document datasets[J].Data Mining and Knowledge Discovery,2005(10):141-168

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部