期刊文献+

基于DBSCAN聚类的连续属性离散化算法 被引量:2

A Novel Algorithm for Discretization of Continuous Features Based on DBSCAN Clustering
在线阅读 下载PDF
导出
摘要 连续属性离散化是数据分析中重要的预处理过程,而基于粗糙集理论的数据分析要求离散化的结果能够最大程度地保持原信息系统的分辨关系。论文提出了一种新的离散化算法,此算法以决策信息系统中决策属性对条件属性集合的依赖度作为评价函数动态调整DBSCAN聚类算法的参数,直至离散化决策属性对条件属性集合的依赖度达到预先指定的阈值为止。算法分析和实验证明,算法是切实可行的。 Discretization of continuous features is an important process of data preprocessing.The data analysis based rough set theory demands keeping discernibility of information system.In this article,a novel approach for discretization of continuous features based on DBSCAN algorithm and feature dependency are proposed.In order to set appropriate parameters of DBSCAN clustering algorithm,we have employed dependency of decision feature for condition feature set as evaluate criterion.In this algorithm,we have adjusted parameters of DBSCAN dynamically until the threshold of dependency is satisfied.Experiment and analysis of algorithm show that this algorithm is feasible.
出处 《计算机工程与应用》 CSCD 北大核心 2006年第13期149-151,共3页 Computer Engineering and Applications
关键词 离散化 DBSCAN聚类 属性依赖度 discretization,DBSCAN clustering algorithm,feature dependency
  • 相关文献

参考文献8

  • 1Pawlak Z.Rough Sets[J].Int.Journal of Compute and Information Sciencei,1982; 11 (5):341~356
  • 2Pawlak Z.Rough Set:Theory and its Applications to Data Analysis[J].Cybernetics and Systems,1998;29(7):661~688
  • 3Han JW,Kambr M.Data Mining Concepts and Techniques[M].San Francisco,CA:Morgan Kaufmann Publishers,Inc,2001
  • 4Liu H,Hussain F,Tan C L et al.Discretization:An Enabling Technique[J].Data Mining and Knowledge Discovery,2002; 6 (4):393 ~423
  • 5Nguyen SH,Skowron A.Quantization of Real-valued Attributes,Rough Set and Boolean Reasoning Approaches[C].In:Proc of the second Joint Annual Conference on Information Sciences,Wrightsville Beach,North Carolina,1995:34~37
  • 6Hu X H,Cercone N.Learning in Relational Databases:A Rough Set Approach[J].Inter J of Computational Intelligence,1995; 11 (2):323~338
  • 7Blake C L,Merz C J.UCI Repository of Machine Learning Databases.http://www,ics.uci.edu/~mlearn/MLRepository.html
  • 8马帅,王腾蛟,唐世渭,杨冬青,高军.一种基于参考点和密度的快速聚类算法[J].软件学报,2003,14(6):1089-1095. 被引量:108

二级参考文献8

  • 1Han JW, Kambr M. Data Mining Concepts and Techniques. Beijing: Higher Education Press, 2001. 145-176.
  • 2Kaufan L, Rousseeuw PJ. Finding Groups in Data: an Introduction to Cluster Analysis. New York: John Wiley & Sons, 1990.
  • 3Ester M, Kriegel HP, Sander J, Xu X. A density based algorithm for discovering clusters in large spatial databases with noise. In:Simoudis E, Han JW, Fayyad UM, eds. Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining.Portland: AAAI Press, 1996. 226-231.
  • 4Guha S, Rastogi R, Shim K. CURE: an efficient clustering algorithm for large databases. In: Haas LM, Tiwary A, eds. Proceedings of the ACM SIGMOD International Conference on Management of Data. Seattle: ACM Press, 1998. "73-84.
  • 5Agrawal R, Gehrke J, Gunopolos D, Raghavan P. Automatic subspace clustering of high dimensional data for data mining application. In: Haas LM, Tiwary A, eds. Proceedings of the ACM SIGMOD International Conference on Management of Data.Seattle: ACM Press, 1998.94-105.
  • 6Alexandros N, Yannis T,Yannis M. C^2P: clustering based on closest pairs. In: Apers PMG, Atzeni P, Ceri S, Paraboschi S,Ramamohanarao K, Snodgrass RT, eds. Proceedings of the 27th International Conference on Very Large Data Bases. Roma:Morgan Kaufmann Publishers, 2001. 331-340.
  • 7Berchtold S, Bohm C, Kriegel H-P. The pyramid-technique: towards breaking the curse of dimensionality. In: Haas LM, Tiwary A,eds. Proceedings of the ACM SIGMOD International Conference on Management of Data. Seattle: ACM Press, 1998. 142- 153.
  • 8Yu C, Ooi BC, Tan K-L, Jagadish HV. Indexing the distance: an efficient method to KNN processing. In: Apers PMG, Atzeni P,Ceri S, Paraboschi S, Ramamohanarao K, Snodgrass RT, eds. Proceedings of the 27th International Conference on Very Large Data Bases. Roma: Morgan Kaufmann Publishers, 2001. 421--430.

共引文献107

同被引文献11

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部