摘要
传统的聚类方法大多是基于距离或者是样品间相似度的,这就要求所分析的数据必须是定量的。但是在数据挖掘中,存在着大量的定性数据,传统的聚类分析方法已不再是一个可行的方法,这就需要寻找一个可以有效处理定性数据的聚类方法。粗糙集是处理定性数据的有效方法,在详细阐述粗糙集的相关概念后,利用属性重要性的概念,提出了一种能有效处理定性数据的聚类分析方法,并利用了数据对该方法进行了实证分析,取得了良好的结果。
Most of the current clustering approaches are based on the distance among the data or of the similarity of the data, which makes the data analyzed must be quantifiable data. In data mining, there are many qualitative data. That makes the traditional clustering techniques are not useful in tackling the qualitative data as hoped, So need to find an effective clustering method to cope with the qualitative data. Rough set is an useful tool to deal with the qualitative data. After explicating the relative concepts of the rough set, introduced a new clustering approach by using attribute importance concept, which can deal with the high dimensions data effectively. At last, make an empirical analysis of the data and obtain a good clustering result.
出处
《计算机技术与发展》
2007年第12期89-91,95,共4页
Computer Technology and Development
基金
国家教育部"新世纪优秀人才支持计划"资助(NCET-04-0608)
关键词
属性重要性
聚类分析
粗糙集
等价关系
attribute importance
clustering analysis
rough set
equivalence relation