摘要
传统Rough集理论只能处理离散属性,所以在对决策表进行处理之前,必须对决策表中的连续属性进行离散化。本文提出了一种基于云模型的、领域独立的决策表连续属性离散化方法,尤其适合大数据量的情形。该方法首先根据数据的实际分布,利用云变换将连续属性的定义域划分为多个基于云的定性概念,然后利用决策表不确定性程度的反馈信息合并相邻的定性概念。这种离散化方法是一种软划分,更加符合实际的数据分布和人的思维方式。另外通过合并相邻的定性概念,能够有效提高信息系统中信息的粒度,从而提高所挖掘规则的统计意义和预测强度。
Because traditional Rough Sets theory can only deal with discrete attributes, continuous attributes must be converted into discrete attributes before coping with decision table. In this paper, a new method of discretization of continuous attributes in decision table based on cloud model is introduced, especially suitable for a large amount of data processing. This method makes use of cloud transform to partition the domain of every continuous attribute into many concepts represented by cloud models, and merges neighboring concepts according to feedback information from decision table uncertainty. It can not only reflect the real distribution of data, but also efficiently increase the information granularity of information system.
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2003年第1期33-38,共6页
Pattern Recognition and Artificial Intelligence
基金
国家自然科学基金资助项目(No.69975024)