摘要
针对基于测试代价敏感的属性约简方法可能会改变决策域等问题,引入决策域分布保持的概念,结合获取数据的代价,提出了一种保决策域的测试代价概率邻域粗糙集属性约简算法。首先,以条件信息量为基础,定义一种具有单调性的(α,β)正域分布的概率邻域条件信息量,并以此获取决策表的核属性;其次,给出一种邻域条件熵属性重要度度量方法,通过属性重要度和属性自身测试代价构建属性影响度评估指标;然后,以核属性集作为初始集,将属性重要度大且测试代价小的属性加入核集,经迭代输出约简子集。实例分析和UCI数据集实验结果表明,与对比算法相比,所提算法在不降低分类精度的情况下得到的约简子集测试代价较小,且决策域分布保持不变。
Aiming at the problem that attribute reduction methods based on test cost sensitivity may change the decision domain,this paper introduced the concept of decision domain distribution preservation,combining the cost of obtaining data,and proposed a attribute reduction algorithm of test cost probability neighborhood rough set that preserved the decision domain.Firstly,this paper defined a probability neighborhood conditional information quantity with monotonic(α,β)positive domain distribution based on conditional information quantity and used it to obtain the core attributes of the decision table.Secondly,it gave a method to measure attribute importance using neighborhood conditional entropy and constructed a attribute influence evaluation indicators based on attribute importance and the testing cost of the attribute itself.Then,it used the core attribute set as the initial set,added the attribute with the high importance and low testing cost to the core set,and output the reduced subset through iteration.The case analysis and experimental results on the UCI datasets show that compared with the comparative algorithm,the proposed algorithm obtains a reduced subset with lower testing cost without reducing classification accuracy,and the decision domain distribution remains unchanged.
作者
罗致豪
叶军
詹诗颖
Luo Zhihao;Ye Jun;Zhan Shiying(School of Information Engineering,Jiangxi University of Water Resources&Power,Nanchang 330099,China;Key Laboratory of Smart Water Conservancy in Jiangxi Province,Nanchang 330099,China)
出处
《计算机应用研究》
北大核心
2025年第11期3370-3377,共8页
Application Research of Computers
基金
国家自然科学基金资助项目(62166027,62566041)
江西省教育厅科技项目(GJJ211920)。
关键词
测试代价
属性约简
决策域
概率邻域粗糙集
条件信息量
属性重要度
影响度评估指标
testing cost
attribute reduction
decision domain
probabilistic neighborhood rough set
conditional information quantity
attribute importance
influence evaluation indicators