摘要
朴素贝叶斯分类器是一种简单而高效的分类器,但是其属性独立性假设限制了对实际数据的应用。文章提出一种新的算法,该算法为避免数据预处理时的属性约简对分类效果的直接影响,在训练集上通过随机属性选取生成若干属性子集,以这些子集构建相应的朴素贝叶斯分类器,采用模拟退火遗传算法进行优选。实验表明,与传统的朴素贝叶斯方法相比,该方法具有更好的性能。
Although Naive Bayesian classifier is a simple and highly efficient classification method, its attribute of independence assumption limits its real application. A new algorithm is introduced in this paper to avoid the direct influence of feature reduction on the performance of classification. This algorithm generates certain attribute subsets of the training sets through random attribute selection, constructs the corresponding Naive Bayesian classifiers, and optimizes the Bayesian classifiers by using simulated annealing genetic algorithms. Experiment shows that this algorithm has better performance when compared with traditional Na'fve Bayesian methods.
出处
《计算机工程》
CAS
CSCD
北大核心
2007年第9期219-221,共3页
Computer Engineering
基金
安徽省高等学校自然科学基金资助项目(KJ2007B236)
安徽省高等学校自然科学基金资助重点项目(2006kj027A)
关键词
数据挖掘
朴素贝叶斯
模拟退火算法
遗传算法
属性约简
适应度函数
Data mining
Nai've Bayesian
Simulated annealing algorithms
Genetic algorithms
Feature reduction
Fitness function