摘要
提出了基于粗糙集理论和贝叶斯分类算法的垃圾邮件过滤方法。利用粗糙集约简算法对邮件样本集进行特征约简,删除对邮件过滤结果影响不大的冗余特征,从而降低了输入样本集的维数,解决了贝叶斯分类器训练时间长,样本集占用的存储空间过大的问题。实验证明,该方法可以提高邮件过滤的准确性和训练的速度。
This paper proposed a spam filtering method based on Rough set theory and Bayesian classifier algo-rithm. Then the amount of features are reduced by deleting redundant features with little significance on filtering effect based on rough set theory, resulting in a input sample with reduced number of dimension. Using this method, it can overcome the shortages of Bayies classifier-time-consuming of training and massive dataset storage. Experiments proved that this mechanism could greatly boost both the system' s accuracy and the training speed.
出处
《南昌大学学报(工科版)》
CAS
2009年第1期45-48,共4页
Journal of Nanchang University(Engineering & Technology)
基金
江西省教育厅科技计划资助项目(赣教技字[2007]23号
赣教技字[2007]344号)