摘要
传统的朴素贝叶斯不能处理连续属性,文中基于贝叶斯测度提出一种有监督离散化方法。它能够在无先验知识的前提下,自动寻求最佳的离散子区间数目和区间划分。在此基础上根据MDL准则控制离散化子区间的数目,使学习方法的精确度和复杂度达到均衡。在UCI机器学习数据集上对该方法进行了验证,取得了良好的效果。
Standard naive Bayes can not handle continuous attributes. A new supervised discretization method is proposed, which is based on Bayes measure to automatically find the most appropriate boundaries for discretizationand the number of intervals. At the same time, it embodies tradeoff between the accuracy and the complexity of the learned discretization by applying MDL principle. Experimental results on UCI data sets indicate that the classification accuracy is substantially improved.
出处
《仪器仪表学报》
EI
CAS
CSCD
北大核心
2005年第8期786-789,共4页
Chinese Journal of Scientific Instrument
基金
国家自然科学基金项目(60275026)资助。