期刊文献+

基于Naive Bayes的CLIF_NB文本分类学习方法 被引量:1

Method of CLIF_NB Text Classification Learning Based on Naive Bayes
在线阅读 下载PDF
导出
摘要 针对NaiveBayes方法中条件独立性假设常常与实际相违背的情况,提出了CLIF-NB文本分类学习方法,利用互信息理论,计算特征属性之间的最大相关性概率,用变量集组合替代线性不可分属性,改善条件独立性假设的限制,并通过学习一系列分类器,缩小训练集中的分类错误,综合得出分类准确率较高的CLIF-NB分类器. The method of CLIF_NB text classification learning based on Naive Bayes is proposed. To solve the problem that independence hypothesis is not coincident with the actual situation in Naive Bayes classification method, the paper uses the theory of mutual information, and calculate the maximum relative probability during training the text feature properties, and import variables set to combine and replace line inseparable attributes. So the method can improve the limit of conditional independence hypothesis, and also decrease the classification errors in training dataset by learning from a series of classifiers, high accuracy CLIF_NB classifying model can be gained.
出处 《小型微型计算机系统》 CSCD 北大核心 2005年第9期1575-1577,共3页 Journal of Chinese Computer Systems
基金 国家"九七三"重点基础研究项目(G1998030414)资助
关键词 文本分类 NAIVE BAYES 条件独立性假设 text classification Naive Bayes conditional independence hypothesis
  • 相关文献

参考文献5

  • 1Joseph Giarratano, Gary Riley. Principle and programming of expert system[M]. Beijing :Machine Industry Press, 2000,5.
  • 2Marco A Wiering . Hierarchical mixtures of Naive bayes classifiers[R]. Intelligent Systems Group Utrecht University, TR,2002.
  • 3Chen Yun, Zhou Liang. The theory of information and coding[M]. Beijing:Electron Industry Press, 2002.
  • 4Jie Cheng, Russell Greiner. Learning bayesian belief network classifiers: algorithms and system[J]. Lecture Notes in Computer Science, 2001,189-126.
  • 5Liu Li-zhen, Chen Jun-jie,Song han-tao. The research of web mining[C]. Proceeding of the 4^th World Congress on Intelligent Control and Automation,2003, 2333-2337.

同被引文献18

  • 1蒋国瑞,司学峰.基于代价敏感SVM的电信客户流失预测研究[J].计算机应用研究,2009,26(2):521-523. 被引量:22
  • 2关健,刘大昕.一种基于多层感知机的无监督异常检测方法[J].哈尔滨工程大学学报,2004,25(4):495-498. 被引量:4
  • 3钱苏丽,何建敏,王纯麟.基于改进支持向量机的电信客户流失预测模型[J].管理科学,2007,20(1):54-58. 被引量:27
  • 4ROIGER R J.GEATZ M W.数据挖掘教程[M].翁敬农,译.北京:清华大学出版社,2003:36-37.
  • 5XIE Y Y, LI X, NGAI E W T, et al. Customer Churn Prediction Using Improved Balanced Random Forests[J]. Expert Systems with Applications,2009, 36(3) :5 445-5 449.
  • 6KEAVENEY S M. Customer Switching Behavior in Service Industries: An Exploratory Study[J]. Journal of Marketing, 1995, 59(2): 71-82.
  • 7MOZER M C, WOLNIEWlCZ R. Predicting Subscriber Dissatisfaction and Improving Retention in the Wireless Telecommunications Industry [J]. Neural Networks IEEE Transactions, 2000,11 (3) : 690 - 696.
  • 8BREIMAN L, FRIEDMAN J H, OLSEN R A, et al. Classification and Regression Trees[M]. Bel- mont: Wadsworth International Group, 1984.
  • 9DOMINGOS P. MetaCost: A General Method for Making Classifiers Cost-Sensitive[C]//Proeeedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Die- go, CA, 1999: 155-164.
  • 10TING K M. An Instance Weighting Method to In- duce Cost-Sensitive Trees[J]. IEEE Transactions on Knowledge and Data Engineering, 2002, 14 (3) : 659-665.

引证文献1

二级引证文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部