期刊文献+

适用于不确定文本分类的特征选择算法 被引量:2

Feature selection algorithm for uncertain text classification
在线阅读 下载PDF
导出
摘要 基于Hilbert-Schmidt依赖性准则提出了一种新颖的特征选择算法FSUNT,重点考虑特征选择过程中可能出现的模糊性和不确定性。针对类标号不确定而其他特征值确定的文本数据,通过考察特征与不确定的类标号间的Hilbert-Schmidt相关性,对特征进行排序,并选取最终的结果子集。最后大量真实与仿真实验结果表明,基于该算法可得到良好的分类效果和稳定性。 A novel algorithm called FSUNT was proposed based on HSIC, with the focus on the vagueness and uncertainty which might be taken into account during feature selection. For text data with fixed feature values and uncertain class labels, features were ranked according to the correlation between features and uncertain class labels evaluated by HSIC. The results of experimental evaluation on a variety of datasets show better performance and stability of FSUNT.
出处 《通信学报》 EI CSCD 北大核心 2009年第8期32-38,44,共8页 Journal on Communications
基金 国家高技术研究发展计划("863"计划)基金资助项目(2006AA01Z451 2007AA01Z474 2007AA010502) 国家自然科学基金资助项目(60873204)~~
关键词 特征选择 不确定数据 文本分类 feature selection uncertain text text classification
  • 相关文献

参考文献21

  • 1AGGARWAL C C, YU P S. A survey of uncertain data algorithms and applications[A]. IBM Research Report RC24394[C]. 2007.
  • 2LIU H, MOTODA H. Computational Methods of Feature Selection[M]. Chapman and Hall/CRC Press, 2008.
  • 3KOHAVI R, JOHN G H. Wrappers for feature subset selection[A]. Artificial Intelligence 97[C]. 1997. 273-324.
  • 4PUDIL E NOVOVICOVA J. Novel methods for subset selection with respect to problem knowledge[A]. IEEE Intell Syst[C]. 1998.66-74.
  • 5YANG Y, PEDERSEN J. A comparative study on feature selection in text categorization[A]. Proc of the 14th Int'l Conf on Machine learning (ICML-97)[C]. Nashville: Morgan Kaufmann Publishers, 1997. 412-420.
  • 6MAKREHCHI M, KAMEL M. Text classification using small number of features[A]. Proc of the 4th Int'l Conf on Machine Learning and Data Mining in Pattern Recognition (MLDM 2005)[C]. 2005.580-589.
  • 7FORMAN G.An extensive empirical study of feature selection metrics for text classification[J]. Journal of Machine Learning Research, 2003, 3(1): 1533-7928.
  • 8SHEN Q, JENSEN R. Selecting informative features with fuzzy-rough sets and its application for complex systems monitoring[J]. Pattern Recognition, 2004, 37:1351-1363.
  • 9JAVIER G,MARIA R S, JOSE R V. A feature selection method using a fuzzy mutual information measure[J]. Springer Berlin / Heidelberg, 2008, 44: 56-63.
  • 10ALEXANDRE A, SERGIO R M. A conservative feature selection algorithm with informatively missing data[J]. Journees Francophone sur les Reseaux Bayesiens, Lyon France, 2008.

同被引文献20

  • 1刘涛,吴功宜,陈正.一种高效的用于文本聚类的无监督特征选择算法[J].计算机研究与发展,2005,42(3):381-386. 被引量:37
  • 2罗丹,刘万军,罗超,操龙兵,戴汝为.电信欺诈综合分析与系统架构研究[J].计算机科学,2005,32(5):17-22. 被引量:3
  • 3龚静,周经野.一种基于多重因子加权的文本特征项权值计算方法[J].计算技术与自动化,2007,26(1):81-83. 被引量:10
  • 4边肇棋,张学工.模式识别.第2版.北京:清华大学出版社,2000.
  • 5A.Jain,D.Zongker.Feature selection:Evaluation,applicafion,and smallsample performance.IEEE Transactions on Pattern Analysis and MachineIntelligence,vol.19,no.2,153-158,1997.
  • 6Y.M.Yang,J.O.Pedenen.A Comparative Study on Feature Selection in Text Categorization.ICML 1997,412-420.
  • 7S.K.Singhi,H.Liu.Feature subset selection bias for classification learning.ICML 2006,849-856.
  • 8J.W.Han,M.KambeL Data Mining:Concepts and Techniques.second cdition.San Fransisco:Morgan Kaufmann,2006.
  • 9M.Dash,H.Liu.Feature selection for classification Intelligent Date Analysis:An Int'L J.,vol.1,no.3.131-156,1997.
  • 10M.A.Hall.Correlation-based Feature Selection fof Machine Learning.PH.D.thesis,the university of Waikato,1999.

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部