期刊文献+

基于CHI与遗传算法的特征选择 被引量:3

Feature Selection Based on CHI and Genetic Algorithm
在线阅读 下载PDF
导出
摘要 在基于Web文本信息过滤系统中通过特征选择找到的最优特征子集直接影响到分类的速度及精度。针对此问题,提出了综合CHI及遗传算法的特征选择方法。首先针对原始特征集,采用CHI统计法进行初始筛选,去除冗余特征及噪声后,对得到的特征子集再采用遗传算法进行第二次特征选择,从而得出代表问题空间的最优特征子集,实现降维并提高了分类精度。 In the Information Filter System based on web text , the optimal subset of features that can be found by feature selection directly affect the speed and accuracy of classification. To solve this problem, this paper proposed a genetic algorithm and CHI integrated methods of feature selection. First view of the original feature set, CHI statistical method was used for the initial screening, eliminating redundant features and noise, and genetic algorithm was used for the second feature selection so as to arrive to the optimal representation feature set and result in the low - dimensional data and the good classification accuracy.
出处 《信息技术与信息化》 2007年第1期43-44,共2页 Information Technology and Informatization
关键词 特征选择 CHI 遗传算法 Feature selection CHI Genetic Algorithm (GA)
  • 相关文献

参考文献6

  • 1Yiming Yang,Thomas Ault.Thomas Pierce and Cha W Lattimer.Improving text categorization method sevent tracking[C].Proceedings of ACM SIGIR Conference on Research and Development information Retrieval (SIGIR.00),2000:65-72.
  • 2Vafaie H,De Jong K.Genetic algorithm as a tool for feature selection in machine learning[A].International Conference on Tools with AI[C].Arlington,Va,1992.200 -204.
  • 3Dom B,Nilack W,Sheinvald J.Feature selection with stochastic complexity[C].In:Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,1989.
  • 4刑文训 谢金星.现代优化计算方法[M].北京:清华大学出版社,1999.193-246.
  • 5孙雷,王新.一种基于遗传操作和类内类间距离判据理论的特征选择方法[J].计算机工程与应用,2004,40(21):178-181. 被引量:8
  • 6张锋,樊孝忠,许云.基于遗传算法的文本聚类特征选择[J].华南理工大学学报(自然科学版),2004,32(z1):133-136. 被引量:3

二级参考文献14

  • 1刑文训.现代优化计算方法[M].北京:清华大学出版社,1999..
  • 2[1]Kowalski G. Information Retrieval Systems Theory and Implementation [M]. Netherlands: Kluwer Academic Publishers, 1997.
  • 3[2]Zamir O,Etzioni O,Madani O,et al. Fast and intuitive clustering of Web documents [A]. Proc of KDD-97 [C].Newport Beach, USA, 1997. 287 - 290.
  • 4[3]Cutting D R, Karger D R, Pedersen J O, et al. Scatter/gather:A cluster-based approach to browsing large document collections [A]. Proc of SIGIR ′92 [C]. Copenhagen, 1992. 318 - 329.
  • 5[4]Aggrawal C C,Yu P S. Finding generalized projected clusters in high dimensional spaces [A]. Proc of SIGMOD′00 [C]. Dallas ,USA ,2000.70 - 81.
  • 6[5]Yang Y. Noise reduction in a statistical approach to text categorization [A]. Proc of SIGIR′95 [C]. Seattle,USA, 1995. 256 - 263.
  • 7[6]Yang Y,Pedersen J O. A comparative study on feature selection in text categorization [A]. Proc of ICML-97[C]. Nashville, USA, 1997.412 - 420.
  • 8[7]Vafaie H, De Jong K. Genetic algorithm as a tool for feature selection in machine learning [A]. International Conference on Tools with AI [C]. Arlington,Va, 1992.200 - 204.
  • 9[10]Salton G. Automatic Text Processing:The Transformation, Analysis, and Retrieval of Information by Computer [M]. Boston: Addison-Wesley, 1989.
  • 10M Dash,H Liu.Feature Selection for Classification. Department of Information System & Computer Science ,National University of Singapore

共引文献99

同被引文献16

引证文献3

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部