期刊文献+

基于位差的属性选择算法

Feature Selection Algorithm based on Potential Difference
在线阅读 下载PDF
导出
摘要 针对高维数据的数据分析或数据挖掘算法的时间复杂度随着维数增长,时间开销呈指数上升的问题,采用恰当的特征选择方法可以降低数据的维数,并且可以保持原有的数据分辨能力。使用卡方统计量为属性相关性的量化结果,根据卡方表查出独立性置信水平α。对于某属性子集,根据α给出两个有序序列,一个序列是所有属性与分类属性的α有序序列,另一个序列是所有属性与参照属性的α有序序列。利用各个属性在两个序列中位差(位置差异)选择属性。最后给出算法的理论分析,并且给出实验结果和分析。 To avoid exponentially increased time expenses in data analysis and data mining for high degree-offreedom of dimension of data and keep the same accuracy for the data analysis. Feature selection can decrease the dimensions of data. Chi2 statistic is used as quantity result of correlation in this paper. Based on the Chi2 statistic table, obtain the independent Confidence Level a. Two lists are provided based on a for a specific feature subset. One a list is a descendent list of correlation between class and all features. The other a list is a descendent list of correlation between reference feature and all features. Based on the different positions in two lists for each feature to accomplish the feature selection. At last paper provides the theoretical analysis and experiment results and analysis based on a sample data. The algorithm keeps the same accuracy of data analysis with less degree-of-freedom of dimensions of data.
出处 《吉林大学学报(信息科学版)》 CAS 2007年第1期50-56,共7页 Journal of Jilin University(Information Science Edition)
基金 国家自然科学基金资助项目(60275026)
关键词 数据挖掘 属性选择 卡方算法 相关性概率 位差 data mining feature selection Chi2 statistic correlation probability potential difference
  • 相关文献

参考文献30

  • 1BLUM A L,LANGLEY P.Selection of Relevant Features and Examples in Machine Learning[J].Artificial Intelligence,1997,97(2):245-271.
  • 2LIU H,MOTODA H.Feature Extraction,Construction and Selection:A Data Mining Perspective[M].Boston:Kluwer Academic,1998.
  • 3BEN-BASSAT M.Pattern Recognition and Reduction of Dimensionality,Handbook of Statistics-II[M].New York:North Holland,1982.
  • 4JAIN A,ZONGKER D.Feature Selection:Evaluation,Application,and Small Sample Performance[C]∥ IEEE Transactions on Pattern Analysis and Machine Intelligence.Washington,DC USA:IEEE Computer Society,1997,19(2):153-158.
  • 5MITRA P,MURTHY C A,PAL S K.Unsupervised Feature Selection Using Feature Similarity[C]∥ IEEE Trans Pattern Analysis and Machine Intelligence.Washington,DC USA:IEEE Computer Society,2002,24(3):301-312.
  • 6SIEDLECKI W,SKLANSKY J.On Automatic Feature Selection[C]∥ International Journal of Pattern Recognition and Artificial Intelligence.UK:World Scientific Publishing,1988(2):197-220.
  • 7WYSE N,DUBES R,JAIN A K.A Critical Evaluation of Intrinsic Dimensionality Algorithms[C]∥Pattern Recognition in Practice,Holland:North-Holland Publishing Company,1980:415-425.
  • 8JOHN G H,KOHAVI R,PFLEGER K.Irrelevant Feature and the Subset Selection Problem[C]∥ Proc 11th Intl Conf Machine Learning.New Jersey:Morgan Kaufmann,1994:121-129.
  • 9KIRA K,RENDELL L A.The Feature Selection Problem:Traditional Methods and a New Algorithm[C]∥ Proc of the Ninth National conf on Artificial Intelligence.San Jose,CA USA:AAAI Press,1992:129-134.
  • 10KOHAVI R,JOHN G H.Wrappers for Feature Subset Selection[J].Artificial Intelligence,1997,97(1/2):273-324.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部