期刊文献+

面向软件缺陷个数预测的混合式特征选择方法 被引量:2

Hybrid feature selection method for number of software faults prediction
在线阅读 下载PDF
导出
摘要 针对软件缺陷数据集中不相关特征和冗余特征会降低软件缺陷个数预测模型的性能的问题,提出了一种面向软件缺陷个数预测的混合式特征选择方法——HFSNFP。首先,利用Relief F算法计算每个特征与缺陷个数之间的相关性,选出相关性最高的m个特征;然后,基于特征之间的关联性利用谱聚类对这m个特征进行聚类;最后,利用基于包裹式特征选择思想从每个簇中依次挑选最相关的特征形成最终的特征子集。实验结果表明,相比于已有的五种过滤式特征选择方法,HFSNFP方法在提高预测率的同时降低了误报率,且G-measure与RMSE度量值更佳;相比于已有的两种包裹式特征选择方法,HFSNFP方法在保证缺陷个数预测性能的同时可以显著降低特征选择的时间。 Focused on the issue that the irrelevant and redundant features in software defect data would degrade the perfor- mance of the number of software faults prediction models, this paper proposed a hybrid feature selection method for the number of faults prediction (HFSNFP). Firstly, HFSNFP computed the relevance between every feature and the number of fault with ReliefF algorithm and selected the top m most relevant features. Then, HFSNFP grouped the m features with spectral clustering algorithm according to the correlation between every two features. Finally, HFSNFP selected the most relevant features from each resulted cluster to form the final feature subset using a wrapper search. Compared with the five existing filter-based fea- ture selection methods, the experimental results show that HFSNFP increases PD value, reduces PF value and achieves better G-measure and RMSE values. Comparied with the two wrapper-based feature selection methods, it demonstrates that HFSNFP can achieve the high performance of the number of faults prediction and reduce the running time of feature selection.
出处 《计算机应用研究》 CSCD 北大核心 2018年第2期487-492,502,共7页 Application Research of Computers
基金 湖北大学精品课程(013665 150145)
关键词 软件缺陷个数预测 特征选择 谱聚类 包裹式特征选择 number of software faults prediction feature selection spectral clustering wrapper-based feature selection
  • 相关文献

参考文献3

二级参考文献75

  • 1COVER T M, THOMAS J A. Elements of information theory [ M]. New York: John Wiley & Sons Inc, 1991.
  • 2KWAK N J, CHOI C H. Input feature selection by mutual informa- tion based on Parzen window[ J]. IEEE Transactions on Pattern A- nalysis and Machine Intelligence, 2002, 24(12) : 1667 - 1671.
  • 3AMIRI F, YOUSEFI M M R, LUCAS C, et al. Mutual information- based feature selection for intrusion detection systems[ J]. Journal of Network and Computer Applications, 2011, 34(4) : 1184 - 1199.
  • 4BAE C, YEH W C, CHUNG Y Y, et al. Feature selection with in- telligent dynamic swarm and rough set[ J]. Expert Systems with Ap- plications, 2010, 37(10) : 7026 -7032.
  • 5BAT'I'ITI R. Using mutual information for selecting features in su- pervised neural net learning[ J]. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(9) : 1199 - 1207.
  • 6MENZIES T, GREENWALD J, FRANK A. Data mining static code attributes to learn defect predictors[ J]. IEEE Transactions on Software Engineering, 2007, 32( 11 ) : 2 - 13.
  • 7ZHENG JUN. Cost-sensitive boosting neural networks for software defect prediction[ J]. Expert Systems with Applications, 2010, 37 (6) : 4537 - 4543.
  • 8CATAL C, DIRI B. Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem[ J]. Information Sciences, 2009, 179(8) : 1040 - 1058.
  • 9刘海,郝克刚.软件缺陷数据的定义[J].计算机应用,2008,28(1):226-228. 被引量:5
  • 10All T,Beecham S,Bowes D,et al.A systematic literature review on fault prediction performance in software engineering[J].IEEE Trans on Software Engineering,2012,38(6):1276-1304.

共引文献39

同被引文献13

引证文献2

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部