期刊文献+

微阵列数据中的先验信息对基于LASSO变量选择方法影响的模拟研究 被引量:3

Influence of Prior Information of Microarray Data on Variable Selection Based on LASSO: A Simulation Study
暂未订购
导出
摘要 目的探讨微阵列数据中的先验信息对基于LASSO变量选择方法的影响。方法设置真实模型后,逐步融合先验信息,采用R、MATLAB软件编程,模拟比较先验信息对LASSO,group LASSO(简称为g LASSO)中的non-overlap group LASSO(简称为nog LASSO)和overlap group LASSO(简称为og LASSO)变量选择的影响。结果经典的LASSO、og LASSO变量选择方法在处理模拟微阵列数据时具有较好的预测精度(AUCLASSO=0.8915≈AUCog LASSO=0.8923>AUCnog LASSO=0.8396,MSEnog LASSO=0.1358>MSEog LASSO=0.0975≈MSELASSO=0.0928),LASSO可解释性最强(平均入选模型基因数分别为21.52、111.95、101.01)。nog LASSO在处理基因通路信息时,当[X295]被错分至第19个通路后,尽管未改变其效应值,但入选模型次数大为减少,预测精度下降较为明显,而og LASSO表现更稳健。结论融合微阵列数据中的先验信息并未提高基于LASSO变量选择方法的预测性能及效率,经典的LASSO变量选择方法仍为处理微阵列数据的有效方法。 Objective Objective To explore the influence of prior information of microarray data on variable selection based on LASSO. Methods After setting the true model, we incorporated prior information into LASSO, non - overlap group LASSO( nogLASSO for short)and overlap group LASSO( ogLASSO for short) variable selection models and compared the influence by MATLAB or R software. Results LASSO、ogLASSO models seemed to have good prediction accuracy when processing microarray data ( AUCLASSO = 0. 8915 ≈ AUCogLASSO = 0. 8923 〉 AUCnogLASSO = 0. 8396, MSEnogLASSO = 0. 1358 〉 MSEogLASSO≈ 0. 0975 ≈ MSELASSO≈ 0. 0928 ), while only LASSO achieved a interpretable model ( The average of genes selected in the models :21.52,111.95,101.01 respectively). When [ X295 ] was misclassified into 19th pathway, the average of genes selected in the models decreased and the forecast precision declined by nogLASSO model, while ogLASSO model's performance seemed to be more robust. Conclusion Incorporating prior information of microarray data does not improve the prediction performance and efficiency of variable selection based on LASSO, therefore the simple LASSO regression model may be an efficient means to deal with microarray data.
出处 《中国卫生统计》 CSCD 北大核心 2015年第3期407-409,413,共4页 Chinese Journal of Health Statistics
基金 国家自然科学基金(81373103) 重庆市科委基础与前沿研究计划项目(cstc2013jcyj A10009)
关键词 变量选择 LASSO算法 模拟 Variable selection Least Absolute Shrinkage and Selection Operator Simulation
  • 相关文献

参考文献9

  • 1Tibshirani R. Regression Shrinkage and Selection via the Lasso. Jour- nal of the Royal Statistical Society ,1996,58( 1 ) :267-288.
  • 2Efron B, Hastie T, Johnstone I, et al. Least angle regression. Journal of the Institute of Mathematical Statistics,2004,32 (2) :407-499.
  • 3Yuan M, Lin Y. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B ( Statistical Methodology) ,68:49457.
  • 4Liu J, Ye JP. Fast Overlapping Group Lasso. CoRR abs/1009. 0306.
  • 5张秀秀,王慧,田双双,乔楠,闫丽娜,王彤.高维数据回归分析中基于LASSO的自变量选择[J].中国卫生统计,2013,30(6):922-926. 被引量:30
  • 6Albert JH, Chib S. Bayesian analysis of binary and polychotomous re- sponse data. Journal of the American Statistical Association, 1993,88 ( 422 ) : 669 -679.
  • 7Friedman J, Hastie T, Tibshirani R. Regularization paths for general- ized linear models via coordinate descent. Journal of Statistical Soft- ware ,2010,33 : 1-22.
  • 8James G, Witten D, Hastie T, et al. An introduction to statistical learn- ing with applications in R. America: Springer Press,2013.
  • 9Motyer AJ, McKendry C, Galbraith S, et al. LASSO model selection with post-processing for a genome-wide association study data set. BMC Proceedings,2011,5 (9) : 1-4.

二级参考文献40

  • 1Biihlmann P, Sara G. Statistics for High-dimensional Data Methods,Theory and Applications. Springer Heidelberg Dordrecht London NewYork : Springer ,2011 : 568.
  • 2Goeman J. LI Penalized Estimation in the Cox Proportional HazardsModel. Biometrical Journal,2010,52( 1) :70~84.
  • 3Fan JQ,Li RZ. Variable Selection via Penalized Likelihood. Journal ofAmerican Statistical Association,2001,96(4) : 1348-1360.
  • 4Robert L, Richard F. Selecting Principle Components in Regression. Sta-tistics and Probability Letters, 1985 ,3(6) :299-301.
  • 5Zou H. The Adaptive Lasso and Its Oracle Properties. Journal of the A-merican Statistical Association,2006,101 .476) :1418-1429.
  • 6Tibshirani R. Regression Shrinkage and Selection via the Lasso. Journalof the Royal Statistical Society ,1996,58( 1) :267-288.
  • 7Tibshirani R. Regression shrinkage and selection via the lasso : a retro-spective. Journal of the Royal Statistical Society, 2011, 73 ( 3 ) : 273-282.
  • 8Efron B,Hastie T, Johnstone L, et al. Least angle regression. The An-nals of statistics,2004,32(2) :407499.
  • 9Friedman J,Hastie T,Tibshirani R. Regularization paths for generalizedlinear models via coordinate descent. Journal of Statistical Software,2010,33(1) :l-22.
  • 10Leo B. Better subset selection using the non-negative garotte. Techno-metrics, 1995 ,37(4) :373-384.

共引文献29

同被引文献40

引证文献3

二级引证文献27

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部