The smooth integration of counting and absolute deviation (SICA) penalized variable selection procedure for high-dimensional linear regression models is proposed by Lv and Fan (2009). In this article, we extend th...The smooth integration of counting and absolute deviation (SICA) penalized variable selection procedure for high-dimensional linear regression models is proposed by Lv and Fan (2009). In this article, we extend their idea to Cox's proportional hazards (PH) model by using a penalized log partial likelihood with the SICA penalty. The number of the regression coefficients is allowed to grow with the sample size. Based on an approximation to the inverse of the Hessian matrix, the proposed method can be easily carried out with the smoothing quasi-Newton (SQN) algorithm. Under appropriate sparsity conditions, we show that the resulting estimator of the regression coefficients possesses the oracle property. We perform an extensive simulation study to compare our approach with other methods and illustrate it on a well known PBC data for predicting survival from risk factors.展开更多
In many applications,covariates can be naturally grouped.For example,for gene expression data analysis,genes belonging to the same pathway might be viewed as a group.This paper studies variable selection problem for c...In many applications,covariates can be naturally grouped.For example,for gene expression data analysis,genes belonging to the same pathway might be viewed as a group.This paper studies variable selection problem for censored survival data in the additive hazards model when covariates are grouped.A hierarchical regularization method is proposed to simultaneously estimate parameters and select important variables at both the group level and the within-group level.For the situations in which the number of parameters tends to∞as the sample size increases,we establish an oracle property and asymptotic normality property of the proposed estimators.Numerical results indicate that the hierarchically penalized method performs better than some existing methods such as lasso,smoothly clipped absolute deviation(SCAD)and adaptive lasso.展开更多
For analyzing correlated binary data with high-dimensional covariates,we,in this paper,propose a two-stage shrinkage approach.First,we construct a weighted least-squares(WLS) type function using a special weighting sc...For analyzing correlated binary data with high-dimensional covariates,we,in this paper,propose a two-stage shrinkage approach.First,we construct a weighted least-squares(WLS) type function using a special weighting scheme on the non-conservative vector field of the generalized estimating equations(GEE) model.Second,we define a penalized WLS in the spirit of the adaptive LASSO for simultaneous variable selection and parameter estimation.The proposed procedure enjoys the oracle properties in high-dimensional framework where the number of parameters grows to infinity with the number of clusters.Moreover,we prove the consistency of the sandwich formula of the covariance matrix even when the working correlation matrix is misspecified.For the selection of tuning parameter,we develop a consistent penalized quadratic form(PQF) function criterion.The performance of the proposed method is assessed through a comparison with the existing methods and through an application to a crossover trial in a pain relief study.展开更多
The purpose of this paper is two fold. First, we investigate estimation for varying coefficient partially linear models in which covariates in the nonparametric part are measured with errors. As there would be some sp...The purpose of this paper is two fold. First, we investigate estimation for varying coefficient partially linear models in which covariates in the nonparametric part are measured with errors. As there would be some spurious covariates in the linear part, a penalized profile least squares estimation is suggested with the assistance from smoothly clipped absolute deviation penalty. However, the estimator is often biased due to the existence of measurement errors, a bias correction is proposed such that the estimation consistency with the oracle property is proved. Second, based on the estimator, a test statistic is constructed to check a linear hypothesis of the parameters and its asymptotic properties are studied. We prove that the existence of measurement errors causes intractability of the limiting null distribution that requires a Monte Carlo approximation and the absence of the errors can lead to a chi-square limit. Furthermore, confidence regions of the parameter of interest can also be constructed. Simulation studies and a real data example are conducted to examine the performance of our estimators and test statistic.展开更多
基金Supported by the National Natural Science Foundation of China(No.11171263)
文摘The smooth integration of counting and absolute deviation (SICA) penalized variable selection procedure for high-dimensional linear regression models is proposed by Lv and Fan (2009). In this article, we extend their idea to Cox's proportional hazards (PH) model by using a penalized log partial likelihood with the SICA penalty. The number of the regression coefficients is allowed to grow with the sample size. Based on an approximation to the inverse of the Hessian matrix, the proposed method can be easily carried out with the smoothing quasi-Newton (SQN) algorithm. Under appropriate sparsity conditions, we show that the resulting estimator of the regression coefficients possesses the oracle property. We perform an extensive simulation study to compare our approach with other methods and illustrate it on a well known PBC data for predicting survival from risk factors.
基金supported by National Natural Science Foundation of China(Grant Nos.11171112,11101114 and 11201190)National Statistical Science Research Major Program of China(Grant No.2011LZ051)
文摘In many applications,covariates can be naturally grouped.For example,for gene expression data analysis,genes belonging to the same pathway might be viewed as a group.This paper studies variable selection problem for censored survival data in the additive hazards model when covariates are grouped.A hierarchical regularization method is proposed to simultaneously estimate parameters and select important variables at both the group level and the within-group level.For the situations in which the number of parameters tends to∞as the sample size increases,we establish an oracle property and asymptotic normality property of the proposed estimators.Numerical results indicate that the hierarchically penalized method performs better than some existing methods such as lasso,smoothly clipped absolute deviation(SCAD)and adaptive lasso.
基金supported by National Natural Science Foundation of China(Grant No.11201306)the Innovation Program of Shanghai Municipal Education Commission(Grant No.13YZ065)+2 种基金the Fundamental Research Project of Shanghai Normal University(Grant No.SK201207)the scholarship under the State Scholarship Fund by the China Scholarship Council in 2011the Research Grant Council of Hong Kong, Hong Kong,China(Grant No.#HKBU2028/10P)
文摘For analyzing correlated binary data with high-dimensional covariates,we,in this paper,propose a two-stage shrinkage approach.First,we construct a weighted least-squares(WLS) type function using a special weighting scheme on the non-conservative vector field of the generalized estimating equations(GEE) model.Second,we define a penalized WLS in the spirit of the adaptive LASSO for simultaneous variable selection and parameter estimation.The proposed procedure enjoys the oracle properties in high-dimensional framework where the number of parameters grows to infinity with the number of clusters.Moreover,we prove the consistency of the sandwich formula of the covariance matrix even when the working correlation matrix is misspecified.For the selection of tuning parameter,we develop a consistent penalized quadratic form(PQF) function criterion.The performance of the proposed method is assessed through a comparison with the existing methods and through an application to a crossover trial in a pain relief study.
基金supported by National Natural Science Foundation of China (Grant Nos. 11401006, 11671299 and 11671042)a grant from the University Grants Council of Hong Kong+1 种基金the China Postdoctoral Science Foundation (Grant No. 2017M611083)the National Statistical Science Research Program of China (Grant No. 2015LY55)
文摘The purpose of this paper is two fold. First, we investigate estimation for varying coefficient partially linear models in which covariates in the nonparametric part are measured with errors. As there would be some spurious covariates in the linear part, a penalized profile least squares estimation is suggested with the assistance from smoothly clipped absolute deviation penalty. However, the estimator is often biased due to the existence of measurement errors, a bias correction is proposed such that the estimation consistency with the oracle property is proved. Second, based on the estimator, a test statistic is constructed to check a linear hypothesis of the parameters and its asymptotic properties are studied. We prove that the existence of measurement errors causes intractability of the limiting null distribution that requires a Monte Carlo approximation and the absence of the errors can lead to a chi-square limit. Furthermore, confidence regions of the parameter of interest can also be constructed. Simulation studies and a real data example are conducted to examine the performance of our estimators and test statistic.