In this article, a partially linear single-index model /or longitudinal data is investigated. The generalized penalized spline least squares estimates of the unknown parameters are suggested. All parameters can be est...In this article, a partially linear single-index model /or longitudinal data is investigated. The generalized penalized spline least squares estimates of the unknown parameters are suggested. All parameters can be estimated simultaneously by the proposed method while the feature of longitudinal data is considered. The existence, strong consistency and asymptotic normality of the estimators are proved under suitable conditions. A simulation study is conducted to investigate the finite sample performance of the proposed method. Our approach can also be used to study the pure single-index model for longitudinal data.展开更多
In longitudinal data analysis, our primary interest is in the estimation of regression parameters for the marginal expectations of the longitudinal responses, and the longitudinal correlation parameters are of seconda...In longitudinal data analysis, our primary interest is in the estimation of regression parameters for the marginal expectations of the longitudinal responses, and the longitudinal correlation parameters are of secondary interest. The joint likelihood function for longitudinal data is challenging, particularly due to correlated responses. Marginal models, such as generalized estimating equations (GEEs), have received much attention based on the assumption of the first two moments of the data and a working correlation structure. The confidence regions and hypothesis tests are constructed based on the asymptotic normality. This approach is sensitive to the misspecification of the variance function and the working correlation structure which may yield inefficient and inconsistent estimates leading to wrong conclusions. To overcome this problem, we propose an empirical likelihood (EL) procedure based on a set of estimating equations for the parameter of interest and discuss its <span style="font-family:Verdana;">characteristics and asymptotic properties. We also provide an algorithm base</span><span style="font-family:Verdana;">d on EL principles for the estimation of the regression parameters and the construction of its confidence region. We have applied the proposed method in two case examples.</span>展开更多
Logic regression is an adaptive regression method which searches for Boolean (logic) combinations of binary variables that best explain the variability in the outcome, and thus, it reveals interaction effects which ar...Logic regression is an adaptive regression method which searches for Boolean (logic) combinations of binary variables that best explain the variability in the outcome, and thus, it reveals interaction effects which are associated with the response. In this study, we extended logic regression to longitudinal data with binary response and proposed “Transition Logic Regression Method” to find interactions related to response. In this method, interaction effects over time were found by Annealing Algorithm with AIC (Akaike Information Criterion) as the score function of the model. Also, first and second orders Markov dependence were allowed to capture the correlation among successive observations of the same individual in longitudinal binary response. Performance of the method was evaluated with simulation study in various conditions. Proposed method was used to find interactions of SNPs and other risk factors related to low HDL over time in data of 329 participants of longitudinal TLGS study.展开更多
In this article, robust generalized estimating equation for the analysis of partial linear mixed model for longitudinal data is used. The authors approximate the nonparametric function by a regression spline. Under so...In this article, robust generalized estimating equation for the analysis of partial linear mixed model for longitudinal data is used. The authors approximate the nonparametric function by a regression spline. Under some regular conditions, the asymptotic properties of the estimators are obtained. To avoid the computation of high-dimensional integral, a robust Monte Carlo Newton-Raphson algorithm is used. Some simulations are carried out to study the performance of the proposed robust estimators. In addition, the authors also study the robustness and the efficiency of the proposed estimators by simulation. Finally, two real longitudinal data sets are analyzed.展开更多
In the research of scientific field, it is often necessary to continuously observe different indicators of individuals at different times and analyze the observed results. Among them, variables are mainly of two types...In the research of scientific field, it is often necessary to continuously observe different indicators of individuals at different times and analyze the observed results. Among them, variables are mainly of two types: ordered variables and continuous variables. When analyzing data for different types of variables, it is necessary to consider the correlation between multiple indicators of an individual, and often perform joint analysis on variable observation data of multiple indicators of an individual at different times, in order to achieve more accurate and true analysis results. Joint analysis often yields more information than separate analysis of various variables. In this paper, the ordered variable and the continuous variable are numerically modeled. Based on the potential variable model, the multivariate longitudinal data containing the ordered variable and the continuous variable are jointly analyzed, and the approximate value of the edge likelihood can be obtained by using the method of numerical integration.展开更多
We develop an interquantile shrinkage estimation method to examine the underlying commonality structure of regression coefficients across various quantile levels for longitudinal data in a data-driven manner.This meth...We develop an interquantile shrinkage estimation method to examine the underlying commonality structure of regression coefficients across various quantile levels for longitudinal data in a data-driven manner.This method provides a deeper insight into the relationship between the response and covariates,leading to enhanced estimation efficiency and model interpretability.We propose a fused penalized generalized estimation equation(GEE)estimator with a non-crossing constraint,which automatically promotes constancy in estimates across neighboring quantiles.By accounting for within-subject correlation in longitudinal data,the GEE estimator improves estimation efficiency.We employ a nested alternating direction method of multiplier(ADMM)algorithm to minimize the regularized objective function.The asymptotic properties of the penalized estimators are established.Furthermore,in the presence of irrelevant predictors,we develop a doubly penalized GEE estimator to simultaneously select active variables and identify commonality across quantiles.Numerical studies demonstrate the superior performance of our proposed methods in terms of estimation efficiency.We illustrate the application of our methodologies by analyzing a longitudinal wage dataset.展开更多
With the advent of modern devices,such as smartphones and wearable devices,high-dimensional data are collected on many participants for a period of time or even in perpetuity.For this type of data,dependencies between...With the advent of modern devices,such as smartphones and wearable devices,high-dimensional data are collected on many participants for a period of time or even in perpetuity.For this type of data,dependencies between and within data batches exist because data are collected from the same individual over time.Under the framework of streamed data,individual historical data are not available due to the storage and computation burden.It is urgent to develop computationally efficient methods with statistical guarantees to analyze high-dimensional streamed data and make reliable inferences in practice.In addition,the homogeneity assumption on the model parameters may not be valid in practice over time.To address the above issues,in this paper,we develop a new renewable debiased-lasso inference method for high-dimensional streamed data allowing dependences between and within data batches to exist and model parameters to gradually change.We establish the large sample properties of the proposed estimators,including consistency and asymptotic normality.The numerical results,including simulations and real data analysis,show the superior performance of the proposed method.展开更多
Prediction plays an important role in data analysis.Model averaging method generally provides better prediction than using any of its components.Even though model averaging has been extensively investigated under inde...Prediction plays an important role in data analysis.Model averaging method generally provides better prediction than using any of its components.Even though model averaging has been extensively investigated under independent errors,few authors have considered model averaging for semiparametric models with correlated errors.In this paper,the authors offer an optimal model averaging method to improve the prediction in partially linear model for longitudinal data.The model averaging weights are obtained by minimizing criterion,which is an unbiased estimator of the expected in-sample squared error loss plus a constant.Asymptotic properties,including asymptotic optimality and consistency of averaging weights,are established under two scenarios:(i)All candidate models are misspecified;(ii)Correct models are available in the candidate set.Simulation studies and an empirical example show that the promise of the proposed procedure over other competitive methods.展开更多
In the interdisciplinary realm of statistics,genetics,and epidemiology,longitudinal sibling pair data offers a unique perspective for investigating complex diseases and traits,allowing the exploration of the dynamic p...In the interdisciplinary realm of statistics,genetics,and epidemiology,longitudinal sibling pair data offers a unique perspective for investigating complex diseases and traits,allowing the exploration of the dynamic processes of gene expression over time by controlling numerous confounding factors.Missing-not-at-random(MNAR)data are commonly used in such types of studies,but no statistical methods specifically tailored have been developed to handle MNAR data in complex longitudinal data in the literature.Here,we propose a new statistical method by jointly modeling longitudinal data from sib-pairs and MNAR data.Extensive simulations demonstrate the excellent finite sample properties of the proposed method.展开更多
In the health field,longitudinal studies involve the recording of clinical observations of the same sample of pa-tients over successive periods,referred to as waves.This type of database serves as a valuable source of...In the health field,longitudinal studies involve the recording of clinical observations of the same sample of pa-tients over successive periods,referred to as waves.This type of database serves as a valuable source of infor-mation and insights,particularly when examining the temporal aspect,allowing the extraction of relevant and non-obvious knowledge.The triadic concept analysis theory has been proposed to describe the ternary re-lationships between objects,attributes,and conditions.In this study,we present a methodology for exploring longitudinal health databases using both the triadic theory and triadic rules,which are similar to association rules but incorporate temporal relations.Through four case studies,we demonstrate the potential of applying triadic analysis to longitudinal databases to identify risk patterns,enhance decision-making processes,and deepen our understanding of temporal dynamics.These findings suggest a promising approach for describing longitudinal databases and obtaining insights to improve clinical decision-support systems for disease treatment.展开更多
A partially linear model with longitudinal data is considered, empirical likelihood to infer- ence for the regression coefficients and the baseline function is investigated, the empirical log-likelihood ratios is prov...A partially linear model with longitudinal data is considered, empirical likelihood to infer- ence for the regression coefficients and the baseline function is investigated, the empirical log-likelihood ratios is proven to be asymptotically chi-squared, and the corresponding confidence regions for the pa- rameters of interest are then constructed. Also by the empirical likelihood ratio functions, we can obtain the maximum empirical likelihood estimates of the regression coefficients and the baseline function, and prove the asymptotic normality. The numerical results are conducted to compare the performance of the empirical likelihood and the normal approximation-based method, and a real example is analysed.展开更多
In this paper, we consider the semiparametric regression model for longitudinal data. Due to the correlation within groups, a generalized empirical log-likelihood ratio statistic for the unknown parameters in the mode...In this paper, we consider the semiparametric regression model for longitudinal data. Due to the correlation within groups, a generalized empirical log-likelihood ratio statistic for the unknown parameters in the model is suggested by introducing the working covariance matrix. It is proved that the proposed statistic is asymptotically standard chi-squared under some suitable conditions, and hence it can be used to construct the confidence regions of the parameters. A simulation study is conducted to compare the proposed method with the generalized least squares method in terms of coverage accuracy and average lengths of the confidence intervals.展开更多
Model average receives much attention in recent years.This paper considers the semiparametric model averaging for high-dimensional longitudinal data.To minimize the prediction error,the authors estimate the model weig...Model average receives much attention in recent years.This paper considers the semiparametric model averaging for high-dimensional longitudinal data.To minimize the prediction error,the authors estimate the model weights using a leave-subject-out cross-validation procedure.Asymptotic optimality of the proposed method is proved in the sense that leave-subject-out cross-validation achieves the lowest possible prediction loss asymptotically.Simulation studies show that the performance of the proposed model average method is much better than that of some commonly used model selection and averaging methods.展开更多
Varying-coefficient models with longitudinal observations are very useful in epidemiology and some other practical fields.In this paper,a reducing component procedure is proposed for es- timating the unknown functions...Varying-coefficient models with longitudinal observations are very useful in epidemiology and some other practical fields.In this paper,a reducing component procedure is proposed for es- timating the unknown functions and their derivatives in very general models,in which the unknown coefficient functions admit different or the same degrees of smoothness and the covariates can be time- dependent.The asymptotic properties of the estimators,such as consistency,rate of convergence and asymptotic distribution,are derived.The asymptotic results show that the asymptotic variance of the reducing component estimators is smaller than that of the existing estimators when the coefficient functions admit different degrees of smoothness.Finite sample properties of our procedures are studied through Monte Carlo simulations.展开更多
Modeling the mean and covariance simultaneously is a common strategy to efficiently estimate the mean parameters when applying generalized estimating equation techniques to longitudinal data. In this article, using ge...Modeling the mean and covariance simultaneously is a common strategy to efficiently estimate the mean parameters when applying generalized estimating equation techniques to longitudinal data. In this article, using generalized estimation equation techniques, we propose a new kind of regression models for parameterizing covariance structures. Using a novel Cholesky factor, the entries in this decomposition have moving average and log innovation interpretation and are modeled as the regression coefficients in both the mean and the linear functions of covariates. The resulting estimators for eovarianee are shown to be consistent and asymptotically normally distributed. Simulation studies and a real data analysis show that the proposed approach yields highly efficient estimators for the parameters in the mean, and provides parsimonious estimation for the covariance structure.展开更多
In this paper, we study the local asymptotic behavior of the regression spline estimator in the framework of marginal semiparametric model. Similarly to Zhu, Fung and He (2008), we give explicit expression for the asy...In this paper, we study the local asymptotic behavior of the regression spline estimator in the framework of marginal semiparametric model. Similarly to Zhu, Fung and He (2008), we give explicit expression for the asymptotic bias of regression spline estimator for nonparametric function f. Our results also show that the asymptotic bias of the regression spline estimator does not depend on the working covariance matrix, which distinguishes the regression splines from the smoothing splines and the seemingly unrelated kernel. To understand the local bias result of the regression spline estimator, we show that the regression spline estimator can be obtained iteratively by applying the standard weighted least squares regression spline estimator to pseudo-observations. At each iteration, the bias of the estimator is unchanged and only the variance is updated.展开更多
In this paper we use profile empirical likelihood to construct confidence regions for regression coefficients in partially linear model with longitudinal data. The main contribution is that the within-subject correlat...In this paper we use profile empirical likelihood to construct confidence regions for regression coefficients in partially linear model with longitudinal data. The main contribution is that the within-subject correlation is considered to improve estimation efficiency. We suppose a semi-parametric structure for the covariances of observation errors in each subject and employ both the first order and the second order moment conditions of the observation errors to construct the estimating equations. Although there are nonparametric variable in distribution after estimators, the empirical log-likelihood ratio statistic still tends to a standard Xp2 the nuisance parameters are profiled away. A data simulation is also conducted.展开更多
In this puper, we consider the problem of variabie selection and model detection in varying coefficient models with longitudinM data. We propose a combined penalization procedure to select the significant variables, d...In this puper, we consider the problem of variabie selection and model detection in varying coefficient models with longitudinM data. We propose a combined penalization procedure to select the significant variables, detect the true structure of the model and estimate the unknown regression coefficients simultaneously. With appropriate selection of the tuning parameters, we show that the proposed procedure is consistent in both variable selection and the separation of varying and constant coefficients, and the penalized estimators have the oracle property. Finite sample performances of the proposed method are illustrated by some simulation studies and the real data analysis.展开更多
Empirical likelihood inference for partially linear errors-in-variables models with longitudinal data is investigated.Under regularity conditions,it is shown that the empirical log-likelihood ratio at the true paramet...Empirical likelihood inference for partially linear errors-in-variables models with longitudinal data is investigated.Under regularity conditions,it is shown that the empirical log-likelihood ratio at the true parameters converges to the standard Chi-squared distribution.Furthermore,we consider some estimates of the unknown parameter and the resulting estimators are shown to be asymptotically normal.Some simulations and a real data analysis are given to illustrate the performance of the proposed method.展开更多
For left censored response longitudinal data, we propose a composite quantile regression estimator(CQR) of regression parameter. Statistical properties such as consistency and asymptotic normality of CQR are studied...For left censored response longitudinal data, we propose a composite quantile regression estimator(CQR) of regression parameter. Statistical properties such as consistency and asymptotic normality of CQR are studied under relaxable assumptions of correlation structure of error terms. The performance of CQR is investigated via simulation studies and a real dataset analysis.展开更多
基金Supported by the National Natural Science Foundation of China (10571008)the Natural Science Foundation of Henan (092300410149)the Core Teacher Foundationof Henan (2006141)
文摘In this article, a partially linear single-index model /or longitudinal data is investigated. The generalized penalized spline least squares estimates of the unknown parameters are suggested. All parameters can be estimated simultaneously by the proposed method while the feature of longitudinal data is considered. The existence, strong consistency and asymptotic normality of the estimators are proved under suitable conditions. A simulation study is conducted to investigate the finite sample performance of the proposed method. Our approach can also be used to study the pure single-index model for longitudinal data.
文摘In longitudinal data analysis, our primary interest is in the estimation of regression parameters for the marginal expectations of the longitudinal responses, and the longitudinal correlation parameters are of secondary interest. The joint likelihood function for longitudinal data is challenging, particularly due to correlated responses. Marginal models, such as generalized estimating equations (GEEs), have received much attention based on the assumption of the first two moments of the data and a working correlation structure. The confidence regions and hypothesis tests are constructed based on the asymptotic normality. This approach is sensitive to the misspecification of the variance function and the working correlation structure which may yield inefficient and inconsistent estimates leading to wrong conclusions. To overcome this problem, we propose an empirical likelihood (EL) procedure based on a set of estimating equations for the parameter of interest and discuss its <span style="font-family:Verdana;">characteristics and asymptotic properties. We also provide an algorithm base</span><span style="font-family:Verdana;">d on EL principles for the estimation of the regression parameters and the construction of its confidence region. We have applied the proposed method in two case examples.</span>
文摘Logic regression is an adaptive regression method which searches for Boolean (logic) combinations of binary variables that best explain the variability in the outcome, and thus, it reveals interaction effects which are associated with the response. In this study, we extended logic regression to longitudinal data with binary response and proposed “Transition Logic Regression Method” to find interactions related to response. In this method, interaction effects over time were found by Annealing Algorithm with AIC (Akaike Information Criterion) as the score function of the model. Also, first and second orders Markov dependence were allowed to capture the correlation among successive observations of the same individual in longitudinal binary response. Performance of the method was evaluated with simulation study in various conditions. Proposed method was used to find interactions of SNPs and other risk factors related to low HDL over time in data of 329 participants of longitudinal TLGS study.
基金the Natural Science Foundation of China(10371042,10671038)
文摘In this article, robust generalized estimating equation for the analysis of partial linear mixed model for longitudinal data is used. The authors approximate the nonparametric function by a regression spline. Under some regular conditions, the asymptotic properties of the estimators are obtained. To avoid the computation of high-dimensional integral, a robust Monte Carlo Newton-Raphson algorithm is used. Some simulations are carried out to study the performance of the proposed robust estimators. In addition, the authors also study the robustness and the efficiency of the proposed estimators by simulation. Finally, two real longitudinal data sets are analyzed.
文摘In the research of scientific field, it is often necessary to continuously observe different indicators of individuals at different times and analyze the observed results. Among them, variables are mainly of two types: ordered variables and continuous variables. When analyzing data for different types of variables, it is necessary to consider the correlation between multiple indicators of an individual, and often perform joint analysis on variable observation data of multiple indicators of an individual at different times, in order to achieve more accurate and true analysis results. Joint analysis often yields more information than separate analysis of various variables. In this paper, the ordered variable and the continuous variable are numerically modeled. Based on the potential variable model, the multivariate longitudinal data containing the ordered variable and the continuous variable are jointly analyzed, and the approximate value of the edge likelihood can be obtained by using the method of numerical integration.
基金supported by National Key R&D Program of China(Grant No.2022YFA1003800)National Natural Science Foundation of China(Grant Nos.12301344,12471265,12231011 and 71988101)the Research Grant Council,University Grant Committee of Hong Kong Special Administrative Region(Grant No.14303622)。
文摘We develop an interquantile shrinkage estimation method to examine the underlying commonality structure of regression coefficients across various quantile levels for longitudinal data in a data-driven manner.This method provides a deeper insight into the relationship between the response and covariates,leading to enhanced estimation efficiency and model interpretability.We propose a fused penalized generalized estimation equation(GEE)estimator with a non-crossing constraint,which automatically promotes constancy in estimates across neighboring quantiles.By accounting for within-subject correlation in longitudinal data,the GEE estimator improves estimation efficiency.We employ a nested alternating direction method of multiplier(ADMM)algorithm to minimize the regularized objective function.The asymptotic properties of the penalized estimators are established.Furthermore,in the presence of irrelevant predictors,we develop a doubly penalized GEE estimator to simultaneously select active variables and identify commonality across quantiles.Numerical studies demonstrate the superior performance of our proposed methods in terms of estimation efficiency.We illustrate the application of our methodologies by analyzing a longitudinal wage dataset.
基金Supported by National Key R&D Program of China(Grant No.2022YFA1003702)National Natural Science Foundation of China(Grant No.12271441)。
文摘With the advent of modern devices,such as smartphones and wearable devices,high-dimensional data are collected on many participants for a period of time or even in perpetuity.For this type of data,dependencies between and within data batches exist because data are collected from the same individual over time.Under the framework of streamed data,individual historical data are not available due to the storage and computation burden.It is urgent to develop computationally efficient methods with statistical guarantees to analyze high-dimensional streamed data and make reliable inferences in practice.In addition,the homogeneity assumption on the model parameters may not be valid in practice over time.To address the above issues,in this paper,we develop a new renewable debiased-lasso inference method for high-dimensional streamed data allowing dependences between and within data batches to exist and model parameters to gradually change.We establish the large sample properties of the proposed estimators,including consistency and asymptotic normality.The numerical results,including simulations and real data analysis,show the superior performance of the proposed method.
基金supported by the National Natural Science Foundation of China under Grant Nos.11971421,71925007,72091212,and 12288201Yunling Scholar Research Fund of Yunnan Province under Grant No.YNWR-YLXZ-2018-020+1 种基金the CAS Project for Young Scientists in Basic Research under Grant No.YSBR-008the Start-Up Grant from Kunming University of Science and Technology under Grant No.KKZ3202207024.
文摘Prediction plays an important role in data analysis.Model averaging method generally provides better prediction than using any of its components.Even though model averaging has been extensively investigated under independent errors,few authors have considered model averaging for semiparametric models with correlated errors.In this paper,the authors offer an optimal model averaging method to improve the prediction in partially linear model for longitudinal data.The model averaging weights are obtained by minimizing criterion,which is an unbiased estimator of the expected in-sample squared error loss plus a constant.Asymptotic properties,including asymptotic optimality and consistency of averaging weights,are established under two scenarios:(i)All candidate models are misspecified;(ii)Correct models are available in the candidate set.Simulation studies and an empirical example show that the promise of the proposed procedure over other competitive methods.
基金This work was supported by the National Natural Science Foundation of China(12171451).
文摘In the interdisciplinary realm of statistics,genetics,and epidemiology,longitudinal sibling pair data offers a unique perspective for investigating complex diseases and traits,allowing the exploration of the dynamic processes of gene expression over time by controlling numerous confounding factors.Missing-not-at-random(MNAR)data are commonly used in such types of studies,but no statistical methods specifically tailored have been developed to handle MNAR data in complex longitudinal data in the literature.Here,we propose a new statistical method by jointly modeling longitudinal data from sib-pairs and MNAR data.Extensive simulations demonstrate the excellent finite sample properties of the proposed method.
文摘In the health field,longitudinal studies involve the recording of clinical observations of the same sample of pa-tients over successive periods,referred to as waves.This type of database serves as a valuable source of infor-mation and insights,particularly when examining the temporal aspect,allowing the extraction of relevant and non-obvious knowledge.The triadic concept analysis theory has been proposed to describe the ternary re-lationships between objects,attributes,and conditions.In this study,we present a methodology for exploring longitudinal health databases using both the triadic theory and triadic rules,which are similar to association rules but incorporate temporal relations.Through four case studies,we demonstrate the potential of applying triadic analysis to longitudinal databases to identify risk patterns,enhance decision-making processes,and deepen our understanding of temporal dynamics.These findings suggest a promising approach for describing longitudinal databases and obtaining insights to improve clinical decision-support systems for disease treatment.
基金The first author was supported by the National Natural Science Foundation of China (Grant No. 10571008)the Natural Science Foundation of Beijing (Grant No. 1072004)+1 种基金the Science and Technology Development Project of Education Committee of Beijing City (Grant No. KM200510005009)The second author was supported by a grant of the Research Grant Council of Hong Kong (Grant No. HKBU7060/04P)
文摘A partially linear model with longitudinal data is considered, empirical likelihood to infer- ence for the regression coefficients and the baseline function is investigated, the empirical log-likelihood ratios is proven to be asymptotically chi-squared, and the corresponding confidence regions for the pa- rameters of interest are then constructed. Also by the empirical likelihood ratio functions, we can obtain the maximum empirical likelihood estimates of the regression coefficients and the baseline function, and prove the asymptotic normality. The numerical results are conducted to compare the performance of the empirical likelihood and the normal approximation-based method, and a real example is analysed.
基金China Postdoctoral Science Foundation Funded Project (20080430633)Shanghai Postdoctoral Scientific Program (08R214121)+3 种基金the National Natural Science Foundation of China (10871013)the Research Fund for the Doctoral Program of Higher Education (20070005003)the Natural Science Foundation of Beijing (1072004)the Basic Research and Frontier Technology Foundation of He'nan (072300410090)
文摘In this paper, we consider the semiparametric regression model for longitudinal data. Due to the correlation within groups, a generalized empirical log-likelihood ratio statistic for the unknown parameters in the model is suggested by introducing the working covariance matrix. It is proved that the proposed statistic is asymptotically standard chi-squared under some suitable conditions, and hence it can be used to construct the confidence regions of the parameters. A simulation study is conducted to compare the proposed method with the generalized least squares method in terms of coverage accuracy and average lengths of the confidence intervals.
基金the Ministry of Science and Technology of China under Grant No.2016YFB0502301Academy for Multidisciplinary Studies of Capital Normal University,and the National Natural Science Foundation of China under Grant Nos.11971323 and 11529101。
文摘Model average receives much attention in recent years.This paper considers the semiparametric model averaging for high-dimensional longitudinal data.To minimize the prediction error,the authors estimate the model weights using a leave-subject-out cross-validation procedure.Asymptotic optimality of the proposed method is proved in the sense that leave-subject-out cross-validation achieves the lowest possible prediction loss asymptotically.Simulation studies show that the performance of the proposed model average method is much better than that of some commonly used model selection and averaging methods.
基金Research Foundation for Doctor Programme (Grant No.20060254006)the National Natural Science Foundation of China (Grant No.10671089)
文摘Varying-coefficient models with longitudinal observations are very useful in epidemiology and some other practical fields.In this paper,a reducing component procedure is proposed for es- timating the unknown functions and their derivatives in very general models,in which the unknown coefficient functions admit different or the same degrees of smoothness and the covariates can be time- dependent.The asymptotic properties of the estimators,such as consistency,rate of convergence and asymptotic distribution,are derived.The asymptotic results show that the asymptotic variance of the reducing component estimators is smaller than that of the existing estimators when the coefficient functions admit different degrees of smoothness.Finite sample properties of our procedures are studied through Monte Carlo simulations.
基金supported by National Natural Science Foundation of China(Grant Nos.11271347 and 11171321)
文摘Modeling the mean and covariance simultaneously is a common strategy to efficiently estimate the mean parameters when applying generalized estimating equation techniques to longitudinal data. In this article, using generalized estimation equation techniques, we propose a new kind of regression models for parameterizing covariance structures. Using a novel Cholesky factor, the entries in this decomposition have moving average and log innovation interpretation and are modeled as the regression coefficients in both the mean and the linear functions of covariates. The resulting estimators for eovarianee are shown to be consistent and asymptotically normally distributed. Simulation studies and a real data analysis show that the proposed approach yields highly efficient estimators for the parameters in the mean, and provides parsimonious estimation for the covariance structure.
基金supported by National Natural Science Foundation of China (Grant Nos.10671038,10801039)Youth Science Foundation of Fudan University (Grant No.08FQ29)Shanghai Leading Academic Discipline Project (Grant No.B118)
文摘In this paper, we study the local asymptotic behavior of the regression spline estimator in the framework of marginal semiparametric model. Similarly to Zhu, Fung and He (2008), we give explicit expression for the asymptotic bias of regression spline estimator for nonparametric function f. Our results also show that the asymptotic bias of the regression spline estimator does not depend on the working covariance matrix, which distinguishes the regression splines from the smoothing splines and the seemingly unrelated kernel. To understand the local bias result of the regression spline estimator, we show that the regression spline estimator can be obtained iteratively by applying the standard weighted least squares regression spline estimator to pseudo-observations. At each iteration, the bias of the estimator is unchanged and only the variance is updated.
基金Supported by NBRP (973 Program 2007CB814901) of ChinaNNSF project (10771123) of China+1 种基金RFDP(20070422034) of ChinaNSF projects (ZR2010AZ001) of Shandong Province of China
文摘In this paper we use profile empirical likelihood to construct confidence regions for regression coefficients in partially linear model with longitudinal data. The main contribution is that the within-subject correlation is considered to improve estimation efficiency. We suppose a semi-parametric structure for the covariances of observation errors in each subject and employ both the first order and the second order moment conditions of the observation errors to construct the estimating equations. Although there are nonparametric variable in distribution after estimators, the empirical log-likelihood ratio statistic still tends to a standard Xp2 the nuisance parameters are profiled away. A data simulation is also conducted.
基金Supported by National Natural Science Foundation of China(Grant Nos.11501522,11101014,11001118 and11171012)National Statistical Research Projects(Grant No.2014LZ45)+2 种基金the Doctoral Fund of Innovation of Beijing University of Technologythe Science and Technology Project of the Faculty Adviser of Excellent PhD Degree Thesis of Beijing(Grant No.20111000503)the Beijing Municipal Education Commission Foundation(Grant No.KM201110005029)
文摘In this puper, we consider the problem of variabie selection and model detection in varying coefficient models with longitudinM data. We propose a combined penalization procedure to select the significant variables, detect the true structure of the model and estimate the unknown regression coefficients simultaneously. With appropriate selection of the tuning parameters, we show that the proposed procedure is consistent in both variable selection and the separation of varying and constant coefficients, and the penalized estimators have the oracle property. Finite sample performances of the proposed method are illustrated by some simulation studies and the real data analysis.
基金Supported by the State Key Program of National Natural Science Foundation of China(No.12031016)the National Natural Science Foundation of China(No.11801346)+2 种基金the Youth Fund for Humanities and Social Sciences Research of Ministry of Education(No.18YJC910014)the Natural Science Basic Research Plan in Shaanxi Province of China(No.2020JM-276)the Fundamental Research Funds for the Central Universities(No.GK201901008)。
文摘Empirical likelihood inference for partially linear errors-in-variables models with longitudinal data is investigated.Under regularity conditions,it is shown that the empirical log-likelihood ratio at the true parameters converges to the standard Chi-squared distribution.Furthermore,we consider some estimates of the unknown parameter and the resulting estimators are shown to be asymptotically normal.Some simulations and a real data analysis are given to illustrate the performance of the proposed method.
基金Supported in part by the National Natural Science Foundation of China under(Grant No.11601097 and 11471302)the State Key Program of National Natural Science of China(Grant No.11231010)
文摘For left censored response longitudinal data, we propose a composite quantile regression estimator(CQR) of regression parameter. Statistical properties such as consistency and asymptotic normality of CQR are studied under relaxable assumptions of correlation structure of error terms. The performance of CQR is investigated via simulation studies and a real dataset analysis.