In this paper,we highlight some recent developments of a new route to evaluate macroeconomic policy effects,which are investigated under the framework with potential outcomes.First,this paper begins with a brief intro...In this paper,we highlight some recent developments of a new route to evaluate macroeconomic policy effects,which are investigated under the framework with potential outcomes.First,this paper begins with a brief introduction of the basic model setup in modern econometric analysis of program evaluation.Secondly,primary attention goes to the focus on causal effect estimation of macroeconomic policy with single time series data together with some extensions to multiple time series data.Furthermore,we examine the connection of this new approach to traditional macroeconomic models for policy analysis and evaluation.Finally,we conclude by addressing some possible future research directions in statistics and econometrics.展开更多
Since the financial crisis in 2008, the risk measures which are the core of risk management, have received increasing attention among economists and practitioners. In this review, the concentration is on recent develo...Since the financial crisis in 2008, the risk measures which are the core of risk management, have received increasing attention among economists and practitioners. In this review, the concentration is on recent developments in the estimation of the most popular risk measures, namely, value at risk (VaR), expected shortfall (ES), and expectile. After introducing the concept of risk measures, the focus is on discussion and comparison of their econometric modeling. Then, parametric and nonparametric estimations of tail dependence are investigated. Finally, we conclude with insights into future research directions.展开更多
The era of big data brings opportunities and challenges to developing new statistical methods and models to evaluate social programs or economic policies or interventions. This paper provides a comprehensive review on...The era of big data brings opportunities and challenges to developing new statistical methods and models to evaluate social programs or economic policies or interventions. This paper provides a comprehensive review on some recent advances in statistical methodologies and models to evaluate programs with high-dimensional data. In particular, four kinds of methods for making valid statistical inferences for treatment effects in high dimensions are addressed. The first one is the so-called doubly robust type estimation, which models the outcome regression and propensity score functions simultaneously. The second one is the covariate balance method to construct the treatment effect estimators. The third one is the sufficient dimension reduction approach for causal inferences. The last one is the machine learning procedure directly or indirectly to make statistical inferences to treatment effect. In such a way, some of these methods and models are closely related to the de-biased Lasso type methods for the regression model with high dimensions in the statistical literature. Finally, some future research topics are also discussed.展开更多
This paper highlights some recent developments in testing predictability of asset returns with focuses on linear mean regressions, quantile regressions and nonlinear regression models. For these models, when predictor...This paper highlights some recent developments in testing predictability of asset returns with focuses on linear mean regressions, quantile regressions and nonlinear regression models. For these models, when predictors are highly persistent and their innovations are contemporarily correlated with dependent variable, the ordinary least squares estimator has a finite-sample bias, and its limiting distribution relies on some unknown nuisance parameter, which is not consistently estimable. Without correcting these issues, conventional test statistics are subject to a serious size distortion and generate a misleading conclusion in testing pre- dictability of asset returns in real applications. In the past two decades, sequential studies have contributed to this subject and proposed various kinds of solutions, including, but not limit to, the bias-correction procedures, the linear projection approach, the IVX filtering idea, the variable addition approaches, the weighted empirical likelihood method, and the double-weight robust approach. Particularly, to catch up with the fast-growing literature in the recent decade, we offer a selective overview of these methods. Finally, some future research topics, such as the econometric theory for predictive regressions with structural changes, and nonparametric predictive models, and predictive models under a more general data setting, are also discussed.展开更多
In this review, we highlight some recent methodological and theoretical develop- ments in estimation and testing of large panel data models with cross-sectional dependence. The paper begins with a discussion of issues...In this review, we highlight some recent methodological and theoretical develop- ments in estimation and testing of large panel data models with cross-sectional dependence. The paper begins with a discussion of issues of cross-sectional dependence, and introduces the concepts of weak and strong cross-sectional dependence. Then, the main attention is primarily paid to spatial and factor approaches for modeling cross-sectional dependence for both linear and nonlinear (nonparametric and semiparametric) panel data models. Finally, we conclude with some speculations on future research directions.展开更多
In this paper, we propose a new test for testing the stability in macroeconomic time series, based on the LASSO variable selection approach and nonparametric estimation of a time-varying model. The wild bootstrap is e...In this paper, we propose a new test for testing the stability in macroeconomic time series, based on the LASSO variable selection approach and nonparametric estimation of a time-varying model. The wild bootstrap is employed to obtain its data-dependent critical values. We apply the new method to test the stability of bivariate relations among 92 major Chinese macroeconomic time series. We find that more than 70% bivariate relations are significantly unstable.展开更多
High-dimensional data have frequently been collected in many scientific areas including genomewide association study, biomedical imaging, tomography, tumor classifications, and finance. Analysis of highdimensional dat...High-dimensional data have frequently been collected in many scientific areas including genomewide association study, biomedical imaging, tomography, tumor classifications, and finance. Analysis of highdimensional data poses many challenges for statisticians. Feature selection and variable selection are fundamental for high-dimensional data analysis. The sparsity principle, which assumes that only a small number of predictors contribute to the response, is frequently adopted and deemed useful in the analysis of high-dimensional data.Following this general principle, a large number of variable selection approaches via penalized least squares or likelihood have been developed in the recent literature to estimate a sparse model and select significant variables simultaneously. While the penalized variable selection methods have been successfully applied in many highdimensional analyses, modern applications in areas such as genomics and proteomics push the dimensionality of data to an even larger scale, where the dimension of data may grow exponentially with the sample size. This has been called ultrahigh-dimensional data in the literature. This work aims to present a selective overview of feature screening procedures for ultrahigh-dimensional data. We focus on insights into how to construct marginal utilities for feature screening on specific models and motivation for the need of model-free feature screening procedures.展开更多
Different covariate balance weighting methods have been proposed by researchers from different perspectives to estimate the treatment effects.This paper gives a brief review of the covariate balancing propensity score...Different covariate balance weighting methods have been proposed by researchers from different perspectives to estimate the treatment effects.This paper gives a brief review of the covariate balancing propensity score method by Imai and Ratkovic(2014),the stable balance weighting procedure by Zubizarreta(2015),the calibration balance weighting approach by Chan,et al.(2016),and the integrated propensity score technique by Sant’Anna,et al.(2020).Simulations are conducted to illustrate the finite sample performance of both the average treatment effect and quantile treatment effect estimators based on different weighting methods.Simulation results show that in general,the covariate balance weighting methods can outperform the conventional maximum likelihood estimation method while the performance of the four covariate balance weighting methods varies with the data generating processes.Finally,the four covariate balance weighting methods are applied to estimate the treatment effects of the college graduate on personal annual income.展开更多
Conditional dependence plays a crucial role in various statistical procedures,including variable selection,network analysis and causal inference.However,there remains a paucity of relevant research in the context of h...Conditional dependence plays a crucial role in various statistical procedures,including variable selection,network analysis and causal inference.However,there remains a paucity of relevant research in the context of high-dimensional conditioning variables,a common challenge encountered in the era of big data.To address this issue,many existing studies impose certain model structures,yet high-dimensional conditioning variables often introduce spurious correlations in these models.In this paper,we systematically study the estimation biases inherent in widely-used measures of conditional dependence when spurious variables are present under high-dimensional settings.We discuss the estimation inconsistency both intuitively and theoretically,demonstrating that the conditional dependencies can be either overestimated or underestimated under different scenarios.To mitigate these biases and attain consistency,we introduce a measure based on data splitting and refitting techniques for high-dimensional conditional dependence.A conditional independence test is also developed using the newly advocated measure,with a tuning-free asymptotic null distribution.Furthermore,the proposed test is applied to generating high-dimensional network graphs in graphical modeling.The superior performances of newly proposed methods are illustrated both theoretically and through simulation studies.We also utilize the method to construct the gene-gene networks using a dataset of breast invasive carcinoma,which contains interesting discoveries that are worth further scientific exploration.展开更多
This paper studies variable selection problem in structural equation of a two-stage least squares (2SLS) model in presence of endogeneity which is commonly encountered in empirical economic studies. Model uncertaint...This paper studies variable selection problem in structural equation of a two-stage least squares (2SLS) model in presence of endogeneity which is commonly encountered in empirical economic studies. Model uncertainty and variable selection in the structural equation is an important issue as described in Andrews and Lu (2001) and Caner (2009). The authors propose an adaptive Lasso 2SLS estimator for linear structural equation with endogeneity and show that it enjoys the oracle properties, i.e., the consistency in both estimation and model selection. In Monte Carlo simulations, the authors demonstrate that the proposed estimator has smaller bias and MSE compared with the bridge-type GMM estimator (Caner, 2009). In a case study, the authors revisit the classic returns to education problem (Angrist and Krueger, 1991) using the China Population census data. The authors find that the education level not only has strong effects on income but also shows heterogeneity in different age cohorts.展开更多
This paper proposes a new quantile regression model to characterize the heterogeneity for distributional effects of maternal smoking during pregnancy on infant birth weight across different the mother's age.By imp...This paper proposes a new quantile regression model to characterize the heterogeneity for distributional effects of maternal smoking during pregnancy on infant birth weight across different the mother's age.By imposing a parametric restriction on the quantile functions of the potential outcome distributions conditional on the mother's age,we estimate the quantile treatment effects of maternal smoking during pregnancy on her baby's birth weight across different age groups of mothers.The results show strongly that the quantile effects of maternal smoking on low infant birth weight are negative and substantially heterogenous across different ages.展开更多
To characterize heteroskedasticity,nonlinearity,and asymmetry in tail risk,this study investigates a class of conditional(dynamic)expectile models with partially varying coefficients in which some coefficients are all...To characterize heteroskedasticity,nonlinearity,and asymmetry in tail risk,this study investigates a class of conditional(dynamic)expectile models with partially varying coefficients in which some coefficients are allowed to be constants,but others are allowed to be unknown functions of random variables.A three-stage estimation procedure is proposed to estimate both the parametric constant coefficients and nonparametric functional coefficients.Their asymptotic properties are investigated under a time series context,together with a new simple and easily implemented test for testing the goodness of fit of models and a bandwidth selector based on newly defined cross-validatory estimation for the expected forecasting expectile errors.The proposed methodology is data-analytic and of sufficient flexibility to analyze complex and multivariate nonlinear structures without suffering from the curse of dimensionality.Finally,the proposed model is illustrated by simulated data,and applied to analyzing the daily data of the S&P500 return series.展开更多
基金the National Natural Science Foundation of China(71631004,Key Project)the National Science Fund for Distinguished Young Scholars(71625001)+2 种基金the Basic Scientific Center Project of National Science Foundation of China:Econometrics and Quantitative Policy Evaluation(71988101)the Science Foundation of Ministry of Education of China(19YJA910003)China Scholarship Council Funded Project(201806315045).
文摘In this paper,we highlight some recent developments of a new route to evaluate macroeconomic policy effects,which are investigated under the framework with potential outcomes.First,this paper begins with a brief introduction of the basic model setup in modern econometric analysis of program evaluation.Secondly,primary attention goes to the focus on causal effect estimation of macroeconomic policy with single time series data together with some extensions to multiple time series data.Furthermore,we examine the connection of this new approach to traditional macroeconomic models for policy analysis and evaluation.Finally,we conclude by addressing some possible future research directions in statistics and econometrics.
基金the financial support,in part,from the National Science Fund of China(NSFC)for Distinguished Young Scholars(71625001)NSFC grant(71631004)(Key Project)the scholarship from China Scholarship Council(CSC)under the Grant CSC(N201706310023)
文摘Since the financial crisis in 2008, the risk measures which are the core of risk management, have received increasing attention among economists and practitioners. In this review, the concentration is on recent developments in the estimation of the most popular risk measures, namely, value at risk (VaR), expected shortfall (ES), and expectile. After introducing the concept of risk measures, the focus is on discussion and comparison of their econometric modeling. Then, parametric and nonparametric estimations of tail dependence are investigated. Finally, we conclude with insights into future research directions.
基金Supported by the National Natural Science Foundation of China(71631004, 72033008)National Science Foundation for Distinguished Young Scholars(71625001)Science Foundation of Ministry of Education of China(19YJA910003)。
文摘The era of big data brings opportunities and challenges to developing new statistical methods and models to evaluate social programs or economic policies or interventions. This paper provides a comprehensive review on some recent advances in statistical methodologies and models to evaluate programs with high-dimensional data. In particular, four kinds of methods for making valid statistical inferences for treatment effects in high dimensions are addressed. The first one is the so-called doubly robust type estimation, which models the outcome regression and propensity score functions simultaneously. The second one is the covariate balance method to construct the treatment effect estimators. The third one is the sufficient dimension reduction approach for causal inferences. The last one is the machine learning procedure directly or indirectly to make statistical inferences to treatment effect. In such a way, some of these methods and models are closely related to the de-biased Lasso type methods for the regression model with high dimensions in the statistical literature. Finally, some future research topics are also discussed.
基金supported by the National Natural Science Foundation of China(71631004,71571152)the Fundamental Research Funds for the Central Universities(20720171002,20720170090)the Fok Ying-Tong Education Foundation(151084)
文摘This paper highlights some recent developments in testing predictability of asset returns with focuses on linear mean regressions, quantile regressions and nonlinear regression models. For these models, when predictors are highly persistent and their innovations are contemporarily correlated with dependent variable, the ordinary least squares estimator has a finite-sample bias, and its limiting distribution relies on some unknown nuisance parameter, which is not consistently estimable. Without correcting these issues, conventional test statistics are subject to a serious size distortion and generate a misleading conclusion in testing pre- dictability of asset returns in real applications. In the past two decades, sequential studies have contributed to this subject and proposed various kinds of solutions, including, but not limit to, the bias-correction procedures, the linear projection approach, the IVX filtering idea, the variable addition approaches, the weighted empirical likelihood method, and the double-weight robust approach. Particularly, to catch up with the fast-growing literature in the recent decade, we offer a selective overview of these methods. Finally, some future research topics, such as the econometric theory for predictive regressions with structural changes, and nonparametric predictive models, and predictive models under a more general data setting, are also discussed.
基金Supported by the National Natural Science Foundation of China(71131008(Key Project)and 71271179)
文摘In this review, we highlight some recent methodological and theoretical develop- ments in estimation and testing of large panel data models with cross-sectional dependence. The paper begins with a discussion of issues of cross-sectional dependence, and introduces the concepts of weak and strong cross-sectional dependence. Then, the main attention is primarily paid to spatial and factor approaches for modeling cross-sectional dependence for both linear and nonlinear (nonparametric and semiparametric) panel data models. Finally, we conclude with some speculations on future research directions.
基金Supported by the National Natural Science Foundation of China (70971113, 71131008, 71271179)the Fundamental Research Funds for the Central Universities (2010221092, 2011221015)
文摘In this paper, we propose a new test for testing the stability in macroeconomic time series, based on the LASSO variable selection approach and nonparametric estimation of a time-varying model. The wild bootstrap is employed to obtain its data-dependent critical values. We apply the new method to test the stability of bivariate relations among 92 major Chinese macroeconomic time series. We find that more than 70% bivariate relations are significantly unstable.
基金supported by National Natural Science Foundation of China(Grant Nos.11401497 and 11301435)the Fundamental Research Funds for the Central Universities(Grant No.T2013221043)+3 种基金the Scientific Research Foundation for the Returned Overseas Chinese Scholars,State Education Ministry,the Fundamental Research Funds for the Central Universities(Grant No.20720140034)National Institute on Drug Abuse,National Institutes of Health(Grant Nos.P50 DA036107 and P50 DA039838)National Science Foundation(Grant No.DMS1512422)The content is solely the responsibility of the authors and does not necessarily represent the official views of National Institute on Drug Abuse, National Institutes of Health, National Science Foundation or National Natural Science Foundation of China
文摘High-dimensional data have frequently been collected in many scientific areas including genomewide association study, biomedical imaging, tomography, tumor classifications, and finance. Analysis of highdimensional data poses many challenges for statisticians. Feature selection and variable selection are fundamental for high-dimensional data analysis. The sparsity principle, which assumes that only a small number of predictors contribute to the response, is frequently adopted and deemed useful in the analysis of high-dimensional data.Following this general principle, a large number of variable selection approaches via penalized least squares or likelihood have been developed in the recent literature to estimate a sparse model and select significant variables simultaneously. While the penalized variable selection methods have been successfully applied in many highdimensional analyses, modern applications in areas such as genomics and proteomics push the dimensionality of data to an even larger scale, where the dimension of data may grow exponentially with the sample size. This has been called ultrahigh-dimensional data in the literature. This work aims to present a selective overview of feature screening procedures for ultrahigh-dimensional data. We focus on insights into how to construct marginal utilities for feature screening on specific models and motivation for the need of model-free feature screening procedures.
基金the National Natural Science Foundation of China under Grant Nos.71631004 and 72033008the National Science Foundation for Distinguished Young Scholars under Grant No.71625001the Science Foundation of Ministry of Education of China under Grant No.19YJA910003。
文摘Different covariate balance weighting methods have been proposed by researchers from different perspectives to estimate the treatment effects.This paper gives a brief review of the covariate balancing propensity score method by Imai and Ratkovic(2014),the stable balance weighting procedure by Zubizarreta(2015),the calibration balance weighting approach by Chan,et al.(2016),and the integrated propensity score technique by Sant’Anna,et al.(2020).Simulations are conducted to illustrate the finite sample performance of both the average treatment effect and quantile treatment effect estimators based on different weighting methods.Simulation results show that in general,the covariate balance weighting methods can outperform the conventional maximum likelihood estimation method while the performance of the four covariate balance weighting methods varies with the data generating processes.Finally,the four covariate balance weighting methods are applied to estimate the treatment effects of the college graduate on personal annual income.
基金supported by National Natural Science Foundation of China(Grant Nos.12271456,12371270 and 71988101)the Ministry of Education Research in the Humanities and Social Sciences(Grant No.22YJA910002)Shanghai Science and Technology Development Funds(Grant No.23JC1402100)。
文摘Conditional dependence plays a crucial role in various statistical procedures,including variable selection,network analysis and causal inference.However,there remains a paucity of relevant research in the context of high-dimensional conditioning variables,a common challenge encountered in the era of big data.To address this issue,many existing studies impose certain model structures,yet high-dimensional conditioning variables often introduce spurious correlations in these models.In this paper,we systematically study the estimation biases inherent in widely-used measures of conditional dependence when spurious variables are present under high-dimensional settings.We discuss the estimation inconsistency both intuitively and theoretically,demonstrating that the conditional dependencies can be either overestimated or underestimated under different scenarios.To mitigate these biases and attain consistency,we introduce a measure based on data splitting and refitting techniques for high-dimensional conditional dependence.A conditional independence test is also developed using the newly advocated measure,with a tuning-free asymptotic null distribution.Furthermore,the proposed test is applied to generating high-dimensional network graphs in graphical modeling.The superior performances of newly proposed methods are illustrated both theoretically and through simulation studies.We also utilize the method to construct the gene-gene networks using a dataset of breast invasive carcinoma,which contains interesting discoveries that are worth further scientific exploration.
基金Fan’s research was supported by the National Natural Science Foundation of China under Grant No.71671149the Fundamental Research Funds for the Central Universities under Grant No.20720171042+1 种基金the Natural Science Foundation of Fujian Province of China under Grant No.2016J01340Zhong’s research was supported by the National Natural Science Foundation of China under Grant Nos.11671334,11301435,and 11401497
文摘This paper studies variable selection problem in structural equation of a two-stage least squares (2SLS) model in presence of endogeneity which is commonly encountered in empirical economic studies. Model uncertainty and variable selection in the structural equation is an important issue as described in Andrews and Lu (2001) and Caner (2009). The authors propose an adaptive Lasso 2SLS estimator for linear structural equation with endogeneity and show that it enjoys the oracle properties, i.e., the consistency in both estimation and model selection. In Monte Carlo simulations, the authors demonstrate that the proposed estimator has smaller bias and MSE compared with the bridge-type GMM estimator (Caner, 2009). In a case study, the authors revisit the classic returns to education problem (Angrist and Krueger, 1991) using the China Population census data. The authors find that the education level not only has strong effects on income but also shows heterogeneity in different age cohorts.
基金financial supports from the National Natural Science Foundation of China(NSFC)for Distinguished Scholars(71625001)the NSFC key projects with grant numbers 71631004,72033008 and 71131008Science Foundation of Ministry of Education of China(19YJA910003).
文摘This paper proposes a new quantile regression model to characterize the heterogeneity for distributional effects of maternal smoking during pregnancy on infant birth weight across different the mother's age.By imposing a parametric restriction on the quantile functions of the potential outcome distributions conditional on the mother's age,we estimate the quantile treatment effects of maternal smoking during pregnancy on her baby's birth weight across different age groups of mothers.The results show strongly that the quantile effects of maternal smoking on low infant birth weight are negative and substantially heterogenous across different ages.
基金The authors thank the vip Editors and the anonymous referees for their helpful and constructive comments.The authors also acknowledge gratefully the partial financial support from the National Science Fund for Distinguished Young Scholars#71625001the Natural Science Foundation of China grants#7the scholarship from China Scholarship Council under the Grant CSC N201706310023.
文摘To characterize heteroskedasticity,nonlinearity,and asymmetry in tail risk,this study investigates a class of conditional(dynamic)expectile models with partially varying coefficients in which some coefficients are allowed to be constants,but others are allowed to be unknown functions of random variables.A three-stage estimation procedure is proposed to estimate both the parametric constant coefficients and nonparametric functional coefficients.Their asymptotic properties are investigated under a time series context,together with a new simple and easily implemented test for testing the goodness of fit of models and a bandwidth selector based on newly defined cross-validatory estimation for the expected forecasting expectile errors.The proposed methodology is data-analytic and of sufficient flexibility to analyze complex and multivariate nonlinear structures without suffering from the curse of dimensionality.Finally,the proposed model is illustrated by simulated data,and applied to analyzing the daily data of the S&P500 return series.