Current high-dimensional feature screening methods still face significant challenges in handling mixed linear and nonlinear relationships,controlling redundant information,and improving model robustness.In this study,...Current high-dimensional feature screening methods still face significant challenges in handling mixed linear and nonlinear relationships,controlling redundant information,and improving model robustness.In this study,we propose a Dynamic Conditional Feature Screening(DCFS)method tailored for high-dimensional economic forecasting tasks.Our goal is to accurately identify key variables,enhance predictive performance,and provide both theoretical foundations and practical tools for macroeconomic modeling.The DCFS method constructs a comprehensive test statistic by integrating conditional mutual information with conditional regression error differences.By introducing a dynamic weighting mechanism,DCFS adaptively balances the linear and nonlinear contributions of features during the screening process.In addition,a dynamic thresholding mechanism is designed to effectively control the false discovery rate(FDR),thereby improving the stability and reliability of the screening results.On the theoretical front,we rigorously prove that the proposed method satisfies the sure screening property and rank consistency,ensuring accurate identification of the truly important feature set in high-dimensional settings.Simulation results demonstrate that under purely linear,purely nonlinear,and mixed dependency structures,DCFS consistently outperforms classical screening methods such as SIS,CSIS,and IG-SIS in terms of true positive rate(TPR),false discovery rate(FDR),and rank correlation.These results highlight the superior accuracy,robustness,and stability of our method.Furthermore,an empirical analysis based on the U.S.FRED-MD macroeconomic dataset confirms the practical value of DCFS in real-world forecasting tasks.The experimental results show that DCFS achieves lower prediction errors(RMSE and MAE)and higher R2 values in forecasting GDP growth.The selected key variables-including the Industrial Production Index(IP),Federal Funds Rate,Consumer Price Index(CPI),and Money Supply(M2)-possess clear economic interpretability,offering reliable support for economic forecasting and policy formulation.展开更多
The L<sub>1</sub> regression is a robust alternative to the least squares regression whenever there are outliers in the values of the response variable, or the errors follow a long-tailed distribution. To ...The L<sub>1</sub> regression is a robust alternative to the least squares regression whenever there are outliers in the values of the response variable, or the errors follow a long-tailed distribution. To calculate the standard errors of the L<sub>1</sub> estimators, construct confidence intervals and test hypotheses about the parameters of the model, or to calculate a robust coefficient of determination, it is necessary to have an estimate of a scale parameterτ. This parameter is such that τ<sup>2</sup>/n is the variance of the median of a sample of size n from the errors distribution. [1] proposed the use of , a consistent, and so, an asymptotically unbiased estimator of τ. However, this estimator is not stable in small samples, in the sense that it can increase with the introduction of new independent variables in the model. When the errors follow the Laplace distribution, the maximum likelihood estimator of τ, say , is the mean absolute error, that is, the mean of the absolute residuals. This estimator always decreases when new independent variables are added to the model. Our objective is to develop asymptotic properties of under several errors distributions analytically. We also performed a simulation study to compare the distributions of both estimators in small samples with the objective to establish conditions in which is a good alternative to for such situations.展开更多
Rapidly spreading COVID-19 virus and its variants, especially in metropolitan areas around the world, became a major health public concern. The tendency of COVID-19 pandemic and statistical modelling represents an urg...Rapidly spreading COVID-19 virus and its variants, especially in metropolitan areas around the world, became a major health public concern. The tendency of COVID-19 pandemic and statistical modelling represents an urgent challenge in the United States for which there are few solutions. In this paper, we demonstrate combining Fourier terms for capturing seasonality with ARIMA errors and other dynamics in the data. Therefore, we have analyzed 156 weeks COVID-19 dataset on national level using Dynamic Harmonic Regression model, including simulation analysis and accuracy improvement from 2020 to 2023. Most importantly, we provide new advanced pathways which may serve as targets for developing new solutions and approaches.展开更多
The problem of predicting continuous scalar outcomes from functional predictors has received high levels of interest in recent years in many fields,especially in the food industry.The k-nearest neighbor(k-NN)method of...The problem of predicting continuous scalar outcomes from functional predictors has received high levels of interest in recent years in many fields,especially in the food industry.The k-nearest neighbor(k-NN)method of Near-Infrared Reflectance(NIR)analysis is practical,relatively easy to implement,and becoming one of the most popular methods for conducting food quality based on NIR data.The k-NN is often named k nearest neighbor classifier when it is used for classifying categorical variables,while it is called k-nearest neighbor regression when it is applied for predicting noncategorical variables.The objective of this paper is to use the functional Near-Infrared Reflectance(NIR)spectroscopy approach to predict some chemical components with some modern statistical models based on the kernel and k-Nearest Neighbour procedures.In this paper,three NIR spectroscopy datasets are used as examples,namely Cookie dough,sugar,and tecator data.Specifically,we propose three models for this kind of data which are Functional Nonparametric Regression,Functional Robust Regression,and Functional Relative Error Regression,with both kernel and k-NN approaches to compare between them.The experimental result shows the higher efficiency of k-NN predictor over the kernel predictor.The predictive power of the k-NN method was compared with that of the kernel method,and several real data sets were used to determine the predictive power of both methods.展开更多
This paper aims to propose monthly models responsible for the theoretical evaluation of the global horizontal irradiance of a tropical region in India which is Sivagangai situated in Tamilnadu. The actual measured glo...This paper aims to propose monthly models responsible for the theoretical evaluation of the global horizontal irradiance of a tropical region in India which is Sivagangai situated in Tamilnadu. The actual measured global horizontal irradiance hails from a 5 MW solar power plant station located at Sivagangai in Tamilnadu. The data were monitored from May 2011 to April 2013. The theoretical assessment was conducted differently by employing a programming platform called Microsoft Visual Basic 2010 Express. A graphical user interface was created using Visual Basic 2010 Express, which provided the evaluation of empirical parameters for model formulation such as daily sunshine duration (5), maximum possible sunshine hour duration (S0), extra terrestrial horizontal global irradiance (H0) and extra terrestrial direct normal irradiance (G0). The proposed regression models were validated by the significance of statistical indicators such as mean bias error, root mean square error and mean percentage error from the predicted and the actual values for the region considered. Comparison was made between the proposed monthly models and the existing normalized models for global horizontal irradiance evaluation.展开更多
文摘Current high-dimensional feature screening methods still face significant challenges in handling mixed linear and nonlinear relationships,controlling redundant information,and improving model robustness.In this study,we propose a Dynamic Conditional Feature Screening(DCFS)method tailored for high-dimensional economic forecasting tasks.Our goal is to accurately identify key variables,enhance predictive performance,and provide both theoretical foundations and practical tools for macroeconomic modeling.The DCFS method constructs a comprehensive test statistic by integrating conditional mutual information with conditional regression error differences.By introducing a dynamic weighting mechanism,DCFS adaptively balances the linear and nonlinear contributions of features during the screening process.In addition,a dynamic thresholding mechanism is designed to effectively control the false discovery rate(FDR),thereby improving the stability and reliability of the screening results.On the theoretical front,we rigorously prove that the proposed method satisfies the sure screening property and rank consistency,ensuring accurate identification of the truly important feature set in high-dimensional settings.Simulation results demonstrate that under purely linear,purely nonlinear,and mixed dependency structures,DCFS consistently outperforms classical screening methods such as SIS,CSIS,and IG-SIS in terms of true positive rate(TPR),false discovery rate(FDR),and rank correlation.These results highlight the superior accuracy,robustness,and stability of our method.Furthermore,an empirical analysis based on the U.S.FRED-MD macroeconomic dataset confirms the practical value of DCFS in real-world forecasting tasks.The experimental results show that DCFS achieves lower prediction errors(RMSE and MAE)and higher R2 values in forecasting GDP growth.The selected key variables-including the Industrial Production Index(IP),Federal Funds Rate,Consumer Price Index(CPI),and Money Supply(M2)-possess clear economic interpretability,offering reliable support for economic forecasting and policy formulation.
文摘The L<sub>1</sub> regression is a robust alternative to the least squares regression whenever there are outliers in the values of the response variable, or the errors follow a long-tailed distribution. To calculate the standard errors of the L<sub>1</sub> estimators, construct confidence intervals and test hypotheses about the parameters of the model, or to calculate a robust coefficient of determination, it is necessary to have an estimate of a scale parameterτ. This parameter is such that τ<sup>2</sup>/n is the variance of the median of a sample of size n from the errors distribution. [1] proposed the use of , a consistent, and so, an asymptotically unbiased estimator of τ. However, this estimator is not stable in small samples, in the sense that it can increase with the introduction of new independent variables in the model. When the errors follow the Laplace distribution, the maximum likelihood estimator of τ, say , is the mean absolute error, that is, the mean of the absolute residuals. This estimator always decreases when new independent variables are added to the model. Our objective is to develop asymptotic properties of under several errors distributions analytically. We also performed a simulation study to compare the distributions of both estimators in small samples with the objective to establish conditions in which is a good alternative to for such situations.
文摘Rapidly spreading COVID-19 virus and its variants, especially in metropolitan areas around the world, became a major health public concern. The tendency of COVID-19 pandemic and statistical modelling represents an urgent challenge in the United States for which there are few solutions. In this paper, we demonstrate combining Fourier terms for capturing seasonality with ARIMA errors and other dynamics in the data. Therefore, we have analyzed 156 weeks COVID-19 dataset on national level using Dynamic Harmonic Regression model, including simulation analysis and accuracy improvement from 2020 to 2023. Most importantly, we provide new advanced pathways which may serve as targets for developing new solutions and approaches.
基金funding this work through the Research Groups Program under Grant Number R.G.P.1/189/41.I.M.A.and M.K.A.received the grant.
文摘The problem of predicting continuous scalar outcomes from functional predictors has received high levels of interest in recent years in many fields,especially in the food industry.The k-nearest neighbor(k-NN)method of Near-Infrared Reflectance(NIR)analysis is practical,relatively easy to implement,and becoming one of the most popular methods for conducting food quality based on NIR data.The k-NN is often named k nearest neighbor classifier when it is used for classifying categorical variables,while it is called k-nearest neighbor regression when it is applied for predicting noncategorical variables.The objective of this paper is to use the functional Near-Infrared Reflectance(NIR)spectroscopy approach to predict some chemical components with some modern statistical models based on the kernel and k-Nearest Neighbour procedures.In this paper,three NIR spectroscopy datasets are used as examples,namely Cookie dough,sugar,and tecator data.Specifically,we propose three models for this kind of data which are Functional Nonparametric Regression,Functional Robust Regression,and Functional Relative Error Regression,with both kernel and k-NN approaches to compare between them.The experimental result shows the higher efficiency of k-NN predictor over the kernel predictor.The predictive power of the k-NN method was compared with that of the kernel method,and several real data sets were used to determine the predictive power of both methods.
文摘This paper aims to propose monthly models responsible for the theoretical evaluation of the global horizontal irradiance of a tropical region in India which is Sivagangai situated in Tamilnadu. The actual measured global horizontal irradiance hails from a 5 MW solar power plant station located at Sivagangai in Tamilnadu. The data were monitored from May 2011 to April 2013. The theoretical assessment was conducted differently by employing a programming platform called Microsoft Visual Basic 2010 Express. A graphical user interface was created using Visual Basic 2010 Express, which provided the evaluation of empirical parameters for model formulation such as daily sunshine duration (5), maximum possible sunshine hour duration (S0), extra terrestrial horizontal global irradiance (H0) and extra terrestrial direct normal irradiance (G0). The proposed regression models were validated by the significance of statistical indicators such as mean bias error, root mean square error and mean percentage error from the predicted and the actual values for the region considered. Comparison was made between the proposed monthly models and the existing normalized models for global horizontal irradiance evaluation.