Detection and clarification of cause-effect relationships among variables is an important problem in time series analysis.This paper provides a method that employs both mutual information and conditional mutual inform...Detection and clarification of cause-effect relationships among variables is an important problem in time series analysis.This paper provides a method that employs both mutual information and conditional mutual information to identify the causal structure of multivariate time series causal graphical models.A three-step procedure is developed to learn the contemporaneous and the lagged causal relationships of time series causal graphs.Contrary to conventional constraint-based algorithm, the proposed algorithm does not involve any special kinds of distribution and is nonparametric.These properties are especially appealing for inference of time series causal graphs when the prior knowledge about the data model is not available.Simulations and case analysis demonstrate the effectiveness of the method.展开更多
Current high-dimensional feature screening methods still face significant challenges in handling mixed linear and nonlinear relationships,controlling redundant information,and improving model robustness.In this study,...Current high-dimensional feature screening methods still face significant challenges in handling mixed linear and nonlinear relationships,controlling redundant information,and improving model robustness.In this study,we propose a Dynamic Conditional Feature Screening(DCFS)method tailored for high-dimensional economic forecasting tasks.Our goal is to accurately identify key variables,enhance predictive performance,and provide both theoretical foundations and practical tools for macroeconomic modeling.The DCFS method constructs a comprehensive test statistic by integrating conditional mutual information with conditional regression error differences.By introducing a dynamic weighting mechanism,DCFS adaptively balances the linear and nonlinear contributions of features during the screening process.In addition,a dynamic thresholding mechanism is designed to effectively control the false discovery rate(FDR),thereby improving the stability and reliability of the screening results.On the theoretical front,we rigorously prove that the proposed method satisfies the sure screening property and rank consistency,ensuring accurate identification of the truly important feature set in high-dimensional settings.Simulation results demonstrate that under purely linear,purely nonlinear,and mixed dependency structures,DCFS consistently outperforms classical screening methods such as SIS,CSIS,and IG-SIS in terms of true positive rate(TPR),false discovery rate(FDR),and rank correlation.These results highlight the superior accuracy,robustness,and stability of our method.Furthermore,an empirical analysis based on the U.S.FRED-MD macroeconomic dataset confirms the practical value of DCFS in real-world forecasting tasks.The experimental results show that DCFS achieves lower prediction errors(RMSE and MAE)and higher R2 values in forecasting GDP growth.The selected key variables-including the Industrial Production Index(IP),Federal Funds Rate,Consumer Price Index(CPI),and Money Supply(M2)-possess clear economic interpretability,offering reliable support for economic forecasting and policy formulation.展开更多
Numerous models have been proposed to reduce the classification error of Naive Bayes by weakening its attribute independence assumption and some have demonstrated remarkable error performance. Considering that ensembl...Numerous models have been proposed to reduce the classification error of Naive Bayes by weakening its attribute independence assumption and some have demonstrated remarkable error performance. Considering that ensemble learning is an effective method of reducing the classifmation error of the classifier, this paper proposes a double-layer Bayesian classifier ensembles (DLBCE) algorithm based on frequent itemsets. DLBCE constructs a double-layer Bayesian classifier (DLBC) for each frequent itemset the new instance contained and finally ensembles all the classifiers by assigning different weight to different classifier according to the conditional mutual information. The experimental results show that the proposed algorithm outperforms other outstanding algorithms.展开更多
The general mutual information(GMI)and general conditional mutual information(GCMI)are considered to measure lag dependences in nonlinear time series.Both of the measures have the property of invariance with transform...The general mutual information(GMI)and general conditional mutual information(GCMI)are considered to measure lag dependences in nonlinear time series.Both of the measures have the property of invariance with transform.The statistics based on GMI and GCMI are estimated using the correlation integral.Under the hypothesis of independent series,the estimators have Gaussian asymptotic distributions.Simulations applied to generated nonlinear series demonstrate that the methods appear to find frequently the correct lags.展开更多
Detection and clarification of cause-effect relationships among variables is an important problem in time series analysis. Traditional causality inference methods have a salient limitation that the model must be linea...Detection and clarification of cause-effect relationships among variables is an important problem in time series analysis. Traditional causality inference methods have a salient limitation that the model must be linear and with Gaussian noise. Although additive model regression can effectively infer the nonlinear causal relationships of additive nonlinear time series, it suffers from the limitation that contemporaneous causal relationships of variables must be linear and not always valid to test conditional independence relations. This paper provides a nonparametric method that employs both mutual information and conditional mutual information to identify causal structure of a class of nonlinear time series models, which extends the additive nonlinear times series to nonlinear structural vector autoregressive models. An algorithm is developed to learn the contemporaneous and the lagged causal relationships of variables. Simulations demonstrate the effectiveness of the nroosed method.展开更多
Correctly quantifying the direct association between variables based on observed data is a valuable topic to study.On the one hand,many traditional methods can only measure the linear direct association.On the other h...Correctly quantifying the direct association between variables based on observed data is a valuable topic to study.On the one hand,many traditional methods can only measure the linear direct association.On the other hand,certain existing measures of direct association between two variables suffer an instability problem when a parent variable has a strong influence on both variables.To solve these issues,we propose a measure,namely the independent conditional mutual information(ICMI),to quantify the direct association between two variables in a three-variable network.Additionally,we use simulation data to numerically compare the stability and reliability of the ICMI with those of other measures of direct association under different conditions.The numerical results show that ICMI performs more stably in many cases than the known measures such as unique information,conditional mutual information,and partial correlation.The statistical power results show that ICMI is more reliable for different forms of function.We further use our measure to analyze a network consisting of family finance,social security,and the residence of senior citizens.展开更多
An information theory method is proposed to test the. Granger causality and contemporaneous conditional independence in Granger causality graph models. In the graphs, the vertex set denotes the component series of the...An information theory method is proposed to test the. Granger causality and contemporaneous conditional independence in Granger causality graph models. In the graphs, the vertex set denotes the component series of the multivariate time series, and the directed edges denote causal dependence, while the undirected edges reflect the instantaneous dependence. The presence of the edges is measured by a statistics based on conditional mutual information and tested by a permutation procedure. Furthermore, for the existed relations, a statistics based on the difference between general conditional mutual information and linear conditional mutual information is proposed to test the nonlinearity. The significance of the nonlinear test statistics is determined by a bootstrap method based on surrogate data. We investigate the finite sample behavior of the procedure through simulation time series with different dependence structures, including linear and nonlinear relations.展开更多
基金supported by the National Natural Science Foundation of China under Grant Nos.60972150, 10926197,61201323
文摘Detection and clarification of cause-effect relationships among variables is an important problem in time series analysis.This paper provides a method that employs both mutual information and conditional mutual information to identify the causal structure of multivariate time series causal graphical models.A three-step procedure is developed to learn the contemporaneous and the lagged causal relationships of time series causal graphs.Contrary to conventional constraint-based algorithm, the proposed algorithm does not involve any special kinds of distribution and is nonparametric.These properties are especially appealing for inference of time series causal graphs when the prior knowledge about the data model is not available.Simulations and case analysis demonstrate the effectiveness of the method.
文摘Current high-dimensional feature screening methods still face significant challenges in handling mixed linear and nonlinear relationships,controlling redundant information,and improving model robustness.In this study,we propose a Dynamic Conditional Feature Screening(DCFS)method tailored for high-dimensional economic forecasting tasks.Our goal is to accurately identify key variables,enhance predictive performance,and provide both theoretical foundations and practical tools for macroeconomic modeling.The DCFS method constructs a comprehensive test statistic by integrating conditional mutual information with conditional regression error differences.By introducing a dynamic weighting mechanism,DCFS adaptively balances the linear and nonlinear contributions of features during the screening process.In addition,a dynamic thresholding mechanism is designed to effectively control the false discovery rate(FDR),thereby improving the stability and reliability of the screening results.On the theoretical front,we rigorously prove that the proposed method satisfies the sure screening property and rank consistency,ensuring accurate identification of the truly important feature set in high-dimensional settings.Simulation results demonstrate that under purely linear,purely nonlinear,and mixed dependency structures,DCFS consistently outperforms classical screening methods such as SIS,CSIS,and IG-SIS in terms of true positive rate(TPR),false discovery rate(FDR),and rank correlation.These results highlight the superior accuracy,robustness,and stability of our method.Furthermore,an empirical analysis based on the U.S.FRED-MD macroeconomic dataset confirms the practical value of DCFS in real-world forecasting tasks.The experimental results show that DCFS achieves lower prediction errors(RMSE and MAE)and higher R2 values in forecasting GDP growth.The selected key variables-including the Industrial Production Index(IP),Federal Funds Rate,Consumer Price Index(CPI),and Money Supply(M2)-possess clear economic interpretability,offering reliable support for economic forecasting and policy formulation.
基金supported by National Natural Science Foundation of China (Nos. 61073133, 60973067, and 61175053)Fundamental Research Funds for the Central Universities of China(No. 2011ZD010)
文摘Numerous models have been proposed to reduce the classification error of Naive Bayes by weakening its attribute independence assumption and some have demonstrated remarkable error performance. Considering that ensemble learning is an effective method of reducing the classifmation error of the classifier, this paper proposes a double-layer Bayesian classifier ensembles (DLBCE) algorithm based on frequent itemsets. DLBCE constructs a double-layer Bayesian classifier (DLBC) for each frequent itemset the new instance contained and finally ensembles all the classifiers by assigning different weight to different classifier according to the conditional mutual information. The experimental results show that the proposed algorithm outperforms other outstanding algorithms.
基金Supported by the National Natural Science Foundation of China(Grant Nos.6037500360972150)the Science and Technology Innovation Foundation of Northwestern Polytechnical University(Grant No.2007KJ01033)
文摘The general mutual information(GMI)and general conditional mutual information(GCMI)are considered to measure lag dependences in nonlinear time series.Both of the measures have the property of invariance with transform.The statistics based on GMI and GCMI are estimated using the correlation integral.Under the hypothesis of independent series,the estimators have Gaussian asymptotic distributions.Simulations applied to generated nonlinear series demonstrate that the methods appear to find frequently the correct lags.
基金supported by the National Natural Science Foundation of China under Grant Nos.60972150 and 10926197
文摘Detection and clarification of cause-effect relationships among variables is an important problem in time series analysis. Traditional causality inference methods have a salient limitation that the model must be linear and with Gaussian noise. Although additive model regression can effectively infer the nonlinear causal relationships of additive nonlinear time series, it suffers from the limitation that contemporaneous causal relationships of variables must be linear and not always valid to test conditional independence relations. This paper provides a nonparametric method that employs both mutual information and conditional mutual information to identify causal structure of a class of nonlinear time series models, which extends the additive nonlinear times series to nonlinear structural vector autoregressive models. An algorithm is developed to learn the contemporaneous and the lagged causal relationships of variables. Simulations demonstrate the effectiveness of the nroosed method.
基金supported by the Innovation Program for Quantum Science and Technology(2021ZD0301701)the National Natural Science Foundation of China(12175104).
文摘Correctly quantifying the direct association between variables based on observed data is a valuable topic to study.On the one hand,many traditional methods can only measure the linear direct association.On the other hand,certain existing measures of direct association between two variables suffer an instability problem when a parent variable has a strong influence on both variables.To solve these issues,we propose a measure,namely the independent conditional mutual information(ICMI),to quantify the direct association between two variables in a three-variable network.Additionally,we use simulation data to numerically compare the stability and reliability of the ICMI with those of other measures of direct association under different conditions.The numerical results show that ICMI performs more stably in many cases than the known measures such as unique information,conditional mutual information,and partial correlation.The statistical power results show that ICMI is more reliable for different forms of function.We further use our measure to analyze a network consisting of family finance,social security,and the residence of senior citizens.
基金supported by the National Natural Science Foundation of China(Grant No.60375003)the Chinese Aviation Foundation(Grant No.03153059).
文摘An information theory method is proposed to test the. Granger causality and contemporaneous conditional independence in Granger causality graph models. In the graphs, the vertex set denotes the component series of the multivariate time series, and the directed edges denote causal dependence, while the undirected edges reflect the instantaneous dependence. The presence of the edges is measured by a statistics based on conditional mutual information and tested by a permutation procedure. Furthermore, for the existed relations, a statistics based on the difference between general conditional mutual information and linear conditional mutual information is proposed to test the nonlinearity. The significance of the nonlinear test statistics is determined by a bootstrap method based on surrogate data. We investigate the finite sample behavior of the procedure through simulation time series with different dependence structures, including linear and nonlinear relations.