The recent financial crisis highlights the inherent weaknesses of the financial market. To explore the mechanism that maintains the financial market as a system, we study the interactions of U.S. financial market from...The recent financial crisis highlights the inherent weaknesses of the financial market. To explore the mechanism that maintains the financial market as a system, we study the interactions of U.S. financial market from the network perspective. Applied with conditional Granger causality network analysis, network density, in-degree and out-degree rankings are important indicators to analyze the conditional causal relationships among financial agents, and further to assess the stability of U.S. financial systems. It is found that the topological structure of G-causality network in U.S. financial market changed in different stages over the last decade, especially during the recent global financial crisis. Network density of the G-causality model is much higher during the period of 2007-2009 crisis stage, and it reaches the peak value in 2008, the most turbulent time in the crisis. Ranked by in-degrees and out-degrees, insurance companies are listed in the top of 68 financial institutions during the crisis. They act as the hubs which are more easily influenced by other financial institutions and simultaneously influence others during the global financial disturbance.展开更多
Inferring causal protein signalling networks from human immune system cell data is a promising approach to unravel the underlying tissue signalling biology and dysfunction in diseased cells,which has attracted conside...Inferring causal protein signalling networks from human immune system cell data is a promising approach to unravel the underlying tissue signalling biology and dysfunction in diseased cells,which has attracted considerable attention within the bioinformatics field.Recently,Bayesian network(BN)techniques have gained significant popularity in inferring causal protein signalling networks from multiparameter single-cell data.However,current BN methods may exhibit high computational complexity and ignore interactions among protein signalling molecules from different single cells.A novel BN method is presented for learning causal protein signalling networks based on parallel discrete artificial bee colony(PDABC),named PDABC.Specifically,PDABC is a score-based BN method that utilises the parallel artificial bee colony to search for the global optimal causal protein signalling networks with the highest discrete K2 metric.The experimental results on several simulated datasets,as well as a previously published multi-parameter fluorescence-activated cell sorter dataset,indicate that PDABC surpasses the existing state-of-the-art methods in terms of performance and computational efficiency.展开更多
Causal analysis is a powerful tool to unravel the data complexity and hence provide clues to achieving, say, better platform design, efficient interoperability and service management, etc. Data science will surely ben...Causal analysis is a powerful tool to unravel the data complexity and hence provide clues to achieving, say, better platform design, efficient interoperability and service management, etc. Data science will surely benefit from the advancement in this field. Here we introduce into this community a recent finding in physics on causality and the subsequent rigorous and quantitative causality analysis. The resulting formula is concise in form, involving only the common statistics namely sample covariance. A corollary is that causation implies correlation, but not vice versa, resolving the long-standing philosophical debate over correlation versus causation. The applicability to big data analysis is validated with time series purportedly generated with hidden processes. As a demonstration, a preliminary application to the gross domestic product (GDP) data of United States, China, and Japan reveals some subtle USA-China-Japan relations in certain periods. 展开更多
Statistical approaches for evaluating causal effects and for discovering causal networks are discussed in this paper.A causal relation between two variables is different from an association or correlation between them...Statistical approaches for evaluating causal effects and for discovering causal networks are discussed in this paper.A causal relation between two variables is different from an association or correlation between them.An association measurement between two variables and may be changed dramatically from positive to negative by omitting a third variable,which is called Yule-Simpson paradox.We shall discuss how to evaluate the causal effect of a treatment or exposure on an outcome to avoid the phenomena of Yule-Simpson paradox. Surrogates and intermediate variables are often used to reduce measurement costs or duration when measurement of endpoint variables is expensive,inconvenient,infeasible or unobservable in practice.There have been many criteria for surrogates.However,it is possible that for a surrogate satisfying these criteria,a treatment has a positive effect on the surrogate,which in turn has a positive effect on the outcome,but the treatment has a negative effect on the outcome,which is called the surrogate paradox.We shall discuss criteria for surrogates to avoid the phenomena of the surrogate paradox. Causal networks which describe the causal relationships among a large number of variables have been applied to many research fields.It is important to discover structures of causal networks from observed data.We propose a recursive approach for discovering a causal network in which a structural learning of a large network is decomposed recursively into learning of small networks.Further to discover causal relationships,we present an active learning approach in terms of external interventions on some variables.When we focus on the causes of an interest outcome, instead of discovering a whole network,we propose a local learning approach to discover these causes that affect the outcome.展开更多
The Chinese air transport system has witnessed an important evolution in the last decade,with a strong increase in the number of flights operated and a consequent reduction of their punctuality.In this contribution,we...The Chinese air transport system has witnessed an important evolution in the last decade,with a strong increase in the number of flights operated and a consequent reduction of their punctuality.In this contribution,we propose modelling the process of delay propagation by using complex networks,in which nodes are associated to airports,and links between pairs of them are assigned when a delay propagation is detected.Delay time series are analysed through the wellknown Granger Causality,which allows detecting if one time series is causing the dynamics observed in a second one.Results indicate that delays are mostly propagated from small and regional airports,and through flights operated by turbo-prop aircraft,These insights can be used to design strategies for delay propagation dampening,as for instance by including small airports into the system's Collaborative Decision Making.展开更多
We investigated the application of Causal Bayesian Networks (CBNs) to large data sets in order to predict user intent via internet search prediction. Here, sample data are taken from search engine logs (Excite, Altavi...We investigated the application of Causal Bayesian Networks (CBNs) to large data sets in order to predict user intent via internet search prediction. Here, sample data are taken from search engine logs (Excite, Altavista, and Alltheweb). These logs are parsed and sorted in order to create a data structure that was used to build a CBN. This network is used to predict the next term or terms that the user may be about to search (type). We looked at the application of CBNs, compared with Naive Bays and Bays Net classifiers on very large datasets. To simulate our proposed results, we took a small sample of search data logs to predict intentional query typing. Additionally, problems that arise with the use of such a data structure are addressed individually along with the solutions used and their prediction accuracy and sensitivity.展开更多
基金Supported by the National Natural Science Foundation of China under Grant Nos.7110317971102129+1 种基金11121403by Program for Young Innovative Research Team in China University of Political Science and Law
文摘The recent financial crisis highlights the inherent weaknesses of the financial market. To explore the mechanism that maintains the financial market as a system, we study the interactions of U.S. financial market from the network perspective. Applied with conditional Granger causality network analysis, network density, in-degree and out-degree rankings are important indicators to analyze the conditional causal relationships among financial agents, and further to assess the stability of U.S. financial systems. It is found that the topological structure of G-causality network in U.S. financial market changed in different stages over the last decade, especially during the recent global financial crisis. Network density of the G-causality model is much higher during the period of 2007-2009 crisis stage, and it reaches the peak value in 2008, the most turbulent time in the crisis. Ranked by in-degrees and out-degrees, insurance companies are listed in the top of 68 financial institutions during the crisis. They act as the hubs which are more easily influenced by other financial institutions and simultaneously influence others during the global financial disturbance.
基金National Natural Science Foundation of China,Grant/Award Numbers:62106009,62276010R&D Program of Beijing Municipal Education Commission,Grant/Award Numbers:KM202210005030,KZ202210005009。
文摘Inferring causal protein signalling networks from human immune system cell data is a promising approach to unravel the underlying tissue signalling biology and dysfunction in diseased cells,which has attracted considerable attention within the bioinformatics field.Recently,Bayesian network(BN)techniques have gained significant popularity in inferring causal protein signalling networks from multiparameter single-cell data.However,current BN methods may exhibit high computational complexity and ignore interactions among protein signalling molecules from different single cells.A novel BN method is presented for learning causal protein signalling networks based on parallel discrete artificial bee colony(PDABC),named PDABC.Specifically,PDABC is a score-based BN method that utilises the parallel artificial bee colony to search for the global optimal causal protein signalling networks with the highest discrete K2 metric.The experimental results on several simulated datasets,as well as a previously published multi-parameter fluorescence-activated cell sorter dataset,indicate that PDABC surpasses the existing state-of-the-art methods in terms of performance and computational efficiency.
文摘Causal analysis is a powerful tool to unravel the data complexity and hence provide clues to achieving, say, better platform design, efficient interoperability and service management, etc. Data science will surely benefit from the advancement in this field. Here we introduce into this community a recent finding in physics on causality and the subsequent rigorous and quantitative causality analysis. The resulting formula is concise in form, involving only the common statistics namely sample covariance. A corollary is that causation implies correlation, but not vice versa, resolving the long-standing philosophical debate over correlation versus causation. The applicability to big data analysis is validated with time series purportedly generated with hidden processes. As a demonstration, a preliminary application to the gross domestic product (GDP) data of United States, China, and Japan reveals some subtle USA-China-Japan relations in certain periods.
文摘Statistical approaches for evaluating causal effects and for discovering causal networks are discussed in this paper.A causal relation between two variables is different from an association or correlation between them.An association measurement between two variables and may be changed dramatically from positive to negative by omitting a third variable,which is called Yule-Simpson paradox.We shall discuss how to evaluate the causal effect of a treatment or exposure on an outcome to avoid the phenomena of Yule-Simpson paradox. Surrogates and intermediate variables are often used to reduce measurement costs or duration when measurement of endpoint variables is expensive,inconvenient,infeasible or unobservable in practice.There have been many criteria for surrogates.However,it is possible that for a surrogate satisfying these criteria,a treatment has a positive effect on the surrogate,which in turn has a positive effect on the outcome,but the treatment has a negative effect on the outcome,which is called the surrogate paradox.We shall discuss criteria for surrogates to avoid the phenomena of the surrogate paradox. Causal networks which describe the causal relationships among a large number of variables have been applied to many research fields.It is important to discover structures of causal networks from observed data.We propose a recursive approach for discovering a causal network in which a structural learning of a large network is decomposed recursively into learning of small networks.Further to discover causal relationships,we present an active learning approach in terms of external interventions on some variables.When we focus on the causes of an interest outcome, instead of discovering a whole network,we propose a local learning approach to discover these causes that affect the outcome.
文摘The Chinese air transport system has witnessed an important evolution in the last decade,with a strong increase in the number of flights operated and a consequent reduction of their punctuality.In this contribution,we propose modelling the process of delay propagation by using complex networks,in which nodes are associated to airports,and links between pairs of them are assigned when a delay propagation is detected.Delay time series are analysed through the wellknown Granger Causality,which allows detecting if one time series is causing the dynamics observed in a second one.Results indicate that delays are mostly propagated from small and regional airports,and through flights operated by turbo-prop aircraft,These insights can be used to design strategies for delay propagation dampening,as for instance by including small airports into the system's Collaborative Decision Making.
文摘We investigated the application of Causal Bayesian Networks (CBNs) to large data sets in order to predict user intent via internet search prediction. Here, sample data are taken from search engine logs (Excite, Altavista, and Alltheweb). These logs are parsed and sorted in order to create a data structure that was used to build a CBN. This network is used to predict the next term or terms that the user may be about to search (type). We looked at the application of CBNs, compared with Naive Bays and Bays Net classifiers on very large datasets. To simulate our proposed results, we took a small sample of search data logs to predict intentional query typing. Additionally, problems that arise with the use of such a data structure are addressed individually along with the solutions used and their prediction accuracy and sensitivity.