Osteoporosis is a known risk factor for rotator cuff tears(RCTs),but the causal correlation and underlying mechanisms remain unclear.This study aims to evaluate the impact of osteoporosis on RCT risk and investigate t...Osteoporosis is a known risk factor for rotator cuff tears(RCTs),but the causal correlation and underlying mechanisms remain unclear.This study aims to evaluate the impact of osteoporosis on RCT risk and investigate their genetic associations.Using data from the UK Biobank(n=457871),cross-sectional analyses demonstrated that osteoporosis was significantly associated with an increased risk of RCTs(adjusted OR[95%CI]=1.38[1.25–1.52]).A longitudinal analysis of a subset of patients(n=268117)over 11 years revealed that osteoporosis increased the risk of RCTs(adjusted HR[95%CI]=1.56[1.29–1.87]),which is notably varied between sexes in sex-stratified analysis.Causal inference methods,including propensity score matching,inverse probability weighting,causal random forest and survival random forest models further confirmed the causal effect,both from cross-sectional and longitudinal perspectives.展开更多
Understanding how renewable energy generation affects electricity prices is essential for designing efficient and sustainable electricity markets.However,most existing studies rely on regression-based approaches that ...Understanding how renewable energy generation affects electricity prices is essential for designing efficient and sustainable electricity markets.However,most existing studies rely on regression-based approaches that capture correlations but fail to identify causal relationships,particularly in the presence of non-linearities and confounding factors.This limits their value for informing policy and market design in the context of the energy transition.To address this gap,we propose a novel causal inference framework based on local partially linear double machine learning(DML).Our method isolates the true impact of predicted wind and solar power generation on electricity prices by controlling for high-dimensional confounders and allowing for non-linear,context-dependent effects.This represents a substantial methodological advancement over standard econometric techniques.Applying this framework to the UK electricity market over the period 2018-2024,we produce the first robust causal estimates of how renewables affect dayahead wholesale electricity prices.We find that wind power exerts a U-shaped causal effect:at low penetration levels,a 1 GWh increase reduces prices by up to£7/MWh,the effect weakens at mid-levels,and intensifies again at higher penetration.Solar power consistently reduces prices at low penetration levels,up to£9/MWh per additional GWh,but its marginal effect diminishes quickly.Importantly,the magnitude of these effects has increased over time,reflecting the growing influence of renewables on price formation as their share in the energy mix rises.These findings offer a sound empirical basis for improving the design of support schemes,refining capacity planning,and enhancing electricity market efficiency.By providing a robust causal understanding of renewable impacts,our study contributes both methodological innovation and actionable insights to guide future energy policy.展开更多
Offshore drilling costs are high,and the downhole environment is even more complex.Improving the rate of penetration(ROP)can effectively shorten offshore drilling cycles and improve economic benefits.It is difficult f...Offshore drilling costs are high,and the downhole environment is even more complex.Improving the rate of penetration(ROP)can effectively shorten offshore drilling cycles and improve economic benefits.It is difficult for the current ROP models to guarantee the prediction accuracy and the robustness of the models at the same time.To address the current issues,a new ROP prediction model was developed in this study,which considers ROP as a time series signal(ROP signal).The model is based on the time convolutional network(TCN)framework and integrates ensemble empirical modal decomposition(EEMD)and Bayesian network causal inference(BN),the model is named EEMD-BN-TCN.Within the proposed model,the EEMD decomposes the original ROP signal into multiple sets of sub-signals.The BN determines the causal relationship between the sub-signals and the key physical parameters(weight on bit and revolutions per minute)and carries out preliminary reconstruction of the sub-signals based on the causal relationship.The TCN predicts signals reconstructed by BN.When applying this model to an actual production well,the average absolute percentage error of the EEMD-BN-TCN prediction decreased from 18.4%with TCN to 9.2%.In addition,compared with other models,the EEMD-BN-TCN can improve the decomposition signal of ROP by regulating weight on bit and revolutions per minute,ultimately enhancing ROP.展开更多
Diabetic kidney disease(DKD)with increasing global prevalence lacks effective therapeutic targets to halt or reverse its progression.Therapeutic targets supported by causal genetic evidence are more likely to succeed ...Diabetic kidney disease(DKD)with increasing global prevalence lacks effective therapeutic targets to halt or reverse its progression.Therapeutic targets supported by causal genetic evidence are more likely to succeed in randomized clinical trials.In this study,we integrated large-scale plasma proteomics,genetic-driven causal inference,and experimental validation to identify prioritized targets for DKD using the UK Biobank(UKB)and FinnGen cohorts.Among 2844 diabetic patients(528 with DKD),we identified 37 targets significantly associated with incident DKD,supported by both observational and causal evidence.Of these,22%(8/37)of the potential targets are currently under investigation for DKD or other diseases.Our prospective study confirmed that higher levels of three prioritized targetsdinsulin-like growth factor binding protein 4(IGFBP4),family with sequence similarity 3 member C(FAM3C),and prostaglandin D2 synthase(PTGDS)dwere associated with a 4.35,3.51,and 3.57-fold increased likelihood of developing DKD,respectively.In addition,population-level protein-altering variants(PAVs)analysis and in vitro experiments cross-validated FAM3C and IGFBP4 as potential new target candidates for DKD,through the classic NLR family pyrin domain containing 3(NLRP3)-caspase-1-gasdermin D(GSDMD)apoptotic axis.Our results demonstrate that integrating omics data mining with causal inference may be a promising strategy for prioritizing therapeutic targets.展开更多
Associations of per-and polyfluoroalkyl substances(PFAS)on lipid metabolism have been documented but research remains scarce regarding effect of PFAS on lipid variability.To deeply understand their relationship,a step...Associations of per-and polyfluoroalkyl substances(PFAS)on lipid metabolism have been documented but research remains scarce regarding effect of PFAS on lipid variability.To deeply understand their relationship,a step-forward in causal inference is expected.To address these,we conducted a longitudinal study with three repeated measurements involving 201 participants in Beijing,among which 100 eligible participants were included for the present study.Twenty-three PFAS and four lipid indicators were assessed at each visit.We used linear mixed models and quantile g-computation models to investigate associations between PFAS and blood lipid levels.A latent class growth model described PFAS serum exposure patterns,and a generalized linear model demonstrated associations between these patterns and lipid variability.Our study found that PFDA was associated with increased TC(β=0.083,95%CI:0.011,0.155)and HDL-C(β=0.106,95%CI:0.034,0.178).The PFAS mixture also showed a positive relationship with TC(β=0.06,95%CI:0.02,0.10),with PFDA contributing most positively.Compared to the low trajectory group,the middle trajectory group for PFDA was associated with VIM of TC(β=0.756,95%CI:0.153,1.359).Furthermore,PFDA showed biological gradientswith lipid metabolism.This is the first repeated-measures study to identify the impact of PFAS serum exposure pattern on the lipid metabolism and the first to estimate the association between PFAS and blood lipid levels in middle-aged and elderly Chinese and reinforce the evidence of their causal relationship through epidemiological studies.展开更多
The goal of point cloud completion is to reconstruct raw scanned point clouds acquired from incomplete observations due to occlusion and restricted viewpoints.Numerous methods use a partial-to-complete framework,direc...The goal of point cloud completion is to reconstruct raw scanned point clouds acquired from incomplete observations due to occlusion and restricted viewpoints.Numerous methods use a partial-to-complete framework,directly predicting missing components via global characteristics extracted from incomplete inputs.However,this makes detail re-covery challenging,as global characteristics fail to provide complete missing component specifics.A new point cloud completion method named Point-PC is proposed.A memory network and a causal inference model are separately designed to introduce shape priors and select absent shape information as supplementary geometric factors for aiding completion.Concretely,a memory mechanism is proposed to store complete shape features and their associated shapes in a key-value format.The authors design a pre-training strategy that uses contrastive learning to map incomplete shape features into the complete shape feature domain,enabling retrieval of analogous shapes from incomplete inputs.In addition,the authors employ backdoor adjustment to eliminate confounders,which are shape prior components sharing identical semantic structures with incomplete inputs.Experiments conducted on three datasets show that our method achieves superior performance compared to state-of-the-art approaches.The code for Point-PC can be accessed by https://github.com/bizbard/Point-PC.git.展开更多
Causal inference is a powerful modeling tool for explanatory analysis,which might enable current machine learning to become explainable.How to marry causal inference with machine learning to develop explainable artifi...Causal inference is a powerful modeling tool for explanatory analysis,which might enable current machine learning to become explainable.How to marry causal inference with machine learning to develop explainable artificial intelligence(XAI)algorithms is one of key steps toward to the artificial intelligence 2.0.With the aim of bringing knowledge of causal inference to scholars of machine learning and artificial intelligence,we invited researchers working on causal inference to write this survey from different aspects of causal inference.This survey includes the following sections:“Estimating average treatment effect:A brief review and beyond”from Dr.Kun Kuang,“Attribution problems in counterfactual inference”from Prof.Lian Li,“The Yule–Simpson paradox and the surrogate paradox”from Prof.Zhi Geng,“Causal potential theory”from Prof.Lei Xu,“Discovering causal information from observational data”from Prof.Kun Zhang,“Formal argumentation in causal reasoning and explanation”from Profs.Beishui Liao and Huaxin Huang,“Causal inference with complex experiments”from Prof.Peng Ding,“Instrumental variables and negative controls for observational studies”from Prof.Wang Miao,and“Causal inference with interference”from Dr.Zhichao Jiang.展开更多
Regression is a widely used econometric tool in research. In observational studies, based on a number of assumptions, regression-based statistical control methods attempt to analyze the causation between treatment and...Regression is a widely used econometric tool in research. In observational studies, based on a number of assumptions, regression-based statistical control methods attempt to analyze the causation between treatment and outcome by adding control variables. However, this approach may not produce reliable estimates of causal effects. In addition to the shortcomings of the method, this lack of confidence is mainly related to ambiguous formulations in econometrics, such as the definition of selection bias, selection of core control variables, and method of testing for robustness. Within the framework of the causal models, we clarify the assumption of causal inference using regression-based statistical controls, as described in econometrics, and discuss how to select core control variables to satisfy this assumption and conduct robustness tests for regression estimates.展开更多
Causal inference prevails in the field of laparoscopic surgery.Once the causality between an intervention and outcome is established,the intervention can be applied to a target population to improve clinical outcomes....Causal inference prevails in the field of laparoscopic surgery.Once the causality between an intervention and outcome is established,the intervention can be applied to a target population to improve clinical outcomes.In many clinical scenarios,interventions are applied longitudinally in response to patients’conditions.Such longitudinal data comprise static variables,such as age,gender,and comorbidities;and dynamic variables,such as the treatment regime,laboratory variables,and vital signs.Some dynamic variables can act as both the confounder and mediator for the effect of an intervention on the outcome;in such cases,simple adjustment with a conventional regression model will bias the effect sizes.To address this,numerous statistical methods are being developed for causal inference;these include,but are not limited to,the structural marginal Cox regression model,dynamic treatment regime,and Cox regression model with time-varying covariates.This technical note provides a gentle introduction to such models and illustrates their use with an example in the field of laparoscopic surgery.展开更多
Statistical approaches for evaluating causal effects and for discovering causal networks are discussed in this paper.A causal relation between two variables is different from an association or correlation between them...Statistical approaches for evaluating causal effects and for discovering causal networks are discussed in this paper.A causal relation between two variables is different from an association or correlation between them.An association measurement between two variables and may be changed dramatically from positive to negative by omitting a third variable,which is called Yule-Simpson paradox.We shall discuss how to evaluate the causal effect of a treatment or exposure on an outcome to avoid the phenomena of Yule-Simpson paradox. Surrogates and intermediate variables are often used to reduce measurement costs or duration when measurement of endpoint variables is expensive,inconvenient,infeasible or unobservable in practice.There have been many criteria for surrogates.However,it is possible that for a surrogate satisfying these criteria,a treatment has a positive effect on the surrogate,which in turn has a positive effect on the outcome,but the treatment has a negative effect on the outcome,which is called the surrogate paradox.We shall discuss criteria for surrogates to avoid the phenomena of the surrogate paradox. Causal networks which describe the causal relationships among a large number of variables have been applied to many research fields.It is important to discover structures of causal networks from observed data.We propose a recursive approach for discovering a causal network in which a structural learning of a large network is decomposed recursively into learning of small networks.Further to discover causal relationships,we present an active learning approach in terms of external interventions on some variables.When we focus on the causes of an interest outcome, instead of discovering a whole network,we propose a local learning approach to discover these causes that affect the outcome.展开更多
Image classification algorithms are commonly based on the Independent and Identically Distribution (i.i.d.) assumption, but in practice, the Out-Of-Distribution (OOD) problem widely exists, that is, the contexts of im...Image classification algorithms are commonly based on the Independent and Identically Distribution (i.i.d.) assumption, but in practice, the Out-Of-Distribution (OOD) problem widely exists, that is, the contexts of images in the model predicting are usually unseen during training. In this case, existing models trained under the i.i.d. assumption are limiting generalisation. Causal inference is an important method to learn the causal associations which are invariant across different environments, thus improving the generalisation ability of the model. However, existing methods usually require partitioning of the environment to learn invariant features, which mostly have imbalance problems due to the lack of constraints. In this paper, we propose a balanced causal learning framework (BCL), starting from how to divide the dataset in a balanced way and the balance of training after the division, which automatically generates fine-grained balanced data partitions in an unsupervised manner and balances the training difficulty of different classes, thereby enhancing the generalisation ability of models in different environments. Experiments on the OOD datasets NICO and NICO++ demonstrate that BCL achieves stable predictions on OOD data, and we also find that models using BCL focus more accurately on the foreground of images compared with the existing causal inference method, which effectively improves the generalisation ability.展开更多
This study’s main purpose is to use Bayesian structural time-series models to investigate the causal effect of an earthquake on the Borsa Istanbul Stock Index.The results reveal a significant negative impact on stock...This study’s main purpose is to use Bayesian structural time-series models to investigate the causal effect of an earthquake on the Borsa Istanbul Stock Index.The results reveal a significant negative impact on stock market value during the post-treatment period.The results indicate rapid divergence from counterfactual predictions,and the actual stock index is lower than would have been expected in the absence of an earthquake.The curve of the actual stock value and the counterfactual prediction after the earthquake suggest a reconvening pattern in the stock market when the stock market resumes its activities.The cumulative impact effect shows a negative effect in relative terms,as evidenced by the decrease in the BIST-100 index of -30%.These results have significant implications for investors and policymakers,emphasizing the need to prepare for natural disasters to minimize their adverse effects on stock market valuations.展开更多
COMPUTATIONAL experiments method is an essential tool for analyzing,designing,managing,and integrating complex systems.However,a significant challenge arises in constructing agents with human-like characteristics to f...COMPUTATIONAL experiments method is an essential tool for analyzing,designing,managing,and integrating complex systems.However,a significant challenge arises in constructing agents with human-like characteristics to form an AI society.Agent modeling typically encompasses four levels:1)The autonomy features of agents,e.g.,perception,behavior,and decision-making;2)The evolutionary features of agents,e.g.,bounded rationality,heterogeneity,and learning evolution;3)The social features of agents,e.g.,interaction,cooperation,and competition;4)The emergent features of agents,e.g.,gaming with environments or regulatory strategies.Traditional modeling techniques primarily derive from ABMs(Agent-based Models)and incorporate various emerging technologies(e.g.,machine learning,big data,and social networks),which can enhance modeling capabilities,while amplifying the complexity[1].展开更多
Deep learning relies on learning from extensive data to generate prediction results.This approach may inadvertently capture spurious correlations within the data,leading to models that lack interpretability and robust...Deep learning relies on learning from extensive data to generate prediction results.This approach may inadvertently capture spurious correlations within the data,leading to models that lack interpretability and robustness.Researchers have developed more profound and stable causal inference methods based on cognitive neuroscience.By replacing the correlation model with a stable and interpretable causal model,it is possible to mitigate the misleading nature of spurious correlations and overcome the limitations of model calculations.In this survey,we provide a comprehensive and structured review of causal inference methods in deep learning.Brain-like inference ideas are discussed from a brain-inspired perspective,and the basic concepts of causal learning are introduced.The article describes the integration of causal inference with traditional deep learning algorithms and illustrates its application to large model tasks as well as specific modalities in deep learning.The current limitations of causal inference and future research directions are discussed.Moreover,the commonly used benchmark datasets and the corresponding download links are summarized.展开更多
This study aims to conduct an in-depth analysis of social media data using causal inference methods to explore the underlying mechanisms driving user behavior patterns.By leveraging large-scale social media datasets,t...This study aims to conduct an in-depth analysis of social media data using causal inference methods to explore the underlying mechanisms driving user behavior patterns.By leveraging large-scale social media datasets,this research develops a systematic analytical framework that integrates techniques such as propensity score matching,regression analysis,and regression discontinuity design to identify the causal effects of content characteristics,user attributes,and social network structures on user interactions,including clicks,shares,comments,and likes.The empirical findings indicate that factors such as sentiment,topical relevance,and network centrality have significant causal impacts on user behavior,with notable differences observed among various user groups.This study not only enriches the theoretical understanding of social media data analysis but also provides data-driven decision support and practical guidance for fields such as digital marketing,public opinion management,and digital governance.展开更多
Artificial Intelligence(AI)has revolutionized education by enabling personalized learning experiences through adaptive platforms.However,traditional AI-driven systems primarily rely on correlation-based analytics,lim-...Artificial Intelligence(AI)has revolutionized education by enabling personalized learning experiences through adaptive platforms.However,traditional AI-driven systems primarily rely on correlation-based analytics,lim-iting their ability to uncover the causal mechanisms behind learning outcomes.This study explores the in-tegration of Knowledge Graphs(KGs)and Causal Inference(CI)as a novel approach to enhance AI-driven educational systems.KGs provide a structured representation of educational knowledge,facilitating intelligent content recommendations and adaptive learning pathways,while CI enables AI systems to move beyond pattern recognition to identify cause-and-effect relationships in student learning.By combining these methods,this research aims to optimize personalized learning path recommendations,improve educational decision-making,and ensure AI-driven interventions are both data-informed and causally validated.Case studies from real-world applications,including intelligent tutoring systems and MOOC platforms,illustrate the practical impact of this approach.The findings contribute to advancing AI-driven education by fostering a balance between knowledge modeling,adaptability,and empirical rigor.展开更多
The utilization of big Earth data has provided insights into the planet we inhabit in unprecedented dimensions and scales.Unraveling the concealed causal connections within intricate data holds paramount importance fo...The utilization of big Earth data has provided insights into the planet we inhabit in unprecedented dimensions and scales.Unraveling the concealed causal connections within intricate data holds paramount importance for attaining a profound comprehension of the Earth system.Statistical methods founded on correlation have predominated in Earth system science(ESS)for a long time.Nevertheless,correlation does not imply causation,especially when confronted with spurious correlations resulting from big data.Consequently,traditional correlation and regression methods are inadequate for addressing causation related problems in the Earth system.In recent years,propelled by advancements in causal theory and inference methods,particularly the maturity of causal discovery and causal graphical models,causal inference has demonstrated vigorous vitality in various research directions in the Earth system,such as regularities revealing,processes understanding,hypothesis testing,and physical models improving.This paper commences by delving into the origins,connotations,and development of causality,subsequently outlining the principal frameworks of causal inference and the commonly used methods in ESS.Additionally,it reviews the applications of causal inference in the main branches of the Earth system and summarizes the challenges and development directions of causal inference in ESS.In the big Earth data era,as an important method of big data analysis,causal inference,along with physical model and machine learning,can assist the paradigm transformation of ESS from a model-driven paradigm to a paradigm of integration of both mechanism and data.Looking forward,the establishment of a meticulously structured and normalized causal theory can act as a foundational cornerstone for fostering causal cognition in ESS and propel the leap from fragmented research towards a comprehensive understanding of the Earth system.展开更多
Deep learning-based models are vulnerable to adversarial attacks. Defense against adversarial attacks is essential for sensitive and safety-critical scenarios. However, deep learning methods still lack effective and e...Deep learning-based models are vulnerable to adversarial attacks. Defense against adversarial attacks is essential for sensitive and safety-critical scenarios. However, deep learning methods still lack effective and efficient defense mechanisms against adversarial attacks. Most of the existing methods are just stopgaps for specific adversarial samples. The main obstacle is that how adversarial samples fool the deep learning models is still unclear. The underlying working mechanism of adversarial samples has not been well explored, and it is the bottleneck of adversarial attack defense. In this paper, we build a causal model to interpret the generation and performance of adversarial samples. The self-attention/transformer is adopted as a powerful tool in this causal model. Compared to existing methods, causality enables us to analyze adversarial samples more naturally and intrinsically. Based on this causal model, the working mechanism of adversarial samples is revealed, and instructive analysis is provided. Then, we propose simple and effective adversarial sample detection and recognition methods according to the revealed working mechanism. The causal insights enable us to detect and recognize adversarial samples without any extra model or training. Extensive experiments are conducted to demonstrate the effectiveness of the proposed methods. Our methods outperform the state-of-the-art defense methods under various adversarial attacks.展开更多
Multimodal documents combining language and graphs are wide-spread in print media as well as in electronic media. One of the most important tasks to be solved in comprehending graph-text combinations is construction o...Multimodal documents combining language and graphs are wide-spread in print media as well as in electronic media. One of the most important tasks to be solved in comprehending graph-text combinations is construction of causal chains among the meaning entities provided by modalities. In this study we focus on the role of annotation position and shape of graph lines in simple line graphs on causal attributions concerning the event presented by the annotation and the processes (i.e, increases and decreases) and states (no-changes) in the domain value of the graphs presented by the process-lines and state-lines. Based on the experimental investigation of readers' inferences under different conditions, guidelines for the design of multimodal documents including text and statistical information graphics are suggested. One suggestion is that the position and the number of verbal annotations should be selected appropriately, another is that the graph line smoothing should be done cautiously.展开更多
Causal inference plays a crucial role in biomedical studies and social sciences.Over the years,researchers have devised various methods to facilitate causal inference,particularly in observational studies.Among these ...Causal inference plays a crucial role in biomedical studies and social sciences.Over the years,researchers have devised various methods to facilitate causal inference,particularly in observational studies.Among these methods,the doubly robust estimator distinguishes itself through a remarkable feature:it retains its consistency even when only one of the two components–either the propensity score model or the outcome mean model–is correctly specified,rather than demanding correctness in both simultaneously.In this paper,we focus on scenarios where semiparametric models are employed for both the propensity score and the outcome mean.Semiparametric models offer a valuable blend of interpretability akin to parametric models and the adaptability characteristic of nonparametric models.In this context,achieving correct model specification involves both accurately specifying the unknown function and consistently estimating the unknown parameter.Weintroduce a novel concept:the relaxed doubly robust estimator.It operates in a manner reminiscent of the traditional doubly robust estimator but with a reduced requirement for double robustness.In essence,it only mandates the consistent estimate of the unknown parameter,without requiring the correct specification of the unknown function.This means that it only necessitates a partially correct model specification.We conduct a thorough analysis to establish the double robustness and semiparametric efficiency of our proposed estimator.Furthermore,we bolster our findings with comprehensive simulation studies to illustrate the practical implications of our approach.展开更多
基金the Scientific Research Innovation Capability Support Project for Young Faculty(ZYGXQNJSKYCXNLZCXM-H8)Fundamental Research Funds for the Central Universities(2024ZYGXZR077)+3 种基金Guangdong Basic and Applied Basic Research Foundation(2023B1515120006)Guangzhou Basic and Applied Basic Research Foundation(2024A04J5776)the Research Fund(2023QN10Y421)Guangzhou Talent Recruitment Team Program(2024D03J0004),all related to this study.
文摘Osteoporosis is a known risk factor for rotator cuff tears(RCTs),but the causal correlation and underlying mechanisms remain unclear.This study aims to evaluate the impact of osteoporosis on RCT risk and investigate their genetic associations.Using data from the UK Biobank(n=457871),cross-sectional analyses demonstrated that osteoporosis was significantly associated with an increased risk of RCTs(adjusted OR[95%CI]=1.38[1.25–1.52]).A longitudinal analysis of a subset of patients(n=268117)over 11 years revealed that osteoporosis increased the risk of RCTs(adjusted HR[95%CI]=1.56[1.29–1.87]),which is notably varied between sexes in sex-stratified analysis.Causal inference methods,including propensity score matching,inverse probability weighting,causal random forest and survival random forest models further confirmed the causal effect,both from cross-sectional and longitudinal perspectives.
文摘Understanding how renewable energy generation affects electricity prices is essential for designing efficient and sustainable electricity markets.However,most existing studies rely on regression-based approaches that capture correlations but fail to identify causal relationships,particularly in the presence of non-linearities and confounding factors.This limits their value for informing policy and market design in the context of the energy transition.To address this gap,we propose a novel causal inference framework based on local partially linear double machine learning(DML).Our method isolates the true impact of predicted wind and solar power generation on electricity prices by controlling for high-dimensional confounders and allowing for non-linear,context-dependent effects.This represents a substantial methodological advancement over standard econometric techniques.Applying this framework to the UK electricity market over the period 2018-2024,we produce the first robust causal estimates of how renewables affect dayahead wholesale electricity prices.We find that wind power exerts a U-shaped causal effect:at low penetration levels,a 1 GWh increase reduces prices by up to£7/MWh,the effect weakens at mid-levels,and intensifies again at higher penetration.Solar power consistently reduces prices at low penetration levels,up to£9/MWh per additional GWh,but its marginal effect diminishes quickly.Importantly,the magnitude of these effects has increased over time,reflecting the growing influence of renewables on price formation as their share in the energy mix rises.These findings offer a sound empirical basis for improving the design of support schemes,refining capacity planning,and enhancing electricity market efficiency.By providing a robust causal understanding of renewable impacts,our study contributes both methodological innovation and actionable insights to guide future energy policy.
基金the financial support by the National Natural Science Foundation of China(Grant No.U24B2029)the Key Projects of the National Natural Science Foundation of China(Grant No.52334001)+1 种基金the Strategic Cooperation Technology Projects of CNPC and CUPB(Grand No.ZLZX2020-02)the China University of Petroleum,Beijing(Grand No.ZX20230042)。
文摘Offshore drilling costs are high,and the downhole environment is even more complex.Improving the rate of penetration(ROP)can effectively shorten offshore drilling cycles and improve economic benefits.It is difficult for the current ROP models to guarantee the prediction accuracy and the robustness of the models at the same time.To address the current issues,a new ROP prediction model was developed in this study,which considers ROP as a time series signal(ROP signal).The model is based on the time convolutional network(TCN)framework and integrates ensemble empirical modal decomposition(EEMD)and Bayesian network causal inference(BN),the model is named EEMD-BN-TCN.Within the proposed model,the EEMD decomposes the original ROP signal into multiple sets of sub-signals.The BN determines the causal relationship between the sub-signals and the key physical parameters(weight on bit and revolutions per minute)and carries out preliminary reconstruction of the sub-signals based on the causal relationship.The TCN predicts signals reconstructed by BN.When applying this model to an actual production well,the average absolute percentage error of the EEMD-BN-TCN prediction decreased from 18.4%with TCN to 9.2%.In addition,compared with other models,the EEMD-BN-TCN can improve the decomposition signal of ROP by regulating weight on bit and revolutions per minute,ultimately enhancing ROP.
基金supported by the National Natural Science Foundation of China(Grant Nos.:82204396,82304491,and 82400511).
文摘Diabetic kidney disease(DKD)with increasing global prevalence lacks effective therapeutic targets to halt or reverse its progression.Therapeutic targets supported by causal genetic evidence are more likely to succeed in randomized clinical trials.In this study,we integrated large-scale plasma proteomics,genetic-driven causal inference,and experimental validation to identify prioritized targets for DKD using the UK Biobank(UKB)and FinnGen cohorts.Among 2844 diabetic patients(528 with DKD),we identified 37 targets significantly associated with incident DKD,supported by both observational and causal evidence.Of these,22%(8/37)of the potential targets are currently under investigation for DKD or other diseases.Our prospective study confirmed that higher levels of three prioritized targetsdinsulin-like growth factor binding protein 4(IGFBP4),family with sequence similarity 3 member C(FAM3C),and prostaglandin D2 synthase(PTGDS)dwere associated with a 4.35,3.51,and 3.57-fold increased likelihood of developing DKD,respectively.In addition,population-level protein-altering variants(PAVs)analysis and in vitro experiments cross-validated FAM3C and IGFBP4 as potential new target candidates for DKD,through the classic NLR family pyrin domain containing 3(NLRP3)-caspase-1-gasdermin D(GSDMD)apoptotic axis.Our results demonstrate that integrating omics data mining with causal inference may be a promising strategy for prioritizing therapeutic targets.
基金supported by the National Natural Science Foundation of China(No.82404365)the Noncommunicable Chronic Diseases-National Science and Technology Major Project(No.2023ZD0513200)+7 种基金China Medical Board(No.15-230)China Postdoctoral Science Foundation(Nos.2023M730317and 2023T160066)the Fundamental Research Funds for the Central Universities(No.3332023042)the Open Project of Hebei Key Laboratory of Environment and Human Health(No.202301)the National Key Research and Development Program of China(No.2022YFC3703000)the Non-profit Central Research Institute Fund of Chinese Academy of Medical Sciences(No.2022-JKCS-11)the CAMS Innovation Fund for Medical Sciences(No.2022-I2M-JB-003)the Programs of the National Natural Science Foundation of China(No.21976050).
文摘Associations of per-and polyfluoroalkyl substances(PFAS)on lipid metabolism have been documented but research remains scarce regarding effect of PFAS on lipid variability.To deeply understand their relationship,a step-forward in causal inference is expected.To address these,we conducted a longitudinal study with three repeated measurements involving 201 participants in Beijing,among which 100 eligible participants were included for the present study.Twenty-three PFAS and four lipid indicators were assessed at each visit.We used linear mixed models and quantile g-computation models to investigate associations between PFAS and blood lipid levels.A latent class growth model described PFAS serum exposure patterns,and a generalized linear model demonstrated associations between these patterns and lipid variability.Our study found that PFDA was associated with increased TC(β=0.083,95%CI:0.011,0.155)and HDL-C(β=0.106,95%CI:0.034,0.178).The PFAS mixture also showed a positive relationship with TC(β=0.06,95%CI:0.02,0.10),with PFDA contributing most positively.Compared to the low trajectory group,the middle trajectory group for PFDA was associated with VIM of TC(β=0.756,95%CI:0.153,1.359).Furthermore,PFDA showed biological gradientswith lipid metabolism.This is the first repeated-measures study to identify the impact of PFAS serum exposure pattern on the lipid metabolism and the first to estimate the association between PFAS and blood lipid levels in middle-aged and elderly Chinese and reinforce the evidence of their causal relationship through epidemiological studies.
基金National Key Research and Development Program of China,Grant/Award Number:2020YFB1711704。
文摘The goal of point cloud completion is to reconstruct raw scanned point clouds acquired from incomplete observations due to occlusion and restricted viewpoints.Numerous methods use a partial-to-complete framework,directly predicting missing components via global characteristics extracted from incomplete inputs.However,this makes detail re-covery challenging,as global characteristics fail to provide complete missing component specifics.A new point cloud completion method named Point-PC is proposed.A memory network and a causal inference model are separately designed to introduce shape priors and select absent shape information as supplementary geometric factors for aiding completion.Concretely,a memory mechanism is proposed to store complete shape features and their associated shapes in a key-value format.The authors design a pre-training strategy that uses contrastive learning to map incomplete shape features into the complete shape feature domain,enabling retrieval of analogous shapes from incomplete inputs.In addition,the authors employ backdoor adjustment to eliminate confounders,which are shape prior components sharing identical semantic structures with incomplete inputs.Experiments conducted on three datasets show that our method achieves superior performance compared to state-of-the-art approaches.The code for Point-PC can be accessed by https://github.com/bizbard/Point-PC.git.
文摘Causal inference is a powerful modeling tool for explanatory analysis,which might enable current machine learning to become explainable.How to marry causal inference with machine learning to develop explainable artificial intelligence(XAI)algorithms is one of key steps toward to the artificial intelligence 2.0.With the aim of bringing knowledge of causal inference to scholars of machine learning and artificial intelligence,we invited researchers working on causal inference to write this survey from different aspects of causal inference.This survey includes the following sections:“Estimating average treatment effect:A brief review and beyond”from Dr.Kun Kuang,“Attribution problems in counterfactual inference”from Prof.Lian Li,“The Yule–Simpson paradox and the surrogate paradox”from Prof.Zhi Geng,“Causal potential theory”from Prof.Lei Xu,“Discovering causal information from observational data”from Prof.Kun Zhang,“Formal argumentation in causal reasoning and explanation”from Profs.Beishui Liao and Huaxin Huang,“Causal inference with complex experiments”from Prof.Peng Ding,“Instrumental variables and negative controls for observational studies”from Prof.Wang Miao,and“Causal inference with interference”from Dr.Zhichao Jiang.
基金This research was funded by the National Natural Science Foundation of China(Grant No.72074060).
文摘Regression is a widely used econometric tool in research. In observational studies, based on a number of assumptions, regression-based statistical control methods attempt to analyze the causation between treatment and outcome by adding control variables. However, this approach may not produce reliable estimates of causal effects. In addition to the shortcomings of the method, this lack of confidence is mainly related to ambiguous formulations in econometrics, such as the definition of selection bias, selection of core control variables, and method of testing for robustness. Within the framework of the causal models, we clarify the assumption of causal inference using regression-based statistical controls, as described in econometrics, and discuss how to select core control variables to satisfy this assumption and conduct robustness tests for regression estimates.
基金funding from the National Natural Science Foundation of China(82272180)Open Foundation of Key Laboratory of Digital Technology in Medical Diagnostics of Zhejiang Province(SZZD202206)+2 种基金funding from the Sichuan Medical Association Scientific Research Project(S21019)funding from the Key Research and Development Project of Zhejiang Province(2021C03071)funding from Zhejiang Medical and Health Science and Technology Project(2017ZD001)。
文摘Causal inference prevails in the field of laparoscopic surgery.Once the causality between an intervention and outcome is established,the intervention can be applied to a target population to improve clinical outcomes.In many clinical scenarios,interventions are applied longitudinally in response to patients’conditions.Such longitudinal data comprise static variables,such as age,gender,and comorbidities;and dynamic variables,such as the treatment regime,laboratory variables,and vital signs.Some dynamic variables can act as both the confounder and mediator for the effect of an intervention on the outcome;in such cases,simple adjustment with a conventional regression model will bias the effect sizes.To address this,numerous statistical methods are being developed for causal inference;these include,but are not limited to,the structural marginal Cox regression model,dynamic treatment regime,and Cox regression model with time-varying covariates.This technical note provides a gentle introduction to such models and illustrates their use with an example in the field of laparoscopic surgery.
文摘Statistical approaches for evaluating causal effects and for discovering causal networks are discussed in this paper.A causal relation between two variables is different from an association or correlation between them.An association measurement between two variables and may be changed dramatically from positive to negative by omitting a third variable,which is called Yule-Simpson paradox.We shall discuss how to evaluate the causal effect of a treatment or exposure on an outcome to avoid the phenomena of Yule-Simpson paradox. Surrogates and intermediate variables are often used to reduce measurement costs or duration when measurement of endpoint variables is expensive,inconvenient,infeasible or unobservable in practice.There have been many criteria for surrogates.However,it is possible that for a surrogate satisfying these criteria,a treatment has a positive effect on the surrogate,which in turn has a positive effect on the outcome,but the treatment has a negative effect on the outcome,which is called the surrogate paradox.We shall discuss criteria for surrogates to avoid the phenomena of the surrogate paradox. Causal networks which describe the causal relationships among a large number of variables have been applied to many research fields.It is important to discover structures of causal networks from observed data.We propose a recursive approach for discovering a causal network in which a structural learning of a large network is decomposed recursively into learning of small networks.Further to discover causal relationships,we present an active learning approach in terms of external interventions on some variables.When we focus on the causes of an interest outcome, instead of discovering a whole network,we propose a local learning approach to discover these causes that affect the outcome.
基金TaiShan Scholars Program(Grant no.tsqn202211289)National Key R&D Program of China(Grant no.2021YFC3300203)Oversea Innovation Team Project of the“20 Regulations for New Universities”funding program of Jinan(Grant no.2021GXRC073),and the Excellent Youth Scholars Program of Shandong Province(Grant no.2022HWYQ-048).
文摘Image classification algorithms are commonly based on the Independent and Identically Distribution (i.i.d.) assumption, but in practice, the Out-Of-Distribution (OOD) problem widely exists, that is, the contexts of images in the model predicting are usually unseen during training. In this case, existing models trained under the i.i.d. assumption are limiting generalisation. Causal inference is an important method to learn the causal associations which are invariant across different environments, thus improving the generalisation ability of the model. However, existing methods usually require partitioning of the environment to learn invariant features, which mostly have imbalance problems due to the lack of constraints. In this paper, we propose a balanced causal learning framework (BCL), starting from how to divide the dataset in a balanced way and the balance of training after the division, which automatically generates fine-grained balanced data partitions in an unsupervised manner and balances the training difficulty of different classes, thereby enhancing the generalisation ability of models in different environments. Experiments on the OOD datasets NICO and NICO++ demonstrate that BCL achieves stable predictions on OOD data, and we also find that models using BCL focus more accurately on the foreground of images compared with the existing causal inference method, which effectively improves the generalisation ability.
文摘This study’s main purpose is to use Bayesian structural time-series models to investigate the causal effect of an earthquake on the Borsa Istanbul Stock Index.The results reveal a significant negative impact on stock market value during the post-treatment period.The results indicate rapid divergence from counterfactual predictions,and the actual stock index is lower than would have been expected in the absence of an earthquake.The curve of the actual stock value and the counterfactual prediction after the earthquake suggest a reconvening pattern in the stock market when the stock market resumes its activities.The cumulative impact effect shows a negative effect in relative terms,as evidenced by the decrease in the BIST-100 index of -30%.These results have significant implications for investors and policymakers,emphasizing the need to prepare for natural disasters to minimize their adverse effects on stock market valuations.
基金supported in part by National Key Research and Development Program of China(2021YFF0900800)National Natural Science Foundation of China(62472306,62441221,62206116)+2 种基金Tianjin University’s 2024 Special Project on Disciplinary Development(XKJS-2024-5-9)Tianjin University Talent Innovation Reward Program for Literature&Science Graduate Student(C1-2022-010)Shanxi Province Social Science Foundation(2020F002).
文摘COMPUTATIONAL experiments method is an essential tool for analyzing,designing,managing,and integrating complex systems.However,a significant challenge arises in constructing agents with human-like characteristics to form an AI society.Agent modeling typically encompasses four levels:1)The autonomy features of agents,e.g.,perception,behavior,and decision-making;2)The evolutionary features of agents,e.g.,bounded rationality,heterogeneity,and learning evolution;3)The social features of agents,e.g.,interaction,cooperation,and competition;4)The emergent features of agents,e.g.,gaming with environments or regulatory strategies.Traditional modeling techniques primarily derive from ABMs(Agent-based Models)and incorporate various emerging technologies(e.g.,machine learning,big data,and social networks),which can enhance modeling capabilities,while amplifying the complexity[1].
基金supported in part by the Key Scientific Technological Innovation Research Project of the Ministry of Education,the Joint Funds of the National Natural Science Foundation of China(U22B2054)the National Natural Science Foundation of China(62076192,61902298,61573267,61906150,and 62276199)+2 种基金the 111 Project,the Program for Cheung Kong Scholars and Innovative Research Team in University(IRT 15R53)the Science and Technology Innovation Project from the Chinese Ministry of Education,the Key Research and Development Program in Shaanxi Province of China(2019ZDLGY03-06)the China Postdoctoral Fund(2022T150506).
文摘Deep learning relies on learning from extensive data to generate prediction results.This approach may inadvertently capture spurious correlations within the data,leading to models that lack interpretability and robustness.Researchers have developed more profound and stable causal inference methods based on cognitive neuroscience.By replacing the correlation model with a stable and interpretable causal model,it is possible to mitigate the misleading nature of spurious correlations and overcome the limitations of model calculations.In this survey,we provide a comprehensive and structured review of causal inference methods in deep learning.Brain-like inference ideas are discussed from a brain-inspired perspective,and the basic concepts of causal learning are introduced.The article describes the integration of causal inference with traditional deep learning algorithms and illustrates its application to large model tasks as well as specific modalities in deep learning.The current limitations of causal inference and future research directions are discussed.Moreover,the commonly used benchmark datasets and the corresponding download links are summarized.
文摘This study aims to conduct an in-depth analysis of social media data using causal inference methods to explore the underlying mechanisms driving user behavior patterns.By leveraging large-scale social media datasets,this research develops a systematic analytical framework that integrates techniques such as propensity score matching,regression analysis,and regression discontinuity design to identify the causal effects of content characteristics,user attributes,and social network structures on user interactions,including clicks,shares,comments,and likes.The empirical findings indicate that factors such as sentiment,topical relevance,and network centrality have significant causal impacts on user behavior,with notable differences observed among various user groups.This study not only enriches the theoretical understanding of social media data analysis but also provides data-driven decision support and practical guidance for fields such as digital marketing,public opinion management,and digital governance.
文摘Artificial Intelligence(AI)has revolutionized education by enabling personalized learning experiences through adaptive platforms.However,traditional AI-driven systems primarily rely on correlation-based analytics,lim-iting their ability to uncover the causal mechanisms behind learning outcomes.This study explores the in-tegration of Knowledge Graphs(KGs)and Causal Inference(CI)as a novel approach to enhance AI-driven educational systems.KGs provide a structured representation of educational knowledge,facilitating intelligent content recommendations and adaptive learning pathways,while CI enables AI systems to move beyond pattern recognition to identify cause-and-effect relationships in student learning.By combining these methods,this research aims to optimize personalized learning path recommendations,improve educational decision-making,and ensure AI-driven interventions are both data-informed and causally validated.Case studies from real-world applications,including intelligent tutoring systems and MOOC platforms,illustrate the practical impact of this approach.The findings contribute to advancing AI-driven education by fostering a balance between knowledge modeling,adaptability,and empirical rigor.
基金supported by the Basic Science Center for Tibetan Plateau Earth System(BCTPES,NSFC project Grant Nos.41988101)the National Natural Science Foundation of China(Grant No.42101397)。
文摘The utilization of big Earth data has provided insights into the planet we inhabit in unprecedented dimensions and scales.Unraveling the concealed causal connections within intricate data holds paramount importance for attaining a profound comprehension of the Earth system.Statistical methods founded on correlation have predominated in Earth system science(ESS)for a long time.Nevertheless,correlation does not imply causation,especially when confronted with spurious correlations resulting from big data.Consequently,traditional correlation and regression methods are inadequate for addressing causation related problems in the Earth system.In recent years,propelled by advancements in causal theory and inference methods,particularly the maturity of causal discovery and causal graphical models,causal inference has demonstrated vigorous vitality in various research directions in the Earth system,such as regularities revealing,processes understanding,hypothesis testing,and physical models improving.This paper commences by delving into the origins,connotations,and development of causality,subsequently outlining the principal frameworks of causal inference and the commonly used methods in ESS.Additionally,it reviews the applications of causal inference in the main branches of the Earth system and summarizes the challenges and development directions of causal inference in ESS.In the big Earth data era,as an important method of big data analysis,causal inference,along with physical model and machine learning,can assist the paradigm transformation of ESS from a model-driven paradigm to a paradigm of integration of both mechanism and data.Looking forward,the establishment of a meticulously structured and normalized causal theory can act as a foundational cornerstone for fostering causal cognition in ESS and propel the leap from fragmented research towards a comprehensive understanding of the Earth system.
基金supported by National Key Research and Development Program of China(No.2020AAA0140002)Natural Science Foundation of China(Nos.U1836217,62076240,62006225,61906199,62071468,62176025 and U21B200389)the CAAI-Huawei Mind-spore Open Fund.
文摘Deep learning-based models are vulnerable to adversarial attacks. Defense against adversarial attacks is essential for sensitive and safety-critical scenarios. However, deep learning methods still lack effective and efficient defense mechanisms against adversarial attacks. Most of the existing methods are just stopgaps for specific adversarial samples. The main obstacle is that how adversarial samples fool the deep learning models is still unclear. The underlying working mechanism of adversarial samples has not been well explored, and it is the bottleneck of adversarial attack defense. In this paper, we build a causal model to interpret the generation and performance of adversarial samples. The self-attention/transformer is adopted as a powerful tool in this causal model. Compared to existing methods, causality enables us to analyze adversarial samples more naturally and intrinsically. Based on this causal model, the working mechanism of adversarial samples is revealed, and instructive analysis is provided. Then, we propose simple and effective adversarial sample detection and recognition methods according to the revealed working mechanism. The causal insights enable us to detect and recognize adversarial samples without any extra model or training. Extensive experiments are conducted to demonstrate the effectiveness of the proposed methods. Our methods outperform the state-of-the-art defense methods under various adversarial attacks.
基金Supported in part by DFG(German Science Foundation) in ITRG1247‘Cross-modal Interaction in Natural and Artificial Cognitive Systems’(CI-NACS)
文摘Multimodal documents combining language and graphs are wide-spread in print media as well as in electronic media. One of the most important tasks to be solved in comprehending graph-text combinations is construction of causal chains among the meaning entities provided by modalities. In this study we focus on the role of annotation position and shape of graph lines in simple line graphs on causal attributions concerning the event presented by the annotation and the processes (i.e, increases and decreases) and states (no-changes) in the domain value of the graphs presented by the process-lines and state-lines. Based on the experimental investigation of readers' inferences under different conditions, guidelines for the design of multimodal documents including text and statistical information graphics are suggested. One suggestion is that the position and the number of verbal annotations should be selected appropriately, another is that the graph line smoothing should be done cautiously.
基金supported in part by US National Science Foundation[Grant Numbers DMS 1953526,2122074,2310942]US National Institutes of Health[Grant Number R01DC021431]the American Family Funding Initiative of UW-Madison.
文摘Causal inference plays a crucial role in biomedical studies and social sciences.Over the years,researchers have devised various methods to facilitate causal inference,particularly in observational studies.Among these methods,the doubly robust estimator distinguishes itself through a remarkable feature:it retains its consistency even when only one of the two components–either the propensity score model or the outcome mean model–is correctly specified,rather than demanding correctness in both simultaneously.In this paper,we focus on scenarios where semiparametric models are employed for both the propensity score and the outcome mean.Semiparametric models offer a valuable blend of interpretability akin to parametric models and the adaptability characteristic of nonparametric models.In this context,achieving correct model specification involves both accurately specifying the unknown function and consistently estimating the unknown parameter.Weintroduce a novel concept:the relaxed doubly robust estimator.It operates in a manner reminiscent of the traditional doubly robust estimator but with a reduced requirement for double robustness.In essence,it only mandates the consistent estimate of the unknown parameter,without requiring the correct specification of the unknown function.This means that it only necessitates a partially correct model specification.We conduct a thorough analysis to establish the double robustness and semiparametric efficiency of our proposed estimator.Furthermore,we bolster our findings with comprehensive simulation studies to illustrate the practical implications of our approach.