Osteoporosis is a known risk factor for rotator cuff tears(RCTs),but the causal correlation and underlying mechanisms remain unclear.This study aims to evaluate the impact of osteoporosis on RCT risk and investigate t...Osteoporosis is a known risk factor for rotator cuff tears(RCTs),but the causal correlation and underlying mechanisms remain unclear.This study aims to evaluate the impact of osteoporosis on RCT risk and investigate their genetic associations.Using data from the UK Biobank(n=457871),cross-sectional analyses demonstrated that osteoporosis was significantly associated with an increased risk of RCTs(adjusted OR[95%CI]=1.38[1.25–1.52]).A longitudinal analysis of a subset of patients(n=268117)over 11 years revealed that osteoporosis increased the risk of RCTs(adjusted HR[95%CI]=1.56[1.29–1.87]),which is notably varied between sexes in sex-stratified analysis.Causal inference methods,including propensity score matching,inverse probability weighting,causal random forest and survival random forest models further confirmed the causal effect,both from cross-sectional and longitudinal perspectives.展开更多
Offshore drilling costs are high,and the downhole environment is even more complex.Improving the rate of penetration(ROP)can effectively shorten offshore drilling cycles and improve economic benefits.It is difficult f...Offshore drilling costs are high,and the downhole environment is even more complex.Improving the rate of penetration(ROP)can effectively shorten offshore drilling cycles and improve economic benefits.It is difficult for the current ROP models to guarantee the prediction accuracy and the robustness of the models at the same time.To address the current issues,a new ROP prediction model was developed in this study,which considers ROP as a time series signal(ROP signal).The model is based on the time convolutional network(TCN)framework and integrates ensemble empirical modal decomposition(EEMD)and Bayesian network causal inference(BN),the model is named EEMD-BN-TCN.Within the proposed model,the EEMD decomposes the original ROP signal into multiple sets of sub-signals.The BN determines the causal relationship between the sub-signals and the key physical parameters(weight on bit and revolutions per minute)and carries out preliminary reconstruction of the sub-signals based on the causal relationship.The TCN predicts signals reconstructed by BN.When applying this model to an actual production well,the average absolute percentage error of the EEMD-BN-TCN prediction decreased from 18.4%with TCN to 9.2%.In addition,compared with other models,the EEMD-BN-TCN can improve the decomposition signal of ROP by regulating weight on bit and revolutions per minute,ultimately enhancing ROP.展开更多
The goal of point cloud completion is to reconstruct raw scanned point clouds acquired from incomplete observations due to occlusion and restricted viewpoints.Numerous methods use a partial-to-complete framework,direc...The goal of point cloud completion is to reconstruct raw scanned point clouds acquired from incomplete observations due to occlusion and restricted viewpoints.Numerous methods use a partial-to-complete framework,directly predicting missing components via global characteristics extracted from incomplete inputs.However,this makes detail re-covery challenging,as global characteristics fail to provide complete missing component specifics.A new point cloud completion method named Point-PC is proposed.A memory network and a causal inference model are separately designed to introduce shape priors and select absent shape information as supplementary geometric factors for aiding completion.Concretely,a memory mechanism is proposed to store complete shape features and their associated shapes in a key-value format.The authors design a pre-training strategy that uses contrastive learning to map incomplete shape features into the complete shape feature domain,enabling retrieval of analogous shapes from incomplete inputs.In addition,the authors employ backdoor adjustment to eliminate confounders,which are shape prior components sharing identical semantic structures with incomplete inputs.Experiments conducted on three datasets show that our method achieves superior performance compared to state-of-the-art approaches.The code for Point-PC can be accessed by https://github.com/bizbard/Point-PC.git.展开更多
Associations of per-and polyfluoroalkyl substances(PFAS)on lipid metabolism have been documented but research remains scarce regarding effect of PFAS on lipid variability.To deeply understand their relationship,a step...Associations of per-and polyfluoroalkyl substances(PFAS)on lipid metabolism have been documented but research remains scarce regarding effect of PFAS on lipid variability.To deeply understand their relationship,a step-forward in causal inference is expected.To address these,we conducted a longitudinal study with three repeated measurements involving 201 participants in Beijing,among which 100 eligible participants were included for the present study.Twenty-three PFAS and four lipid indicators were assessed at each visit.We used linear mixed models and quantile g-computation models to investigate associations between PFAS and blood lipid levels.A latent class growth model described PFAS serum exposure patterns,and a generalized linear model demonstrated associations between these patterns and lipid variability.Our study found that PFDA was associated with increased TC(β=0.083,95%CI:0.011,0.155)and HDL-C(β=0.106,95%CI:0.034,0.178).The PFAS mixture also showed a positive relationship with TC(β=0.06,95%CI:0.02,0.10),with PFDA contributing most positively.Compared to the low trajectory group,the middle trajectory group for PFDA was associated with VIM of TC(β=0.756,95%CI:0.153,1.359).Furthermore,PFDA showed biological gradientswith lipid metabolism.This is the first repeated-measures study to identify the impact of PFAS serum exposure pattern on the lipid metabolism and the first to estimate the association between PFAS and blood lipid levels in middle-aged and elderly Chinese and reinforce the evidence of their causal relationship through epidemiological studies.展开更多
Image classification algorithms are commonly based on the Independent and Identically Distribution (i.i.d.) assumption, but in practice, the Out-Of-Distribution (OOD) problem widely exists, that is, the contexts of im...Image classification algorithms are commonly based on the Independent and Identically Distribution (i.i.d.) assumption, but in practice, the Out-Of-Distribution (OOD) problem widely exists, that is, the contexts of images in the model predicting are usually unseen during training. In this case, existing models trained under the i.i.d. assumption are limiting generalisation. Causal inference is an important method to learn the causal associations which are invariant across different environments, thus improving the generalisation ability of the model. However, existing methods usually require partitioning of the environment to learn invariant features, which mostly have imbalance problems due to the lack of constraints. In this paper, we propose a balanced causal learning framework (BCL), starting from how to divide the dataset in a balanced way and the balance of training after the division, which automatically generates fine-grained balanced data partitions in an unsupervised manner and balances the training difficulty of different classes, thereby enhancing the generalisation ability of models in different environments. Experiments on the OOD datasets NICO and NICO++ demonstrate that BCL achieves stable predictions on OOD data, and we also find that models using BCL focus more accurately on the foreground of images compared with the existing causal inference method, which effectively improves the generalisation ability.展开更多
This study’s main purpose is to use Bayesian structural time-series models to investigate the causal effect of an earthquake on the Borsa Istanbul Stock Index.The results reveal a significant negative impact on stock...This study’s main purpose is to use Bayesian structural time-series models to investigate the causal effect of an earthquake on the Borsa Istanbul Stock Index.The results reveal a significant negative impact on stock market value during the post-treatment period.The results indicate rapid divergence from counterfactual predictions,and the actual stock index is lower than would have been expected in the absence of an earthquake.The curve of the actual stock value and the counterfactual prediction after the earthquake suggest a reconvening pattern in the stock market when the stock market resumes its activities.The cumulative impact effect shows a negative effect in relative terms,as evidenced by the decrease in the BIST-100 index of -30%.These results have significant implications for investors and policymakers,emphasizing the need to prepare for natural disasters to minimize their adverse effects on stock market valuations.展开更多
Causal inference is a powerful modeling tool for explanatory analysis,which might enable current machine learning to become explainable.How to marry causal inference with machine learning to develop explainable artifi...Causal inference is a powerful modeling tool for explanatory analysis,which might enable current machine learning to become explainable.How to marry causal inference with machine learning to develop explainable artificial intelligence(XAI)algorithms is one of key steps toward to the artificial intelligence 2.0.With the aim of bringing knowledge of causal inference to scholars of machine learning and artificial intelligence,we invited researchers working on causal inference to write this survey from different aspects of causal inference.This survey includes the following sections:“Estimating average treatment effect:A brief review and beyond”from Dr.Kun Kuang,“Attribution problems in counterfactual inference”from Prof.Lian Li,“The Yule–Simpson paradox and the surrogate paradox”from Prof.Zhi Geng,“Causal potential theory”from Prof.Lei Xu,“Discovering causal information from observational data”from Prof.Kun Zhang,“Formal argumentation in causal reasoning and explanation”from Profs.Beishui Liao and Huaxin Huang,“Causal inference with complex experiments”from Prof.Peng Ding,“Instrumental variables and negative controls for observational studies”from Prof.Wang Miao,and“Causal inference with interference”from Dr.Zhichao Jiang.展开更多
Regression is a widely used econometric tool in research. In observational studies, based on a number of assumptions, regression-based statistical control methods attempt to analyze the causation between treatment and...Regression is a widely used econometric tool in research. In observational studies, based on a number of assumptions, regression-based statistical control methods attempt to analyze the causation between treatment and outcome by adding control variables. However, this approach may not produce reliable estimates of causal effects. In addition to the shortcomings of the method, this lack of confidence is mainly related to ambiguous formulations in econometrics, such as the definition of selection bias, selection of core control variables, and method of testing for robustness. Within the framework of the causal models, we clarify the assumption of causal inference using regression-based statistical controls, as described in econometrics, and discuss how to select core control variables to satisfy this assumption and conduct robustness tests for regression estimates.展开更多
Causal inference prevails in the field of laparoscopic surgery.Once the causality between an intervention and outcome is established,the intervention can be applied to a target population to improve clinical outcomes....Causal inference prevails in the field of laparoscopic surgery.Once the causality between an intervention and outcome is established,the intervention can be applied to a target population to improve clinical outcomes.In many clinical scenarios,interventions are applied longitudinally in response to patients’conditions.Such longitudinal data comprise static variables,such as age,gender,and comorbidities;and dynamic variables,such as the treatment regime,laboratory variables,and vital signs.Some dynamic variables can act as both the confounder and mediator for the effect of an intervention on the outcome;in such cases,simple adjustment with a conventional regression model will bias the effect sizes.To address this,numerous statistical methods are being developed for causal inference;these include,but are not limited to,the structural marginal Cox regression model,dynamic treatment regime,and Cox regression model with time-varying covariates.This technical note provides a gentle introduction to such models and illustrates their use with an example in the field of laparoscopic surgery.展开更多
Statistical approaches for evaluating causal effects and for discovering causal networks are discussed in this paper.A causal relation between two variables is different from an association or correlation between them...Statistical approaches for evaluating causal effects and for discovering causal networks are discussed in this paper.A causal relation between two variables is different from an association or correlation between them.An association measurement between two variables and may be changed dramatically from positive to negative by omitting a third variable,which is called Yule-Simpson paradox.We shall discuss how to evaluate the causal effect of a treatment or exposure on an outcome to avoid the phenomena of Yule-Simpson paradox. Surrogates and intermediate variables are often used to reduce measurement costs or duration when measurement of endpoint variables is expensive,inconvenient,infeasible or unobservable in practice.There have been many criteria for surrogates.However,it is possible that for a surrogate satisfying these criteria,a treatment has a positive effect on the surrogate,which in turn has a positive effect on the outcome,but the treatment has a negative effect on the outcome,which is called the surrogate paradox.We shall discuss criteria for surrogates to avoid the phenomena of the surrogate paradox. Causal networks which describe the causal relationships among a large number of variables have been applied to many research fields.It is important to discover structures of causal networks from observed data.We propose a recursive approach for discovering a causal network in which a structural learning of a large network is decomposed recursively into learning of small networks.Further to discover causal relationships,we present an active learning approach in terms of external interventions on some variables.When we focus on the causes of an interest outcome, instead of discovering a whole network,we propose a local learning approach to discover these causes that affect the outcome.展开更多
Deep learning relies on learning from extensive data to generate prediction results.This approach may inadvertently capture spurious correlations within the data,leading to models that lack interpretability and robust...Deep learning relies on learning from extensive data to generate prediction results.This approach may inadvertently capture spurious correlations within the data,leading to models that lack interpretability and robustness.Researchers have developed more profound and stable causal inference methods based on cognitive neuroscience.By replacing the correlation model with a stable and interpretable causal model,it is possible to mitigate the misleading nature of spurious correlations and overcome the limitations of model calculations.In this survey,we provide a comprehensive and structured review of causal inference methods in deep learning.Brain-like inference ideas are discussed from a brain-inspired perspective,and the basic concepts of causal learning are introduced.The article describes the integration of causal inference with traditional deep learning algorithms and illustrates its application to large model tasks as well as specific modalities in deep learning.The current limitations of causal inference and future research directions are discussed.Moreover,the commonly used benchmark datasets and the corresponding download links are summarized.展开更多
This study aims to conduct an in-depth analysis of social media data using causal inference methods to explore the underlying mechanisms driving user behavior patterns.By leveraging large-scale social media datasets,t...This study aims to conduct an in-depth analysis of social media data using causal inference methods to explore the underlying mechanisms driving user behavior patterns.By leveraging large-scale social media datasets,this research develops a systematic analytical framework that integrates techniques such as propensity score matching,regression analysis,and regression discontinuity design to identify the causal effects of content characteristics,user attributes,and social network structures on user interactions,including clicks,shares,comments,and likes.The empirical findings indicate that factors such as sentiment,topical relevance,and network centrality have significant causal impacts on user behavior,with notable differences observed among various user groups.This study not only enriches the theoretical understanding of social media data analysis but also provides data-driven decision support and practical guidance for fields such as digital marketing,public opinion management,and digital governance.展开更多
Artificial Intelligence(AI)has revolutionized education by enabling personalized learning experiences through adaptive platforms.However,traditional AI-driven systems primarily rely on correlation-based analytics,lim-...Artificial Intelligence(AI)has revolutionized education by enabling personalized learning experiences through adaptive platforms.However,traditional AI-driven systems primarily rely on correlation-based analytics,lim-iting their ability to uncover the causal mechanisms behind learning outcomes.This study explores the in-tegration of Knowledge Graphs(KGs)and Causal Inference(CI)as a novel approach to enhance AI-driven educational systems.KGs provide a structured representation of educational knowledge,facilitating intelligent content recommendations and adaptive learning pathways,while CI enables AI systems to move beyond pattern recognition to identify cause-and-effect relationships in student learning.By combining these methods,this research aims to optimize personalized learning path recommendations,improve educational decision-making,and ensure AI-driven interventions are both data-informed and causally validated.Case studies from real-world applications,including intelligent tutoring systems and MOOC platforms,illustrate the practical impact of this approach.The findings contribute to advancing AI-driven education by fostering a balance between knowledge modeling,adaptability,and empirical rigor.展开更多
Causal inference plays a crucial role in biomedical studies and social sciences.Over the years,researchers have devised various methods to facilitate causal inference,particularly in observational studies.Among these ...Causal inference plays a crucial role in biomedical studies and social sciences.Over the years,researchers have devised various methods to facilitate causal inference,particularly in observational studies.Among these methods,the doubly robust estimator distinguishes itself through a remarkable feature:it retains its consistency even when only one of the two components–either the propensity score model or the outcome mean model–is correctly specified,rather than demanding correctness in both simultaneously.In this paper,we focus on scenarios where semiparametric models are employed for both the propensity score and the outcome mean.Semiparametric models offer a valuable blend of interpretability akin to parametric models and the adaptability characteristic of nonparametric models.In this context,achieving correct model specification involves both accurately specifying the unknown function and consistently estimating the unknown parameter.Weintroduce a novel concept:the relaxed doubly robust estimator.It operates in a manner reminiscent of the traditional doubly robust estimator but with a reduced requirement for double robustness.In essence,it only mandates the consistent estimate of the unknown parameter,without requiring the correct specification of the unknown function.This means that it only necessitates a partially correct model specification.We conduct a thorough analysis to establish the double robustness and semiparametric efficiency of our proposed estimator.Furthermore,we bolster our findings with comprehensive simulation studies to illustrate the practical implications of our approach.展开更多
This study aims to assess the Average Treatment Effect(ATE)of receiving special education services on revised Item Response Theory(IRT)scaled math achievement test scores.By employing a methodological repertoire compr...This study aims to assess the Average Treatment Effect(ATE)of receiving special education services on revised Item Response Theory(IRT)scaled math achievement test scores.By employing a methodological repertoire comprising linear regression with ordinary least squares(OLS),propensity score matching(PSM),Bayesian Additive Regression Trees(BART),and Multilayer Perceptron(MLP),we examine the impact of these interventions.Leveraging data from the Early Childhood Longitudinal Study Kindergarten 2010-11 cohort(ECLS-K:2011),we systematically analyze the ATE of special education services on students'math achievement.The results show that all models yield negative ATE results,suggesting a deleterious effect of special education services on fifth-grade math scores.Furthermore,we employ Principal Component Analysis(PCA)to corroborate these findings,aligning with outcomes obtained from causal inference and Machine Learning(ML)based methods.This research emphasizes the importance of method diversity in educational research and highlights the need for assessments of intervention effectiveness to help educational practices and policies.展开更多
Understanding the characteristics and driving factors behind changes in vegetation ecosystem resilience is crucial for mitigating both current and future impacts of climate change. Despite recent advances in resilienc...Understanding the characteristics and driving factors behind changes in vegetation ecosystem resilience is crucial for mitigating both current and future impacts of climate change. Despite recent advances in resilience research, significant knowledge gaps remain regarding the drivers of resilience changes. In this study, we investigated the dynamics of ecosystem resilience across China and identified potential driving factors using the kernel normalized difference vegetation index(kNDVI) from 2000 to 2020. Our results indicate that vegetation resilience in China has exhibited an increasing trend over the past two decades, with a notable breakpoint occurring around 2012. We found that precipitation was the dominant driver of changes in ecosystem resilience, accounting for 35.82% of the variation across China, followed by monthly average maximum temperature(Tmax) and vapor pressure deficit(VPD), which explained 28.95% and 28.31% of the variation, respectively. Furthermore, we revealed that daytime and nighttime warming has asymmetric impacts on vegetation resilience, with temperature factors such as Tmin and Tmax becoming more influential, while the importance of precipitation slightly decreases after the resilience change point. Overall, our study highlights the key roles of water availability and temperature in shaping vegetation resilience and underscores the asymmetric effects of daytime and nighttime warming on ecosystem resilience.展开更多
The utilization of big Earth data has provided insights into the planet we inhabit in unprecedented dimensions and scales.Unraveling the concealed causal connections within intricate data holds paramount importance fo...The utilization of big Earth data has provided insights into the planet we inhabit in unprecedented dimensions and scales.Unraveling the concealed causal connections within intricate data holds paramount importance for attaining a profound comprehension of the Earth system.Statistical methods founded on correlation have predominated in Earth system science(ESS)for a long time.Nevertheless,correlation does not imply causation,especially when confronted with spurious correlations resulting from big data.Consequently,traditional correlation and regression methods are inadequate for addressing causation related problems in the Earth system.In recent years,propelled by advancements in causal theory and inference methods,particularly the maturity of causal discovery and causal graphical models,causal inference has demonstrated vigorous vitality in various research directions in the Earth system,such as regularities revealing,processes understanding,hypothesis testing,and physical models improving.This paper commences by delving into the origins,connotations,and development of causality,subsequently outlining the principal frameworks of causal inference and the commonly used methods in ESS.Additionally,it reviews the applications of causal inference in the main branches of the Earth system and summarizes the challenges and development directions of causal inference in ESS.In the big Earth data era,as an important method of big data analysis,causal inference,along with physical model and machine learning,can assist the paradigm transformation of ESS from a model-driven paradigm to a paradigm of integration of both mechanism and data.Looking forward,the establishment of a meticulously structured and normalized causal theory can act as a foundational cornerstone for fostering causal cognition in ESS and propel the leap from fragmented research towards a comprehensive understanding of the Earth system.展开更多
Deep learning-based models are vulnerable to adversarial attacks. Defense against adversarial attacks is essential for sensitive and safety-critical scenarios. However, deep learning methods still lack effective and e...Deep learning-based models are vulnerable to adversarial attacks. Defense against adversarial attacks is essential for sensitive and safety-critical scenarios. However, deep learning methods still lack effective and efficient defense mechanisms against adversarial attacks. Most of the existing methods are just stopgaps for specific adversarial samples. The main obstacle is that how adversarial samples fool the deep learning models is still unclear. The underlying working mechanism of adversarial samples has not been well explored, and it is the bottleneck of adversarial attack defense. In this paper, we build a causal model to interpret the generation and performance of adversarial samples. The self-attention/transformer is adopted as a powerful tool in this causal model. Compared to existing methods, causality enables us to analyze adversarial samples more naturally and intrinsically. Based on this causal model, the working mechanism of adversarial samples is revealed, and instructive analysis is provided. Then, we propose simple and effective adversarial sample detection and recognition methods according to the revealed working mechanism. The causal insights enable us to detect and recognize adversarial samples without any extra model or training. Extensive experiments are conducted to demonstrate the effectiveness of the proposed methods. Our methods outperform the state-of-the-art defense methods under various adversarial attacks.展开更多
BACKGROUND Despite being one of the most prevalent sleep disorders,obstructive sleep apnea hypoventilation syndrome(OSAHS)has limited information on its immunologic foundation.The immunological underpinnings of certai...BACKGROUND Despite being one of the most prevalent sleep disorders,obstructive sleep apnea hypoventilation syndrome(OSAHS)has limited information on its immunologic foundation.The immunological underpinnings of certain major psychiatric diseases have been uncovered in recent years thanks to the extensive use of genome-wide association studies(GWAS)and genotyping techniques using highdensity genetic markers(e.g.,SNP or CNVs).But this tactic hasn't yet been applied to OSAHS.Using a Mendelian randomization analysis,we analyzed the causal link between immune cells and the illness in order to comprehend the immunological bases of OSAHS.AIM To investigate the immune cells'association with OSAHS via genetic methods,guiding future clinical research.METHODS A comprehensive two-sample mendelian randomization study was conducted to investigate the causal relationship between immune cell characteristics and OSAHS.Summary statistics for each immune cell feature were obtained from the GWAS catalog.Information on 731 immune cell properties,such as morphologic parameters,median fluorescence intensity,absolute cellular,and relative cellular,was compiled using publicly available genetic databases.The results'robustness,heterogeneity,and horizontal pleiotropy were confirmed using extensive sensitivity examination.RESULTS Following false discovery rate(FDR)correction,no statistically significant effect of OSAHS on immunophenotypes was observed.However,two lymphocyte subsets were found to have a significant association with the risk of OSAHS:Basophil%CD33dim HLA DR-CD66b-(OR=1.03,95%CI=1.01-1.03,P<0.001);CD38 on IgD+CD24-B cell(OR=1.04,95%CI=1.02-1.04,P=0.019).CONCLUSION This study shows a strong link between immune cells and OSAHS through a gene approach,thus offering direction for potential future medical research.展开更多
Multimodal documents combining language and graphs are wide-spread in print media as well as in electronic media. One of the most important tasks to be solved in comprehending graph-text combinations is construction o...Multimodal documents combining language and graphs are wide-spread in print media as well as in electronic media. One of the most important tasks to be solved in comprehending graph-text combinations is construction of causal chains among the meaning entities provided by modalities. In this study we focus on the role of annotation position and shape of graph lines in simple line graphs on causal attributions concerning the event presented by the annotation and the processes (i.e, increases and decreases) and states (no-changes) in the domain value of the graphs presented by the process-lines and state-lines. Based on the experimental investigation of readers' inferences under different conditions, guidelines for the design of multimodal documents including text and statistical information graphics are suggested. One suggestion is that the position and the number of verbal annotations should be selected appropriately, another is that the graph line smoothing should be done cautiously.展开更多
基金the Scientific Research Innovation Capability Support Project for Young Faculty(ZYGXQNJSKYCXNLZCXM-H8)Fundamental Research Funds for the Central Universities(2024ZYGXZR077)+3 种基金Guangdong Basic and Applied Basic Research Foundation(2023B1515120006)Guangzhou Basic and Applied Basic Research Foundation(2024A04J5776)the Research Fund(2023QN10Y421)Guangzhou Talent Recruitment Team Program(2024D03J0004),all related to this study.
文摘Osteoporosis is a known risk factor for rotator cuff tears(RCTs),but the causal correlation and underlying mechanisms remain unclear.This study aims to evaluate the impact of osteoporosis on RCT risk and investigate their genetic associations.Using data from the UK Biobank(n=457871),cross-sectional analyses demonstrated that osteoporosis was significantly associated with an increased risk of RCTs(adjusted OR[95%CI]=1.38[1.25–1.52]).A longitudinal analysis of a subset of patients(n=268117)over 11 years revealed that osteoporosis increased the risk of RCTs(adjusted HR[95%CI]=1.56[1.29–1.87]),which is notably varied between sexes in sex-stratified analysis.Causal inference methods,including propensity score matching,inverse probability weighting,causal random forest and survival random forest models further confirmed the causal effect,both from cross-sectional and longitudinal perspectives.
基金the financial support by the National Natural Science Foundation of China(Grant No.U24B2029)the Key Projects of the National Natural Science Foundation of China(Grant No.52334001)+1 种基金the Strategic Cooperation Technology Projects of CNPC and CUPB(Grand No.ZLZX2020-02)the China University of Petroleum,Beijing(Grand No.ZX20230042)。
文摘Offshore drilling costs are high,and the downhole environment is even more complex.Improving the rate of penetration(ROP)can effectively shorten offshore drilling cycles and improve economic benefits.It is difficult for the current ROP models to guarantee the prediction accuracy and the robustness of the models at the same time.To address the current issues,a new ROP prediction model was developed in this study,which considers ROP as a time series signal(ROP signal).The model is based on the time convolutional network(TCN)framework and integrates ensemble empirical modal decomposition(EEMD)and Bayesian network causal inference(BN),the model is named EEMD-BN-TCN.Within the proposed model,the EEMD decomposes the original ROP signal into multiple sets of sub-signals.The BN determines the causal relationship between the sub-signals and the key physical parameters(weight on bit and revolutions per minute)and carries out preliminary reconstruction of the sub-signals based on the causal relationship.The TCN predicts signals reconstructed by BN.When applying this model to an actual production well,the average absolute percentage error of the EEMD-BN-TCN prediction decreased from 18.4%with TCN to 9.2%.In addition,compared with other models,the EEMD-BN-TCN can improve the decomposition signal of ROP by regulating weight on bit and revolutions per minute,ultimately enhancing ROP.
基金National Key Research and Development Program of China,Grant/Award Number:2020YFB1711704。
文摘The goal of point cloud completion is to reconstruct raw scanned point clouds acquired from incomplete observations due to occlusion and restricted viewpoints.Numerous methods use a partial-to-complete framework,directly predicting missing components via global characteristics extracted from incomplete inputs.However,this makes detail re-covery challenging,as global characteristics fail to provide complete missing component specifics.A new point cloud completion method named Point-PC is proposed.A memory network and a causal inference model are separately designed to introduce shape priors and select absent shape information as supplementary geometric factors for aiding completion.Concretely,a memory mechanism is proposed to store complete shape features and their associated shapes in a key-value format.The authors design a pre-training strategy that uses contrastive learning to map incomplete shape features into the complete shape feature domain,enabling retrieval of analogous shapes from incomplete inputs.In addition,the authors employ backdoor adjustment to eliminate confounders,which are shape prior components sharing identical semantic structures with incomplete inputs.Experiments conducted on three datasets show that our method achieves superior performance compared to state-of-the-art approaches.The code for Point-PC can be accessed by https://github.com/bizbard/Point-PC.git.
基金supported by the National Natural Science Foundation of China(No.82404365)the Noncommunicable Chronic Diseases-National Science and Technology Major Project(No.2023ZD0513200)+7 种基金China Medical Board(No.15-230)China Postdoctoral Science Foundation(Nos.2023M730317and 2023T160066)the Fundamental Research Funds for the Central Universities(No.3332023042)the Open Project of Hebei Key Laboratory of Environment and Human Health(No.202301)the National Key Research and Development Program of China(No.2022YFC3703000)the Non-profit Central Research Institute Fund of Chinese Academy of Medical Sciences(No.2022-JKCS-11)the CAMS Innovation Fund for Medical Sciences(No.2022-I2M-JB-003)the Programs of the National Natural Science Foundation of China(No.21976050).
文摘Associations of per-and polyfluoroalkyl substances(PFAS)on lipid metabolism have been documented but research remains scarce regarding effect of PFAS on lipid variability.To deeply understand their relationship,a step-forward in causal inference is expected.To address these,we conducted a longitudinal study with three repeated measurements involving 201 participants in Beijing,among which 100 eligible participants were included for the present study.Twenty-three PFAS and four lipid indicators were assessed at each visit.We used linear mixed models and quantile g-computation models to investigate associations between PFAS and blood lipid levels.A latent class growth model described PFAS serum exposure patterns,and a generalized linear model demonstrated associations between these patterns and lipid variability.Our study found that PFDA was associated with increased TC(β=0.083,95%CI:0.011,0.155)and HDL-C(β=0.106,95%CI:0.034,0.178).The PFAS mixture also showed a positive relationship with TC(β=0.06,95%CI:0.02,0.10),with PFDA contributing most positively.Compared to the low trajectory group,the middle trajectory group for PFDA was associated with VIM of TC(β=0.756,95%CI:0.153,1.359).Furthermore,PFDA showed biological gradientswith lipid metabolism.This is the first repeated-measures study to identify the impact of PFAS serum exposure pattern on the lipid metabolism and the first to estimate the association between PFAS and blood lipid levels in middle-aged and elderly Chinese and reinforce the evidence of their causal relationship through epidemiological studies.
基金TaiShan Scholars Program(Grant no.tsqn202211289)National Key R&D Program of China(Grant no.2021YFC3300203)Oversea Innovation Team Project of the“20 Regulations for New Universities”funding program of Jinan(Grant no.2021GXRC073),and the Excellent Youth Scholars Program of Shandong Province(Grant no.2022HWYQ-048).
文摘Image classification algorithms are commonly based on the Independent and Identically Distribution (i.i.d.) assumption, but in practice, the Out-Of-Distribution (OOD) problem widely exists, that is, the contexts of images in the model predicting are usually unseen during training. In this case, existing models trained under the i.i.d. assumption are limiting generalisation. Causal inference is an important method to learn the causal associations which are invariant across different environments, thus improving the generalisation ability of the model. However, existing methods usually require partitioning of the environment to learn invariant features, which mostly have imbalance problems due to the lack of constraints. In this paper, we propose a balanced causal learning framework (BCL), starting from how to divide the dataset in a balanced way and the balance of training after the division, which automatically generates fine-grained balanced data partitions in an unsupervised manner and balances the training difficulty of different classes, thereby enhancing the generalisation ability of models in different environments. Experiments on the OOD datasets NICO and NICO++ demonstrate that BCL achieves stable predictions on OOD data, and we also find that models using BCL focus more accurately on the foreground of images compared with the existing causal inference method, which effectively improves the generalisation ability.
文摘This study’s main purpose is to use Bayesian structural time-series models to investigate the causal effect of an earthquake on the Borsa Istanbul Stock Index.The results reveal a significant negative impact on stock market value during the post-treatment period.The results indicate rapid divergence from counterfactual predictions,and the actual stock index is lower than would have been expected in the absence of an earthquake.The curve of the actual stock value and the counterfactual prediction after the earthquake suggest a reconvening pattern in the stock market when the stock market resumes its activities.The cumulative impact effect shows a negative effect in relative terms,as evidenced by the decrease in the BIST-100 index of -30%.These results have significant implications for investors and policymakers,emphasizing the need to prepare for natural disasters to minimize their adverse effects on stock market valuations.
文摘Causal inference is a powerful modeling tool for explanatory analysis,which might enable current machine learning to become explainable.How to marry causal inference with machine learning to develop explainable artificial intelligence(XAI)algorithms is one of key steps toward to the artificial intelligence 2.0.With the aim of bringing knowledge of causal inference to scholars of machine learning and artificial intelligence,we invited researchers working on causal inference to write this survey from different aspects of causal inference.This survey includes the following sections:“Estimating average treatment effect:A brief review and beyond”from Dr.Kun Kuang,“Attribution problems in counterfactual inference”from Prof.Lian Li,“The Yule–Simpson paradox and the surrogate paradox”from Prof.Zhi Geng,“Causal potential theory”from Prof.Lei Xu,“Discovering causal information from observational data”from Prof.Kun Zhang,“Formal argumentation in causal reasoning and explanation”from Profs.Beishui Liao and Huaxin Huang,“Causal inference with complex experiments”from Prof.Peng Ding,“Instrumental variables and negative controls for observational studies”from Prof.Wang Miao,and“Causal inference with interference”from Dr.Zhichao Jiang.
基金This research was funded by the National Natural Science Foundation of China(Grant No.72074060).
文摘Regression is a widely used econometric tool in research. In observational studies, based on a number of assumptions, regression-based statistical control methods attempt to analyze the causation between treatment and outcome by adding control variables. However, this approach may not produce reliable estimates of causal effects. In addition to the shortcomings of the method, this lack of confidence is mainly related to ambiguous formulations in econometrics, such as the definition of selection bias, selection of core control variables, and method of testing for robustness. Within the framework of the causal models, we clarify the assumption of causal inference using regression-based statistical controls, as described in econometrics, and discuss how to select core control variables to satisfy this assumption and conduct robustness tests for regression estimates.
基金funding from the National Natural Science Foundation of China(82272180)Open Foundation of Key Laboratory of Digital Technology in Medical Diagnostics of Zhejiang Province(SZZD202206)+2 种基金funding from the Sichuan Medical Association Scientific Research Project(S21019)funding from the Key Research and Development Project of Zhejiang Province(2021C03071)funding from Zhejiang Medical and Health Science and Technology Project(2017ZD001)。
文摘Causal inference prevails in the field of laparoscopic surgery.Once the causality between an intervention and outcome is established,the intervention can be applied to a target population to improve clinical outcomes.In many clinical scenarios,interventions are applied longitudinally in response to patients’conditions.Such longitudinal data comprise static variables,such as age,gender,and comorbidities;and dynamic variables,such as the treatment regime,laboratory variables,and vital signs.Some dynamic variables can act as both the confounder and mediator for the effect of an intervention on the outcome;in such cases,simple adjustment with a conventional regression model will bias the effect sizes.To address this,numerous statistical methods are being developed for causal inference;these include,but are not limited to,the structural marginal Cox regression model,dynamic treatment regime,and Cox regression model with time-varying covariates.This technical note provides a gentle introduction to such models and illustrates their use with an example in the field of laparoscopic surgery.
文摘Statistical approaches for evaluating causal effects and for discovering causal networks are discussed in this paper.A causal relation between two variables is different from an association or correlation between them.An association measurement between two variables and may be changed dramatically from positive to negative by omitting a third variable,which is called Yule-Simpson paradox.We shall discuss how to evaluate the causal effect of a treatment or exposure on an outcome to avoid the phenomena of Yule-Simpson paradox. Surrogates and intermediate variables are often used to reduce measurement costs or duration when measurement of endpoint variables is expensive,inconvenient,infeasible or unobservable in practice.There have been many criteria for surrogates.However,it is possible that for a surrogate satisfying these criteria,a treatment has a positive effect on the surrogate,which in turn has a positive effect on the outcome,but the treatment has a negative effect on the outcome,which is called the surrogate paradox.We shall discuss criteria for surrogates to avoid the phenomena of the surrogate paradox. Causal networks which describe the causal relationships among a large number of variables have been applied to many research fields.It is important to discover structures of causal networks from observed data.We propose a recursive approach for discovering a causal network in which a structural learning of a large network is decomposed recursively into learning of small networks.Further to discover causal relationships,we present an active learning approach in terms of external interventions on some variables.When we focus on the causes of an interest outcome, instead of discovering a whole network,we propose a local learning approach to discover these causes that affect the outcome.
基金supported in part by the Key Scientific Technological Innovation Research Project of the Ministry of Education,the Joint Funds of the National Natural Science Foundation of China(U22B2054)the National Natural Science Foundation of China(62076192,61902298,61573267,61906150,and 62276199)+2 种基金the 111 Project,the Program for Cheung Kong Scholars and Innovative Research Team in University(IRT 15R53)the Science and Technology Innovation Project from the Chinese Ministry of Education,the Key Research and Development Program in Shaanxi Province of China(2019ZDLGY03-06)the China Postdoctoral Fund(2022T150506).
文摘Deep learning relies on learning from extensive data to generate prediction results.This approach may inadvertently capture spurious correlations within the data,leading to models that lack interpretability and robustness.Researchers have developed more profound and stable causal inference methods based on cognitive neuroscience.By replacing the correlation model with a stable and interpretable causal model,it is possible to mitigate the misleading nature of spurious correlations and overcome the limitations of model calculations.In this survey,we provide a comprehensive and structured review of causal inference methods in deep learning.Brain-like inference ideas are discussed from a brain-inspired perspective,and the basic concepts of causal learning are introduced.The article describes the integration of causal inference with traditional deep learning algorithms and illustrates its application to large model tasks as well as specific modalities in deep learning.The current limitations of causal inference and future research directions are discussed.Moreover,the commonly used benchmark datasets and the corresponding download links are summarized.
文摘This study aims to conduct an in-depth analysis of social media data using causal inference methods to explore the underlying mechanisms driving user behavior patterns.By leveraging large-scale social media datasets,this research develops a systematic analytical framework that integrates techniques such as propensity score matching,regression analysis,and regression discontinuity design to identify the causal effects of content characteristics,user attributes,and social network structures on user interactions,including clicks,shares,comments,and likes.The empirical findings indicate that factors such as sentiment,topical relevance,and network centrality have significant causal impacts on user behavior,with notable differences observed among various user groups.This study not only enriches the theoretical understanding of social media data analysis but also provides data-driven decision support and practical guidance for fields such as digital marketing,public opinion management,and digital governance.
文摘Artificial Intelligence(AI)has revolutionized education by enabling personalized learning experiences through adaptive platforms.However,traditional AI-driven systems primarily rely on correlation-based analytics,lim-iting their ability to uncover the causal mechanisms behind learning outcomes.This study explores the in-tegration of Knowledge Graphs(KGs)and Causal Inference(CI)as a novel approach to enhance AI-driven educational systems.KGs provide a structured representation of educational knowledge,facilitating intelligent content recommendations and adaptive learning pathways,while CI enables AI systems to move beyond pattern recognition to identify cause-and-effect relationships in student learning.By combining these methods,this research aims to optimize personalized learning path recommendations,improve educational decision-making,and ensure AI-driven interventions are both data-informed and causally validated.Case studies from real-world applications,including intelligent tutoring systems and MOOC platforms,illustrate the practical impact of this approach.The findings contribute to advancing AI-driven education by fostering a balance between knowledge modeling,adaptability,and empirical rigor.
基金supported in part by US National Science Foundation[Grant Numbers DMS 1953526,2122074,2310942]US National Institutes of Health[Grant Number R01DC021431]the American Family Funding Initiative of UW-Madison.
文摘Causal inference plays a crucial role in biomedical studies and social sciences.Over the years,researchers have devised various methods to facilitate causal inference,particularly in observational studies.Among these methods,the doubly robust estimator distinguishes itself through a remarkable feature:it retains its consistency even when only one of the two components–either the propensity score model or the outcome mean model–is correctly specified,rather than demanding correctness in both simultaneously.In this paper,we focus on scenarios where semiparametric models are employed for both the propensity score and the outcome mean.Semiparametric models offer a valuable blend of interpretability akin to parametric models and the adaptability characteristic of nonparametric models.In this context,achieving correct model specification involves both accurately specifying the unknown function and consistently estimating the unknown parameter.Weintroduce a novel concept:the relaxed doubly robust estimator.It operates in a manner reminiscent of the traditional doubly robust estimator but with a reduced requirement for double robustness.In essence,it only mandates the consistent estimate of the unknown parameter,without requiring the correct specification of the unknown function.This means that it only necessitates a partially correct model specification.We conduct a thorough analysis to establish the double robustness and semiparametric efficiency of our proposed estimator.Furthermore,we bolster our findings with comprehensive simulation studies to illustrate the practical implications of our approach.
文摘This study aims to assess the Average Treatment Effect(ATE)of receiving special education services on revised Item Response Theory(IRT)scaled math achievement test scores.By employing a methodological repertoire comprising linear regression with ordinary least squares(OLS),propensity score matching(PSM),Bayesian Additive Regression Trees(BART),and Multilayer Perceptron(MLP),we examine the impact of these interventions.Leveraging data from the Early Childhood Longitudinal Study Kindergarten 2010-11 cohort(ECLS-K:2011),we systematically analyze the ATE of special education services on students'math achievement.The results show that all models yield negative ATE results,suggesting a deleterious effect of special education services on fifth-grade math scores.Furthermore,we employ Principal Component Analysis(PCA)to corroborate these findings,aligning with outcomes obtained from causal inference and Machine Learning(ML)based methods.This research emphasizes the importance of method diversity in educational research and highlights the need for assessments of intervention effectiveness to help educational practices and policies.
基金National Key Research and Development Program,No.2021xjkk0303。
文摘Understanding the characteristics and driving factors behind changes in vegetation ecosystem resilience is crucial for mitigating both current and future impacts of climate change. Despite recent advances in resilience research, significant knowledge gaps remain regarding the drivers of resilience changes. In this study, we investigated the dynamics of ecosystem resilience across China and identified potential driving factors using the kernel normalized difference vegetation index(kNDVI) from 2000 to 2020. Our results indicate that vegetation resilience in China has exhibited an increasing trend over the past two decades, with a notable breakpoint occurring around 2012. We found that precipitation was the dominant driver of changes in ecosystem resilience, accounting for 35.82% of the variation across China, followed by monthly average maximum temperature(Tmax) and vapor pressure deficit(VPD), which explained 28.95% and 28.31% of the variation, respectively. Furthermore, we revealed that daytime and nighttime warming has asymmetric impacts on vegetation resilience, with temperature factors such as Tmin and Tmax becoming more influential, while the importance of precipitation slightly decreases after the resilience change point. Overall, our study highlights the key roles of water availability and temperature in shaping vegetation resilience and underscores the asymmetric effects of daytime and nighttime warming on ecosystem resilience.
基金supported by the Basic Science Center for Tibetan Plateau Earth System(BCTPES,NSFC project Grant Nos.41988101)the National Natural Science Foundation of China(Grant No.42101397)。
文摘The utilization of big Earth data has provided insights into the planet we inhabit in unprecedented dimensions and scales.Unraveling the concealed causal connections within intricate data holds paramount importance for attaining a profound comprehension of the Earth system.Statistical methods founded on correlation have predominated in Earth system science(ESS)for a long time.Nevertheless,correlation does not imply causation,especially when confronted with spurious correlations resulting from big data.Consequently,traditional correlation and regression methods are inadequate for addressing causation related problems in the Earth system.In recent years,propelled by advancements in causal theory and inference methods,particularly the maturity of causal discovery and causal graphical models,causal inference has demonstrated vigorous vitality in various research directions in the Earth system,such as regularities revealing,processes understanding,hypothesis testing,and physical models improving.This paper commences by delving into the origins,connotations,and development of causality,subsequently outlining the principal frameworks of causal inference and the commonly used methods in ESS.Additionally,it reviews the applications of causal inference in the main branches of the Earth system and summarizes the challenges and development directions of causal inference in ESS.In the big Earth data era,as an important method of big data analysis,causal inference,along with physical model and machine learning,can assist the paradigm transformation of ESS from a model-driven paradigm to a paradigm of integration of both mechanism and data.Looking forward,the establishment of a meticulously structured and normalized causal theory can act as a foundational cornerstone for fostering causal cognition in ESS and propel the leap from fragmented research towards a comprehensive understanding of the Earth system.
基金supported by National Key Research and Development Program of China(No.2020AAA0140002)Natural Science Foundation of China(Nos.U1836217,62076240,62006225,61906199,62071468,62176025 and U21B200389)the CAAI-Huawei Mind-spore Open Fund.
文摘Deep learning-based models are vulnerable to adversarial attacks. Defense against adversarial attacks is essential for sensitive and safety-critical scenarios. However, deep learning methods still lack effective and efficient defense mechanisms against adversarial attacks. Most of the existing methods are just stopgaps for specific adversarial samples. The main obstacle is that how adversarial samples fool the deep learning models is still unclear. The underlying working mechanism of adversarial samples has not been well explored, and it is the bottleneck of adversarial attack defense. In this paper, we build a causal model to interpret the generation and performance of adversarial samples. The self-attention/transformer is adopted as a powerful tool in this causal model. Compared to existing methods, causality enables us to analyze adversarial samples more naturally and intrinsically. Based on this causal model, the working mechanism of adversarial samples is revealed, and instructive analysis is provided. Then, we propose simple and effective adversarial sample detection and recognition methods according to the revealed working mechanism. The causal insights enable us to detect and recognize adversarial samples without any extra model or training. Extensive experiments are conducted to demonstrate the effectiveness of the proposed methods. Our methods outperform the state-of-the-art defense methods under various adversarial attacks.
基金Supported by Doctoral Research Fund Project of Henan Provincial Hospital of Traditional Chinese Medicine,No.2022BSJJ10.
文摘BACKGROUND Despite being one of the most prevalent sleep disorders,obstructive sleep apnea hypoventilation syndrome(OSAHS)has limited information on its immunologic foundation.The immunological underpinnings of certain major psychiatric diseases have been uncovered in recent years thanks to the extensive use of genome-wide association studies(GWAS)and genotyping techniques using highdensity genetic markers(e.g.,SNP or CNVs).But this tactic hasn't yet been applied to OSAHS.Using a Mendelian randomization analysis,we analyzed the causal link between immune cells and the illness in order to comprehend the immunological bases of OSAHS.AIM To investigate the immune cells'association with OSAHS via genetic methods,guiding future clinical research.METHODS A comprehensive two-sample mendelian randomization study was conducted to investigate the causal relationship between immune cell characteristics and OSAHS.Summary statistics for each immune cell feature were obtained from the GWAS catalog.Information on 731 immune cell properties,such as morphologic parameters,median fluorescence intensity,absolute cellular,and relative cellular,was compiled using publicly available genetic databases.The results'robustness,heterogeneity,and horizontal pleiotropy were confirmed using extensive sensitivity examination.RESULTS Following false discovery rate(FDR)correction,no statistically significant effect of OSAHS on immunophenotypes was observed.However,two lymphocyte subsets were found to have a significant association with the risk of OSAHS:Basophil%CD33dim HLA DR-CD66b-(OR=1.03,95%CI=1.01-1.03,P<0.001);CD38 on IgD+CD24-B cell(OR=1.04,95%CI=1.02-1.04,P=0.019).CONCLUSION This study shows a strong link between immune cells and OSAHS through a gene approach,thus offering direction for potential future medical research.
基金Supported in part by DFG(German Science Foundation) in ITRG1247‘Cross-modal Interaction in Natural and Artificial Cognitive Systems’(CI-NACS)
文摘Multimodal documents combining language and graphs are wide-spread in print media as well as in electronic media. One of the most important tasks to be solved in comprehending graph-text combinations is construction of causal chains among the meaning entities provided by modalities. In this study we focus on the role of annotation position and shape of graph lines in simple line graphs on causal attributions concerning the event presented by the annotation and the processes (i.e, increases and decreases) and states (no-changes) in the domain value of the graphs presented by the process-lines and state-lines. Based on the experimental investigation of readers' inferences under different conditions, guidelines for the design of multimodal documents including text and statistical information graphics are suggested. One suggestion is that the position and the number of verbal annotations should be selected appropriately, another is that the graph line smoothing should be done cautiously.