[Literature Summary] Causal Discovery and Inference: concepts and recent methodological advances

2023. 3. 28. 09:10Data science

반응형

Ref1. Causal Discovery and Inference: concepts and recent methodological advances

Notes

[What to expect from this paper]

  • Provides an overview of automated causal inference and emerging approaches to causal discovery from i.i.d data and time series
  • Reviews fundamental concepts such as manipulations, causal models, sample predictive modelling, causal predictive modelling, and structural equation models
  • Discusses the constraint-based approach to causal discovery, which relies on conditional independence relationships in the data, and the assumptions underlying its validity
  • Focuses on causal discovery based on structural equation models and discusses the identifiability of the causal structure implied by appropriately defined structural equation models
  • Shows that independence between the error term and causes, together with appropriate structural constraints on the structural equation, makes it possible to identify the causal direction between two variables in the two-variable case
  • Discusses recent advances in causal discovery from time series, mentioning traditionally challenging problems such as causal discovery from subsampled data and in the presence of confounding time series
  • Lists several open questions in the field of causal discovery and inference
  • Provides a valuable resource for researchers interested in automated causal inference and emerging approaches to causal discovery from i.i.d data and time series.

Key takeaways

  • Keywords: Causal inference, Causal discovery, Structural equation model, Conditional independence, Statistical independence, Identifiability
  1. Manipulation is changing one or more variables in a system and observing the effect on other variables to establish causality.
  2. Causal models: mathematical models that describe the causal relationships between variables in a system.
  3. Sample predictive modelling: a method of causal inference that uses data from observational studies to predict the outcome of interventions on new data.
  4. Causal predictive modelling: a causal inference method that uses data from observational studies to estimate the causal effects of interventions.
  5. Structural equation models: a statistical model that describes the relationships between variables in terms of structural equations.
  6. Constraint-based approach: a method of causal discovery that relies on identifying conditional independence relationships in the data.
  7. Causal discovery based on structural equations: a method of causal discovery that involves fitting structural equation models to the data and using constraints to identify causal relationships.
  8. Linear causal models with non-Gaussian noise: a class of models that assumes linear causal relationships between variables with non-Gaussian noise.
  9. Causal discovery from time series: a causal discovery method involving analyzing time series data to identify causal relationships.
  10. Causal discovery from subsampled data: a causal discovery method that involves inferring causal relationships from subsets of time series data.
  11. Confounding time series: a problem in causal discovery from time series data where the observed data is influenced by unobserved variables that affect both the cause and effect.
  12. Counterfactual analysis: a method of causal inference that involves comparing the outcomes of a treatment group to a control group to estimate the causal effect of the treatment.
  13. Bayesian networks: a probabilistic graphical model representing the causal relationships between variables as a directed acyclic graph.

Summary

Causal discovery and inference are related but distinct concepts in statistics and machine learning. Causal discovery involves identifying the causal relationships between variables in a system, while causal inference involves using these relationships to make predictions or draw conclusions about the system. Recent methodological advances have improved our ability to perform causal discovery and inference in complex systems, using techniques such as Bayesian networks, structural equation modelling, and counterfactual analysis.

Bayesian networks are a popular method for performing causal discovery, allowing researchers to model the relationships between variables in a probabilistic framework. Bayesian networks can be used to infer causal relationships from observational data, and can also be used to make predictions about the behaviour of the system under different conditions. Recent advances in Bayesian network modelling have included the development of efficient algorithms for learning the network's structure from data and integrating causal discovery with other statistical techniques such as clustering and regression.

Structural equation modelling (SEM) is another method for performing causal inference, which involves modelling the relationships between variables in terms of their underlying causal mechanisms. SEM allows researchers to test hypotheses about the causal relationships between variables, and to estimate the strength and direction of these relationships. Recent advances in SEM have included the development of new methods for dealing with missing data and the integration of SEM with machine learning techniques such as deep neural networks.

Counterfactual analysis is a third approach to causal inference, which involves modelling the system's behaviour under hypothetical scenarios. The counterfactual analysis allows researchers to estimate the causal effects of interventions, by comparing the system's behaviour with and without the intervention. Recent advances in the counterfactual analysis have included the development of new methods for estimating causal effects from observational data and integrating counterfactual analysis with machine learning techniques such as causal forests and propensity score matching.

반응형