Causal inference is the process of drawing a conclusion about a causal effect based on the conditions of the occurrence of an effect. In quantitative research, causal inferences inform evidence-based decision-making in public policy, medicine, business, and other fields. This introduction will provide an overview of causal inference concepts, methods, assumptions, and current best practices.
Foundational Concepts
Causes and Effects
A cause is an event or condition that brings about or increases the likelihood of an outcome called an effect. Causal effects indicate how much altering the cause changes the effect, compared to what would have happened without intervening. Causal relationships are unidirectional from cause to effect (1).
Confounding Variables
A relationship between a cause and effect may be spurious if a third variable (confounder) affects both. For example, a study may find those who exercise more have lower rates of heart disease. However, healthy diet affects both exercise and heart disease, confounding their relationship. Analysts try to isolate direct causal relationships from confounding (2).
Ceteris Paribus
Estimating causal effects requires the ceteris paribus (all else being equal) assumption, which holds all other relevant factors are held constant when examining the cause-effect relationship. This isolates the specific impact of the cause being studied (3).
Counterfactuals
Causal inference relies on counterfactuals, comparing what actually happened to what would have happened in the absence of the cause. The effect is the difference between the factual and counterfactual outcomes. Since the counterfactual is unobserved, it must be estimated (4).
Randomized Experiments
In randomized controlled experiments, study subjects are randomly assigned to treatment or control groups to distribute confounders randomly. Comparing outcomes isolates the treatment effect. This is considered the “gold standard” for causal inference (5).
Observational Studies
In observational studies, researchers cannot randomize exposures. Instead, they use statistical methods to emulate experiments and approximate counterfactuals as closely as possible using observational data (6).
Structural Causal Models
These graphical models describe assumed causal relationships between variables, represented by nodes and directed edges. They help researchers encode substantive assumptions to guide analysis. Common models include regression, simultaneous equations, and potential outcomes (7).
The Ladder of Causation
This metaphor describes levels of increasing causal influence from association, to prediction, to intervention, to counterfactuals. Causal claims become stronger as analysts overcome more threats to validity (8).
Causal Identification
A causal effect is said to be identified when there is sufficient information to point identify its magnitude in the population. Identification depends on research design and untestable assumptions encoded in models (9).
Key Assumptions and Threats to Validity
Ignorability
In randomized experiments, treatment assignment is independent of outcomes, conditional on pretreatment covariates. This independence permits unbiased effect estimates. Dependence between exposure and outcomes given covariates violates ignorability (10).
Positivity
Each study unit must have a non-zero probability of receiving each treatment level for comparable outcomes. Violations make counterfactuals infeasible to estimate (11).
Consistency
The effect measure must be uniformly defined across all units. Biologically, the same treatment should correspond to the same outcome response for all units (12).
Correct model specification
Statistical models used to estimate effects must accurately reflect the true structural relationships, or estimates may be biased. Analysts aim for congruent models and robustness checks (13).
Confounding and endogeneity
Unobserved common causes that influence both the treatment and outcome complicate isolating the treatment’s impact. Analysts control for observed confounders but unmeasured ones still threaten validity (14).
Omitted variable bias
Uncontrolled confounding due to unobserved common causes results in biased effect estimates. Sensitivity analysis assesses how strong confounding would need to be to alter inferences (15).
Selection bias
When assignment to treatment is associated with potential outcomes, effects differ systematically for treated versus control groups. Experimental randomization avoids selection bias (16).
Reverse causation
When the outcome also causes the exposure, effects estimates do not accurately reflect the causal direction of interest. Longitudinal data and experiments help avoid reverse causation (17).
Methods for Observational Studies
Matching
Cases are matched on observed covariates to create balanced treated and control groups. Matches aim to replicate random assignment using observed characteristics. Matching lessens dependence between exposure and confounders (18).
Stratification and Weighting
Observations are stratified into subgroups within which treatment is unrelated to outcomes. Stratified analyses control confounding. Weighting observations also balances groups on covariates (19).
Regression
Including covariates in regression models helps adjust for confounding influences. Parametric assumptions limit flexibility. Causal interpretations rely on correctly specifying the functional forms (20).
Instrumental Variables
Instruments induce exogenous variation in the exposure unrelated to confounders or outcomes except through treatment. This mimics random assignment. Valid instruments are key for identification (21).
Differences-in-differences
Comparing changes over time between groups controls for time-invariant unobserved confounders. Parallel trends are assumed for valid inferences about group differences in the changes (22).
Regression Discontinuity
Sharp cutoffs for treatment eligibility produce a local experiment around the threshold. Comparing outcomes just above and below the cutoff isolates the treatment effect (23).
Propensity Scores
The propensity for treatment given observed covariates is used to balance groups or match units to estimate effects. Overlap and model specification are key assumptions (24).
Synthetic Controls
A weighted combination of units forms a synthetic control group that matches important predictors of the outcomes. Comparisons isolate the effect of treatment (25).
Causal Machine Learning
Flexible nonparametric methods like neural networks, random forests, and boosted regression estimate heterogeneous treatment effects from high-dimensional data. Cross-validation avoids overfitting (26).
Structural Equation Modeling
These models encode theory-based causal relationships between multiple variables and use contrasts between observed and model-implied covariance matrices to estimate effects. Strict assumptions limit flexibility (27).
Mediation Analysis
This estimates how effects occur through intermediate variables that transmit some of the causal influence. Structural equation modeling is often used for mediation analysis (28).
Simulation-based Analysis
Generating simulated counterfactual outcomes under different causal scenarios facilitates assessing the identification, bias, efficiency, and sensitivity of effect estimates (29).
Best Practices
Clearly justify causal hypotheses before analysis using theory and study design choices. Avoid data dredging and p-hacking (30).
Carefully assess if identification assumptions are justified, and conduct sensitivity analysis to evaluate threats to validity (31).
Use placebo and null effect tests to check for spurious findings (32).
Examine effect heterogeneity and boundary conditions of effects across units and contexts (33).
Assess replicability of results across independent, well-powered studies analyzing different samples and measures (34).
Make analytic choices transparent by pre-registering analysis plans and providing access to data and code (35).
Interpret effect sizes cautiously and communicate limitations and uncertainty around causal claims (36).
Triangulate evidence from multiple methods with complementary strengths and limitations (37).
Applications
Public Policy Causal Inference
Randomized controlled trials and quasi-experiments assess policy impacts on educational attainment, poverty, health behaviors, and other social outcomes (38).
Business Causal Inference
Companies use A/B testing, regression discontinuity, and other methods to estimate the causal effects of pricing, advertising, recommendation algorithms, and product features on customer behavior (39).
Epidemiology and Medicine
Randomized clinical trials provide the gold standard for evaluating new treatments. Observational studies also investigate risk factors for diseases using longitudinal data and natural experiments (40).
Economics
Instrumental variables, regression discontinuity, differences-in-differences, and general method of moments models identify policy effects on important economic outcomes that randomized experiments cannot (41).
AI and Machine Learning
Flexible nonparametric methods are increasing used to discover subtle and heterogeneous treatment effects from high-dimensional data like images, text, and genomes (42).
Psychology and Neuroscience
Controlled experiments isolate causal mechanisms of cognitive processes, emotions, decision-making, and social behavior. Brain imaging provides neural evidence (43).
Emerging Directions
Automating Causal Inference
Algorithms automate parts of the causal inference workflow, including covariate selection, model specification, and heterogeneity detection in high-dimensional data (44).
Ensemble Methods
Combining estimates from diverse methods can enhance robustness. Stacking, averaging, and weighting techniques synthesize varied models (45).
Causal Discovery Algorithms
These data-driven methods search over possible causal graphs and dependency relationships to discover plausible models from observational data (46).
External Validity
Transporting causal inferences to new populations or settings remains challenging. New methods extrapolate experimental findings using theoretical moderators (47).
Integrating Domain Knowledge
Injecting substantive expertise into causal models improves assumptions and identification. Formalizing domain theories facilitates generalizable insights (48).
Conclusion
Causal inference leverages observational and experimental data to produce credible estimates of cause-effect relationships that inform actionable decision-making across many knowledge domains. Ongoing advances in computational power, algorithmic innovations, and interdisciplinary collaborations continue to strengthen the rigor, accuracy, and breadth of causal insights derived from data. Used judiciously and transparently, causal inference provides a valuable evidence base for improving social welfare, health, commerce, governance, and more.
References
- Pearl, J., & Mackenzie, D. (2018). The book of why: the new science of cause and effect. Basic Books.
- Elwert, F., & Winship, C. (2014). Endogenous selection bias: The problem of conditioning on a collider variable. Annual review of sociology, 40, 31-53.
- King, G., Keohane, R. O., & Verba, S. (1994). Designing social inquiry: Scientific inference in qualitative research. Princeton university press.
- Morgan, S. L., & Winship, C. (2014). Counterfactuals and causal inference. Cambridge University Press.
- Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference/William R. Shedish, Thomas D. Cook, Donald T. Campbell. Boston: Houghton Mifflin.
- Rosenbaum, P. R. (2010). Design of observational studies (Vol. 10). New York, NY: Springer.
- Pearl, J. (2009). Causality. Cambridge university press.
- Shmueli, G. (2010). To explain or to predict?. Statistical science, 289-310.
- Angrist, J. D., & Pischke, J. S. (2008). Mostly harmless econometrics: An empiricist’s companion. Princeton university press.
- Imbens, G. W., & Rubin, D. B. (2015). Causal inference in statistics, social, and biomedical sciences. Cambridge University Press.
- Crump, R. K., Hotz, V. J., Imbens, G. W., & Mitnik, O. A. (2008). NONPARAMETRIC TESTS FOR TREATMENT EFFECT HETEROGENEITY*. The Review of Economics and Statistics, 90(3), 389-405.
- Cole, S. R., & Hernán, M. A. (2008). Constructing inverse probability weights for marginal structural models. American journal of epidemiology, 168(6), 656-664.
- Claassen, C., & Heskes, T. (2012, June). A Bayesian Approach to Constraint Based Causal Inference. In UAI (pp. 207-216).
- Antonakis, J., Bendahan, S., Jacquart, P., & Lalive, R. (2014). Causality and endogeneity: Problems and solutions. The Oxford handbook of leadership and organizations, 93.
- Frank, K. A. (2000). Impact of a confounding variable on a regression coefficient. Sociological Methods & Research, 29(2), 147-194.
- Winship, C., & Mare, R. D. (1992). Models for sample selection bias. Annual review of sociology, 18(1), 327-350.
- Gelman, A., & Imbens, G. (2013). Why ask why? Forward causal inference and reverse causal questions. Technical Report 19614, National Bureau of Economic Research, Cambridge, MA.
- Stuart, E. A. (2010). Matching methods for causal inference: A review and a look forward. Statistical science: a review journal of the Institute of Mathematical Statistics, 25(1), 1.
- Lunceford, J. K., & Davidian, M. (2004). Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study. Statistics in medicine, 23(19), 2937-2960.
- Stock, J. H., & Watson, M. W. (2003). Introduction to econometrics (Vol. 104). Boston: Addison Wesley.
- Angrist, J. D., Imbens, G. W., & Rubin, D. B. (1996). Identification of causal effects using instrumental variables. Journal of the American statistical Association, 91(434), 444-455.
- Lechner, M. (2010). The estimation of causal effects by difference-in-difference methods. University of St. Gallen Department of Economics Working Paper Series, (2010-28).
- Lee, D. S., & Lemieux, T. (2010). Regression discontinuity designs in economics. Journal of economic literature, 48(2), 281-355.
- Austin, P. C. (2011). An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate behavioral research, 46(3), 399-424.
- Abadie, A., Diamond, A., & Hainmueller, J. (2010). Synthetic control methods for comparative case studies: Estimating the effect of California’s tobacco control program. Journal of the American statistical Association, 105(490), 493-505.
- Athey, S., & Imbens, G. (2019). Machine learning methods for estimating heterogeneous causal effects. stat, 1050(5).
- Kline, R. B. (2015). Principles and practice of structural equation modeling. Guilford publications.
- VanderWeele, T. J. (2016). Mediation analysis: a practitioner’s guide. Annual review of public health, 37, 17-32.
- Keele, L., Titiunik, R., & Zubizarreta, J. R. (2015). Enhancing a geographic regression discontinuity design through matching to estimate the effect of ballot initiatives on voter turnout. Journal of the Royal Statistical Society: Series A (Statistics in Society), 178(1), 223-239.
- Gelman, A., & Loken, E. (2014). The statistical crisis in science: Data-dependent analysis—a” garden of forking paths”—explains why many statistically significant comparisons don’t hold up. American scientist, 102(6), 460-466.
- Cornfield, J., Haenszel, W., Hammond, E. C., Lilienfeld, A. M., Shimkin, M. B., & Wynder, E. L. (1959). Smoking and lung cancer: recent evidence and a discussion of some questions. JNCI: Journal of the National Cancer Institute, 22(1), 173-203.
- Loken, E., & Gelman, A. (2017). Measurement error and the replication crisis. Science, 355(6325), 584-585.
- Tipton, E. (2013). Improving generalizations from experiments using propensity score subclassification: Assumptions, properties, and contexts. Journal of Educational and Behavioral Statistics, 38(3), 239-266.
- McShane, B. B., Böckenholt, U., & Hansen, K. T. (2016). Adjusting for publication bias in meta-analysis: An evaluation of selection methods and some cautionary notes. Perspectives on Psychological Science, 11(5), 730-749.
- Miguel, E., Camerer, C., Casey, K., Cohen, J., Esterling, K. M., Gerber, A., … & Laitin, D. (2014). Promoting transparency in social science research. Science, 343(6166), 30-31.
- Kaplan, D., & Irvin, V. L. (2015). Likelihood of null effects of large NHLBI clinical trials has increased over time. PloS one, 10(8), e0132382.
- Tashakkori, A., & Teddlie, C. (Eds.). (2010). Sage handbook of mixed methods in social & behavioral research. Sage.
- Ludwig, J., Kling, J. R., & Mullainathan, S. (2011). Mechanism experiments and policy evaluations. Journal of Economic Perspectives, 25(3), 17-38.
- Lewis, R. A., & Rao, J. M. (2015). The unfavorable economics of measuring the returns to advertising. The Quarterly Journal of Economics, 130(4), 1941-1973.
- Banerjee, A. V., & Duflo, E. (2009). The experimental approach to development economics. Annu. Rev. Econ., 1(1), 151-178.
- Angrist, J. D., & Pischke, J. S. (2010). The credibility revolution in empirical economics: How better research design is taking the con out of econometrics. Journal of economic perspectives, 24(2), 3-30.
- Athey, S. (2019). The impact of machine learning on economics. In The economics of artificial intelligence: An agenda (pp. 507-547). University of Chicago Press.
- Here is the continuation of the article:
- Koch, P., Gens, R., Savchuk, S., Vreeken, R., Kaulartz, M., Heumann, C., … & Scheffer, T. (2018, July). Causally motivated feature selection: experiments with expertise modeling. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 1245-1254).
- Wolpert, D. H. (1992, July). Stacked generalization. Neural networks, 5(2), 241-259.
- Spirtes, P., Glymour, C. N., Scheines, R., Heckerman, D., Meek, C., Cooper, G., & Richardson, T. (2000). Causation, prediction, and search. MIT press.
- Tipton, E. (2014). Generalizing the results from social experiments: Theory and evidence from Mexico and India (Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences).
- Pearl, J., & Bareinboim, E. (2014). External validity: From do-calculus to transportability across populations. Statistical Science, 29(4), 579-595.
- Hernan, M.A. and Robins, J.M. (2020). Causal Inference: What If. Boca Raton: Chapman & Hall/CRC.
- Imbens, G.W. and Rubin, D.B. (2015). Causal Inference in Statistics, Social, and Biomedical Sciences. Cambridge: Cambridge University Press.
- Morgan, S.L. and Winship, C. (2015). Counterfactuals and Causal Inference. Cambridge: Cambridge University Press.
- Pearl, J., Glymour, M. and Jewell, N.P. (2016). Causal Inference in Statistics. Chichester: Wiley.
- Angrist, J.D. and Pischke, J.-S. (2009). Mostly Harmless Econometrics: An Empiricist’s Companion. Princeton: Princeton University Press.
- Rosenbaum, P.R. (2010). Design of Observational Studies. New York, NY: Springer.
- Aronow, P.M. and Miller, B. (2019). Foundations of Agnostic Statistics. Cambridge: Cambridge University Press.
- Cunningham, S. (2021). Causal Inference: The Mixtape. New Haven: Yale University Press.
- In conclusion, causal inference provides a rigorous framework for estimating cause-and-effect relationships from both experimental and observational data. Key concepts, assumptions, methods, applications, current best practices, and emerging directions were reviewed. Used responsibly and transparently, causal evidence derived from data can offer valuable guidance for impactful decision-making across diverse disciplines.