Observational Methods

Observational methods extract causal insights from non-experimental data by controlling for confounding variables and making identifying assumptions about selection processes.

Propensity Score Matching

Overview

Observational methods are used when experimental or quasi-experimental designs are not available. These methods rely on the assumption that all confounding variables are observed and controlled for, making causal identification possible through statistical adjustment.

Key Advantages: * Can be applied to any observational dataset * Utilize existing data sources * Cost-effective compared to experiments * Allow for large sample sizes

Key Limitations: * Strong unconfoundedness assumption required * Vulnerable to omitted variable bias * Selection on unobservables cannot be ruled out * Require rich covariate data

Method Details

Propensity Score Matching

Matches treated and control units with similar propensity scores.

When to use: Rich covariate data, clear treatment definition
Key assumption: Unconfoundedness given observed covariates
Strengths: Intuitive matching logic, balances covariates
Limitations: Curse of dimensionality, common support issues

Propensity Score Weighting

Reweights observations to balance treatment and control groups.

When to use: When matching is not feasible or desirable
Key assumption: Unconfoundedness and positivity
Strengths: Uses all observations, flexible weighting schemes
Limitations: Sensitive to extreme weights, model dependence

Backdoor Adjustment

Controls for confounders identified through causal graphs.

When to use: When causal graph is well-understood
Key assumption: Backdoor criterion satisfied
Strengths: Principled confounder selection
Limitations: Requires causal graph knowledge

Linear Regression

Controls for confounders through linear regression adjustment.

When to use: Linear relationships, continuous outcomes
Key assumption: Correct functional form, no omitted variables
Strengths: Simple, interpretable, widely understood
Limitations: Strong functional form assumptions

Implementation in Causal Agent

Causal Agent automatically selects and implements appropriate observational methods:

from causal_agent import CausalAgent

# Causal Agent selects best observational method
agent = CausalAgent()
result = agent.analyze(
    data=observational_data,
    treatment='treatment_variable',
    outcome='outcome_variable',
    covariates=['covar1', 'covar2', 'covar3']
)

Automatic Method Selection

Causal Agent chooses methods based on: * Data characteristics (sample size, covariate richness) * Treatment assignment patterns * Outcome variable type * User preferences and constraints

Covariate Selection

Automatic confounder detection
Causal graph-based selection when available
Statistical significance-based inclusion
Domain knowledge integration

Assumption Validation

Unconfoundedness

Cannot be directly tested
Sensitivity analyses for unobserved confounding
Placebo tests using pre-treatment outcomes
Comparison with experimental benchmarks when available

Positivity/Common Support

Propensity score distribution overlap
Trimming observations outside common support
Diagnostic plots and statistics
Sensitivity to support restrictions

Correct Specification

Model specification tests
Functional form diagnostics
Residual analysis
Cross-validation approaches

Balance Assessment

Covariate Balance

Standardized mean differences
Variance ratios
Kolmogorov-Smirnov tests
Graphical balance assessment

Propensity Score Balance

Propensity score distribution comparison
Stratification balance tests
Matching quality diagnostics
Weighting effectiveness measures

Best Practices

Study Design

Collect rich covariate data
Include pre-treatment outcomes when possible
Consider multiple comparison groups
Document data collection process

Analysis

Check balance before and after adjustment
Conduct sensitivity analyses
Use multiple methods for robustness
Report all diagnostic results

Interpretation

Acknowledge unconfoundedness assumption
Discuss potential sources of bias
Consider external validity
Report confidence intervals and uncertainty

Sensitivity Analysis

Unobserved Confounding

Rosenbaum bounds for matched samples
Imbens sensitivity analysis
Simulation-based approaches
Benchmarking against known confounders

Model Specification

Alternative functional forms
Different covariate sets
Various matching/weighting schemes
Robustness to outliers

Sample Restrictions

Different common support definitions
Trimming strategies
Subgroup analyses
Temporal stability

Common Challenges

Data Quality Issues

Missing covariate data
Measurement error in variables
Inconsistent variable definitions
Sample selection issues

Methodological Challenges

Curse of dimensionality
Extreme propensity scores
Poor covariate balance
Model dependence

Interpretation Issues

Distinguishing correlation from causation
Communicating uncertainty
Addressing skepticism about assumptions
Policy relevance of estimates

Advanced Topics

Machine Learning Methods

Targeted maximum likelihood estimation (TMLE)
Double machine learning
Causal forests
Neural network-based methods

Multiple Treatments

Generalized propensity scores
Multiple treatment matching
Dose-response relationships
Treatment interaction effects

Time-Varying Treatments

Marginal structural models
G-computation
Inverse probability weighting over time
Sequential ignorability