Decision Path Comparisons: Similar Datasets, Different Methods

This document provides side-by-side comparisons of how CAIS selects different methods for similar datasets with slight variations in characteristics. Understanding these comparisons helps illustrate the decision tree logic and method selection criteria.

Overview

Small changes in dataset characteristics can lead to dramatically different method selections. This document shows:

How minor data differences affect method choice
Why certain methods are preferred over others
What happens when key assumptions are violated
How to interpret method selection decisions

Comparison 1: Randomized vs. Observational Education Data

Scenario: Evaluating the impact of a tutoring program on student test scores

Randomized Version

Dataset Characteristics: - Students randomly assigned to tutoring program - Balanced baseline characteristics - Rich covariate information available - Perfect compliance with assignment

        flowchart TD
    A[Randomized Tutoring Study] --> B{Is this randomized?}
    B -->|Yes ✓| C{Are covariates available?}
    C -->|Yes ✓| D[Linear Regression<br/>with Covariates]

    style A fill:#e3f2fd
    style B fill:#e8f5e8
    style C fill:#fff3e0
    style D fill:#e8f5e8

Agent Decision:

🎯 Selected Method: Linear Regression with Covariates

Reasoning:
✓ Randomization ensures causal identification
✓ Covariates improve precision (reduce standard errors)
✓ No selection bias concerns
✓ Straightforward interpretation

Expected Results:
- Unbiased treatment effect estimate
- Narrow confidence intervals (high precision)
- Clear causal interpretation

Observational Version

Dataset Characteristics: - Students self-select into tutoring program - Systematic differences in baseline characteristics - Same rich covariate information available - Good overlap in covariate distributions

        flowchart TD
    A[Observational Tutoring Study] --> B{Is this randomized?}
    B -->|No ✗| C{Panel data available?}
    C -->|No ✗| D{Running variable?}
    D -->|No ✗| E{Binary treatment?}
    E -->|Yes ✓| F{Instrumental variable?}
    F -->|No ✗| G{Rich covariates?}
    G -->|Yes ✓| H{Good overlap?}
    H -->|Yes ✓| I[Propensity Score<br/>Matching]

    style A fill:#e3f2fd
    style B fill:#ffebee
    style C fill:#ffebee
    style D fill:#ffebee
    style E fill:#fff3e0
    style F fill:#ffebee
    style G fill:#fff3e0
    style H fill:#fff3e0
    style I fill:#e8f5e8

Agent Decision:

🎯 Selected Method: Propensity Score Matching

Reasoning:
❌ No randomization (selection bias present)
✓ Rich covariates available for matching
✓ Good covariate overlap enables valid matches
✓ Can control for observed confounders

Expected Results:
- Potentially biased if unobserved confounders exist
- Wider confidence intervals (less precision)
- Requires strong unconfoundedness assumption

Side-by-Side Comparison:

Randomized vs. Observational Comparison
Aspect	Randomized Version	Observational Version
Method Selected	Linear Regression + Covariates	Propensity Score Matching
Identification	Randomization	Unconfoundedness assumption
Bias Risk	None (randomized)	Possible (unobserved confounders)
Precision	High (uses all data)	Lower (matched sample only)
Assumptions	Minimal	Strong (no unmeasured confounding)

—

Comparison 2: Cross-Sectional vs. Panel Policy Data

Scenario: Evaluating the impact of minimum wage increases on employment

Cross-Sectional Version

Dataset Characteristics: - Single time point after policy implementation - States with and without minimum wage increases - Rich economic and demographic controls - No pre-policy baseline data

        flowchart TD
    A[Cross-Sectional Policy Data] --> B{Is this randomized?}
    B -->|No ✗| C{Panel data available?}
    C -->|No ✗| D{Running variable?}
    D -->|No ✗| E{Binary treatment?}
    E -->|Yes ✓| F{Instrumental variable?}
    F -->|No ✗| G{Rich covariates?}
    G -->|Yes ✓| H{Good overlap?}
    H -->|Yes ✓| I[Propensity Score<br/>Methods]

    style A fill:#e3f2fd
    style B fill:#ffebee
    style C fill:#ffebee
    style D fill:#ffebee
    style E fill:#fff3e0
    style F fill:#ffebee
    style G fill:#fff3e0
    style H fill:#fff3e0
    style I fill:#e8f5e8

Agent Decision:

🎯 Selected Method: Propensity Score Methods

Reasoning:
❌ No randomization (policy endogenous)
❌ No panel data (single time point)
❌ No clear running variable
✓ Rich covariates for matching/weighting

Limitations:
⚠️ Cannot control for unobserved state characteristics
⚠️ Policy adoption may be endogenous
⚠️ Strong unconfoundedness assumption required

Panel Version

Dataset Characteristics: - Multiple time periods before and after policy - Staggered implementation across states - Same rich controls available - Clear treatment timing variation

        flowchart TD
    A[Panel Policy Data] --> B{Is this randomized?}
    B -->|No ✗| C{Panel data available?}
    C -->|Yes ✓| D{Treatment timing varies?}
    D -->|Yes ✓| E[Difference-in-Differences]

    style A fill:#e3f2fd
    style B fill:#ffebee
    style C fill:#fff3e0
    style D fill:#fff3e0
    style E fill:#e8f5e8

Agent Decision:

🎯 Selected Method: Difference-in-Differences

Reasoning:
❌ No randomization (policy endogenous)
✓ Panel data with timing variation
✓ Can control for time-invariant confounders
✓ Exploits policy timing for identification

Advantages:
✓ Controls for unobserved state characteristics
✓ Handles time trends
✓ More credible identification than cross-sectional

Side-by-Side Comparison:

Cross-Sectional vs. Panel Comparison
Aspect	Cross-Sectional Version	Panel Version
Method Selected	Propensity Score Methods	Difference-in-Differences
Identification	Unconfoundedness	Parallel trends
Controls For	Observed confounders only	Time-invariant unobservables
Key Assumption	No unmeasured confounding	Parallel trends
Credibility	Lower (strong assumptions)	Higher (weaker assumptions)

—

Comparison 3: Sharp vs. Fuzzy Discontinuity

Scenario: Evaluating scholarship program effects on college enrollment

Sharp Discontinuity Version

Dataset Characteristics: - Test score determines scholarship eligibility - Sharp cutoff at score = 1200 - All students above cutoff get scholarship - No students below cutoff get scholarship

        flowchart TD
    A[Sharp RDD Data] --> B{Is this randomized?}
    B -->|No ✗| C{Panel data available?}
    C -->|No ✗| D{Running variable with cutoff?}
    D -->|Yes ✓| E{Sharp discontinuity?}
    E -->|Yes ✓| F[Sharp Regression<br/>Discontinuity]

    style A fill:#e3f2fd
    style B fill:#ffebee
    style C fill:#ffebee
    style D fill:#fff3e0
    style E fill:#fff3e0
    style F fill:#e8f5e8

Agent Decision:

🎯 Selected Method: Sharp Regression Discontinuity

Reasoning:
✓ Clear running variable (test score)
✓ Sharp cutoff at 1200
✓ Treatment probability jumps from 0 to 1
✓ Local randomization around cutoff

Implementation:
- Compare students just above/below cutoff
- Estimate local treatment effect
- Check continuity assumptions

Fuzzy Discontinuity Version

Dataset Characteristics: - Same test score running variable - Same cutoff at score = 1200 - Scholarship probability increases but doesn’t reach 100% - Some students below cutoff still get scholarships

        flowchart TD
    A[Fuzzy RDD Data] --> B{Is this randomized?}
    B -->|No ✗| C{Panel data available?}
    C -->|No ✗| D{Running variable with cutoff?}
    D -->|Yes ✓| E{Sharp discontinuity?}
    E -->|No ✗| F{Fuzzy discontinuity?}
    F -->|Yes ✓| G[Fuzzy Regression<br/>Discontinuity]

    style A fill:#e3f2fd
    style B fill:#ffebee
    style C fill:#ffebee
    style D fill:#fff3e0
    style E fill:#ffebee
    style F fill:#fff3e0
    style G fill:#e8f5e8

Agent Decision:

🎯 Selected Method: Fuzzy Regression Discontinuity

Reasoning:
✓ Clear running variable (test score)
✓ Discontinuous jump in treatment probability
❌ Treatment probability doesn't reach 100%
✓ Can use IV approach with cutoff as instrument

Implementation:
- First stage: cutoff predicts scholarship probability
- Second stage: predicted scholarship affects enrollment
- Estimate local average treatment effect (LATE)

Side-by-Side Comparison:

Sharp vs. Fuzzy RDD Comparison
Aspect	Sharp RDD	Fuzzy RDD
Method Selected	Sharp RDD	Fuzzy RDD (IV approach)
Treatment Assignment	Deterministic at cutoff	Probabilistic at cutoff
Identification	Direct comparison	Instrumental variables
Interpretation	Average treatment effect	Local average treatment effect
Complexity	Simpler	More complex (two-stage)

—

Comparison 4: Strong vs. Weak Instrument

Scenario: Evaluating the effect of education on earnings

Strong Instrument Version

Dataset Characteristics: - Distance to college as instrument for education - Strong first-stage relationship (F > 50) - Credible exclusion restriction - Large sample size

        flowchart TD
    A[Strong IV Data] --> B{Is this randomized?}
    B -->|No ✗| C{Panel data available?}
    C -->|No ✗| D{Running variable?}
    D -->|No ✗| E{Binary treatment?}
    E -->|No ✗| F{Continuous treatment}
    F --> G{Instrumental variable?}
    G -->|Yes ✓| H{Strong instrument?}
    H -->|Yes ✓| I[Instrumental Variables]

    style A fill:#e3f2fd
    style B fill:#ffebee
    style C fill:#ffebee
    style D fill:#ffebee
    style E fill:#ffebee
    style F fill:#fff3e0
    style G fill:#fff3e0
    style H fill:#fff3e0
    style I fill:#e8f5e8

Agent Decision:

🎯 Selected Method: Instrumental Variables

Reasoning:
✓ Strong first-stage relationship (F = 52.3)
✓ Credible exclusion restriction
✓ Handles unmeasured confounding
✓ Large sample provides adequate power

Expected Results:
- Consistent estimates
- Reasonable precision
- Valid inference

Weak Instrument Version

Dataset Characteristics: - Same distance to college instrument - Weak first-stage relationship (F < 10) - Same exclusion restriction - Same sample size

        flowchart TD
    A[Weak IV Data] --> B{Is this randomized?}
    B -->|No ✗| C{Panel data available?}
    C -->|No ✗| D{Running variable?}
    D -->|No ✗| E{Binary treatment?}
    E -->|No ✗| F{Continuous treatment}
    F --> G{Instrumental variable?}
    G -->|Yes ✓| H{Strong instrument?}
    H -->|No ✗| I[⚠️ Weak Instrument<br/>Consider Alternatives]

    style A fill:#e3f2fd
    style B fill:#ffebee
    style C fill:#ffebee
    style D fill:#ffebee
    style E fill:#ffebee
    style F fill:#fff3e0
    style G fill:#fff3e0
    style H fill:#ffebee
    style I fill:#fff3e0

Agent Decision:

⚠️ Weak Instrument Detected: Consider Alternatives

Problems with Weak IV:
❌ First-stage F-statistic = 8.2 (< 10 threshold)
❌ Biased estimates in finite samples
❌ Invalid inference (confidence intervals too narrow)
❌ Sensitive to small violations of exclusion restriction

Recommended Alternatives:
1. Find stronger instruments
2. Use limited information maximum likelihood (LIML)
3. Consider observational methods with rich controls
4. Collect more data to improve first-stage power

Side-by-Side Comparison:

Strong vs. Weak IV Comparison
Aspect	Strong Instrument	Weak Instrument
First-Stage F	52.3 (strong)	8.2 (weak)
Method Selected	Standard IV	Alternative methods recommended
Bias Risk	Low	High (finite sample bias)
Inference	Valid	Invalid (undersized tests)
Sensitivity	Robust	Highly sensitive

—

Comparison 5: Good vs. Poor Covariate Overlap

Scenario: Evaluating job training program effectiveness

Good Overlap Version

Dataset Characteristics: - Observational data with selection bias - Rich set of baseline characteristics - Good overlap in covariate distributions - Treated and control units across full covariate range

        flowchart TD
    A[Good Overlap Data] --> B{Is this randomized?}
    B -->|No ✗| C{Panel data available?}
    C -->|No ✗| D{Running variable?}
    D -->|No ✗| E{Binary treatment?}
    E -->|Yes ✓| F{Instrumental variable?}
    F -->|No ✗| G{Rich covariates?}
    G -->|Yes ✓| H{Good overlap?}
    H -->|Yes ✓| I[Propensity Score<br/>Matching]

    style A fill:#e3f2fd
    style B fill:#ffebee
    style C fill:#ffebee
    style D fill:#ffebee
    style E fill:#fff3e0
    style F fill:#ffebee
    style G fill:#fff3e0
    style H fill:#fff3e0
    style I fill:#e8f5e8

Agent Decision:

🎯 Selected Method: Propensity Score Matching

Reasoning:
✓ Rich covariates available
✓ Excellent covariate overlap (common support)
✓ Can find good matches for most treated units
✓ Transparent balance assessment

Expected Results:
- High-quality matches
- Good balance on observables
- Credible causal estimates (if unconfoundedness holds)

Poor Overlap Version

Dataset Characteristics: - Same observational data structure - Same rich baseline characteristics - Poor overlap in covariate distributions - Treated units concentrated in one region of covariate space

        flowchart TD
    A[Poor Overlap Data] --> B{Is this randomized?}
    B -->|No ✗| C{Panel data available?}
    C -->|No ✗| D{Running variable?}
    D -->|No ✗| E{Binary treatment?}
    E -->|Yes ✓| F{Instrumental variable?}
    F -->|No ✗| G{Rich covariates?}
    G -->|Yes ✓| H{Good overlap?}
    H -->|No ✗| I[Propensity Score<br/>Weighting]

    style A fill:#e3f2fd
    style B fill:#ffebee
    style C fill:#ffebee
    style D fill:#ffebee
    style E fill:#fff3e0
    style F fill:#ffebee
    style G fill:#fff3e0
    style H fill:#ffebee
    style I fill:#e8f5e8

Agent Decision:

🎯 Selected Method: Propensity Score Weighting

Reasoning:
✓ Rich covariates available
❌ Poor covariate overlap (limited common support)
❌ Matching would discard many observations
✓ Weighting can handle poor overlap better

Caveats:
⚠️ Extrapolation required (poor overlap)
⚠️ High variance in weights possible
⚠️ Results may be sensitive to specification

Side-by-Side Comparison:

Good vs. Poor Overlap Comparison
Aspect	Good Overlap	Poor Overlap
Method Selected	Propensity Score Matching	Propensity Score Weighting
Common Support	Excellent	Limited
Sample Usage	High (good matches)	Full sample (with weights)
Extrapolation	Minimal	Substantial
Variance	Lower	Higher (extreme weights)

Key Learning Points

Decision Tree Sensitivity

Small changes in data characteristics can lead to dramatically different method selections:

Randomization Status: Completely changes the analysis approach
Data Structure: Panel vs. cross-sectional determines method families
Instrument Strength: Weak instruments invalidate IV approaches
Overlap Quality: Affects choice between matching and weighting

Method Hierarchy

CAIS follows a clear hierarchy of method preferences:

Experimental Methods: Always preferred when randomization is available
Natural Experiments: RDD and strong IV are next best
Quasi-Experiments: DiD with credible parallel trends
Observational Methods: Matching/weighting with rich covariates
Regression Methods: Last resort with strong assumptions

Assumption Importance

Different methods rely on different assumptions:

Randomization: Minimal assumptions, strongest identification
Parallel Trends: Moderate assumptions, good identification
Exclusion Restriction: Strong assumptions, requires careful validation
Unconfoundedness: Very strong assumptions, often untestable

Practical Implications

Understanding these comparisons helps with:

Study Design: Plan data collection to enable better methods
Method Selection: Understand why CAIS chooses specific approaches
Result Interpretation: Know the limitations of your selected method
Robustness Checking: Test sensitivity across similar methods

Next Steps

Apply to Your Data: Use these comparisons to understand your method selection
Design Better Studies: Plan data collection to enable stronger methods
Validate Assumptions: Check key assumptions for your selected method
Explore Alternatives: Consider how small data changes might improve identification

Related Resources: - Case Studies - Detailed case studies by domain - Method Selection Decision Tree - Complete decision tree documentation - Dataset Properties and Method Selection Gallery - Visual method selection examples