Chapter 11 Final Words

11.1 Common misconception

  • PS results = ‘causal’;
  • regression = ‘non-causal’.

No. ‘Results from both methods should lead to the same conclusions.’ (D’Agostino Jr 1998)

When the results deviate, important to investigate why!

Establishing causality requires establishing temporarily and integration of subject area expertise.

11.2 Benifits of PS

  • Intuitive: compare two similar groups

  • 2-step process

    • Encourages researchers to think about the treatment generation process
    • Fit outcome model with only important variables.
    • Allowing to think more about design stage (nice separation from outcome model building process).
  • Fit rich PS model (with higher order terms); focusing on prediction; worry less about overparameterization.

  • Reduce dimension, helpful when exposure frequent but outcome rare (event per variable).

    • Smaller outcome model may be helpful in diagnostic checks.
  • Diagnostics

    • Diagnostics (balance checking) much easier compared to residual plot/influence
    • Graphical comparison helps identify areas of non-overlap.

11.3 Limitations of PS

  • Matching population vs. target population: often not the same.
    • PS matching may give effect estimate of a subset, which may be difficult to identify in the actual population!
  • May delete a lot of subjects from the study!
  • SMD is very commonly used, but may not be enough to judge balance. Check other useful summaries.

11.4 When PS may not be useful?

  • When outcome is common (5 times the available number of variables), then PS may not have any advantage over rregression modelling [ref, 17-5; March 20, 2022].
  • PS can do nothing about unmeasured confounding, neither can outcome regression.
    • Consider instrumental variable (IV) approaches.
  • Non-parametric (ML) approaches can be used to relax linearity assumption in estimating PS, but variance estimation becomes difficult.
    • Double robust methods should be used when non-parametric (ML) approaches are used.
    • See more on Lee, Lessler, and Stuart (2010), Pirracchio, Petersen, and Van Der Laan (2015), Alam, Moodie, and Stephens (2019), Naimi, Mishler, and Kennedy (2017) and Balzer and Westling (2021)

11.5 Software

References

Alam, Shomoita, Erica EM Moodie, and David A Stephens. 2019. “Should a Propensity Score Model Be Super? The Utility of Ensemble Procedures for Causal Adjustment.” Statistics in Medicine 38 (9): 1690–1702.
Balzer, Laura B, and Ted Westling. 2021. “Demystifying Statistical Inference When Using Machine Learning in Causal Research.” American Journal of Epidemiology.
D’Agostino Jr, Ralph B. 1998. “Propensity Score Methods for Bias Reduction in the Comparison of a Treatment to a Non-Randomized Control Group.” Statistics in Medicine 17 (19): 2265–81.
Lee, Brian K, Justin Lessler, and Elizabeth A Stuart. 2010. “Improving Propensity Score Weighting Using Machine Learning.” Statistics in Medicine 29 (3): 337–46.
Naimi, Ashley I, Alan E Mishler, and Edward H Kennedy. 2017. “Challenges in Obtaining Valid Causal Effect Estimates with Machine Learning Algorithms.” arXiv Preprint arXiv:1711.07137.
Pirracchio, Romain, Maya L Petersen, and Mark Van Der Laan. 2015. “Improving Propensity Score Estimators’ Robustness to Model Misspecification Using Super Learner.” American Journal of Epidemiology 181 (2): 108–19.