hdPS and its machine learning extensions in residual confounding control

Author

Ehsan Karim, School of Population and Public Health, UBC

Published

June 4, 2023

Background

The use of retrospective health care claims datasets is frequently criticized for lacking complete information on potential confounders. Ultimately, the treatment effects estimated utilizing such data sources may be subject to residual confounding. Digital electronic administrative records routinely collect a large volume of health-related information; and many of whom are usually not considered in conventional pharmacoepidemiological studies.

Proposal to reduce residual confounding bias

In 2009, a high-dimensional propensity score (hdPS) algorithm was proposed that utilizes such information as surrogates or proxies for mismeasured and unobserved confounders in an effort to reduce residual confounding bias. Since then, many machine learning and semi-parametric extensions of this algorithm have been proposed to exploit the wealth of high-dimensional proxy information properly.

Schneeweiss et al. (2009)

Purpose of the workshop

This workshop will

  1. demonstrate logic, steps and implementation guidelines of hdPS utilizing an open data source as an example (using reproducible R codes),
  2. familiarize participants with the difference between propensity score vs. hdPS,
  3. explain the rationale for using the machine learning extensions of hdPS, and their statistical properties, and
  4. discuss advantages, controversies, and hdPS reporting guidelines while writing a manuscript.

Workshop prerequisite

Attendees should have prerequisite knowledge of multiple regression analysis and working knowledge in R (e.g., basic data manipulation and regression fitting).

R Codes

R Codes for data creation and hdPS analysis can be found on the GitHub repo (codes directory).

Version history

Different versions and updates of the materials were presented in the following sessions

Additional relevant talks (selected):

Citation

How to cite

Karim, ME. (2023). High-dimensional propensity score and its machine learning extensions in residual confounding control in pharmacoepidemiologic studies. Zenodo. DOI: 10.5281/zenodo.7894083.

Comments

For any comments regarding this document, reach out to me.