Discussion about Propensity Score

Uses of propensity score matching

Usefulness

balance covariates
addressing high-dimensional covariates
when outcome is rare
more flexible ways to deal with modelling
ability to assess overlap
somewhat more intuitive
explicit assumptions

Other than answering causal question in observational studies, propensity score matching can be helpful in the following scenarios:

Covariate adjustment in randomized trials: In some cases, propensity score matching can be applied to randomized controlled trials to adjust for potential imbalances in covariates that may arise due to randomization (Zeng et al. 2021), especially in small sample sizes. However, there is a lot of controversy regarding covariate adjustment in RCTs.

Subgroup analysis: Propensity score matching can be used to conduct subgroup analysis, where the goal is to explore treatment effects within specific subgroups defined by one or more covariates (Dong et al. 2020).

Limitations

multiple steps, too many choices
less efficient
target population may shift
residual confounding still an issue

Propensity score weighting

flowchart LR
  AA((propensity score weighting)) --> A[(Exposure modelling <br/>to estimate <br/>propensity scores)] 
  AA --> B[(Converting propensity scores  <br/>to  <br/>inverse probability weights,  <br/>or IPW)]
  AA --> C[(Assess balance <br/>and summary of IPW  <br/>in the weighted data)]
  AA --> D[(Outcome <br/>modelling in <br/>the IPW weighted data)]
  style AA fill:#f9f,stroke:#333,stroke-width:4px

Average Treatment Effect (ATE)

\[\begin{align*} IPW_i &= \begin{cases} \frac{1}{\text{PS}_i} & \text{if individual } i \text{ is treated (} A_i = 1 \text{)} \\ \frac{1}{1 - \text{PS}_i} & \text{if individual } i \text{ is not treated (} A_i = 0 \text{)} \end{cases} \end{align*}\]

Average Treatment Effect on the Treated (ATT)

\[\begin{align*} IPW_i &= \begin{cases} 1 & \text{if individual } i \text{ is treated (} A_i = 1 \text{)} \\ \frac{\text{PS}_i}{1 - \text{PS}_i} & \text{if individual } i \text{ is not treated (} A_i = 0 \text{)} \end{cases} \end{align*}\]

Gradient Boosted Models

Gradient Boosted Models (GBM) are an ensemble technique used in machine learning. This is ensemble technique as combines multiple shallow decision trees (usually known to be weak learners) to create a more useful predictive model. It improves the overall model performance by iteratively adding these weak models that focus on correcting the errors of the previous models in the sequence. These models are very promising in a real-world data analysis scenario as they are can automatically incorporate nonlinear and interaction terms among the covariates.

Traditional logistic regression models require manual specification of interaction terms, which can be challenging and time-consuming, especially if there are many potential interactions to consider. TWANG package in R implements GBMs to estimate propensity scores can help overcome this challenge. GBMs can automatically capture interaction effects and non-linear relationships among covariates without the need for manual specification. This ability to model interactions without explicit intervention from the analyst makes GBMs an appealing choice for propensity score estimation in complex datasets with potential interactions.