Extensions
Uses of propensity score matching
Usefulness
- balance covariates
- addressing high-dimensional covariates
- when outcome is rare
- more flexible ways to deal with modelling
- ability to assess overlap
- somewhat more intuitive
- explicit assumptions
Other than answering causal question in observational studies, propensity score matching can be helpful in the following scenarios:
Covariate adjustment in randomized trials: In some cases, propensity score matching can be applied to randomized controlled trials to adjust for potential imbalances in covariates that may arise due to randomization (Zeng et al. 2021), especially in small sample sizes. However, there is a lot of controversy regarding covariate adjustment in RCTs.
Subgroup analysis: Propensity score matching can be used to conduct subgroup analysis, where the goal is to explore treatment effects within specific subgroups defined by one or more covariates (Dong et al. 2020).
Limitations
- multiple steps, too many choices
- less efficient
- target population may shift
- residual confounding still an issue
Propensity score weighting
flowchart LR AA((propensity score weighting)) --> A[(Exposure modelling <br/>to estimate <br/>propensity scores)] AA --> B[(Converting propensity scores <br/>to <br/>inverse probability weights, <br/>or IPW)] AA --> C[(Assess balance <br/>and summary of IPW <br/>in the weighted data)] AA --> D[(Outcome <br/>modelling in <br/>the IPW weighted data)] style AA fill:#f9f,stroke:#333,stroke-width:4px
Average Treatment Effect (ATE)
\[\begin{align*} IPW_i &= \begin{cases} \frac{1}{\text{PS}_i} & \text{if individual } i \text{ is treated (} A_i = 1 \text{)} \\ \frac{1}{1 - \text{PS}_i} & \text{if individual } i \text{ is not treated (} A_i = 0 \text{)} \end{cases} \end{align*}\]Average Treatment Effect on the Treated (ATT)
\[\begin{align*} IPW_i &= \begin{cases} 1 & \text{if individual } i \text{ is treated (} A_i = 1 \text{)} \\ \frac{\text{PS}_i}{1 - \text{PS}_i} & \text{if individual } i \text{ is not treated (} A_i = 0 \text{)} \end{cases} \end{align*}\]Gradient Boosted Models
Gradient Boosted Models (GBM) are an ensemble technique used in machine learning. This is ensemble technique as combines multiple shallow decision trees (usually known to be weak learners) to create a more useful predictive model. It improves the overall model performance by iteratively adding these weak models that focus on correcting the errors of the previous models in the sequence. These models are very promising in a real-world data analysis scenario as they are can automatically incorporate nonlinear and interaction terms among the covariates.
Traditional logistic regression models require manual specification of interaction terms, which can be challenging and time-consuming, especially if there are many potential interactions to consider. TWANG package in R implements GBMs to estimate propensity scores can help overcome this challenge. GBMs can automatically capture interaction effects and non-linear relationships among covariates without the need for manual specification. This ability to model interactions without explicit intervention from the analyst makes GBMs an appealing choice for propensity score estimation in complex datasets with potential interactions.