Rethinking Residual Confounding Bias Reduction: Why Vanilla hdPS Alone is No Longer Enough

7.1 Bross formula

We need to make an educated guess about 3 components (i.e., make an assumption), that are used in the calculation of bias contributed by not adjusting for a covariate based on Bross (1966) formula:

Bross formula (Bross 1966; Schneeweiss 2006) for the Bias Multiplier considers both the imbalance in the prevalence of the unmeasured confounder between the exposure groups and the association between the confounder and the outcome to assess the potential bias.

prevalence of a binary unmeasured confounder (\(U\)) among exposed (\(P_{UA_1}\))
prevalence of that binary unmeasured confounder among unexposed (\(P_{UA_0}\))
association between that binary unmeasured confounder and the outcome (\(RR_{UY} = \frac{P_{UY_1}}{P_{UY_1}}\))

The above components can help us calculate \(bias\) amount (known as ‘Bias Multiplier’) using the Bross formula when we omit adjusting for \(U\):

\[\text{Bias}_U = \frac{P_{UA_1} (RR_{UY} - 1) + 1}{P_{UA_0} (RR_{UY} - 1) + 1}\]

These are the ingredients of the Bross formula. This formula is helpful for understanding the impact of unmeasured confounding of a binary variable. We have to put assumed prevalence and risk ratio associated with an unmeasured confounder.

7.2 Calculating bias from a recurrence covariate

For recurrence covariates (\(R\)), we do not need to assume, we just plug-in \(R\) instead of \(U\) in the following calculations:

prevalence of a binary recurrence variable among exposed (\(P_{RA_1}\))
prevalence of that binary recurrence variable among unexposed (\(P_{RA_0}\))
association between that binary recurrence variable and the outcome (\(RR_{RY} = \frac{P_{RY_1}}{P_{RY_1}}\))

These components can help us empirically calculate \(bias\) amount:

\[\text{Bias}_R = \frac{P_{RA_1} (RR_{RY} - 1) + 1}{P_{RA_0} (RR_{RY} - 1) + 1}\]

Here, \(RR_{RY}\) is the crude risk ratio between the recurrence covariate and the outcome, \(Y\) is the outcome, \(A\) is the exposure, and \(R\) is a recurrence covariate.

For recurrence covariates, we do not need to assume, we can basically calculate these numbers (\(log-absolute-bias\)) for all of the recurrence covariates (Schneeweiss et al. 2009). For each data dimension, we can rank each of the recurrence covariates based on the amount of bias (confounding or imbalance) it could likely adjust.

7.3 Calculating bias from all recurrence covariates

In our example, we simply plug-in each recurrence covariates one-by-one to calculate \(log-absolute-bias\):

R=rec_dx_D64_once
R=rec_dx_D75_sporadic
…
R=rec_dx_E07_frequent

7.4 Obtain log of absolute-bias

We calculate \(log-absolute-bias\) for all recurrence covariates.

Absolute log of the Bias Multiplier, \(log-absolute-bias\), is a symmetric measure of the potential bias introduced by the recurrence covariate, making it easier to compare and rank recurrence covariates.

out3 <- get_prioritised_covariates(df = out2,
                                   patientIdVarname = "idx", 
                                   exposureVector = basetable$exposure,
                                   outcomeVector = basetable$outcome,
                                   patientIdVector = patientIds, 
                                   k = 100)

This would return absolute log of the multiplicative bias for each recurrence covariate (by univariate Bross formula). We can use this information to prioritize recurrence covariates in the next step.

7.5 Convert to Absolute log of multiplicative bias

Here are the few covariates and associated Absolute log of the multiplicative bias:

rec_dx_I10_once : 0.115
rec_dx_R73_once : 0.088
rec_dx_I10_frequent : 0.068
rec_dx_R60_once : 0.054
rec_dx_E78_once : 0.038
rec_dx_M79_once : 0.017
rec_dx_E87_once : 0.015
rec_dx_I51_once : 0.013
rec_dx_I50_once : 0.011

And here are translated table with description:

Hypertension : 0.115
Elevated blood glucose level : 0.088
Hypertension : 0.068
Edema : 0.054
Pure hypercholesterolemia : 0.038
musculoskeletal pain : 0.017
Hypokalemia : 0.015
Heart disease : 0.013
Heart failure : 0.011

(Choi and Shi 2001)

SMD vs Bias multiplier

Standardized mean difference (SMD) is useful for assessing the balance in the propensity score literature. However, Bross formula incorporates outcome information. In the investigation of empirical covariates or recurrence covariates where interpretations of these covariates are unknown, it may seem more safe to use the multiplicative bias term from the Bross formula to identify proxy covariates that are helpful in predicting the outcome.

(Stuart, Lee, and Leacy 2013)