7 Step 4: Prioritize
7.1 Bross formula
We need to make an educated guess about 3 components (i.e., make an assumption), that are used in the calculation of bias contributed by not adjusting for a covariate based on Bross (1966) formula:
Bross formula (Bross 1966; Schneeweiss 2006) for the Bias Multiplier considers both the imbalance in the prevalence of the unmeasured confounder between the exposure groups and the association between the confounder and the outcome to assess the potential bias.
- prevalence of a binary unmeasured confounder (\(U\)) among exposed (\(P_{UA_1}\))
- prevalence of that binary unmeasured confounder among unexposed (\(P_{UA_0}\))
- association between that binary unmeasured confounder and the outcome (\(RR_{UY} = \frac{P_{UY_1}}{P_{UY_1}}\))
The above components can help us calculate \(bias\) amount (known as ‘Bias Multiplier’) using the Bross formula when we omit adjusting for \(U\):
\[\text{Bias}_U = \frac{P_{UA_1} (RR_{UY} - 1) + 1}{P_{UA_0} (RR_{UY} - 1) + 1}\]
These are the ingredients of the Bross formula. This formula is helpful for understanding the impact of unmeasured confounding of a binary variable. We have to put assumed prevalence and risk ratio associated with an unmeasured confounder.
7.2 Calculating bias from a recurrence covariate
For recurrence covariates (\(R\)), we do not need to assume, we just plug-in \(R\) instead of \(U\) in the following calculations:
- prevalence of a binary recurrence variable among exposed (\(P_{RA_1}\))
- prevalence of that binary recurrence variable among unexposed (\(P_{RA_0}\))
- association between that binary recurrence variable and the outcome (\(RR_{RY} = \frac{P_{RY_1}}{P_{RY_1}}\))
These components can help us empirically calculate \(bias\) amount:
\[\text{Bias}_R = \frac{P_{RA_1} (RR_{RY} - 1) + 1}{P_{RA_0} (RR_{RY} - 1) + 1}\]
Here, \(RR_{RY}\) is the crude risk ratio between the recurrence covariate and the outcome, \(Y\) is the outcome, \(A\) is the exposure, and \(R\) is a recurrence covariate.
For recurrence covariates, we do not need to assume, we can basically calculate these numbers (\(log-absolute-bias\)) for all of the recurrence covariates (Schneeweiss et al. 2009). For each data dimension, we can rank each of the recurrence covariates based on the amount of bias (confounding or imbalance) it could likely adjust.
7.3 Calculating bias from all recurrence covariates
In our example, we simply plug-in each recurrence covariates one-by-one to calculate \(log-absolute-bias\):
R=rec_dx_D64_once |
R=rec_dx_D75_sporadic |
… |
R=rec_dx_E07_frequent |
7.4 Obtain log of absolute-bias
We calculate \(log-absolute-bias\) for all recurrence covariates.
Absolute log of the Bias Multiplier, \(log-absolute-bias\), is a symmetric measure of the potential bias introduced by the recurrence covariate, making it easier to compare and rank recurrence covariates.
<- get_prioritised_covariates(df = out2,
out3 patientIdVarname = "idx",
exposureVector = basetable$exposure,
outcomeVector = basetable$outcome,
patientIdVector = patientIds,
k = 100)
This would return absolute log of the multiplicative bias for each recurrence covariate (by univariate Bross formula). We can use this information to prioritize recurrence covariates in the next step.
7.5 Convert to Absolute log of multiplicative bias
Here are the few covariates and associated Absolute log of the multiplicative bias:
rec_dx_I10_once : 0.115 |
rec_dx_R73_once : 0.088 |
rec_dx_I10_frequent : 0.068 |
rec_dx_R60_once : 0.054 |
rec_dx_E78_once : 0.038 |
rec_dx_M79_once : 0.017 |
rec_dx_E87_once : 0.015 |
rec_dx_I51_once : 0.013 |
rec_dx_I50_once : 0.011 |
And here are translated table with description:
Hypertension : 0.115 |
Elevated blood glucose level : 0.088 |
Hypertension : 0.068 |
Edema : 0.054 |
Pure hypercholesterolemia : 0.038 |
musculoskeletal pain : 0.017 |
Hypokalemia : 0.015 |
Heart disease : 0.013 |
Heart failure : 0.011 |