8 Step 5: Covariates
We select 2 types of covariates for the next step (to analyze using propensity score or other alternative approaches):
8.1 Ideal number of prioritised covariates
Based on calculated \(log-absolute-bias\), we select top k
recurrence covariates to be used in the hdPS analyses later. Below is a plot of all of the absolute log of the Bias Multiplier:
We used \(k = 100\) covariates selected by the hdPS algorithm (we call them ‘hdPS covariates’). What should be the cutpoint?
Absolute log of the Bias Multiplier has a null value of 0. Anything above 0 is an indication of confounding bias adjusted by the adjustment of the associated recurrent covariate. For large proxy data sources, \(k = 500\) is suggested (Schneeweiss et al. 2009).
8.2 Investigator-specified covariates
\(25\) investigator-specified covariates are selected based on variables in the DAG that are available in the data set.
We should also add necessary interactions of these investigator-specified covariates, or add other useful model-specifications (e.g., polynomials).
- 14 demographic, behavioral, health history related variables
- Mostly categorical
- 11 lab variables
- Mostly continuous
8.3 hdPS model
C = investigator-specified covariates and EC = hdPS covariates (Schneeweiss et al. 2009)
Then the hdPS can be used as matching, weighting, stratifying variables, or as covariates (usuallly in deciles) in outcome model.