8  Step 5: Covariates

We select 2 types of covariates for the next step (to analyze using propensity score or other alternative approaches):

8.1 Ideal number of prioritised covariates

Based on calculated \(log-absolute-bias\), we select top k recurrence covariates to be used in the hdPS analyses later. Below is a plot of all of the absolute log of the Bias Multiplier:

We used \(k = 100\) covariates selected by the hdPS algorithm (we call them ‘hdPS covariates’). What should be the cutpoint?

Absolute log of the Bias Multiplier has a null value of 0. Anything above 0 is an indication of confounding bias adjusted by the adjustment of the associated recurrent covariate. For large proxy data sources, \(k = 500\) is suggested (Schneeweiss et al. 2009).

8.2 Investigator-specified covariates

\(25\) investigator-specified covariates are selected based on variables in the DAG that are available in the data set.

We should also add necessary interactions of these investigator-specified covariates, or add other useful model-specifications (e.g., polynomials).

Hypothesized Directed acyclic graph drawn based on analyst’s best understanding of the literature

  • 14 demographic, behavioral, health history related variables
    • Mostly categorical
  • 11 lab variables
    • Mostly continuous

8.3 hdPS model

C = investigator-specified covariates and EC = hdPS covariates (Schneeweiss et al. 2009)

Then the hdPS can be used as matching, weighting, stratifying variables, or as covariates (usuallly in deciles) in outcome model.