flowchart LR S(Super Learner) --> l(Logistic regression) S --> g(LASSO) S --> m(Multivariate Adaptive Regression Splines MARS) style S fill:#90EE90;
15 Ensemble
Tip
We show an example using a super learner using 3 candidate learners.
If you want to know more about Super Learner, look at other tutorials.
The super learning approach is fundamentally different from the pure ML or LASSO approach discussed earlier. Here all of the candidate learners are using exposure
as their outcome while running the model.
15.1 Build model formula based on all variables
<- names(out3$autoselected_covariate_df[,-1])
proxy.list length(proxy.list)
#> [1] 100
<- paste0(investigator.specified.covariates, collapse = "+")
covform <- paste0(proxy.list, collapse = "+")
proxyform <- paste0(c(covform, proxyform), collapse = "+")
rhsformula <- as.formula(paste0("exposure", "~", rhsformula)) ps.formula
We work with all proxies
15.2 Fit the PS model with super learner
require(WeightIt)
<- weightit(ps.formula,
W.out data = hdps.data,
estimand = "ATE",
method = "super",
SL.library = c("SL.glm",
"SL.glmnet",
"SL.earth"))
#> Loading required namespace: glmnet
#> Loading required namespace: earth
Propensity score model fit based on super learning algorithm to be able to calculate the inverse probability weights.
15.3 Obtain log-OR from unadjusted outcome model
summary(W.out$ps)
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 0.01826 0.22575 0.39324 0.42094 0.59037 0.98809
<- as.formula(paste0("outcome", "~", "exposure"))
out.formula <- glm(out.formula,
fit data = hdps.data,
weights = W.out$weights,
family= binomial(link = "logit"))
<- summary(fit)$coef["exposure",
fit.summary c("Estimate",
"Std. Error",
"Pr(>|z|)")]
2] <- sqrt(sandwich::sandwich(fit)[2,2])
fit.summary[require(lmtest)
<- confint(fit, "exposure", level = 0.95, method = "hc1")
conf.int <- c(fit.summary, conf.int)
fit.summary_with_ci.sl ::kable(t(round(fit.summary_with_ci.sl,2))) knitr
Estimate | Std. Error | Pr(>|z|) | 2.5 % | 97.5 % |
---|---|---|---|---|
0.42 | 0.1 | 0 | 0.31 | 0.53 |
Summary of results (log-OR).