Chapter 7 Pre-packaged software

7.1 tmle

  • The tmle package can handle
    • both binary and
    • continuous outcomes, and
    • uses the SuperLearner package to construct both models just like we did in the steps above.
  • The default SuperLearner library for estimating the outcome includes (S. Gruber, Van Der Laan, and Kennedy 2020)
    • SL.glm: generalized linear models (GLMs)
    • SL.glmnet: LASSO
    • tmle.SL.dbarts2: modeling and prediction using BART
  • The default library for estimating the propensity scores includes
    • SL.glm: generalized linear models (GLMs)
    • tmle.SL.dbarts.k.5: SL wrappers for modeling and prediction using BART
    • SL.gam: generalized additive models: (GAMs)
  • It is certainly possible to use different set of learners
    • More methods can be added by
      • specifying lists of models in the Q.SL.library (for the outcome model) and
      • g.SL.library (for the propensity score model) arguments.
  • Note also that the outcome \(Y\) is required to be within the range of \([0,1]\) for this method as well,
    • so we need to pass in the transformed data, then transform back the estimate.
set.seed(1444) 
# transform the outcome to fall within the range [0,1]
min.Y <- min(ObsData$Y)
max.Y <- max(ObsData$Y)
ObsData$Y_transf <- (ObsData$Y-min.Y)/(max.Y-min.Y)

# run tmle from the tmle package 
ObsData.noYA <- dplyr::select(ObsData, 
                              !c(Y_transf, Y, A))
SL.library = c("SL.glm", 
               "SL.glmnet", 
               "SL.xgboost")
tmle.fit <- tmle::tmle(Y = ObsData$Y_transf, 
                   A = ObsData$A, 
                   W = ObsData.noYA, 
                   family = "gaussian", 
                   V = 3,
                   Q.SL.library = SL.library, 
                   g.SL.library = SL.library)
tmle.fit
##  Additive Effect
##    Parameter Estimate:  0.0073229
##    Estimated Variance:  2.0642e-06
##               p-value:  3.4526e-07
##     95% Conf Interval: (0.0045069, 0.010139) 
## 
##  Additive Effect among the Treated
##    Parameter Estimate:  0.0054449
##    Estimated Variance:  3.4095e-06
##               p-value:  0.00319
##     95% Conf Interval: (0.0018258, 0.0090641) 
## 
##  Additive Effect among the Controls
##    Parameter Estimate:  0.01266
##    Estimated Variance:  1.9251e-06
##               p-value:  <2e-16
##     95% Conf Interval: (0.0099407, 0.01538)
summary(tmle.fit)
##  Initial estimation of Q
##   Procedure: cv-SuperLearner, ensemble
##   Model:
##       Y ~  SL.glm_All + SL.glmnet_All + SL.xgboost_All
## 
##   Coefficients: 
##        SL.glm_All    0.316376 
##     SL.glmnet_All    0.4996009 
##    SL.xgboost_All    0.1840231 
## 
##   Cross-validated R squared :  0.0607 
## 
##  Estimation of g (treatment mechanism)
##   Procedure: SuperLearner, ensemble   Empirical AUC = 0.9388 
## 
##   Model:
##       A ~  SL.glm_All + SL.glmnet_All + SL.xgboost_All 
## 
##   Coefficients: 
##        SL.glm_All    0 
##     SL.glmnet_All    0.6490267 
##    SL.xgboost_All    0.3509733 
## 
##  Estimation of g.Z (intermediate variable assignment mechanism)
##   Procedure: No intermediate variable 
## 
##  Estimation of g.Delta (missingness mechanism)
##   Procedure: No missingness, ensemble
## 
##  Bounds on g: (0.0076, 1) 
## 
##  Bounds on g for ATT/ATE: (0.0076, 0.9924) 
## 
##  Additive Effect
##    Parameter Estimate:  0.0073229
##    Estimated Variance:  2.0642e-06
##               p-value:  3.4526e-07
##     95% Conf Interval: (0.0045069, 0.010139) 
## 
##  Additive Effect among the Treated
##    Parameter Estimate:  0.0054449
##    Estimated Variance:  3.4095e-06
##               p-value:  0.00319
##     95% Conf Interval: (0.0018258, 0.0090641) 
## 
##  Additive Effect among the Controls
##    Parameter Estimate:  0.01266
##    Estimated Variance:  1.9251e-06
##               p-value:  <2e-16
##     95% Conf Interval: (0.0099407, 0.01538)
tmle_est_tr <- tmle.fit$estimates$ATE$psi
tmle_est_tr
## [1] 0.00732285
# transform back the ATE estimate
tmle_est <- (max.Y-min.Y)*tmle_est_tr
tmle_est
## [1] 2.870557
saveRDS(tmle_est, file = "data/tmle.RDS")
tmle_ci <- paste("(", 
                 round((max.Y-min.Y)*tmle.fit$estimates$ATE$CI[1], 3), ", ", 
                 round((max.Y-min.Y)*tmle.fit$estimates$ATE$CI[2], 3), ")", sep = "")
tmle.ci <- (max.Y-min.Y)*tmle.fit$estimates$ATE$CI
saveRDS(tmle.ci, file = "data/tmleci.RDS")
## ATE from tmle package: 2.870557(1.767, 3.974)

Notes about the tmle package:

  • does not scale the outcome for you
  • can give some error messages when dealing with variable types it is not expecting
  • practically all steps are nicely packed up in one function, very easy to use but need to dig a little to truly understand what it does

Most helpful resources:

7.2 tmle (reduced computation)

We can use the previously calculated propensity score predictions from SL (calculated using WeightIt package) in the tmle to reduce some computing time.

ps.obj <- readRDS(file = "data/ipwslps.RDS")
ps.SL <- ps.obj$weights
tmle.fit2 <- tmle::tmle(Y = ObsData$Y_transf, 
                   A = ObsData$A, 
                   W = ObsData.noYA, 
                   family = "gaussian",
                   V = 3,
                   Q.SL.library = SL.library, 
                   g1W = ps.SL)
tmle.fit2
##  Additive Effect
##    Parameter Estimate:  0.0079113
##    Estimated Variance:  0.0063697
##               p-value:  0.92104
##     95% Conf Interval: (-0.14852, 0.16434) 
## 
##  Additive Effect among the Treated
##    Parameter Estimate:  0.016964
##    Estimated Variance:  0.043265
##               p-value:  0.935
##     95% Conf Interval: (-0.39072, 0.42465) 
## 
##  Additive Effect among the Controls
##   Warning: Procedure failed to converge
##    Parameter Estimate:  0.00063631
##    Estimated Variance:  9.7303e-07
##               p-value:  0.51888
##     95% Conf Interval: (-0.0012971, 0.0025697)
# transform back ATE estimate
(max.Y-min.Y)*tmle.fit2$estimates$ATE$psi
## [1] 3.101232

7.3 sl3 (optional)

# install sl3 if not done so
# remotes::install_github("tlverse/sl3")

The sl3 package is a newer package, that implements two types of Super Learning:

  • discrete Super Learning,
    • in which the best prediction algorithm (based on cross-validation) from a specified library is returned, and
  • ensemble Super Learning,

The first step is to create a sl3 task which keeps track of the roles of the variables in our problem (Coyle, Hejazi, Melencia, et al. (2021)).

require(sl3)
# create sl3 task, specifying outcome and covariates 
rhc_task <- make_sl3_Task(
  data = ObsData, 
  covariates = colnames(ObsData)[-which(names(ObsData) == "Y")],
  outcome = "Y"
)
rhc_task
## A sl3 Task with 5735 obs and these nodes:
## $covariates
##  [1] "Disease.category"      "Cancer"                "Cardiovascular"       
##  [4] "Congestive.HF"         "Dementia"              "Psychiatric"          
##  [7] "Pulmonary"             "Renal"                 "Hepatic"              
## [10] "GI.Bleed"              "Tumor"                 "Immunosupperssion"    
## [13] "Transfer.hx"           "MI"                    "age"                  
## [16] "sex"                   "edu"                   "DASIndex"             
## [19] "APACHE.score"          "Glasgow.Coma.Score"    "blood.pressure"       
## [22] "WBC"                   "Heart.rate"            "Respiratory.rate"     
## [25] "Temperature"           "PaO2vs.FIO2"           "Albumin"              
## [28] "Hematocrit"            "Bilirubin"             "Creatinine"           
## [31] "Sodium"                "Potassium"             "PaCo2"                
## [34] "PH"                    "Weight"                "DNR.status"           
## [37] "Medical.insurance"     "Respiratory.Diag"      "Cardiovascular.Diag"  
## [40] "Neurological.Diag"     "Gastrointestinal.Diag" "Renal.Diag"           
## [43] "Metabolic.Diag"        "Hematologic.Diag"      "Sepsis.Diag"          
## [46] "Trauma.Diag"           "Orthopedic.Diag"       "race"                 
## [49] "income"                "A"                     "Y_transf"             
## 
## $outcome
## [1] "Y"
## 
## $id
## NULL
## 
## $weights
## NULL
## 
## $offset
## NULL
## 
## $time
## NULL

Next, we create our SuperLearner. To do this,

  • we need to specify a selection of machine learning algorithms we want to include as candidates, as well as
  • a metalearner that the SuperLearner will use to combine or choose from the machine learning algorithms provided (Coyle, Hejazi, Melencia, et al. (2021)).
# see what algorithms are available for a continuous outcome 
# (similar can be done for a binary outcome)
sl3_list_learners("continuous")
##  [1] "Lrnr_arima"                     "Lrnr_bartMachine"              
##  [3] "Lrnr_bilstm"                    "Lrnr_bound"                    
##  [5] "Lrnr_caret"                     "Lrnr_cv_selector"              
##  [7] "Lrnr_dbarts"                    "Lrnr_earth"                    
##  [9] "Lrnr_expSmooth"                 "Lrnr_gam"                      
## [11] "Lrnr_gbm"                       "Lrnr_glm"                      
## [13] "Lrnr_glm_fast"                  "Lrnr_glmnet"                   
## [15] "Lrnr_grf"                       "Lrnr_gru_keras"                
## [17] "Lrnr_gts"                       "Lrnr_h2o_glm"                  
## [19] "Lrnr_h2o_grid"                  "Lrnr_hal9001"                  
## [21] "Lrnr_HarmonicReg"               "Lrnr_hts"                      
## [23] "Lrnr_lstm"                      "Lrnr_lstm_keras"               
## [25] "Lrnr_mean"                      "Lrnr_multiple_ts"              
## [27] "Lrnr_nnet"                      "Lrnr_nnls"                     
## [29] "Lrnr_optim"                     "Lrnr_pkg_SuperLearner"         
## [31] "Lrnr_pkg_SuperLearner_method"   "Lrnr_pkg_SuperLearner_screener"
## [33] "Lrnr_polspline"                 "Lrnr_randomForest"             
## [35] "Lrnr_ranger"                    "Lrnr_rpart"                    
## [37] "Lrnr_rugarch"                   "Lrnr_screener_correlation"     
## [39] "Lrnr_solnp"                     "Lrnr_stratified"               
## [41] "Lrnr_svm"                       "Lrnr_tsDyn"                    
## [43] "Lrnr_xgboost"

The chosen candidate algorithms can be created and collected in a Stack.

# initialize candidate learners
lrn_glm <- make_learner(Lrnr_glm)
lrn_lasso <- make_learner(Lrnr_glmnet) # alpha default is 1
xgb_5 <- Lrnr_xgboost$new(nrounds = 5)

# collect learners in stack
stack <- make_learner(
  Stack, lrn_glm, lrn_lasso, xgb_5
)

The stack is then given to the SuperLearner.

# to make an ensemble SuperLearner
sl_meta <- Lrnr_nnls$new()
sl <- Lrnr_sl$new(
  learners = stack,
  metalearner = sl_meta)

# or a discrete SuperLearner
sl_disc_meta <- Lrnr_cv_selector$new()
sl_disc <- Lrnr_sl$new(
  learners = stack, 
  metalearner = sl_disc_meta
)

The SuperLearner is then trained on the sl3 task we created at the start and then it can be used to make predictions.

set.seed(1444)

# train SL
sl_fit <- sl$train(rhc_task)
# or for discrete SL
# sl_fit <- sl_disc$train(rhc_task)

# make predictions
sl3_data <- ObsData
sl3_data$sl_preds <- sl_fit$predict()

sl3_est <- mean(sl3_data$sl_preds[sl3_data$A == 1]) - 
  mean(sl3_data$sl_preds[sl3_data$A == 0])

sl3_est
## [1] 5.331201
saveRDS(sl3_est, file = "data/sl3.RDS")

Notes about the sl3 package:

  • fairly easy to implement & understand structure
  • large selection of candidate algorithms provided
  • unsure why result is so different
  • very different structure from SuperLearner library, but very customizable
  • could use more explanations of when to use what metalearner and what exactly the structure of the metalearner construction means

Most helpful resources:

7.4 RHC results

Gathering previously saved results:

fit.reg <- readRDS(file = "data/adjreg.RDS")
TEr <- fit.reg$coefficients[2]
CIr <- as.numeric(confint(fit.reg, 'A'))
fit.matched <- readRDS(file = "data/match.RDS")
TEm <- fit.matched$coefficients[2]
CIm <- as.numeric(confint(fit.matched, 'A'))
TEg <- readRDS(file = "data/gcomp.RDS")
CIg <- readRDS(file = "data/gcompci.RDS")
CIgc <- CIg$percent[4:5]
TE1g <- readRDS(file = "data/gcompxg.RDS")
TE2g <- readRDS(file = "data/gcompls.RDS")
TE3g <- readRDS(file = "data/gcompsl.RDS")
ipw <- readRDS(file = "data/ipw.RDS")
TEi <- ipw$coefficients[2]
CIi <- as.numeric(confint(ipw, 'A'))
ipwsl <- readRDS(file = "data/ipwsl.RDS")
TEsli <- ipwsl$coefficients[2]
CIsli <- as.numeric(confint(ipwsl, 'A'))
tmleh <- readRDS(file = "data/tmlepointh.RDS")
tmlecih <- readRDS(file = "data/tmlecih.RDS")
tmlesl <- readRDS(file = "data/tmle.RDS")
tmlecisl <- readRDS(file = "data/tmleci.RDS")
slp <- readRDS(file = "data/sl3.RDS")
ci.b <- rep(NA,2)
ks <- 2.01
ci.ks <- c(0.6,3.41)
point <- as.numeric(c(TEr, TEm, TEg, TE1g, TE2g,  
                      TE3g, TEi, TEsli, tmleh, 
                      tmlesl, slp, ks))
CIs <- cbind(CIr, CIm, CIgc, ci.b, ci.b, ci.b, 
             CIi, CIsli, tmlecih, tmlecisl, 
             ci.b, ci.ks)    
method.list <- c("Adj. Reg","PS match", 
                 "G-comp (linear reg.)","G-comp (xgboost)", 
                 "G-comp (lasso)", "G-comp (SL)", 
                 "IPW (logistic)", "IPW (SL)", 
                 "TMLE (9 steps)", "TMLE (package)", 
                 "sl3 (package)", "Keele and Small (2021) paper") 
results <- data.frame(method.list) 
results$Estimate <- round(point,2)
results$`2.5 %` <- CIs[1,] 
results$`97.5 %` <- CIs[2,] 
kable(results,digits = 2)%>%
  row_spec(10, bold = TRUE, color = "white", background = "#D7261E")  
method.list Estimate 2.5 % 97.5 %
Adj. Reg 2.90 1.37 4.43
PS match 3.25 1.45 5.05
G-comp (linear reg.) 2.90 1.52 4.59
G-comp (xgboost) 4.11
G-comp (lasso) 2.72
G-comp (SL) 1.91
IPW (logistic) 3.24 1.88 4.60
IPW (SL) 2.90 1.51 4.29
TMLE (9 steps) 2.73 1.59 3.87
TMLE (package) 2.87 1.77 3.97
sl3 (package) 5.33
Keele and Small (2021) paper 2.01 0.60 3.41

Keele and Small (2021) used TMLE-SL based on an ensemble of 3 different learners: (1) GLM, (2) random forests, and (3) LASSO.

7.5 Other packages

Other packages that may be useful:

Package Resources Notes
ltmle CRAN vignette Longitudinal
tmle3 GitHub, framework overview, tlverse handbook tmle3 is still under development
aipw GitHub, CRAN vignette Newer package for AIPW (another DR method)
Others van der Laan research group

You can find many other related packages on CRAN or GitHub.

References

Coyle, Jeremy R, Nima S Hejazi, Ivana Malenica, and Oleg Sofrygin. 2021. sl3: Modern Pipelines for Machine Learning and Super Learning. https://github.com/tlverse/sl3. https://doi.org/10.5281/zenodo.1342293.
Coyle, Jeremy R, Nima S Hejazi, Ivana Melencia, Rachael Phillips, and Alex Hubbard. 2021. Targeted Learning in R: Causal Data Science with the tlverse Software Ecosystem. https://github.com/tlverse/tlverse-handbook. https://tlverse.org/tlverse-handbook/.
Gruber, S., M. Van Der Laan, and C. Kennedy. 2020. Package ’Tmle’. https://cran.r-project.org/web/packages/tmle/tmle.pdf.
———. 2021. “Comparing Covariate Prioritization via Matching to Machine Learning Methods for Causal Inference Using Five Empirical Applications.” The American Statistician, 1–9.