9  Literature

9.1 PubMed

Combination of plasmode, simulation, high-dimensional propensity provides 7 papers (searched in April 23, 2023):

flowchart LR
  A[PubMed] --> p4(Karim et al. 2018 \nEpidemiology)
  p4 --> ml1
  p4 --> ml0[Hybrid]
  A[PubMed] --> p2(Tian et al. 2018 \nInt J Epidemiol.)
  p2 --> ml1[Pure LASSO]
  A[PubMed] --> p5(Wyss et al. 2018 \nEpidemiology)
  p5 --> sl1[vary k,\nk=25,100:500\nSuper \nLearner]
  p5 --> ct1
  A[PubMed] --> p1(Benasseur et al. 2022 \nPharmacoepidemiol Drug Saf. )
  p1 --> ml2[Low k,\nk = 10]
  p1 --> ct1[cTMLE]
  A[PubMed] --> p7(Neugebauer et al. 2015 \nStat Med.)
  p7 --> O2[time-varying \ninterventions]
  A[PubMed] --> p6(Franklin et al. 2015 \nAm J Epidemiol.)
  p6 --> ml1
  p6 --> ml0
  A[PubMed] --> p3(Schneeweiss et al. 2018 \nClin Epidemiol.)
  p3 --> O1[Review]
  style p1 fill:#f44,stroke-width:2px,stroke:#f00,color:#fff;
  style p3 fill:#f44,stroke-width:2px,stroke:#f00,color:#fff;
  style p7 fill:#f44,stroke-width:2px,stroke:#f00,color:#fff;
  style p5 fill:#ffff00,stroke-width:2px,stroke:#ffcc00,color:#000;
  style p2 fill:#9f9,stroke-width:2px,stroke:#090,color:#000;
  style p4 fill:#9f9,stroke-width:2px,stroke:#090,color:#000;
  style p6 fill:#9f9,stroke-width:2px,stroke:#090,color:#000;

9.2 Outside of PubMed

flowchart LR
  S[Simulations] --> p0(Pang et al. 2016 \nInt. J Biostat.)
  p0 --> t1[TMLE, \nNo \nsuper \nlearner]
  D--> p00(Pang et al. 2016 \nEpidemiology)
  p00 --> t1
  D[Data \nanalysis] --> p1(Ju et al. 2019 \nJ App Stat.)
  p1 --> sl1[Super \nlearner, \nNo TMLE, \n bias not \nused as a\nperformance \nmeasure]
  D --> p3(Schneeweiss et al. 2017 \nEpidemiology)
  p3 --> ml1[LASSO]
  S --> p4(Weberpals et al. 2021 \nEpidemiology)
  p4 --> ml1[LASSO]
  p4 --> ml2[Autoencoder]
  S --> p5(Ju et al. 2019 \nStat Meth Med Res.)
  p5 --> t1
  p5 --> t2[cTMLE, \nmore about \ntime \ncomplexity]
  S --> p6(Low et al. 2015 \nJ Comp Eff Res.)
  p6 --> ml1
  
  style p1 fill:#ffff00,stroke-width:2px,stroke:#ffcc00,color:#000;
  style p4 fill:#9f9,stroke-width:2px,stroke:#090,color:#000;
  style p3 fill:#9f9,stroke-width:2px,stroke:#090,color:#000;
  style p6 fill:#9f9,stroke-width:2px,stroke:#090,color:#000;
  style p5 fill:#44f,stroke-width:2px,stroke:#00f,color:#fff;
  style p0 fill:#44f,stroke-width:2px,stroke:#00f,color:#fff;
  style p00 fill:#44f,stroke-width:2px,stroke:#00f,color:#fff;

9.3 Reviews and Guidelines

flowchart LR
  r[Reviews] --> p1(Wyss et al. 2022 \nPharmacoepidemiol Drug Saf.)
  r --> p0(Schneeweiss et al. 2018 \nClin Epidemiol.)
  g[Guideline] --> p2(Rassen et al. 2022 \nPharmacoepidemiol Drug Saf.)
  g --> p3(Tazare et al. 2022 \nPharmacoepidemiol Drug Saf.)
  style p0 fill:#f44,stroke-width:2px,stroke:#f00,color:#fff;
  style p1 fill:#f44,stroke-width:2px,stroke:#f00,color:#fff;
  style p2 fill:#b03,stroke-width:2px,stroke:#600,color:#fff;
  style p3 fill:#b03,stroke-width:2px,stroke:#600,color:#fff;

Review documents included some good directions, but guideline documents did not emphasize the machine learning advancements much.

Criticism for simulations in the literature
  • plasmode simulations are somewhat simplistic
  • Need simulations with model form being mis-specified
    • when analyst doesn’t know exact variables
    • interactions, polynomials, more complex forms
    • when we don’t know which variables are useful (among many variables): Potential for many noise variables when \(k\) is high
  • Use of TMLE and Super learner are sub-optimal to some extent
  • 95% coverage probability rarely assessed
  • Double robust methods are are known to perform better under low-dimensions in real-world situations