References

Austin, Peter C, Geoffrey M Anderson, Candemir Cigsar, and Andrea Gruneir. 2012. “Comparing the Cohort Design and the Nested Case–Control Design in the Presence of Both Time-Invariant and Time-Dependent Treatment and Competing Risks: Bias and Precision.” Pharmacoepidemiology and Drug Safety 21 (7): 714–24.
Balzer, Laura B, and Ted Westling. 2021. “Demystifying Statistical Inference When Using Machine Learning in Causal Research.” American Journal of Epidemiology.
Benasseur, Imane, Denis Talbot, Madeleine Durand, Anne Holbrook, Alexis Matteau, Brian J Potter, Christel Renoux, Mireille E Schnitzer, Jean-Éric Tarride, and Jason R Guertin. 2022. “A Comparison of Confounder Selection and Adjustment Methods for Estimating Causal Effects Using Large Healthcare Databases.” Pharmacoepidemiology and Drug Safety 31 (4): 424–33.
Beyersmann, Jan, Martin Wolkewitz, and Martin Schumacher. 2008. “The Impact of Time-Dependent Bias in Proportional Hazards Modelling.” Statistics in Medicine 27 (30): 6439–54.
Brookhart, M Alan, Sebastian Schneeweiss, Kenneth J Rothman, Robert J Glynn, Jerry Avorn, and Til Stürmer. 2006. “Variable Selection for Propensity Score Models.” American Journal of Epidemiology 163 (12): 1149–56.
Bross, Irwin DJ. 1966. “Spurious Effects from an Extraneous Variable.” Journal of Chronic Diseases 19 (6): 637–47.
Charlson, Mary E, Peter Pompei, Kathy L Ales, and C Ronald MacKenzie. 1987. “A New Method of Classifying Prognostic Comorbidity in Longitudinal Studies: Development and Validation.” Journal of Chronic Diseases 40 (5): 373–83.
Choi, BCK, and F Shi. 2001. “Risk Factors for Diabetes Mellitus by Age and Sex: Results of the National Population Health Survey.” Diabetologia 44: 1221–31.
Connolly, John G, Sebastian Schneeweiss, Robert J Glynn, and Joshua J Gagne. 2019. “Quantifying Bias Reduction with Fixed-Duration Versus All-Available Covariate Assessment Periods.” Pharmacoepidemiology and Drug Safety 28 (5): 665–70.
Disease Control, Centers for, and Prevention. 2021. “National Health and Nutrition Examination Survey (NHANES).” National Center for Health Statistics.
Elixhauser, Anne, Claudia Steiner, D Robert Harris, and Rosanna M Coffey. 1998. “Comorbidity Measures for Use with Administrative Data.” Medical Care, 8–27.
Ernster, Virginia L. 1994. “Nested Case-Control Studies.” Preventive Medicine 23 (5): 587–90.
Franklin, Jessica M, Wesley Eddings, Robert J Glynn, and Sebastian Schneeweiss. 2015. “Regularized Regression Versus the High-Dimensional Propensity Score for Confounding Adjustment in Secondary Database Analyses.” American Journal of Epidemiology 182 (7): 651–59.
Greenland, Sander, Judea Pearl, and James M Robins. 1999. “Causal Diagrams for Epidemiologic Research.” Epidemiology, 37–48.
Hansen, Ben B. 2008. “The Prognostic Analogue of the Propensity Score.” Biometrika 95 (2): 481–88.
Hernán, Miguel A, and Sarah L Taubman. 2008. “Does Obesity Shorten Life? The Importance of Well-Defined Interventions to Answer Causal Questions.” International Journal of Obesity 32 (3): S8–14.
Hossain, Md Belal. 2025. “Chapter 2: High-Dimensional Disease Risk Score for Dealing with Residual Confounding in Estimating Treatment Effects with a Survival Outcome.” In. Harnessing the power of causal inference; predictive analytics for survival outcomes with health administrative data: applications to tuberculosis research.
Hossain, Md Belal, Huah Shin Ng, Feng Zhu, Helen Tremlett, and Mohammad Ehsanul Karim. 2025. “Simultaneously Dealing with Immortal Time Bias and Residual Confounding: A Case Study of a High-Dimensional Propensity Score Approach with a Nested Case–Control Framework in Multiple Sclerosis Research.” Pharmacoepidemiology and Drug Safety 34 (7): e70174.
Hossain, Md Belal, Mohsen Sadatsafavi, Hubert Wong, Victoria J Cook, James C Johnston, and Mohammad Ehsanul Karim. 2025. “Enhancing Risk Prediction Base on Health Administrative Data Using High-Dimensional Prediction Model.” Journal of Clinical Epidemiology, 111857.
Hossain, Md Belal, Hubert Wong, Mohsen Sadatsafavi, James C Johnston, Victoria J Cook, and Mohammad Ehsanul Karim. 2024. “Benefits of Repeated Matched-Cohort and Nested Case–Control Analyses with Time-Dependent Exposure in Observational Studies.” Statistics in Biosciences, 1–29.
Jones, Mark, and Robert Fowler. 2016. “Immortal Time Bias in Observational Studies of Time-to-Event Outcomes.” Journal of Critical Care 36: 195–99.
Ju, Cheng, Mary Combs, Samuel D Lendle, Jessica M Franklin, Richard Wyss, Sebastian Schneeweiss, and Mark J van der Laan. 2019. “Propensity Score Prediction for Electronic Healthcare Databases Using Super Learner and High-Dimensional Propensity Score Methods.” Journal of Applied Statistics 46 (12): 2216–36.
Ju, Cheng, Susan Gruber, Samuel D Lendle, Antoine Chambaz, Jessica M Franklin, Richard Wyss, Sebastian Schneeweiss, and Mark J van Der Laan. 2019. “Scalable Collaborative Targeted Learning for High-Dimensional Data.” Statistical Methods in Medical Research 28 (2): 532–54.
Karim, ME. 2023. “Rethinking Residual Confounding Bias Reduction: Why Vanilla hdPS Alone Is No Longer Enough.”
Karim, ME, and Y Lei. 2025. “Is There a Competitive Advantage to Using Multivariate Statistical or Machine Learning Methods over the Bross Formula in the hdPS Framework for Bias and Variance Estimation?” PLoS One 20 (5): e0324639.
Karim, ME, and MH Mondol. 2025. “Finding the Optimal Number of Splits and Repetitions in Double Cross-Fitting Targeted Maximum Likelihood Estimators.” Pharmaceutical Statistics.
Karim, Mohammad Ehsanul, Paul Gustafson, John Petkau, Yinshan Zhao, Afsaneh Shirani, Elaine Kingwell, Charity Evans, Mia Van Der Kop, Joel Oger, and Helen Tremlett. 2014. “Marginal Structural Cox Models for Estimating the Association Between β-Interferon Exposure and Disease Progression in a Multiple Sclerosis Cohort.” American Journal of Epidemiology 180 (2): 160–71.
Karim, Mohammad Ehsanul, Md Belal Hossain, Huah Shin Ng, Feng Zhu, Hanna A Frank, and Helen Tremlett. 2025. “Evaluating the Role of High-Dimensional Proxy Data in Confounding Adjustment in Multiple Sclerosis Research: A Case Study.” Pharmacoepidemiology and Drug Safety 34 (2): e70112.
Karim, Mohammad Ehsanul, and Yang Lei. 2025. “How Effective Are Machine Learning and Doubly Robust Estimators in Incorporating High-Dimensional Proxies to Reduce Residual Confounding?” Pharmacoepidemiology and Drug Safety 34 (5): e70155.
Karim, Mohammad Ehsanul, Menglan Pang, and Robert W Platt. 2018. “Can We Train Machine Learning Methods to Outperform the High-Dimensional Propensity Score Algorithm?” Epidemiology 29 (2): 191–98.
Karim, Mohammad Ehsanul, Fabio Pellegrini, Robert W Platt, Gabrielle Simoneau, Julie Rouette, and Carl de Moor. 2022. “The Use and Quality of Reporting of Propensity Score Methods in Multiple Sclerosis Literature: A Review.” Multiple Sclerosis Journal 28 (9): 1317–23.
Klein, Samuel, Amalia Gastaldelli, Hannele Yki-Järvinen, and Philipp E Scherer. 2022. “Why Does Obesity Cause Diabetes?” Cell Metabolism 34 (1): 11–20.
Kumamaru, Hiraku, Joshua J Gagne, Robert J Glynn, Soko Setoguchi, and Sebastian Schneeweiss. 2016. “Comparison of High-Dimensional Confounder Summary Scores in Comparative Studies of Newly Marketed Medications.” Journal of Clinical Epidemiology 76: 200–208.
Kumamaru, Hiraku, Sebastian Schneeweiss, Robert J Glynn, Soko Setoguchi, and Joshua J Gagne. 2016. “Dimension Reduction and Shrinkage Methods for High Dimensional Disease Risk Scores in Historical Data.” Emerging Themes in Epidemiology 13: 1–10.
Liu, Mengling, Wenbin Lu, Roy E Shore, and Anne Zeleniuch-Jacquotte. 2010. “Cox Regression Model with Time-Varying Coefficients in Nested Case–Control Studies.” Biostatistics 11 (4): 693–706.
Lix, Lisa M, Jacqueline Quail, Opeyemi Fadahunsi, and Gary F Teare. 2013. “Predictive Performance of Comorbidity Measures in Administrative Databases for Diabetes Cohorts.” BMC Health Services Research 13: 1–12.
Lix, LM, J Quail, G Teare, and B Acan. 2011. “Performance of Comorbidity Measures for Predicting Outcomes in Population-Based Osteoporosis Cohorts.” Osteoporosis International 22: 2633–43.
Low, Yen Sia, Blanca Gallego, and Nigam Haresh Shah. 2016. “Comparing High-Dimensional Confounder Control Methods for Rapid Cohort Studies from Electronic Health Records.” Journal of Comparative Effectiveness Research 5 (2): 179–92.
Mondol, MH, and ME Karim. 2024. “Towards Robust Causal Inference in Epidemiological Research: Employing Double Cross-Fit TMLE in Right Heart Catheterization Data.” American Journal of Epidemiology, kwae447.
Naimi, Ashley I, and Brian W Whitcomb. 2020. “Estimating Risk Ratios and Risk Differences Using Regression.” American Journal of Epidemiology 189 (6): 508–10.
Neugebauer, Romain, Julie A Schmittdiel, Zheng Zhu, Jeremy A Rassen, John D Seeger, and Sebastian Schneeweiss. 2015. “High-Dimensional Propensity Score Algorithm in Comparative Effectiveness Research with Time-Varying Interventions.” Statistics in Medicine 34 (5): 753–81.
Nguyen, Tri-Long, Thomas PA Debray, Bora Youn, Gabrielle Simoneau, and Gary S Collins. 2024. “Confounder Adjustment Using the Disease Risk Score: A Proposal for Weighting Methods.” American Journal of Epidemiology 193 (2): 377–88.
Pang, Menglan, Tibor Schuster, Kristian B Filion, Maria Eberg, and Robert W Platt. 2016. “Targeted Maximum Likelihood Estimation for Pharmacoepidemiologic Research.” Epidemiology (Cambridge, Mass.) 27 (4): 570.
Pang, Menglan, Tibor Schuster, Kristian B Filion, Mireille E Schnitzer, Maria Eberg, and Robert W Platt. 2016. “Effect Estimation in Point-Exposure Studies with Binary Outcomes and High-Dimensional Covariate Data–a Comparison of Targeted Maximum Likelihood Estimation and Inverse Probability of Treatment Weighting.” The International Journal of Biostatistics 12 (2).
Rassen, Jeremy A, Patrick Blin, Sebastian Kloss, Romain S Neugebauer, Robert W Platt, Anton Pottegård, Sebastian Schneeweiss, and Sengwee Toh. 2023. “High-Dimensional Propensity Scores for Empirical Covariate Selection in Secondary Database Studies: Planning, Implementation, and Reporting.” Pharmacoepidemiology and Drug Safety 32 (2): 93–106.
Robert, Dennis. 2020. autoCovariateSelection: Automatic Covariate Selection. https://CRAN.R-project.org/package=autoCovariateSelection.
Rubin, Donald B. 1997. “Estimating Causal Effects from Large Data Sets Using Propensity Scores.” Annals of Internal Medicine 127 (8_Part_2): 757–63.
Rubin, Donald B, and Neal Thomas. 1996. “Matching Using Estimated Propensity Scores: Relating Theory to Practice.” Biometrics, 249–64.
Schneeweiss, Sebastian. 2006. “Sensitivity Analysis and External Adjustment for Unmeasured Confounders in Epidemiologic Database Studies of Therapeutics.” Pharmacoepidemiology and Drug Safety 15 (5): 291–303.
———. 2018. “Automated Data-Adaptive Analytics for Electronic Healthcare Data to Study Causal Treatment Effects.” Clinical Epidemiology, 771–88.
Schneeweiss, Sebastian, Wesley Eddings, Robert J Glynn, Elisabetta Patorno, Jeremy Rassen, and Jessica M Franklin. 2017. “Variable Selection for Confounding Adjustment in High-Dimensional Covariate Spaces When Analyzing Healthcare Databases.” Epidemiology 28 (2): 237–48.
Schneeweiss, Sebastian, and Malcolm Maclure. 2000. “Use of Comorbidity Scores for Control of Confounding in Studies Using Administrative Databases.” International Journal of Epidemiology 29 (5): 891–98.
Schneeweiss, Sebastian, Jeremy A Rassen, Robert J Glynn, Jerry Avorn, Helen Mogun, and M Alan Brookhart. 2009. “High-Dimensional Propensity Score Adjustment in Studies of Treatment Effects Using Health Care Claims Data.” Epidemiology (Cambridge, Mass.) 20 (4): 512.
Schuster, Tibor, Wilfrid Kouokam Lowe, and Robert W Platt. 2016. “Propensity Score Model Overfitting Led to Inflated Variance of Estimated Odds Ratios.” Journal of Clinical Epidemiology 80: 97–106.
Schuster, Tibor, Menglan Pang, and Robert W Platt. 2015. “On the Role of Marginal Confounder Prevalence–Implications for the High-Dimensional Propensity Score Algorithm.” Pharmacoepidemiology and Drug Safety 24 (9): 1004–7.
Shalit, Uri, Fredrik D Johansson, and David Sontag. 2017. “Estimating Individual Treatment Effect: Generalization Bounds and Algorithms.” In International Conference on Machine Learning, 3076–85. PMLR.
Shi, Claudia, David Blei, and Victor Veitch. 2019. “Adapting Neural Networks for the Estimation of Treatment Effects.” Advances in Neural Information Processing Systems 32.
Simoneau, Gabrielle, Fabio Pellegrini, Thomas PA Debray, Julie Rouette, Johanna Muñoz, Robert W Platt, John Petkau, et al. 2022. “Recommendations for the Use of Propensity Score Methods in Multiple Sclerosis Research.” Multiple Sclerosis Journal 28 (9): 1467–80.
Stuart, Elizabeth A, Brian K Lee, and Finbarr P Leacy. 2013. “Prognostic Score–Based Balance Measures Can Be a Useful Diagnostic for Propensity Score Methods in Comparative Effectiveness Research.” Journal of Clinical Epidemiology 66 (8): S84–90.
Tazare, John, Richard Wyss, Jessica M Franklin, Liam Smeeth, Stephen JW Evans, Shirley V Wang, Sebastian Schneeweiss, Ian J Douglas, Joshua J Gagne, and Elizabeth J Williamson. 2022. “Transparency of High-Dimensional Propensity Score Analyses: Guidance for Diagnostics and Reporting.” Pharmacoepidemiology and Drug Safety 31 (4): 411–23.
Tian, Yuxi, Martijn J Schuemie, and Marc A Suchard. 2018. “Evaluating Large-Scale Propensity Score Performance Through Real-World and Synthetic Data Experiments.” International Journal of Epidemiology 47 (6): 2005–14.
VanderWeele, Tyler J. 2019. “Principles of Confounder Selection.” European Journal of Epidemiology 34: 211–19.
Von Korff, Michael, Edward H Wagner, and Kathleen Saunders. 1992. “A Chronic Disease Score from Automated Pharmacy Data.” Journal of Clinical Epidemiology 45 (2): 197–203.
Weberpals, Janick, Tim Becker, Jessica Davies, Fabian Schmich, Dominik Rüttinger, Fabian J Theis, and Anna Bauer-Mehren. 2021. “Deep Learning-Based Propensity Scores for Confounding Control in Comparative Effectiveness Research: A Large-Scale, Real-World Data Study.” Epidemiology 32 (3): 378–88.
Westreich, Daniel, Stephen R Cole, Michele Jonsson Funk, M Alan Brookhart, and Til Stürmer. 2011. “The Role of the c-Statistic in Variable Selection for Propensity Score Models.” Pharmacoepidemiology and Drug Safety 20 (3): 317–20.
Wyss, Richard, Alan R Ellis, M Alan Brookhart, Michele Jonsson Funk, Cynthia J Girman, Ross J Simpson Jr, and Til Stürmer. 2015. “Matching on the Disease Risk Score in Comparative Effectiveness Research of New Treatments.” Pharmacoepidemiology and Drug Safety 24 (9): 951–61.
Wyss, Richard, Sebastian Schneeweiss, Mark Van Der Laan, Samuel D Lendle, Cheng Ju, and Jessica M Franklin. 2018. “Using Super Learner Prediction Modeling to Improve High-Dimensional Propensity Score Estimation.” Epidemiology 29 (1): 96–106.
Wyss, Richard, Chen Yanover, Tal El-Hay, Dimitri Bennett, Robert W Platt, Andrew R Zullo, Grammati Sari, et al. 2022. “Machine Learning for Improving High-Dimensional Proxy Confounder Adjustment in Healthcare Database Studies: An Overview of the Current Literature.” Pharmacoepidemiology and Drug Safety 31 (9): 932–43.
Zhang, Di, and Jessica Kim. 2019. “Use of Propensity Score and Disease Risk Score for Multiple Treatments with Time-to-Event Outcome: A Simulation Study.” Journal of Biopharmaceutical Statistics 29 (6): 1103–15.
Zivich, Paul N, and Alexander Breskin. 2021. “Machine Learning for Causal Inference: On the Use of Cross-Fit Estimators.” Epidemiology (Cambridge, Mass.) 32 (3): 393.