This document is a very basic introduction to machine learning for Medicine.


  • Part I: Basics
    • Basic Machine Learning Terminologies (pre-reading)
    • Supervised vs. Unsupervised Learning Algorithms
      • Concepts
      • Examples
    • Over-Fitting, Measuring Performance and Model Tuning
      • Overall steps
  • Part II: Application in Medical Literature
    • Model development
    • Model validation
    • Clinical implementation


Code-first philosophy is adopted for this tutorial; demonstrating the analyses through one real data analysis problem used in the literature.

  • This tutorial is not theory-focused, nor utilizes simulated data to explain the ideas. Given the focus on implementation, theory is beyond the scope of this tutorial.


  • The prerequisites are knowledge of multiple regression analysis and basic probability. Software demonstrations and codes will be provided in R, although proficiency in R is not required for understanding the concepts.
  • Watch the tutorial video tutorial video

Key References

  • Bi, Q., Goodman, K. E., Kaminsky, J., & Lessler, J. (2019). What is machine learning? A primer for the epidemiologist. American journal of epidemiology, 188(12), 2222-2239.
  • Liu, Y., Chen, P. H. C., Krause, J., & Peng, L. (2019). How to read articles that use machine learning: users’ guides to the medical literature. Jama, 322(18), 1806-1816.
  • Kuhn M., Johnson K. (2013) [chapter 4] Over-Fitting and Model Tuning. In: Applied Predictive Modeling. Springer, New York, NY

Additional useful references:

  • James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). An introduction to statistical learning (2nd ed.). New York: springer.
  • Vittinghoff, E., Glidden, D. V., Shiboski, S. C., & McCulloch, C. E. (2011) [chapter 10] “Predictor Selection.” In: Regression methods in biostatistics: linear, logistic, survival, and repeated measures models. Springer.
  • Steyerberg, Ewout W. Clinical prediction models (2nd ed.). Springer International Publishing, 2019.

Version History

Version 1.1 was created for course MEDI 504A 001 Emerging Topics in Experimental Medicine - EMRG TOP EXP MED, delivered on 2021W1. Some of the materials were initially prepared for Continuing Professional Development course, UBC Department of Medicine CPD event, November 3, 2020.

Feel free to reach out for any comments, corrections, suggestions.

Contributor List

Ehsan Karim (SPPH, UBC)


The online version of this book is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. You may share, adapt the content and may distribute your contributions under the same license (CC BY-NC-SA 4.0), but you have to give appropriate credit, and cannot use material for the commercial purposes.

How to cite

Karim, ME (2021) “Understanding Basics and Usage of Machine Learning in Medical Literature,” URL:, (v1.1).