Overview

Author

Affiliation

Ehsan Karim; ehsan.karim@ubc.ca

SPPH, UBC

Introduction

Organization

Tentative breakdown:

Seminar Schedule
Time	Topic	Presenter
25 mins	Terminology + Q/A	EK
25 mins	MS example of prediction + Q/A	HF
10 mins	First gap
50 mins	2 TB examples of prediction + Q/A	BH
10 mins	Second gap
20 mins	ICU example of causal inference + Q/A	MM
15 mins	More methodological examples of causal inference	EK
15 mins	Closing remark and overall Q/A session	EK / MB / HF / MM

Acknowledgements

Summary of Grants
Funding.Source	Grant.Title
Michael Smith Health Research BC	Causal inference for large admin health care databases [Salary award]
UBC OER Fund	Advanced Epidemiological Methods [Implementation Grant]
Work Learn	Applying Machine Learning in Health Data
NSERC	Improving Causal Inference Methods for Big Data [Discovery Grant]
NSERC	Improving Causal Inference Methods for Big Data [Launch Supplements]
MS Canada	Reducing Residual Confounding in MS Research: ML Approach [Catalyst Grant]
MS Canada	Development and Validation of the MS Comorbidity Summary Index [Discovery Grant]
SPPH, UBC	Data Science in Health [Start-up fund]

class: inverse

Terminologies

ML vs. AI vs. Parametric Regression

Artificial Intelligence (AI): AI is a broad field of computer science dedicated to creating systems capable of performing tasks that typically require human intelligence.

Machine Learning (ML): ML is a subset of AI that focuses on the development of algorithms that allow computers to learn from and make predictions or decisions based on data.

Parametric Regression: Parametric regression is a traditional statistical technique used to model the relationship between a dependent variable and one or more independent variables.

It assumes a specific form f() for the function that describes this relationship, usually a linear or polynomial function, and estimates the parameters of this function from the data.

Types of ML

Supervised Learning: The algorithm learns from labeled training data and makes predictions based on new, unseen data.
- Binary outcome
- Continuous outcome
- Survival outcome
- Ensemble methods (Type I): Training same model to different samples (of the same data)
- Ensemble methods (Type II): Training different models on the same data
  - Super learner
Unsupervised Learning: The algorithm identifies patterns and relationships in unlabeled data.
- k-means
Semi-supervised Learning: Combines a small amount of labeled data with a large amount of unlabeled data during training.
Deep Learning: A subset of ML involving neural networks with many layers, capable of learning from large amounts of data.

Public Health Research Goals

General goals include (Karim and pi-OER team, 2024) (Karim, 2021b):

Prediction - MS example about developing comorbidity score. Comorbidity means the presence of other co-occurring chronic conditions in someone with a disease of interest. - TB examples
- Missing data
- Using additional variables from health administrative databases that are not usually used
Causal inference - Double robust methods - Choice of ML methods within the double robust framework - Using ML methods in the large data (large number of variables) context

References

Karim, E. M. E. (2021b). Understanding Basics and Usage of Machine Learning in Medical Literature. https://ehsanx.github.io/into2ML/. v1.1.

Karim, M. E. and pi-OER team (2024). “Advanced Epidemiological Methods”. . Accessed on: April 20, 2024. URL: https://ehsanx.github.io/EpiMethods/.