Overview
Introduction
Organization
Tentative breakdown:
Time | Topic | Presenter |
---|---|---|
25 mins | Terminology + Q/A | EK |
25 mins | MS example of prediction + Q/A | HF |
10 mins | First gap | |
50 mins | 2 TB examples of prediction + Q/A | BH |
10 mins | Second gap | |
20 mins | ICU example of causal inference + Q/A | MM |
15 mins | More methodological examples of causal inference | EK |
15 mins | Closing remark and overall Q/A session | EK / MB / HF / MM |
Acknowledgements
Funding.Source | Grant.Title |
---|---|
Michael Smith Health Research BC | Causal inference for large admin health care databases [Salary award] |
UBC OER Fund | Advanced Epidemiological Methods [Implementation Grant] |
Work Learn | Applying Machine Learning in Health Data |
NSERC | Improving Causal Inference Methods for Big Data [Discovery Grant] |
NSERC | Improving Causal Inference Methods for Big Data [Launch Supplements] |
MS Canada | Reducing Residual Confounding in MS Research: ML Approach [Catalyst Grant] |
MS Canada | Development and Validation of the MS Comorbidity Summary Index [Discovery Grant] |
SPPH, UBC | Data Science in Health [Start-up fund] |
class: inverse
Terminologies
ML vs. AI vs. Parametric Regression
Artificial Intelligence (AI): AI is a broad field of computer science dedicated to creating systems capable of performing tasks that typically require human intelligence.
Machine Learning (ML): ML is a subset of AI that focuses on the development of algorithms that allow computers to learn from and make predictions or decisions based on data.
Parametric Regression: Parametric regression is a traditional statistical technique used to model the relationship between a dependent variable and one or more independent variables.
It assumes a specific form f() for the function that describes this relationship, usually a linear or polynomial function, and estimates the parameters of this function from the data.
Types of ML
Supervised Learning: The algorithm learns from labeled training data and makes predictions based on new, unseen data.
- Binary outcome
- Continuous outcome
- Survival outcome
- Ensemble methods (Type I): Training same model to different samples (of the same data)
- Ensemble methods (Type II): Training different models on the same data
Unsupervised Learning: The algorithm identifies patterns and relationships in unlabeled data.
Semi-supervised Learning: Combines a small amount of labeled data with a large amount of unlabeled data during training.
Deep Learning: A subset of ML involving neural networks with many layers, capable of learning from large amounts of data.
Public Health Research Goals
General goals include (Karim and pi-OER team, 2024) (Karim, 2021b):
- Prediction - MS example about developing comorbidity score. Comorbidity means the presence of other co-occurring chronic conditions in someone with a disease of interest. - TB examples
- Missing data
- Using additional variables from health administrative databases that are not usually used
- Causal inference - Double robust methods - Choice of ML methods within the double robust framework - Using ML methods in the large data (large number of variables) context
References
Karim, E. M. E. (2021b). Understanding Basics and Usage of Machine Learning in Medical Literature. https://ehsanx.github.io/into2ML/. v1.1.
Karim, M. E. and pi-OER team (2024). “Advanced Epidemiological Methods”. . Accessed on: April 20, 2024. URL: https://ehsanx.github.io/EpiMethods/.