Research questions

Background

When we are starting a research project, one of the first steps is to clearly define your research topic or question. We will primarily focus on two types of research questions:

  1. predictive (predictors predicting one outcome)
  2. associational or causal (association between an outcome and an exposure, adjusting for confounders and risk factors for the outcome).
Important

Datasets:

All of the datasets used in this tutorial can be accessed from this GitHub repository folder

Overview of tutorials

Predictive questions

In the previous chapter, we learned about how to access external data. In this chapter, we will embark on a journey to understand the nuances of different research questions, laying the groundwork for the topics that lie ahead. As we move forward, the next chapter will delve deeper into the challenges associated with causal questions. We will explore the complexities of causal associations and discuss the optimal types of variables to include in adjustment models for accurate treatment effect estimation. Following that, we will transition to a chapter dedicated entirely to predictive questions, shedding light on their unique attributes and the methodologies best suited for addressing them. Join us as we navigate these intricate terrains of research inquiry.

RHC Data

The first tutorial serves to educate the user on how to utilize the RHC dataset to answer a predictive research question: developing a prediction model for the length of stay. The tutorial equips users with the skills to clean and process raw data, transforming it into an analyzable format, and introduces concepts that will be foundational for subsequent analysis.

Data from NHANES Part 1: prepare data Part 2: work with data

The second tutorial (part a for downloading and part b for analyzing) provides an in-depth guide on how to build a predictive model for Diastolic blood pressure using the NHANES dataset for the years 2013-14.

Causal questions

Data from CCHS

The third tutorial aims to guide a study on the relationship between Osteoarthritis (OA) and cardiovascular diseases (CVD) among Canadian adults from 2001-2005. Utilizing the Canadian Community Health Survey (CCHS) cycle 1.1-3.1, the study intends to explore whether OA increases (more accurately, whether associated with) the risk of developing CVD.

Data from NHANES

The NHANES dataset was analyzed in this fourth tutorial to explore the relationship between health predictors and cholesterol levels (association/causal). After refining the survey design and handling missing data, regression models were built using varying predictors. Standard error computations and p-values were derived, adjusting for the survey’s unique structure.

Tip

Optional Content:

You will find that some sections conclude with an optional video walkthrough that demonstrates the code. Keep in mind that the content might have been updated since these videos were recorded. Watching these videos is optional.

Warning

Bug Report:

Fill out this form to report any issues with the tutorial.

Reference