Assessing interation in epidemiological studies
2023-02-20
Chapter 1 Definitions
1.1 Effect modification
Causal effect of exposure (A) on outcome (Y) depends upon levels of a third factor (B). This is the scenario when the exposure-outcome association differs within the strata of a 2nd exposure (2nd exposure = effect modifier). Interaction term is often added on a logistic regression model to assess the impact.
![An illustration of possible effect modification by a dichotomous factor $B$ (tobacco smoking [**smk**]) while investigating the impact of a dichotomous factor $A$ (alcohol [**alc**]) on the dichotomous outcome $Y$ (oral cancer [**oc**]).\label{fig:dagem}](images/dagem.png)
Figure 1.1: An illustration of possible effect modification by a dichotomous factor \(B\) (tobacco smoking [smk]) while investigating the impact of a dichotomous factor \(A\) (alcohol [alc]) on the dichotomous outcome \(Y\) (oral cancer [oc]).
1.2 Interaction
Causal effect of combination of multiples exposures (A and B) on outcome (Y). Interaction is the joint causal effect of two exposures on an outcome.
![An illustration of possible interaction by while investigating the impact of two dichotomous factors: $A$ (alcohol [**alc**]) and $B$ (tobacco smoking [**smk**]) on the dichotomous outcome $Y$ (oral cancer [**oc**]).\label{fig:dag}](images/dag.png)
Figure 1.2: An illustration of possible interaction by while investigating the impact of two dichotomous factors: \(A\) (alcohol [alc]) and \(B\) (tobacco smoking [smk]) on the dichotomous outcome \(Y\) (oral cancer [oc]).
1.3 Example data
Data source: K. Rothman and Keller (1972)
require(interactionR)
data(OCdata)
dim(OCdata)
## [1] 458 3
summary(OCdata)
## oc alc smk
## Min. :0.0000 Min. :0.000 Min. :0.0000
## 1st Qu.:0.0000 1st Qu.:1.000 1st Qu.:1.0000
## Median :1.0000 Median :1.000 Median :1.0000
## Mean :0.5284 Mean :0.893 Mean :0.9105
## 3rd Qu.:1.0000 3rd Qu.:1.000 3rd Qu.:1.0000
## Max. :1.0000 Max. :1.000 Max. :1.0000
Variables
- oc, oral cancer, outcome (Y)
- alc, alcohol use, first exposure (A)
- smk, smoking, second exposure (B)
= "oc"
outcome = "alc"
ex = OCdata
dataset <- table(dataset[[ex]], dataset[[outcome]])
M rownames(M) <- c("Exposure -", "Exposure +")
colnames(M) <- c("Outcome -", "Outcome +")
M
##
## Outcome - Outcome +
## Exposure - 38 11
## Exposure + 178 231
1.3.1 Crude risk ratio
require(mosaic)
relrisk(M, verbose = TRUE)
##
## Odds Ratio
##
## Proportions
## Prop. 1: 0.7755
## Prop. 2: 0.4352
## Rel. Risk: 0.5612
##
## Odds
## Odds 1: 3.455
## Odds 2: 0.7706
## Odds Ratio: 0.2231
##
## 95 percent confidence interval:
## 0.4656 < RR < 0.6764
## 0.1109 < OR < 0.4487
## NULL
## [1] 0.561189
1.3.2 Change exposure label if RR <1
This step is not necessary of RR > 1. The following calculattion assumes that exposure and stratification factors are risk factors for the outcome (RR > 1), not protective factors. If protective, estimates of RERI and AP will be invalid, although the estimate of SI is not affected by this condition.
<- matrix(c(M[2,2],M[2,1],M[1,2],M[1,1]), nrow = 2, byrow = TRUE)
M3 require(epiR)
1.3.3 Get detailed estimates from 2x2 table
require(epiR)
<- epi.2by2(dat = M3, method = "cross.sectional",
res conf.level = 0.95, units = 1,
interpret = FALSE,
outcome = "as.columns")
res
## Outcome + Outcome - Total Prevalence *
## Exposed + 231 178 409 0.56 (0.52 to 0.61)
## Exposed - 11 38 49 0.22 (0.12 to 0.37)
## Total 242 216 458 0.53 (0.48 to 0.57)
##
## Point estimates and 95% CIs:
## -------------------------------------------------------------------
## Prevalence ratio 2.52 (1.48, 4.26)
## Odds ratio 4.48 (2.23, 9.02)
## Attrib prevalence in the exposed * 0.34 (0.21, 0.47)
## Attrib fraction in the exposed (%) 60.25 (32.65, 76.54)
## Attrib prevalence in the population * 0.30 (0.18, 0.43)
## Attrib fraction in the population (%) 57.51 (29.69, 74.33)
## -------------------------------------------------------------------
## Uncorrected chi2 test that OR = 1: chi2(1) = 20.335 Pr>chi2 = <0.001
## Fisher exact test that OR = 1: Pr>chi2 = <0.001
## Wald confidence limits
## CI: confidence interval
## * Outcomes per population unit
Check the results yourself
<- as.numeric(strsplit(as.character(res$tab$` Prevalence *`), " ")[[1]][1])
p1 <- as.numeric(strsplit(as.character(res$tab$` Prevalence *`), " ")[[2]][1])
p0 <- p1/p0
RR RR
## [1] 2.545455
<- p1 - p0
RD RD
## [1] 0.34
<- (p1/(1-p1))/(p0/(1-p0))
OR OR
## [1] 4.512397