Concepts (R)
Confounding
Confounding is a pervasive concern in epidemiology, especially in observational studies focusing on causality. Epidemiologists need to carefully select confounders to avoid biased results due to third factors affecting the relationship between exposure and outcome. Commonly used methods for selecting confounders, such as change-in-estimator or solely relying on p-value-based statistical methods, may be inadequate or even problematic.
Epidemiologists need a more formalized system for confounder selection, incorporating causal diagrams (Greenland, Pearl, and Robins 1999; Tennant et al. 2021) and counterfactual reasoning. This includes an understanding of the underlying causal relationships and the potential impacts of different variables on the observed association. Understanding the temporal order and causal pathways is crucial for accurate confounder control.
However, it is possible that epidemiologists may lack comprehensive knowledge about the causal roles of all variables and hence may need to resort to empirical criteria (VanderWeele 2019) such as the disjunctive cause criterion, or other variable selection methods such as machine learning approaches. While these methods can provide more sophisticated analyses and help address the high dimensionality and complex structures of modern epidemiological data, epidemiologists need to understand how these approaches function, along with their benefits and limitations, to avoid introducing additional bias into the analysis.
Effect modifier
Effect modification and interaction are two distinct concepts in epidemiology (VanderWeele 2009; Bours 2021). Effect modification occurs when the causal effect of an exposure (A) on an outcome (Y) varies based on the levels of a third factor (B).
In this scenario, the association between the exposure and the outcome differs within the strata of a second exposure, which acts as the effect modifier. For instance, the impact of alcohol (A) on oral cancer (Y) might differ based on tobacco smoking (B).
On the other hand, interaction refers to the joint causal effect of two exposures (A and B) on an outcome (Y). It examines how the combination of multiple exposures influences the outcome, such as the combined effect of alcohol (A) and tobacco smoking (B) on oral cancer (Y).
In essence, while effect modification looks at how a third factor influences the relationship between an exposure and an outcome, interaction focuses on the combined effect of two exposures on the outcome.
Table 2 fallacy
The “Table 2 Fallacy” in epidemiology refers to the misleading practice of presenting multiple adjusted effect estimates from a single statistical model in one table, often resulting in misinterpretation. This occurs when researchers report both the primary exposure’s effects and secondary exposures’ (often an adjustment variable for the primary exposure) effects without adequately distinguishing between the types of effects or considering the causal relationships among variables.
This idea highlights the potential for misunderstanding in interpreting the effects of various exposures on an outcome when they are reported together, leading to confusion over the nature and magnitude of the relationships and possibly influencing the design and interpretation of further studies (Westreich and Greenland 2013). The fallacy demonstrates the need for careful consideration of the types of effects estimated and reported in statistical models, urging researchers to be clear about the distinctions and implications of controlled direct effects, total effects, and the presence of confounding or mediating variables.
Reading list
Confounding key reference: (VanderWeele 2019; Tennant et al. 2021)
Effect modification key reference: (VanderWeele 2009; Bours 2021)
Table 2 fallacy key reference: (Westreich and Greenland 2013)
Optional reading:
Video Lessons
Defining the Causal Effect: Potential Outcomes
To understand causality, one must first be able to imagine a world that does not exist. The potential outcomes framework formalizes this by defining the causal effect of an exposure in terms of what would have happened under different exposure scenarios. Let us define the key notations:
- A: The exposure status of an individual (e.g., \(A=1\) if a smoker, \(A=0\) if a non-smoker).
- Y: The outcome of interest (e.g., hypertension).
- L: A measured covariate or potential confounder.
- U: An unmeasured variable.
For any individual, we can define two potential outcomes:
- Y(A=1): The outcome that would be observed if the individual were a smoker.
- Y(A=0): The outcome that would be observed if that same individual were a non-smoker at the same point in time.
The Individual Treatment Effect (TE) is the difference between these two potential outcomes for a single person: \(TE = Y(A=1) - Y(A=0)\). For example, if a patient named John smokes (\(A=1\)) and develops hypertension, while he would not have developed hypertension had he not smoked (\(A=0\)), the causal effect of smoking for John is present.
The Fundamental Problem of Causal Inference
The definition of the individual TE immediately presents a profound challenge. For any given individual, we can only ever observe one of their potential outcomes. If John smokes, we observe \(Y(A=1)\), but his counterfactual outcome, \(Y(A=0)\), remains unobserved forever. This is known as the fundamental problem of causal inference; it is a problem of missing data where half the data is always missing for every subject.
Because the individual TE is unobservable, the goal of epidemiology shifts from the individual to the population. We instead seek to estimate the Average Treatment Effect (ATE), defined as the average of the individual effects across all subjects in a population: \(ATE = E\).
From Association to Causation: The Role of Confounding
In the real world, we cannot directly observe both potential outcomes for a population. Instead, we observe outcomes in two different groups of people: those who happened to be exposed (smokers) and those who were not (non-smokers). We can calculate the associational difference between these groups: \(E - E\). A critical error is to assume this associational difference is equal to the causal ATE.
This difference arises because of confounding. The groups of smokers and non-smokers may differ systematically on factors that also affect the outcome. For instance, individuals with lower socioeconomic status may be more likely to smoke and also have a higher underlying risk of hypertension for reasons unrelated to smoking (e.g., diet, stress). In this case, the observed difference in outcomes is a mixture of the true treatment effect and these pre-existing, systematic differences between the groups.
The Observational Study Solution: Conditional Exchangeability
Randomized Controlled Trials (RCTs) are the gold standard for causal inference because the process of randomization, with a large enough sample size, ensures that the exposed and unexposed groups are, on average, identical (“exchangeable”) on all baseline characteristics, both measured and unmeasured. In an RCT, any systematic differences are eliminated, making the associational difference a valid estimate of the causal ATE.
In observational studies, where randomization is not possible, we cannot achieve this level of exchangeability. Instead, we strive for conditional exchangeability. This is the assumption that, within strata of the measured confounders, the exposed and unexposed groups are exchangeable. By estimating the effect of smoking separately within each level of the confounder(s) \(L\) (e.g., estimating the effect of smoking separately for different age groups) and then averaging these stratum-specific effects, we can aim to reconstruct the causal ATE. This process of stratification, or “adjustment,” is the conceptual basis for controlling for confounding in observational research. However, its validity rests entirely on the critical and untestable assumption that we have successfully identified and measured all important common causes of the exposure and the outcome.
What is included in this Video Lesson:
- 0:00 Introduction
- 0:16 Notations
- 2:40 Treatment Effect
- 6:13 Real-world Problem of the counterfactual definition
- 9:44 Real-world Solution in Observational Setting
The timestamps are also included in the YouTube video description.
To properly address confounding, researchers need a tool to translate their subject-matter knowledge and assumptions about the world into a formal structure. Directed Acyclic Graphs (DAGs) serve this purpose, providing a visual language and a set of rigorous rules for identifying sources of bias and guiding statistical analysis.
The Grammar of Causal Diagrams
A DAG is a graphical model of causal relationships between variables. Its components follow a simple grammar:
- Nodes: Represent variables (e.g., smoking, hypertension, age).
- Arrows (Directed Edges): Represent a direct causal effect from one variable to another.
- Directed: The arrows have a single head, indicating the assumed direction of causality.
- Acyclic: A path of arrows cannot form a closed loop. This enforces the principle of temporality: a variable cannot be its own cause.
Crucially, the most powerful assumptions in a DAG are the absent arrows. The absence of an arrow between two variables represents a strong claim of no direct causal effect.
Paths: Causal and Non-Causal
A path is any sequence of arrows connecting two variables, regardless of the direction of the arrowheads. When assessing the relationship between an exposure like smoking (\(A\)) and an outcome like hypertension (\(Y\)), paths can be categorized into two critical types:
- Causal Paths (Front-door paths): These are paths that begin with an arrow originating from \(A\) and moving toward \(Y\) (e.g., \(A \rightarrow \text{Stress} \rightarrow Y\)). These paths transmit the causal effect of \(A\) on \(Y\) that we wish to estimate.
- Non-Causal Paths (Back-door paths): These are paths between \(A\) and \(Y\) that begin with an arrow pointing into \(A\) (e.g., \(A \leftarrow \text{Age} \rightarrow Y\)). These paths are sources of non-causal association (confounding) that can bias our estimate. The goal of adjustment is to “block” these backdoor paths.
The Three Elementary Causal Structures
All complex DAGs are composed of three fundamental building blocks. Understanding how information flows through these structures is the key to using DAGs to identify and control for bias.
- The Fork (Confounding): The structure is \(A \leftarrow L \rightarrow Y\). Here, \(L\) is a common cause of both the exposure \(A\) and the outcome \(Y\).
- Example: Age (\(L\)) is a common cause of both smoking habits (\(A\)) and hypertension (\(Y\)).
- Rule: The backdoor path through a common cause is open by default, creating a spurious association. To remove this confounding, one must condition on the confounder \(L\), which blocks the path.
- The Chain (Mediation): The structure is \(A \rightarrow M \rightarrow Y\). Here, \(M\) is a mediator that lies on the causal pathway.
- Example: Smoking (\(A\)) causes chronic inflammation (\(M\)), which in turn causes hypertension (\(Y\)).
- Rule: The causal path through a mediator is open by default. To estimate the total effect of \(A\) on \(Y\), one must not condition on the mediator \(M\). Doing so would block this part of the causal effect.
- The Collider (Selection/Collider Bias): The structure is \(A \rightarrow L \leftarrow Y\). Here, \(L\) is a common effect of both \(A\) and \(Y\).
- Example: Both smoking (\(A\)) and a genetic predisposition (\(Y\)) can lead to a specific biomarker level (\(L\)).
- Rule: The path through a collider is blocked by default. However, conditioning on the collider \(L\) opens the path, inducing a spurious, non-causal association between \(A\) and \(Y\). Adjusting for a collider is a critical error that introduces bias.
Applying the Rules with Dagitty
In practice, causal systems can be highly complex. Software such as Dagitty.net automates the application of these path-blocking rules. Given a user-drawn DAG, Dagitty can identify all open backdoor paths and determine the minimal sufficient adjustment sets: the smallest set of covariates that, if conditioned on, will block all backdoor paths and allow for an unbiased estimation of the total causal effect.
The video lesson split into 3 parts
Example DAG codes can be accessed from this GitHub repository folder
In the absence of a fully specified DAG, researchers can rely on a set of empirical criteria that require less stringent assumptions.
- Pre-treatment Criterion: Adjust for any variable that occurs chronologically before the exposure. This approach can fail by incorrectly adjusting for a pre-exposure collider, thereby inducing M-bias.
- Common Cause Criterion: Adjust only for variables known to be common causes of both the exposure and the outcome. This is often too conservative.
- Disjunctive Cause Criterion: This is a highly recommended practical strategy. It states that one should control for any pre-exposure covariate that is a cause of the exposure, OR a cause of the outcome, OR both. This criterion strikes a robust balance, ensuring sufficient adjustment under a wide range of unknown causal structures.
- Modified Disjunctive Cause Criterion: This refines the disjunctive criterion with crucial exceptions:
- Exclude Instrumental Variables & Z-Bias: An instrumental variable causes the exposure but does not affect the outcome except through the exposure. One must avoid adjusting for known instruments, as doing so can amplify bias due to unmeasured confounding, a phenomenon known as Z-bias.
- Include Proxies for Unmeasured Confounders: If a true common cause is unmeasured, one should adjust for a measured variable that serves as a proxy for it, as this will typically reduce bias.
- Finally, to estimate the total causal effect, any known mediators on the causal pathway must also be excluded from the adjustment set.
Statistical methods can also be used for variable selection, but their application requires careful consideration of the research goal: prediction versus causal inference.
- Change-in-Estimate: This method retains a covariate if its inclusion changes the exposure effect estimate by a certain threshold (e.g., 10%). However, this approach is flawed and not valid for non-collapsible effect measures, such as odds ratios (ORs) and hazard ratios (HRs). For these measures, a change in the estimate can occur even in the absence of confounding.
- Statistical Significance: Methods like stepwise regression using p-values or AIC are strongly discouraged for confounder selection. They are designed for prediction, not causal inference, and result in invalid p-values and confidence intervals for the final model.
- Machine Learning: Algorithms like LASSO and Random Forests are excellent for high-dimensional prediction. Their primary role in causal inference is in developing propensity score (PS) models, which is a prediction task. The goal is to create a score that balances measured covariates between the exposed and unexposed groups, mimicking randomization.
A crucial, and often overlooked, aspect of statistical adjustment is the concept of collapsibility. An effect measure is said to be collapsible if the marginal (crude) measure of association is equal to a weighted average of the stratum-specific measures of association after conditioning on another variable. This property has profound implications for how we interpret adjusted estimates.
In the absence of confounding, some effect measures, like the Risk Difference (RD) and Risk Ratio (RR), are collapsible. This means that if a variable is not a confounder, adjusting for it will not change the effect estimate. However, other common measures, most notably the Odds Ratio (OR), are non-collapsible.
The non-collapsibility of the odds ratio is a mathematical property stemming from the non-linearity of the logistic model’s link function. It means that the adjusted OR can be different from the crude OR even when there is no confounding. This phenomenon, where an association in a population differs from the association within its subgroups, is also known as Simpson’s Paradox (in the absence of confounding). This is precisely why the change-in-estimate criterion for confounder selection is invalid when using odds ratios—a change in the OR upon adjustment does not necessarily signal the presence of confounding.
Simpson’s Paradox is a statistical phenomenon where an association observed in a population is different from—and often in the opposite direction of—the associations observed in all of its subgroups. This paradox is a powerful illustration of how failing to account for a key third variable (a confounder or a collider) can lead to completely erroneous conclusions.
A famous example is the “Birthweight Paradox,” where maternal smoking appeared to be protective against infant mortality among low-birthweight infants, a finding that contradicted the known harms of smoking. This occurred because birthweight acted as a collider. Adjusting for it induced a spurious association between smoking and other unmeasured causes of mortality (e.g., birth defects).
The effect of an exposure may not be uniform across a population. A third variable can alter the exposure-outcome relationship, a phenomenon that leads to frequent confusion between two distinct concepts: interaction and effect modification.
Formal Definitions
While often used interchangeably, these terms address different causal questions:
- Effect Modification: This occurs when the causal effect of a single exposure (e.g., smoking) on an outcome (hypertension) differs across strata of a second variable (e.g., education level). The question is: “Is the effect of smoking different for people with high education versus people with low education?” This involves only one intervention (on smoking). The variable ‘education’ is treated as a baseline characteristic defining subgroups.
- Interaction: This refers to the joint causal effect of two exposures (e.g., smoking and low education) on an outcome (hypertension). The question is: “Is the effect of intervening on both smoking and education greater than the sum of the effects of intervening on each one alone?” This involves two distinct interventions and assesses synergy or antagonism.
Implications for Confounding Control
The distinction is critical for analytical strategy:
- To assess Effect Modification: When investigating if education modifies the effect of smoking on hypertension, a researcher only needs to control for the set of confounders of the
smoking -> hypertension
relationship. - To assess Interaction: When investigating the causal interaction between smoking and education, a researcher must control for all confounders of the
smoking -> hypertension
relationship AND all confounders of theeducation -> hypertension
relationship. This is a much more demanding requirement.
The Role of the Scale: Effect Measure Modification
Whether modification is detected can depend on the statistical scale used (e.g., additive scale for Risk Difference vs. multiplicative scale for Risk Ratio). For this reason, the more precise term is effect measure modification. A statistical finding of interaction is a property of the chosen model and does not necessarily correspond to a specific biological mechanism.
To revisit or deepen your grasp of these two concepts, consider reviewing this external tutorial.
One of the most common errors in reporting observational research is the Table 2 Fallacy. This fallacy is the practice of presenting a single multivariable regression model and interpreting the coefficients for all variables—the primary exposure and all adjustment covariates—as if they are equally valid estimates of the total causal effect of each variable on the outcome.
Why A Single Model Fails: A DAG-Based Explanation
A multivariable regression model is built to answer a single, specific causal question. The adjustment set required to estimate the causal effect of one variable is often different from the set required to estimate the effect of another.
Consider a DAG for the effects of smoking, age, and hypertension:
- Causal Question 1: What is the total effect of Smoking on Hypertension?
- Assume Age is a common cause of both Smoking and Hypertension. To get an unbiased estimate of the total effect of Smoking, one must adjust for Age. The appropriate model is:
Hypertension ~ Smoking + Age
. The coefficient for Smoking can be interpreted as the total causal effect.
- Assume Age is a common cause of both Smoking and Hypertension. To get an unbiased estimate of the total effect of Smoking, one must adjust for Age. The appropriate model is:
- Causal Question 2: What is the total effect of Age on Hypertension?
- In this same DAG, Smoking may be a mediator of the effect of Age (i.e.,
Age -> Smoking -> Hypertension
). To estimate the total effect of Age, one must not adjust for the mediator, Smoking. The model built for Question 1 does adjust for Smoking. Therefore, the coefficient for Age in that first model is not an estimate of the total effect; it is an estimate of the controlled direct effect—the effect of Age on Hypertension that does not operate through the Smoking pathway.
- In this same DAG, Smoking may be a mediator of the effect of Age (i.e.,
Best Practices for Reporting
To avoid the Table 2 Fallacy, analysis and reporting must be driven by a “one exposure, one model” principle:
- Be Explicit: Clearly state the single primary exposure of interest for each model.
- Use Multiple Models: If causal effects are desired for multiple variables, fit a separate, correctly specified model for each one.
- Structure Tables Clearly: The primary results table should only show the effect estimate for the main exposure of interest. The covariates used for adjustment should be listed in a footnote, not in the table with their own effect estimates.
Video Lesson Slides
Confounding
Effect modification
Table 2 fallacy
Links
Confounding
Effect modification
Table 2 fallacy