Below we discuss the assumptions in more details.
Conditional exchangeability
Assume a condition where doctors prescribe Rosuvastatin more often if the patient belongs to any of the following category
- White (race)
- Male (sex)
- Age (Age at least 50)
Conditional exchangeability, also known as conditional ignorability or unconfoundedness, is an assumption that treatment assignment is independent of potential outcomes, given a set of measured confounders. If conditional exchangeability is satisfied, we can consider the treatment and control groups exchangeable or similar in terms of potential outcomes.
\(Y(A=1), Y(A=0) \perp A | L\): Treatment assignment is independent of the potential outcome, given L
To achieve conditional exchangeability, we need to balance the distribution of these confounders (race, sex, and age) between the treatment and control groups.
By doing so, we can minimize the impact of confounding variables on the treatment-outcome relationship, allowing for a more accurate estimation of the causal effect of Rosuvastatin intake on total cholesterol levels.
Achieving conditional exchangeability can be challenging, especially in observational studies where treatment assignment is not randomized. General suggestions in a real-world scenario:
- To strengthen the plausibility of the conditional exchangeability assumption, researchers should carefully identify and measure all relevant confounders that might influence both the treatment assignment and the outcome. This often requires domain-specific knowledge and a thorough understanding of the causal relationships among variables.
- It is impossible to prove that conditional exchangeability holds in a given study. Therefore, researchers should perform sensitivity analyses to assess the robustness of their causal effect estimates to potential violations of this assumption.
- Researchers should be transparent about the assumptions made in their analyses, including conditional exchangeability, and clearly communicate the limitations of their study findings due to these assumptions.
Positivity
Positivity is an important assumption in the causal inference that ensures a non-zero probability of receiving each treatment level (or exposure) for every stratum of the confounding variables. In other words, it means that for every combination of the confounders, there should be at least some individuals who have received and not received the treatment.
\(0 < P(A=1 | L) < 1\): Subjects are eligible to receive both treatment, given L
In the context of the association between Rosuvastatin intake and total cholesterol levels, where race, sex, and age are confounders, the positivity assumption implies that there must be people from every combination of race, sex, and age who have taken Rosuvastatin as well as those who haven’t.
Assume that there are 2 race categories, 2 sex categories, and 2 age categories in the data. Then we will have a total of \(2 \times 2 \times 2\) = 8 possible categories. The following 8 categories represent all possible combinations of the given race, sex, and age categories:
- White, Male, Less than 50
- White, Male, At least 50
- White, Female, Less than 50
- White, Female, At least 50
- Asian, Male, Less than 50
- Asian, Male, At least 50
- Asian, Female, Less than 50
- Asian, Female, At least 50
The positivity assumption implies that there should be individuals taking (\(A=1\)) and not taking the treatment (\(A=0\)) in each of these 8 categories.
- If there are no control individuals (not receiving the treatment) for the “Asian, Male, At least 50” category, it means the positivity assumption is violated for this particular combination of confounders.
- If the sample size allows, we could consider combining categories, such as merging age groups or collapsing race categories, to create more balanced groups. This may help to satisfy the positivity assumption but might also result in a loss of information or reduced ability to detect specific treatment effects.
- The causal effect estimates obtained from the study may not be generalizable to the entire population, as the results might not be applicable to the “Asian, Male, At least 50” category. This limits the external validity of the study findings.
(Causal) consistency
When the causal consistency assumption holds, it implies that the treatment is well-defined, and there is only one version of the treatment that can be applied to all treated individuals. This allows for a clear interpretation of the causal effect, as the treatment effect can be attributed to the specific, well-defined treatment. Violation of the causal consistency assumption occurs when there are multiple versions of the treatment, and the treatment effect is not well-defined.
If some patients receive 5 mg of Rosuvastatin, while others receive 10 mg or 20 mg, it would be unclear which specific version of the treatment is responsible for the observed effect on total cholesterol levels. In such cases, the causal effect estimation may be biased or difficult to interpret.
- To ensure causal consistency, researchers should carefully design their study to control the treatment version and dosage, making sure that all treated individuals receive the same version of the treatment.
- In cases where multiple treatment versions are present, researchers may need to modify their analysis to account for these different versions, such as by estimating separate causal effects for each dosage level or considering the treatment as a continuous variable (e.g., dose-response analysis).
No interference
The potential outcome for one individual is not affected by the treatment status of any other individual. This means that each individual’s treatment has no influence on the outcome of others, and the potential outcomes for a given individual are independent of the treatment assignments for other individuals.
If some patients in the treatment group are sharing their Rosuvastatin with control patients, the no interference assumption would be violated. This is because the treatment status of one individual is now affecting the potential outcomes of another individual, which goes against the no interference assumption. In such a scenario, the treatment effect estimation may be biased because the control group is not truly “untreated” and is receiving the treatment indirectly.
Models are correctly specified
We also assume that models used to estimate (1) propensity scores and (2) analyze outcomes are accurate representations of the underlying relationships between the variables involved. Model specification not only includes adding necessary covariates (which is also relavant for conditional exchangeability or no unmeasured confounding assumption), but also that they are added in the correct functional form.
If there is evidence that age and sex interact in determining the probability of being prescribed to Rosuvastatin, we should include the interaction term (age × sex) in the propensity score model along with the main effects.
By including the relevant interaction terms in the propensity score model, we can improve the model’s ability to accurately balance the covariates between the treatment and control groups, which in turn enhances the validity of the causal effect estimates.
- Ensuring that the identifiability assumptions hold is crucial for obtaining unbiased and meaningful causal effect estimates. When these assumptions are violated, the causal effect estimates may be biased or not identifiable at all.
- By providing evidence or arguments supporting the plausibility of the identifiability assumptions in the study and addressing any potential violations (via sensitility analyses), we will increase the credibility and validity of the causal inference.
- Merely stating the assumptions without ensuring that they hold in our data can lead to biased or misleading results and undermine the validity of the study findings.