Propensity Score Analyses in Complex Surveys

Published

October 9, 2023

Topics

This lecture, focusing on propensity score analysis within survey data analysis, consists of four main sections.

Previous Content
  • Connection with Previous Courses: The first three sections may overlap with content from other prerequisite courses (such as SPPH 500-007 or SPPH 506) and may serve as a refresher if you have taken either of those courses. If you have not taken those courses, these sections provide adequate material before the fourth section (the main component of this lecture). The fifth section is about causal assumptions, which again overlaps with the prerequisite course discussion materials.
  • Connection with Previous Lectures: Within our current course SPPH 604, prior to this lecture, you have learned about survey data analysis and missing data analysis. These topics will be instrumental in enhancing your understanding of propensity score analysis within the context of complex survey data analysis. The concept of subpopulations will be particularly valuable when handling the matching aspect of the analysis, where part of the data is omitted within the complex survey data analysis context. Nonetheless, accurate variance estimation remains crucial despite such omission, a topic we will revisit in the lab this week.

The following two tasks will be helpful in answering pre-reading quizzes for the week.

Video content, and estimated time to watch

Pre-reading Task 1:

These sections in the current lecture include videos, which will take about 1 hour and 35 mins to watch.

There are additional optional sections and videos listed for those who are interested.

Main Reference

Pre-reading Task 2:

Note: Same reference also was a pre-requisite for SPPH 500-007 guest lecture and discussion.

There are additional optional references listed in each section for those who are interested.

Upcoming and Related Content
  • Class activity: Critically reviewing an article that implemented propensity score analysis in a survey data analysis context (within the class; after the discussion).
  • Optional materials: Additional and advanced content is available for those who are interested, but exploring those sections is not mandatory.
  • What is coming next in this week: After going through the required materials, you will be prepared for the upcoming lab content, where you will have the opportunity to implement the ideas discussed up to this point in the lecture in a series of real data analyses.
  • What is coming next in upcoming lectures: An extension of these ideas will be discussed in an upcoming lecture, where we will explore the ramifications of using machine learning methods to estimate propensity scores.

Key terms

Before delving into the various sections, let’s define some key terms and acronyms that will be used throughout:

  • Propensity Scores (PS): The probability of a unit being assigned a particular treatment given observed covariates.
  • Average Treatment Effect (ATE): A measure used to assess the overall impact of interventions across an entire population.
  • Average Treatment Effect on the Treated (ATT): A measure that assesses the impact of interventions specifically on the treated units.
  • Standardized Mean Difference (SMD): A statistical measure used to quantify the magnitude of difference between two groups.
  • Inverse Probability Weighting (IPW): A technique to create a synthetic sample in which the treatment assignment is independent of observed covariates.

Summary

The outlined topics provide a comprehensive exploration into various facets of PS methods and their application in observational studies and surveys. Beginning with an in-depth look into key concepts and calculations related to ATE and ATT, the content navigates through the practical application and diagnostic checks of covariate balance using the SMD. It further elucidates the methodology and application of PS, particularly focusing on matching and weighting to mitigate bias and create comparable groups for causal inference. The intricacies of employing PS methods within surveys are explored, highlighting different approaches and the incorporation of design variables in PS and outcome models. Fundamental assumptions for causal inference, namely Conditional Exchangeability, Positivity, and Causal Consistency, are dissected to form a foundational understanding for conducting robust causal analyses. Additionally, the content optionally delves into the nuances of implementing IPW in surveys and provides a structured framework for reporting analyses using PS methods in research manuscripts. Lastly, additional optional content features insightful workshops, offering more explanations of PS method implementations in research contexts.

Outline of Topics

Section 1: Target Parameters for an Associational or Causal Analysis

The section provides an in-depth exploration of the concepts and calculations related to ATE and ATT, pivotal components in assessing the impact of interventions in research. Through illustrative examples, it underscores the concept of counterfactuals and introduces the expressions for ATE and ATT, distinguishing between their applications and implications across different research contexts, particularly in RCTs and observational studies. The content further elaborates on variations of ATE and ATT, such as Population and Sample ATE/ATT, highlighting their relevance in policy-making and subgroup analysis. The nuanced differences between ATE and ATT, especially in observational studies where treatment assignment might be influenced by various factors, are emphasized, underscoring the importance of meticulous selection and interpretation of these estimands in alignment with the research question and study design.

Section 2: Diagnostics for Checking Covariate Balance

This section delves into the concept and application of the SMD, a statistical measure utilized to quantify the magnitude of difference between two groups. SMD not only facilitates the comparison of group differences when variables are measured in diverse units but also serves as a pivotal metric in assessing the balance of covariates between groups in observational studies, where achieving and verifying balance often necessitates additional statistical methodologies.

Section 3: Propensity Score Matching: Steps

This section elucidates the concept and application of PS in observational studies to mitigate bias stemming from confounding when treatment assignment is non-random. PS represents the probability of a unit being assigned a particular treatment given observed covariates and is pivotal in creating comparable groups for estimating causal effects. Through methodologies like logistic regression or machine learning techniques, PS is modeled, ensuring that selected variables are confounders or risk factors for the outcome while typically excluding instruments and noise variables. The section further distinguishes between PS matching and PS weighting, elaborating on their respective methodologies and applications in creating balanced comparison groups or synthetic samples. The four-step process of PS modeling, encompassing covariate specification, subject matching via PS (or weighting using IPW), covariate balance checking, and treatment effect estimation, is delineated, providing a structured approach to employing PS in research. The comparative analysis of PS and regression highlights their assumptions, applications, and the intuitive, diagnostic, and flexible advantages PS matching might offer.

Section 4: Propensity Score Matching in Complex Surveys

This section delves into the nuanced methodologies of employing PS models and outcome models within the context of surveys, exploring various approaches and their respective utilization of design variables. Through a tabular summary, the section juxtaposes different approaches, such as those by Zanutto (2006), DuGoff et al. (2014), Austin et al. (2018), and Lenis et al. (2017), alongside providing insights into the necessity and methodology of incorporating design variables in PS and outcome models.

Section 5: Assumptions for Causal Analyses

This section delves into three fundamental assumptions in causal inference: Conditional Exchangeability, Positivity, and Causal Consistency. Conditional Exchangeability assumes that, when controlling for observed covariates, any difference in outcomes between treatment groups can be attributed to the treatment itself. Positivity ensures that every individual, regardless of their covariate combination, has a non-zero probability of receiving each treatment level, which is vital for estimating causal effects and avoiding mathematical issues in methods like propensity score weighting. Lastly, Causal Consistency stipulates that the observed outcomes under each treatment level align with the potential outcomes under that treatment level, ensuring the potential outcomes framework is well-defined. Together, these assumptions form the foundational bedrock for conducting robust and valid causal analyses, ensuring that the results and conclusions drawn are methodologically sound and interpretable.

(Optional: Section 6) Propensity Score Weighting in Complex Surveys

This section, designated as advanced and optional, meticulously explores the nuances of implementing IPW in the context of surveys, providing a detailed walkthrough of both the basic IPW process and its application within complex surveys. The content delineates a tabular summary that contrasts the steps involved in the basic IPW process with those in complex surveys. The steps encompass fitting the PS model, converting PS to IPW, checking balance using the SMD in IPW-weighted data, and applying the outcome model with respective weights. Notably, the section also introduces an alternative IPW for targeting ATT, offering comprehensive insights into the methodological adaptations required when navigating through the intricacies of weighting within survey data, thereby facilitating a robust framework for causal inference in observational studies with complex survey designs.

(Optional: Section 7) Reporting Guidelines for a Propensity Score Analysis

This section, earmarked as advanced and optional, provides a structured and detailed framework for reporting analyses using PS methods in manuscripts, aiming to enhance the transparency and rigor of research findings. The guidelines, presented in a comprehensive table, delineate specific reporting items across various sections of a manuscript, including the Abstract, Introduction, Methods, Results, and Discussion.

(Optional: Section 8) Additional Contents / Workshops

This section features video content from two workshops. Both workshops, accessible via embedded YouTube videos and accompanied by tutorial websites, offer in-depth insights and practical applications of PS methods in research contexts.

Important

If you notice any issues or mistakes in this webpage, or want to report any errors, feel free to communicate.