flowchart LR
AA((propensity score matching)) --> A[(Exposure modelling <br/>to estimate <br/>propensity scores)]
AA --> B[(Propensity score <br/>Matching)]
AA --> C[(Assess balance <br/>and overlap)]
AA --> D[(Outcome <br/>modelling)]
style AA fill:#f9f,stroke:#333,stroke-width:4px
The followings are not exhaustive lists, but meant to show different possibilities at each step.
Step 1: Propensity score estimation
flowchart LR
subgraph ZA[" "]
direction LR
A[(Exposure <br/>modelling <br/>to<br/> estimate <br/>propensity <br/>scores)] --> A1(Variable selection)
A --> A2(Model specification)
end
subgraph ZAA[" "]
direction LR
A1 --> A11[Consult <br/>subject-area <br/>experts]
A1 --> A13[Build DAG <br/>from <br/>literature review]
A1 --> A12[Analytic methods <br/>to sort out <br/>problematic variables- <br/>collinearity]
A2 --> A21[Add interactions]
A2 --> A22[Add polynomial terms]
A2 --> A23[Add complex functions <br/>of one or more <br/>covariates]
A2 --> A24[Use machine learning <br/>methods to identify <br/>complex patterns]
end
subgraph ZAAA[" "]
direction LR
A21 --> A221[Unstable <br/>results - <br/>exposure <br/>model <br/>coef SEs <br/>are too <br/>large <br/>or infinite-<br/> Go back to <br/>Model <br/>specification <br/>step]
A22 --> A221
A23 --> A221
A24 --> A221
A11 --> A111[Carefully think <br/>about <br/>variable roles]
A13 --> A111
A12 --> A112[Merge covariates, <br/>covariate levels, <br/>omit some <br/>variables, <br/>find alternative and <br/>less problematic <br/>variable]
style A111 fill:#bbf,stroke:#f66,stroke-width:2px,color:#fff,stroke-dasharray: 5 5
end
ZA -.-> ZAA -.-> ZAAA
The type of variables we want to include in the propensity score modelling: this should be based on prior knowledge. Usually empirical selection of variables is discouraged. Understanding the role of each variable, e.g., effect modifier vs. confounder will determine the analysis strategy.
flowchart LR
A1110[\Variable roles\] --> A1111{{Select variables <br/>causing outcome}}
A1110 --> A1112{{select variables <br/>causing both outcome <br/>and exposure}}
A1110 --> A1113{{omit variables <br/>only causing <br/>exposure}}
A1110 --> A1114{{omit variables <br/>in the causal <br/>pathway}}
A1110 --> A1115{{omit variables <br/>that are effect <br/>of outcome}}
A1110 --> A1116{{omit variables <br/>that are <br/>simply noise}}
A1110 --> A1117{{add useful <br/>proxies}}
style A1110 fill:#bbf,stroke:#f66,stroke-width:2px,color:#fff,stroke-dasharray: 5 5
Step 2: Propensity score matching
flowchart LR
subgraph ZA[" "]
direction LR
B[(Propensity <br/>score <br/>Matching)] --> B1(Matching<br/> methods)
B --> B2(Matching <br/>ratios)
B --> B3(replacements)
end
subgraph ZAA[" "]
direction LR
B1 --> B11[Nearest <br/>neighbor]
B1 --> B12[Caliper, <br/>0.2*SD of <br/>logit of <br/>propensity <br/>score]
B1 --> B13[Optimal]
B1 --> B14[Full]
B2 --> B21[Fixed <br/>ratio]
B2 --> B23[Variable <br/>ratio]
B3 --> B33[With]
B3 --> B34[Without]
end
subgraph ZAAA[" "]
direction RL
B21 --> B211[1:1, pair]
B21 --> B212[1:M,<br/> where M>1 <br/>can be <br/>any integer]
B1 --> B0>Some <br/>combinations <br/>assuming <br/>sample size <br/>do not <br/>reduce <br/>too much]
B2 --> B0
B3 --> B0
end
ZA -.-> ZAA -.-> ZAAA
Step 3: Balance of Propensity score matched dataset
flowchart TD
subgraph ZA[" "]
direction LR
C[(Assess balance <br/>and overlap)] --> C1(SMD)
C --> C2(Variance <br/>ratio)
C --> C3(Visualization, <br/>overlapping histograms <br/>love plot, <br/>balance tables)
end
subgraph ZAA[" "]
direction LR
C1 --> C11[\Unsatisfactory balance <br/>or overlap\]
C2 --> C11
C3 --> C11
C3 --> C31[\Propensity scores <br/>too close <br/>to 0 or 1\]
end
subgraph ZAAA[" "]
direction RL
C11 --> C111[/Go back to <br/>Exposure modelling step\]
C31 --> C111
C31 --> C112[/Trimming, <br/>not preferred\]
end
ZA -.-> ZAA -.-> ZAAA
Step 4: treatment effect estimation from outcome model
flowchart TD
subgraph ZA[" "]
direction LR
D[(Outcome <br/>modelling)] --> D1(Crude)
D --> D2(Adjusted)
D --> D4(Variance <br/>estimation)
end
subgraph ZAA[" "]
direction LR
D2 --> D221[All covariates, <br/>preferred]
D2 --> D222[Partial list <br/>of covariates]
D4 --> D41[Cluster options]
D4 --> D42[Bootstrap options]
end
ZA -.-> ZAA
Reporting
- We should report the results and interpret the treatment effect estimates in the context of the research question and the underlying assumptions.
- We need to clearly communicate the limitations and potential biases in the analysis, and reporting of results from useful sensitivity analyses.
flowchart TD
subgraph ZA[" "]
direction LR
D6(Sensitivity <br/>Analysis <br/>for <br/>overall process) --> D5(Sensitivity <br/>Analysis <br/>for unmeasured <br/>confounding)
end
subgraph ZAA[" "]
direction LR
D5 --> D51[Rosenbaum <br/>bounds]
D5 --> D52[Quantitative <br/>bias analysis, <br/>E-value]
D6 --> D62[Alternative <br/>matching <br/>algorithm]
D6 --> D61[Alternative <br/>model <br/>specifications]
D6 --> D63[Alternative <br/>missing data <br/>method]
end
ZA -.-> ZAA
- We should also discuss the implications of the findings for the target population and the broader scientific literature.