This function creates a publication-ready, survey-weighted Kaplan-Meier plot with an attached "Number at Risk" table and censor markings, using `ggplot2` and `patchwork`.
Usage
svykmplot(
formula,
design,
time_unit = "years",
time_breaks = seq(0, 16, by = 4),
legend_title = "Strata",
risk_table_title = "Number at Risk",
palette = NULL,
show_pval = TRUE,
show_censor_marks = TRUE,
base_font_size = 11,
table_font_size = 3.5
)Arguments
- formula
A survival formula, e.g., `Surv(stime, status) ~ strata_var`.
- design
A survey design object created with the `survey` package.
- time_unit
The unit for time on the x-axis. One of `"days"`, `"months"`, or `"years"`. This determines the divisor for the time variable.
- time_breaks
A numeric vector of breaks for the x-axis, e.g., `seq(0, 10, by = 2)`.
- legend_title
A character string for the legend title.
- risk_table_title
A character string for the "Number at Risk" table's y-axis title.
- palette
A character vector of color codes (e.g., hex codes) to use for the strata.
- show_pval
Logical. If `TRUE`, calculates and displays the survey-weighted log-rank test p-value.
- show_censor_marks
Logical. If `TRUE`, displays censor markings (`+`) on the survival curves.
- base_font_size
Base font size for the plot theme.
- table_font_size
Font size for the text in the risk table.
Value
A list containing the following components:
- plot
The final combined `ggplot` object (plot + table) from `patchwork`.
- table
A `data.frame` (tibble) of the number-at-risk data.
- p_value
The numeric p-value from the survey-weighted log-rank test (if `show_pval = TRUE`).
- km_fit
The `svykm` fit object from the `survey` package.
- logrank_test
The `svylogrank` test object (if `show_pval = TRUE`).
Examples
if (FALSE) { # \dontrun{
# This example uses the 'nhanes_mortality' dataset,
if (requireNamespace("survey", quietly = TRUE)) {
# 1. Load the data
data(nhanes_mortality)
# 2. Create the main survey design object
analytic_design <- survey::svydesign(
strata = ~strata,
id = ~psu,
weights = ~survey_weight,
data = nhanes_mortality,
nest = TRUE
)
# 3. Create a subsetted design for females
design_female <- subset(analytic_design, sex == "Female")
# 4. Define the formula
km_formula <- Surv(stime, status) ~ caff
# 5. Define a 4-color palette
distinct_palette <- c("#377EB8", "#FF7F00", "#4DAF4A", "#E41A1C")
# 6. Run the function
km_results_female <- svykmplot(
formula = km_formula,
design = design_female,
legend_title = "Caffeine Consumption",
time_unit = "days", # Use "days" so divisor is 1
time_breaks = seq(0, 240, by = 60),
palette = distinct_palette,
show_pval = TRUE,
show_censor_marks = TRUE
)
# 7. Display the plot (and correct the x-axis label)
if (requireNamespace("ggplot2", quietly = TRUE)) {
km_results_female$plot +
ggplot2::labs(x = "Follow-up Time (Months)")
}
# 8. Print the at-risk table
print(km_results_female$table)
}
} # }