Testing Interactions

We will use the following article by Flegal et al. (2016). We will use the same data as before.

Objective: To statistically test whether an interaction term contributes significantly to a model when using complex survey data (NHANES) that has been imputed using mice.

The Scenario:

Outcome (\(Y\)): BMI (BMXBMI)
Exposure (\(X\)): Age (RIDAGEYR)
Moderator (\(Z\)): Smoking Status (SMQ020)

Question: Does the effect of Age on BMI depend on whether the person smokes?

load("Data/missingdata/Flegal2016.RData")
ls()
#> [1] "dat.full"
str(dat.full)
#> 'data.frame':    10175 obs. of  12 variables:
#>  $ SEQN    : 'labelled' int  73557 73558 73559 73560 73561 73562 73563 73564 73565 73566 ...
#>   ..- attr(*, "label")= chr "Respondent sequence number"
#>  $ RIDAGEYR: 'labelled' int  69 54 72 9 73 56 0 61 42 56 ...
#>   ..- attr(*, "label")= chr "Age in years at screening"
#>  $ RIAGENDR: Factor w/ 2 levels "Male","Female": 1 1 1 1 2 1 1 2 1 2 ...
#>  $ DMDEDUC2: Factor w/ 7 levels "Less than 9th grade",..: 3 3 4 NA 5 4 NA 5 3 3 ...
#>  $ RIDRETH3: Factor w/ 6 levels "Mexican American",..: 4 3 3 3 3 1 3 3 2 3 ...
#>  $ RIDEXPRG: Factor w/ 3 levels "Yes, positive lab pregnancy test",..: NA NA NA NA NA NA NA NA NA NA ...
#>  $ WTINT2YR: 'labelled' num  13281 23682 57215 55201 63710 ...
#>   ..- attr(*, "label")= chr "Full sample 2 year interview weight"
#>  $ SDMVPSU : 'labelled' int  1 1 1 2 2 1 1 1 2 1 ...
#>   ..- attr(*, "label")= chr "Masked variance pseudo-PSU"
#>  $ SDMVSTRA: 'labelled' int  112 108 109 109 116 111 105 114 106 112 ...
#>   ..- attr(*, "label")= chr "Masked variance pseudo-stratum"
#>  $ BMXBMI  : 'labelled' num  26.7 28.6 28.9 17.1 19.7 41.7 NA 35.7 NA 26.5 ...
#>   ..- attr(*, "label")= chr "Body Mass Index (kg/m**2)"
#>  $ SMQ020  : Factor w/ 3 levels "Yes","No","Don't know": 1 1 1 NA 2 1 NA 2 1 1 ...
#>  $ SMQ040  : Factor w/ 3 levels "Every day","Some days",..: 3 2 3 NA NA 3 NA NA 3 1 ...

Step 1: Imputation

Before modeling, we must handle missing data to avoid bias.

library(mice) 
#> 
#> Attaching package: 'mice'
#> The following object is masked from 'package:stats':
#> 
#>     filter
#> The following objects are masked from 'package:base':
#> 
#>     cbind, rbind
library(survey)
#> Loading required package: grid
#> Loading required package: Matrix
#> Loading required package: survival
#> 
#> Attaching package: 'survey'
#> The following object is masked from 'package:graphics':
#> 
#>     dotchart
# 1. Impute the data
# Recommendation: Set m = 20 for interaction tests to stabilize variance.
imp <- mice(dat.full, m = 20, maxit = 5)
#> 
#>  iter imp variable
#>   1   1  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   1   2  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   1   3  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   1   4  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   1   5  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   1   6  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   1   7  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   1   8  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   1   9  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   1   10  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   1   11  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   1   12  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   1   13  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   1   14  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   1   15  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   1   16  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   1   17  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   1   18  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   1   19  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   1   20  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   2   1  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   2   2  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   2   3  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   2   4  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   2   5  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   2   6  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   2   7  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   2   8  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   2   9  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   2   10  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   2   11  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   2   12  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   2   13  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   2   14  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   2   15  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   2   16  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   2   17  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   2   18  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   2   19  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   2   20  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   3   1  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   3   2  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   3   3  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   3   4  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   3   5  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   3   6  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   3   7  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   3   8  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   3   9  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   3   10  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   3   11  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   3   12  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   3   13  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   3   14  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   3   15  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   3   16  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   3   17  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   3   18  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   3   19  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   3   20  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   4   1  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   4   2  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   4   3  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   4   4  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   4   5  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   4   6  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   4   7  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   4   8  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   4   9  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   4   10  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   4   11  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   4   12  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   4   13  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   4   14  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   4   15  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   4   16  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   4   17  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   4   18  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   4   19  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   4   20  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   5   1  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   5   2  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   5   3  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   5   4  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   5   5  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   5   6  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   5   7  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   5   8  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   5   9  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   5   10  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   5   11  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   5   12  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   5   13  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   5   14  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   5   15  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   5   16  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   5   17  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   5   18  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   5   19  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#>   5   20  DMDEDUC2  RIDEXPRG  BMXBMI  SMQ020  SMQ040
#> Warning: Number of logged events: 200

# 2. Extract the data into a list
# We cannot use 'with()' easily with svydesign, so we create a list of
# completed dataframes to loop over manually.
imp_list <- complete(imp, "all")

Step 2: Fitting the Nested Models

To test an interaction, we must compare two models:

  • The Full Model: Includes the interaction term (Age * Smoking).
  • The Reduced Model: Includes only main effects (Age + Smoking).

We use lapply to fit these models across all 20 imputed datasets.

# --- Helper: Calculate Design Degrees of Freedom (dfcom) ---
# We need this to prevent NaN errors in the final test.
# It is usually (Number of PSUs - Number of Strata).
# We calculate it once using the first imputed dataset.
temp_des <- svydesign(id = ~SDMVPSU, 
                      strata = ~SDMVSTRA, 
                      weights = ~WTINT2YR, 
                      data = imp_list[[1]], 
                      nest = TRUE)
df_degrees_freedom <- degf(temp_des)
# May need to do subsetting if the original data was reduced.

Model A: The Full Model (With Interaction)

fit_full_list <- lapply(imp_list, function(df) {
  # 1. Define survey design for this iteration
  des <- svydesign(id = ~SDMVPSU, 
                   strata = ~SDMVSTRA, 
                   weights = ~WTINT2YR, 
                   data = df, 
                   nest = TRUE)
  
  # 2. Fit the GLM
  svyglm(BMXBMI ~ RIDAGEYR * SMQ020, design = des)
})

Model B: The Reduced Model (No Interaction)

fit_reduced_list <- lapply(imp_list, function(df) {
  des <- svydesign(id = ~SDMVPSU, 
                   strata = ~SDMVSTRA, 
                   weights = ~WTINT2YR, 
                   data = df, nest = TRUE)
  
  svyglm(BMXBMI ~ RIDAGEYR + SMQ020, design = des)
})

Step 3: Pooling and Testing

We convert our lists of models into mira objects (Multiply Imputed Repeated Analyses) so the mice package recognizes them. Then, we use the Wald Test (\(D_1\)).

Important

Note that pool.compare(fit_full, fit_reduced, method = "wald") is deprecated. D1 correctly accounts for the small sample correction inherent in your survey design (dfcom), whereas pool.compare might be overestimating the precision.

# 1. Convert to 'mira' class
fit_full <- list(analyses = fit_full_list)
class(fit_full) <- "mira"

fit_reduced <- list(analyses = fit_reduced_list)
class(fit_reduced) <- "mira"

# 2. Run the D1 Statistic (Multivariate Wald Test)
# We pass the dfcom we calculated earlier to ensure stability
test_result <- D1(fit_full, 
                  fit_reduced, 
                  dfcom = df_degrees_freedom)

Step 4: Detailed Breakdown of the Output

When you run print(test_result), you get a row of statistics.

names(test_result)
#> [1] "call"     "result"   "formulas" "m"        "method"   "use"      "dfcom"
test_result$dfcom
#> [1] 15
library(kableExtra)
kable(test_result$result)
F.value df1 df2 P(>F) RIV
0.5023733 2 2.625538 0.6535277 2.959959

The Definitions

Metric What it is How to Interpret
test statistic The F-value of the Wald test. High values (> 4) suggest the models are significantly different. Values near 1 suggest the interaction adds nothing.
df1 Numerator Degrees of Freedom. Number of parameters added.
dfcom Complete Data Degrees of Freedom. From survey design. Low dfcom (< 30) means limited power.
riv Relative Increase in Variance. Low (< 0.5) means stable imputations. High (> 1) means large uncertainty.
df2 Denominator Degrees of Freedom. Effective sample size after MI and design.
p.value Significance Level. < 0.05 indicates significant interaction.

Step 5: How to Report This

Scenario A: The Interaction Significant (p < 0.05)

“To test whether the association between age and BMI was modified by smoking status, we compared nested survey-weighted regression models using the multiparameter Wald test (\(D_1\)) on 20 imputed datasets. We found a significant interaction between age and smoking (\(F(2, 350.5) = 4.52, p = 0.01\)). Consequently, we present stratified results for smokers and non-smokers.”

Scenario B: The Interaction NOT Significant (p > 0.05)

“We tested for effect modification by smoking status using the \(D_1\) Wald statistic. The interaction term was not statistically significant (\(F(2, 105.2) = 0.88, p = 0.48\)). Therefore, the interaction term was removed, and we proceeded with the main effects model adjusted for smoking status.”

Important

You can use the same D1 function and workflow to test any set of nested models to check specification.

  • If you add an interaction term (e.g., Age * Smoking), the \(D_1\) test checks if any of these interaction coefficients are non-zero.

  • If you want to test whether adding a set of covariates (e.g., Race, Education, and Income) improves your model, you can compare a “Base Model” (without them) to a “Full Model” (with them). The \(D_1\) test will tell you if the set of covariates significantly improves the fit.