Naive Per Protocol Estimation

The next effect estimate we calculate is the naive Per Protocol (naive PP) estimate. This corresponds to comparing all those who were randomized to receive the treatment, and truly did receive it against those who were randomized to receive the control and truly did receive the control.

The estimation of this treatment effect corresponds to:

\[logit(Y) = \gamma_{0} + \gamma_{A} A + \gamma_{t} t\]

For this estimate, we start by artificially censoring the dataset so that we only retain observations from individuals at time points where they were adherent to their randomized treatment.

#### First artificially censor the original data within each arm
simulated.data.treated <- simulated.data[simulated.data$Z==1 & 
                                           simulated.data$Alag1==1,]
simulated.data.control <- simulated.data[simulated.data$Z==0 & 
                                           simulated.data$Alag1==0,]

#### Join these to form the complete artificially censored dataset
simulated.data.combined <- rbind(simulated.data.treated, 
                                 simulated.data.control)
head(simulated.data.combined)
##   Z id t0         B          L1 L2 L2lag   cavgL1 A Alag1 Y
## 1 1  1  0 0.2780523  2.64317182  1     0 2.643172 1     1 0
## 2 1  1  1 0.2780523  6.14841344  1     1 4.395793 1     1 0
## 3 1  1  2 0.2780523  1.56694834  1     1 3.452845 1     1 0
## 4 1  1  3 0.2780523 -0.03690539  1     1 2.580407 1     1 0
## 5 1  1  4 0.2780523  0.32103495  1     1 2.128533 1     1 0
## 6 1  1  5 0.2780523 -2.22462997  0     1 1.403006 1     1 0

Next, we fit the pooled logistic regression model:

naivePPFit <- glm(Y ~ t0 + A, 
                  data = simulated.data.combined, 
                  family = binomial(link="logit"))
naivePPFit
## 
## Call:  glm(formula = Y ~ t0 + A, family = binomial(link = "logit"), 
##     data = simulated.data.combined)
## 
## Coefficients:
## (Intercept)           t0            A  
##   -10.96716      0.09479      1.26137  
## 
## Degrees of Freedom: 99371 Total (i.e. Null);  99369 Residual
## Null Deviance:       2365 
## Residual Deviance: 2051  AIC: 2057

Lastly, we calculate the appropriate standard error, p-values, and confidence intervals using the clustered sandwich standard errors:

cov.m1 <- vcovCL(naivePPFit, type = "HC0", 
                 cluster = simulated.data.combined$id)
co.test <- coeftest(naivePPFit, vcov = cov.m1)
co.CI <- coefci(naivePPFit, vcov = cov.m1, level = 0.95)
cat("Naive PP effect estimate:", "\n",
        "log(OR)", round(coefficients(naivePPFit)[["A"]],3), "\n",
      "95% CI:", round(co.CI["A","2.5 %"],3), ",",round(co.CI["A","97.5 %"],3), "\n",
      "p-value", round(co.test["A","Pr(>|z|)"],3), "\n",
      "robust standard error", round(sqrt(diag(cov.m1))[["A"]],3))
## Naive PP effect estimate: 
##  log(OR) 1.261 
##  95% CI: 0.895 , 1.628 
##  p-value 0 
##  robust standard error 0.187

The naive PP estimate concludes that the effect of treatment is significant, with a log odds ratio of 1.2613734. This result would indicate that the receipt of treatment is associated with an increase in the likelihood of the outcome occurring. However, naive PP estimates may be subject to bias if non-adherence is associated with the outcome of interest.

Alternatively, we can again use the summ() function to obtain the same estimates:

summ(naivePPFit, robust = "HC0", 
     confint = TRUE, digits = 3, cluster="id", 
     model.info = FALSE, model.fit = FALSE)
Est. 2.5% 97.5% z val. p
(Intercept) -10.967 -11.746 -10.188 -27.605 0.000
t0 0.095 0.080 0.109 12.669 0.000
A 1.261 0.896 1.627 6.769 0.000
Standard errors: Cluster-robust, type = HC0