One way to identify collinear predictors is hierarchical clustering approach. Which R function can be used to run a hierarchical cluster analysis?
Which R function can be used to visualize a correlation matrix that shows the relationships between continuous variables?
Say, we aim to develop a prediction model to predict diabetes (a binary variable) based on some sociodemographic and clinical risk factors. \n We fit logistic regression model as follows: mod <- glm(diabetes ~ age + sex + race + education + triglycerides + protein + bilirubin + phosphorus + sodium + potassium + globulin + calcium, data = dat.train, family = binomial). \n The predicted probability of diabetes is calculated as: pred.diabetes <- predict(mod, type = 'response', newdata = dat.test). How would you calculate the area under the curve (AUC) values on the test data (dat.test)?
Which methods could be used to measure prediction error for continuous outcomes?
Say, you aim to build a prediction model to predict CVD among Canadian adults using logistic regression. Which methods could be used to deal with model overfitting? (select ALL that apply)