In a recent homework assignment, we were instructed to run 27 linear models, each time adding an additional variable (the goal was to plot the changes in R2 vs. changes in adjusted R2). I found it difficult to algorithmically create formulas like this. The code I ended up using looked like this following (note that the first column in the data frame is the dependent variable, all the rest are prospective independent variables.
make.formula <- function(howfar) {
formula <- c()
for (i in 1:howfar) {
if (i == 1) {
formula <- paste(formula, names(d)[i], '~')}
else if (i == howfar) {
formula <- paste(formula, names(d)[i], '')
}
else {
formula <- paste(formula, names(d)[i], '+')}
}
return(formula)
}
formulas <- lapply(seq(2, length(d)), make.formula)
formulas <- lapply(formulas, as.formula)
fits <- lapply(formulas, lm, data = d)
This works, but seems far from ideal, and my impression is that anything I'm doing with a for-loop in R is probably not being done the best way. Is there an easier way to algorithmically construct formulas for a given data frame?
See Question&Answers more detail:os