I have been trying to figure out how the subset
argument in R's lm()
function works. Especially the follwoing code seems dubious for me:
data(mtcars)
summary(lm(mpg ~ wt, data=mtcars))
summary(lm(mpg ~ wt, cyl, data=mtcars))
In every case the regression has 32 observations
dim(lm(mpg ~ wt, cyl ,data=mtcars)$model)
[1] 32 2
dim(lm(mpg ~ wt ,data=mtcars)$model)
[1] 32 2
yet the coefficients change (along with the R2). The help doesn't provide too much information on this matter:
See Question&Answers more detail:ossubset an optional vector specifying a subset of observations to be used in the fitting process