Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I'm trying to combine a list of unequal data.frames; the obvious do.call(rbind, df.lst) fails but the real problem is padding it out with NAs.

df.lst <- list(A=data.frame(a=c(1,2),b=c(5,4),d=c(2,3),e=c(1,1),f=c(1,2),g=c(1,2)),
               B=data.frame(a=c(1,2),b=c(3,2),d=c(2,3)),
               C=data.frame(a=c(1,2),b=c(4,3),d=c(1,2),e=c(1,3))
               )

I can see that I need to find the maximum number of columns in the longest data.frame; I can do this with the following code,

max(sapply(df.lst,ncol))

but after that I'm stuck. It is suggested that it is possible to index over the list and this automatically fills it with NAs.

Once I have the padded list, I anticipate a simple do.call() as described earlier. (I'm trying to keep the answer to base R and although there are many similar questions I can't seem to find an answer to this precise one).

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
1.1k views
Welcome To Ask or Share your Answers For Others

1 Answer

If you want to stick with base R, you can do something like this:

### Get all the columns names
col <- unique(unlist(sapply(df.lst, names)))
col
## [1] "a" "b" "d" "e" "f" "g"

### Fill the missing columns with NA
df.lst <- lapply(df.lst, function(df) {
  df[, setdiff(col, names(df))] <- NA
  df
})

### Then Bind it
do.call(rbind, df.lst)
##     a b d  e  f  g
## A.1 1 5 2  1  1  1
## A.2 2 4 3  1  2  2
## B.1 1 3 2 NA NA NA
## B.2 2 2 3 NA NA NA
## C.1 1 4 1  1 NA NA
## C.2 2 3 2  3 NA NA

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...