I have a data frame that's maybe best approximated as:
library(data.table)
z <- rep("z",5)
y <- c(rep("st",2),rep("co",2),"fu")
var1 <- c(rep("a",2),rep("b",2),"c")
var2 <- c("y","y","y","z","x")
transp <- c("bus","plane","train","bus","bus")
sample1 <- sample(1:10, 5)
sample2 <- sample(1:10, 5)
df <- cbind(z,y,var1,var2,transp,sample1,sample2)
df<-as.data.table(df)
> df
z y var1 var2 transp sample1 sample2
1: z st a y bus 4 3
2: z st a y plane 10 7
3: z co b y train 8 9
4: z co b z bus 1 5
5: z fu c x bus 6 4
All unique combinations of var1 and var2 already exist in the table. I want to expand the table so that all combinations of var1/var2 include all transp options found in a list:
transtype <- c("bus","train")
Notice "plane" is an option in df but not in transtype. I would like to keep the row that includes transp="plane" but not expand by adding rows with "plane". The columns z and y need to be filled in with the appropriate value and sample1 and sample2 should be NA. Result should be:
> result
z y var1 var2 transp sample1 sample2
1: z st a y bus 4 3
2: z st a y plane 10 7
3: z st a y train NA NA
4: z co b y train 8 9
5: z co b y bus NA NA
6: z co b z bus 1 5
7: z co b z train NA NA
8: z fu c x bus 6 4
9: z fu c x train NA NA
The data.table options I've come up with based on Fastest way to add rows for missing values in a data.frame? and Data.table: Add rows for missing combinations of 2 factors without losing associated descriptive factors end up expanding all unique combinations of var1 and var2, not just the combinations that already exist in the table. And I don't know how to keep the values of z and y. Like this:
setkey(df, var1, var2, transp)
x<-df[CJ(var1, var2, transp, unique=T)]
Maybe I should be using dplyr? Or maybe I'm missing something simple? I went through the data.table documentation and can't come up with a solution.
See Question&Answers more detail:os