I want to calculate mean of each of several columns in a data.table, grouped by another column. My question is similar to two other questions on SO (one and two) but I couldn't apply those on my problem.
Here is an example:
library(data.table)
dtb <- fread(input = "condition,var1,var2,var3
one,100,1000,10000
one,101,1001,10001
one,102,1002,10002
two,103,1003,10003
two,104,1004,10004
two,105,1005,10005
three,106,1006,10006
three,107,1007,10007
three,108,1008,10008
four,109,1009,10009
four,110,1010,10010")
dtb
# condition var1 var2 var3
# 1: one 100 1000 10000
# 2: one 101 1001 10001
# 3: one 102 1002 10002
# 4: two 103 1003 10003
# 5: two 104 1004 10004
# 6: two 105 1005 10005
# 7: three 106 1006 10006
# 8: three 107 1007 10007
# 9: three 108 1008 10008
# 10: four 109 1009 10009
# 11: four 110 1010 10010
The calculation of each single mean is easy; e.g. for "var1": dtb[ , mean(var1), by = condition]
. But I this quickly becomes cumbersome if there are many variables and you need to write all of them. Thus, dtb[, list(mean(var1), mean(var2), mean(var3)), by = condition]
is undesirable. I need the column names to be dynamic and I wish to end up with something like this:
condition var1 var2 var3
1: one 101.0 1001.0 10001.0
2: two 104.0 1004.0 10004.0
3: three 107.0 1007.0 10007.0
4: four 109.5 1009.5 10009.5
See Question&Answers more detail:os