I am looking for patterns for manipulating data.table
objects whose structure resembles that of dataframes created with melt
from the reshape2
package. I am dealing with data tables with millions of rows. Performance is critical.
The generalized form of the question is whether there is a way to perform grouping based on a subset of values in a column and have the result of the grouping operation create one or more new columns.
A specific form of the question could be how to use data.table
to accomplish the equivalent of what dcast
does in the following:
input <- data.table(
id=c(1, 1, 1, 2, 2, 2, 3, 3, 3, 3),
variable=c('x', 'y', 'y', 'x', 'y', 'y', 'x', 'x', 'y', 'other'),
value=c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10))
dcast(input,
id ~ variable, sum,
subset=.(variable %in% c('x', 'y')))
the output of which is
id x y
1 1 1 5
2 2 4 11
3 3 15 9
See Question&Answers more detail:os