I got a data frame in R where one of the fields is composite (delimited). Here's an example of what I got:
users=c(1,2,3)
items=c("23 77 49", "10 18 28", "20 31 84")
df = data.frame(users,items)
(I don't build it; this is just for illustrative purposes.)
users items
1 23 77 49
2 10 18 28
3 20 31 84
I want to flatten the second column in order to have a list of (non-unique) user IDs and an individual item per row. So I want to end up with:
user item
1 23
1 77
1 49
2 10
2 18
2 28
3 20
3 31
3 84
I tried:
data.frame(user = df$users, item = unlist(strsplit(as.character(df$items), " ")))
But I get "arguments imply differing number of rows". I understand why, but can't find a solution to give me the result I want. Any ideas?
Also, what is the most efficient way as I got more than 20 million rows of this?
See Question&Answers more detail:os