Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

Wanting to use dplyr and case_when to collapse a series of indicator columns into a single column. The challenge is I want to be able to collapse over an unspecified/dynamic number of columns.

Consider the following dataset, gear has been split into a series of indicator columns.

library(dplyr)
data(mtcars)
mtcars = mtcars %>%
  mutate(g2 = ifelse(gear == 2, 1, 0),
         g3 = ifelse(gear == 3, 1, 0),
         g4 = ifelse(gear == 4, 1, 0)) %>%
  select(g2, g3, g4)

I am trying to write a function that does the reverse.

When I know how many cases this can be done as follows:

combine_indices = function(db, cols, vals){
  db %>% mutate(new_col = case_when(!!sym(cols[1]) == 1 ~ vals[1],
                                    !!sym(cols[2]) == 1 ~ vals[2],
                                    !!sym(cols[3]) == 1 ~ vals[3]))
}

cols = c("g2", "g3", "g4")
vals = c(2,3,4)
combine_indices(mtcars, cols, vals)

However, I would like the combine_indices function to handle any number of index columns (right now it works for exactly three).

According to the documentation (?case_when), "if your patterns are stored in a list, you can splice that in with !!!". But I can not get this working:

patterns = list(sym(cols[1] == 1 ~ vals[1],
                sym(cols[2] == 1 ~ vals[2],
                sym(cols[3] == 1 ~ vals[3])

mtcars %>% mutate(new_col = case_when(!!!patterns))

Only produces a new column filled with NAs.

If !!!patterns worked, then it would be straightforward to take the lists cols and vals and generate patterns. However, I can not get the quosures correct. Hoping someone more familiar with quosures knows how.

Note - some similar questions here of SO were solved using joins or other functions. However, I am restricted to using case_when because of how it translates to sql when using dbplyr.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
1.0k views
Welcome To Ask or Share your Answers For Others

1 Answer

We can create a string of conditions, use parse_exprs and splice it (!!!).

library(dplyr)
library(rlang)

combine_indices = function(db, cols, vals){
   db %>% mutate(new_col = case_when(!!!parse_exprs(paste(cols, '== 1 ~', vals))))
}


cols = c("g2", "g3", "g4")
vals = c(2,3,4)
combine_indices(mtcars, cols, vals)

which returns :

#   g2 g3 g4 new_col
#1   0  0  1       4
#2   0  0  1       4
#3   0  0  1       4
#4   0  1  0       3
#5   0  1  0       3
#6   0  1  0       3
#....

where paste generates the conditions for case_when dynamically.

paste(cols, '== 1 ~', vals)
#[1] "g2 == 1 ~ 2" "g3 == 1 ~ 3" "g4 == 1 ~ 4"

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...