Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I've got a little problem using dplyr group_by function. After doing this :

datasetALL %>% group_by(YEAR,Region) %>% summarise(count_number = n()) 

here is the result :

YEAR Region count_number
<int>  <int>        <int>
1   1946      1            2
2   1946      2            3
3   1946      3            1
4   1946      5            1
5   1947      3            1
6   1947      4            1

I would like something like :

YEAR Region count_number
<int>  <int>        <int>
1   1946      1            2
2   1946      2            3
3   1946      3            1
4   1946      5            1
5   1946      4            0 #order is no important
6   1947      1            0
7   1947      2            0
8   1947      3            1
9   1947      4            1
10  1947      5            0

I try to use complete() from tidyr package, but it's not succeeding...

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
314 views
Welcome To Ask or Share your Answers For Others

1 Answer

Using complete from the tidyr package should work. You can find documentation about it here.

What probably happened is that you did not remove the grouping. Then complete tries to add each of the combinations of YEAR and Region within each group. But all these combinations are already in the grouping. Thus first remove the grouping and then do the complete.

datasetALL %>% 
    group_by(YEAR,Region) %>% 
    summarise(count_number = n()) %>%
    ungroup() %>%
    complete(Year, Region, fill = list(count_number = 1))

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...