Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have a data frame in R where the rows represent events, and one column is the date of the event. The thing the event is happening to is described by an ID column. So for each ID there are multiple entries.

How do I filter the data frame so that I retain only the most recent event for each ID? The IDs are integers and the dates are in the form mm/dd/yyyy.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
471 views
Welcome To Ask or Share your Answers For Others

1 Answer

You can try

library(dplyr)
df %>% 
  group_by(ID) %>%
  slice(which.max(as.Date(date, '%m/%d/%Y')))

data

df <- data.frame(ID= rep(1:3, each=3), date=c('02/20/1989',
'03/14/2001', '02/25/1990',  '04/20/2002', '02/04/2005', '02/01/2008',
'08/22/2011','08/20/2009', '08/25/2010' ), stringsAsFactors=FALSE)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share

548k questions

547k answers

4 comments

86.3k users

...