Sample data in csv
format. Save in a file broken_posix.csv
Date
3/10/2012 23:00
3/11/2012 0:00
3/11/2012 1:00
3/11/2012 2:00
3/11/2012 3:00
3/11/2012 4:00
3/11/2012 5:00
3/11/2012 6:00
3/11/2012 7:00
3/11/2012 8:00
3/11/2012 9:00
3/11/2012 10:00
3/11/2012 11:00
3/11/2012 12:00
3/11/2012 13:00
3/11/2012 14:00
3/11/2012 15:00
3/11/2012 16:00
3/11/2012 17:00
3/11/2012 18:00
3/11/2012 19:00
3/11/2012 20:00
3/11/2012 21:00
3/11/2012 22:00
3/11/2012 23:00
3/12/2012 0:00
3/12/2012 1:00
3/12/2012 2:00
3/12/2012 3:00
3/12/2012 4:00
3/12/2012 5:00
3/12/2012 6:00
3/12/2012 7:00
3/12/2012 8:00
3/12/2012 9:00
3/12/2012 10:00
3/12/2012 11:00
So I have this file broken_posix.csv
. I can read the file just fine with
a_var <- read.csv("broken_posix.csv")
Then I can convert it to posix
using
a_var_posixct = as.POSIXct(strptime( as.character( a_var$Date) , '%m/%d/%Y %H:%M'))
or with
a_var_posixlt = strptime(as.character( a_var$Date) , '%m/%d/%Y %H:%M')
The problem occurs now though because when I use posixct, then I get 4 NA values in my string every year. When I use posixlt
I get one NA
value on March 11,2012 at 2 (daylight savings time)
You'll see what I mean when you run
which(is.na(a_var_posixct))
which(is.na(a_var_posixlt))
a_var_posixct[4]
a_var_posixlt[4]
The fourth value is always a NA
value whenever you apply an operation even though it is clearly a date value for posixlt.
I've tried omitting the value only to end up messing up the rest of the posix string. I've tried setting the posix string as itself, in an attempt to clear the NA flag, to no effect. I've even tried setting it as a character value only to lose the hour and minute formatting.
I think that this situation occurs because of daylight savings time. It's very frustrating to deal with because when I try to run other functions on the dates I have to try to avoid the NA values since I can't change them. I could aggregate the data by day, and or just use date objects but that doesn't seem like the right method.
See Question&Answers more detail:os