This is a good question, and highlights some of the difficulty in dealing with dates in R. The lubridate package is very handy, so below I present two approaches, one using base (as suggested by @RJ-) and the other using lubridate.
Recreate the (first two rows of) the dataframe in the original post:
foo <- data.frame(start.time = c("2012-02-06 15:47:00",
"2012-02-06 15:02:00",
"2012-02-22 10:08:00"),
duration = c(1,2,3))
Convert to POSIXct and POSIXt class (two ways to do this)
# using base::strptime
t.str <- strptime(foo$start.time, "%Y-%m-%d %H:%M:%S")
# using lubridate::ymd_hms
library(lubridate)
t.lub <- ymd_hms(foo$start.time)
Now, extract time as decimal hours
# using base::format
h.str <- as.numeric(format(t.str, "%H")) +
as.numeric(format(t.str, "%M"))/60
# using lubridate::hour and lubridate::minute
h.lub <- hour(t.lub) + minute(t.lub)/60
Demonstrate that these approaches are equal:
identical(h.str, h.lub)
Then choose one of above approaches to assign decimal hour to foo$hr
:
foo$hr <- h.str
# If you prefer, the choice can be made at random:
foo$hr <- if(runif(1) > 0.5){ h.str } else { h.lub }
then plot using the ggplot2 package:
library(ggplot2)
qplot(foo$hr, foo$duration) +?
? ? ? ? ?scale_x_datetime(labels = "%S:00")
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…