Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

Is there some way to use rollapply (from zoo package or something similar) optimized functions (rollmean, rollmedian etc) to compute rolling functions with a time-based window, instead of one based on a number of observations? What I want is simple: for each element in an irregular time series, I want to compute a rolling function with a N-days window. That is, the window should include all the observations up to N days before the current observation. Time series may also contain duplicates.

Here follows an example. Given the following time series:

      date  value
 1/11/2011      5
 1/11/2011      4
 1/11/2011      2
 8/11/2011      1
13/11/2011      0
14/11/2011      0
15/11/2011      0
18/11/2011      1
21/11/2011      4
 5/12/2011      3

A rolling median with a 5-day window, aligned to the right, should result in the following calculation:

> c(
    median(c(5)),
    median(c(5,4)),
    median(c(5,4,2)),
    median(c(1)),
    median(c(1,0)), 
    median(c(0,0)),
    median(c(0,0,0)),
    median(c(0,0,0,1)),
    median(c(1,4)),
    median(c(3))
   )

 [1] 5.0 4.5 4.0 1.0 0.5 0.0 0.0 0.0 2.5 3.0

I already found some solutions out there but they are usually tricky, which usually means slow. I managed to implement my own rolling function calculation. The problem is that for very long time series the optimized version of median (rollmedian) can make a huge time difference, since it takes into account the overlap between windows. I would like to avoid reimplementing it. I suspect there are some trick with rollapply parameters that will make it work, but I cannot figure it out. Thanks in advance for the help.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
506 views
Welcome To Ask or Share your Answers For Others

1 Answer

As of version v1.9.8 (on CRAN 25 Nov 2016), has gained the ability to perform non-equi joins which can be used here.

The OP has requested

for each element in an irregular time series, I want to compute a rolling function with a N-days window. That is, the window should include all the observations up to N days before the current observation. Time series may also contain duplicates.

Note that the OP has requested to include all the observations up to N days before the current observation. This is different to request all the observations up to N days before the current day.

For the latter, I would expect one value for 1/11/2011, i.e., median(c(5, 4, 2)) = 4.

Apparently, the OP expects an observation-based rolling window which is limited to N days. Therefore, the join conditions of the non-equi join have to consider the row number as well.

library(data.table)
n_days <- 5L
setDT(DT)[, rn := .I][
  .(ur = rn, ud = date, ld = date - n_days), 
  on = .(rn <= ur, date <= ud, date >= ld),
  median(as.double(value)), by = .EACHI]$V1
[1] 5.0 4.5 4.0 1.0 0.5 0.0 0.0 0.0 2.5 3.0

For the sake of completeness, a possible solution for the day-based rolling window could be:

setDT(DT)[.(ud = unique(date), ld = unique(date) - n_days), on = .(date <= ud, date >= ld), 
   median(as.double(value)), by = .EACHI]
         date       date  V1
1: 2011-11-01 2011-10-27 4.0
2: 2011-11-08 2011-11-03 1.0
3: 2011-11-13 2011-11-08 0.5
4: 2011-11-14 2011-11-09 0.0
5: 2011-11-15 2011-11-10 0.0
6: 2011-11-18 2011-11-13 0.0
7: 2011-11-21 2011-11-16 2.5
8: 2011-12-05 2011-11-30 3.0

Data

library(data.table)
DT <- fread("      date  value
 1/11/2011      5
 1/11/2011      4
 1/11/2011      2
 8/11/2011      1
13/11/2011      0
14/11/2011      0
15/11/2011      0
18/11/2011      1
21/11/2011      4
 5/12/2011      3")[
   # coerce date from character string to integer date class
   , date := as.IDate(date, "%d/%m/%Y")]

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...