

Hello!
I’m having stacked data in a data.frame with 2 factors, ordered POSIXct, and actual value as numeric (as if for lattice::xyplot).
I would like to calculate first difference using “diff” function within corresponding subsets/partitions. Since data.frame is organized by factors and has sorted dates, it seems like "by" is a good candidate for the job. However it returns just a dumb list of vectors.
It seems that I can use either expand.grid to remap results of "by" and hope that I won't mess up order, or I can use "unique(subset(x,select=c(foo,bar)))"
In overall it looks like quite many steps for such task not counting assignment of those differences back to original data.frame starting from 2nd position in each partition (as diff returns shorter vector).
Am I on the right track or is there an easier way to do that?
Mikhail
If you would post a subset of your data so that we can see what you
are talking about, we could probably help you come up with a solution.
On Sat, Mar 3, 2012 at 7:50 PM, Mikhail Titov
It'd be doubly helpful if you could post desired output as well.
If you haven't seen it before, the easiest way to post R data is to
use the dput() function to get a plaintext (mailing list friendly)
representation. If your data is large, dput(head(DATA, 30)) should
suffice.
(We wouldn't want to clog those internet tubes...)
Michael
"R. Michael Weylandt" writes:
> It'd be doubly helpful if you could post desired output as well.
I beg alls pardon, I suddenly realized that in my case the solution is
trivial. Here is an example with a mockup data.
Let's generate some data
#+begin_src R
qq <
expand.grid(
day=seq(ISOdate(2011,1,1),ISOdate(2011,12,31),by='day'),
bar=1:4,
foo=factor(c('A','B','G','I'))
)
ww <
within(qq,
val < bar * sin(as.double(dayday[1],"days")
/ as.double(diff(range(day)),"days")
* 2*pi
+ as.numeric(foo)/2
)
)
#+end_src
We can take a look at it with
#+begin_src R :results graphics :exports both :file z.png
library(lattice)
xyplot(val~dayfoo,ww,group=ww$bar, type='l')
#+end_src
Now since we ditch first element in each partition anyway,
we can apply diff on entire data set at once.
Then we should ditch very first element in each partition.
#+begin_src R
ww[1,"diff"] < diff(ww$val)
ee < subset(ww, day>ISOdate(2011,1,1))
#+end_src
And a final result
#+begin_src R :results graphics :exports both :file x.png
xyplot(diff~dayfoo,ee,group=ee$bar, type='l')
#+end_src
Mikhail
