

Hello!
I’m having stacked data in a data.frame with 2 factors, ordered POSIXct, and actual value as numeric (as if for lattice::xyplot).
I would like to calculate first difference using “diff” function within corresponding subsets/partitions. Since data.frame is organized by factors and has sorted dates, it seems like "by" is a good candidate for the job. However it returns just a dumb list of vectors.
It seems that I can use either expand.grid to remap results of "by" and hope that I won't mess up order, or I can use "unique(subset(x,select=c(foo,bar)))"
In overall it looks like quite many steps for such task not counting assignment of those differences back to original data.frame starting from 2nd position in each partition (as diff returns shorter vector).
Am I on the right track or is there an easier way to do that?
Mikhail
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


If you would post a subset of your data so that we can see what you
are talking about, we could probably help you come up with a solution.
On Sat, Mar 3, 2012 at 7:50 PM, Mikhail Titov < [hidden email]> wrote:
> Hello!
>
> I’m having stacked data in a data.frame with 2 factors, ordered POSIXct, and actual value as numeric (as if for lattice::xyplot).
>
> I would like to calculate first difference using “diff” function within corresponding subsets/partitions. Since data.frame is organized by factors and has sorted dates, it seems like "by" is a good candidate for the job. However it returns just a dumb list of vectors.
>
> It seems that I can use either expand.grid to remap results of "by" and hope that I won't mess up order, or I can use "unique(subset(x,select=c(foo,bar)))"
>
> In overall it looks like quite many steps for such task not counting assignment of those differences back to original data.frame starting from 2nd position in each partition (as diff returns shorter vector).
>
> Am I on the right track or is there an easier way to do that?
>
> Mikhail
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.

Jim Holtman
Data Munger Guru
What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


It'd be doubly helpful if you could post desired output as well.
If you haven't seen it before, the easiest way to post R data is to
use the dput() function to get a plaintext (mailing list friendly)
representation. If your data is large, dput(head(DATA, 30)) should
suffice.
(We wouldn't want to clog those internet tubes...)
Michael
On Sat, Mar 3, 2012 at 8:55 PM, jim holtman < [hidden email]> wrote:
> If you would post a subset of your data so that we can see what you
> are talking about, we could probably help you come up with a solution.
>
> On Sat, Mar 3, 2012 at 7:50 PM, Mikhail Titov < [hidden email]> wrote:
>> Hello!
>>
>> I’m having stacked data in a data.frame with 2 factors, ordered POSIXct, and actual value as numeric (as if for lattice::xyplot).
>>
>> I would like to calculate first difference using “diff” function within corresponding subsets/partitions. Since data.frame is organized by factors and has sorted dates, it seems like "by" is a good candidate for the job. However it returns just a dumb list of vectors.
>>
>> It seems that I can use either expand.grid to remap results of "by" and hope that I won't mess up order, or I can use "unique(subset(x,select=c(foo,bar)))"
>>
>> In overall it looks like quite many steps for such task not counting assignment of those differences back to original data.frame starting from 2nd position in each partition (as diff returns shorter vector).
>>
>> Am I on the right track or is there an easier way to do that?
>>
>> Mikhail
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/rhelp>> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html>> and provide commented, minimal, selfcontained, reproducible code.
>
>
>
> 
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


"R. Michael Weylandt" < [hidden email]> writes:
> It'd be doubly helpful if you could post desired output as well.
I beg alls pardon, I suddenly realized that in my case the solution is
trivial. Here is an example with a mockup data.
Let's generate some data
#+begin_src R
qq <
expand.grid(
day=seq(ISOdate(2011,1,1),ISOdate(2011,12,31),by='day'),
bar=1:4,
foo=factor(c('A','B','G','I'))
)
ww <
within(qq,
val < bar * sin(as.double(dayday[1],"days")
/ as.double(diff(range(day)),"days")
* 2*pi
+ as.numeric(foo)/2
)
)
#+end_src
We can take a look at it with
#+begin_src R :results graphics :exports both :file z.png
library(lattice)
xyplot(val~dayfoo,ww,group=ww$bar, type='l')
#+end_src
Now since we ditch first element in each partition anyway,
we can apply diff on entire data set at once.
Then we should ditch very first element in each partition.
#+begin_src R
ww[1,"diff"] < diff(ww$val)
ee < subset(ww, day>ISOdate(2011,1,1))
#+end_src
And a final result
#+begin_src R :results graphics :exports both :file x.png
xyplot(diff~dayfoo,ee,group=ee$bar, type='l')
#+end_src
> If you haven't seen it before, the easiest way to post R data is to
> use the dput() function to get a plaintext (mailing list friendly)
> representation. If your data is large, dput(head(DATA, 30)) should
> suffice.
>
> (We wouldn't want to clog those internet tubes...)
>
> Michael
>
> On Sat, Mar 3, 2012 at 8:55 PM, jim holtman < [hidden email]> wrote:
>> If you would post a subset of your data so that we can see what you
>> are talking about, we could probably help you come up with a solution.
>>
>> On Sat, Mar 3, 2012 at 7:50 PM, Mikhail Titov < [hidden email]> wrote:
>>> Hello!
>>>
>>> I’m having stacked data in a data.frame with 2 factors, ordered POSIXct, and actual value as numeric (as if for lattice::xyplot).
>>>
>>> I would like to calculate first difference using “diff” function
>>> within corresponding subsets/partitions. Since data.frame is
>>> organized by factors and has sorted dates, it seems like "by" is a
>>> good candidate for the job. However it returns just a dumb list of
>>> vectors.
>>>
>>> It seems that I can use either expand.grid to remap results of "by" and hope that I won't mess up order, or I can use "unique(subset(x,select=c(foo,bar)))"
>>>
>>> In overall it looks like quite many steps for such task not
>>> counting assignment of those differences back to original
>>> data.frame starting from 2nd position in each partition (as diff
>>> returns shorter vector).
>>>
>>> Am I on the right track or is there an easier way to do that?
>>>
>>> Mikhail
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/rhelp>>> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html>>> and provide commented, minimal, selfcontained, reproducible code.
>>
>>
>>
>> 
>> Jim Holtman
>> Data Munger Guru
>>
>> What is the problem that you are trying to solve?
>> Tell me what you want to do, not how you want to do it.
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/rhelp>> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html>> and provide commented, minimal, selfcontained, reproducible code.

Mikhail
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.

