

Dear R People:
The aggregate function works very well on regular time series.
Is there a version for zoo or its that would take daily data and
convert it to monthly, please?
Thanks in advance,
Sincerely,
Erin

Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston  Downtown
mailto: [hidden email]
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


See ?aggregate.zoo, e.g.
library(zoo)
z < zoo(1:1000, as.Date("20000101") + 0:999)
aggregate(z, as.yearmon, mean)
or replace mean with whatever summarization you want.
On Sun, Mar 7, 2010 at 5:29 PM, Erin Hodgess < [hidden email]> wrote:
> Dear R People:
>
> The aggregate function works very well on regular time series.
>
> Is there a version for zoo or its that would take daily data and
> convert it to monthly, please?
>
> Thanks in advance,
> Sincerely,
> Erin
>
> 
> Erin Hodgess
> Associate Professor
> Department of Computer and Mathematical Sciences
> University of Houston  Downtown
> mailto: [hidden email]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


x < c(0,0,1,2,3,0,0,4,5,6)
How to identify the regions of nonzeros and average c(1,2,3) and c(4,5,6) to get 2 and 5.
Thanks
_________________________________________________________________
Hotmail: Trusted email with Microsoft¡¯s powerful SPAM protection.
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Try this:
> x < c(0,0,1,2,3,0,0,4,5,6)
> # partition the data
> x.p < split(x, cumsum(x == 0))
> # now only process groups > 1
> x.mean < lapply(x.p, function(a){
+ if (length(a) == 1) return(NULL)
+ return(list(grp=tail(a, 1), mean=mean(tail(a, 1))))
+ })
> # now only return the real values
> x.mean[unlist(lapply(x.mean, length) != 0)]
$`2`
$`2`$grp
[1] 1 2 3
$`2`$mean
[1] 2
$`4`
$`4`$grp
[1] 4 5 6
$`4`$mean
[1] 5
On Sun, Mar 7, 2010 at 9:48 PM, Daren Tan < [hidden email]> wrote:
>
> x < c(0,0,1,2,3,0,0,4,5,6)
>
>
>
> How to identify the regions of nonzeros and average c(1,2,3) and c(4,5,6)
> to get 2 and 5.
>
>
>
> Thanks
>
>
>
> _________________________________________________________________
> Hotmail: Trusted email with Microsofts powerful SPAM protection.
>
> [[alternative HTML version deleted]]
>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide
> http://www.Rproject.org/postingguide.html< http://www.rproject.org/postingguide.html>
> and provide commented, minimal, selfcontained, reproducible code.
>
>

Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem that you are trying to solve?
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


try this:
x < c(0,0,1,2,3,0,0,4,5,6)
rl < rle(x == 0)
grp < rep(seq_along(rl$lengths), rl$lengths)
res < tapply(x, grp, mean)
res[res > 0]
I hope it helps.
Best,
Dimitris
On 3/8/2010 3:48 AM, Daren Tan wrote:
>
> x< c(0,0,1,2,3,0,0,4,5,6)
>
>
>
> How to identify the regions of nonzeros and average c(1,2,3) and c(4,5,6) to get 2 and 5.
>
>
>
> Thanks
>
>
>
> _________________________________________________________________
> Hotmail: Trusted email with Microsoft¡¯s powerful SPAM protection.
>
> [[alternative HTML version deleted]]
>
>
>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.

Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center
Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Hi Jim I was following this thread and found that your answer is perfect
there. However I could not comprehend the meaning of the expression "
cumsum(x == 0)". If I paste it in R window, I get following :
> cumsum(x == 0)
[1] 1 2 2 2 2 3 4 4 4 4
I gone through the help page of cumsum() function I correctly understand
that this function calculates the cumulative sum. But could not understand
really the meaning of cumsum(x == 0)
Would you please explain that?
Thanks,
Original Message
From: [hidden email] [mailto: [hidden email]] On
Behalf Of jim holtman
Sent: 08 March 2010 08:32
To: Daren Tan
Cc: [hidden email]
Subject: Re: [R] Average regions of nonzeros
Try this:
> x < c(0,0,1,2,3,0,0,4,5,6)
> # partition the data
> x.p < split(x, cumsum(x == 0))
> # now only process groups > 1
> x.mean < lapply(x.p, function(a){
+ if (length(a) == 1) return(NULL)
+ return(list(grp=tail(a, 1), mean=mean(tail(a, 1))))
+ })
> # now only return the real values
> x.mean[unlist(lapply(x.mean, length) != 0)]
$`2`
$`2`$grp
[1] 1 2 3
$`2`$mean
[1] 2
$`4`
$`4`$grp
[1] 4 5 6
$`4`$mean
[1] 5
On Sun, Mar 7, 2010 at 9:48 PM, Daren Tan < [hidden email]> wrote:
>
> x < c(0,0,1,2,3,0,0,4,5,6)
>
>
>
> How to identify the regions of nonzeros and average c(1,2,3) and c(4,5,6)
> to get 2 and 5.
>
>
>
> Thanks
>
>
>
> _________________________________________________________________
> Hotmail: Trusted email with Microsofts powerful SPAM protection.
>
> [[alternative HTML version deleted]]
>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide
>
http://www.Rproject.org/postingguide.html< http://www.rproject.org/postingguide.html>
> and provide commented, minimal, selfcontained, reproducible code.
>
>

Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem that you are trying to solve?
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


What I was looking for was the string of nonzero values and where they
'broke' at. I could have used 'rle', but I sometime find this approach just
as easy. Every place there is a zero will be TRUE which has the value 1.
'cumsum' will generate a running sum of these values. When there is a
nonzero value, you will get consecutive values of cumsum to be the same.
The is what you saw in pasting the value into the window. Notice that the
run of '2's begins with a value of zero and then includes all the nonzero
values following. By using 'split', I cn create a list of each group. If
the group is of length 1, then it only contains zero and I ignore it. If
the length is greater than 1, then we have some nonzero values and we have
to throw away the leading zero in the group (tail(a, 1)) and then take the
mean.
HTH
On Mon, Mar 8, 2010 at 3:26 AM, bogaso.christofer <
[hidden email]> wrote:
> Hi Jim I was following this thread and found that your answer is perfect
> there. However I could not comprehend the meaning of the expression "
> cumsum(x == 0)". If I paste it in R window, I get following :
>
> > cumsum(x == 0)
> [1] 1 2 2 2 2 3 4 4 4 4
>
> I gone through the help page of cumsum() function I correctly understand
> that this function calculates the cumulative sum. But could not understand
> really the meaning of cumsum(x == 0)
>
> Would you please explain that?
>
> Thanks,
>
> Original Message
> From: [hidden email] [mailto: [hidden email]]
> On
> Behalf Of jim holtman
> Sent: 08 March 2010 08:32
> To: Daren Tan
> Cc: [hidden email]
> Subject: Re: [R] Average regions of nonzeros
>
> Try this:
>
> > x < c(0,0,1,2,3,0,0,4,5,6)
> > # partition the data
> > x.p < split(x, cumsum(x == 0))
> > # now only process groups > 1
> > x.mean < lapply(x.p, function(a){
> + if (length(a) == 1) return(NULL)
> + return(list(grp=tail(a, 1), mean=mean(tail(a, 1))))
> + })
> > # now only return the real values
> > x.mean[unlist(lapply(x.mean, length) != 0)]
> $`2`
> $`2`$grp
> [1] 1 2 3
> $`2`$mean
> [1] 2
>
> $`4`
> $`4`$grp
> [1] 4 5 6
> $`4`$mean
> [1] 5
>
>
>
> On Sun, Mar 7, 2010 at 9:48 PM, Daren Tan < [hidden email]> wrote:
>
> >
> > x < c(0,0,1,2,3,0,0,4,5,6)
> >
> >
> >
> > How to identify the regions of nonzeros and average c(1,2,3) and
> c(4,5,6)
> > to get 2 and 5.
> >
> >
> >
> > Thanks
> >
> >
> >
> > _________________________________________________________________
> > Hotmail: Trusted email with Microsoft s powerful SPAM protection.
> >
> > [[alternative HTML version deleted]]
> >
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/rhelp> > PLEASE do read the posting guide
> >
> http://www.Rproject.org/postingguide.html< http://www.rproject.org/postingguide.html>
> < http://www.rproject.org/posting> guide.html>
> > and provide commented, minimal, selfcontained, reproducible code.
> >
> >
>
>
> 
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem that you are trying to solve?
>
> [[alternative HTML version deleted]]
>
>
>

Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem that you are trying to solve?
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Nice shot of cumsum(). Just improve it a little:
> x < c(0,0,1,2,3,0,0,4,5,6)
> x.groups < split(x, (x != 0) * cumsum(x == 0))[1]
> x.groups
$`2`
[1] 1 2 3
$`4`
[1] 4 5 6
> lapply(x.groups, mean)
$`2`
[1] 2
$`4`
[1] 5
On Mon, Mar 8, 2010 at 11:02 AM, jim holtman < [hidden email]> wrote:
> Try this:
>
>> x < c(0,0,1,2,3,0,0,4,5,6)
>> # partition the data
>> x.p < split(x, cumsum(x == 0))
>> # now only process groups > 1
>> x.mean < lapply(x.p, function(a){
> + if (length(a) == 1) return(NULL)
> + return(list(grp=tail(a, 1), mean=mean(tail(a, 1))))
> + })
>> # now only return the real values
>> x.mean[unlist(lapply(x.mean, length) != 0)]
> $`2`
> $`2`$grp
> [1] 1 2 3
> $`2`$mean
> [1] 2
>
> $`4`
> $`4`$grp
> [1] 4 5 6
> $`4`$mean
> [1] 5
>
>
>
> On Sun, Mar 7, 2010 at 9:48 PM, Daren Tan < [hidden email]> wrote:
>
>>
>> x < c(0,0,1,2,3,0,0,4,5,6)
>>
>>
>>
>> How to identify the regions of nonzeros and average c(1,2,3) and c(4,5,6)
>> to get 2 and 5.
>>
>>
>>
>> Thanks
>>
>>
>>
>> _________________________________________________________________
>> Hotmail: Trusted email with Microsoft’s powerful SPAM protection.
>>
>> [[alternative HTML version deleted]]
>>
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/rhelp>> PLEASE do read the posting guide
>> http://www.Rproject.org/postingguide.html< http://www.rproject.org/postingguide.html>
>> and provide commented, minimal, selfcontained, reproducible code.
>>
>>
>
>
> 
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem that you are trying to solve?
>
> [[alternative HTML version deleted]]
>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>
>
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Or for real power and flexibility, see the 'convert()' function in package
tis.
Jeff
Gabor Grothendieck < [hidden email]> writes:
> See ?aggregate.zoo, e.g.
>
> library(zoo)
> z < zoo(1:1000, as.Date("20000101") + 0:999)
> aggregate(z, as.yearmon, mean)
>
> or replace mean with whatever summarization you want.
>
> On Sun, Mar 7, 2010 at 5:29 PM, Erin Hodgess < [hidden email]> wrote:
>> Dear R People:
>>
>> The aggregate function works very well on regular time series.
>>
>> Is there a version for zoo or its that would take daily data and
>> convert it to monthly, please?
>>

Jeff
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.

