Zoo - bug ???

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Zoo - bug ???

sayan dasgupta
Hi folks,

I am confused whether the following is a bug or it is fine

Here is the explanation

a <- zoo(c(NA,1:9),1:10)

Now If I do

rollapply(a,FUN=mean,width=3,align="right")

I get
> rollapply(a,FUN=mean,width=3,align="right")
 3  4  5  6  7  8  9 10
NA NA NA NA NA NA NA NA

But I shouldn't be getting NA right ? i.e for index 10 I should get
(1/3)*(9+8+7)

Similarly

> rollapply(a,FUN=mean,width=3)
 2  3  4  5  6  7  8  9
NA NA NA NA NA NA NA NA


Zoo version :

> installed.packages()["zoo","Version"]
[1] "1.6-3"
>


My machine details

> sessionInfo()
R version 2.10.1 (2009-12-14)
i386-pc-intel32

locale:
[1] LC_COLLATE=English_India.1252  LC_CTYPE=English_India.1252
LC_MONETARY=English_India.1252 LC_NUMERIC=C
[5] LC_TIME=English_India.1252

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base

other attached packages:
[1] zoo_1.6-3      rcom_2.2-1     rscproxy_1.3-1 Revobase_3.2.0

loaded via a namespace (and not attached):
[1] grid_2.10.1    lattice_0.18-3 tools_2.10.1
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Zoo - bug ???

Gavin Simpson
On Tue, 2010-07-13 at 17:13 +0530, sayan dasgupta wrote:

> Hi folks,
>
> I am confused whether the following is a bug or it is fine
>
> Here is the explanation
>
> a <- zoo(c(NA,1:9),1:10)
>
> Now If I do
>
> rollapply(a,FUN=mean,width=3,align="right")

mean() has argument na.rm which defaults to FALSE. As such, if NA are in
the computation the mean is undefined and the answer will be NA. If you
pass na.rm = TRUE to rollapply, mean ignores the NA and works on the
remaining values:

> rollapply(a,FUN=mean,width=3,align="right", na.rm = TRUE)
  3   4   5   6   7   8   9  10
1.5 2.0 3.0 4.0 5.0 6.0 7.0 8.0

HTH

G

>
> I get
> > rollapply(a,FUN=mean,width=3,align="right")
>  3  4  5  6  7  8  9 10
> NA NA NA NA NA NA NA NA
>
> But I shouldn't be getting NA right ? i.e for index 10 I should get
> (1/3)*(9+8+7)
>
> Similarly
>
> > rollapply(a,FUN=mean,width=3)
>  2  3  4  5  6  7  8  9
> NA NA NA NA NA NA NA NA
>
>
> Zoo version :
>
> > installed.packages()["zoo","Version"]
> [1] "1.6-3"
> >
>
>
> My machine details
>
> > sessionInfo()
> R version 2.10.1 (2009-12-14)
> i386-pc-intel32
>
> locale:
> [1] LC_COLLATE=English_India.1252  LC_CTYPE=English_India.1252
> LC_MONETARY=English_India.1252 LC_NUMERIC=C
> [5] LC_TIME=English_India.1252
>
> attached base packages:
> [1] stats     graphics  grDevices datasets  utils     methods   base
>
> other attached packages:
> [1] zoo_1.6-3      rcom_2.2-1     rscproxy_1.3-1 Revobase_3.2.0
>
> loaded via a namespace (and not attached):
> [1] grid_2.10.1    lattice_0.18-3 tools_2.10.1
> >
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

--
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Zoo - bug ???

sayan dasgupta
On Tue, Jul 13, 2010 at 5:27 PM, Gavin Simpson <[hidden email]>wrote:

> On Tue, 2010-07-13 at 17:13 +0530, sayan dasgupta wrote:
> > Hi folks,
> >
> > I am confused whether the following is a bug or it is fine
> >
> > Here is the explanation
> >
> > a <- zoo(c(NA,1:9),1:10)
> >
> > Now If I do
> >
> > rollapply(a,FUN=mean,width=3,align="right")
>
> mean() has argument na.rm which defaults to FALSE. As such, if NA are in
> the computation the mean is undefined and the answer will be NA. If you
> pass na.rm = TRUE to rollapply, mean ignores the NA and works on the
> remaining values:
>
> > rollapply(a,FUN=mean,width=3,align="right", na.rm = TRUE)
>   3   4   5   6   7   8   9  10
> 1.5 2.0 3.0 4.0 5.0 6.0 7.0 8.0
>

This is fine but the problem is logically when you are doing rollapply only
the first 2 values should be NA
but suppose for index 10 as I have mentioned the rollapply should be a mean
of b9, 8 ,7 and there is no NA here.
So it should not return NA






>
> HTH
>
> G
>
> >
> > I get
> > > rollapply(a,FUN=mean,width=3,align="right")
> >  3  4  5  6  7  8  9 10
> > NA NA NA NA NA NA NA NA
> >
> > But I shouldn't be getting NA right ? i.e for index 10 I should get
> > (1/3)*(9+8+7)
> >
> > Similarly
> >
> > > rollapply(a,FUN=mean,width=3)
> >  2  3  4  5  6  7  8  9
> > NA NA NA NA NA NA NA NA
> >
> >
> > Zoo version :
> >
> > > installed.packages()["zoo","Version"]
> > [1] "1.6-3"
> > >
> >
> >
> > My machine details
> >
> > > sessionInfo()
> > R version 2.10.1 (2009-12-14)
> > i386-pc-intel32
> >
> > locale:
> > [1] LC_COLLATE=English_India.1252  LC_CTYPE=English_India.1252
> > LC_MONETARY=English_India.1252 LC_NUMERIC=C
> > [5] LC_TIME=English_India.1252
> >
> > attached base packages:
> > [1] stats     graphics  grDevices datasets  utils     methods   base
> >
> > other attached packages:
> > [1] zoo_1.6-3      rcom_2.2-1     rscproxy_1.3-1 Revobase_3.2.0
> >
> > loaded via a namespace (and not attached):
> > [1] grid_2.10.1    lattice_0.18-3 tools_2.10.1
> > >
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> --
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>  Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
>  ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
>  Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
>  Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/<http://www.ucl.ac.uk/%7Eucfagls/>
>  UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Zoo - bug ???

Gavin Simpson
On Tue, 2010-07-13 at 17:41 +0530, sayan dasgupta wrote:

>
>
> On Tue, Jul 13, 2010 at 5:27 PM, Gavin Simpson
> <[hidden email]> wrote:
>         On Tue, 2010-07-13 at 17:13 +0530, sayan dasgupta wrote:
>         > Hi folks,
>         >
>         > I am confused whether the following is a bug or it is fine
>         >
>         > Here is the explanation
>         >
>         > a <- zoo(c(NA,1:9),1:10)
>         >
>         > Now If I do
>         >
>         > rollapply(a,FUN=mean,width=3,align="right")
>        
>        
>         mean() has argument na.rm which defaults to FALSE. As such, if
>         NA are in
>         the computation the mean is undefined and the answer will be
>         NA. If you
>         pass na.rm = TRUE to rollapply, mean ignores the NA and works
>         on the
>         remaining values:
>        
>         > rollapply(a,FUN=mean,width=3,align="right", na.rm = TRUE)
>          3   4   5   6   7   8   9  10
>        
>         1.5 2.0 3.0 4.0 5.0 6.0 7.0 8.0
>
> This is fine but the problem is logically when you are doing rollapply
> only the first 2 values should be NA
> but suppose for index 10 as I have mentioned the rollapply should be a
> mean of b9, 8 ,7 and there is no NA here.
> So it should not return NA

Indeed, there seems to be something odd happening here: consider,

> rollapply(a,FUN=mean,width=3)
 2  3  4  5  6  7  8  9
NA NA NA NA NA NA NA NA
> rollapply(a,FUN=mean,width=3, na.rm = FALSE)
 2  3  4  5  6  7  8  9
NA  2  3  4  5  6  7  8
> rollapply(a,FUN=mean,width=3, na.rm = TRUE)
  2   3   4   5   6   7   8   9
1.5 2.0 3.0 4.0 5.0 6.0 7.0 8.0

and if you debug zoo:rollapply.zoo, the top one gets passed off to
rollmean early on in the code, whilst the second (na.rm = FALSE) is
handled by rollapply itself. And I see why this is happening. If ...
contains anything, is anything then the code will not enter the switch
statement which passes off control to functions like rollmean() (in this
case). This explains the difference between the first and second calls
with na.rm = FALSE.

And of course, this is mentioned on ?rollapply. Must read the help!!!

So, as rollmean doesn't accept an na.rm argument or pass it on, you need
to do

rollapply(a,FUN=mean,width=3, na.rm = FALSE)

This is not a bug as ?rollapply tells you what it does, passes you
to ?rollmean which states that it doesn't work for inputs with NAs. To
get behaviour you want though, you have to do the somewhat odd
workaround and force computation via rollapply by providing an extra
argument, even a gibberish one, e.g.:

rollapply(a,FUN=mean,width=3, foo = 1)

will work.

HTH

G

>
>
>
>
>  
>        
>         HTH
>        
>         G
>        
>        
>         >
>         > I get
>         > > rollapply(a,FUN=mean,width=3,align="right")
>         >  3  4  5  6  7  8  9 10
>         > NA NA NA NA NA NA NA NA
>         >
>         > But I shouldn't be getting NA right ? i.e for index 10 I
>         should get
>         > (1/3)*(9+8+7)
>         >
>         > Similarly
>         >
>         > > rollapply(a,FUN=mean,width=3)
>         >  2  3  4  5  6  7  8  9
>         > NA NA NA NA NA NA NA NA
>         >
>         >
>         > Zoo version :
>         >
>         > > installed.packages()["zoo","Version"]
>         > [1] "1.6-3"
>         > >
>         >
>         >
>         > My machine details
>         >
>         > > sessionInfo()
>         > R version 2.10.1 (2009-12-14)
>         > i386-pc-intel32
>         >
>         > locale:
>         > [1] LC_COLLATE=English_India.1252
>          LC_CTYPE=English_India.1252
>         > LC_MONETARY=English_India.1252 LC_NUMERIC=C
>         > [5] LC_TIME=English_India.1252
>         >
>         > attached base packages:
>         > [1] stats     graphics  grDevices datasets  utils
>         methods   base
>         >
>         > other attached packages:
>         > [1] zoo_1.6-3      rcom_2.2-1     rscproxy_1.3-1
>         Revobase_3.2.0
>         >
>         > loaded via a namespace (and not attached):
>         > [1] grid_2.10.1    lattice_0.18-3 tools_2.10.1
>         > >
>         >
>        
>         >       [[alternative HTML version deleted]]
>         >
>         > ______________________________________________
>         > [hidden email] mailing list
>         > https://stat.ethz.ch/mailman/listinfo/r-help
>         > PLEASE do read the posting guide
>         http://www.R-project.org/posting-guide.html
>         > and provide commented, minimal, self-contained, reproducible
>         code.
>        
>         --
>         %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~
>         %~%~%~%
>          Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
>          ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
>          Pearson Building,             [e]
>         gavin.simpsonATNOSPAMucl.ac.uk
>          Gower Street, London          [w]
>         http://www.ucl.ac.uk/~ucfagls/
>          UK. WC1E 6BT.                 [w]
>         http://www.freshwaters.org.uk
>         %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~
>         %~%~%~%
>        
>

--
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Zoo - bug ???

Gabor Grothendieck
In reply to this post by sayan dasgupta
On Tue, Jul 13, 2010 at 7:43 AM, sayan dasgupta <[hidden email]> wrote:

> Hi folks,
>
> I am confused whether the following is a bug or it is fine
>
> Here is the explanation
>
> a <- zoo(c(NA,1:9),1:10)
>
> Now If I do
>
> rollapply(a,FUN=mean,width=3,align="right")
>
> I get
>> rollapply(a,FUN=mean,width=3,align="right")
>  3  4  5  6  7  8  9 10
> NA NA NA NA NA NA NA NA
>
> But I shouldn't be getting NA right ? i.e for index 10 I should get
> (1/3)*(9+8+7)
>
> Similarly
>
>> rollapply(a,FUN=mean,width=3)
>  2  3  4  5  6  7  8  9
> NA NA NA NA NA NA NA NA

This is documented behavior (thanks to Gavin for pointing this out)
but I agree that it is undesirable and we will consider how to address
this.  In the meantime use
rollapply(a, 3, "mean")
so that it does not use rollmean or if you want NAs removed when doing
the mean calculation use na.rm = TRUE as Gavin suggested.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Zoo - bug ???

clangkamp
In reply to this post by Gavin Simpson
I am not sure whether this is 100 % related, but I also get some unexpected behaviour with NA and rollapply:
In the following example I would expect in Line 3 to see a 1 instead of an NA (mean out of [1, NA, NA]
What do you think ?


> mean(c(1,NA, NA), na.rm=TRUE)
[1] 1

> A1a<-zoo(c(NA, NA, 1,2,3,4,5,6, NA, NA))
> A1b<-rollapply(A1a,4,mean, na.rm=TRUE, na.pad=FALSE, align="right")
> View(cbind(A1a,A1b))
> cbind(A1a,A1b)
   A1a A1b
1   NA  NA
2   NA  NA
3    1  NA
4    2 1.5
5    3 2.0
6    4 2.5
7    5 3.5
8    6 4.5
9   NA 5.0
10  NA 5.5
Christian Langkamp
christian.langkamp-at-gmxpro.de
Reply | Threaded
Open this post in threaded view
|

Re: Zoo - bug ???

Joshua Ulrich
Hi Christian,

On Tue, Nov 30, 2010 at 8:22 AM, clangkamp <[hidden email]> wrote:
>
> I am not sure whether this is 100 % related, but I also get some unexpected
> behaviour with NA and rollapply:
> In the following example I would expect in Line 3 to see a 1 instead of an
> NA (mean out of [1, NA, NA]
> What do you think ?
>

I'm confused.  Why do you expect the 3rd element to have a non-NA
value when you've specified width=4?  The first 3 elements will be
missing (or NA if na.pad=TRUE) when width=4 and align="right".

The results are as you expect when width=3:

> A1b <- rollapply(A1a,4,mean, na.rm=TRUE, na.pad=FALSE, align="right")
> cbind(A1a,A1b)
   A1a A1b
1   NA  NA
2   NA  NA
3    1 1.0
4    2 1.5
5    3 2.0
6    4 3.0
7    5 4.0
8    6 5.0
9   NA 5.5
10  NA 6.0

HTH,
--
Joshua Ulrich  |  FOSS Trading: www.fosstrading.com


>
>> mean(c(1,NA, NA), na.rm=TRUE)
> [1] 1
>
>> A1a<-zoo(c(NA, NA, 1,2,3,4,5,6, NA, NA))
>> A1b<-rollapply(A1a,4,mean, na.rm=TRUE, na.pad=FALSE, align="right")
>> View(cbind(A1a,A1b))
>> cbind(A1a,A1b)
>   A1a A1b
> 1   NA  NA
> 2   NA  NA
> 3    1  NA
> 4    2 1.5
> 5    3 2.0
> 6    4 2.5
> 7    5 3.5
> 8    6 4.5
> 9   NA 5.0
> 10  NA 5.5
>
> -----
> Christian Langkamp
> christian.langkamp-at-gmxpro.de
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Zoo-bug-tp2287282p3065458.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Zoo - bug ???

clangkamp
Dear Joshua
thank you for the pointer to the width element. Actually still with na.pad=TRUE it doesn't seem to work but attaching an extra row of NAs did.

A1a<-zoo(c( NA, NA, 1,2,3,4,5,6, NA, NA))
A1b<-rollapply(A1a,4,mean, na.rm=TRUE, na.pad=FALSE, align="right")
A1c<-rollapply(A1a,4,mean, na.rm=TRUE, na.pad=TRUE, align="right")
cbind(A1a,A1b, A1c)
   A1a A1b A1c
1   NA  NA  NA
2   NA  NA  NA
3    1  NA  NA
4    2 1.5 1.5

vs. the same outcome with another NA
> cbind(A1a,A1b, A1c)
   A1a A1b A1c
1   NA  NA  NA
2   NA  NA  NA
3   NA  NA  NA
4    1 1.0 1.0
5    2 1.5 1.5

But for now the issue is solved with the workaround.
Christian Langkamp
christian.langkamp-at-gmxpro.de
Reply | Threaded
Open this post in threaded view
|

Re: Zoo - bug ???

Gabor Grothendieck
On Tue, Nov 30, 2010 at 1:01 PM, clangkamp <[hidden email]> wrote:

>
> Dear Joshua
> thank you for the pointer to the width element. Actually still with
> na.pad=TRUE it doesn't seem to work but attaching an extra row of NAs did.
>
> A1a<-zoo(c( NA, NA, 1,2,3,4,5,6, NA, NA))
> A1b<-rollapply(A1a,4,mean, na.rm=TRUE, na.pad=FALSE, align="right")
> A1c<-rollapply(A1a,4,mean, na.rm=TRUE, na.pad=TRUE, align="right")
> cbind(A1a,A1b, A1c)
>   A1a A1b A1c
> 1   NA  NA  NA
> 2   NA  NA  NA
> 3    1  NA  NA
> 4    2 1.5 1.5
>
> vs. the same outcome with another NA
>> cbind(A1a,A1b, A1c)
>   A1a A1b A1c
> 1   NA  NA  NA
> 2   NA  NA  NA
> 3   NA  NA  NA
> 4    1 1.0 1.0
> 5    2 1.5 1.5
>
> But for now the issue is solved with the workaround.

The way rollapply works is that the input at each iteration must be k
long.   For the first k-1 entries of the input the corresponding
output for align= "right" is either NA if na.pad = TRUE or the entry
is just dropped if na.pad = FALSE.  That is correct and intended
behavior and also ensures that the variance of each output component
is the same in simple cases because each such output component is
derived from the same number of input components.

I believe what you are asking for is a new feature where optionally,
for k = 4 and align = "right" the first output component would be
FUN(x[1], ...), the second would be FUN(x[1:2], ...), the third would
be FUN(x[1:3], ...) and the remaining ones would be the same as they
are now.

The devel version of zoo actually has such a feature in it if you
specify the argument partial = TRUE.  Try this:

> library(zoo)
> source("http://r-forge.r-project.org/scm/viewvc.php/*checkout*/pkg/zoo/R/rollapply.R?revision=802&root=zoo")
> A1a <- zoo(c( NA, NA, 1,2,3,4,5,6, NA, NA))
> rollapply(A1a, 4, mean, na.rm = TRUE, align = "right", partial = TRUE)
  1   2   3   4   5   6   7   8   9  10
NaN NaN 1.0 1.5 2.0 2.5 3.5 4.5 5.0 5.5

--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Zoo - bug ???

clangkamp
Dear Gabor, the partial works wonders, that is the bit which I was missing. Thanks Christian
Christian Langkamp
christian.langkamp-at-gmxpro.de