Quantcast

var/sd and NAs in R2.7.0

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

var/sd and NAs in R2.7.0

McGehee, Robert
Hello all,
I just upgraded to R 2.7.0 and found that the behavior of 'var' and 'sd'
have changed in the presence NAs (this wasn't explicit in the NEWS file,
though I see it probably has to do with the change for cor/cov). Anyway,
I just want to make sure that it was intentional to produce an error
when there was all NAs and na.rm=TRUE, rather than returning an NA (like
R 2.6.2), or NaN (like the function 'mean' does). That is, isn't the
purpose of 'na.rm=TRUE' to, in part, suppress these error messages.

Specifically,
> var(c(NA, NA, NA), na.rm=TRUE) # R2.6.2
[1] NA  
> var(c(NA, NA, NA), na.rm=TRUE) # R2.7.0
Error during wrapup: no complete observations in cov/cor

I think I can get the old behavior by setting use='p', but the 'sd'
function does not have a 'use' argument and I'd like not to get an error
here. Anyway, I'm a fan of the old behavior (not producing an error),
but if there was a reason to change this when na.rm=TRUE, I would
request that the 'sd' function be updated to be able to revert to the
old behavior as well.

FYI: I 'apply' these functions to large matrices of stock return time
series with missing values, and don't want the whole calculation to fail
just because I'm missing stock returns for one company.

Thanks,
Robert

Robert McGehee, CFA
Geode Capital Management, LLC
One Post Office Square, 28th Floor | Boston, MA | 02109
Tel: 617/392-8396    Fax:617/476-6389
mailto:[hidden email]



This e-mail, and any attachments hereto, are intended fo...{{dropped:11}}

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: var/sd and NAs in R2.7.0

Gabor Grothendieck
Try

var(c(NA, NA, NA), use = "pairwise.complete.obs")


On Fri, May 16, 2008 at 10:56 AM, McGehee, Robert
<[hidden email]> wrote:

> Hello all,
> I just upgraded to R 2.7.0 and found that the behavior of 'var' and 'sd'
> have changed in the presence NAs (this wasn't explicit in the NEWS file,
> though I see it probably has to do with the change for cor/cov). Anyway,
> I just want to make sure that it was intentional to produce an error
> when there was all NAs and na.rm=TRUE, rather than returning an NA (like
> R 2.6.2), or NaN (like the function 'mean' does). That is, isn't the
> purpose of 'na.rm=TRUE' to, in part, suppress these error messages.
>
> Specifically,
>> var(c(NA, NA, NA), na.rm=TRUE) # R2.6.2
> [1] NA
>> var(c(NA, NA, NA), na.rm=TRUE) # R2.7.0
> Error during wrapup: no complete observations in cov/cor
>
> I think I can get the old behavior by setting use='p', but the 'sd'
> function does not have a 'use' argument and I'd like not to get an error
> here. Anyway, I'm a fan of the old behavior (not producing an error),
> but if there was a reason to change this when na.rm=TRUE, I would
> request that the 'sd' function be updated to be able to revert to the
> old behavior as well.
>
> FYI: I 'apply' these functions to large matrices of stock return time
> series with missing values, and don't want the whole calculation to fail
> just because I'm missing stock returns for one company.
>
> Thanks,
> Robert
>
> Robert McGehee, CFA
> Geode Capital Management, LLC
> One Post Office Square, 28th Floor | Boston, MA | 02109
> Tel: 617/392-8396    Fax:617/476-6389
> mailto:[hidden email]
>
>
>
> This e-mail, and any attachments hereto, are intended fo...{{dropped:11}}
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: var/sd and NAs in R2.7.0

Gerlanc, Daniel
Perhaps _sd_ should take a ... argument.

-- Dan

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Gabor Grothendieck
Sent: Friday, May 16, 2008 11:03 AM
To: McGehee, Robert
Cc: R-devel
Subject: Re: [Rd] var/sd and NAs in R2.7.0

Try

var(c(NA, NA, NA), use = "pairwise.complete.obs")


On Fri, May 16, 2008 at 10:56 AM, McGehee, Robert
<[hidden email]> wrote:
> Hello all,
> I just upgraded to R 2.7.0 and found that the behavior of 'var' and
'sd'
> have changed in the presence NAs (this wasn't explicit in the NEWS
file,
> though I see it probably has to do with the change for cor/cov).
Anyway,
> I just want to make sure that it was intentional to produce an error
> when there was all NAs and na.rm=TRUE, rather than returning an NA
(like

> R 2.6.2), or NaN (like the function 'mean' does). That is, isn't the
> purpose of 'na.rm=TRUE' to, in part, suppress these error messages.
>
> Specifically,
>> var(c(NA, NA, NA), na.rm=TRUE) # R2.6.2
> [1] NA
>> var(c(NA, NA, NA), na.rm=TRUE) # R2.7.0
> Error during wrapup: no complete observations in cov/cor
>
> I think I can get the old behavior by setting use='p', but the 'sd'
> function does not have a 'use' argument and I'd like not to get an
error
> here. Anyway, I'm a fan of the old behavior (not producing an error),
> but if there was a reason to change this when na.rm=TRUE, I would
> request that the 'sd' function be updated to be able to revert to the
> old behavior as well.
>
> FYI: I 'apply' these functions to large matrices of stock return time
> series with missing values, and don't want the whole calculation to
fail

> just because I'm missing stock returns for one company.
>
> Thanks,
> Robert
>
> Robert McGehee, CFA
> Geode Capital Management, LLC
> One Post Office Square, 28th Floor | Boston, MA | 02109
> Tel: 617/392-8396    Fax:617/476-6389
> mailto:[hidden email]
>
>
>
> This e-mail, and any attachments hereto, are intended
fo...{{dropped:11}}
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: var/sd and NAs in R2.7.0

McGehee, Robert
In reply to this post by Gabor Grothendieck
I know I can get around this, I just would prefer that if R is breaking
backwards compatibility, then it's intentional (maybe it is, I just
don't know). That is, I don't want to require my entire company to
upgrade to 2.7.0 just so I can deploy a fix here, and I'd prefer not to
check the argument list of var every time I use it.

if ("use" %in% names(formals(var)))
        var(x, na.rm=TRUE, use="p")
else
        var(x, na.rm=TRUE)


-----Original Message-----
From: Gabor Grothendieck [mailto:[hidden email]]
Sent: Friday, May 16, 2008 11:03 AM
To: McGehee, Robert
Cc: R-devel
Subject: Re: [Rd] var/sd and NAs in R2.7.0

Try

var(c(NA, NA, NA), use = "pairwise.complete.obs")


On Fri, May 16, 2008 at 10:56 AM, McGehee, Robert
<[hidden email]> wrote:
> Hello all,
> I just upgraded to R 2.7.0 and found that the behavior of 'var' and
'sd'
> have changed in the presence NAs (this wasn't explicit in the NEWS
file,
> though I see it probably has to do with the change for cor/cov).
Anyway,
> I just want to make sure that it was intentional to produce an error
> when there was all NAs and na.rm=TRUE, rather than returning an NA
(like

> R 2.6.2), or NaN (like the function 'mean' does). That is, isn't the
> purpose of 'na.rm=TRUE' to, in part, suppress these error messages.
>
> Specifically,
>> var(c(NA, NA, NA), na.rm=TRUE) # R2.6.2
> [1] NA
>> var(c(NA, NA, NA), na.rm=TRUE) # R2.7.0
> Error during wrapup: no complete observations in cov/cor
>
> I think I can get the old behavior by setting use='p', but the 'sd'
> function does not have a 'use' argument and I'd like not to get an
error
> here. Anyway, I'm a fan of the old behavior (not producing an error),
> but if there was a reason to change this when na.rm=TRUE, I would
> request that the 'sd' function be updated to be able to revert to the
> old behavior as well.
>
> FYI: I 'apply' these functions to large matrices of stock return time
> series with missing values, and don't want the whole calculation to
fail

> just because I'm missing stock returns for one company.
>
> Thanks,
> Robert
>
> Robert McGehee, CFA
> Geode Capital Management, LLC
> One Post Office Square, 28th Floor | Boston, MA | 02109
> Tel: 617/392-8396    Fax:617/476-6389
> mailto:[hidden email]
>
>
>
> This e-mail, and any attachments hereto, are intended
fo...{{dropped:11}}
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: var/sd and NAs in R2.7.0

McGehee, Robert
Oops, as I just realized, var does have a 'use' argument in 2.6.2, so I
can just use Gabor's suggestion for var. Sorry for that Gabor.

-----Original Message-----
From: McGehee, Robert
Sent: Friday, May 16, 2008 11:20 AM
To: 'Gabor Grothendieck'
Cc: R-devel
Subject: RE: [Rd] var/sd and NAs in R2.7.0

I know I can get around this, I just would prefer that if R is breaking
backwards compatibility, then it's intentional (maybe it is, I just
don't know). That is, I don't want to require my entire company to
upgrade to 2.7.0 just so I can deploy a fix here, and I'd prefer not to
check the argument list of var every time I use it.

if ("use" %in% names(formals(var)))
        var(x, na.rm=TRUE, use="p")
else
        var(x, na.rm=TRUE)


-----Original Message-----
From: Gabor Grothendieck [mailto:[hidden email]]
Sent: Friday, May 16, 2008 11:03 AM
To: McGehee, Robert
Cc: R-devel
Subject: Re: [Rd] var/sd and NAs in R2.7.0

Try

var(c(NA, NA, NA), use = "pairwise.complete.obs")


On Fri, May 16, 2008 at 10:56 AM, McGehee, Robert
<[hidden email]> wrote:
> Hello all,
> I just upgraded to R 2.7.0 and found that the behavior of 'var' and
'sd'
> have changed in the presence NAs (this wasn't explicit in the NEWS
file,
> though I see it probably has to do with the change for cor/cov).
Anyway,
> I just want to make sure that it was intentional to produce an error
> when there was all NAs and na.rm=TRUE, rather than returning an NA
(like

> R 2.6.2), or NaN (like the function 'mean' does). That is, isn't the
> purpose of 'na.rm=TRUE' to, in part, suppress these error messages.
>
> Specifically,
>> var(c(NA, NA, NA), na.rm=TRUE) # R2.6.2
> [1] NA
>> var(c(NA, NA, NA), na.rm=TRUE) # R2.7.0
> Error during wrapup: no complete observations in cov/cor
>
> I think I can get the old behavior by setting use='p', but the 'sd'
> function does not have a 'use' argument and I'd like not to get an
error
> here. Anyway, I'm a fan of the old behavior (not producing an error),
> but if there was a reason to change this when na.rm=TRUE, I would
> request that the 'sd' function be updated to be able to revert to the
> old behavior as well.
>
> FYI: I 'apply' these functions to large matrices of stock return time
> series with missing values, and don't want the whole calculation to
fail

> just because I'm missing stock returns for one company.
>
> Thanks,
> Robert
>
> Robert McGehee, CFA
> Geode Capital Management, LLC
> One Post Office Square, 28th Floor | Boston, MA | 02109
> Tel: 617/392-8396    Fax:617/476-6389
> mailto:[hidden email]
>
>
>
> This e-mail, and any attachments hereto, are intended
fo...{{dropped:11}}
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: var/sd and NAs in R2.7.0

Simon Urbanek
In reply to this post by McGehee, Robert
Robert,

this was discussed before:
https://stat.ethz.ch/pipermail/r-devel/2007-December/047594.html

and it *is* mentioned in NEWS:

     o   co[rv](use = "complete.obs") now always gives an error if there
         are no complete cases: they used to give NA if
         method = "pearson" but an error for the other two methods.
         (Note that this is pretty arbitrary, but zero-length vectors
         always give an error so it is at least consistent.)

         cor(use="pair") used to give diagonal 1 even if the variable
         was completely missing for the rank methods but NA for the
         Pearson method: it now gives NA in all cases.

         cor(use="pair") for the rank methods gave a matrix result with
         dimensions > 0 even if one of the inputs had 0 columns.

[sd(..,na.rm=TRUE) -> cov(..,use="complete.obs")]

Cheers,
Simon


On May 16, 2008, at 11:19 AM, McGehee, Robert wrote:

> I know I can get around this, I just would prefer that if R is  
> breaking
> backwards compatibility, then it's intentional (maybe it is, I just
> don't know). That is, I don't want to require my entire company to
> upgrade to 2.7.0 just so I can deploy a fix here, and I'd prefer not  
> to
> check the argument list of var every time I use it.
>
> if ("use" %in% names(formals(var)))
> var(x, na.rm=TRUE, use="p")
> else
> var(x, na.rm=TRUE)
>
>
> -----Original Message-----
> From: Gabor Grothendieck [mailto:[hidden email]]
> Sent: Friday, May 16, 2008 11:03 AM
> To: McGehee, Robert
> Cc: R-devel
> Subject: Re: [Rd] var/sd and NAs in R2.7.0
>
> Try
>
> var(c(NA, NA, NA), use = "pairwise.complete.obs")
>
>
> On Fri, May 16, 2008 at 10:56 AM, McGehee, Robert
> <[hidden email]> wrote:
>> Hello all,
>> I just upgraded to R 2.7.0 and found that the behavior of 'var' and
> 'sd'
>> have changed in the presence NAs (this wasn't explicit in the NEWS
> file,
>> though I see it probably has to do with the change for cor/cov).
> Anyway,
>> I just want to make sure that it was intentional to produce an error
>> when there was all NAs and na.rm=TRUE, rather than returning an NA
> (like
>> R 2.6.2), or NaN (like the function 'mean' does). That is, isn't the
>> purpose of 'na.rm=TRUE' to, in part, suppress these error messages.
>>
>> Specifically,
>>> var(c(NA, NA, NA), na.rm=TRUE) # R2.6.2
>> [1] NA
>>> var(c(NA, NA, NA), na.rm=TRUE) # R2.7.0
>> Error during wrapup: no complete observations in cov/cor
>>
>> I think I can get the old behavior by setting use='p', but the 'sd'
>> function does not have a 'use' argument and I'd like not to get an
> error
>> here. Anyway, I'm a fan of the old behavior (not producing an error),
>> but if there was a reason to change this when na.rm=TRUE, I would
>> request that the 'sd' function be updated to be able to revert to the
>> old behavior as well.
>>
>> FYI: I 'apply' these functions to large matrices of stock return time
>> series with missing values, and don't want the whole calculation to
> fail
>> just because I'm missing stock returns for one company.
>>
>> Thanks,
>> Robert
>>
>> Robert McGehee, CFA
>> Geode Capital Management, LLC
>> One Post Office Square, 28th Floor | Boston, MA | 02109
>> Tel: 617/392-8396    Fax:617/476-6389
>> mailto:[hidden email]
>>
>>
>>
>> This e-mail, and any attachments hereto, are intended
> fo...{{dropped:11}}
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Loading...