R 'base' returning 0 as sum of NAs

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

R 'base' returning 0 as sum of NAs

Alex Ivan Howard
Dear R Team

The following line returns 0 (zero) as answer:
sum(c(NA_real_, NA_real_, NA_real_, NA_real_), na.rm = TRUE)

One would, however, have expected it to return 'NaN', as is the case with
function 'mean':

> mean(c(NA_real_, NA_real_, NA_real_, NA_real_), na.rm = TRUE)
[1] NaN

The problem in other words:
I have a vector filled with missing numbers. I run the 'sum' function on
it, but instruct it to remove all missing values first. Consequently, the
sum function is left with an empty numeric vector. There is nothing to sum
over, so it shouldn't actually be able to return a concrete numeric value?
Shouldn't it thus rather return either NA ('unknown'/'missing') or - in the
fashion of the mean function - NaN ('not a number')?

With the current state of affairs, the sum function poses the grave danger
of introducing zeros to one's data (and subsequently other values as well,
as soon as the zeros get taken up in further calculations).

I hope my e-mail finds you well and I wish the R team all of the best for
2017 :)

Kind regards

Alex I. Howard

Web: www.nova.org.za
Phone: +27 (0) 44 695 0749
VoiP: +27 (0) 87 751 3490
Fax:         +27 (0) 86 538 7958

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: R 'base' returning 0 as sum of NAs

Duncan Murdoch-2
On 11/01/2017 5:33 AM, Alex Ivan Howard wrote:

> Dear R Team
>
> The following line returns 0 (zero) as answer:
> sum(c(NA_real_, NA_real_, NA_real_, NA_real_), na.rm = TRUE)
>
> One would, however, have expected it to return 'NaN', as is the case with
> function 'mean':
>
>> mean(c(NA_real_, NA_real_, NA_real_, NA_real_), na.rm = TRUE)
> [1] NaN
>

The two expressions are long versions of

sum(numeric())
mean(numeric())

It is reasonable that an empty sum is zero.  The mean is 0/0, so NaN is
reasonable.

If this doesn't suit your needs, then you should put in special checks
for empty datasets.

Duncan Murdoch

> The problem in other words:
> I have a vector filled with missing numbers. I run the 'sum' function on
> it, but instruct it to remove all missing values first. Consequently, the
> sum function is left with an empty numeric vector. There is nothing to sum
> over, so it shouldn't actually be able to return a concrete numeric value?
> Shouldn't it thus rather return either NA ('unknown'/'missing') or - in the
> fashion of the mean function - NaN ('not a number')?
>
> With the current state of affairs, the sum function poses the grave danger
> of introducing zeros to one's data (and subsequently other values as well,
> as soon as the zeros get taken up in further calculations).
>
> I hope my e-mail finds you well and I wish the R team all of the best for
> 2017 :)
>
> Kind regards
>
> Alex I. Howard
>
> Web: www.nova.org.za
> Phone: +27 (0) 44 695 0749
> VoiP: +27 (0) 87 751 3490
> Fax:         +27 (0) 86 538 7958
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: R 'base' returning 0 as sum of NAs

Hervé Pagès-2
In reply to this post by Alex Ivan Howard
On 01/11/2017 02:33 AM, Alex Ivan Howard wrote:
> There is nothing to sum
> over, so it shouldn't actually be able to return a concrete numeric value?

How much did you spend at the grocery store if you didn't buy anything?

H.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel