Sum function and missing values --- need to mimic SAS sum function

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

Sum function and missing values --- need to mimic SAS sum function

Allen Bingham
I understand that in order to get the sum function to ignore missing values
I need to supply the argument na.rm=TRUE. However, when summing numeric
values in which ALL components are "NA" ... the result is 0.0 ... instead of
(what I would get from SAS) of NA (or in the case of SAS ".").

Accordingly, I've had to go to 'extreme' measures to get the sum function to
result in NA if all arguments are missing (otherwise give me a sum of all
non-NA elements).

So for example here's a snippet of code that ALMOST does what I want:

 
SumValue<-apply(subset(InputDataFrame,!is.na(Variable.1)|!is.na(Variable.2),
select=c(Variable.1,Variable.2)),1,sum,na.rm=TRUE)

In reality this does NOT give me records with NA for SumValue ... but it
doesn't give me values for any records in which both Variable.1 and
Variable.2 are NA --- which is "good enough" for my purposes.

I'm guessing with a little more work I could come up with a way to adapt the
code above so that I could get it to work like SAS's sum function ...

... but before I go that extra mile I thought I'd ask others if they know of
functions in either base R ... or in a package that will better mimic the
SAS sum function.

Any suggestions?

Thanks.
______________________________________
Allen Bingham
[hidden email]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Sum function and missing values --- need to mimic SAS sum function

Fox, John
Dear Allen,

This seems reasonably straightforward to me, suggesting that I might not properly understand what you want to do. How about something like the following?

> mysum <- function(...){
+   x <- c(...)
+   if (all(is.na(x))) NA else sum(x, na.rm=TRUE)
+ }

> mysum(1, 2, 3, NA)
[1] 6
> mysum(1:3)
[1] 6
> mysum(NA, NA, NA)
[1] NA
> mysum(c(NA, NA, NA))
[1] NA

I hope this helps,
 John

------------------------------------------------
John Fox, Professor
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/

On Sun, 25 Jan 2015 15:21:52 -0800
 "Allen Bingham" <[hidden email]> wrote:

> I understand that in order to get the sum function to ignore missing values
> I need to supply the argument na.rm=TRUE. However, when summing numeric
> values in which ALL components are "NA" ... the result is 0.0 ... instead of
> (what I would get from SAS) of NA (or in the case of SAS ".").
>
> Accordingly, I've had to go to 'extreme' measures to get the sum function to
> result in NA if all arguments are missing (otherwise give me a sum of all
> non-NA elements).
>
> So for example here's a snippet of code that ALMOST does what I want:
>
>  
> SumValue<-apply(subset(InputDataFrame,!is.na(Variable.1)|!is.na(Variable.2),
> select=c(Variable.1,Variable.2)),1,sum,na.rm=TRUE)
>
> In reality this does NOT give me records with NA for SumValue ... but it
> doesn't give me values for any records in which both Variable.1 and
> Variable.2 are NA --- which is "good enough" for my purposes.
>
> I'm guessing with a little more work I could come up with a way to adapt the
> code above so that I could get it to work like SAS's sum function ...
>
> ... but before I go that extra mile I thought I'd ask others if they know of
> functions in either base R ... or in a package that will better mimic the
> SAS sum function.
>
> Any suggestions?
>
> Thanks.
> ______________________________________
> Allen Bingham
> [hidden email]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Sum function and missing values --- need to mimic SAS sum function

Jim Lemon-4
In reply to this post by Allen Bingham
Hi Allen,
How about this:

sum_w_NA<-function(x) ifelse(all(is.na(x)),NA,sum(x,na.rm=TRUE))

Jim


On Mon, Jan 26, 2015 at 10:21 AM, Allen Bingham <[hidden email]> wrote:

> I understand that in order to get the sum function to ignore missing values
> I need to supply the argument na.rm=TRUE. However, when summing numeric
> values in which ALL components are "NA" ... the result is 0.0 ... instead of
> (what I would get from SAS) of NA (or in the case of SAS ".").
>
> Accordingly, I've had to go to 'extreme' measures to get the sum function to
> result in NA if all arguments are missing (otherwise give me a sum of all
> non-NA elements).
>
> So for example here's a snippet of code that ALMOST does what I want:
>
>
> SumValue<-apply(subset(InputDataFrame,!is.na(Variable.1)|!is.na(Variable.2),
> select=c(Variable.1,Variable.2)),1,sum,na.rm=TRUE)
>
> In reality this does NOT give me records with NA for SumValue ... but it
> doesn't give me values for any records in which both Variable.1 and
> Variable.2 are NA --- which is "good enough" for my purposes.
>
> I'm guessing with a little more work I could come up with a way to adapt the
> code above so that I could get it to work like SAS's sum function ...
>
> ... but before I go that extra mile I thought I'd ask others if they know of
> functions in either base R ... or in a package that will better mimic the
> SAS sum function.
>
> Any suggestions?
>
> Thanks.
> ______________________________________
> Allen Bingham
> [hidden email]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Sum function and missing values --- need to mimic SAS sum function

Peter Dalgaard-2
Ouch. Please avoid ifelse() in non-vectorized contexts. John Fox has the right idea.

-pd

On 26 Jan 2015, at 01:21 , Jim Lemon <[hidden email]> wrote:

> Hi Allen,
> How about this:
>
> sum_w_NA<-function(x) ifelse(all(is.na(x)),NA,sum(x,na.rm=TRUE))
>
> Jim
>
>
> On Mon, Jan 26, 2015 at 10:21 AM, Allen Bingham <[hidden email]> wrote:
>> I understand that in order to get the sum function to ignore missing values
>> I need to supply the argument na.rm=TRUE. However, when summing numeric
>> values in which ALL components are "NA" ... the result is 0.0 ... instead of
>> (what I would get from SAS) of NA (or in the case of SAS ".").
>>
>> Accordingly, I've had to go to 'extreme' measures to get the sum function to
>> result in NA if all arguments are missing (otherwise give me a sum of all
>> non-NA elements).
>>
>> So for example here's a snippet of code that ALMOST does what I want:
>>
>>
>> SumValue<-apply(subset(InputDataFrame,!is.na(Variable.1)|!is.na(Variable.2),
>> select=c(Variable.1,Variable.2)),1,sum,na.rm=TRUE)
>>
>> In reality this does NOT give me records with NA for SumValue ... but it
>> doesn't give me values for any records in which both Variable.1 and
>> Variable.2 are NA --- which is "good enough" for my purposes.
>>
>> I'm guessing with a little more work I could come up with a way to adapt the
>> code above so that I could get it to work like SAS's sum function ...
>>
>> ... but before I go that extra mile I thought I'd ask others if they know of
>> functions in either base R ... or in a package that will better mimic the
>> SAS sum function.
>>
>> Any suggestions?
>>
>> Thanks.
>> ______________________________________
>> Allen Bingham
>> [hidden email]
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: [hidden email]  Priv: [hidden email]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Sum function and missing values --- need to mimic SAS sum function

Martin Maechler-5
In reply to this post by Jim Lemon-4
>>>>> Jim Lemon <[hidden email]>
>>>>>     on Mon, 26 Jan 2015 11:21:03 +1100 writes:

    > Hi Allen, How about this:

    > sum_w_NA<-function(x) ifelse(all(is.na(x)),NA,sum(x,na.rm=TRUE))

Excuse, Jim, but that's yet another  "horrible misuse of  ifelse()"

John Fox's reply *did* contain  the "proper" solution

     if (all(is.na(x))) NA else sum(x, na.rm=TRUE)

The ifelse() function should never be used in such cases.
Read more after googling
 
    "Do NOT use ifelse()"

    -- include the quotes in your search --

or directly at
   http://stat.ethz.ch/pipermail/r-help/2014-December/424367.html

Yes, this has been on R-help a month ago..
Martin

    > On Mon, Jan 26, 2015 at 10:21 AM, Allen Bingham
    > <[hidden email]> wrote:
    >> I understand that in order to get the sum function to
    >> ignore missing values I need to supply the argument
    >> na.rm=TRUE. However, when summing numeric values in which
    >> ALL components are "NA" ... the result is 0.0 ... instead
    >> of (what I would get from SAS) of NA (or in the case of
    >> SAS ".").
    >>
    >> Accordingly, I've had to go to 'extreme' measures to get
    >> the sum function to result in NA if all arguments are
    >> missing (otherwise give me a sum of all non-NA elements).
    >>
    >> So for example here's a snippet of code that ALMOST does
    >> what I want:
    >>
    >>
    >> SumValue<-apply(subset(InputDataFrame,!is.na(Variable.1)|!is.na(Variable.2),
    >> select=c(Variable.1,Variable.2)),1,sum,na.rm=TRUE)
    >>
    >> In reality this does NOT give me records with NA for
    >> SumValue ... but it doesn't give me values for any
    >> records in which both Variable.1 and Variable.2 are NA
    >> --- which is "good enough" for my purposes.
    >>
    >> I'm guessing with a little more work I could come up with
    >> a way to adapt the code above so that I could get it to
    >> work like SAS's sum function ...
    >>
    >> ... but before I go that extra mile I thought I'd ask
    >> others if they know of functions in either base R ... or
    >> in a package that will better mimic the SAS sum function.
    >>
    >> Any suggestions?
    >>
    >> Thanks.  ______________________________________ Allen
    >> Bingham [hidden email]
    >>
    >> ______________________________________________
    >> [hidden email] mailing list -- To UNSUBSCRIBE and
    >> more, see https://stat.ethz.ch/mailman/listinfo/r-help
    >> PLEASE do read the posting guide
    >> http://www.R-project.org/posting-guide.html and provide
    >> commented, minimal, self-contained, reproducible code.

    > ______________________________________________
    > [hidden email] mailing list -- To UNSUBSCRIBE and
    > more, see https://stat.ethz.ch/mailman/listinfo/r-help
    > PLEASE do read the posting guide
    > http://www.R-project.org/posting-guide.html and provide
    > commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Sum function and missing values --- need to mimic SAS sum function

Sven E. Templer
you can also define 'na.rm' in sum() by 'NA state' of x (where x is
your vector holding the data):

sum(x, na.rm=!all(is.na(x)))

On 26 January 2015 at 13:45, Martin Maechler
<[hidden email]> wrote:

>>>>>> Jim Lemon <[hidden email]>
>>>>>>     on Mon, 26 Jan 2015 11:21:03 +1100 writes:
>
>     > Hi Allen, How about this:
>
>     > sum_w_NA<-function(x) ifelse(all(is.na(x)),NA,sum(x,na.rm=TRUE))
>
> Excuse, Jim, but that's yet another  "horrible misuse of  ifelse()"
>
> John Fox's reply *did* contain  the "proper" solution
>
>      if (all(is.na(x))) NA else sum(x, na.rm=TRUE)
>
> The ifelse() function should never be used in such cases.
> Read more after googling
>
>     "Do NOT use ifelse()"
>
>     -- include the quotes in your search --
>
> or directly at
>    http://stat.ethz.ch/pipermail/r-help/2014-December/424367.html
>
> Yes, this has been on R-help a month ago..
> Martin
>
>     > On Mon, Jan 26, 2015 at 10:21 AM, Allen Bingham
>     > <[hidden email]> wrote:
>     >> I understand that in order to get the sum function to
>     >> ignore missing values I need to supply the argument
>     >> na.rm=TRUE. However, when summing numeric values in which
>     >> ALL components are "NA" ... the result is 0.0 ... instead
>     >> of (what I would get from SAS) of NA (or in the case of
>     >> SAS ".").
>     >>
>     >> Accordingly, I've had to go to 'extreme' measures to get
>     >> the sum function to result in NA if all arguments are
>     >> missing (otherwise give me a sum of all non-NA elements).
>     >>
>     >> So for example here's a snippet of code that ALMOST does
>     >> what I want:
>     >>
>     >>
>     >> SumValue<-apply(subset(InputDataFrame,!is.na(Variable.1)|!is.na(Variable.2),
>     >> select=c(Variable.1,Variable.2)),1,sum,na.rm=TRUE)
>     >>
>     >> In reality this does NOT give me records with NA for
>     >> SumValue ... but it doesn't give me values for any
>     >> records in which both Variable.1 and Variable.2 are NA
>     >> --- which is "good enough" for my purposes.
>     >>
>     >> I'm guessing with a little more work I could come up with
>     >> a way to adapt the code above so that I could get it to
>     >> work like SAS's sum function ...
>     >>
>     >> ... but before I go that extra mile I thought I'd ask
>     >> others if they know of functions in either base R ... or
>     >> in a package that will better mimic the SAS sum function.
>     >>
>     >> Any suggestions?
>     >>
>     >> Thanks.  ______________________________________ Allen
>     >> Bingham [hidden email]
>     >>
>     >> ______________________________________________
>     >> [hidden email] mailing list -- To UNSUBSCRIBE and
>     >> more, see https://stat.ethz.ch/mailman/listinfo/r-help
>     >> PLEASE do read the posting guide
>     >> http://www.R-project.org/posting-guide.html and provide
>     >> commented, minimal, self-contained, reproducible code.
>
>     > ______________________________________________
>     > [hidden email] mailing list -- To UNSUBSCRIBE and
>     > more, see https://stat.ethz.ch/mailman/listinfo/r-help
>     > PLEASE do read the posting guide
>     > http://www.R-project.org/posting-guide.html and provide
>     > commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Sum function and missing values --- need to mimic SAS sum function

MacQueen, Don
In reply to this post by Allen Bingham
I'm a little puzzled by the assertion that the result is 0.0 when all the
elements are NA:

> sum(NA)
[1] NA

> sum(c(NA,NA))
[1] NA

> sum(rep(NA, 10))
[1] NA

> sum(as.numeric(letters[1:4]))
[1] NA
Warning message:
NAs introduced by coercion


Considering that the example snippet of code has several other aspects
besides using sum(), among them subsetting rows of a data frame when there
are apparently NAs in some its variables ... I wonder if the reason for
the failure of that snippet has been misunderstood?


--
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 1/25/15, 3:21 PM, "Allen Bingham" <[hidden email]> wrote:

>I understand that in order to get the sum function to ignore missing
>values
>I need to supply the argument na.rm=TRUE. However, when summing numeric
>values in which ALL components are "NA" ... the result is 0.0 ... instead
>of
>(what I would get from SAS) of NA (or in the case of SAS ".").
>
>Accordingly, I've had to go to 'extreme' measures to get the sum function
>to
>result in NA if all arguments are missing (otherwise give me a sum of all
>non-NA elements).
>
>So for example here's a snippet of code that ALMOST does what I want:
>
>
>SumValue<-apply(subset(InputDataFrame,!is.na(Variable.1)|!is.na(Variable.2
>),
>select=c(Variable.1,Variable.2)),1,sum,na.rm=TRUE)
>
>In reality this does NOT give me records with NA for SumValue ... but it
>doesn't give me values for any records in which both Variable.1 and
>Variable.2 are NA --- which is "good enough" for my purposes.
>
>I'm guessing with a little more work I could come up with a way to adapt
>the
>code above so that I could get it to work like SAS's sum function ...
>
>... but before I go that extra mile I thought I'd ask others if they know
>of
>functions in either base R ... or in a package that will better mimic the
>SAS sum function.
>
>Any suggestions?
>
>Thanks.
>______________________________________
>Allen Bingham
>[hidden email]
>
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Sum function and missing values --- need to mimic SAS sum function

Ista Zahn
Try with na.rm=TRUE.
On Jan 26, 2015 4:04 PM, "MacQueen, Don" <[hidden email]> wrote:

> I'm a little puzzled by the assertion that the result is 0.0 when all the
> elements are NA:
>
> > sum(NA)
> [1] NA
>
> > sum(c(NA,NA))
> [1] NA
>
> > sum(rep(NA, 10))
> [1] NA
>
> > sum(as.numeric(letters[1:4]))
> [1] NA
> Warning message:
> NAs introduced by coercion
>
>
> Considering that the example snippet of code has several other aspects
> besides using sum(), among them subsetting rows of a data frame when there
> are apparently NAs in some its variables ... I wonder if the reason for
> the failure of that snippet has been misunderstood?
>
>
> --
> Don MacQueen
>
> Lawrence Livermore National Laboratory
> 7000 East Ave., L-627
> Livermore, CA 94550
> 925-423-1062
>
>
>
>
>
> On 1/25/15, 3:21 PM, "Allen Bingham" <[hidden email]> wrote:
>
> >I understand that in order to get the sum function to ignore missing
> >values
> >I need to supply the argument na.rm=TRUE. However, when summing numeric
> >values in which ALL components are "NA" ... the result is 0.0 ... instead
> >of
> >(what I would get from SAS) of NA (or in the case of SAS ".").
> >
> >Accordingly, I've had to go to 'extreme' measures to get the sum function
> >to
> >result in NA if all arguments are missing (otherwise give me a sum of all
> >non-NA elements).
> >
> >So for example here's a snippet of code that ALMOST does what I want:
> >
> >
> >SumValue<-apply(subset(InputDataFrame,!is.na(Variable.1)|!is.na
> (Variable.2
> >),
> >select=c(Variable.1,Variable.2)),1,sum,na.rm=TRUE)
> >
> >In reality this does NOT give me records with NA for SumValue ... but it
> >doesn't give me values for any records in which both Variable.1 and
> >Variable.2 are NA --- which is "good enough" for my purposes.
> >
> >I'm guessing with a little more work I could come up with a way to adapt
> >the
> >code above so that I could get it to work like SAS's sum function ...
> >
> >... but before I go that extra mile I thought I'd ask others if they know
> >of
> >functions in either base R ... or in a package that will better mimic the
> >SAS sum function.
> >
> >Any suggestions?
> >
> >Thanks.
> >______________________________________
> >Allen Bingham
> >[hidden email]
> >
> >______________________________________________
> >[hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Sum function and missing values --- need to mimic SAS sum function

Henrik Bengtsson-3
In case anyone wonders, this behavior is expected and consistent with
the note "the sum of an empty set is zero, by definition" in
help("sum"), i.e.

> x <- numeric(0)
> str(x)
 num(0)
> sum(x)
[1] 0

Analogously, prod(numeric(0)) gives 1.0.


To OP, if you're in the end of the day is after the sample mean, note
that mean() returns NaN in this case, e.g.

> x <- rep(NA_real_, times=10)
> mean(x, na.rm=TRUE)
[1] NaN

/Henrik

On Mon, Jan 26, 2015 at 1:17 PM, Ista Zahn <[hidden email]> wrote:

> Try with na.rm=TRUE.
> On Jan 26, 2015 4:04 PM, "MacQueen, Don" <[hidden email]> wrote:
>
>> I'm a little puzzled by the assertion that the result is 0.0 when all the
>> elements are NA:
>>
>> > sum(NA)
>> [1] NA
>>
>> > sum(c(NA,NA))
>> [1] NA
>>
>> > sum(rep(NA, 10))
>> [1] NA
>>
>> > sum(as.numeric(letters[1:4]))
>> [1] NA
>> Warning message:
>> NAs introduced by coercion
>>
>>
>> Considering that the example snippet of code has several other aspects
>> besides using sum(), among them subsetting rows of a data frame when there
>> are apparently NAs in some its variables ... I wonder if the reason for
>> the failure of that snippet has been misunderstood?
>>
>>
>> --
>> Don MacQueen
>>
>> Lawrence Livermore National Laboratory
>> 7000 East Ave., L-627
>> Livermore, CA 94550
>> 925-423-1062
>>
>>
>>
>>
>>
>> On 1/25/15, 3:21 PM, "Allen Bingham" <[hidden email]> wrote:
>>
>> >I understand that in order to get the sum function to ignore missing
>> >values
>> >I need to supply the argument na.rm=TRUE. However, when summing numeric
>> >values in which ALL components are "NA" ... the result is 0.0 ... instead
>> >of
>> >(what I would get from SAS) of NA (or in the case of SAS ".").
>> >
>> >Accordingly, I've had to go to 'extreme' measures to get the sum function
>> >to
>> >result in NA if all arguments are missing (otherwise give me a sum of all
>> >non-NA elements).
>> >
>> >So for example here's a snippet of code that ALMOST does what I want:
>> >
>> >
>> >SumValue<-apply(subset(InputDataFrame,!is.na(Variable.1)|!is.na
>> (Variable.2
>> >),
>> >select=c(Variable.1,Variable.2)),1,sum,na.rm=TRUE)
>> >
>> >In reality this does NOT give me records with NA for SumValue ... but it
>> >doesn't give me values for any records in which both Variable.1 and
>> >Variable.2 are NA --- which is "good enough" for my purposes.
>> >
>> >I'm guessing with a little more work I could come up with a way to adapt
>> >the
>> >code above so that I could get it to work like SAS's sum function ...
>> >
>> >... but before I go that extra mile I thought I'd ask others if they know
>> >of
>> >functions in either base R ... or in a package that will better mimic the
>> >SAS sum function.
>> >
>> >Any suggestions?
>> >
>> >Thanks.
>> >______________________________________
>> >Allen Bingham
>> >[hidden email]
>> >
>> >______________________________________________
>> >[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> >https://stat.ethz.ch/mailman/listinfo/r-help
>> >PLEASE do read the posting guide
>> >http://www.R-project.org/posting-guide.html
>> >and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Sum function and missing values --- need to mimic SAS sum function

Allen Bingham
In reply to this post by MacQueen, Don
Don,

The default for the sum function is to NOT remove NA before summing (i.e.,
option na.rm=FALSE), here's the results with na.rm=TRUE

> sum(NA,na.rm=TRUE)
[1] 0
> sum(c(NA,NA),na.rm=TRUE)
[1] 0
> sum(rep(NA,10),na.rm=TRUE)
[1] 0
> sum(as.numeric(letters[1:4]),na.rm=TRUE)
[1] 0
Warning message:
NAs introduced by coercion

Hope that explains it a bit better.

Others have replied with suggested solutions to my 'problem', and the one by
John Fox is what I need (an actual function that I can use in an apply
statement), although the suggested code by Sven Templer is appealing in its
simplicity.

Allen
-----Original Message-----
From: MacQueen, Don [mailto:[hidden email]]
Sent: Monday, January 26, 2015 1:03 PM
To: Allen Bingham; [hidden email]
Subject: Re: [R] Sum function and missing values --- need to mimic SAS sum
function

I'm a little puzzled by the assertion that the result is 0.0 when all the
elements are NA:

> sum(NA)
[1] NA

> sum(c(NA,NA))
[1] NA

> sum(rep(NA, 10))
[1] NA

> sum(as.numeric(letters[1:4]))
[1] NA
Warning message:
NAs introduced by coercion


Considering that the example snippet of code has several other aspects
besides using sum(), among them subsetting rows of a data frame when there
are apparently NAs in some its variables ... I wonder if the reason for the
failure of that snippet has been misunderstood?


--
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 1/25/15, 3:21 PM, "Allen Bingham" <[hidden email]> wrote:

>I understand that in order to get the sum function to ignore missing
>values I need to supply the argument na.rm=TRUE. However, when summing
>numeric values in which ALL components are "NA" ... the result is 0.0
>... instead of (what I would get from SAS) of NA (or in the case of SAS
>".").
>
>Accordingly, I've had to go to 'extreme' measures to get the sum
>function to result in NA if all arguments are missing (otherwise give
>me a sum of all non-NA elements).
>
>So for example here's a snippet of code that ALMOST does what I want:
>
>
>SumValue<-apply(subset(InputDataFrame,!is.na(Variable.1)|!is.na(Variabl
>e.2
>),
>select=c(Variable.1,Variable.2)),1,sum,na.rm=TRUE)
>
>In reality this does NOT give me records with NA for SumValue ... but
>it doesn't give me values for any records in which both Variable.1 and
>Variable.2 are NA --- which is "good enough" for my purposes.
>
>I'm guessing with a little more work I could come up with a way to
>adapt the code above so that I could get it to work like SAS's sum
>function ...
>
>... but before I go that extra mile I thought I'd ask others if they
>know of functions in either base R ... or in a package that will better
>mimic the SAS sum function.
>
>Any suggestions?
>
>Thanks.
>______________________________________
>Allen Bingham
>[hidden email]
>
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Sum function and missing values --- need to mimic SAS sum function

Allen Bingham
In reply to this post by Sven E. Templer
Sven and John,

Thanks for your suggested code ... hits the mark! The code by John is what I need to be able to use in an apply function, but I really like the simplicity of Sven's suggestion.

Also thanks to all who replied --- really helped broaden my knowledge of R.

Allen

-----Original Message-----
From: Sven E. Templer [mailto:[hidden email]]
Sent: Monday, January 26, 2015 6:56 AM
To: Martin Maechler
Cc: Jim Lemon; r-help mailing list; Allen Bingham
Subject: Re: [R] Sum function and missing values --- need to mimic SAS sum function

you can also define 'na.rm' in sum() by 'NA state' of x (where x is your vector holding the data):

sum(x, na.rm=!all(is.na(x)))

On 26 January 2015 at 13:45, Martin Maechler <[hidden email]> wrote:

>>>>>> Jim Lemon <[hidden email]>
>>>>>>     on Mon, 26 Jan 2015 11:21:03 +1100 writes:
>
>     > Hi Allen, How about this:
>
>     > sum_w_NA<-function(x) ifelse(all(is.na(x)),NA,sum(x,na.rm=TRUE))
>
> Excuse, Jim, but that's yet another  "horrible misuse of  ifelse()"
>
> John Fox's reply *did* contain  the "proper" solution
>
>      if (all(is.na(x))) NA else sum(x, na.rm=TRUE)
>
> The ifelse() function should never be used in such cases.
> Read more after googling
>
>     "Do NOT use ifelse()"
>
>     -- include the quotes in your search --
>
> or directly at
>    http://stat.ethz.ch/pipermail/r-help/2014-December/424367.html
>
> Yes, this has been on R-help a month ago..
> Martin
>
>     > On Mon, Jan 26, 2015 at 10:21 AM, Allen Bingham
>     > <[hidden email]> wrote:
>     >> I understand that in order to get the sum function to
>     >> ignore missing values I need to supply the argument
>     >> na.rm=TRUE. However, when summing numeric values in which
>     >> ALL components are "NA" ... the result is 0.0 ... instead
>     >> of (what I would get from SAS) of NA (or in the case of
>     >> SAS ".").
>     >>
>     >> Accordingly, I've had to go to 'extreme' measures to get
>     >> the sum function to result in NA if all arguments are
>     >> missing (otherwise give me a sum of all non-NA elements).
>     >>
>     >> So for example here's a snippet of code that ALMOST does
>     >> what I want:
>     >>
>     >>
>     >> SumValue<-apply(subset(InputDataFrame,!is.na(Variable.1)|!is.na(Variable.2),
>     >> select=c(Variable.1,Variable.2)),1,sum,na.rm=TRUE)
>     >>
>     >> In reality this does NOT give me records with NA for
>     >> SumValue ... but it doesn't give me values for any
>     >> records in which both Variable.1 and Variable.2 are NA
>     >> --- which is "good enough" for my purposes.
>     >>
>     >> I'm guessing with a little more work I could come up with
>     >> a way to adapt the code above so that I could get it to
>     >> work like SAS's sum function ...
>     >>
>     >> ... but before I go that extra mile I thought I'd ask
>     >> others if they know of functions in either base R ... or
>     >> in a package that will better mimic the SAS sum function.
>     >>
>     >> Any suggestions?
>     >>
>     >> Thanks.  ______________________________________ Allen
>     >> Bingham [hidden email]
>     >>
>     >> ______________________________________________
>     >> [hidden email] mailing list -- To UNSUBSCRIBE and
>     >> more, see https://stat.ethz.ch/mailman/listinfo/r-help
>     >> PLEASE do read the posting guide
>     >> http://www.R-project.org/posting-guide.html and provide
>     >> commented, minimal, self-contained, reproducible code.
>
>     > ______________________________________________
>     > [hidden email] mailing list -- To UNSUBSCRIBE and
>     > more, see https://stat.ethz.ch/mailman/listinfo/r-help
>     > PLEASE do read the posting guide
>     > http://www.R-project.org/posting-guide.html and provide
>     > commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Sum function and missing values --- need to mimic SAS sum function

Hervé Pagès-2
In reply to this post by Martin Maechler-5
Hi Martin,

On 01/26/2015 04:45 AM, Martin Maechler wrote:

>>>>>> Jim Lemon <[hidden email]>
>>>>>>      on Mon, 26 Jan 2015 11:21:03 +1100 writes:
>
>      > Hi Allen, How about this:
>
>      > sum_w_NA<-function(x) ifelse(all(is.na(x)),NA,sum(x,na.rm=TRUE))
>
> Excuse, Jim, but that's yet another  "horrible misuse of  ifelse()"
>
> John Fox's reply *did* contain  the "proper" solution
>
>       if (all(is.na(x))) NA else sum(x, na.rm=TRUE)
>
> The ifelse() function should never be used in such cases.
> Read more after googling
>
>      "Do NOT use ifelse()"
>
>      -- include the quotes in your search --
>
> or directly at
>     http://stat.ethz.ch/pipermail/r-help/2014-December/424367.html

Interesting. You could have added the following item to your list:

   4. less likely to play strange tricks on you:

      > ifelse(TRUE, a <- 2L, a <- 3L)
      [1] 2
      > a
      [1] 3

Yeah I've seen people using ifelse() that way and being totally
confused...

Cheers,
H.

>
> Yes, this has been on R-help a month ago..
> Martin
>
>      > On Mon, Jan 26, 2015 at 10:21 AM, Allen Bingham
>      > <[hidden email]> wrote:
>      >> I understand that in order to get the sum function to
>      >> ignore missing values I need to supply the argument
>      >> na.rm=TRUE. However, when summing numeric values in which
>      >> ALL components are "NA" ... the result is 0.0 ... instead
>      >> of (what I would get from SAS) of NA (or in the case of
>      >> SAS ".").
>      >>
>      >> Accordingly, I've had to go to 'extreme' measures to get
>      >> the sum function to result in NA if all arguments are
>      >> missing (otherwise give me a sum of all non-NA elements).
>      >>
>      >> So for example here's a snippet of code that ALMOST does
>      >> what I want:
>      >>
>      >>
>      >> SumValue<-apply(subset(InputDataFrame,!is.na(Variable.1)|!is.na(Variable.2),
>      >> select=c(Variable.1,Variable.2)),1,sum,na.rm=TRUE)
>      >>
>      >> In reality this does NOT give me records with NA for
>      >> SumValue ... but it doesn't give me values for any
>      >> records in which both Variable.1 and Variable.2 are NA
>      >> --- which is "good enough" for my purposes.
>      >>
>      >> I'm guessing with a little more work I could come up with
>      >> a way to adapt the code above so that I could get it to
>      >> work like SAS's sum function ...
>      >>
>      >> ... but before I go that extra mile I thought I'd ask
>      >> others if they know of functions in either base R ... or
>      >> in a package that will better mimic the SAS sum function.
>      >>
>      >> Any suggestions?
>      >>
>      >> Thanks.  ______________________________________ Allen
>      >> Bingham [hidden email]
>      >>
>      >> ______________________________________________
>      >> [hidden email] mailing list -- To UNSUBSCRIBE and
>      >> more, see https://stat.ethz.ch/mailman/listinfo/r-help
>      >> PLEASE do read the posting guide
>      >> http://www.R-project.org/posting-guide.html and provide
>      >> commented, minimal, self-contained, reproducible code.
>
>      > ______________________________________________
>      > [hidden email] mailing list -- To UNSUBSCRIBE and
>      > more, see https://stat.ethz.ch/mailman/listinfo/r-help
>      > PLEASE do read the posting guide
>      > http://www.R-project.org/posting-guide.html and provide
>      > commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: [hidden email]
Phone:  (206) 667-5791
Fax:    (206) 667-1319

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Sum function and missing values --- need to mimic SAS sum function

Boris Steipe
In reply to this post by Sven E. Templer


> sum(x, na.rm=!all(is.na(x)))


That's the kind of idiom that brings the poor chap who has to maintain it to tears.

;-)

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Sum function and missing values --- need to mimic SAS sum function

Rolf Turner
On 27/01/15 13:42, Boris Steipe wrote:
>
>
>> sum(x, na.rm=!all(is.na(x)))
>
>
> That's the kind of idiom that brings the poor chap who has to maintain it to tears.
>
> ;-)

It looks perfectly lucid to me.  If you think that that's obscure code,
you ain't been around! :-)

cheers,

Rolf Turner

--
Rolf Turner
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276
Home phone: +64-9-480-4619

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Sum function and missing values --- need to mimic SAS sum function

Bert Gunter
In reply to this post by Hervé Pagès-2
Huh??

> ifelse(TRUE, a <- 2L, a <- 3L)
[1] 2
> a
[1] 2

Please clarify.

-- Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
Clifford Stoll




On Mon, Jan 26, 2015 at 2:22 PM, Hervé Pagès <[hidden email]> wrote:

> Hi Martin,
>
> On 01/26/2015 04:45 AM, Martin Maechler wrote:
>>>>>>>
>>>>>>> Jim Lemon <[hidden email]>
>>>>>>>      on Mon, 26 Jan 2015 11:21:03 +1100 writes:
>>
>>
>>      > Hi Allen, How about this:
>>
>>      > sum_w_NA<-function(x) ifelse(all(is.na(x)),NA,sum(x,na.rm=TRUE))
>>
>> Excuse, Jim, but that's yet another  "horrible misuse of  ifelse()"
>>
>> John Fox's reply *did* contain  the "proper" solution
>>
>>       if (all(is.na(x))) NA else sum(x, na.rm=TRUE)
>>
>> The ifelse() function should never be used in such cases.
>> Read more after googling
>>
>>      "Do NOT use ifelse()"
>>
>>      -- include the quotes in your search --
>>
>> or directly at
>>     http://stat.ethz.ch/pipermail/r-help/2014-December/424367.html
>
>
> Interesting. You could have added the following item to your list:
>
>   4. less likely to play strange tricks on you:
>
>      > ifelse(TRUE, a <- 2L, a <- 3L)
>      [1] 2
>      > a
>      [1] 3
>
> Yeah I've seen people using ifelse() that way and being totally
> confused...
>
> Cheers,
> H.
>
>>
>> Yes, this has been on R-help a month ago..
>> Martin
>>
>>      > On Mon, Jan 26, 2015 at 10:21 AM, Allen Bingham
>>      > <[hidden email]> wrote:
>>      >> I understand that in order to get the sum function to
>>      >> ignore missing values I need to supply the argument
>>      >> na.rm=TRUE. However, when summing numeric values in which
>>      >> ALL components are "NA" ... the result is 0.0 ... instead
>>      >> of (what I would get from SAS) of NA (or in the case of
>>      >> SAS ".").
>>      >>
>>      >> Accordingly, I've had to go to 'extreme' measures to get
>>      >> the sum function to result in NA if all arguments are
>>      >> missing (otherwise give me a sum of all non-NA elements).
>>      >>
>>      >> So for example here's a snippet of code that ALMOST does
>>      >> what I want:
>>      >>
>>      >>
>>      >>
>> SumValue<-apply(subset(InputDataFrame,!is.na(Variable.1)|!is.na(Variable.2),
>>      >> select=c(Variable.1,Variable.2)),1,sum,na.rm=TRUE)
>>      >>
>>      >> In reality this does NOT give me records with NA for
>>      >> SumValue ... but it doesn't give me values for any
>>      >> records in which both Variable.1 and Variable.2 are NA
>>      >> --- which is "good enough" for my purposes.
>>      >>
>>      >> I'm guessing with a little more work I could come up with
>>      >> a way to adapt the code above so that I could get it to
>>      >> work like SAS's sum function ...
>>      >>
>>      >> ... but before I go that extra mile I thought I'd ask
>>      >> others if they know of functions in either base R ... or
>>      >> in a package that will better mimic the SAS sum function.
>>      >>
>>      >> Any suggestions?
>>      >>
>>      >> Thanks.  ______________________________________ Allen
>>      >> Bingham [hidden email]
>>      >>
>>      >> ______________________________________________
>>      >> [hidden email] mailing list -- To UNSUBSCRIBE and
>>      >> more, see https://stat.ethz.ch/mailman/listinfo/r-help
>>      >> PLEASE do read the posting guide
>>      >> http://www.R-project.org/posting-guide.html and provide
>>      >> commented, minimal, self-contained, reproducible code.
>>
>>      > ______________________________________________
>>      > [hidden email] mailing list -- To UNSUBSCRIBE and
>>      > more, see https://stat.ethz.ch/mailman/listinfo/r-help
>>      > PLEASE do read the posting guide
>>      > http://www.R-project.org/posting-guide.html and provide
>>      > commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: [hidden email]
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Sum function and missing values --- need to mimic SAS sum function

Sven E. Templer
Maybe this is due to the usage of rep() in ifelse():

f.rep <- function(ans){ans <- rep(ans,1);return(ans)}
f <- function(ans){return(ans)}

f(a <- 123) # no print here
f.rep(a <- 123) # prints:
# [1] 123


On 27 January 2015 at 11:54, Bert Gunter <[hidden email]> wrote:

> Huh??
>
>> ifelse(TRUE, a <- 2L, a <- 3L)
> [1] 2
>> a
> [1] 2
>
> Please clarify.
>
> -- Bert
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
> (650) 467-7374
>
> "Data is not information. Information is not knowledge. And knowledge
> is certainly not wisdom."
> Clifford Stoll
>
>
>
>
> On Mon, Jan 26, 2015 at 2:22 PM, Hervé Pagès <[hidden email]> wrote:
>> Hi Martin,
>>
>> On 01/26/2015 04:45 AM, Martin Maechler wrote:
>>>>>>>>
>>>>>>>> Jim Lemon <[hidden email]>
>>>>>>>>      on Mon, 26 Jan 2015 11:21:03 +1100 writes:
>>>
>>>
>>>      > Hi Allen, How about this:
>>>
>>>      > sum_w_NA<-function(x) ifelse(all(is.na(x)),NA,sum(x,na.rm=TRUE))
>>>
>>> Excuse, Jim, but that's yet another  "horrible misuse of  ifelse()"
>>>
>>> John Fox's reply *did* contain  the "proper" solution
>>>
>>>       if (all(is.na(x))) NA else sum(x, na.rm=TRUE)
>>>
>>> The ifelse() function should never be used in such cases.
>>> Read more after googling
>>>
>>>      "Do NOT use ifelse()"
>>>
>>>      -- include the quotes in your search --
>>>
>>> or directly at
>>>     http://stat.ethz.ch/pipermail/r-help/2014-December/424367.html
>>
>>
>> Interesting. You could have added the following item to your list:
>>
>>   4. less likely to play strange tricks on you:
>>
>>      > ifelse(TRUE, a <- 2L, a <- 3L)
>>      [1] 2
>>      > a
>>      [1] 3
>>
>> Yeah I've seen people using ifelse() that way and being totally
>> confused...
>>
>> Cheers,
>> H.
>>
>>>
>>> Yes, this has been on R-help a month ago..
>>> Martin
>>>
>>>      > On Mon, Jan 26, 2015 at 10:21 AM, Allen Bingham
>>>      > <[hidden email]> wrote:
>>>      >> I understand that in order to get the sum function to
>>>      >> ignore missing values I need to supply the argument
>>>      >> na.rm=TRUE. However, when summing numeric values in which
>>>      >> ALL components are "NA" ... the result is 0.0 ... instead
>>>      >> of (what I would get from SAS) of NA (or in the case of
>>>      >> SAS ".").
>>>      >>
>>>      >> Accordingly, I've had to go to 'extreme' measures to get
>>>      >> the sum function to result in NA if all arguments are
>>>      >> missing (otherwise give me a sum of all non-NA elements).
>>>      >>
>>>      >> So for example here's a snippet of code that ALMOST does
>>>      >> what I want:
>>>      >>
>>>      >>
>>>      >>
>>> SumValue<-apply(subset(InputDataFrame,!is.na(Variable.1)|!is.na(Variable.2),
>>>      >> select=c(Variable.1,Variable.2)),1,sum,na.rm=TRUE)
>>>      >>
>>>      >> In reality this does NOT give me records with NA for
>>>      >> SumValue ... but it doesn't give me values for any
>>>      >> records in which both Variable.1 and Variable.2 are NA
>>>      >> --- which is "good enough" for my purposes.
>>>      >>
>>>      >> I'm guessing with a little more work I could come up with
>>>      >> a way to adapt the code above so that I could get it to
>>>      >> work like SAS's sum function ...
>>>      >>
>>>      >> ... but before I go that extra mile I thought I'd ask
>>>      >> others if they know of functions in either base R ... or
>>>      >> in a package that will better mimic the SAS sum function.
>>>      >>
>>>      >> Any suggestions?
>>>      >>
>>>      >> Thanks.  ______________________________________ Allen
>>>      >> Bingham [hidden email]
>>>      >>
>>>      >> ______________________________________________
>>>      >> [hidden email] mailing list -- To UNSUBSCRIBE and
>>>      >> more, see https://stat.ethz.ch/mailman/listinfo/r-help
>>>      >> PLEASE do read the posting guide
>>>      >> http://www.R-project.org/posting-guide.html and provide
>>>      >> commented, minimal, self-contained, reproducible code.
>>>
>>>      > ______________________________________________
>>>      > [hidden email] mailing list -- To UNSUBSCRIBE and
>>>      > more, see https://stat.ethz.ch/mailman/listinfo/r-help
>>>      > PLEASE do read the posting guide
>>>      > http://www.R-project.org/posting-guide.html and provide
>>>      > commented, minimal, self-contained, reproducible code.
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> --
>> Hervé Pagès
>>
>> Program in Computational Biology
>> Division of Public Health Sciences
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N, M1-B514
>> P.O. Box 19024
>> Seattle, WA 98109-1024
>>
>> E-mail: [hidden email]
>> Phone:  (206) 667-5791
>> Fax:    (206) 667-1319
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Sum function and missing values --- need to mimic SAS sum function

Hervé Pagès-2
In reply to this post by Bert Gunter
On 01/27/2015 02:54 AM, Bert Gunter wrote:
> Huh??
>
>> ifelse(TRUE, a <- 2L, a <- 3L)
> [1] 2
>> a
> [1] 2
>
> Please clarify.

In Bioconductor ifelse() is a generic function (with methods for Rle
objects) so all its arguments are evaluated before dispatch can
happen. You can reproduce with:

setGeneric("ifelse")

## A dummy method so the dispatch mechanism will need to evaluate th
## 'no' arg before dispatch can actually happen.
setMethod("ifelse", c(no="data.frame"),
   function(test, yes, no)
     stop("I'm kind of broken on data frames, don't use me like that"))

Then:

   > ifelse(TRUE, a <- 2L, a <- 3L)
   [1] 2
   > a
   [1] 3

Delay evaluation is a world full of surprises...

H.

>
> -- Bert
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
> (650) 467-7374
>
> "Data is not information. Information is not knowledge. And knowledge
> is certainly not wisdom."
> Clifford Stoll
>
>
>
>
> On Mon, Jan 26, 2015 at 2:22 PM, Hervé Pagès <[hidden email]> wrote:
>> Hi Martin,
>>
>> On 01/26/2015 04:45 AM, Martin Maechler wrote:
>>>>>>>>
>>>>>>>> Jim Lemon <[hidden email]>
>>>>>>>>       on Mon, 26 Jan 2015 11:21:03 +1100 writes:
>>>
>>>
>>>       > Hi Allen, How about this:
>>>
>>>       > sum_w_NA<-function(x) ifelse(all(is.na(x)),NA,sum(x,na.rm=TRUE))
>>>
>>> Excuse, Jim, but that's yet another  "horrible misuse of  ifelse()"
>>>
>>> John Fox's reply *did* contain  the "proper" solution
>>>
>>>        if (all(is.na(x))) NA else sum(x, na.rm=TRUE)
>>>
>>> The ifelse() function should never be used in such cases.
>>> Read more after googling
>>>
>>>       "Do NOT use ifelse()"
>>>
>>>       -- include the quotes in your search --
>>>
>>> or directly at
>>>      http://stat.ethz.ch/pipermail/r-help/2014-December/424367.html
>>
>>
>> Interesting. You could have added the following item to your list:
>>
>>    4. less likely to play strange tricks on you:
>>
>>       > ifelse(TRUE, a <- 2L, a <- 3L)
>>       [1] 2
>>       > a
>>       [1] 3
>>
>> Yeah I've seen people using ifelse() that way and being totally
>> confused...
>>
>> Cheers,
>> H.
>>
>>>
>>> Yes, this has been on R-help a month ago..
>>> Martin
>>>
>>>       > On Mon, Jan 26, 2015 at 10:21 AM, Allen Bingham
>>>       > <[hidden email]> wrote:
>>>       >> I understand that in order to get the sum function to
>>>       >> ignore missing values I need to supply the argument
>>>       >> na.rm=TRUE. However, when summing numeric values in which
>>>       >> ALL components are "NA" ... the result is 0.0 ... instead
>>>       >> of (what I would get from SAS) of NA (or in the case of
>>>       >> SAS ".").
>>>       >>
>>>       >> Accordingly, I've had to go to 'extreme' measures to get
>>>       >> the sum function to result in NA if all arguments are
>>>       >> missing (otherwise give me a sum of all non-NA elements).
>>>       >>
>>>       >> So for example here's a snippet of code that ALMOST does
>>>       >> what I want:
>>>       >>
>>>       >>
>>>       >>
>>> SumValue<-apply(subset(InputDataFrame,!is.na(Variable.1)|!is.na(Variable.2),
>>>       >> select=c(Variable.1,Variable.2)),1,sum,na.rm=TRUE)
>>>       >>
>>>       >> In reality this does NOT give me records with NA for
>>>       >> SumValue ... but it doesn't give me values for any
>>>       >> records in which both Variable.1 and Variable.2 are NA
>>>       >> --- which is "good enough" for my purposes.
>>>       >>
>>>       >> I'm guessing with a little more work I could come up with
>>>       >> a way to adapt the code above so that I could get it to
>>>       >> work like SAS's sum function ...
>>>       >>
>>>       >> ... but before I go that extra mile I thought I'd ask
>>>       >> others if they know of functions in either base R ... or
>>>       >> in a package that will better mimic the SAS sum function.
>>>       >>
>>>       >> Any suggestions?
>>>       >>
>>>       >> Thanks.  ______________________________________ Allen
>>>       >> Bingham [hidden email]
>>>       >>
>>>       >> ______________________________________________
>>>       >> [hidden email] mailing list -- To UNSUBSCRIBE and
>>>       >> more, see https://stat.ethz.ch/mailman/listinfo/r-help
>>>       >> PLEASE do read the posting guide
>>>       >> http://www.R-project.org/posting-guide.html and provide
>>>       >> commented, minimal, self-contained, reproducible code.
>>>
>>>       > ______________________________________________
>>>       > [hidden email] mailing list -- To UNSUBSCRIBE and
>>>       > more, see https://stat.ethz.ch/mailman/listinfo/r-help
>>>       > PLEASE do read the posting guide
>>>       > http://www.R-project.org/posting-guide.html and provide
>>>       > commented, minimal, self-contained, reproducible code.
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> --
>> Hervé Pagès
>>
>> Program in Computational Biology
>> Division of Public Health Sciences
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N, M1-B514
>> P.O. Box 19024
>> Seattle, WA 98109-1024
>>
>> E-mail: [hidden email]
>> Phone:  (206) 667-5791
>> Fax:    (206) 667-1319
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: [hidden email]
Phone:  (206) 667-5791
Fax:    (206) 667-1319

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.