SAPPLY function XXXX

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

SAPPLY function XXXX

Dan Abner
Hello everyone,

I am attempting to write a function to count the number of non-missing
values of each column in a data frame using the sapply function. I have the
following code which is receiving the error message below.


> n.valid<-sapply(data1,sum(!is.na))
Error in !is.na : invalid argument type

Ultimately, I would like for this to be 1 conponent in a larger function
that will produce PROC CONTENTS style output. Something like...

data1.contents<-data.frame(Variable=names(data1),
 Class=sapply(data1,class),
 n.valid=sapply(data1,sum(!is.na)),
 n.miss=sapply(data1,sum(is.na)))
data1.contents

Any suggestions/assistance are appreciated.

Thank you,

Daniel

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: SAPPLY function XXXX

Erik Iverson-3
Dan,


> I am attempting to write a function to count the number of non-missing
> values of each column in a data frame using the sapply function. I have the
> following code which is receiving the error message below.
>
>
>> n.valid<-sapply(data1,sum(!is.na))
> Error in !is.na : invalid argument type

That's the FUN argument to sapply, which expects a function.  is.na is
indeed a function, but !is.na is not a function:

 > !is.na
Error in !is.na : invalid argument type

You need to write your own function to do what you want. Luckily this is
easy. Let's write one to count the number of missing values in a vector.

countNAs <- function(x) {
    sum(!is.na(x))
}

Now you have a function that does what you want, so you can use sapply
with it.

sapply(data1, countNAs)

You could also do an anonymous (unnamed) function within sapply to the
same effect.

sapply(data1, function(x) sum(!is.na(x)))

NB: none of this is tested!

--Erik


>
> Ultimately, I would like for this to be 1 conponent in a larger function
> that will produce PROC CONTENTS style output. Something like...
>
> data1.contents<-data.frame(Variable=names(data1),
>  Class=sapply(data1,class),
>  n.valid=sapply(data1,sum(!is.na)),
>  n.miss=sapply(data1,sum(is.na)))
> data1.contents
>
> Any suggestions/assistance are appreciated.
>
> Thank you,
>
> Daniel
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: SAPPLY function XXXX

Erik Iverson-3
In reply to this post by Dan Abner

> Ultimately, I would like for this to be 1 conponent in a larger function
> that will produce PROC CONTENTS style output. Something like...
>
> data1.contents<-data.frame(Variable=names(data1),
>  Class=sapply(data1,class),
>  n.valid=sapply(data1,sum(!is.na)),
>  n.miss=sapply(data1,sum(is.na)))
> data1.contents

Also meant to mention to see ?describe in the Hmisc package:

E.g.,

 > describe(c(NA, 1:10))

There is also a useful method for data.frame objects.

--Erik

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: SAPPLY function XXXX

Dan Abner
In reply to this post by Erik Iverson-3
Perfect Erik! Thank you!



On Wed, May 4, 2011 at 4:22 PM, Erik Iverson <[hidden email]> wrote:

> Dan,
>
>
>
> I am attempting to write a function to count the number of non-missing
>> values of each column in a data frame using the sapply function. I have
>> the
>> following code which is receiving the error message below.
>>
>>
>> n.valid<-sapply(data1,sum(!is.na))
>>>
>> Error in !is.na : invalid argument type
>>
>
> That's the FUN argument to sapply, which expects a function.  is.na is
> indeed a function, but !is.na is not a function:
>
>
> > !is.na
> Error in !is.na : invalid argument type
>
> You need to write your own function to do what you want. Luckily this is
> easy. Let's write one to count the number of missing values in a vector.
>
> countNAs <- function(x) {
>   sum(!is.na(x))
> }
>
> Now you have a function that does what you want, so you can use sapply with
> it.
>
> sapply(data1, countNAs)
>
> You could also do an anonymous (unnamed) function within sapply to the same
> effect.
>
> sapply(data1, function(x) sum(!is.na(x)))
>
> NB: none of this is tested!
>
> --Erik
>
>
>
>> Ultimately, I would like for this to be 1 conponent in a larger function
>> that will produce PROC CONTENTS style output. Something like...
>>
>> data1.contents<-data.frame(Variable=names(data1),
>>  Class=sapply(data1,class),
>>  n.valid=sapply(data1,sum(!is.na)),
>>  n.miss=sapply(data1,sum(is.na)))
>> data1.contents
>>
>> Any suggestions/assistance are appreciated.
>>
>> Thank you,
>>
>> Daniel
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: SAPPLY function XXXX

PIKAL Petr
In reply to this post by Erik Iverson-3
Hi

[hidden email] napsal dne 04.05.2011 22:26:59:

> Erik Iverson <[hidden email]>
> Odeslal: [hidden email]
>
> 04.05.2011 22:26
>
> Komu
>
> Dan Abner <[hidden email]>
>
>
> > Ultimately, I would like for this to be 1 conponent in a larger
function

> > that will produce PROC CONTENTS style output. Something like...
> >
> > data1.contents<-data.frame(Variable=names(data1),
> >  Class=sapply(data1,class),
> >  n.valid=sapply(data1,sum(!is.na)),
> >  n.miss=sapply(data1,sum(is.na)))
> > data1.contents
>
> Also meant to mention to see ?describe in the Hmisc package:
>
> E.g.,
>
>  > describe(c(NA, 1:10))
>
> There is also a useful method for data.frame objects.

colSums(is.na(data1))
colSums(!is.na(data1))

may also show number of missing and nonmissing values in data frame.

Regards
Petr

>
> --Erik
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.