Quantcast

[R] How to count the number of NAs in each column of a df?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[R] How to count the number of NAs in each column of a df?

Michael Kubovy
I would like to remove columns of a df which have too many NAs.

I think that summary() should give me the information, I just don't  
know how to access it.

Advice?
_____________________________
Professor Michael Kubovy
University of Virginia
Department of Psychology
USPS:     P.O.Box 400400    Charlottesville, VA 22904-4400
Parcels:    Room 102        Gilmer Hall
         McCormick Road    Charlottesville, VA 22903
Office:    B011    +1-434-982-4729
Lab:        B019    +1-434-982-4751
Fax:        +1-434-982-4766
WWW:    http://www.people.virginia.edu/~mk9y/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [R] How to count the number of NAs in each column of a df?

Richard M. Heiberger
drop.col.kna <- function(mydf, k)
 mydf[sapply(mydf, function(x) sum(is.na(x))) < k]


tmp <- data.frame(matrix(1:24, 6,4, dimnames=list(letters[1:6], LETTERS[1:4])))
tmp[1:3,1] <- NA
tmp[2:5,2] <- NA
tmp[6,3] <- NA

drop.col.kna(tmp, 0)
drop.col.kna(tmp, 1)
drop.col.kna(tmp, 2)
drop.col.kna(tmp, 3)
drop.col.kna(tmp, 4)
drop.col.kna(tmp, 5)
drop.col.kna(tmp, 6)

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [R] How to count the number of NAs in each column of a df?

Chuck Cleland
Richard M. Heiberger wrote:

> drop.col.kna <- function(mydf, k)
>  mydf[sapply(mydf, function(x) sum(is.na(x))) < k]
>
> tmp <- data.frame(matrix(1:24, 6,4, dimnames=list(letters[1:6], LETTERS[1:4])))
> tmp[1:3,1] <- NA
> tmp[2:5,2] <- NA
> tmp[6,3] <- NA
>
> drop.col.kna(tmp, 0)
> drop.col.kna(tmp, 1)
> drop.col.kna(tmp, 2)
> drop.col.kna(tmp, 3)
> drop.col.kna(tmp, 4)
> drop.col.kna(tmp, 5)
> drop.col.kna(tmp, 6)

  Possibly simpler (does not require a new function definition and seems
highly intuitive) might be something like this:

tmp.dropna <- tmp[,colSums(is.na(tmp)) < 2]

tmp.dropna
   C  D
a 13 19
b 14 20
c 15 21
d 16 22
e 17 23
f NA 24

> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

--
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How to count the number of NAs in each column of a df?

Michael Kubovy
Dear Jim (25 minutes!), Richard (27 minutes!), and Chuck,

Thanks to your hints, I have come up with what I hope is a pithy  
idiom that drops columns of a dataframe (df) in which the number of  
NAs is > (e.g.) 30.

tmp <- df
tmp <- tmp[, which(as.numeric(colSums(is.na(tmp))) > 30)]
df <- tmp

I wonder if we have a place to keep R programming idioms (which  
probably get unnecessarily reinvented). Is the R-Wiki suitable?
_____________________________
Professor Michael Kubovy
University of Virginia
Department of Psychology
USPS:     P.O.Box 400400    Charlottesville, VA 22904-4400
Parcels:    Room 102        Gilmer Hall
         McCormick Road    Charlottesville, VA 22903
Office:    B011    +1-434-982-4729
Lab:        B019    +1-434-982-4751
Fax:        +1-434-982-4766
WWW:    http://www.people.virginia.edu/~mk9y/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Loading...