Delete missing values

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Delete missing values

John Sorkin
I am trying to delete rows containing missing values from a groupeddata object. Several of the columns are character (sexChar, HAPI, rs2304785) the rest are numeric. For some reason I am excluding all rows with missing values. Your suggestions for corrections would be appreciated.

This did not work
        GC2 <- GC[c("logtg" != NA & "ctime" != NA & !is.na("sexChar") & !is.na("HAPI") & "logfirsttg" != NA & "BMI" != NA & !is.na(GC$
                rs2304795)),  ]
nor did
        GC2 <- GC["logtg" != NA & "ctime" != NA & !is.na("sexChar") & !is.na("HAPI") & "logfirsttg" != NA & "BMI" != NA & !is.na(GC$
                rs2304795),  ]

John

John Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
Baltimore VA Medical Center GRECC and
University of Maryland School of Medicine Claude Pepper OAIC

University of Maryland School of Medicine
Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524

410-605-7119
- NOTE NEW EMAIL ADDRESS:
[hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Delete missing values

Marc Schwartz (via MN)
On Wed, 2005-12-14 at 21:34 -0500, John Sorkin wrote:

> I am trying to delete rows containing missing values from a
> groupeddata object. Several of the columns are character (sexChar,
> HAPI, rs2304785) the rest are numeric. For some reason I am excluding
> all rows with missing values. Your suggestions for corrections would
> be appreciated.
>
> This did not work
> GC2 <- GC[c("logtg" != NA & "ctime" != NA & !is.na("sexChar") & !
> is.na("HAPI") & "logfirsttg" != NA & "BMI" != NA & !is.na(GC$
> rs2304795)),  ]
> nor did
> GC2 <- GC["logtg" != NA & "ctime" != NA & !is.na("sexChar") & !
> is.na("HAPI") & "logfirsttg" != NA & "BMI" != NA & !is.na(GC$
> rs2304795),  ]
>
> John

John,

You cannot use:

  Values != NA

and get the TRUE/FALSE results of the boolean comparison of Values that
are not equal to NA.

For example:

> a <- sample(c(NA, 1:5), 20, replace = TRUE)

> a
 [1]  2  3  3  1  3  4  5  3 NA  4  3  2  1  2  2 NA  2  2 NA  1

> a != NA
 [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA

or

> a == NA
 [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA


NA is undefined, so by definition, any comparisons to NA, as above, will
be as well.  Simply put:

> NA == NA
[1] NA      # Note that this is not TRUE


That is why there is a specific function to be used, which you have in
some cases above. That is is.na().

> !is.na(a)
 [1]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE
[12]  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE FALSE  TRUE


which then can be used as such:

> a[!is.na(a)]
 [1] 2 3 3 1 3 4 5 3 4 3 2 1 2 2 2 2 1


In the case of a data frame (which a groupedData object contains), you
can use complete.cases() to access the rows that do not have missing
values.  So, if your initial object is called GC, you should be able to
use:

  GC2 <- GC[complete.cases(GC), ]

An alternative is to use na.omit() as follows:

  GC2 <- na.omit(GC)

See ?complete.cases and ?na.omit for more information.

HTH,

Marc Schwartz

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Loading...