select and hold missing

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

select and hold missing

Val-17
I have a data
dfc <- read.table( text= 'week v1 v2
  w1  11  11
  w1  .    42
  w1  31  32
  w2  31  52
  w2  41  .
  w3  51  82
  w2  11  22
  w3  11  12
  w4  21  202
  w1  31  72
  w2  71  52', header = TRUE, as.is = TRUE, na.strings=c("",".","NA") )

I want to create this new variable diff = v2-v1  and remove rows based
on this "diff" value as shown below.
dfc$diff <-  dfc$v2 - dfc$v1
I want to   remove row values  <=0  and any value greater than  >=
100   and keep all values including NAs
dfca      <- dfc[((dfc$diff) > 0) & ((dfc$diff) < 100), ]

 However, the result is not what I wanted. I want the output as follow,
  week v1 v2 diff
  w1 NA  42  NA
  w1 31 32    1
  w2 31 52   21
  w2 41  NA  NA
  w3 51 82   31
  w2 11 22   11
  w3 11 12    1
  w1 31 72   41

However, I got this,l. Why it is setting all row values  NA?
   week v1 v2 diff
  <NA> NA NA   NA
  w1 31 32    1
 w2 31 52   21
 <NA> NA NA   NA
  w3 51 82   31
  w2 11 22   11
  w3 11 12    1
  w1 31 72   41

Any help ?
Thank you.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: select and hold missing

Bert Gunter-2
"Why it is setting all row values  NA?"

Because the row index is NA. e.g.

> z <- data.frame(a=letters[1:3],b = 1:3); x <- c(TRUE,NA,FALSE)
> z[x,]
      a  b
1     a  1
NA <NA> NA

Change your logical comparison to (using with() to simplify entry):

> dfc[with(dfc, diff > 0 & diff < 100 | is.na(diff)), ]
   week v1 v2 diff
2    w1 NA 42   NA
3    w1 31 32    1
4    w2 31 52   21
5    w2 41 NA   NA
6    w3 51 82   31
7    w2 11 22   11
8    w3 11 12    1
10   w1 31 72   41

Cheers,
Bert


On Wed, Sep 12, 2018 at 1:39 PM Val <[hidden email]> wrote:

>
> I have a data
> dfc <- read.table( text= 'week v1 v2
>   w1  11  11
>   w1  .    42
>   w1  31  32
>   w2  31  52
>   w2  41  .
>   w3  51  82
>   w2  11  22
>   w3  11  12
>   w4  21  202
>   w1  31  72
>   w2  71  52', header = TRUE, as.is = TRUE, na.strings=c("",".","NA") )
>
> I want to create this new variable diff = v2-v1  and remove rows based
> on this "diff" value as shown below.
> dfc$diff <-  dfc$v2 - dfc$v1
> I want to   remove row values  <=0  and any value greater than  >=
> 100   and keep all values including NAs
> dfca      <- dfc[((dfc$diff) > 0) & ((dfc$diff) < 100), ]
>
>  However, the result is not what I wanted. I want the output as follow,
>   week v1 v2 diff
>   w1 NA  42  NA
>   w1 31 32    1
>   w2 31 52   21
>   w2 41  NA  NA
>   w3 51 82   31
>   w2 11 22   11
>   w3 11 12    1
>   w1 31 72   41
>
> However, I got this,l. Why it is setting all row values  NA?
>    week v1 v2 diff
>   <NA> NA NA   NA
>   w1 31 32    1
>  w2 31 52   21
>  <NA> NA NA   NA
>   w3 51 82   31
>   w2 11 22   11
>   w3 11 12    1
>   w1 31 72   41
>
> Any help ?
> Thank you.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.