Problem Subsetting Rows that Have NA's

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Problem Subsetting Rows that Have NA's

Tom La Bone
This has every appearance of being a bug. If it is not a bug, can
someone tell me what I am asking for when I ask for "x[x[,2]==0,]". Thanks.

 > #here is the toy dataset
 > x <- rbind(c(1,1),c(2,2),c(3,3),c(4,0),c(5,0),c(6,NA),
+   c(7,NA),c(8,NA),c(9,NA),c(10,NA)
+ )
 > x
       [,1] [,2]
  [1,]    1    1
  [2,]    2    2
  [3,]    3    3
  [4,]    4    0
  [5,]    5    0
  [6,]    6   NA
  [7,]    7   NA
  [8,]    8   NA
  [9,]    9   NA
[10,]   10   NA
 >
 > #it contains rows that have NA's
 > x[is.na(x[,2]),]
      [,1] [,2]
[1,]    6   NA
[2,]    7   NA
[3,]    8   NA
[4,]    9   NA
[5,]   10   NA
 >
 > #seems like an unreasonable answer to a reasonable question
 > x[x[,2]==0,]
      [,1] [,2]
[1,]    4    0
[2,]    5    0
[3,]   NA   NA
[4,]   NA   NA
[5,]   NA   NA
[6,]   NA   NA
[7,]   NA   NA
 >
 > #this is more what I was expecting
 > x[which(x[,2]==0),]
      [,1] [,2]
[1,]    4    0
[2,]    5    0
 >

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Problem Subsetting Rows that Have NA's

Ben Tupper-2
Hi,

It's related to how NAs are treated in comparison operations.  See the Details section of https://www.rdocumentation.org/packages/base/versions/3.4.1/topics/Comparison <https://www.rdocumentation.org/packages/base/versions/3.4.1/topics/Comparison>

You can try something like this...

x[which(x[,2] %in% 0),]
#      [,1] [,2]
# [1,]    4    0
# [2,]    5    0


... but I'm not sure if it is bullet proof.  Others may have more insight.

Cheers,
Ben



> On Oct 24, 2017, at 3:05 PM, BooBoo <[hidden email]> wrote:
>
> This has every appearance of being a bug. If it is not a bug, can someone tell me what I am asking for when I ask for "x[x[,2]==0,]". Thanks.
>
> > #here is the toy dataset
> > x <- rbind(c(1,1),c(2,2),c(3,3),c(4,0),c(5,0),c(6,NA),
> +   c(7,NA),c(8,NA),c(9,NA),c(10,NA)
> + )
> > x
>      [,1] [,2]
> [1,]    1    1
> [2,]    2    2
> [3,]    3    3
> [4,]    4    0
> [5,]    5    0
> [6,]    6   NA
> [7,]    7   NA
> [8,]    8   NA
> [9,]    9   NA
> [10,]   10   NA
> >
> > #it contains rows that have NA's
> > x[is.na(x[,2]),]
>     [,1] [,2]
> [1,]    6   NA
> [2,]    7   NA
> [3,]    8   NA
> [4,]    9   NA
> [5,]   10   NA
> >
> > #seems like an unreasonable answer to a reasonable question
> > x[x[,2]==0,]
>     [,1] [,2]
> [1,]    4    0
> [2,]    5    0
> [3,]   NA   NA
> [4,]   NA   NA
> [5,]   NA   NA
> [6,]   NA   NA
> [7,]   NA   NA
> >
> > #this is more what I was expecting
> > x[which(x[,2]==0),]
>     [,1] [,2]
> [1,]    4    0
> [2,]    5    0
> >
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Ben Tupper
Bigelow Laboratory for Ocean Sciences
60 Bigelow Drive, P.O. Box 380
East Boothbay, Maine 04544
http://www.bigelow.org

Ecocast Reports: http://seascapemodeling.org/ecocast.html
Tick Reports: https://report.bigelow.org/tick/
Jellyfish Reports: https://jellyfish.bigelow.org/jellyfish/




        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Problem Subsetting Rows that Have NA's

Ek Esawi
In reply to this post by Tom La Bone
z <- x[x[,2]==0&!is.na(x[,2]),]  seems to work and get you what you want,
but doesn't answer your question,
z <- x[x[,2]==0&!is.na(x[,2]),]

Best of luck,
EK

On Tue, Oct 24, 2017 at 3:05 PM, BooBoo <[hidden email]> wrote:

> This has every appearance of being a bug. If it is not a bug, can someone
> tell me what I am asking for when I ask for "x[x[,2]==0,]". Thanks.
>
> > #here is the toy dataset
> > x <- rbind(c(1,1),c(2,2),c(3,3),c(4,0),c(5,0),c(6,NA),
> +   c(7,NA),c(8,NA),c(9,NA),c(10,NA)
> + )
> > x
>       [,1] [,2]
>  [1,]    1    1
>  [2,]    2    2
>  [3,]    3    3
>  [4,]    4    0
>  [5,]    5    0
>  [6,]    6   NA
>  [7,]    7   NA
>  [8,]    8   NA
>  [9,]    9   NA
> [10,]   10   NA
> >
> > #it contains rows that have NA's
> > x[is.na(x[,2]),]
>      [,1] [,2]
> [1,]    6   NA
> [2,]    7   NA
> [3,]    8   NA
> [4,]    9   NA
> [5,]   10   NA
> >
> > #seems like an unreasonable answer to a reasonable question
> > x[x[,2]==0,]
>      [,1] [,2]
> [1,]    4    0
> [2,]    5    0
> [3,]   NA   NA
> [4,]   NA   NA
> [5,]   NA   NA
> [6,]   NA   NA
> [7,]   NA   NA
> >
> > #this is more what I was expecting
> > x[which(x[,2]==0),]
>      [,1] [,2]
> [1,]    4    0
> [2,]    5    0
> >
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posti
> ng-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Problem Subsetting Rows that Have NA's

Ista Zahn
In reply to this post by Tom La Bone
On Tue, Oct 24, 2017 at 3:05 PM, BooBoo <[hidden email]> wrote:
> This has every appearance of being a bug. If it is not a bug, can someone
> tell me what I am asking for when I ask for "x[x[,2]==0,]". Thanks.

You are asking for elements of x where the second column is equal to zero.

help("==")

and

help("[")

explain what happens when missing values are involved. I agree that
the behavior is surprising, but your first instinct when you discover
something surprising should be to read the documentation, not to post
to this list. After having read the documentation you may post back
here if anything remains unclear.

Best,
Ista

>
>> #here is the toy dataset
>> x <- rbind(c(1,1),c(2,2),c(3,3),c(4,0),c(5,0),c(6,NA),
> +   c(7,NA),c(8,NA),c(9,NA),c(10,NA)
> + )
>> x
>       [,1] [,2]
>  [1,]    1    1
>  [2,]    2    2
>  [3,]    3    3
>  [4,]    4    0
>  [5,]    5    0
>  [6,]    6   NA
>  [7,]    7   NA
>  [8,]    8   NA
>  [9,]    9   NA
> [10,]   10   NA
>>
>> #it contains rows that have NA's
>> x[is.na(x[,2]),]
>      [,1] [,2]
> [1,]    6   NA
> [2,]    7   NA
> [3,]    8   NA
> [4,]    9   NA
> [5,]   10   NA
>>
>> #seems like an unreasonable answer to a reasonable question
>> x[x[,2]==0,]
>      [,1] [,2]
> [1,]    4    0
> [2,]    5    0
> [3,]   NA   NA
> [4,]   NA   NA
> [5,]   NA   NA
> [6,]   NA   NA
> [7,]   NA   NA
>>
>> #this is more what I was expecting
>> x[which(x[,2]==0),]
>      [,1] [,2]
> [1,]    4    0
> [2,]    5    0
>>
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Problem Subsetting Rows that Have NA's

Tom La Bone
On 10/25/2017 4:38 AM, Ista Zahn wrote:

> On Tue, Oct 24, 2017 at 3:05 PM, BooBoo <[hidden email]> wrote:
>> This has every appearance of being a bug. If it is not a bug, can someone
>> tell me what I am asking for when I ask for "x[x[,2]==0,]". Thanks.
> You are asking for elements of x where the second column is equal to zero.
>
> help("==")
>
> and
>
> help("[")
>
> explain what happens when missing values are involved. I agree that
> the behavior is surprising, but your first instinct when you discover
> something surprising should be to read the documentation, not to post
> to this list. After having read the documentation you may post back
> here if anything remains unclear.
>
> Best,
> Ista
>
>>> #here is the toy dataset
>>> x <- rbind(c(1,1),c(2,2),c(3,3),c(4,0),c(5,0),c(6,NA),
>> +   c(7,NA),c(8,NA),c(9,NA),c(10,NA)
>> + )
>>> x
>>        [,1] [,2]
>>   [1,]    1    1
>>   [2,]    2    2
>>   [3,]    3    3
>>   [4,]    4    0
>>   [5,]    5    0
>>   [6,]    6   NA
>>   [7,]    7   NA
>>   [8,]    8   NA
>>   [9,]    9   NA
>> [10,]   10   NA
>>> #it contains rows that have NA's
>>> x[is.na(x[,2]),]
>>       [,1] [,2]
>> [1,]    6   NA
>> [2,]    7   NA
>> [3,]    8   NA
>> [4,]    9   NA
>> [5,]   10   NA
>>> #seems like an unreasonable answer to a reasonable question
>>> x[x[,2]==0,]
>>       [,1] [,2]
>> [1,]    4    0
>> [2,]    5    0
>> [3,]   NA   NA
>> [4,]   NA   NA
>> [5,]   NA   NA
>> [6,]   NA   NA
>> [7,]   NA   NA
>>> #this is more what I was expecting
>>> x[which(x[,2]==0),]
>>       [,1] [,2]
>> [1,]    4    0
>> [2,]    5    0
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

I wanted to know if this was a bug so that I could report it if so. You
say it is not, so you answered my question. As far as me not reading the
documentation, I challenge anyone to read the cited help pages and
predict the observed behavior based on the information given in those
pages.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Problem Subsetting Rows that Have NA's

David Winsemius

> On Oct 25, 2017, at 6:57 AM, BooBoo <[hidden email]> wrote:
>
> On 10/25/2017 4:38 AM, Ista Zahn wrote:
>> On Tue, Oct 24, 2017 at 3:05 PM, BooBoo <[hidden email]> wrote:
>>> This has every appearance of being a bug. If it is not a bug, can someone
>>> tell me what I am asking for when I ask for "x[x[,2]==0,]". Thanks.
>> You are asking for elements of x where the second column is equal to zero.
>>
>> help("==")
>>
>> and
>>
>> help("[")
>>
>> explain what happens when missing values are involved. I agree that
>> the behavior is surprising, but your first instinct when you discover
>> something surprising should be to read the documentation, not to post
>> to this list. After having read the documentation you may post back
>> here if anything remains unclear.
>>
>> Best,
>> Ista
>>
>>>> #here is the toy dataset
>>>> x <- rbind(c(1,1),c(2,2),c(3,3),c(4,0),c(5,0),c(6,NA),
>>> +   c(7,NA),c(8,NA),c(9,NA),c(10,NA)
>>> + )
>>>> x
>>>       [,1] [,2]
>>>  [1,]    1    1
>>>  [2,]    2    2
>>>  [3,]    3    3
>>>  [4,]    4    0
>>>  [5,]    5    0
>>>  [6,]    6   NA
>>>  [7,]    7   NA
>>>  [8,]    8   NA
>>>  [9,]    9   NA
>>> [10,]   10   NA
>>>> #it contains rows that have NA's
>>>> x[is.na(x[,2]),]
>>>      [,1] [,2]
>>> [1,]    6   NA
>>> [2,]    7   NA
>>> [3,]    8   NA
>>> [4,]    9   NA
>>> [5,]   10   NA
>>>> #seems like an unreasonable answer to a reasonable question
>>>> x[x[,2]==0,]
>>>      [,1] [,2]
>>> [1,]    4    0
>>> [2,]    5    0
>>> [3,]   NA   NA
>>> [4,]   NA   NA
>>> [5,]   NA   NA
>>> [6,]   NA   NA
>>> [7,]   NA   NA
>>>> #this is more what I was expecting
>>>> x[which(x[,2]==0),]
>>>      [,1] [,2]
>>> [1,]    4    0
>>> [2,]    5    0
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>
> I wanted to know if this was a bug so that I could report it if so. You say it is not, so you answered my question. As far as me not reading the documentation, I challenge anyone to read the cited help pages and predict the observed behavior based on the information given in those pages.

Some of us do share (or at least remember feeling) your pain. The ?Extract page is long and complex and there are several features that I find non-intuitive. But they are deemed desirable by others. I think I needed to read that page about ten times (with multiple different problems that needed explication) before it started to sink in. You are apparently on that same side of the split opinions on the feature of returning rows with logical NA's as I am. I've learned to use `which`, and I push back when the conoscienti says it's not needed.

 After you read it a few more times you may come to a different opinion. Many people come to R with preconceived notions of what words like "equals" or "list" or "vector" mean and then complain about the documentation. You would be better advised to spend more time studying the language. The help pages are precise but terse, and you need to spend time with the examples and with other tutorial material to recognize the gotcha's.

Here's a couple of possibly helpful rules regarding "[[" and "[" and logical indexing:

Nothing _equals_ NA.
Selection operations with NA logical index item return NA.  (Justified as a warning feature as I understand it.)
"[" always returns a list.
"[[" returns only one thing, but even that thing could be a list.
Generally you want "[[" if you plan on testing for equality with a vector.

The "R Inferno" by Burns is an effort to detail many more of the unexpected or irregular aspects of R (mostly inherited from S).

--
Best of luck in your studies.


>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.'   -Gehm's Corollary to Clarke's Third Law

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Problem Subsetting Rows that Have NA's

Bert Gunter-2
... Just to be clear:

David's end summary

"[" always returns a list.
"[[" returns only one thing, but even that thing could be a list.
Generally you want "[[" if you plan on testing for equality with a vector.


applies to indexing on a **list**, of course, and not to vectors, matrices,
etc.

Cheers,
Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Wed, Oct 25, 2017 at 11:17 AM, David Winsemius <[hidden email]>
wrote:

>
> > On Oct 25, 2017, at 6:57 AM, BooBoo <[hidden email]> wrote:
> >
> > On 10/25/2017 4:38 AM, Ista Zahn wrote:
> >> On Tue, Oct 24, 2017 at 3:05 PM, BooBoo <[hidden email]> wrote:
> >>> This has every appearance of being a bug. If it is not a bug, can
> someone
> >>> tell me what I am asking for when I ask for "x[x[,2]==0,]". Thanks.
> >> You are asking for elements of x where the second column is equal to
> zero.
> >>
> >> help("==")
> >>
> >> and
> >>
> >> help("[")
> >>
> >> explain what happens when missing values are involved. I agree that
> >> the behavior is surprising, but your first instinct when you discover
> >> something surprising should be to read the documentation, not to post
> >> to this list. After having read the documentation you may post back
> >> here if anything remains unclear.
> >>
> >> Best,
> >> Ista
> >>
> >>>> #here is the toy dataset
> >>>> x <- rbind(c(1,1),c(2,2),c(3,3),c(4,0),c(5,0),c(6,NA),
> >>> +   c(7,NA),c(8,NA),c(9,NA),c(10,NA)
> >>> + )
> >>>> x
> >>>       [,1] [,2]
> >>>  [1,]    1    1
> >>>  [2,]    2    2
> >>>  [3,]    3    3
> >>>  [4,]    4    0
> >>>  [5,]    5    0
> >>>  [6,]    6   NA
> >>>  [7,]    7   NA
> >>>  [8,]    8   NA
> >>>  [9,]    9   NA
> >>> [10,]   10   NA
> >>>> #it contains rows that have NA's
> >>>> x[is.na(x[,2]),]
> >>>      [,1] [,2]
> >>> [1,]    6   NA
> >>> [2,]    7   NA
> >>> [3,]    8   NA
> >>> [4,]    9   NA
> >>> [5,]   10   NA
> >>>> #seems like an unreasonable answer to a reasonable question
> >>>> x[x[,2]==0,]
> >>>      [,1] [,2]
> >>> [1,]    4    0
> >>> [2,]    5    0
> >>> [3,]   NA   NA
> >>> [4,]   NA   NA
> >>> [5,]   NA   NA
> >>> [6,]   NA   NA
> >>> [7,]   NA   NA
> >>>> #this is more what I was expecting
> >>>> x[which(x[,2]==0),]
> >>>      [,1] [,2]
> >>> [1,]    4    0
> >>> [2,]    5    0
> >>> ______________________________________________
> >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >
> > I wanted to know if this was a bug so that I could report it if so. You
> say it is not, so you answered my question. As far as me not reading the
> documentation, I challenge anyone to read the cited help pages and predict
> the observed behavior based on the information given in those pages.
>
> Some of us do share (or at least remember feeling) your pain. The ?Extract
> page is long and complex and there are several features that I find
> non-intuitive. But they are deemed desirable by others. I think I needed to
> read that page about ten times (with multiple different problems that
> needed explication) before it started to sink in. You are apparently on
> that same side of the split opinions on the feature of returning rows with
> logical NA's as I am. I've learned to use `which`, and I push back when the
> conoscienti says it's not needed.
>
>  After you read it a few more times you may come to a different opinion.
> Many people come to R with preconceived notions of what words like "equals"
> or "list" or "vector" mean and then complain about the documentation. You
> would be better advised to spend more time studying the language. The help
> pages are precise but terse, and you need to spend time with the examples
> and with other tutorial material to recognize the gotcha's.
>
> Here's a couple of possibly helpful rules regarding "[[" and "[" and
> logical indexing:
>
> Nothing _equals_ NA.
> Selection operations with NA logical index item return NA.  (Justified as
> a warning feature as I understand it.)
> "[" always returns a list.
> "[[" returns only one thing, but even that thing could be a list.
> Generally you want "[[" if you plan on testing for equality with a vector.
>
> The "R Inferno" by Burns is an effort to detail many more of the
> unexpected or irregular aspects of R (mostly inherited from S).
>
> --
> Best of luck in your studies.
>
>
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
> 'Any technology distinguishable from magic is insufficiently advanced.'
>  -Gehm's Corollary to Clarke's Third Law
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Problem Subsetting Rows that Have NA's

David Winsemius
In reply to this post by David Winsemius

> On Oct 25, 2017, at 11:17 AM, David Winsemius <[hidden email]> wrote:
>
>
>> On Oct 25, 2017, at 6:57 AM, BooBoo <[hidden email]> wrote:
>>
>> On 10/25/2017 4:38 AM, Ista Zahn wrote:
>>> On Tue, Oct 24, 2017 at 3:05 PM, BooBoo <[hidden email]> wrote:
>>>> This has every appearance of being a bug. If it is not a bug, can someone
>>>> tell me what I am asking for when I ask for "x[x[,2]==0,]". Thanks.
>>> You are asking for elements of x where the second column is equal to zero.
>>>
>>> help("==")
>>>
>>> and
>>>
>>> help("[")
>>>
>>> explain what happens when missing values are involved. I agree that
>>> the behavior is surprising, but your first instinct when you discover
>>> something surprising should be to read the documentation, not to post
>>> to this list. After having read the documentation you may post back
>>> here if anything remains unclear.
>>>
>>> Best,
>>> Ista
>>>
>>>>> #here is the toy dataset
>>>>> x <- rbind(c(1,1),c(2,2),c(3,3),c(4,0),c(5,0),c(6,NA),
>>>> +   c(7,NA),c(8,NA),c(9,NA),c(10,NA)
>>>> + )
>>>>> x
>>>>      [,1] [,2]
>>>> [1,]    1    1
>>>> [2,]    2    2
>>>> [3,]    3    3
>>>> [4,]    4    0
>>>> [5,]    5    0
>>>> [6,]    6   NA
>>>> [7,]    7   NA
>>>> [8,]    8   NA
>>>> [9,]    9   NA
>>>> [10,]   10   NA
>>>>> #it contains rows that have NA's
>>>>> x[is.na(x[,2]),]
>>>>     [,1] [,2]
>>>> [1,]    6   NA
>>>> [2,]    7   NA
>>>> [3,]    8   NA
>>>> [4,]    9   NA
>>>> [5,]   10   NA
>>>>> #seems like an unreasonable answer to a reasonable question
>>>>> x[x[,2]==0,]
>>>>     [,1] [,2]
>>>> [1,]    4    0
>>>> [2,]    5    0
>>>> [3,]   NA   NA
>>>> [4,]   NA   NA
>>>> [5,]   NA   NA
>>>> [6,]   NA   NA
>>>> [7,]   NA   NA
>>>>> #this is more what I was expecting
>>>>> x[which(x[,2]==0),]
>>>>     [,1] [,2]
>>>> [1,]    4    0
>>>> [2,]    5    0
>>>> ______________________________________________
>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> I wanted to know if this was a bug so that I could report it if so. You say it is not, so you answered my question. As far as me not reading the documentation, I challenge anyone to read the cited help pages and predict the observed behavior based on the information given in those pages.
>
> Some of us do share (or at least remember feeling) your pain. The ?Extract page is long and complex and there are several features that I find non-intuitive. But they are deemed desirable by others. I think I needed to read that page about ten times (with multiple different problems that needed explication) before it started to sink in. You are apparently on that same side of the split opinions on the feature of returning rows with logical NA's as I am. I've learned to use `which`, and I push back when the conoscienti says it's not needed.


horrible misspelling of cognoscenti


> After you read it a few more times you may come to a different opinion. Many people come to R with preconceived notions of what words like "equals" or "list" or "vector" mean and then complain about the documentation. You would be better advised to spend more time studying the language. The help pages are precise but terse, and you need to spend time with the examples and with other tutorial material to recognize the gotcha's.
>
> Here's a couple of possibly helpful rules regarding "[[" and "[" and logical indexing:
>
> Nothing _equals_ NA.
> Selection operations with NA logical index item return NA.  (Justified as a warning feature as I understand it.)
> "[" always returns a list.

That's not true or even half true. "[" always returns a list if it's first argument is a list and it only has two arguments.

If X is a list and you ask for X[vector] you get a list

If you ask for X[vector, ] you may get a list or a vector.

If you ask for X[two_column_matrix] you get a vector.

I should be flogged.


> "[[" returns only one thing, but even that thing could be a list.

Horribl;y imprecise.

> Generally you want "[[" if you plan on testing for equality with a vector.

Don't listen to me. Read ....

>
> The "R Inferno" by Burns is an effort to detail many more of the unexpected or irregular aspects of R (mostly inherited from S).
>
> --
> Best of luck in your studies.
>
>
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
> 'Any technology distinguishable from magic is insufficiently advanced.'   -Gehm's Corollary to Clarke's Third Law
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.'   -Gehm's Corollary to Clarke's Third Law

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Problem Subsetting Rows that Have NA's

Peter Dalgaard-2
In reply to this post by Tom La Bone
It's not a bug, and the rationale has been hashed over since the beginning of time...

It is a bit of an annoyance in some contexts and part of the rationale for the existence of subset().

If you need an explanation, start with elementary vector indexing:

colors <- c("red", "green", "blue")
colors[c(1,3,2,NA,3)]

You pretty clearly want the result to be a vector of length 5 with 4th element NA, right?

Same story if you index into a data frame:

> airquality[c(1,3,2,NA,2),]
    Ozone Solar.R Wind Temp Month Day
1      41     190  7.4   67     5   1
3      12     149 12.6   74     5   3
2      36     118  8.0   72     5   2
NA     NA      NA   NA   NA    NA  NA
2.1    36     118  8.0   72     5   2

Now, that's not an argument that you also get NA rows from logical indexing, but then comes the issue of automatic coercion: In colors[NA], the NA is actually mode "logical". If we removed NA indexes in logical indexing, we would have to explain why colors[c(1,NA)] has length 2 but colors[NA] has length zero (which it currently does not).

-pd

> On 25 Oct 2017, at 15:57 , BooBoo <[hidden email]> wrote:
>
> On 10/25/2017 4:38 AM, Ista Zahn wrote:
>> On Tue, Oct 24, 2017 at 3:05 PM, BooBoo <[hidden email]> wrote:
>>> This has every appearance of being a bug. If it is not a bug, can someone
>>> tell me what I am asking for when I ask for "x[x[,2]==0,]". Thanks.
>> You are asking for elements of x where the second column is equal to zero.
>>
>> help("==")
>>
>> and
>>
>> help("[")
>>
>> explain what happens when missing values are involved. I agree that
>> the behavior is surprising, but your first instinct when you discover
>> something surprising should be to read the documentation, not to post
>> to this list. After having read the documentation you may post back
>> here if anything remains unclear.
>>
>> Best,
>> Ista
>>
>>>> #here is the toy dataset
>>>> x <- rbind(c(1,1),c(2,2),c(3,3),c(4,0),c(5,0),c(6,NA),
>>> +   c(7,NA),c(8,NA),c(9,NA),c(10,NA)
>>> + )
>>>> x
>>>       [,1] [,2]
>>>  [1,]    1    1
>>>  [2,]    2    2
>>>  [3,]    3    3
>>>  [4,]    4    0
>>>  [5,]    5    0
>>>  [6,]    6   NA
>>>  [7,]    7   NA
>>>  [8,]    8   NA
>>>  [9,]    9   NA
>>> [10,]   10   NA
>>>> #it contains rows that have NA's
>>>> x[is.na(x[,2]),]
>>>      [,1] [,2]
>>> [1,]    6   NA
>>> [2,]    7   NA
>>> [3,]    8   NA
>>> [4,]    9   NA
>>> [5,]   10   NA
>>>> #seems like an unreasonable answer to a reasonable question
>>>> x[x[,2]==0,]
>>>      [,1] [,2]
>>> [1,]    4    0
>>> [2,]    5    0
>>> [3,]   NA   NA
>>> [4,]   NA   NA
>>> [5,]   NA   NA
>>> [6,]   NA   NA
>>> [7,]   NA   NA
>>>> #this is more what I was expecting
>>>> x[which(x[,2]==0),]
>>>      [,1] [,2]
>>> [1,]    4    0
>>> [2,]    5    0
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>
> I wanted to know if this was a bug so that I could report it if so. You say it is not, so you answered my question. As far as me not reading the documentation, I challenge anyone to read the cited help pages and predict the observed behavior based on the information given in those pages.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: [hidden email]  Priv: [hidden email]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Problem Subsetting Rows that Have NA's

JohnDee
In reply to this post by Tom La Bone
On Tue, 24 Oct 2017 15:05:01 -0400
BooBoo <[hidden email]> wrote:

> This has every appearance of being a bug. If it is not a bug, can
> someone tell me what I am asking for when I ask for "x[x[,2]==0,]".
> Thanks.
>
As others have pointed out not a bug, but very "unintuitively"
explained in the documentation.  On the other hand, be glad it isn't
a .dbf file and that you are not using dBASE.  Ancient history of
course, but dBASE used to convert missing data (NAs) into 0s when you
weren't looking. If you were extremely unlucky and stuck in the role of
autodidact, your success in both noticing and identifying the problem
could be a long time coming.  In the meantime, your instructors might
be numerate, and point out rudely your inability to carry out even
simple calculations of averages, or they could be innumerate and
happily accept your mistakes as gospel because you included
"quantified" results in you paper.

Good luck.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.