by() subset by factor gives unexpected results

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

by() subset by factor gives unexpected results

Myles English
I am having trouble understanding how the 'by' function works.  Using
this bit of code:

i <- data.frame(x=c(1,2,3), y=c(0,0,0), B=c("red","blue","blue"))
j <- data.frame(x=c(1,2,3), y=c(1,1,1), B=c('red','blue','green'))

plot(0, 0, type="n", xlim=c(0,4), ylim=c(0,1))
by(i, i$B, function(s){ points(s$x, s$y, col=s$B) })
by(j, j$B, function(s){ points(s$x, s$y, col=s$B) })

I would have expected the point at (1,1) to be coloured red.  When
plotted, this row is indeed red:

> i[1,]
  x y   B
1 1 0 red

however, this next point is green on the plot even though I would like
it to be red:

> j[1,]
  x y   B
1 1 1 red

How can I achieve that?

Myles

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: by() subset by factor gives unexpected results

Myles English

The answer was (thanks to Mark Leeds) to do with the use of a factor
instead of a vector.

on [2017-08-05] at 08:57 Myles English writes:

> I am having trouble understanding how the 'by' function works.  Using
> this bit of code:
>
> i <- data.frame(x=c(1,2,3), y=c(0,0,0), B=c("red","blue","blue"))
> j <- data.frame(x=c(1,2,3), y=c(1,1,1), B=c('red','blue','green'))

The use of I() prevents conversion to a factor:

i <- data.frame(x=c(1,2,3), y=c(0,0,0), B=I(c("red","blue","blue")))
j <- data.frame(x=c(1,2,3), y=c(1,1,1), B=I(c('red','blue','green')))

> plot(0, 0, type="n", xlim=c(0,4), ylim=c(0,1))
> by(i, i$B, function(s){ points(s$x, s$y, col=s$B) })
> by(j, j$B, function(s){ points(s$x, s$y, col=s$B) })
>
> I would have expected the point at (1,1) to be coloured red.  When
> plotted, this row is indeed red:
>
>> i[1,]
>   x y   B
> 1 1 0 red
>
> however, this next point is green on the plot even though I would like
> it to be red:
>
>> j[1,]
>   x y   B
> 1 1 1 red
>
> How can I achieve that?
>
> Myles

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: by() subset by factor gives unexpected results

mark leeds
Putting answer here for future posterity. Didn't send to R-help initially
because I wasn't sure
what OP wanted. I guessed right.  Sorry for confusion in thread.


GUESSING THAT YOU WANT IS BELOW
#===================================================================

i <- data.frame(x=c(1,2,3), y=c(0,0,0), B=c("red","blue","blue"),
stringsAsFactors = FALSE)
j <- data.frame(x=c(1,2,3), y=c(1,1,1), B=c('red','blue','green'),
stringsAsFactors = FALSE)

plot(0, 0, type="n", xlim=c(0,4), ylim=c(0,1))
points(i$x, i$y, col = i$B)
points(j$x, j$y, col = j$B)

On Sat, Aug 5, 2017 at 5:59 AM, Myles English <[hidden email]>
wrote:

>
> The answer was (thanks to Mark Leeds) to do with the use of a factor
> instead of a vector.
>
> on [2017-08-05] at 08:57 Myles English writes:
>
> > I am having trouble understanding how the 'by' function works.  Using
> > this bit of code:
> >
> > i <- data.frame(x=c(1,2,3), y=c(0,0,0), B=c("red","blue","blue"))
> > j <- data.frame(x=c(1,2,3), y=c(1,1,1), B=c('red','blue','green'))
>
> The use of I() prevents conversion to a factor:
>
> i <- data.frame(x=c(1,2,3), y=c(0,0,0), B=I(c("red","blue","blue")))
> j <- data.frame(x=c(1,2,3), y=c(1,1,1), B=I(c('red','blue','green')))
>
> > plot(0, 0, type="n", xlim=c(0,4), ylim=c(0,1))
> > by(i, i$B, function(s){ points(s$x, s$y, col=s$B) })
> > by(j, j$B, function(s){ points(s$x, s$y, col=s$B) })
> >
> > I would have expected the point at (1,1) to be coloured red.  When
> > plotted, this row is indeed red:
> >
> >> i[1,]
> >   x y   B
> > 1 1 0 red
> >
> > however, this next point is green on the plot even though I would like
> > it to be red:
> >
> >> j[1,]
> >   x y   B
> > 1 1 1 red
> >
> > How can I achieve that?
> >
> > Myles
>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: by() subset by factor gives unexpected results

Bert Gunter-2
In reply to this post by Myles English
... and, of course, by() should not be used at all for this sort of
thing in practice, as the "col" argument can be a vector. See
?plot.default if you you were not aware of this already.

j <- data.frame(x=c(1,2,3), y=c(1,1,1), B=c("red","blue","green"),
                stringsAsFactors = FALSE)

with(j,plot(x,y, col=B, xlim=c(0,4), ylim=c(0,1.2)))


Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Aug 5, 2017 at 2:59 AM, Myles English <[hidden email]> wrote:

>
> The answer was (thanks to Mark Leeds) to do with the use of a factor
> instead of a vector.
>
> on [2017-08-05] at 08:57 Myles English writes:
>
>> I am having trouble understanding how the 'by' function works.  Using
>> this bit of code:
>>
>> i <- data.frame(x=c(1,2,3), y=c(0,0,0), B=c("red","blue","blue"))
>> j <- data.frame(x=c(1,2,3), y=c(1,1,1), B=c('red','blue','green'))
>
> The use of I() prevents conversion to a factor:
>
> i <- data.frame(x=c(1,2,3), y=c(0,0,0), B=I(c("red","blue","blue")))
> j <- data.frame(x=c(1,2,3), y=c(1,1,1), B=I(c('red','blue','green')))
>
>> plot(0, 0, type="n", xlim=c(0,4), ylim=c(0,1))
>> by(i, i$B, function(s){ points(s$x, s$y, col=s$B) })
>> by(j, j$B, function(s){ points(s$x, s$y, col=s$B) })
>>
>> I would have expected the point at (1,1) to be coloured red.  When
>> plotted, this row is indeed red:
>>
>>> i[1,]
>>   x y   B
>> 1 1 0 red
>>
>> however, this next point is green on the plot even though I would like
>> it to be red:
>>
>>> j[1,]
>>   x y   B
>> 1 1 1 red
>>
>> How can I achieve that?
>>
>> Myles
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.