## by() subset by factor gives unexpected results

 I am having trouble understanding how the 'by' function works.  Using this bit of code: i <- data.frame(x=c(1,2,3), y=c(0,0,0), B=c("red","blue","blue")) j <- data.frame(x=c(1,2,3), y=c(1,1,1), B=c('red','blue','green')) plot(0, 0, type="n", xlim=c(0,4), ylim=c(0,1)) by(i, i\$B, function(s){ points(s\$x, s\$y, col=s\$B) }) by(j, j\$B, function(s){ points(s\$x, s\$y, col=s\$B) }) I would have expected the point at (1,1) to be coloured red.  When plotted, this row is indeed red: > i[1,]   x y   B 1 1 0 red however, this next point is green on the plot even though I would like it to be red: > j[1,]   x y   B 1 1 1 red How can I achieve that? Myles
## Re: by() subset by factor gives unexpected results

 The answer was (thanks to Mark Leeds) to do with the use of a factor instead of a vector. on [2017-08-05] at 08:57 Myles English writes: > I am having trouble understanding how the 'by' function works.  Using > this bit of code: > > i <- data.frame(x=c(1,2,3), y=c(0,0,0), B=c("red","blue","blue")) > j <- data.frame(x=c(1,2,3), y=c(1,1,1), B=c('red','blue','green')) The use of I() prevents conversion to a factor: i <- data.frame(x=c(1,2,3), y=c(0,0,0), B=I(c("red","blue","blue"))) j <- data.frame(x=c(1,2,3), y=c(1,1,1), B=I(c('red','blue','green'))) > plot(0, 0, type="n", xlim=c(0,4), ylim=c(0,1)) > by(i, i\$B, function(s){ points(s\$x, s\$y, col=s\$B) }) > by(j, j\$B, function(s){ points(s\$x, s\$y, col=s\$B) }) > > I would have expected the point at (1,1) to be coloured red.  When > plotted, this row is indeed red: > >> i[1,] >   x y   B > 1 1 0 red > > however, this next point is green on the plot even though I would like > it to be red: > >> j[1,] >   x y   B > 1 1 1 red > > How can I achieve that? > > Myles