Faceted bar plot shows wrong counts (ggplot2)

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Faceted bar plot shows wrong counts (ggplot2)

Helios de Rosario
I have encountered a problem with faceted bar plots. I have tried to
create something like the example explained in the ggplot2 book (see pp.
126-128):

library(ggplot2)
mpg4  <-  subset(mpg,  manufacturer  %in%
c("audi",  "volkswagen",  "jeep"))
mpg4$manufacturer  <-  as.character(mpg4$manufacturer)
mpg4$model  <-  as.character(mpg4$model)

base  <-  ggplot(mpg4,  aes(fill  =  model))  +
geom_bar(position  =  "dodge")  +
opts(legend.position  =  "none")
base  +  aes(x  =  model)  +
facet_grid(.  ~  manufacturer)

That example works fine; the bar heights are just the same as the
counts in the table:

table(mpg4[,1:2])
          model
manufacturer a4 a4 quattro a6 quattro grand cherokee 4wd gti jetta new
beetle
  audi        7          8          3                  0   0     0    
    0
  jeep        0          0          0                  8   0     0    
    0
  volkswagen  0          0          0                  0   5     9    
    6
          model
manufacturer passat
  audi            0
  jeep            0

But in other cases this does not occur. For instance, take a small
subset of data(diamonds):

diamonds25 <- droplevels(diamonds[1:25,2:3])
table(diamonds25)
           color
cut         E F H I J
  Fair      1 0 0 0 0
  Good      1 0 0 1 4
  Very Good 1 0 3 1 4
  Premium   3 1 0 1 0
  Ideal     1 0 0 1 2

And change the variables mapped in the previous plot:

base  <-  ggplot(diamonds25,  aes(fill  =  cut))  +
geom_bar(position  =  "dodge")  +
opts(legend.position  =  "none")
base  +  aes(x  =  cut)  +
facet_grid(.  ~  color)

I see all bars with height = 1.
I have ovserved this problem (wrong bar heights, but not always = 1),
in other cases when all counts are very small or zero.
What's wrong here?

Regards,
Helios

sessionInfo()
R version 2.14.2 (2012-02-29)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=Spanish_Spain.1252  LC_CTYPE=Spanish_Spain.1252
[3] LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C
[5] LC_TIME=Spanish_Spain.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] ggplot2_0.9.0

loaded via a namespace (and not attached):
 [1] colorspace_1.1-1   dichromat_1.2-4    digest_0.5.1      
grid_2.14.2
 [5] MASS_7.3-17        memoise_0.1        munsell_0.3      
plyr_1.7.1
 [9] proto_0.3-9.2      RColorBrewer_1.0-5 reshape2_1.2.1    
scales_0.2.0
[13] stringr_0.6



INSTITUTO DE BIOMECÁNICA DE VALENCIA
Universidad Politécnica de Valencia • Edificio 9C
Camino de Vera s/n • 46022 VALENCIA (ESPAÑA)
Tel. +34 96 387 91 60 • Fax +34 96 387 91 69
www.ibv.org

  Antes de imprimir este e-mail piense bien si es necesario hacerlo.
En cumplimiento de la Ley Orgánica 15/1999 reguladora de la Protección
de Datos de Carácter Personal, le informamos de que el presente mensaje
contiene información confidencial, siendo para uso exclusivo del
destinatario arriba indicado. En caso de no ser usted el destinatario
del mismo le informamos que su recepción no le autoriza a su divulgación
o reproducción por cualquier medio, debiendo destruirlo de inmediato,
rogándole lo notifique al remitente.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Faceted bar plot shows wrong counts (ggplot2)

Michael Weylandt
You get the "good" behavior with

base + aes(x = cut) + facet_wrap(~ color, ncol = 5)

so this seems buggy to me.

If someone here doesn't step forward with more insight, I'd forward it
to the ggplot list to see if one of the developers there can give an
explanation or possibly make the official call that it's a bug.

There was another report of a possible bug in facet_grid() today that
could be related:
https://groups.google.com/group/ggplot2/browse_thread/thread/5213ac35da6b36d4

Michael

On Mon, Mar 12, 2012 at 7:16 AM, Helios de Rosario
<[hidden email]> wrote:

> I have encountered a problem with faceted bar plots. I have tried to
> create something like the example explained in the ggplot2 book (see pp.
> 126-128):
>
> library(ggplot2)
> mpg4  <-  subset(mpg,  manufacturer  %in%
> c("audi",  "volkswagen",  "jeep"))
> mpg4$manufacturer  <-  as.character(mpg4$manufacturer)
> mpg4$model  <-  as.character(mpg4$model)
>
> base  <-  ggplot(mpg4,  aes(fill  =  model))  +
> geom_bar(position  =  "dodge")  +
> opts(legend.position  =  "none")
> base  +  aes(x  =  model)  +
> facet_grid(.  ~  manufacturer)
>
> That example works fine; the bar heights are just the same as the
> counts in the table:
>
> table(mpg4[,1:2])
>          model
> manufacturer a4 a4 quattro a6 quattro grand cherokee 4wd gti jetta new
> beetle
>  audi        7          8          3                  0   0     0
>    0
>  jeep        0          0          0                  8   0     0
>    0
>  volkswagen  0          0          0                  0   5     9
>    6
>          model
> manufacturer passat
>  audi            0
>  jeep            0
>
> But in other cases this does not occur. For instance, take a small
> subset of data(diamonds):
>
> diamonds25 <- droplevels(diamonds[1:25,2:3])
> table(diamonds25)
>           color
> cut         E F H I J
>  Fair      1 0 0 0 0
>  Good      1 0 0 1 4
>  Very Good 1 0 3 1 4
>  Premium   3 1 0 1 0
>  Ideal     1 0 0 1 2
>
> And change the variables mapped in the previous plot:
>
> base  <-  ggplot(diamonds25,  aes(fill  =  cut))  +
> geom_bar(position  =  "dodge")  +
> opts(legend.position  =  "none")
> base  +  aes(x  =  cut)  +
> facet_grid(.  ~  color)
>
> I see all bars with height = 1.
> I have ovserved this problem (wrong bar heights, but not always = 1),
> in other cases when all counts are very small or zero.
> What's wrong here?
>
> Regards,
> Helios
>
> sessionInfo()
> R version 2.14.2 (2012-02-29)
> Platform: i386-pc-mingw32/i386 (32-bit)
>
> locale:
> [1] LC_COLLATE=Spanish_Spain.1252  LC_CTYPE=Spanish_Spain.1252
> [3] LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C
> [5] LC_TIME=Spanish_Spain.1252
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] ggplot2_0.9.0
>
> loaded via a namespace (and not attached):
>  [1] colorspace_1.1-1   dichromat_1.2-4    digest_0.5.1
> grid_2.14.2
>  [5] MASS_7.3-17        memoise_0.1        munsell_0.3
> plyr_1.7.1
>  [9] proto_0.3-9.2      RColorBrewer_1.0-5 reshape2_1.2.1
> scales_0.2.0
> [13] stringr_0.6
>
>
>
> INSTITUTO DE BIOMECÁNICA DE VALENCIA
> Universidad Politécnica de Valencia • Edificio 9C
> Camino de Vera s/n • 46022 VALENCIA (ESPAÑA)
> Tel. +34 96 387 91 60 • Fax +34 96 387 91 69
> www.ibv.org
>
>  Antes de imprimir este e-mail piense bien si es necesario hacerlo.
> En cumplimiento de la Ley Orgánica 15/1999 reguladora de la Protección
> de Datos de Carácter Personal, le informamos de que el presente mensaje
> contiene información confidencial, siendo para uso exclusivo del
> destinatario arriba indicado. En caso de no ser usted el destinatario
> del mismo le informamos que su recepción no le autoriza a su divulgación
> o reproducción por cualquier medio, debiendo destruirlo de inmediato,
> rogándole lo notifique al remitente.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Faceted bar plot shows wrong counts (ggplot2)

Helios de Rosario
Michael,

Thanks for the pointer to the discussion in the ggplot list. It seems
that the reason of this behaviour of facet_grid() is already known and
being discussed by the developers of ggplot2.

facet_grid() reduces the original data frame with unique() before
applying the stats.  If the data frame has any other column that
prevents duplicated rows, counts are correctly computed.

E.g.

diamonds25 <- droplevels(diamonds[1:25,]) # Keep all columns

# Everything else as before:
base  <-  ggplot(diamonds25,  aes(fill  =  cut))  +
 geom_bar(position  =  "dodge")  +
 opts(legend.position  =  "none")
 base  +  aes(x  =  cut)  +
 facet_grid(.  ~  color)


Helios

>>> El día 12/03/2012 a las 20:59, "R. Michael Weylandt"
<[hidden email]> escribió:
> You get the "good" behavior with
>
> base + aes(x = cut) + facet_wrap(~ color, ncol = 5)
>
> so this seems buggy to me.
>
> If someone here doesn't step forward with more insight, I'd forward
it
> to the ggplot list to see if one of the developers there can give an
> explanation or possibly make the official call that it's a bug.
>
> There was another report of a possible bug in facet_grid() today
that
> could be related:
>
https://groups.google.com/group/ggplot2/browse_thread/thread/5213ac35da6b36d

> 4
>
> Michael
>
> On Mon, Mar 12, 2012 at 7:16 AM, Helios de Rosario
> <[hidden email]> wrote:
>> I have encountered a problem with faceted bar plots. I have tried
to
>> create something like the example explained in the ggplot2 book (see
pp.

>> 126-128):
>>
>> library(ggplot2)
>> mpg4  <-  subset(mpg,  manufacturer  %in%
>> c("audi",  "volkswagen",  "jeep"))
>> mpg4$manufacturer  <-  as.character(mpg4$manufacturer)
>> mpg4$model  <-  as.character(mpg4$model)
>>
>> base  <-  ggplot(mpg4,  aes(fill  =  model))  +
>> geom_bar(position  =  "dodge")  +
>> opts(legend.position  =  "none")
>> base  +  aes(x  =  model)  +
>> facet_grid(.  ~  manufacturer)
>>
>> That example works fine; the bar heights are just the same as the
>> counts in the table:
>>
>> table(mpg4[,1:2])
>>          model
>> manufacturer a4 a4 quattro a6 quattro grand cherokee 4wd gti jetta
new

>> beetle
>>  audi        7          8          3                  0   0     0
>>    0
>>  jeep        0          0          0                  8   0     0
>>    0
>>  volkswagen  0          0          0                  0   5     9
>>    6
>>          model
>> manufacturer passat
>>  audi            0
>>  jeep            0
>>
>> But in other cases this does not occur. For instance, take a small
>> subset of data(diamonds):
>>
>> diamonds25 <- droplevels(diamonds[1:25,2:3])
>> table(diamonds25)
>>           color
>> cut         E F H I J
>>  Fair      1 0 0 0 0
>>  Good      1 0 0 1 4
>>  Very Good 1 0 3 1 4
>>  Premium   3 1 0 1 0
>>  Ideal     1 0 0 1 2
>>
>> And change the variables mapped in the previous plot:
>>
>> base  <-  ggplot(diamonds25,  aes(fill  =  cut))  +
>> geom_bar(position  =  "dodge")  +
>> opts(legend.position  =  "none")
>> base  +  aes(x  =  cut)  +
>> facet_grid(.  ~  color)
>>
>> I see all bars with height = 1.
>> I have ovserved this problem (wrong bar heights, but not always =
1),

>> in other cases when all counts are very small or zero.
>> What's wrong here?
>>
>> Regards,
>> Helios
>>
>> sessionInfo()
>> R version 2.14.2 (2012-02-29)
>> Platform: i386-pc-mingw32/i386 (32-bit)
>>
>> locale:
>> [1] LC_COLLATE=Spanish_Spain.1252  LC_CTYPE=Spanish_Spain.1252
>> [3] LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C
>> [5] LC_TIME=Spanish_Spain.1252
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods  
base

>>
>> other attached packages:
>> [1] ggplot2_0.9.0
>>
>> loaded via a namespace (and not attached):
>>  [1] colorspace_1.1-1   dichromat_1.2-4    digest_0.5.1
>> grid_2.14.2
>>  [5] MASS_7.3-17        memoise_0.1        munsell_0.3
>> plyr_1.7.1
>>  [9] proto_0.3-9.2      RColorBrewer_1.0-5 reshape2_1.2.1
>> scales_0.2.0
>> [13] stringr_0.6
>>
>>
>>
>> INSTITUTO DE BIOMECÁNICA DE VALENCIA
>> Universidad Politécnica de Valencia ● Edificio 9C
>> Camino de Vera s/n ● 46022 VALENCIA (ESPAÑA)
>> Tel. +34 96 387 91 60 ● Fax +34 96 387 91 69
>> www.ibv.org
>>
>>  Antes de imprimir este e-mail piense bien si es necesario hacerlo.
>> En cumplimiento de la Ley Orgánica 15/1999 reguladora de la
Protección
>> de Datos de Carácter Personal, le informamos de que el presente
mensaje
>> contiene información confidencial, siendo para uso exclusivo del
>> destinatario arriba indicado. En caso de no ser usted el
destinatario
>> del mismo le informamos que su recepción no le autoriza a su
divulgación
>> o reproducción por cualquier medio, debiendo destruirlo de
inmediato,
>> rogándole lo notifique al remitente.
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help 
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code.

INSTITUTO DE BIOMECÁNICA DE VALENCIA
Universidad Politécnica de Valencia • Edificio 9C
Camino de Vera s/n • 46022 VALENCIA (ESPAÑA)
Tel. +34 96 387 91 60 • Fax +34 96 387 91 69
www.ibv.org

  Antes de imprimir este e-mail piense bien si es necesario hacerlo.
En cumplimiento de la Ley Orgánica 15/1999 reguladora de la Protección
de Datos de Carácter Personal, le informamos de que el presente mensaje
contiene información confidencial, siendo para uso exclusivo del
destinatario arriba indicado. En caso de no ser usted el destinatario
del mismo le informamos que su recepción no le autoriza a su divulgación
o reproducción por cualquier medio, debiendo destruirlo de inmediato,
rogándole lo notifique al remitente.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Faceted bar plot shows wrong counts (ggplot2)

Hadley Wickham-2
And it's now fixed in the dev version.
Hadley

On Tue, Mar 13, 2012 at 11:37 AM, Helios de Rosario
<[hidden email]> wrote:

> Michael,
>
> Thanks for the pointer to the discussion in the ggplot list. It seems
> that the reason of this behaviour of facet_grid() is already known and
> being discussed by the developers of ggplot2.
>
> facet_grid() reduces the original data frame with unique() before
> applying the stats.  If the data frame has any other column that
> prevents duplicated rows, counts are correctly computed.
>
> E.g.
>
> diamonds25 <- droplevels(diamonds[1:25,]) # Keep all columns
>
> # Everything else as before:
> base  <-  ggplot(diamonds25,  aes(fill  =  cut))  +
>  geom_bar(position  =  "dodge")  +
>  opts(legend.position  =  "none")
>  base  +  aes(x  =  cut)  +
>  facet_grid(.  ~  color)
>
>
> Helios
>
>>>> El día 12/03/2012 a las 20:59, "R. Michael Weylandt"
> <[hidden email]> escribió:
>> You get the "good" behavior with
>>
>> base + aes(x = cut) + facet_wrap(~ color, ncol = 5)
>>
>> so this seems buggy to me.
>>
>> If someone here doesn't step forward with more insight, I'd forward
> it
>> to the ggplot list to see if one of the developers there can give an
>> explanation or possibly make the official call that it's a bug.
>>
>> There was another report of a possible bug in facet_grid() today
> that
>> could be related:
>>
> https://groups.google.com/group/ggplot2/browse_thread/thread/5213ac35da6b36d
>
>> 4
>>
>> Michael
>>
>> On Mon, Mar 12, 2012 at 7:16 AM, Helios de Rosario
>> <[hidden email]> wrote:
>>> I have encountered a problem with faceted bar plots. I have tried
> to
>>> create something like the example explained in the ggplot2 book (see
> pp.
>>> 126-128):
>>>
>>> library(ggplot2)
>>> mpg4  <-  subset(mpg,  manufacturer  %in%
>>> c("audi",  "volkswagen",  "jeep"))
>>> mpg4$manufacturer  <-  as.character(mpg4$manufacturer)
>>> mpg4$model  <-  as.character(mpg4$model)
>>>
>>> base  <-  ggplot(mpg4,  aes(fill  =  model))  +
>>> geom_bar(position  =  "dodge")  +
>>> opts(legend.position  =  "none")
>>> base  +  aes(x  =  model)  +
>>> facet_grid(.  ~  manufacturer)
>>>
>>> That example works fine; the bar heights are just the same as the
>>> counts in the table:
>>>
>>> table(mpg4[,1:2])
>>>          model
>>> manufacturer a4 a4 quattro a6 quattro grand cherokee 4wd gti jetta
> new
>>> beetle
>>>  audi        7          8          3                  0   0     0
>>>    0
>>>  jeep        0          0          0                  8   0     0
>>>    0
>>>  volkswagen  0          0          0                  0   5     9
>>>    6
>>>          model
>>> manufacturer passat
>>>  audi            0
>>>  jeep            0
>>>
>>> But in other cases this does not occur. For instance, take a small
>>> subset of data(diamonds):
>>>
>>> diamonds25 <- droplevels(diamonds[1:25,2:3])
>>> table(diamonds25)
>>>           color
>>> cut         E F H I J
>>>  Fair      1 0 0 0 0
>>>  Good      1 0 0 1 4
>>>  Very Good 1 0 3 1 4
>>>  Premium   3 1 0 1 0
>>>  Ideal     1 0 0 1 2
>>>
>>> And change the variables mapped in the previous plot:
>>>
>>> base  <-  ggplot(diamonds25,  aes(fill  =  cut))  +
>>> geom_bar(position  =  "dodge")  +
>>> opts(legend.position  =  "none")
>>> base  +  aes(x  =  cut)  +
>>> facet_grid(.  ~  color)
>>>
>>> I see all bars with height = 1.
>>> I have ovserved this problem (wrong bar heights, but not always =
> 1),
>>> in other cases when all counts are very small or zero.
>>> What's wrong here?
>>>
>>> Regards,
>>> Helios
>>>
>>> sessionInfo()
>>> R version 2.14.2 (2012-02-29)
>>> Platform: i386-pc-mingw32/i386 (32-bit)
>>>
>>> locale:
>>> [1] LC_COLLATE=Spanish_Spain.1252  LC_CTYPE=Spanish_Spain.1252
>>> [3] LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C
>>> [5] LC_TIME=Spanish_Spain.1252
>>>
>>> attached base packages:
>>> [1] stats     graphics  grDevices utils     datasets  methods
> base
>>>
>>> other attached packages:
>>> [1] ggplot2_0.9.0
>>>
>>> loaded via a namespace (and not attached):
>>>  [1] colorspace_1.1-1   dichromat_1.2-4    digest_0.5.1
>>> grid_2.14.2
>>>  [5] MASS_7.3-17        memoise_0.1        munsell_0.3
>>> plyr_1.7.1
>>>  [9] proto_0.3-9.2      RColorBrewer_1.0-5 reshape2_1.2.1
>>> scales_0.2.0
>>> [13] stringr_0.6
>>>
>>>
>>>
>>> INSTITUTO DE BIOMECÁNICA DE VALENCIA
>>> Universidad Politécnica de Valencia ● Edificio 9C
>>> Camino de Vera s/n ● 46022 VALENCIA (ESPAÑA)
>>> Tel. +34 96 387 91 60 ● Fax +34 96 387 91 69
>>> www.ibv.org
>>>
>>>  Antes de imprimir este e-mail piense bien si es necesario hacerlo.
>>> En cumplimiento de la Ley Orgánica 15/1999 reguladora de la
> Protección
>>> de Datos de Carácter Personal, le informamos de que el presente
> mensaje
>>> contiene información confidencial, siendo para uso exclusivo del
>>> destinatario arriba indicado. En caso de no ser usted el
> destinatario
>>> del mismo le informamos que su recepción no le autoriza a su
> divulgación
>>> o reproducción por cualquier medio, debiendo destruirlo de
> inmediato,
>>> rogándole lo notifique al remitente.
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>
> INSTITUTO DE BIOMECÁNICA DE VALENCIA
> Universidad Politécnica de Valencia • Edificio 9C
> Camino de Vera s/n • 46022 VALENCIA (ESPAÑA)
> Tel. +34 96 387 91 60 • Fax +34 96 387 91 69
> www.ibv.org
>
>  Antes de imprimir este e-mail piense bien si es necesario hacerlo.
> En cumplimiento de la Ley Orgánica 15/1999 reguladora de la Protección
> de Datos de Carácter Personal, le informamos de que el presente mensaje
> contiene información confidencial, siendo para uso exclusivo del
> destinatario arriba indicado. En caso de no ser usted el destinatario
> del mismo le informamos que su recepción no le autoriza a su divulgación
> o reproducción por cualquier medio, debiendo destruirlo de inmediato,
> rogándole lo notifique al remitente.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.