Grouped boxplots using ggplot() from ggplot2.

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Grouped boxplots using ggplot() from ggplot2.

Rolf Turner

I have the task of producing some boxplot graphics with the requirement
that these have the same general appearance as a set of such graphics
as were produced last year.  I do not have access to the code that was
used to produce the "last year" graphics.

There are multiple boxplots per figure and these boxplots appear in
groups (with two boxplots in each group in the simplest instance; there
are four or more per group in other instances, but I figure that if I
can work out how to handle two, then ....).

After a bit of Googling I found that ggplot() does basically what I
want.  However my mindset seems to be substantially incompatible with
that of ggplot() and I cannot figure out how to make some adjustments
which are needed in order to make my plots look like last year's.

In last year's graphics the boxes were unfilled and were distinguished
(within groups) by their boundary colours, which were "red" and "black"
in the simple two-per-group instance.  I achieved the "unfilled" effect
by setting alpha=0 inside the call to geom_boxplot().  (Is this the
Right Thing to Do?)  However I cannot get the boundary colours of the
boxes to be "red" and "black".

I have attached a sourceable script ("demo.txt") showing what I have
tried so far.  I don't really understand the code; I simply copied and
adjusted code that I saw on stackoverflow.  (Fairly mindlessly I'm afraid.)

Problems:

(1) The borders of the boxes are distinct, but they are sort-of-pink and
sort-of-blue, and I cannot for the life of me figure out how to make
them red and black.

(2) Putting in "color=Type" seemed to have the effect of creating two
legends, one with the desired legend title but all in black, and one
with legend title equal to "Type" but using the colours that actually
appear. How can I get just one "appropriate" legend?

(3) Last year's graphics have the x-axis starting at 0 (rather than at
c. 3.5).  I tried using + xlim(0,8.5) but got told "Error: Discrete
value supplied to continuous scale".  How can I make the appropriate
adjustment?

(4) Last year's graphics have y-axis tick marks, labels and grid lines
at 700, 800, 900, ..., 2000, 2100.  How can I reproduce this?

I actually had several additional questions, but thought I'd better
scrounge around a bit more before posting this, and thereby managed
(mirabile dictu!) to answer them myself.

Can anyone help me out with questions (1) --- (4)?  Please keep it
simple and very explicit, for I am a bear of very little brain and long
words bother me!

Thanks.

cheers,

Rolf Turner

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

demo.txt (874 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Grouped boxplots using ggplot() from ggplot2.

Jeff Newmiller
When you understand the strong dependence on how the data controls ggplot,
using it gets much easier. I still have to google details sometimes
though. Note that it can be very difficult to make a weird plot (e.g.
multiple parallel axes) in ggplot because it is very internally
consistent... a blessing and a curse.

1) Colour is assigned in the scale according to order of levels of the
factor. Note that while they are both discrete, the so-called "discrete"
scales auto-colour, but "manual" scales require you to specify the exact
colour sequence.

2) Assigning constants to properties is done outside the mapping (aes).
Note that "colour" is for lines and shapes outlines, while "fill" is
colour meant to fill in shapes. When the names of these two scales are the
same and the values are the same, the legends will merge. If not, they
will be shown separately.

3) Discrete scales are controlled by the levels in the data. To prevent
ggplot from removing missing levels, use the drop=FALSE argument.

4) Breaks are a property of the scale.

My changes were:

Year <- factor( rep( 4:8, each = 50, times = 2 ), levels = 0:8 )
DemoDat <- data.frame(Year = Year, Score = c( X0 , X1 ), Type = Type )

ggplot( data = DemoDat
       , aes( x = Year, y = Score, color = Type )
       , fill = NULL
       ) +
     geom_boxplot( position = position_dodge(1) ) +
     theme_minimal() +
     scale_colour_manual( name = "National v. Local"
                        , values = c( "red", "black" ) ) +
     scale_x_discrete( drop = FALSE ) +
     scale_y_continuous( breaks = seq( 700, 2100, 100 ) )

Good luck with your graphics grammar!

On Sat, 28 Jul 2018, Rolf Turner wrote:

>
> I have the task of producing some boxplot graphics with the requirement that
> these have the same general appearance as a set of such graphics
> as were produced last year.  I do not have access to the code that was
> used to produce the "last year" graphics.
>
> There are multiple boxplots per figure and these boxplots appear in groups
> (with two boxplots in each group in the simplest instance; there are four or
> more per group in other instances, but I figure that if I can work out how to
> handle two, then ....).
>
> After a bit of Googling I found that ggplot() does basically what I want.
> However my mindset seems to be substantially incompatible with that of
> ggplot() and I cannot figure out how to make some adjustments which are
> needed in order to make my plots look like last year's.
>
> In last year's graphics the boxes were unfilled and were distinguished
> (within groups) by their boundary colours, which were "red" and "black"
> in the simple two-per-group instance.  I achieved the "unfilled" effect by
> setting alpha=0 inside the call to geom_boxplot().  (Is this the Right Thing
> to Do?)  However I cannot get the boundary colours of the
> boxes to be "red" and "black".
>
> I have attached a sourceable script ("demo.txt") showing what I have tried so
> far.  I don't really understand the code; I simply copied and adjusted code
> that I saw on stackoverflow.  (Fairly mindlessly I'm afraid.)
>
> Problems:
>
> (1) The borders of the boxes are distinct, but they are sort-of-pink and
> sort-of-blue, and I cannot for the life of me figure out how to make them red
> and black.
>
> (2) Putting in "color=Type" seemed to have the effect of creating two
> legends, one with the desired legend title but all in black, and one with
> legend title equal to "Type" but using the colours that actually appear. How
> can I get just one "appropriate" legend?
>
> (3) Last year's graphics have the x-axis starting at 0 (rather than at
> c. 3.5).  I tried using + xlim(0,8.5) but got told "Error: Discrete value
> supplied to continuous scale".  How can I make the appropriate
> adjustment?
>
> (4) Last year's graphics have y-axis tick marks, labels and grid lines at
> 700, 800, 900, ..., 2000, 2100.  How can I reproduce this?
>
> I actually had several additional questions, but thought I'd better scrounge
> around a bit more before posting this, and thereby managed (mirabile dictu!)
> to answer them myself.
>
> Can anyone help me out with questions (1) --- (4)?  Please keep it simple and
> very explicit, for I am a bear of very little brain and long words bother me!
>
> Thanks.
>
> cheers,
>
> Rolf Turner
>
> --
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
>

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<[hidden email]>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Grouped boxplots using ggplot() from ggplot2.

Rolf Turner

On 28/07/18 17:03, Jeff Newmiller wrote:

> When you understand the strong dependence on how the data controls
> ggplot, using it gets much easier. I still have to google details
> sometimes though. Note that it can be very difficult to make a weird
> plot (e.g. multiple parallel axes) in ggplot because it is very
> internally consistent... a blessing and a curse.
>
> 1) Colour is assigned in the scale according to order of levels of the
> factor. Note that while they are both discrete, the so-called "discrete"
> scales auto-colour, but "manual" scales require you to specify the exact
> colour sequence.
>
> 2) Assigning constants to properties is done outside the mapping (aes).
> Note that "colour" is for lines and shapes outlines, while "fill" is
> colour meant to fill in shapes. When the names of these two scales are
> the same and the values are the same, the legends will merge. If not,
> they will be shown separately.
>
> 3) Discrete scales are controlled by the levels in the data. To prevent
> ggplot from removing missing levels, use the drop=FALSE argument.
>
> 4) Breaks are a property of the scale.
>
> My changes were:
>
> Year <- factor( rep( 4:8, each = 50, times = 2 ), levels = 0:8 )
> DemoDat <- data.frame(Year = Year, Score = c( X0 , X1 ), Type = Type )
>
> ggplot( data = DemoDat
>        , aes( x = Year, y = Score, color = Type )
>        , fill = NULL
>        ) +
>      geom_boxplot( position = position_dodge(1) ) +
>      theme_minimal() +
>      scale_colour_manual( name = "National v. Local"
>                         , values = c( "red", "black" ) ) +
>      scale_x_discrete( drop = FALSE ) +
>      scale_y_continuous( breaks = seq( 700, 2100, 100 ) )
>
> Good luck with your graphics grammar!
Dear Jeff,

Thanks very much for this cogent advice and for taking the trouble to
steer me in the right direction.  However I am not quite out of the
woods yet.

(1) I'm still getting two legends.  How do I stop this from happening?

(2) The boxes are "filled" (with pinkish and blueish colours --- which
are referenced in the second of the two legends that I get).  How can I
get "unfilled" boxes?

(3) The y-axis scale runs only from 800 to 1800, rather than from 700 to
2100.  How can I force it to run from 700 to 2100?

(4) With the modified code we now get some "outliers" (points beyond the
whisker tips) plotted --- which I didn't get before (and don't want,
because "last year's" graphics did not include outliers).  How can I
suppress the plotting of outliers?

I have attached a pdf containing the results of running the code that
you provided, so that you can readily see what is happening.

May I prevail upon your good graces to enlighten me about questions
(1) --- (4) above?

Ever so humbly grateful.

cheers,

Rolf

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

demoPlot.pdf (7K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Grouped boxplots using ggplot() from ggplot2.

Jeff Newmiller
1) I don't know... it looks to me like you did not run my code. I have
included a complete reprex below... try it out in a fresh session. If you
still get the problem, check your sessionInfo package versions against
mine.

2) This still smells like your fill parameter is inside the aes function
with Type as value. This causes a legend to be created, and since that
legend has a different name ("Type") than the colour scale, they are
separated. Confirm that you are using fill outside the aes function
(because you don't want fill to depend on the data) and have the constant
NULL as value (so it won't generate any fill graphical representation).

3) I missed that... the ylim()/scales_y_continuous(breaks=) limits
constrain which data are included as input into the graph. The
coord_cartesian function forces the limits as desired.

4) While showing outliers is a standard semantic feature of boxplots
whether produced by ggplot or lattice or base or non-R solution, you can
please the client by making the outliers transparent.

There is a link to the generated image below.

################
# Simulate some data:
Type <- rep( c( "National", "Local" ), each = 250 )
M0   <- 1300+50*(0:4)
set.seed( 42 )
M1   <- M0 + runif( 5, -100, -50 )
X0   <- rnorm( 250, rep( M0, each = 50 ), 150 )
X1   <- rnorm( 250, rep( M1, each = 50 ), 100 )

library(ggplot2)
Year <- factor( rep( 4:8, each = 50, times = 2)
               , levels = 0:8 )
DemoDat <- data.frame( Year = Year
                      , Score = c( X0, X1 )
                      , Type = Type
                      )

ggplot( data = DemoDat
       , aes( x = Year
            , y = Score
            , color = Type
            )
       , fill = NULL
       ) +
     geom_boxplot( position = position_dodge( 1 )
                 , outlier.alpha = 0
                 ) +
     theme_minimal() +
     scale_colour_manual( name = "National v. Local"
                        , values = c( "red", "black" ) ) +
     scale_x_discrete( drop = FALSE ) +
     scale_y_continuous( breaks=seq( 700, 2100, 100 ) ) +
     coord_cartesian( ylim = c( 700, 2100 ) )

# ![](https://i.imgur.com/wUVYU5H.png)

#' Created on 2018-07-28 by the [reprex package](http://reprex.tidyverse.org) (v0.2.0).
################


> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0

locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
LC_PAPER=en_US.UTF-8       LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] ggplot2_3.0.0

loaded via a namespace (and not attached):
  [1] Rcpp_0.12.17     pillar_1.2.3     compiler_3.4.4   plyr_1.8.4
bindr_0.1.1      tools_3.4.4
  [7] digest_0.6.15    memoise_1.1.0    evaluate_0.10.1  tibble_1.4.2
gtable_0.2.0     debugme_1.1.0
[13] pkgconfig_2.0.1  rlang_0.2.1      reprex_0.2.0     rstudioapi_0.7
yaml_2.1.19      bindrcpp_0.2.2
[19] stringr_1.3.1    withr_2.1.2      dplyr_0.7.6      knitr_1.20
devtools_1.13.6  rprojroot_1.3-2
[25] grid_3.4.4       tidyselect_0.2.4 glue_1.2.0       R6_2.2.2
processx_3.1.0   rmarkdown_1.10
[31] clipr_0.4.1      purrr_0.2.5      callr_2.0.4      magrittr_1.5
whisker_0.3-2    scales_0.5.0
[37] backports_1.1.2  htmltools_0.3.6  assertthat_0.2.0 colorspace_1.3-2
stringi_1.2.3    lazyeval_0.2.1
[43] munsell_0.5.0    crayon_1.3.4



On Sat, 28 Jul 2018, Rolf Turner wrote:

>
> On 28/07/18 17:03, Jeff Newmiller wrote:
>
>> When you understand the strong dependence on how the data controls ggplot,
>> using it gets much easier. I still have to google details sometimes though.
>> Note that it can be very difficult to make a weird plot (e.g. multiple
>> parallel axes) in ggplot because it is very internally consistent... a
>> blessing and a curse.
>>
>> 1) Colour is assigned in the scale according to order of levels of the
>> factor. Note that while they are both discrete, the so-called "discrete"
>> scales auto-colour, but "manual" scales require you to specify the exact
>> colour sequence.
>>
>> 2) Assigning constants to properties is done outside the mapping (aes).
>> Note that "colour" is for lines and shapes outlines, while "fill" is colour
>> meant to fill in shapes. When the names of these two scales are the same
>> and the values are the same, the legends will merge. If not, they will be
>> shown separately.
>>
>> 3) Discrete scales are controlled by the levels in the data. To prevent
>> ggplot from removing missing levels, use the drop=FALSE argument.
>>
>> 4) Breaks are a property of the scale.
>>
>> My changes were:
>>
>> Year <- factor( rep( 4:8, each = 50, times = 2 ), levels = 0:8 )
>> DemoDat <- data.frame(Year = Year, Score = c( X0 , X1 ), Type = Type )
>>
>> ggplot( data = DemoDat
>>        , aes( x = Year, y = Score, color = Type )
>>        , fill = NULL
>>        ) +
>>      geom_boxplot( position = position_dodge(1) ) +
>>      theme_minimal() +
>>      scale_colour_manual( name = "National v. Local"
>>                         , values = c( "red", "black" ) ) +
>>      scale_x_discrete( drop = FALSE ) +
>>      scale_y_continuous( breaks = seq( 700, 2100, 100 ) )
>>
>> Good luck with your graphics grammar!
>
> Dear Jeff,
>
> Thanks very much for this cogent advice and for taking the trouble to steer
> me in the right direction.  However I am not quite out of the woods yet.
>
> (1) I'm still getting two legends.  How do I stop this from happening?
>
> (2) The boxes are "filled" (with pinkish and blueish colours --- which are
> referenced in the second of the two legends that I get).  How can I get
> "unfilled" boxes?
>
> (3) The y-axis scale runs only from 800 to 1800, rather than from 700 to
> 2100.  How can I force it to run from 700 to 2100?
>
> (4) With the modified code we now get some "outliers" (points beyond the
> whisker tips) plotted --- which I didn't get before (and don't want, because
> "last year's" graphics did not include outliers).  How can I suppress the
> plotting of outliers?
>
> I have attached a pdf containing the results of running the code that
> you provided, so that you can readily see what is happening.
>
> May I prevail upon your good graces to enlighten me about questions
> (1) --- (4) above?
>
> Ever so humbly grateful.
>
> cheers,
>
> Rolf
>
> --
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
>

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<[hidden email]>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
---------------------------------------------------------------------------
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Grouped boxplots using ggplot() from ggplot2.

Rolf Turner

On 29/07/18 02:54, Jeff Newmiller wrote:

> 1) I don't know... it looks to me like you did not run my code.

Aaaarrrgghhh.  I *thought* I had, but instead left "fill=Type" inside
the aes() call and neglected to add fill=NULL outside this call.
Duhhhh!!! It's tough being mentally challenged, let me assure you.

> I have
> included a complete reprex below... try it out in a fresh session. If
> you still get the problem, check your sessionInfo package versions
> against mine.

Yep.  Works like a charm.

> 2) This still smells like your fill parameter is inside the aes function
> with Type as value. This causes a legend to be created, and since that
> legend has a different name ("Type") than the colour scale, they are
> separated. Confirm that you are using fill outside the aes function
> (because you don't want fill to depend on the data) and have the
> constant NULL as value (so it won't generate any fill graphical
> representation).

Yeah.  Well.  Duhhh.  I'm a retread.
>
> 3) I missed that... the ylim()/scales_y_continuous(breaks=) limits
> constrain which data are included as input into the graph. The
> coord_cartesian function forces the limits as desired.

Bewdy, ta.

>
> 4) While showing outliers is a standard semantic feature of boxplots
> whether produced by ggplot or lattice or base or non-R solution,

Indeed.  But the client is always right! :-)

> you can
> please the client by making the outliers transparent.

And your code shows me how!  Which I need.  Bewdy, ta.

> There is a link to the generated image below.
>
> ################
> # Simulate some data:
> Type <- rep( c( "National", "Local" ), each = 250 )
> M0   <- 1300+50*(0:4)
> set.seed( 42 )
> M1   <- M0 + runif( 5, -100, -50 )
> X0   <- rnorm( 250, rep( M0, each = 50 ), 150 )
> X1   <- rnorm( 250, rep( M1, each = 50 ), 100 )
>
> library(ggplot2)
> Year <- factor( rep( 4:8, each = 50, times = 2)
>                , levels = 0:8 )
> DemoDat <- data.frame( Year = Year
>                       , Score = c( X0, X1 )
>                       , Type = Type
>                       )
>
> ggplot( data = DemoDat
>        , aes( x = Year
>             , y = Score
>             , color = Type
>             )
>        , fill = NULL
>        ) +
>      geom_boxplot( position = position_dodge( 1 )
>                  , outlier.alpha = 0
>                  ) +
>      theme_minimal() +
>      scale_colour_manual( name = "National v. Local"
>                         , values = c( "red", "black" ) ) +
>      scale_x_discrete( drop = FALSE ) +
>      scale_y_continuous( breaks=seq( 700, 2100, 100 ) ) +
>      coord_cartesian( ylim = c( 700, 2100 ) )
>
> # ![](https://i.imgur.com/wUVYU5H.png)

Looks perfect.  Thanks *HUGELY* for your patience with my stupidity.

<SNIP>

cheers,

Rolf

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Grouped boxplots using ggplot() from ggplot2.

Richard M. Heiberger
## I recommend using lattice for this task.
## First I show the example from my book and package (HH).
## Then I use this on your example.

library(HH)       ## Package supporting Heiberger and Holland,
                  ## Statistical Analysis and Data Display (Second
edition, 2015)
HHscriptnames(4)  ## Filename on your computer for script for all
Chapter 4 examples

## this is Chunk 23

###################################################
### code chunk number 23: grap.tex:1953-1981
###################################################
bwdata <- data.frame(Y=(rt(80, df=5)*5 + rep(c(20,25,15,22, 22,28,16,14), 10)),
                     week=ordered(rep(c(1:4, 1:4), each=10)),
                     treatment= rep(c("A", "B"), each=40))

position(bwdata$week) <- c(1, 2, 4, 8)
levels(bwdata$week) <- c(1, 2, 4, 8)

bwdata$week.treatment <- with(bwdata, interaction(treatment, week))
position(bwdata$week.treatment) <-
   as.vector(t(outer(c(1, 2, 4, 8), c(-.18,.18), "+")))

BR <- likertColor(2, colorFunctionOption="default")[2:1]

## uses panel.bwplot.intermediate.hh to control position and colors
## hhpdf("bwplotposition.pdf", width=7, height=5)
bwplot(Y ~ week.treatment, data=bwdata,
       panel=panel.bwplot.intermediate.hh, xlim=c(0, 9),
       box.width=.25,
       pch=c(17, 16), col=BR,
       xlab="Week", ylab=list(rot=0),
       scales=list(x=list(at=position(bwdata$week), tck=1)),
       key=list(
          text=list(c("A","B"), col=BR),
          points=list(pch=c(17, 16), col=BR),
          space="top", columns=2, between=1, border=TRUE,
          title="Treatment", cex.title=.9)) +
   layer(panel.abline(h=37, col="gray60", lty=3, lwd=2))
## hhdev.off()


## The placement features provided by panel.bwplot.intermediate.hh are
## 1. The two boxes at each time position are clearly distinguished
##    from boxes at other time positions.
##
## 2. Times do not need to be evenly spaced.


## Now your sample data and lattice code for your desired graph


## Script to demonstrate what I am trying to do.
##

## Simulate some data:
Year <- factor(rep(4:8,each=50,times=2))
Type <- rep(c("National","Local"),each=250)
M0   <- 1300+50*(0:4)
set.seed(42)
M1   <- M0 + runif(5,-100,-50)
X0   <- rnorm(250,rep(M0,each=50),150)
X1   <- rnorm(250,rep(M1,each=50),100)
DemoDat <- data.frame(Year=Year,Score=c(X0,X1),Type=Type)

if (FALSE) { ## Rolf Turners original code
## Grouped boxplots:
library(ggplot2)
print(ggplot(data=DemoDat) +
    geom_boxplot(aes(x=Year, y=Score, color=Type,fill=Type),
                 position=position_dodge(1),alpha=0) +
    theme_minimal() +
    scale_fill_discrete(name="National v. Local") +
    ylim(700,2100))
}


DemoDat$Year.Type <- with(DemoDat, interaction(Year, Type))
position(DemoDat$Year.Type) <-
   as.vector(t(outer(c(4, 5, 6, 7, 8), c(-.18, .18), "+")))
RB <- c("red", "black")

SYT <-
bwplot(Score ~ Year.Type, data=DemoDat,
       panel=panel.bwplot.intermediate.hh,
       xlim=c(-.1, 9.1),
       ylim=c(690, 2110),
       box.width=.22,
       col=RB,
       xlab="Year", ylab=list(rot=0),
       scales=list(x=list(at=0:9, tck=1),
                   y=list(at=seq(700, 2100, 100), tick=1)),
       par.settings=list(box.dot=list(pch="|"),
                         plot.symbol=list(pch="-", col=RB, cex=1.5)),
       key=list(
         text=list(levels(DemoDat$Type), col=RB, cex=.8),
         lines=list(col=RB), size=1.5,
         columns=2, between=.5, between.columns=.6,
         space="right", border=FALSE,
         title="\nNational v. Local", cex.title=.9),
       main="Matches specifications"
       )
SYT

update(SYT, main="Outliers made invisible, not recommended",
       par.settings=list(plot.symbol=list(cex=0)))

## Rich


On Sat, Jul 28, 2018 at 7:04 PM, Rolf Turner <[hidden email]> wrote:

>
> On 29/07/18 02:54, Jeff Newmiller wrote:
>
>> 1) I don't know... it looks to me like you did not run my code.
>
>
> Aaaarrrgghhh.  I *thought* I had, but instead left "fill=Type" inside the
> aes() call and neglected to add fill=NULL outside this call. Duhhhh!!! It's
> tough being mentally challenged, let me assure you.
>
>> I have included a complete reprex below... try it out in a fresh session.
>> If you still get the problem, check your sessionInfo package versions
>> against mine.
>
>
> Yep.  Works like a charm.
>
>> 2) This still smells like your fill parameter is inside the aes function
>> with Type as value. This causes a legend to be created, and since that
>> legend has a different name ("Type") than the colour scale, they are
>> separated. Confirm that you are using fill outside the aes function (because
>> you don't want fill to depend on the data) and have the constant NULL as
>> value (so it won't generate any fill graphical representation).
>
>
> Yeah.  Well.  Duhhh.  I'm a retread.
>>
>>
>> 3) I missed that... the ylim()/scales_y_continuous(breaks=) limits
>> constrain which data are included as input into the graph. The
>> coord_cartesian function forces the limits as desired.
>
>
> Bewdy, ta.
>
>>
>> 4) While showing outliers is a standard semantic feature of boxplots
>> whether produced by ggplot or lattice or base or non-R solution,
>
>
> Indeed.  But the client is always right! :-)
>
>> you can please the client by making the outliers transparent.
>
>
> And your code shows me how!  Which I need.  Bewdy, ta.
>
>
>> There is a link to the generated image below.
>>
>> ################
>> # Simulate some data:
>> Type <- rep( c( "National", "Local" ), each = 250 )
>> M0   <- 1300+50*(0:4)
>> set.seed( 42 )
>> M1   <- M0 + runif( 5, -100, -50 )
>> X0   <- rnorm( 250, rep( M0, each = 50 ), 150 )
>> X1   <- rnorm( 250, rep( M1, each = 50 ), 100 )
>>
>> library(ggplot2)
>> Year <- factor( rep( 4:8, each = 50, times = 2)
>>                , levels = 0:8 )
>> DemoDat <- data.frame( Year = Year
>>                       , Score = c( X0, X1 )
>>                       , Type = Type
>>                       )
>>
>> ggplot( data = DemoDat
>>        , aes( x = Year
>>             , y = Score
>>             , color = Type
>>             )
>>        , fill = NULL
>>        ) +
>>      geom_boxplot( position = position_dodge( 1 )
>>                  , outlier.alpha = 0
>>                  ) +
>>      theme_minimal() +
>>      scale_colour_manual( name = "National v. Local"
>>                         , values = c( "red", "black" ) ) +
>>      scale_x_discrete( drop = FALSE ) +
>>      scale_y_continuous( breaks=seq( 700, 2100, 100 ) ) +
>>      coord_cartesian( ylim = c( 700, 2100 ) )
>>
>> # ![](https://i.imgur.com/wUVYU5H.png)
>
>
> Looks perfect.  Thanks *HUGELY* for your patience with my stupidity.
>
> <SNIP>
>
> cheers,
>
> Rolf
>
> --
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.