lattice::xyplot/ggplot2: plotting weighted data frames with lmline and smooth

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

lattice::xyplot/ggplot2: plotting weighted data frames with lmline and smooth

Michael Friendly
In the HistData package, I have a data frame, PearsonLee, containing
observations on heights of parent and child, in weighted form:

library(HistData)

 > str(PearsonLee)
'data.frame':   746 obs. of  6 variables:
  $ child    : num  59.5 59.5 59.5 60.5 60.5 61.5 61.5 61.5 61.5 61.5 ...
  $ parent   : num  62.5 63.5 64.5 62.5 66.5 59.5 60.5 62.5 63.5 64.5 ...
  $ frequency: num  0.5 0.5 1 0.5 1 0.25 0.25 0.5 1 0.25 ...
  $ gp       : Factor w/ 4 levels "fd","fs","md",..: 2 2 2 2 2 2 2 2 2 2 ...
  $ par      : Factor w/ 2 levels "Father","Mother": 1 1 1 1 1 1 1 1 1 1 ...
  $ chl      : Factor w/ 2 levels "Daughter","Son": 2 2 2 2 2 2 2 2 2 2 ...

I want to make a 2x2 set of plots of child ~ parent | par+chl, with
regression lines and loess smooths, that
incorporate weights=frequency.  The "frequencies" are not integers, so I
can't simply expand the
data frame.

I'd also like to use different colors for the regression and smoothed lines.
Here's what I've tried using xyplot, all unsuccessful.  I suppose I
could also use ggplot2, if I could do what
I want.

xyplot(child ~ parent|par+chl, data=PearsonLee, weights=frequency,
type=c("p", "r", "smooth"))
xyplot(child ~ parent|par+chl, data=PearsonLee,  type=c("p", "r", "smooth"))

  panel.lmline  and panel.smooth don't have a weights= argument, though
lm() and loess() do.

# Try to control line colors: unsuccessfully -- only one value of
col.lin is used
xyplot(child ~ parent|par+chl, data=PearsonLee, type=c("p", "r",
"smooth"), col.line=c("red", "blue"))

## try to use panel functions ... unsucessfully
xyplot(child ~ parent|par+chl, data=PearsonLee, type="p",
        panel = function(x, y, ...) {
            panel.xyplot(x, y, ...)
            panel.lmline(x, y, col="blue", ...)
            panel.smooth(x, y, col="red", ...)
            }
)

The following, using base graphics, illustrates the difference between
the weighted and unweighted lines,
for the total data frame:

with(PearsonLee,
     {
     lim <- c(55,80)
     xv <- seq(55,80, .5)
     sunflowerplot(parent,child, number=frequency, xlim=lim, ylim=lim,
seg.col="gray", size=.1)
     # unweighted
     abline(lm(child ~ parent), col="green", lwd=2)
     lines(xv, predict(loess(child ~ parent), data.frame(parent=xv)),
col="green", lwd=2)
     # weighted
     abline(lm(child ~ parent, weights=frequency), col="blue", lwd=2)
     lines(xv, predict(loess(child ~ parent, weights=frequency),
data.frame(parent=xv)), col="blue", lwd=2)
   })

thanks,
-Michael



--
Michael Friendly     Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University      Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele Street    Web:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: lattice::xyplot/ggplot2: plotting weighted data frames with lmline and smooth

djmuseR
Hi Michael:

Here's one way to get it from ggplot2. To avoid possible overplotting,
I jittered the points horizontally by +/- 0.2. I also reduced the point
size from the default 2 and increased the line thickness to 1.5 for
both fitted curves. In ggplot2, the term faceting is synonymous with
conditioning (by groups).

library('HistData')
library('ggplot2')
ggplot(PearsonLee, aes(x = parent, y = child)) +
   geom_point(size = 1.5, position = position_jitter(width = 0.2)) +
   geom_smooth(method = lm, aes(weights = PearsonLee$weight),
               colour = 'green', se = FALSE, size = 1.5) +
   geom_smooth(aes(weights = PearsonLee$weight),
               colour = 'red', se = FALSE, size = 1.5) +
   facet_grid(chl ~ par)

# If you prefer a legend, here's one take, pulling the legend inside
# to the upper left corner. This requires a bit more 'trickery', but
# the tricks are found in the ggplot2 book.

ggplot(PearsonLee, aes(x = parent, y = child)) +
   geom_point(size = 1.5, position = position_jitter(width = 0.2)) +
   geom_smooth(method = lm, aes(weights = PearsonLee$weight,
               colour = 'Linear'), se = FALSE, size = 1.5) +
   geom_smooth(aes(weights = PearsonLee$weight,
               colour = 'Loess'), se = FALSE, size = 1.5) +
   facet_grid(chl ~ par) +
   scale_colour_manual(breaks = c('Linear', 'Loess'),
                       values = c('green', 'red')) +
   opts(legend.position = c(0.14, 0.885),
        legend.background = theme_rect(fill = 'white'))


HTH,
Dennis

On Fri, Oct 21, 2011 at 8:22 AM, Michael Friendly <[hidden email]> wrote:

> In the HistData package, I have a data frame, PearsonLee, containing
> observations on heights of parent and child, in weighted form:
>
> library(HistData)
>
>> str(PearsonLee)
> 'data.frame':   746 obs. of  6 variables:
>  $ child    : num  59.5 59.5 59.5 60.5 60.5 61.5 61.5 61.5 61.5 61.5 ...
>  $ parent   : num  62.5 63.5 64.5 62.5 66.5 59.5 60.5 62.5 63.5 64.5 ...
>  $ frequency: num  0.5 0.5 1 0.5 1 0.25 0.25 0.5 1 0.25 ...
>  $ gp       : Factor w/ 4 levels "fd","fs","md",..: 2 2 2 2 2 2 2 2 2 2 ...
>  $ par      : Factor w/ 2 levels "Father","Mother": 1 1 1 1 1 1 1 1 1 1 ...
>  $ chl      : Factor w/ 2 levels "Daughter","Son": 2 2 2 2 2 2 2 2 2 2 ...
>
> I want to make a 2x2 set of plots of child ~ parent | par+chl, with
> regression lines and loess smooths, that
> incorporate weights=frequency.  The "frequencies" are not integers, so I
> can't simply expand the
> data frame.
>
> I'd also like to use different colors for the regression and smoothed lines.
> Here's what I've tried using xyplot, all unsuccessful.  I suppose I could
> also use ggplot2, if I could do what
> I want.
>
> xyplot(child ~ parent|par+chl, data=PearsonLee, weights=frequency,
> type=c("p", "r", "smooth"))
> xyplot(child ~ parent|par+chl, data=PearsonLee,  type=c("p", "r", "smooth"))
>
>  panel.lmline  and panel.smooth don't have a weights= argument, though lm()
> and loess() do.
>
> # Try to control line colors: unsuccessfully -- only one value of col.lin is
> used
> xyplot(child ~ parent|par+chl, data=PearsonLee, type=c("p", "r", "smooth"),
> col.line=c("red", "blue"))
>
> ## try to use panel functions ... unsucessfully
> xyplot(child ~ parent|par+chl, data=PearsonLee, type="p",
>       panel = function(x, y, ...) {
>           panel.xyplot(x, y, ...)
>           panel.lmline(x, y, col="blue", ...)
>           panel.smooth(x, y, col="red", ...)
>           }
> )
>
> The following, using base graphics, illustrates the difference between the
> weighted and unweighted lines,
> for the total data frame:
>
> with(PearsonLee,
>    {
>    lim <- c(55,80)
>    xv <- seq(55,80, .5)
>    sunflowerplot(parent,child, number=frequency, xlim=lim, ylim=lim,
> seg.col="gray", size=.1)
>    # unweighted
>    abline(lm(child ~ parent), col="green", lwd=2)
>    lines(xv, predict(loess(child ~ parent), data.frame(parent=xv)),
> col="green", lwd=2)
>    # weighted
>    abline(lm(child ~ parent, weights=frequency), col="blue", lwd=2)
>    lines(xv, predict(loess(child ~ parent, weights=frequency),
> data.frame(parent=xv)), col="blue", lwd=2)
>  })
>
> thanks,
> -Michael
>
>
>
> --
> Michael Friendly     Email: friendly AT yorku DOT ca
> Professor, Psychology Dept.
> York University      Voice: 416 736-5115 x66249 Fax: 416 736-5814
> 4700 Keele Street    Web:   http://www.datavis.ca
> Toronto, ONT  M3J 1P3 CANADA
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: lattice::xyplot/ggplot2: plotting weighted data frames with lmline and smooth

Michael Friendly
Thanks very much, Dennis.  See below for something I don't understand.

On 10/21/2011 12:15 PM, Dennis Murphy wrote:

> Hi Michael:
>
> Here's one way to get it from ggplot2. To avoid possible overplotting,
> I jittered the points horizontally by ± 0.2. I also reduced the point
> size from the default 2 and increased the line thickness to 1.5 for
> both fitted curves. In ggplot2, the term faceting is synonymous with
> conditioning (by groups).
>
> library('HistData')
> library('ggplot2')
> ggplot(PearsonLee, aes(x = parent, y = child)) +
>     geom_point(size = 1.5, position = position_jitter(width = 0.2)) +
>     geom_smooth(method = lm, aes(weights = PearsonLee$weight),
>                 colour = 'green', se = FALSE, size = 1.5) +
>     geom_smooth(aes(weights = PearsonLee$weight),
>                 colour = 'red', se = FALSE, size = 1.5) +
>     facet_grid(chl ~ par)
This seems to work, but I don't understand *why*, since the weight
variable is
PearsonLee$frequency, not PearsonLee$weight.

 > PearsonLee$weight
NULL

I get an error if I try to use PearsonLee$frequency as the weights=
variable.

 > ggplot(PearsonLee, aes(x = parent, y = child)) +
+    geom_point(size = 1.5, position = position_jitter(width = 0.2)) +
+    geom_smooth(method = lm, aes(weights = PearsonLee$frequency),
+                colour = 'green', se = FALSE, size = 1.5) +
+    geom_smooth(aes(weights = PearsonLee$frequency),
+                colour = 'red', se = FALSE, size = 1.5) +
+    facet_grid(chl ~ par)
Error in eval(expr, envir, enclos) : object 'weight' not found

In the form below, it makes sense to me and does work, using
weight=frequency in the initial aes(),
and weight= in geom_smooth:

ggplot(PearsonLee, aes(x = parent, y = child, weight=frequency)) +
    geom_point(size = 1.5, position = position_jitter(width = 0.2)) +
    geom_smooth(method = lm, aes(weight = PearsonLee$frequency),
                colour = 'green', se = FALSE, size = 1.5) +
    geom_smooth(aes(weight = PearsonLee$frequency),
                colour = 'red', se = FALSE, size = 1.5) +
    facet_grid(chl ~ par)


> # If you prefer a legend, here's one take, pulling the legend inside
> # to the upper left corner. This requires a bit more 'trickery', but
> # the tricks are found in the ggplot2 book.
>
> ggplot(PearsonLee, aes(x = parent, y = child)) +
>     geom_point(size = 1.5, position = position_jitter(width = 0.2)) +
>     geom_smooth(method = lm, aes(weights = PearsonLee$weight,
>                 colour = 'Linear'), se = FALSE, size = 1.5) +
>     geom_smooth(aes(weights = PearsonLee$weight,
>                 colour = 'Loess'), se = FALSE, size = 1.5) +
>     facet_grid(chl ~ par) +
>     scale_colour_manual(breaks = c('Linear', 'Loess'),
>                         values = c('green', 'red')) +
>     opts(legend.position = c(0.14, 0.885),
>          legend.background = theme_rect(fill = 'white'))
>
>
> HTH,
> Dennis


--
Michael Friendly     Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University      Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele Street    Web:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: lattice::xyplot/ggplot2: plotting weighted data frames with lmline and smooth

djmuseR
Hi Michael:

The necessary argument to geom_smooth() is weight, not weights (my
fault, sorry), so try this instead:

ggplot(PearsonLee, aes(x = parent, y = child)) +
   geom_point(size = 1.5, position = position_jitter(width = 0.2)) +
   geom_smooth(method = lm, aes(weight = frequency,
               colour = 'Linear'), se = FALSE, size = 1.5) +
   geom_smooth(aes(weight = frequency,
               colour = 'Loess'), se = FALSE, size = 1.5) +
   facet_grid(chl ~ par) +
   scale_colour_manual(breaks = c('Linear', 'Loess'),
                       values = c('green', 'red')) +
   opts(legend.position = c(0.14, 0.885),
        legend.background = theme_rect(fill = 'white'))

Dennis

On Fri, Oct 21, 2011 at 11:57 AM, Michael Friendly <[hidden email]> wrote:

> Thanks very much, Dennis.  See below for something I don't understand.
>
> On 10/21/2011 12:15 PM, Dennis Murphy wrote:
>>
>> Hi Michael:
>>
>> Here's one way to get it from ggplot2. To avoid possible overplotting,
>> I jittered the points horizontally by ą 0.2. I also reduced the point
>> size from the default 2 and increased the line thickness to 1.5 for
>> both fitted curves. In ggplot2, the term faceting is synonymous with
>> conditioning (by groups).
>>
>> library('HistData')
>> library('ggplot2')
>> ggplot(PearsonLee, aes(x = parent, y = child)) +
>>    geom_point(size = 1.5, position = position_jitter(width = 0.2)) +
>>    geom_smooth(method = lm, aes(weights = PearsonLee$weight),
>>                colour = 'green', se = FALSE, size = 1.5) +
>>    geom_smooth(aes(weights = PearsonLee$weight),
>>                colour = 'red', se = FALSE, size = 1.5) +
>>    facet_grid(chl ~ par)
>
> This seems to work, but I don't understand *why*, since the weight variable
> is
> PearsonLee$frequency, not PearsonLee$weight.
>
>> PearsonLee$weight
> NULL
>
> I get an error if I try to use PearsonLee$frequency as the weights=
> variable.
>
>> ggplot(PearsonLee, aes(x = parent, y = child)) +
> +    geom_point(size = 1.5, position = position_jitter(width = 0.2)) +
> +    geom_smooth(method = lm, aes(weights = PearsonLee$frequency),
> +                colour = 'green', se = FALSE, size = 1.5) +
> +    geom_smooth(aes(weights = PearsonLee$frequency),
> +                colour = 'red', se = FALSE, size = 1.5) +
> +    facet_grid(chl ~ par)
> Error in eval(expr, envir, enclos) : object 'weight' not found
>
> In the form below, it makes sense to me and does work, using
> weight=frequency in the initial aes(),
> and weight= in geom_smooth:
>
> ggplot(PearsonLee, aes(x = parent, y = child, weight=frequency)) +
>   geom_point(size = 1.5, position = position_jitter(width = 0.2)) +
>   geom_smooth(method = lm, aes(weight = PearsonLee$frequency),
>               colour = 'green', se = FALSE, size = 1.5) +
>   geom_smooth(aes(weight = PearsonLee$frequency),
>               colour = 'red', se = FALSE, size = 1.5) +
>   facet_grid(chl ~ par)
>
>
>> # If you prefer a legend, here's one take, pulling the legend inside
>> # to the upper left corner. This requires a bit more 'trickery', but
>> # the tricks are found in the ggplot2 book.
>>
>> ggplot(PearsonLee, aes(x = parent, y = child)) +
>>    geom_point(size = 1.5, position = position_jitter(width = 0.2)) +
>>    geom_smooth(method = lm, aes(weights = PearsonLee$weight,
>>                colour = 'Linear'), se = FALSE, size = 1.5) +
>>    geom_smooth(aes(weights = PearsonLee$weight,
>>                colour = 'Loess'), se = FALSE, size = 1.5) +
>>    facet_grid(chl ~ par) +
>>    scale_colour_manual(breaks = c('Linear', 'Loess'),
>>                        values = c('green', 'red')) +
>>    opts(legend.position = c(0.14, 0.885),
>>         legend.background = theme_rect(fill = 'white'))
>>
>>
>> HTH,
>> Dennis
>
>
> --
> Michael Friendly     Email: friendly AT yorku DOT ca
> Professor, Psychology Dept.
> York University      Voice: 416 736-5115 x66249 Fax: 416 736-5814
> 4700 Keele Street    Web:   http://www.datavis.ca
> Toronto, ONT  M3J 1P3 CANADA
>
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.