Quantcast

Thinning Lattice Plot

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Thinning Lattice Plot

Elliot
Is there an easy way to "thin" a lattice plot? I often create plots from
large data sets, and use the "pdf" command to save them to a file, but the
resulting files can be huge, because every point in the underlying dataset
is rendered in the plot, even though it isn't possible to see that much
detail.

For example:

require(Hmisc)
x <- rnorm(1e6)

pdf("test.pdf")
Ecdf(x)
dev.off()

The resulting pdf files is 31MB. Is there any easy way to get a smaller pdf
file without having to manually prune the dataset?

Thanks.

- Elliot

--
Elliot Joel Bernstein, Ph.D. | Research Associate | FDO Partners, LLC
134 Mount Auburn Street | Cambridge, MA | 02138
Phone: (617) 503-4619 | Email: [hidden email]

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Thinning Lattice Plot

David Winsemius

On Jul 30, 2012, at 2:13 PM, Elliot Joel Bernstein wrote:

> Is there an easy way to "thin" a lattice plot? I often create plots  
> from
> large data sets, and use the "pdf" command to save them to a file,  
> but the
> resulting files can be huge, because every point in the underlying  
> dataset
> is rendered in the plot, even though it isn't possible to see that  
> much
> detail.
>
> For example:
>
> require(Hmisc)
> x <- rnorm(1e6)
>
> pdf("test.pdf")
> Ecdf(x)
> dev.off()
>
> The resulting pdf files is 31MB. Is there any easy way to get a  
> smaller pdf
> file without having to manually prune the dataset?

There are plotting routines that display the density of distributions.  
I use hexbin fairly frequently but that is for 2d plots.  If you  
wanted the ECDF of a 1d vector, you could use cumsum() on the output  
of hist() or quantile() with suitable arguments to their parameters to  
control the degree of aggregation. Either of these yields an 8KB file  
on my machine.

 > pdf("test.pdf")
 > xyplot( cumsum(hist(x, plot=F)$intensities) ~ hist(x, plot=F)
$breaks )
 > dev.off()
quartz
      2

 > pdf("test.pdf")
 > xyplot( (0:100)/100 ~ quantile(x, prob=(0:100)/100)  )
 > dev.off()
quartz
      2



>
> Thanks.
>
> - Elliot
>
> --
> Elliot Joel Bernstein, Ph.D. | Research Associate | FDO Partners, LLC
> 134 Mount Auburn Street | Cambridge, MA | 02138
> Phone: (617) 503-4619 | Email: [hidden email]
>

David Winsemius, MD
Alameda, CA, USA

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Thinning Lattice Plot

David Carlson
You might also check ?pdf on your system. On Windows the default is for
compression. Your code creates a 186K file although it is slow to load
reflecting the overhead from decompressing the file.

----------------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352


> -----Original Message-----
> From: [hidden email] [mailto:r-help-bounces@r-
> project.org] On Behalf Of David Winsemius
> Sent: Monday, July 30, 2012 5:47 PM
> To: Elliot Joel Bernstein
> Cc: [hidden email]
> Subject: Re: [R] Thinning Lattice Plot
>
>
> On Jul 30, 2012, at 2:13 PM, Elliot Joel Bernstein wrote:
>
> > Is there an easy way to "thin" a lattice plot? I often create plots
> > from
> > large data sets, and use the "pdf" command to save them to a file,
> > but the
> > resulting files can be huge, because every point in the underlying
> > dataset
> > is rendered in the plot, even though it isn't possible to see that
> > much
> > detail.
> >
> > For example:
> >
> > require(Hmisc)
> > x <- rnorm(1e6)
> >
> > pdf("test.pdf")
> > Ecdf(x)
> > dev.off()
> >
> > The resulting pdf files is 31MB. Is there any easy way to get a
> > smaller pdf
> > file without having to manually prune the dataset?
>
> There are plotting routines that display the density of distributions.
> I use hexbin fairly frequently but that is for 2d plots.  If you
> wanted the ECDF of a 1d vector, you could use cumsum() on the output
> of hist() or quantile() with suitable arguments to their parameters to
> control the degree of aggregation. Either of these yields an 8KB file
> on my machine.
>
>  > pdf("test.pdf")
>  > xyplot( cumsum(hist(x, plot=F)$intensities) ~ hist(x, plot=F)
> $breaks )
>  > dev.off()
> quartz
>       2
>
>  > pdf("test.pdf")
>  > xyplot( (0:100)/100 ~ quantile(x, prob=(0:100)/100)  )
>  > dev.off()
> quartz
>       2
>
>
>
> >
> > Thanks.
> >
> > - Elliot
> >
> > --
> > Elliot Joel Bernstein, Ph.D. | Research Associate | FDO Partners, LLC
> > 134 Mount Auburn Street | Cambridge, MA | 02138
> > Phone: (617) 503-4619 | Email: [hidden email]
> >
>
> David Winsemius, MD
> Alameda, CA, USA
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Thinning Lattice Plot

Deepayan Sarkar
In reply to this post by Elliot
On Tue, Jul 31, 2012 at 2:43 AM, Elliot Joel Bernstein
<[hidden email]> wrote:

> Is there an easy way to "thin" a lattice plot? I often create plots from
> large data sets, and use the "pdf" command to save them to a file, but the
> resulting files can be huge, because every point in the underlying dataset
> is rendered in the plot, even though it isn't possible to see that much
> detail.
>
> For example:
>
> require(Hmisc)
> x <- rnorm(1e6)
>
> pdf("test.pdf")
> Ecdf(x)
> dev.off()

(This is not a lattice plot, BTW.)

> The resulting pdf files is 31MB.

Hmm, for me it's 192K. Perhaps you have not bothered to update R recently.

> Is there any easy way to get a smaller pdf
> file without having to manually prune the dataset?

In general, as David noted, you need to do some sort of data
summarization; great if tools are available to that, otherwise
yourself. In this case, for example, it seems reasonable to do

Ecdf(quantile(x, probs = ppoints(500, a=1)))

If you don't like to do this yourself, ecdfplot() in latticeExtra will allow

library(latticeExtra)
ecdfplot(x, f.value = ppoints(500, a=1))

-Deepayan

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Thinning Lattice Plot

Elliot
Thanks everyone for your replies. I didn't know about the ecdfplot
function, so I'll start using that instead of Ecdf. Why is Ecdf not a
lattice plot? The result certainly looks like other lattice plots, the
arguments are similar to other lattice plots. In fact, internally it seems
to just call the "histogram" function with a different prepanel and panel
function. Is it not considered a lattice plot only because it isn't part of
the lattice package?

Thanks.

- Elliot

On Tue, Jul 31, 2012 at 2:32 AM, Deepayan Sarkar
<[hidden email]>wrote:

> On Tue, Jul 31, 2012 at 2:43 AM, Elliot Joel Bernstein
> <[hidden email]> wrote:
> > Is there an easy way to "thin" a lattice plot? I often create plots from
> > large data sets, and use the "pdf" command to save them to a file, but
> the
> > resulting files can be huge, because every point in the underlying
> dataset
> > is rendered in the plot, even though it isn't possible to see that much
> > detail.
> >
> > For example:
> >
> > require(Hmisc)
> > x <- rnorm(1e6)
> >
> > pdf("test.pdf")
> > Ecdf(x)
> > dev.off()
>
> (This is not a lattice plot, BTW.)
>
> > The resulting pdf files is 31MB.
>
> Hmm, for me it's 192K. Perhaps you have not bothered to update R recently.
>
> > Is there any easy way to get a smaller pdf
> > file without having to manually prune the dataset?
>
> In general, as David noted, you need to do some sort of data
> summarization; great if tools are available to that, otherwise
> yourself. In this case, for example, it seems reasonable to do
>
> Ecdf(quantile(x, probs = ppoints(500, a=1)))
>
> If you don't like to do this yourself, ecdfplot() in latticeExtra will
> allow
>
> library(latticeExtra)
> ecdfplot(x, f.value = ppoints(500, a=1))
>
> -Deepayan
>



--
Elliot Joel Bernstein, Ph.D. | Research Associate | FDO Partners, LLC
134 Mount Auburn Street | Cambridge, MA | 02138
Phone: (617) 503-4619 | Email: [hidden email]

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Thinning Lattice Plot

Bert Gunter
Well, yes.

Terminology-wise, I guess one could say that it's a trellis plot in
the Hmisc package.

But I'd agree that this is nitpicking.

-- Bert

On Tue, Jul 31, 2012 at 6:13 AM, Elliot Joel Bernstein
<[hidden email]> wrote:

> Thanks everyone for your replies. I didn't know about the ecdfplot
> function, so I'll start using that instead of Ecdf. Why is Ecdf not a
> lattice plot? The result certainly looks like other lattice plots, the
> arguments are similar to other lattice plots. In fact, internally it seems
> to just call the "histogram" function with a different prepanel and panel
> function. Is it not considered a lattice plot only because it isn't part of
> the lattice package?
>
> Thanks.
>
> - Elliot
>
> On Tue, Jul 31, 2012 at 2:32 AM, Deepayan Sarkar
> <[hidden email]>wrote:
>
>> On Tue, Jul 31, 2012 at 2:43 AM, Elliot Joel Bernstein
>> <[hidden email]> wrote:
>> > Is there an easy way to "thin" a lattice plot? I often create plots from
>> > large data sets, and use the "pdf" command to save them to a file, but
>> the
>> > resulting files can be huge, because every point in the underlying
>> dataset
>> > is rendered in the plot, even though it isn't possible to see that much
>> > detail.
>> >
>> > For example:
>> >
>> > require(Hmisc)
>> > x <- rnorm(1e6)
>> >
>> > pdf("test.pdf")
>> > Ecdf(x)
>> > dev.off()
>>
>> (This is not a lattice plot, BTW.)
>>
>> > The resulting pdf files is 31MB.
>>
>> Hmm, for me it's 192K. Perhaps you have not bothered to update R recently.
>>
>> > Is there any easy way to get a smaller pdf
>> > file without having to manually prune the dataset?
>>
>> In general, as David noted, you need to do some sort of data
>> summarization; great if tools are available to that, otherwise
>> yourself. In this case, for example, it seems reasonable to do
>>
>> Ecdf(quantile(x, probs = ppoints(500, a=1)))
>>
>> If you don't like to do this yourself, ecdfplot() in latticeExtra will
>> allow
>>
>> library(latticeExtra)
>> ecdfplot(x, f.value = ppoints(500, a=1))
>>
>> -Deepayan
>>
>
>
>
> --
> Elliot Joel Bernstein, Ph.D. | Research Associate | FDO Partners, LLC
> 134 Mount Auburn Street | Cambridge, MA | 02138
> Phone: (617) 503-4619 | Email: [hidden email]
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



--

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Thinning Lattice Plot

Deepayan Sarkar
In reply to this post by Elliot
On Tue, Jul 31, 2012 at 6:43 PM, Elliot Joel Bernstein
<[hidden email]> wrote:

> Thanks everyone for your replies. I didn't know about the ecdfplot function,
> so I'll start using that instead of Ecdf. Why is Ecdf not a lattice plot?
> The result certainly looks like other lattice plots, the arguments are
> similar to other lattice plots. In fact, internally it seems to just call
> the "histogram" function with a different prepanel and panel function. Is it
> not considered a lattice plot only because it isn't part of the lattice
> package?

Of course not. What you are saying is a valid description of the
Ecdf.formula() method, which definitely produces a lattice plot (or
trellis plot if you prefer). However, the example you gave, namely,

x <- rnorm(1e6)
Ecdf(x)

ends up calling Ecdf.default(), which is very much a traditional
graphics function. I should add that this is for Hmisc 3.9-2, and
don't know if the behaviour is different with other versions.

Note that Ecdf() has more features than ecdfplot(), in particular it
allows weights.

-Deepayan

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Thinning Lattice Plot

Elliot
I see. I typically use a (one-sided) formula as the first argument to Ecdf,
but didn't even think about that distinction in putting together this
example.

Thanks again for your help.

- Elliot

On Tue, Jul 31, 2012 at 12:46 PM, Deepayan Sarkar <[hidden email]
> wrote:

> On Tue, Jul 31, 2012 at 6:43 PM, Elliot Joel Bernstein
> <[hidden email]> wrote:
>
> > Thanks everyone for your replies. I didn't know about the ecdfplot
> function,
> > so I'll start using that instead of Ecdf. Why is Ecdf not a lattice plot?
> > The result certainly looks like other lattice plots, the arguments are
> > similar to other lattice plots. In fact, internally it seems to just call
> > the "histogram" function with a different prepanel and panel function.
> Is it
> > not considered a lattice plot only because it isn't part of the lattice
> > package?
>
> Of course not. What you are saying is a valid description of the
> Ecdf.formula() method, which definitely produces a lattice plot (or
> trellis plot if you prefer). However, the example you gave, namely,
>
> x <- rnorm(1e6)
> Ecdf(x)
>
> ends up calling Ecdf.default(), which is very much a traditional
> graphics function. I should add that this is for Hmisc 3.9-2, and
> don't know if the behaviour is different with other versions.
>
> Note that Ecdf() has more features than ecdfplot(), in particular it
> allows weights.
>
> -Deepayan
>



--
Elliot Joel Bernstein, Ph.D. | Research Associate | FDO Partners, LLC
134 Mount Auburn Street | Cambridge, MA | 02138
Phone: (617) 503-4619 | Email: [hidden email]

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Loading...