

Hello
I need to plot a histogram, but insted of using bars, I'd like to plot
the data points. I've been doing it like this so far:
h < hist(x, plot = F)
plot(y = x$counts / sum(x$counts),
x = x$breaks[2:length(x$breaks)],
type = "p", log = "xy")
Sometimes I want to have a look at the "raw" data (avoiding any kind of
binning). When x only contains integers, it's easy to just use bins of
size 1 when generating h with "breaks = seq(0, max(x))".
Is there any way to do something similar when x consists of fractional
data? What I'm doing is setting a small bin length (for example, "breaks
= seq(0, 1, by = 1e6)", but there's still a chance that points will be
grouped in a single bin.
Is there a better way to do this kind of "raw histogram" plotting?
Thanks,
Andre
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


take a look at
?stem
There is still a place for handtools in the age of integrated
circuits. Of course, avoiding binning isn't really desirable.
url: www.econ.uiuc.edu/~roger Roger Koenker
email [hidden email] Department of Economics
vox: 2173334558 University of Illinois
fax: 2172446678 Champaign, IL 61820
On Feb 26, 2008, at 4:10 PM, Andre Nathan wrote:
> Hello
>
> I need to plot a histogram, but insted of using bars, I'd like to plot
> the data points. I've been doing it like this so far:
>
> h < hist(x, plot = F)
> plot(y = x$counts / sum(x$counts),
> x = x$breaks[2:length(x$breaks)],
> type = "p", log = "xy")
>
> Sometimes I want to have a look at the "raw" data (avoiding any kind
> of
> binning). When x only contains integers, it's easy to just use bins of
> size 1 when generating h with "breaks = seq(0, max(x))".
>
> Is there any way to do something similar when x consists of fractional
> data? What I'm doing is setting a small bin length (for example,
> "breaks
> = seq(0, 1, by = 1e6)", but there's still a chance that points will
> be
> grouped in a single bin.
>
> Is there a better way to do this kind of "raw histogram" plotting?
>
> Thanks,
> Andre
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On Tue, Feb 26, 2008 at 4:10 PM, Andre Nathan < [hidden email]> wrote:
> Hello
>
> I need to plot a histogram, but insted of using bars, I'd like to plot
> the data points. I've been doing it like this so far:
>
> h < hist(x, plot = F)
> plot(y = x$counts / sum(x$counts),
> x = x$breaks[2:length(x$breaks)],
> type = "p", log = "xy")
Another approach would be to use ggplot2, where all statistical
transformations can be performed separately from their traditional
appearance:
install.packages("ggplot2")
qplot(x, stat="bin", geom="bar")
qplot(x, stat="bin")
Hadley

http://had.co.nz/______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


I know about stem, but the data set has 1 million points, so it's not
very useful here. I want to avoid binning just to have an idea about the
shape of the distribution, before deciding how I'll bin it.
Andre
On Tue, 20080226 at 16:20 0600, roger koenker wrote:
> take a look at
>
> ?stem
>
> There is still a place for handtools in the age of integrated
> circuits. Of course, avoiding binning isn't really desirable.
>
> url: www.econ.uiuc.edu/~roger Roger Koenker
> email [hidden email] Department of Economics
> vox: 2173334558 University of Illinois
> fax: 2172446678 Champaign, IL 61820
>
>
> On Feb 26, 2008, at 4:10 PM, Andre Nathan wrote:
>
> > Hello
> >
> > I need to plot a histogram, but insted of using bars, I'd like to plot
> > the data points. I've been doing it like this so far:
> >
> > h < hist(x, plot = F)
> > plot(y = x$counts / sum(x$counts),
> > x = x$breaks[2:length(x$breaks)],
> > type = "p", log = "xy")
> >
> > Sometimes I want to have a look at the "raw" data (avoiding any kind
> > of
> > binning). When x only contains integers, it's easy to just use bins of
> > size 1 when generating h with "breaks = seq(0, max(x))".
> >
> > Is there any way to do something similar when x consists of fractional
> > data? What I'm doing is setting a small bin length (for example,
> > "breaks
> > = seq(0, 1, by = 1e6)", but there's still a chance that points will
> > be
> > grouped in a single bin.
> >
> > Is there a better way to do this kind of "raw histogram" plotting?
> >
> > Thanks,
> > Andre
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/rhelp> > PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> > and provide commented, minimal, selfcontained, reproducible code.
>
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Andre
If I understand you correctly, you could try a barplot() on the result
of table().
HTH ......
Peter Alspach
> Original Message
> From: [hidden email]
> [mailto: [hidden email]] On Behalf Of Andre Nathan
> Sent: Wednesday, 27 February 2008 1:34 p.m.
> To: roger koenker
> Cc: rhelp
> Subject: Re: [R] "Raw" histogram plots
>
> I know about stem, but the data set has 1 million points, so
> it's not very useful here. I want to avoid binning just to
> have an idea about the shape of the distribution, before
> deciding how I'll bin it.
>
> Andre
>
> On Tue, 20080226 at 16:20 0600, roger koenker wrote:
> > take a look at
> >
> > ?stem
> >
> > There is still a place for handtools in the age of integrated
> > circuits. Of course, avoiding binning isn't really desirable.
> >
> > url: www.econ.uiuc.edu/~roger Roger Koenker
> > email [hidden email] Department of Economics
> > vox: 2173334558 University of Illinois
> > fax: 2172446678 Champaign, IL 61820
> >
> >
> > On Feb 26, 2008, at 4:10 PM, Andre Nathan wrote:
> >
> > > Hello
> > >
> > > I need to plot a histogram, but insted of using bars, I'd like to
> > > plot the data points. I've been doing it like this so far:
> > >
> > > h < hist(x, plot = F)
> > > plot(y = x$counts / sum(x$counts),
> > > x = x$breaks[2:length(x$breaks)],
> > > type = "p", log = "xy")
> > >
> > > Sometimes I want to have a look at the "raw" data
> (avoiding any kind
> > > of binning). When x only contains integers, it's easy to just use
> > > bins of size 1 when generating h with "breaks = seq(0, max(x))".
> > >
> > > Is there any way to do something similar when x consists of
> > > fractional data? What I'm doing is setting a small bin
> length (for
> > > example, "breaks = seq(0, 1, by = 1e6)", but there's
> still a chance
> > > that points will be grouped in a single bin.
> > >
> > > Is there a better way to do this kind of "raw histogram" plotting?
> > >
> > > Thanks,
> > > Andre
> > >
> > > ______________________________________________
> > > [hidden email] mailing list
> > > https://stat.ethz.ch/mailman/listinfo/rhelp> > > PLEASE do read the posting guide
> > > http://www.Rproject.org/postingguide.html> > > and provide commented, minimal, selfcontained, reproducible code.
> >
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide
> http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>
The contents of this email are privileged and/or confidential to the named
recipient and are not to be used by any other person and/or organisation.
If you have received this email in error, please notify the sender and delete
all material pertaining to this email.
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


If the goal is to get a sense of the 'shape' of the overall distribution
of 'x', then why not use:
plot(density(x))
?
HTH,
Marc Schwartz
Peter Alspach wrote:
> Andre
>
> If I understand you correctly, you could try a barplot() on the result
> of table().
>
> HTH ......
>
> Peter Alspach
>
>
>> Original Message
>> From: [hidden email]
>> [mailto: [hidden email]] On Behalf Of Andre Nathan
>> Sent: Wednesday, 27 February 2008 1:34 p.m.
>> To: roger koenker
>> Cc: rhelp
>> Subject: Re: [R] "Raw" histogram plots
>>
>> I know about stem, but the data set has 1 million points, so
>> it's not very useful here. I want to avoid binning just to
>> have an idea about the shape of the distribution, before
>> deciding how I'll bin it.
>>
>> Andre
>>
>> On Tue, 20080226 at 16:20 0600, roger koenker wrote:
>>> take a look at
>>>
>>> ?stem
>>>
>>> There is still a place for handtools in the age of integrated
>>> circuits. Of course, avoiding binning isn't really desirable.
>>>
>>>
>>> On Feb 26, 2008, at 4:10 PM, Andre Nathan wrote:
>>>
>>>> Hello
>>>>
>>>> I need to plot a histogram, but insted of using bars, I'd like to
>>>> plot the data points. I've been doing it like this so far:
>>>>
>>>> h< hist(x, plot = F)
>>>> plot(y = x$counts / sum(x$counts),
>>>> x = x$breaks[2:length(x$breaks)],
>>>> type = "p", log = "xy")
>>>>
>>>> Sometimes I want to have a look at the "raw" data
>> (avoiding any kind
>>>> of binning). When x only contains integers, it's easy to just use
>>>> bins of size 1 when generating h with "breaks = seq(0, max(x))".
>>>>
>>>> Is there any way to do something similar when x consists of
>>>> fractional data? What I'm doing is setting a small bin
>> length (for
>>>> example, "breaks = seq(0, 1, by = 1e6)", but there's
>> still a chance
>>>> that points will be grouped in a single bin.
>>>>
>>>> Is there a better way to do this kind of "raw histogram" plotting?
>>>>
>>>> Thanks,
>>>> Andre
>>>>
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On Tue, 26 Feb 2008, Andre Nathan wrote:
> I know about stem, but the data set has 1 million points, so it's not
> very useful here. I want to avoid binning just to have an idea about the
> shape of the distribution, before deciding how I'll bin it.
Ideas:
1) use a much smaller sample of the data (1000 should suffice)
2) use a density plot (see ?density), perhaps on a subsample
(although as that will bin the data on a fine grid, this does not matter
much).
>
> Andre
>
> On Tue, 20080226 at 16:20 0600, roger koenker wrote:
>> take a look at
>>
>> ?stem
>>
>> There is still a place for handtools in the age of integrated
>> circuits. Of course, avoiding binning isn't really desirable.
>>
>> url: www.econ.uiuc.edu/~roger Roger Koenker
>> email [hidden email] Department of Economics
>> vox: 2173334558 University of Illinois
>> fax: 2172446678 Champaign, IL 61820
>>
>>
>> On Feb 26, 2008, at 4:10 PM, Andre Nathan wrote:
>>
>>> Hello
>>>
>>> I need to plot a histogram, but insted of using bars, I'd like to plot
>>> the data points. I've been doing it like this so far:
>>>
>>> h < hist(x, plot = F)
>>> plot(y = x$counts / sum(x$counts),
>>> x = x$breaks[2:length(x$breaks)],
>>> type = "p", log = "xy")
>>>
>>> Sometimes I want to have a look at the "raw" data (avoiding any kind
>>> of
>>> binning). When x only contains integers, it's easy to just use bins of
>>> size 1 when generating h with "breaks = seq(0, max(x))".
>>>
>>> Is there any way to do something similar when x consists of fractional
>>> data? What I'm doing is setting a small bin length (for example,
>>> "breaks
>>> = seq(0, 1, by = 1e6)", but there's still a chance that points will
>>> be
>>> grouped in a single bin.
>>>
>>> Is there a better way to do this kind of "raw histogram" plotting?
>>>
>>> Thanks,
>>> Andre
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/rhelp>>> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html>>> and provide commented, minimal, selfcontained, reproducible code.
>>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>

Brian D. Ripley, [hidden email]
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On Wed, 20080227 at 14:15 +1300, Peter Alspach wrote:
> If I understand you correctly, you could try a barplot() on the result
> of table().
Hmm, table() does the counting exactly the way I want, i.e., just
counting individual values. Is there a way to extract the counts vs. the
values from a table, so that I can pass them as the x and y arguments to
plot()?
Thanks,
Andre
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


If I understand:
x < rnorm(1e6)
out < tapply(x, ceiling(x), length)
plot(as.numeric(names(out)), out)
On 27/02/2008, Andre Nathan < [hidden email]> wrote:
> On Wed, 20080227 at 14:15 +1300, Peter Alspach wrote:
> > If I understand you correctly, you could try a barplot() on the result
> > of table().
>
>
> Hmm, table() does the counting exactly the way I want, i.e., just
> counting individual values. Is there a way to extract the counts vs. the
> values from a table, so that I can pass them as the x and y arguments to
> plot()?
>
>
> Thanks,
> Andre
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>

Henrique Dallazuanna
CuritibaParanáBrasil
25° 25' 40" S 49° 16' 22" O
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On Feb 27, 2008, at 8:16 AM, Andre Nathan wrote:
> On Wed, 20080227 at 14:15 +1300, Peter Alspach wrote:
>> If I understand you correctly, you could try a barplot() on the
>> result
>> of table().
>
> Hmm, table() does the counting exactly the way I want, i.e., just
> counting individual values. Is there a way to extract the counts
> vs. the
> values from a table, so that I can pass them as the x and y
> arguments to
> plot()?
>
x < table(rbinom(20,2,0.5))
plot(names(x),x)
should do it. You can also try just plot(x). Use prop.table on table
if you want the relative frequencies instead.
> Thanks,
> Andre
Haris Skiadas
Department of Mathematics and Computer Science
Hanover College
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Andre Nathan wrote:
> On Wed, 20080227 at 14:15 +1300, Peter Alspach wrote:
>> If I understand you correctly, you could try a barplot() on the result
>> of table().
>
> Hmm, table() does the counting exactly the way I want, i.e., just
> counting individual values. Is there a way to extract the counts vs. the
> values from a table, so that I can pass them as the x and y arguments to
> plot()?
>
> Thanks,
> Andre
Also take a lot at the Hmisc package's spike histogramrelated functions
such as histSpike and scat1d.

Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.
Frank Harrell
Department of Biostatistics, Vanderbilt University


On Wed, 20080227 at 08:48 0500, Charilaos Skiadas wrote:
> x < table(rbinom(20,2,0.5))
> plot(names(x),x)
>
> should do it. You can also try just plot(x). Use prop.table on table
> if you want the relative frequencies instead.
Yes, names is what I needed :) Thanks for the prop.table hint. I looked
everywhere but none of my searches hinted at table/table.prop. You guys'
help has been invaluable for me.
Thanks again,
Andre
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.

