

# Can someone help with this simple frequency histogram problem (n = 15)?
# I use four class limits: [90,95], [95,100], [100,105], [105,110].
# These coincide with the limits obtain by pretty {base}.
# Proper frequencies would be: (1,5,6,3).
# But hist{graphics} gives me a histogram showing frequencies (1,8,3,3),
# with or without argument break = ...
# Replicable codes below. Thanks.
set.seed(123)
x<rnorm(15,mean=100,sd=5); x<as.integer(x)
x<sort(x)
x
breaks<seq(90,110,by=5); breaks
pretty(x,n=5) # pretty {base}
x.cut<cut(x,breaks,right=F) ; x.cut
freq<table(x.cut); cbind(freq)
hist(x,breaks=breaks) # hist {graphics}
hist(x)
Never mind. Thanks.
I found that adding parameter right=F to the call fixes it.
Drawing a histogram of discrete data often leads to bad results.
Histograms are intended for continuous data, where no observations fall
on bin boundaries.
You often get a more faithful representation of discrete data using
something like
plot(table(x))
Duncan Murdoch
>
Also checkout MASS::truehist or simply consider setting breaks so as not to coincide with data values. (hist() not doing something like this, but instead actively aiming for pretty breaks is something of a design bug in my book, but ancient history and not easy to change at this point in time.)
pd
Also there is
I have not looked at the site for a while but I think it has some code in ?Splus which should work in R.
Regards
Duncan
Regards
Duncan
Duncan Mackay
Department of Agronomy and Soil Science
University of New England
Armidale NSW 2350
yes!!
including plot(<factor>)
[ if you really want, you can add something like 'lwd = 4' there ]
And relatedly, possibly more generally:
Many many people and hence useRs do
*NOT* distinguish between what R (and I think statistical graphics more
generally) calls *histograms* on one side vs
*bar plots* / *bar charts* / "spear charts"(?) etc on the other.
As Duncan said: Visually distinguishing quantities that are
inherently (mostly/almost) continuous ["mostly/..": think of quantum physics]
from those that are inherently "integerlike" or categorical.
We (the R user community, notably the graphically oriented
subset) should really strive to keep these concepts and the
corresponding visualizations separate as well as possible
[and educate the consumers of our graphics if necessary ..]
Martin Maechler
ETH Zurich and R Core Team
