Quantcast

hist.default()$density

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

hist.default()$density

Martin Becker
Dear developers,

the current implementation of hist.default() calculates 'density' (and
'intensities') as
  dens <- counts/(n*h)
where h has been calculated before as
  h <- diff(fuzzybreaks)
which results in 'fuzzy' values for the density, see e.g.

 > tmp <- hist(1:10,breaks=c(-2.5,2.5,7.5,12.5),plot=FALSE)
 > print(tmp$density,digits=15)
[1] 0.0399999920000016 0.1000000000000000 0.0600000000000000

Since hist.default()$breaks are not the fuzzy breaks used for the
calculation of dens, the sum of the bins' area is significantly
different from 1 in many cases, see e.g.

 > print(sum(tmp$density*diff(tmp$breaks)),digits=15)
[1] 0.999999960000008

Is this intended, or should the calculation of dens read
  dens <- counts/(n*diff(breaks))
instead (or should hist.default()$breaks return the fuzzy breaks)?

Best wishes
   Martin


--
Dr. Martin Becker
Statistics and Econometrics
Saarland University
Campus C3 1, Room 206
66123 Saarbruecken
Germany

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: hist.default()$density

Martin Becker
Dear developers,

since running 'example(hist)' produces
...
hist> sum(r$density * diff(r$breaks)) # == 1
[1] 0.9999999
...
I suppose that the current behaviour of hist() is not as intended (and
documented).
So, please find attached (and inline below) a (trivial) patch for
hist.default().

Best wishes
  Martin


Index: src/library/graphics/R/hist.R
===================================================================
--- src/library/graphics/R/hist.R    (revision 51652)
+++ src/library/graphics/R/hist.R    (working copy)
@@ -111,7 +111,7 @@
     stop("negative 'counts'. Internal Error in C-code for \"bincount\"")
     if (sum(counts) < n)
     stop("some 'x' not counted; maybe 'breaks' do not span range of 'x'")
-    dens <- counts/(n*h)
+    dens <- counts/(n*diff(breaks))
     mids <- 0.5 * (breaks[-1L] + breaks[-nB])
     r <- structure(list(breaks = breaks, counts = counts,
             intensities = dens,



Martin Becker wrote:

> Dear developers,
>
> the current implementation of hist.default() calculates 'density' (and
> 'intensities') as
>  dens <- counts/(n*h)
> where h has been calculated before as
>  h <- diff(fuzzybreaks)
> which results in 'fuzzy' values for the density, see e.g.
>
> > tmp <- hist(1:10,breaks=c(-2.5,2.5,7.5,12.5),plot=FALSE)
> > print(tmp$density,digits=15)
> [1] 0.0399999920000016 0.1000000000000000 0.0600000000000000
>
> Since hist.default()$breaks are not the fuzzy breaks used for the
> calculation of dens, the sum of the bins' area is significantly
> different from 1 in many cases, see e.g.
>
> > print(sum(tmp$density*diff(tmp$breaks)),digits=15)
> [1] 0.999999960000008
>
> Is this intended, or should the calculation of dens read
>  dens <- counts/(n*diff(breaks))
> instead (or should hist.default()$breaks return the fuzzy breaks)?
>
> Best wishes
>   Martin
>
>

--
Dr. Martin Becker
Statistics and Econometrics
Saarland University
Campus C3 1, Room 217
66123 Saarbruecken
Germany


Index: src/library/graphics/R/hist.R
===================================================================
--- src/library/graphics/R/hist.R (revision 51652)
+++ src/library/graphics/R/hist.R (working copy)
@@ -111,7 +111,7 @@
  stop("negative 'counts'. Internal Error in C-code for \"bincount\"")
     if (sum(counts) < n)
  stop("some 'x' not counted; maybe 'breaks' do not span range of 'x'")
-    dens <- counts/(n*h)
+    dens <- counts/(n*diff(breaks))
     mids <- 0.5 * (breaks[-1L] + breaks[-nB])
     r <- structure(list(breaks = breaks, counts = counts,
  intensities = dens,

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Loading...