Documentation for sd (stats) + suggestion

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Documentation for sd (stats) + suggestion

PatrickT
I cannot file suggestions on bugzilla, so writing here.

As far as I can tell, the manual help page for ``sd``

?sd

does not explicitly mention that the formula for the standard deviation is
the so-called "Bessel-corrected" formula (divide by n-1 rather than n).

I suggest it should be stated near the top.

I would also suggest (feature request!) that either

 - a population standard deviation formula, e.g. ``sdp`` or ``sd.p`` be
made available (that would be my preference)

or

 - the current ``sd`` be extended to accept a ``population=FALSE`` or
``sample=TRUE`` argument.

Same for the variance. Excel, Calc, etc. offer these.

Motivation: I encourage my students to use R rather than Python (which has
picked up big time in recent years) on the grounds that it is easier to get
started with and is specialized in statistics. But then there is no
"population" formula for the standard deviation. And ``mode`` is not the
mode they expect... (btw I suggest adding a ``modes`` function to the
core)  All things a beginner will look for in a stats software.

Thanks for listening. And thanks for the great work!

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Documentation for sd (stats) + suggestion

S Ellison-2
> As far as I can tell, the manual help page for ``sd``
>
> ?sd
>
> does not explicitly mention that the formula for the standard deviation is
> the so-called "Bessel-corrected" formula (divide by n-1 rather than n).

See Details, where it says
"Details:

     Like 'var' this uses denominator n - 1.
"



*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Documentation for sd (stats) + suggestion

Dario Strbenac-2
Good day,

It is implemented by the CRAN package multicon. The function is named popsd. But it does seem like something R should provide without creating a package dependency.

--------------------------------------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Documentation for sd (stats) + suggestion

PatrickT
In reply to this post by S Ellison-2
Oh thanks, missed that. I expected the explanation to be near the top under
"Description." I may have scanned for the word "sample", which doesn't
appear. I could have searched harder. Apologies for the noise.

On Tue, Feb 19, 2019 at 5:59 PM S Ellison <[hidden email]> wrote:

> > As far as I can tell, the manual help page for ``sd``
> >
> > ?sd
> >
> > does not explicitly mention that the formula for the standard deviation
> is
> > the so-called "Bessel-corrected" formula (divide by n-1 rather than n).
>
> See Details, where it says
> "Details:
>
>      Like 'var' this uses denominator n - 1.
> "
>
>
>
> *******************************************************************
> This email and any attachments are confidential. Any u...{{dropped:12}}

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Documentation for sd (stats) + suggestion

PatrickT
In reply to this post by Dario Strbenac-2
Indeed. Thanks for your suggestions.

To elaborate briefly. The ``quantile`` function offers 9 types of methods.
The ``sd`` function only one. The ``mad`` function offers ways to tweak the
bias correction. The ``sd`` function doesn't.

Are there good reasons against adding features to ``sd``? after all it must
be one of the most popular stats out there.

Moreover the default ``sd`` function, which divides by n-1, is not well
founded like the variance is. It's still biased for repeated small
samples...

While it's easy to roll your own function, I don't think we can expect
beginners to write something like:

sdp = function(x) sqrt(sum((x-mean(x))^2)/length(x))


On Wed, Feb 20, 2019 at 9:00 AM Dario Strbenac <[hidden email]>
wrote:

> Good day,
>
> It is implemented by the CRAN package multicon. The function is named
> popsd. But it does seem like something R should provide without creating a
> package dependency.
>
> --------------------------------------
> Dario Strbenac
> University of Sydney
> Camperdown NSW 2050
> Australia
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel