Different value between R variance and definition of variance

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Different value between R variance and definition of variance

Tine-2
Hi!

Let us define random variable:
 > x = seq(0,1,length=100)

If we calculate variance following definition E[(x-E(x))^2] we get:
 > mean( (x - mean(x))^2 ) # == mean(x^2) - mean(x)^2
0.08501684

And if we use internal R function var:
 > var(x)
0.08587559

Can anyone tells me why the difference?

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Different value between R variance and definition ofvariance

Stefano Calza-3
Hi!


maybe because sample variance has N-1 at the denominator (say degrees of freedom)?

so

all.equal((sum(x^2) - 100*mean(x)^2)/99, var(x)) ## TRUE

but

(sum(x^2) - 100*mean(x)^2)/100 # == your value


Stef

On Wed, Nov 28, 2007 at 09:56:58AM +0100, Tine wrote:
<Tine>Hi!
<Tine>
<Tine>Let us define random variable:
<Tine> > x = seq(0,1,length=100)
<Tine>
<Tine>If we calculate variance following definition E[(x-E(x))^2] we get:
<Tine> > mean( (x - mean(x))^2 ) # == mean(x^2) - mean(x)^2
<Tine>0.08501684
<Tine>
<Tine>And if we use internal R function var:
<Tine> > var(x)
<Tine>0.08587559
<Tine>
<Tine>Can anyone tells me why the difference?
<Tine>
<Tine>______________________________________________
<Tine>[hidden email] mailing list
<Tine>https://stat.ethz.ch/mailman/listinfo/r-help
<Tine>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
<Tine>and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Different value between R variance and definition of variance

Daniel Nordlund
In reply to this post by Tine-2
> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]] On Behalf
> Of Tine
> Sent: Wednesday, November 28, 2007 12:57 AM
> To: [hidden email]
> Subject: [R] Different value between R variance and definition of variance
>
> Hi!
>
> Let us define random variable:
>  > x = seq(0,1,length=100)
>
> If we calculate variance following definition E[(x-E(x))^2] we get:
>  > mean( (x - mean(x))^2 ) # == mean(x^2) - mean(x)^2
> 0.08501684
>
> And if we use internal R function var:
>  > var(x)
> 0.08587559
>
> Can anyone tells me why the difference?
>

I haven't seen a response, so I will chime in.  R calculates an unbiased estimate of the population variance from which your x is assumed to be a simple random sample of size n (in your case n=100).  See any basic book on statistics.  So, using your formula, R "in effect" calculates

     mean((x-mean(x)^2) * n/(n-1)

Hope this is helpful,

Dan

Daniel Nordlund
Bothell, WA USA

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.