How to simulate correlated data

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

How to simulate correlated data

Lisa Wang
Hello there,

I would like to simulate X --Normal (20, 5)
                         Y-- Normal (40, 10)

and the correlation between X and Y is 0.6. How do I do it in R?

Thank you very much

Lisa Wang Msc.
Princess Margaret Hospital
Toronto, Ca

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How to simulate correlated data

Doran, Harold
See ?mvrnorm

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Lisa Wang
Sent: Thursday, December 15, 2005 10:34 AM
To: R-Help
Subject: [R] How to simulate correlated data

Hello there,

I would like to simulate X --Normal (20, 5)
                         Y-- Normal (40, 10)

and the correlation between X and Y is 0.6. How do I do it in R?

Thank you very much

Lisa Wang Msc.
Princess Margaret Hospital
Toronto, Ca

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How to simulate correlated data

Robin Hankin
In reply to this post by Lisa Wang
Hi

you need

library(mvtnorm)

then

a <- rmvnorm(n=10000,mean=c(20,40),sigma=matrix(c(5,0.6*sqrt(50),
0.6*sqrt(50),10),2,2))



will do what you want


HTH

rksh



On 15 Dec 2005, at 15:33, Lisa Wang wrote:

> Hello there,
>
> I would like to simulate X --Normal (20, 5)
>                          Y-- Normal (40, 10)
>
> and the correlation between X and Y is 0.6. How do I do it in R?
>
> Thank you very much
>
> Lisa Wang Msc.
> Princess Margaret Hospital
> Toronto, Ca
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting- 
> guide.html

--
Robin Hankin
Uncertainty Analyst
National Oceanography Centre, Southampton
European Way, Southampton SO14 3ZH, UK
  tel  023-8059-7743

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How to simulate correlated data

Berwin A Turlach
In reply to this post by Lisa Wang
G'day Lisa,

>>>>> "LW" == Lisa Wang <[hidden email]> writes:

    LW> I would like to simulate X --Normal (20, 5) Y-- Normal (40,
    LW> 10) and the correlation between X and Y is 0.6.

    LW> How do I do it in R?
That depends on what you want the joint distribution to be. :)

If you want the joint distribution to be normal, you could use the
function mvrnorm() from the MASS package.

HTH.

Cheers,

        Berwin

========================== Full address ============================
Berwin A Turlach                      Tel.: +61 (8) 6488 3338 (secr)  
School of Mathematics and Statistics        +61 (8) 6488 3383 (self)      
The University of Western Australia   FAX : +61 (8) 6488 1028
35 Stirling Highway                  
Crawley WA 6009                e-mail: [hidden email]
Australia                        http://www.maths.uwa.edu.au/~berwin

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How to simulate correlated data

Kristel Joossens
In reply to this post by Lisa Wang
So what you actually wnat is a multivariate normal distribution!
with mean c(20,40) and covariance matrix
cbind(c(5,0.6*sqrt(5,10)),c(0.6*sqrt(5,10),10))
[Since Corr(x,y) = Cov(x,y)/sqrt(Var(x)*Var(y))


Look at the mvtnorm package, for function rmvnorm


Trying RSiteSearch("Multivariate normal distribution")
should also bring you to the package

Best regrads,
Kristel



Lisa Wang wrote:

> Hello there,
>
> I would like to simulate X --Normal (20, 5)
>                          Y-- Normal (40, 10)
>
> and the correlation between X and Y is 0.6. How do I do it in R?
>
> Thank you very much
>
> Lisa Wang Msc.
> Princess Margaret Hospital
> Toronto, Ca
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

--
__________________________________________
Kristel Joossens        Ph.D. Student
Research Center ORSTAT  K.U. Leuven
Naamsestraat 69         Tel: +32 16 326929
3000 Leuven, Belgium    Fax: +32 16 326732
E-mail:  [hidden email]
http://www.econ.kuleuven.be/public/ndbae49

Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How to simulate correlated data

Ted.Harding
In reply to this post by Lisa Wang
On 15-Dec-05 Lisa Wang wrote:
> Hello there,
>
> I would like to simulate X --Normal (20, 5)
>                          Y-- Normal (40, 10)
>
> and the correlation between X and Y is 0.6. How do I do it in R?

... and, as well as using mvrnorm (MASS) or rmvnorm (mvtnorm),
as have been suggested, you could simply do it "by hand":

If U, V are independent and N(0,1), then

  E(U + a*V)*(U - a*V) = 1 - a^2

  E(U+a*V)^2 = E(U - a*V) = 1 + a*2

so the correlation between (U + a*V) and U - a*V) is

  r = (1 - a^2)/(1 + a^2)

Hence, for -1 < r < 1, choose

  a = sqrt((1 - r)/(1 + r))

which, for r = 0.6, gives a = sqrt(0.4/1.6) = sqrt(1/4) = 1/2
(how nice! ... ).

Then Var(U + a*V) = 1 + a^2 = 1 + 1/4 = 5/4 (I smell more smooth
numbers coming ... ).

Then, since the correlation between two variables is unchanged
if you add a constant to either, or multiply either by a constant,
you can give (U + a*V) variance 5 by multiplying it by 2, and
give (U - a*V) variance 10 by multiplying by 2*sqrt(2), both still
having expectation 0. So finally add 10 and 20:

  X = 10 + 2*(U + V/2) ;  Y = 20 + 2*sqrt(2)*(U - V/2)

So you can get U and V by sampling from rnorm(), and then X and Y
as described.

(Which is how I used to do it before starting to use R, e.g. in
matlab/octave).

Best wishes,
Ted.


--------------------------------------------------------------------
E-Mail: (Ted Harding) <[hidden email]>
Fax-to-email: +44 (0)870 094 0861
Date: 15-Dec-05                                       Time: 17:04:18
------------------------------ XFMail ------------------------------

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Loading...