On Sun, Apr 24, 2011 at 07:00:26PM -0400, Shane Phillips wrote:

> Hi, R-Helpers!

>

> I have a dataframe that contains a binomial variable. I need to add another random variable drawn from a normal distribution with a specific mean and standard deviation. This variable also needs to be correlated with the existing binomial variable with a specific correlation (say .75). Any ideas?

Hi.

If X, Y are dependent random variables and we want to generate y, so

that (x, y) is a pair from their joint distribution with known x,

then y should be generated from the conditional distribution P(Y|X=x).

If the probability P(X=x) is not too small, then this may be done by

rejection sampling: Generate pairs (X, Y) until the condition X=x is

satisfied and use the corresponding Y.

It remains to generate pairs (X, Y), where Y is a normal variable

and X a binomial one. The parameters of Y are known, the parameters

of X should be chosen somehow and the correlation of X and Y is

known. I suggest the following. Compute the distribution of X as a

vector of probabilities p_0, ..., p_n (see ?dbinom). Find a nondecreasing

function f() from reals to {0, .., n} such that f(Y) has distribution

p_0, ..., p_n. The function may be determined by a sequence of

cutpoints a_1, ..., a_n defining f(y) as follows

y f(y)

(-infty, a_1) 0

[a_1, a_2) 1

...

[a_n, infty) n

For each i, the cutpoint a_i is the (p_0 + ... + p_{i-1})-quantile of Y

(see ?qnorm). See ?cut for computing f().

The pair (f(Y), Y) has the required marginal distributions and, in my

opinion, the maximal possible correlation. If this correlation is lower

than the requested one, then i think there is no solution.

If the correlation of (f(Y), Y) is at least the required one, then use

a mixture of the distribution (f(Y), Y) and (X, Y), where X has the

required marginal distribution of X, but is generated independently

from Y. The mixture parameter may be determined as a solution of an

equation with one variable.

Hope this helps.

Petr Savicky.

______________________________________________

[hidden email] mailing list

https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide

http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.