Dear R-users

i need your help to solve my problem in the code below, i want to simulate

two different samples R1 and R2 and each sample has 10 variables and 1000

observations so i want to simulate a data with high correlation between

var. in R1 and also in R2 and no correlation between R1 and R2 also i have

a problem with correlation coefficient between tow dichotomous var. the R-

program supports just these types of correlation coefficients such as

pearson, spearman,kendall.

thanks alot in advance

Thanoon

ords <- seq(0,1)

p <- 10

N <- 1000

percent_change <- 0.9

R1 <- as.data.frame(replicate(p, sample(ords, N, replace = T)))

R2 <- as.data.frame(replicate(p, sample(ords, N, replace = T)))

# pearson is more appropriate for dichotomous data

cor(R1, R2, method = "pearson")

# subset variable to have a stronger correlation

v1 <- R1[,1, drop = FALSE]

v1 <- R2[,1, drop = FALSE]

# randomly choose which rows to retain

keep <- sample(as.numeric(rownames(v1)), size = percent_change*nrow(v1))

change <- as.numeric(rownames(v1)[-keep])

# randomly choose new values for changing

new.change <- sample(ords, ((1-percent_change)*N)+1, replace = T)

# replace values in copy of original column

v1.samp <- v1

v1.samp[change,] <- new.change

# closer correlation

cor(v1, v1.samp, method = "pearson")

# set correlated column as one of your other columns

R1[,2] <- v1.samp

R2[,2] <- v1.samp

R1

R2

[[alternative HTML version deleted]]

______________________________________________

[hidden email] mailing list

https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide

http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.