On 09/08/2011 12:02 AM, dadrivr wrote:

> Hi,

>

> I'm trying to reshape my data set from wide to tall format for multilevel

> modeling. Unfortunately, the function I typically use (make.univ from the

> multilevel package) does not appear to work with unbalanced data frames,

> which is what I'm dealing with.

>

> Below is an example of the columns of a data frame similar to what I'm

> working with:

> ID a1 a2 a4 b2 b3 b4 b5 b6

>

> Below is what I want the columns to be after reshaping the data to long

> format:

> ID a b time

>

> Here is an example data frame that I want to reshape:

> ID<- c(1,2,3)

> a1<- c(NA, rnorm(2))

> a2<- c(NA, rnorm(1), NA)

> a4<- c(NA, rnorm(2))

> b2<- c(rnorm(2), NA)

> b3<- rnorm(3)

> b4<- NA

> b5<- rnorm(3)

> b6<- rnorm(3)

> mydata<- as.data.frame(cbind(ID,a1,a2,a4,b2,b3,b4,b5,b6))

>

> What is the best way to do this efficiently with MANY variables with widely

> differing time ranges? Note that I will have to manually enter the time for

> a given measurement because in the wide format, the time is in the variable

> names. By the way, I have a fairly large data set, with some variables

> occurring at 2 time points and other variables occurring at 20 time points.

> Thanks for your help!

>

Hi dadrivr,

I think you can do what you want using the rep_n_stack function in the

prettyR package. If you want a data frame at the end, you will have to

pad out your input data frame so that the lengths of the columns will be

equal. You'll get lots of NAs, but without them, you won't get a data frame.

mydata$a3<-NA

mydata$a5<-NA

mydata$a6<-NA

mydata$b1<-NA

mydata

Now you have equal numbers of "a" and "b" columns. To reshape this into

three columns is easy:

rep_n_stack(mydata,to.stack=c("a1","a2","a3","a4","a5","a6",

"b1","b2","b3","b4","b5","b6"),stack.names=c("ab","time"))

If you want the "a" and "b" columns separate, try this:

rep_n_stack(mydata,to.stack=matrix(c(2,3,10,4,11,12,13,5,6,7,8,9),nrow=2,

byrow=TRUE),stack.names=c("a","time","b","time"))

Currently you have to pass the column indices directly to get the

correct order in the output. I hadn't anticipated the missing column

problem when I wrote the function.

Jim

______________________________________________

[hidden email] mailing list

https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide

http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.