split the data.frame

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

split the data.frame

YIHSU CHEN
Dear R folks:

I wonder anyone has a elegent way of doing what I need to do.

I have a data frame called with four columns: V1, V2, A1 and A2:

V1  V2  A1   A2
A    B    1.2  2.0
A    D    1.2  4.0
A    C    2.4  2.2

What I need to do is to convert it into the following data frame with a new column x, where x is just the stacked up of A1 and A2 placed with respective V1 and V2 in the first two columns:  

V1  V2   x
A    B    1.2
A    B    2.0
A    D    1.2
A    D    4.0
A    C    2.4
A    C    2.2

I wonder whether there is an efficient way to do it since I have huge dataset.

Thank you very much

Yihsu













 







Yihsu Chen
The Johns Hopkins University

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: split the data.frame

Gabor Grothendieck
Try melt from the reshape package:

library(reshape)
melt(DF, 1:2)

You may may need to resort it if the order is important.


On 5/15/06, YIHSU CHEN <[hidden email]> wrote:

> Dear R folks:
>
> I wonder anyone has a elegent way of doing what I need to do.
>
> I have a data frame called with four columns: V1, V2, A1 and A2:
>
> V1  V2  A1   A2
> A    B    1.2  2.0
> A    D    1.2  4.0
> A    C    2.4  2.2
>
> What I need to do is to convert it into the following data frame with a new column x, where x is just the stacked up of A1 and A2 placed with respective V1 and V2 in the first two columns:
>
> V1  V2   x
> A    B    1.2
> A    B    2.0
> A    D    1.2
> A    D    4.0
> A    C    2.4
> A    C    2.2
>
> I wonder whether there is an efficient way to do it since I have huge dataset.
>
> Thank you very much
>
> Yihsu
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Yihsu Chen
> The Johns Hopkins University
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: split the data.frame

Robert Citek
In reply to this post by YIHSU CHEN

On May 15, 2006, at 10:45 PM, YIHSU CHEN wrote:

> I wonder anyone has a elegent way of doing what I need to do.
>
> I have a data frame called with four columns: V1, V2, A1 and A2:
>
> V1  V2  A1   A2
> A    B    1.2  2.0
> A    D    1.2  4.0
> A    C    2.4  2.2
>
> What I need to do is to convert it into the following data frame  
> with a new column x, where x is just the stacked up of A1 and A2  
> placed with respective V1 and V2 in the first two columns:
>
> V1  V2   x
> A    B    1.2
> A    B    2.0
> A    D    1.2
> A    D    4.0
> A    C    2.4
> A    C    2.2
>
> I wonder whether there is an efficient way to do it since I have  
> huge dataset.

How big is huge?  Also, what operating system are you using?

If your data set is really big, i.e. bigger than R can handle in  
memory, then you might want to write the data frame to disk,  
manipulate it there, and then read it back in.

For example:

myDF <- data.frame(V1=rep("A",3), V2=c("B","D","C"), A1=c
(1.2,1.2,2.4), A2=c(2,4,2.2) )
write.table(subset(myDF,select=c(V1,V2,A1)), file="foo.txt",
   row.name=FALSE, col.names = FALSE)
write.table(subset(myDF,select=c(V1,V2,A2)), file="foo.txt",
   row.name=FALSE, col.names = FALSE, append= TRUE)
newDF <- read.table("foo.txt", col.names=c("V1","V2","x"))
newDF[1:10,]

There's also the operating system solution if using Linux or Cywin/
Windows:

myDF <- data.frame(V1=rep("A",3), V2=c("B","D","C"), A1=c
(1.2,1.2,2.4), A2=c(2,4,2.2) )
write.table(myDF, file="foo.txt", sep="\t", na="",
   quote=FALSE, row.names = FALSE, col.names=FALSE)
system("{ cut -f1,2,3 foo.txt ; cut -f1,2,4 foo.txt ; } > bar.txt")
newDF <- read.table("bar.txt", col.names=c("V1","V2","x"))
newDF[1:10,]

Please post back letting us know what worked for you.

Regards,
- Robert
http://www.cwelug.org/downloads
Help others get OpenSource software.  Distribute FLOSS
for Windows, Linux, *BSD, and MacOS X with BitTorrent

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html