Add rows to dataframe by split values

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Add rows to dataframe by split values

Bart Joosen
This post has NOT been accepted by the mailing list yet.
Hi,

I have a dataframe with some data from experiments:
Batch   Test  Result        LL        ......(many more columns, all the same within Batch/Test)
  1         A     6,5-6,7     6,00    .....
  1         B     95,1-95,3  95,00  .....
  2         A     6,5-6,7     6,00    .....
...

Some results have double values, instead of 1 single numeric value, all separated by "-".
But now I want a dataframe with only 1 value per Batch/Test, and batches with double values should be renamed to 1a, 1b, ...

So
Batch   Test  Result        LL        ......(many more columns, all the same within Batch/Test)
  1a        A     6,5
  1b        A     6,7     6,00    .....
  1a        B     95,1  95,00  .....
  1b        B     95,3  95,00  .....
  2a        A     6,5     6,00    .....
  2b        A     6,7     6,00    .....
...

I tried with by and strsplit, but couldn't figure it out.
Any help would be appreciated, example data below.

Bart


dat <-
structure(list(Batch = c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 4L,
4L, 4L, 4L), Test = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 2L, 2L), .Label = c("A", "B"), class = "factor"), Result = structure(c(12L,
4L, 8L, 6L, 11L, 5L, 10L, 2L, 9L, 3L, 1L, 7L), .Label = c("6,604 - 6,59 - 6,585 - 6,588",
"6,61", "6,625 - 6,607", "6,675", "6,71 - 6,74", "6,76 - 6,75",
"6,79 - 6,80", "98,28 - 99,14", "98,52", "99,05 - 99,15", "99,74 - 98,19",
"99,77 - 99,62"), class = "factor"), X = c(NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA), LL = structure(c(2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 1L, 1L), .Label = c("6,00", "95"), class = "factor")), .Names = c("Batch",
"Test", "Result", "X", "LL"), class = "data.frame", row.names = c(NA,
-12L))
Reply | Threaded
Open this post in threaded view
|

Re: Add rows to dataframe by split values

jayesh.baviskar
Hello Bart,

I solved your problem.

Please find the code given below,

****************************************************************************************************************************************
dat <-structure(list(Batch = c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L), Test = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 2L, 2L), .Label = c("A", "B"), class = "factor"), Result = structure(c(12L, 4L, 8L, 6L, 11L, 5L, 10L, 2L, 9L, 3L, 1L, 7L), .Label = c("6,604 - 6,59 - 6,585 - 6,588",  "6,61", "6,625 - 6,607", "6,675", "6,71 - 6,74", "6,76 - 6,75", "6,79 - 6,80", "98,28 - 99,14", "98,52", "99,05 - 99,15", "99,74 - 98,19",  "99,77 - 99,62"), class = "factor"), X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), LL = structure(c(2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 1L), .Label = c("6.00", "95"), class = "factor")), .Names = c("Batch","Test", "Result", "X", "LL"), class = "data.frame", row.names = c(NA, -12L))

dat1<-NULL
rno<-0
for (thisResult in dat$Result)
{
  rno<-rno+1
  resTemp<-unlist(strsplit(thisResult," - "))
  i<-length(resTemp)
  if(i>1)
  {
    for(j in 1:i)
    {
      rbind(dat1,data.frame(Batch=paste(dat$Batch[rno] ,".",j),Test=dat$Test[rno],Result=resTemp[j],X=dat$X[rno],LL=dat$LL[rno]))->dat1
      j=j+1
    }
  }
  else
  {
    rbind(dat1,data.frame(Batch=toString(dat$Batch[rno]),Test=dat$Test[rno],Result=dat$Result[rno],X=dat$X[rno],LL=dat$LL[rno]))->dat1
  }
}
dat1
rm(dat1,i,j,resTemp,thisResult,rno)
***************************************************************************************************************************************

Output :

        Batch  Test      Result    X     LL
1        1 . 1    A        99,77    NA    95
2        1 . 2    A        99,62    NA    95
3        1         B        6,675    NA    6.00
4        2 . 1    A        98,28    NA    95
5        2 . 2    A        99,14    NA    95
6        2 . 1    B        6,76      NA    6.00
7        2 . 2    B        6,75      NA    6.00
8        3 . 1    A        99,74    NA    95
9        3 . 2    A        98,19    NA    95
10      3 . 1    B        6,71      NA    6.00
11      3 . 2    B        6,74      NA    6.00
12      4 . 1    A        99,05    NA    95
13      4 . 2    A        99,15    NA    95
14      4         B        6,61      NA    6.00
15      4         A        98,52    NA    95
16      4 . 1    B        6,625    NA    6.00
17      4 . 2    B        6,607    NA    6.00
18      4 . 1    B        6,604    NA    6.00
19      4 . 2    B        6,59      NA    6.00
20      4 . 3    B        6,585    NA    6.00
21      4 . 4    B        6,588    NA    6.00
22      4 . 1    B        6,79      NA    6.00
23      4 . 2    B        6,80      NA    6.00


Thanks & Regards,

Mr. Jayesh B. Baviskar
Pune, India.

On 2/20/2014 2:34 PM, Bart Joosen [via R] wrote:
Hi,

I have a dataframe with some data from experiments:
Batch   Test  Result        LL        ......(many more columns, all the same within Batch/Test)
  1         A     6,5-6,7     6,00    .....
  1         B     95,1-95,3  95,00  .....
  2         A     6,5-6,7     6,00    .....
...

Some results have double values, instead of 1 single numeric value, all separated by "-".
But now I want a dataframe with only 1 value per Batch/Test, and batches with double values should be renamed to 1a, 1b, ...

So
Batch   Test  Result        LL        ......(many more columns, all the same within Batch/Test)
  1a        A     6,5
  1b        A     6,7     6,00    .....
  1a        B     95,1  95,00  .....
  1b        B     95,3  95,00  .....
  2a        A     6,5     6,00    .....
  2b        A     6,7     6,00    .....
...

I tried with by and strsplit, but couldn't figure it out.
Any help would be appreciated, example data below.

Bart


dat <-
structure(list(Batch = c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 4L,
4L, 4L, 4L), Test = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 2L, 2L), .Label = c("A", "B"), class = "factor"), Result = structure(c(12L,
4L, 8L, 6L, 11L, 5L, 10L, 2L, 9L, 3L, 1L, 7L), .Label = c("6,604 - 6,59 - 6,585 - 6,588",
"6,61", "6,625 - 6,607", "6,675", "6,71 - 6,74", "6,76 - 6,75",
"6,79 - 6,80", "98,28 - 99,14", "98,52", "99,05 - 99,15", "99,74 - 98,19",
"99,77 - 99,62"), class = "factor"), X = c(NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA), LL = structure(c(2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 1L, 1L), .Label = c("6,00", "95"), class = "factor")), .Names = c("Batch",
"Test", "Result", "X", "LL"), class = "data.frame", row.names = c(NA,
-12L))



If you reply to this email, your message will be added to the discussion below:
http://r.789695.n4.nabble.com/Add-rows-to-dataframe-by-split-values-tp4685581.html
To start a new topic under R help, email [hidden email]
To unsubscribe from R, click here.
NAML

Reply | Threaded
Open this post in threaded view
|

Re: Add rows to dataframe by split values

arun kirshna
In reply to this post by Bart Joosen
Hi,
Try:
library(stringr)

res <- transform(dat[rep(1:nrow(dat),str_count(dat$Result,"-")+1),],Result=unlist(strsplit(as.character(dat$Result)," - ")),Batch=ave(Batch,Batch,FUN=function(x) paste0(x,letters[seq_along(x)])))


 row.names(res) <- 1:nrow(res)

# Based on the expected results, it could be also:
res1 <-  transform(dat[rep(1:nrow(dat),str_count(dat$Result,"-")+1),],Result=unlist(strsplit(as.character(dat$Result)," - ")),Batch=ave(Batch,Batch,Test,FUN=function(x) paste0(x,letters[seq_along(x)])))

 row.names(res1) <- 1:nrow(res1)
A.K.



Hi,

I have a dataframe with some data from experiments:
Batch   Test  Result        LL        ......(many more columns, all the same within Batch/Test)
  1         A     6,5-6,7     6,00    .....
  1         B     95,1-95,3  95,00  .....
  2         A     6,5-6,7     6,00    .....
...

Some results have double values, instead of 1 single numeric value, all separated by "-".
But now I want a dataframe with only 1 value per Batch/Test, and batches with double values should be renamed to 1a, 1b, ...

So
Batch   Test  Result        LL        ......(many more columns, all the same within Batch/Test)
  1a        A     6,5
  1b        A     6,7     6,00    .....
  1a        B     95,1  95,00  .....
  1b        B     95,3  95,00  .....
  2a        A     6,5     6,00    .....
  2b        A     6,7     6,00    .....
...

I tried with by and strsplit, but couldn't figure it out.
Any help would be appreciated, example data below.

Bart

dat <-
structure(list(Batch = c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 4L,
4L, 4L, 4L), Test = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 2L, 2L), .Label = c("A", "B"), class = "factor"), Result = structure(c(12L,
4L, 8L, 6L, 11L, 5L, 10L, 2L, 9L, 3L, 1L, 7L), .Label = c("6,604 - 6,59 - 6,585 - 6,588",
"6,61", "6,625 - 6,607", "6,675", "6,71 - 6,74", "6,76 - 6,75",
"6,79 - 6,80", "98,28 - 99,14", "98,52", "99,05 - 99,15", "99,74 - 98,19",
"99,77 - 99,62"), class = "factor"), X = c(NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA), LL = structure(c(2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 1L, 1L), .Label = c("6,00", "95"), class = "factor")), .Names = c("Batch",
"Test", "Result", "X", "LL"), class = "data.frame", row.names = c(NA,
-12L))

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.