Apply function to one specific column / Alternative to for loop

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Apply function to one specific column / Alternative to for loop

Stageexp
Hi guys, I am a total newbie to R, so I hope this isn't a totally dumb question. I have a dataframe with a title in one row and the corresponding values in the next rows. Let's take this example:

test_df <- data.frame(cbind(titel = "", x = 4:5, y = 1:2))
test_df = rbind(cbind(titel="1.Test", x="", y=""), test_df, cbind(titel="2.Test", x="", y=""), test_df, cbind(titel="3.Test", x="", y=""), test_df)

test_df
   titel x y
1 1.Test    
2        4 1
3        5 2
4 2.Test    
5        4 1
6        5 2
7 3.Test    
8        4 1
9        5 2

What I want to have is:
   titel x y
2 1.Test 4 1
3 1.Test 5 2
5 2.Test 4 1
6 2.Test 5 2
8 3.Test 4 1
9 3.Test 5 2

In my example, the title is in every third line, but in my real data there is no pattern. Each title has at least one line but can have x lines.

I was able to solve my problem in a for loop with the following code:
test_df$titel <- as.character(test_df$titel)
for (i in 1:nrow(test_df))
{
  if (nchar(test_df$titel[i])==0){
    test_df$titel[i]=test_df$titel[i-1]
  }
}
test_df <- subset(test_df,test_df$x!="")


The problem is, I have a lot of data and the for loop is obviously very slow. Is there a more elegant way to achieve the same? I think I have to use the apply function, but I don't know how to use it with just one column.
Reply | Threaded
Open this post in threaded view
|

Re: Apply function to one specific column / Alternative to for loop

umair durrani
This might be of some use : http://nsaunders.wordpress.com/2010/08/20/a-brief-introduction-to-apply-in-r/

Umair Durrani

email: [hidden email]


> Date: Sat, 16 Nov 2013 07:30:29 -0800
> From: [hidden email]
> To: [hidden email]
> Subject: [R] Apply function to one specific column / Alternative to for loop
>
> Hi guys, I am a total newbie to R, so I hope this isn't a totally dumb
> question. I have a dataframe with a title in one row and the corresponding
> values in the next rows. Let's take this example:
>
> test_df <- data.frame(cbind(titel = "", x = 4:5, y = 1:2))
> test_df = rbind(cbind(titel="1.Test", x="", y=""), test_df,
> cbind(titel="2.Test", x="", y=""), test_df, cbind(titel="3.Test", x="",
> y=""), test_df)
>
> test_df
>    titel x y
> 1 1.Test    
> 2        4 1
> 3        5 2
> 4 2.Test    
> 5        4 1
> 6        5 2
> 7 3.Test    
> 8        4 1
> 9        5 2
>
> What I want to have is:
>    titel x y
> 2 1.Test 4 1
> 3 1.Test 5 2
> 5 2.Test 4 1
> 6 2.Test 5 2
> 8 3.Test 4 1
> 9 3.Test 5 2
>
> In my example, the title is in every third line, but in my real data there
> is no pattern. Each title has at least one line but can have x lines.
>
> I was able to solve my problem in a for loop with the following code:
> test_df$titel <- as.character(test_df$titel)
> for (i in 1:nrow(test_df))
> {
>   if (nchar(test_df$titel[i])==0){
>     test_df$titel[i]=test_df$titel[i-1]
>   }
> }
> test_df <- subset(test_df,test_df$x!="")
>
>
> The problem is, I have a lot of data and the for loop is obviously very
> slow. Is there a more elegant way to achieve the same? I think I have to use
> the apply function, but I don't know how to use it with just one column.
>
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Apply-function-to-one-specific-column-Alternative-to-for-loop-tp4680566.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
     
        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Apply function to one specific column / Alternative to for loop

arun kirshna
In reply to this post by Stageexp
Hi,
Try:
indx <- grep("Test",test_df[,1])  ##assuming that there is some pattern
 res <- within(test_df[-indx,],titel <- rep(test_df$titel[indx], diff(c(indx,nrow(test_df)+1))-1))

## If you need to change the class

res[] <- lapply(res,function(x) if(any(grepl("[[:alpha:]]",x))) as.character(x) else as.numeric(as.character(x)))


##Using data.frame(cbind()), etc. creates


A.K.




On Saturday, November 16, 2013 11:14 AM, Stageexp <[hidden email]> wrote:
Hi guys, I am a total newbie to R, so I hope this isn't a totally dumb
question. I have a dataframe with a title in one row and the corresponding
values in the next rows. Let's take this example:

test_df <- data.frame(cbind(titel = "", x = 4:5, y = 1:2))
test_df = rbind(cbind(titel="1.Test", x="", y=""), test_df,
cbind(titel="2.Test", x="", y=""), test_df, cbind(titel="3.Test", x="",
y=""), test_df)

test_df
   titel x y
1 1.Test   
2        4 1
3        5 2
4 2.Test   
5        4 1
6        5 2
7 3.Test   
8        4 1
9        5 2

What I want to have is:
   titel x y
2 1.Test 4 1
3 1.Test 5 2
5 2.Test 4 1
6 2.Test 5 2
8 3.Test 4 1
9 3.Test 5 2

In my example, the title is in every third line, but in my real data there
is no pattern. Each title has at least one line but can have x lines.

I was able to solve my problem in a for loop with the following code:
test_df$titel <- as.character(test_df$titel)
for (i in 1:nrow(test_df))
{
  if (nchar(test_df$titel[i])==0){
    test_df$titel[i]=test_df$titel[i-1]
  }
}
test_df <- subset(test_df,test_df$x!="")


The problem is, I have a lot of data and the for loop is obviously very
slow. Is there a more elegant way to achieve the same? I think I have to use
the apply function, but I don't know how to use it with just one column.




--
View this message in context: http://r.789695.n4.nabble.com/Apply-function-to-one-specific-column-Alternative-to-for-loop-tp4680566.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Apply function to one specific column / Alternative to for loop

Stageexp
<quote author="arun kirshna">
Hi,
Try:
indx <- grep("Test",test_df[,1])  ##assuming that there is some pattern
 res <- within(test_df[-indx,],titel <- rep(test_df$titel[indx], diff(c(indx,nrow(test_df)+1))-1))

## If you need to change the class

res[] <- lapply(res,function(x) if(any(grepl("[[:alpha:]]",x))) as.character(x) else as.numeric(as.character(x)))


##Using data.frame(cbind()), etc. creates


A.K.


This option worked great for me! I knew there was a nicer and much faster way to solve this. One thing I already learned about R: Never use for-loops, there is always a better way :-)