|
Hi all,
I would like create a new column in a data.frame (a1) to store 0, 1 data converted from a factor as below. a1$h2<-NULL for (i in 1:dim(a1)[1]) { if (a1$h1[i]=="H") a1$h2[i]<-1 else a1$h2[i]<-0 } My question: is it possible to remove the loop from above code to achieve the desired result? Thanks in advance, Jin Geoscience Australia Disclaimer: This e-mail (and files transmitted with it) is intended only for the person or entity to which it is addressed. If you are not the intended recipient, then you have received this e-mail by mistake and any use, dissemination, forwarding, printing or copying of this e-mail and its file attachments is prohibited. The security of emails transmitted cannot be guaranteed; by forwarding or replying to this email, you acknowledge and accept these risks. ------------------------------------------------------------------------------------------------------------------------- [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
Hello,
Try: > a1$h2 <- 0 > a1$h2[a1$h1=="H"] <- 1 Regards Le 12/07/03 16:18, [hidden email] a écrit : > Hi all, > > I would like create a new column in a data.frame (a1) to store 0, 1 data converted from a factor as below. > > a1$h2<-NULL > for (i in 1:dim(a1)[1]) { > if (a1$h1[i]=="H") a1$h2[i]<-1 else a1$h2[i]<-0 > } > > My question: is it possible to remove the loop from above code to achieve the desired result? > > Thanks in advance, > Jin > > Geoscience Australia Disclaimer: This e-mail (and files transmitted with it) is intended only for the person or entity to which it is addressed. If you are not the intended recipient, then you have received this e-mail by mistake and any use, dissemination, forwarding, printing or copying of this e-mail and its file attachments is prohibited. The security of emails transmitted cannot be guaranteed; by forwarding or replying to this email, you acknowledge and accept these risks. > ------------------------------------------------------------------------------------------------------------------------- > > > [[alternative HTML version deleted]] > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
In reply to this post by Jin.Li@ga.gov.au
Hi
> Hi all, > > I would like create a new column in a data.frame (a1) to store 0, 1 data > converted from a factor as below. > > a1$h2<-NULL > for (i in 1:dim(a1)[1]) { > if (a1$h1[i]=="H") a1$h2[i]<-1 else a1$h2[i]<-0 > } > > My question: is it possible to remove the loop from above code to achieve > the desired result? Untested a1$h2 <- (a1$h1=="H")*1 Regards Petr > > Thanks in advance, > Jin > > Geoscience Australia Disclaimer: This e-mail (and files transmitted with > it) is intended only for the person or entity to which it is addressed. If > you are not the intended recipient, then you have received this e-mail by > mistake and any use, dissemination, forwarding, printing or copying of > this e-mail and its file attachments is prohibited. The security of emails > transmitted cannot be guaranteed; by forwarding or replying to this email, > you acknowledge and accept these risks. > ------------------------------------------------------------------------------------------------------------------------- > > > [[alternative HTML version deleted]] > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
In reply to this post by Jin.Li@ga.gov.au
On 07/03/2012 05:18 PM, [hidden email] wrote:
> Hi all, > > I would like create a new column in a data.frame (a1) to store 0, 1 data converted from a factor as below. > > a1$h2<-NULL > for (i in 1:dim(a1)[1]) { > if (a1$h1[i]=="H") a1$h2[i]<-1 else a1$h2[i]<-0 > } > > My question: is it possible to remove the loop from above code to achieve the desired result? > Just to provide you with an embarrassment of alternatives: a1$h2<-ifelse(a1$h1=="H",1,0) Jim ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
In reply to this post by Jin.Li@ga.gov.au
And one more alternative:
a1$h2 <- apply(a1,1, function(x) if (x["h1"]=="H") 1 else 0 ) |
|
In reply to this post by Jim Lemon
On Jul 3, 2012, at 5:08 AM, Jim Lemon wrote: > On 07/03/2012 05:18 PM, [hidden email] wrote: >> Hi all, >> >> I would like create a new column in a data.frame (a1) to store 0, 1 >> data converted from a factor as below. >> >> a1$h2<-NULL >> for (i in 1:dim(a1)[1]) { >> if (a1$h1[i]=="H") a1$h2[i]<-1 else a1$h2[i]<-0 >> } >> >> My question: is it possible to remove the loop from above code to >> achieve the desired result? >> > Hi Jin, > Just to provide you with an embarrassment of alternatives: > > a1$h2<-ifelse(a1$h1=="H",1,0) One more. Similar to Petr's, but perhaps a bit more accessible to a new R user: a1$h2 <- as.numeric(a1$h1=="H") I wasn't sure whether NA's would be handled in the same manner by these two methods so I tested: > ifelse( factor(c("H", "h", NA))=="H", 1, 0) [1] 1 0 NA > as.numeric( factor(c("H", "h", NA))=="H") [1] 1 0 NA -- David Winsemius, MD West Hartford, CT ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
In reply to this post by Bart Joosen
Thank you all for providing various alternatives. They are all pretty fast. Great help! Based on a test of a dataset with 800,000 rows, the time used varies from 0.04 to 11.56 s. The champion is:
> a1$h2 <- 0 > a1$h2[a1$h1=="H"] <- 1 Regards, Jin Geoscience Australia Disclaimer: This e-mail (and files transmitted with it) is intended only for the person or entity to which it is addressed. If you are not the intended recipient, then you have received this e-mail by mistake and any use, dissemination, forwarding, printing or copying of this e-mail and its file attachments is prohibited. The security of emails transmitted cannot be guaranteed; by forwarding or replying to this email, you acknowledge and accept these risks. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
On 2012-07-03 17:23, [hidden email] wrote:
> Thank you all for providing various alternatives. They are all pretty fast. Great help! Based on a test of a dataset with 800,000 rows, the time used varies from 0.04 to 11.56 s. The champion is: >> a1$h2 <- 0 >> a1$h2[a1$h1=="H"] <- 1 Interesting. My testing shows that Petr's solution is about twice as fast. Not that it matters much - the time is pretty small in any case. a0 <- data.frame(h1 = sample(c("H","J","K"), 1e7, replace = TRUE), stringsAsFactors = FALSE) a1 <- a0 system.time({a1$h2 <- 0; a1$h2[a1$h1 == "H"] <- 1}) # user system elapsed # 1.47 0.48 1.96 a11 <- a1 a1 <- a0 system.time(a1$h2 <- (a1$h1 == "H") * 1) # user system elapsed # 0.37 0.17 0.56 a12 <- a1 all.equal(a11,a12) #[1] TRUE Peter Ehlers > Regards, > Jin > > Geoscience Australia Disclaimer: This e-mail (and files transmitted with it) is intended only for the person or entity to which it is addressed. If you are not the intended recipient, then you have received this e-mail by mistake and any use, dissemination, forwarding, printing or copying of this e-mail and its file attachments is prohibited. The security of emails transmitted cannot be guaranteed; by forwarding or replying to this email, you acknowledge and accept these risks. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
Le 04/07/2012 12:43, Peter Ehlers a écrit : > On 2012-07-03 17:23, [hidden email] wrote: >> Thank you all for providing various alternatives. They are all pretty >> fast. Great help! Based on a test of a dataset with 800,000 rows, the >> time used varies from 0.04 to 11.56 s. The champion is: >>> a1$h2 <- 0 >>> a1$h2[a1$h1=="H"] <- 1 > > Interesting. My testing shows that Petr's solution is about > twice as fast. Not that it matters much - the time is pretty > small in any case. > > a0 <- data.frame(h1 = sample(c("H","J","K"), 1e7, replace = TRUE), > stringsAsFactors = FALSE) > a1 <- a0 > system.time({a1$h2 <- 0; a1$h2[a1$h1 == "H"] <- 1}) > # user system elapsed > # 1.47 0.48 1.96 > a11 <- a1 > > a1 <- a0 > system.time(a1$h2 <- (a1$h1 == "H") * 1) > # user system elapsed > # 0.37 0.17 0.56 > a12 <- a1 > all.equal(a11,a12) > #[1] TRUE > > Peter Ehlers > I got the same result. Petr's solution is the fastest. Good to know it. Pascal Oettli >> Regards, >> Jin >> >> Geoscience Australia Disclaimer: This e-mail (and files transmitted >> with it) is intended only for the person or entity to which it is >> addressed. If you are not the intended recipient, then you have >> received this e-mail by mistake and any use, dissemination, >> forwarding, printing or copying of this e-mail and its file >> attachments is prohibited. The security of emails transmitted cannot >> be guaranteed; by forwarding or replying to this email, you >> acknowledge and accept these risks. > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
Thanks for your validation. Yes Peter's solution is the fastest, faster than the previous one by saving 25% time. It was missed out in my previous testing.
Jin -----Original Message----- From: Pascal Oettli [mailto:[hidden email]] Sent: Wednesday, 4 July 2012 2:07 PM To: Li Jin Cc: [hidden email] Subject: Re: [R] Is it possible to remove this loop? [SEC=UNCLASSIFIED] Le 04/07/2012 12:43, Peter Ehlers a écrit : > On 2012-07-03 17:23, [hidden email] wrote: >> Thank you all for providing various alternatives. They are all pretty >> fast. Great help! Based on a test of a dataset with 800,000 rows, the >> time used varies from 0.04 to 11.56 s. The champion is: >>> a1$h2 <- 0 >>> a1$h2[a1$h1=="H"] <- 1 > > Interesting. My testing shows that Petr's solution is about > twice as fast. Not that it matters much - the time is pretty > small in any case. > > a0 <- data.frame(h1 = sample(c("H","J","K"), 1e7, replace = TRUE), > stringsAsFactors = FALSE) > a1 <- a0 > system.time({a1$h2 <- 0; a1$h2[a1$h1 == "H"] <- 1}) > # user system elapsed > # 1.47 0.48 1.96 > a11 <- a1 > > a1 <- a0 > system.time(a1$h2 <- (a1$h1 == "H") * 1) > # user system elapsed > # 0.37 0.17 0.56 > a12 <- a1 > all.equal(a11,a12) > #[1] TRUE > > Peter Ehlers > I got the same result. Petr's solution is the fastest. Good to know it. Pascal Oettli >> Regards, >> Jin >> >> Geoscience Australia Disclaimer: This e-mail (and files transmitted >> with it) is intended only for the person or entity to which it is >> addressed. If you are not the intended recipient, then you have >> received this e-mail by mistake and any use, dissemination, >> forwarding, printing or copying of this e-mail and its file >> attachments is prohibited. The security of emails transmitted cannot >> be guaranteed; by forwarding or replying to this email, you >> acknowledge and accept these risks. > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > Geoscience Australia Disclaimer: This e-mail (and files transmitted with it) is intended only for the person or entity to which it is addressed. If you are not the intended recipient, then you have received this e-mail by mistake and any use, dissemination, forwarding, printing or copying of this e-mail and its file attachments is prohibited. The security of emails transmitted cannot be guaranteed; by forwarding or replying to this email, you acknowledge and accept these risks. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
| Powered by Nabble | Edit this page |
