I'm sure this is simple enough, but an R site search on my subject
terms did suggest a solution. I have a numeric vector with many values that I wish to create a factor from having only a few levels. Here is a toy example. > x <- 1:10 > x <- factor(x,levels=1:10,labels=c("A","A","A","B","B","B","C","C","C","C")) > x [1] A A A B B B C C C C Levels: A A A B B B C C C C > summary(x) A A A B B B C C C C 3 0 0 3 0 0 4 0 0 0 So, there are clearly still 10 underlying levels. The results I would like to see from printing the value and summary(x) are: > x [1] A A A B B B C C C C Levels: A B C > summary(x) A B C 3 3 4 Hopefully this makes sense. Thanks, Kevin -- Kevin E. Thorpe Biostatistician/Trialist, Knowledge Translation Program Assistant Professor, Dalla Lana School of Public Health University of Toronto email: [hidden email] Tel: 416.864.5776 Fax: 416.864.3016 ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Hi Kevin,
Here are two suggestions: # Combination of levels() and table() table(levels(x)) # A B C # 3 3 4 # Or defining a function mysummary <- function(x) table(levels(x)) # you can easily improve it :-) mysummary(x) # A B C # 3 3 4 HTH, Jorge On Sun, Nov 1, 2009 at 3:51 PM, Kevin E. Thorpe <> wrote: > I'm sure this is simple enough, but an R site search on my subject > terms did suggest a solution. I have a numeric vector with many > values that I wish to create a factor from having only a few levels. > Here is a toy example. > > > x <- 1:10 > > x <- > factor(x,levels=1:10,labels=c("A","A","A","B","B","B","C","C","C","C")) > > x > [1] A A A B B B C C C C > Levels: A A A B B B C C C C > > summary(x) > A A A B B B C C C C > 3 0 0 3 0 0 4 0 0 0 > > So, there are clearly still 10 underlying levels. The results I would > like to see from printing the value and summary(x) are: > > > x > [1] A A A B B B C C C C > Levels: A B C > > summary(x) > A B C > 3 3 4 > > Hopefully this makes sense. > > Thanks, > > Kevin > > -- > Kevin E. Thorpe > Biostatistician/Trialist, Knowledge Translation Program > Assistant Professor, Dalla Lana School of Public Health > University of Toronto > email: [hidden email] Tel: 416.864.5776 Fax: 416.864.3016 > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
In reply to this post by Kevin E. Thorpe
On Nov 1, 2009, at 3:51 PM, Kevin E. Thorpe wrote: > I'm sure this is simple enough, but an R site search on my subject > terms did suggest a solution. I have a numeric vector with many > values that I wish to create a factor from having only a few levels. > Here is a toy example. > > > x <- 1:10 > > x <- > factor > (x,levels=1:10,labels=c("A","A","A","B","B","B","C","C","C","C")) You have thusly created a pathological situation. In 2.10.0 this is what you might see: > x <- factor(x,levels=1:10,labels=c("A","A","A","B","B","B","C","C","C","C")) Warning message: In `levels<-`(`*tmp*`, value = c("A", "A", "A", "B", "B", "B", "C", : duplicated levels will not be allowed in factors anymore What you _should_ have done was: x2 <- factor(c("A","A","A","B","B","B","C","C","C","C")) The usual approach to getting rid of unused factor levels is just to apply the function factor() again without additional arguments. > x <- factor(x) # the "x" was from your code Warning message: In `levels<-`(`*tmp*`, value = c("A", "A", "A", "B", "B", "B", "C", : duplicated levels will not be allowed in factors anymore # but that will be the last time you will see the warning.. > summary(x) A B C 3 3 4 -- David. > > x > [1] A A A B B B C C C C > Levels: A A A B B B C C C C > > summary(x) > A A A B B B C C C C > 3 0 0 3 0 0 4 0 0 0 > > So, there are clearly still 10 underlying levels. The results I would > like to see from printing the value and summary(x) are: > > > x > [1] A A A B B B C C C C > Levels: A B C > > summary(x) > A B C > 3 3 4 > > Hopefully this makes sense. > > Thanks, > > Kevin > > -- > Kevin E. Thorpe > Biostatistician/Trialist, Knowledge Translation Program > Assistant Professor, Dalla Lana School of Public Health > University of Toronto > email: [hidden email] Tel: 416.864.5776 Fax: 416.864.3016 > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
In reply to this post by Kevin E. Thorpe
Kevin E. Thorpe wrote:
> I'm sure this is simple enough, but an R site search on my subject > terms did suggest a solution. I have a numeric vector with many > values that I wish to create a factor from having only a few levels. > Here is a toy example. > > > x <- 1:10 > > x <- > factor(x,levels=1:10,labels=c("A","A","A","B","B","B","C","C","C","C")) > > x > [1] A A A B B B C C C C > Levels: A A A B B B C C C C > > summary(x) > A A A B B B C C C C > 3 0 0 3 0 0 4 0 0 0 > > So, there are clearly still 10 underlying levels. The results I would > like to see from printing the value and summary(x) are: > > > x > [1] A A A B B B C C C C > Levels: A B C > > summary(x) > A B C > 3 3 4 > > Hopefully this makes sense. > > Thanks, > > Kevin > It's an anomaly inherited frokm S-PLUS (or so I have been told). Actually, with the current R, you should get a warning: > x <- 1:10 > x <- factor(x,levels=1:10,labels=c("A","A","A","B","B","B","C","C","C","C")) Warning message: In `levels<-`(`*tmp*`, value = c("A", "A", "A", "B", "B", "B", "C", : duplicated levels will not be allowed in factors anymore This works (as documented on the help page for levels!): > x <- 1:10 > x <- factor(x,levels=1:10) > levels(x) <- c("A","A","A","B","B","B","C","C","C","C") > table(x) x A B C 3 3 4 -- O__ ---- Peter Dalgaard Ă˜ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - ([hidden email]) FAX: (+45) 35327907 ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Peter Dalgaard wrote:
> Kevin E. Thorpe wrote: >> I'm sure this is simple enough, but an R site search on my subject >> terms did suggest a solution. I have a numeric vector with many >> values that I wish to create a factor from having only a few levels. >> Here is a toy example. >> >> > x <- 1:10 >> > x <- >> factor(x,levels=1:10,labels=c("A","A","A","B","B","B","C","C","C","C")) >> > x >> [1] A A A B B B C C C C >> Levels: A A A B B B C C C C >> > summary(x) >> A A A B B B C C C C >> 3 0 0 3 0 0 4 0 0 0 >> >> So, there are clearly still 10 underlying levels. The results I would >> like to see from printing the value and summary(x) are: >> >> > x >> [1] A A A B B B C C C C >> Levels: A B C >> > summary(x) >> A B C >> 3 3 4 >> >> Hopefully this makes sense. >> >> Thanks, >> >> Kevin >> > > It's an anomaly inherited frokm S-PLUS (or so I have been told). > Actually, with the current R, you should get a warning: > > > x <- 1:10 > > x <- > factor(x,levels=1:10,labels=c("A","A","A","B","B","B","C","C","C","C")) > Warning message: > In `levels<-`(`*tmp*`, value = c("A", "A", "A", "B", "B", "B", "C", : > duplicated levels will not be allowed in factors anymore > > This works (as documented on the help page for levels!): > > > x <- 1:10 > > x <- factor(x,levels=1:10) > > levels(x) <- c("A","A","A","B","B","B","C","C","C","C") > > table(x) > x > A B C > 3 3 4 > > Thanks. That's exactly what I need. I knew it was simple. I've even used levels() before, but it just didn't occur to me this time. I'm clearly not on current R. :-) When I have some time, I'll upgrade. Kevin -- Kevin E. Thorpe Biostatistician/Trialist, Knowledge Translation Program Assistant Professor, Dalla Lana School of Public Health University of Toronto email: [hidden email] Tel: 416.864.5776 Fax: 416.864.3016 ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Free forum by Nabble | Edit this page |