# How to convert a factor column into a numeric one?

8 messages
Open this post in threaded view
|
Report Content as Inappropriate

## How to convert a factor column into a numeric one?

 I have a data frame:  > head(df)    Time Temp Conc Repl    Log10 1    0  -20    H    1 6.406547 2    2  -20    H    1 5.738683 3    7  -20    H    1 5.796394 4   14  -20    H    1 4.413691 5    0    4    H    1 6.406547 7    7    4    H    1 5.705433  > str(df) 'data.frame':   177 obs. of  5 variables:   \$ Time : Factor w/ 4 levels "0","2","7","14": 1 2 3 4 1 3 4 1 3 4 ...   \$ Temp : Factor w/ 4 levels "-20","4","25",..: 1 1 1 1 2 2 2 3 3 3 ...   \$ Conc : Factor w/ 3 levels "H","L","M": 1 1 1 1 1 1 1 1 1 1 ...   \$ Repl : Factor w/ 5 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...   \$ Log10: num  6.41 5.74 5.8 4.41 6.41 ...  > levels(df\$Temp) [1] "-20" "4"   "25"  "45"  > levels(df\$Time) [1] "0"  "2"  "7"  "14" As you can see, "Time" and "Temp" are currently factors, not numeric. I would like to change these columns into numerical ones. df\$Time<- as.numeric(df\$Time) doesn't work, as it changes to the factor level indices (1,2,3,4) instead of the values (0,2,7,14). There must be a direct way of doing this in R. I tried recode() in 'car':  > df\$Temp<- recode(df\$Temp, '1=-20;2=25;3=4;4=45',as.factor.result=FALSE)  > head(df)    Time Temp Conc Repl     Freq 1    0  -20    H    1 6.406547 2    2  -20    H    1 5.738683 3    7  -20    H    1 5.796394 4   14  -20    H    1 4.413691 5    0   45    H    1 6.406547 7    7   45    H    1 5.705433 but note that the values for 'Temp' in rows 5 and 7 are 45 and not 4, as expected, although the result is numeric. The same happens if I use the order given by levels(df\$Temp) instead of the sort order in the recode() 2nd argument. Any hints? ================================================================ Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: [hidden email] Least Cost Formulations, Ltd.            URL: http://lcfltd.com/824 Timberlake Drive                     Tel: 757-467-0954 Virginia Beach, VA 23464-3239            Fax: 757-467-2947 "Vere scire est per causas scire" ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|
Report Content as Inappropriate

## Re: How to convert a factor column into a numeric one?

 Dr. LaBudde, Perhaps as.numeric(as.character(x)) is what you are looking for. HTH, Jorge On Sun, Jun 5, 2011 at 12:31 AM, Robert A. LaBudde <> wrote: > I have a data frame: > > > head(df) >  Time Temp Conc Repl    Log10 > 1    0  -20    H    1 6.406547 > 2    2  -20    H    1 5.738683 > 3    7  -20    H    1 5.796394 > 4   14  -20    H    1 4.413691 > 5    0    4    H    1 6.406547 > 7    7    4    H    1 5.705433 > > str(df) > 'data.frame':   177 obs. of  5 variables: >  \$ Time : Factor w/ 4 levels "0","2","7","14": 1 2 3 4 1 3 4 1 3 4 ... >  \$ Temp : Factor w/ 4 levels "-20","4","25",..: 1 1 1 1 2 2 2 3 3 3 ... >  \$ Conc : Factor w/ 3 levels "H","L","M": 1 1 1 1 1 1 1 1 1 1 ... >  \$ Repl : Factor w/ 5 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ... >  \$ Log10: num  6.41 5.74 5.8 4.41 6.41 ... > > levels(df\$Temp) > [1] "-20" "4"   "25"  "45" > > levels(df\$Time) > [1] "0"  "2"  "7"  "14" > > As you can see, "Time" and "Temp" are currently factors, not numeric. > > I would like to change these columns into numerical ones. > > df\$Time<- as.numeric(df\$Time) > > doesn't work, as it changes to the factor level indices (1,2,3,4) instead > of the values (0,2,7,14). > > There must be a direct way of doing this in R. > > I tried recode() in 'car': > > > df\$Temp<- recode(df\$Temp, '1=-20;2=25;3=4;4=45',as.factor.result=FALSE) > > head(df) >  Time Temp Conc Repl     Freq > 1    0  -20    H    1 6.406547 > 2    2  -20    H    1 5.738683 > 3    7  -20    H    1 5.796394 > 4   14  -20    H    1 4.413691 > 5    0   45    H    1 6.406547 > 7    7   45    H    1 5.705433 > > but note that the values for 'Temp' in rows 5 and 7 are 45 and not 4, as > expected, although the result is numeric. The same happens if I use the > order given by levels(df\$Temp) instead of the sort order in the recode() 2nd > argument. > > Any hints? > ================================================================ > Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: [hidden email] > Least Cost Formulations, Ltd.            URL: http://lcfltd.com/> 824 Timberlake Drive                     Tel: 757-467-0954 > Virginia Beach, VA 23464-3239            Fax: 757-467-2947 > > "Vere scire est per causas scire" > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. >         [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|
Report Content as Inappropriate

## Re: How to convert a factor column into a numeric one?

 In reply to this post by Robert A LaBudde Hi: Try this: > dd <- data.frame(a = factor(rep(1:5, each = 4)), +                  b = factor(rep(rep(1:2, each = 2), 5)), +                  y = rnorm(20)) > str(dd) 'data.frame':   20 obs. of  3 variables:  \$ a: Factor w/ 5 levels "1","2","3","4",..: 1 1 1 1 2 2 2 2 3 3 ...  \$ b: Factor w/ 2 levels "1","2": 1 1 2 2 1 1 2 2 1 1 ...  \$ y: num  0.6396 1.467 1.8403 -0.0915 0.2711 ... > de <- within(dd, { +          a <- as.numeric(as.character(a)) +          b <- as.numeric(as.character(b)) +        } ) > str(de) 'data.frame':   20 obs. of  3 variables:  \$ a: num  1 1 1 1 2 2 2 2 3 3 ...  \$ b: num  1 1 2 2 1 1 2 2 1 1 ...  \$ y: num  0.6396 1.467 1.8403 -0.0915 0.2711 ... HTH, Dennis On Sat, Jun 4, 2011 at 9:31 PM, Robert A. LaBudde <[hidden email]> wrote: > I have a data frame: > >> head(df) >  Time Temp Conc Repl    Log10 > 1    0  -20    H    1 6.406547 > 2    2  -20    H    1 5.738683 > 3    7  -20    H    1 5.796394 > 4   14  -20    H    1 4.413691 > 5    0    4    H    1 6.406547 > 7    7    4    H    1 5.705433 >> str(df) > 'data.frame':   177 obs. of  5 variables: >  \$ Time : Factor w/ 4 levels "0","2","7","14": 1 2 3 4 1 3 4 1 3 4 ... >  \$ Temp : Factor w/ 4 levels "-20","4","25",..: 1 1 1 1 2 2 2 3 3 3 ... >  \$ Conc : Factor w/ 3 levels "H","L","M": 1 1 1 1 1 1 1 1 1 1 ... >  \$ Repl : Factor w/ 5 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ... >  \$ Log10: num  6.41 5.74 5.8 4.41 6.41 ... >> levels(df\$Temp) > [1] "-20" "4"   "25"  "45" >> levels(df\$Time) > [1] "0"  "2"  "7"  "14" > > As you can see, "Time" and "Temp" are currently factors, not numeric. > > I would like to change these columns into numerical ones. > > df\$Time<- as.numeric(df\$Time) > > doesn't work, as it changes to the factor level indices (1,2,3,4) instead of > the values (0,2,7,14). > > There must be a direct way of doing this in R. > > I tried recode() in 'car': > >> df\$Temp<- recode(df\$Temp, '1=-20;2=25;3=4;4=45',as.factor.result=FALSE) >> head(df) >  Time Temp Conc Repl     Freq > 1    0  -20    H    1 6.406547 > 2    2  -20    H    1 5.738683 > 3    7  -20    H    1 5.796394 > 4   14  -20    H    1 4.413691 > 5    0   45    H    1 6.406547 > 7    7   45    H    1 5.705433 > > but note that the values for 'Temp' in rows 5 and 7 are 45 and not 4, as > expected, although the result is numeric. The same happens if I use the > order given by levels(df\$Temp) instead of the sort order in the recode() 2nd > argument. > > Any hints? > ================================================================ > Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: [hidden email] > Least Cost Formulations, Ltd.            URL: http://lcfltd.com/> 824 Timberlake Drive                     Tel: 757-467-0954 > Virginia Beach, VA 23464-3239            Fax: 757-467-2947 > > "Vere scire est per causas scire" > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|
Report Content as Inappropriate

## Re: How to convert a factor column into a numeric one?

Open this post in threaded view
|
Report Content as Inappropriate

## Re: How to convert a factor column into a numeric one?

 In reply to this post by Jorge I Velez Exactly! Thanks. At 12:49 AM 6/5/2011, Jorge Ivan Velez wrote: >Dr. LaBudde, > >Perhaps > >as.numeric(as.character(x)) > >is what you are looking for. > >HTH, >Jorge > > >On Sun, Jun 5, 2011 at 12:31 AM, Robert A. LaBudde <> wrote: >I have a data frame: > > > head(df) >  Time Temp Conc Repl    Log10 >1    0  -20    H    1 6.406547 >2    2  -20    H    1 5.738683 >3    7  -20    H    1 5.796394 >4   14  -20    H    1 4.413691 >5    0    4    H    1 6.406547 >7    7    4    H    1 5.705433 > > str(df) >'data.frame':   177 obs. of  5 variables: >  \$ Time : Factor w/ 4 levels "0","2","7","14": 1 2 3 4 1 3 4 1 3 4 ... >  \$ Temp : Factor w/ 4 levels "-20","4","25",..: 1 1 1 1 2 2 2 3 3 3 ... >  \$ Conc : Factor w/ 3 levels "H","L","M": 1 1 1 1 1 1 1 1 1 1 ... >  \$ Repl : Factor w/ 5 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ... >  \$ Log10: num  6.41 5.74 5.8 4.41 6.41 ... > > levels(df\$Temp) >[1] "-20" "4"   "25"  "45" > > levels(df\$Time) >[1] "0"  "2"  "7"  "14" > >As you can see, "Time" and "Temp" are currently factors, not numeric. > >I would like to change these columns into numerical ones. > >df\$Time<- as.numeric(df\$Time) > >doesn't work, as it changes to the factor level indices (1,2,3,4) >instead of the values (0,2,7,14). > >There must be a direct way of doing this in R. > >I tried recode() in 'car': > > > df\$Temp<- recode(df\$Temp, '1=-20;2=25;3=4;4=45',as.factor.result=FALSE) > > head(df) >  Time Temp Conc Repl     Freq >1    0  -20    H    1 6.406547 >2    2  -20    H    1 5.738683 >3    7  -20    H    1 5.796394 >4   14  -20    H    1 4.413691 >5    0   45    H    1 6.406547 >7    7   45    H    1 5.705433 > >but note that the values for 'Temp' in rows 5 and 7 are 45 and not >4, as expected, although the result is numeric. The same happens if >I use the order given by levels(df\$Temp) instead of the sort order >in the recode() 2nd argument. > >Any hints? >================================================================ >Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: >[hidden email] >Least Cost Formulations, Ltd.            URL: >http://lcfltd.com/>824 Timberlake Drive                     Tel: 757-467-0954 >Virginia Beach, VA 23464-3239            Fax: 757-467-2947 > >"Vere scire est per causas scire" > >______________________________________________ >[hidden email] mailing list >https://stat.ethz.ch/mailman/listinfo/r-help>PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html>and provide commented, minimal, self-contained, reproducible code. > ================================================================ Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: [hidden email] Least Cost Formulations, Ltd.            URL: http://lcfltd.com/824 Timberlake Drive                     Tel: 757-467-0954 Virginia Beach, VA 23464-3239            Fax: 757-467-2947 "Vere scire est per causas scire" ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|
Report Content as Inappropriate

## Re: How to convert a factor column into a numeric one?

 In reply to this post by Robert A LaBudde Thanks for your help. As far as your question below is concerned, the data frame arose as a result of some data cleaning on an original data frame, which was changed into a table, modified, and changed back to a data frame: ttcrmean<- as.table(by(ngbe[,'Log10'], list(Time=ngbe\$Time,Temp=ngbe\$Temp,Conc=ngbe\$Conc,Repl=ngbe\$Replicate),    mean)) for (k in 1:3) {  #fix-up time zeroes    for (l in 1:5) { #replicates      t0val<- ttcrmean[1,3,k,l]      for (j in 1:4) {  #temps        ttcrmean[1,j,k,l]<- t0val      } #j    } #l } #i df<- na.omit(as.data.frame(ttcrmean)) colnames(df)[5]<- 'Log10' At 12:51 AM 6/5/2011, Joshua Wiley wrote: >Hi Robert, > >I would also look into *why* those numeric columns are being stored as >factors in the first place.  If you are reading the data in with >read.table() or one of its wrapper functions (like read.csv), then it >would be better to preempt the storage as a factor altogether rather >than converting back to numeric.  For example, perhaps something is >being used to indicate missing data that R does not recognize (e.g., >SAS uses ".").  Specifying na.strings = ".", would fix this.  See >?read.table for some of the options available. > ================================================================ Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: [hidden email] Least Cost Formulations, Ltd.            URL: http://lcfltd.com/824 Timberlake Drive                     Tel: 757-467-0954 Virginia Beach, VA 23464-3239            Fax: 757-467-2947 "Vere scire est per causas scire" ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|
Report Content as Inappropriate

## Re: How to convert a factor column into a numeric one?

 In reply to this post by djmuseR Thanks! Exactly what I wanted, as the same as Jorge also suggested. At 12:49 AM 6/5/2011, Dennis Murphy wrote: >Hi: > >Try this: > > > dd <- data.frame(a = factor(rep(1:5, each = 4)), >+                  b = factor(rep(rep(1:2, each = 2), 5)), >+                  y = rnorm(20)) > > str(dd) >'data.frame':   20 obs. of  3 variables: >  \$ a: Factor w/ 5 levels "1","2","3","4",..: 1 1 1 1 2 2 2 2 3 3 ... >  \$ b: Factor w/ 2 levels "1","2": 1 1 2 2 1 1 2 2 1 1 ... >  \$ y: num  0.6396 1.467 1.8403 -0.0915 0.2711 ... > > de <- within(dd, { >+          a <- as.numeric(as.character(a)) >+          b <- as.numeric(as.character(b)) >+        } ) > > str(de) >'data.frame':   20 obs. of  3 variables: >  \$ a: num  1 1 1 1 2 2 2 2 3 3 ... >  \$ b: num  1 1 2 2 1 1 2 2 1 1 ... >  \$ y: num  0.6396 1.467 1.8403 -0.0915 0.2711 ... > > >HTH, >Dennis > >On Sat, Jun 4, 2011 at 9:31 PM, Robert A. LaBudde <[hidden email]> wrote: > > I have a data frame: > > > >> head(df) > >  Time Temp Conc Repl    Log10 > > 1    0  -20    H    1 6.406547 > > 2    2  -20    H    1 5.738683 > > 3    7  -20    H    1 5.796394 > > 4   14  -20    H    1 4.413691 > > 5    0    4    H    1 6.406547 > > 7    7    4    H    1 5.705433 > >> str(df) > > 'data.frame':   177 obs. of  5 variables: > >  \$ Time : Factor w/ 4 levels "0","2","7","14": 1 2 3 4 1 3 4 1 3 4 ... > >  \$ Temp : Factor w/ 4 levels "-20","4","25",..: 1 1 1 1 2 2 2 3 3 3 ... > >  \$ Conc : Factor w/ 3 levels "H","L","M": 1 1 1 1 1 1 1 1 1 1 ... > >  \$ Repl : Factor w/ 5 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ... > >  \$ Log10: num  6.41 5.74 5.8 4.41 6.41 ... > >> levels(df\$Temp) > > [1] "-20" "4"   "25"  "45" > >> levels(df\$Time) > > [1] "0"  "2"  "7"  "14" > > > > As you can see, "Time" and "Temp" are currently factors, not numeric. > > > > I would like to change these columns into numerical ones. > > > > df\$Time<- as.numeric(df\$Time) > > > > doesn't work, as it changes to the factor level indices (1,2,3,4) > instead of > > the values (0,2,7,14). > > > > There must be a direct way of doing this in R. > > > > I tried recode() in 'car': > > > >> df\$Temp<- recode(df\$Temp, '1=-20;2=25;3=4;4=45',as.factor.result=FALSE) > >> head(df) > >  Time Temp Conc Repl     Freq > > 1    0  -20    H    1 6.406547 > > 2    2  -20    H    1 5.738683 > > 3    7  -20    H    1 5.796394 > > 4   14  -20    H    1 4.413691 > > 5    0   45    H    1 6.406547 > > 7    7   45    H    1 5.705433 > > > > but note that the values for 'Temp' in rows 5 and 7 are 45 and not 4, as > > expected, although the result is numeric. The same happens if I use the > > order given by levels(df\$Temp) instead of the sort order in the > recode() 2nd > > argument. > > > > Any hints? > > ================================================================ > > Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: [hidden email] > > Least Cost Formulations, Ltd.            URL: http://lcfltd.com/> > 824 Timberlake Drive                     Tel: 757-467-0954 > > Virginia Beach, VA 23464-3239            Fax: 757-467-2947 > > > > "Vere scire est per causas scire" > > > > ______________________________________________ > > [hidden email] mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > > ================================================================ Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: [hidden email] Least Cost Formulations, Ltd.            URL: http://lcfltd.com/824 Timberlake Drive                     Tel: 757-467-0954 Virginia Beach, VA 23464-3239            Fax: 757-467-2947 "Vere scire est per causas scire" ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.