Saving data in an R package - how to maintain that t avariable is a 'factor' when it is coded as 1, 2, 3...

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Saving data in an R package - how to maintain that t avariable is a 'factor' when it is coded as 1, 2, 3...

Søren Højsgaard
I have a .txt file obtained by saving a data frame in which the first four columns are factors (but represented as 1,2,3 etc). The first four lines are
 
"Pig" "Evit" "Cu" "Litter" "Start" "Weight" "Feed" "Time"
"4601" "1" "1" "1" 26.5 26.5 NA 1
"4601" "1" "1" "1" 26.5 27.59999 5.200005 2
"4601" "1" "1" "1" 26.5 36.5 17.6 3
"4601" "1" "1" "1" 26.5 40.29999 28.5 4

I would like to include that data set in an R-package. When I load the data from the package the first four columns are read in as numeric variables. This is consistent with the documentation of read.table - but it is not what I want! I can of course change the coding of the variables, but there ought to be another way. Can anyone help me on that?
Best regards
Søren Højsgaard

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: Saving data in an R package - how to maintain that t avariable is a 'factor' when it is coded as 1, 2, 3...

jholtman
colClasses

> x <- read.table('clipboard', header=T,
colClasses=c(rep('factor',4),rep('numeric',4)))

> x

   Pig Evit Cu Litter Start   Weight      Feed Time
1 4601    1  1      1  26.5 26.50000        NA    1
2 4601    1  1      1  26.5 27.59999  5.200005    2
3 4601    1  1      1  26.5 36.50000 17.600000    3
4 4601    1  1      1  26.5 40.29999 28.500000    4
> str(x)
`data.frame':   4 obs. of  8 variables:
 $ Pig   : Factor w/ 1 level "4601": 1 1 1 1
 $ Evit  : Factor w/ 1 level "1": 1 1 1 1
 $ Cu    : Factor w/ 1 level "1": 1 1 1 1
 $ Litter: Factor w/ 1 level "1": 1 1 1 1
 $ Start : num  26.5 26.5 26.5 26.5
 $ Weight: num  26.5 27.6 36.5 40.3
 $ Feed  : num    NA  5.2 17.6 28.5
 $ Time  : num  1 2 3 4
>



On 1/13/06, Søren Højsgaard <[hidden email]> wrote:

>
> I have a .txt file obtained by saving a data frame in which the first four
> columns are factors (but represented as 1,2,3 etc). The first four lines are
>
> "Pig" "Evit" "Cu" "Litter" "Start" "Weight" "Feed" "Time"
> "4601" "1" "1" "1" 26.5 26.5 NA 1
> "4601" "1" "1" "1" 26.5 27.59999 5.200005 2
> "4601" "1" "1" "1" 26.5 36.5 17.6 3
> "4601" "1" "1" "1" 26.5 40.29999 28.5 4
>
> I would like to include that data set in an R-package. When I load the
> data from the package the first four columns are read in as numeric
> variables. This is consistent with the documentation of read.table - but
> it is not what I want! I can of course change the coding of the variables,
> but there ought to be another way. Can anyone help me on that?
> Best regards
> Søren Højsgaard
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>


--
Jim Holtman
Cincinnati, OH
+1 513 247 0281

What the problem you are trying to solve?

        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: Saving data in an R package - how to maintain that t avariable is a 'factor' when it is coded as 1, 2, 3...

Brian Ripley
In reply to this post by Søren Højsgaard
?read.table, see argument colClasses.

You can use a .R wrapper to a .tab file in the data directory of a
package.  Or, perhaps better, include it as a .rda file.

On Fri, 13 Jan 2006, Søren Højsgaard wrote:

> I have a .txt file obtained by saving a data frame in which the first four columns are factors (but represented as 1,2,3 etc). The first four lines are
>
> "Pig" "Evit" "Cu" "Litter" "Start" "Weight" "Feed" "Time"
> "4601" "1" "1" "1" 26.5 26.5 NA 1
> "4601" "1" "1" "1" 26.5 27.59999 5.200005 2
> "4601" "1" "1" "1" 26.5 36.5 17.6 3
> "4601" "1" "1" "1" 26.5 40.29999 28.5 4
>
> I would like to include that data set in an R-package. When I load the data from the package the first four columns are read in as numeric variables. This is consistent with the documentation of read.table - but it is not what I want! I can of course change the coding of the variables, but there ought to be another way. Can anyone help me on that?
> Best regards
> Søren Højsgaard
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
--
Brian D. Ripley,                  [hidden email]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html