converting character matrix to a dataframe

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

converting character matrix to a dataframe

John M. Miyamoto
Dear R-Help,
    Suppose I have a character matrix, e.g.,

(ch.mat <- matrix(c('a','s','*','f','w','*','k','*','*','f','i','o'),
ncol=3))

When I convert 'ch.mat' to a dataframe, the columns are converted to
factors:

(d1 <- data.frame(ch.mat))
mode(d1[,1])
is.factor(d1[,1])

To prevent this, I can use 'I' to protect the column vectors:

(d2 <- data.frame(x1 = I(ch.mat[,1]), x2 = I(ch.mat[,2]), x3 =
I(ch.mat[,3])))
mode(d2[,1])
is.factor(d2[,2])

but this method is cumbersome if the matrix has many columns.
The following code is reasonably efficient even if the matrix has
arbitrarily many columns.

(d3 <- data.frame(apply(ch.mat,2,function(x) data.frame(I(x)))))
mode(d3[,1])
is.factor(d3[,1])

Question:  Is there a more efficient method than the last one for
converting a character matrix to a dataframe while preventing the
automatic conversion of the column vectors to factors?

John

--------------------------------------------------------------------
John Miyamoto, Dept. of Psychology, Box 351525
University of Washington, Seattle, WA 98195-1525
Phone 206-543-0805, Fax 206-685-3157
Homepage http://faculty.washington.edu/jmiyamot/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: converting character matrix to a dataframe

Gabor Grothendieck
You could do it directly like this:

structure(split(ch.mat, col(ch.mat)),
       row.names = 1:nrow(ch.mat), .Names = 1:ncol(ch.mat),
       class = "data.frame")


On 2/23/06, John M. Miyamoto <[hidden email]> wrote:

> Dear R-Help,
>    Suppose I have a character matrix, e.g.,
>
> (ch.mat <- matrix(c('a','s','*','f','w','*','k','*','*','f','i','o'),
> ncol=3))
>
> When I convert 'ch.mat' to a dataframe, the columns are converted to
> factors:
>
> (d1 <- data.frame(ch.mat))
> mode(d1[,1])
> is.factor(d1[,1])
>
> To prevent this, I can use 'I' to protect the column vectors:
>
> (d2 <- data.frame(x1 = I(ch.mat[,1]), x2 = I(ch.mat[,2]), x3 =
> I(ch.mat[,3])))
> mode(d2[,1])
> is.factor(d2[,2])
>
> but this method is cumbersome if the matrix has many columns.
> The following code is reasonably efficient even if the matrix has
> arbitrarily many columns.
>
> (d3 <- data.frame(apply(ch.mat,2,function(x) data.frame(I(x)))))
> mode(d3[,1])
> is.factor(d3[,1])
>
> Question:  Is there a more efficient method than the last one for
> converting a character matrix to a dataframe while preventing the
> automatic conversion of the column vectors to factors?
>
> John
>
> --------------------------------------------------------------------
> John Miyamoto, Dept. of Psychology, Box 351525
> University of Washington, Seattle, WA 98195-1525
> Phone 206-543-0805, Fax 206-685-3157
> Homepage http://faculty.washington.edu/jmiyamot/
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: converting character matrix to a dataframe

Prof Brian Ripley
In reply to this post by John M. Miyamoto
It is a bit more efficient to use as.data.frame in your apply.

You could make a copy of as.data.frame.matrix (under another name) and
remove the special-casing of character matrices.  This would efficiently
give you a data frame with character columns, but they would then not be
treated 'AsIs' in subsequent manipulations.  So this is only desirable if
efficiency is really important (and it seems unlikely to me that it is).

On Thu, 23 Feb 2006, John M. Miyamoto wrote:

> Dear R-Help,
>    Suppose I have a character matrix, e.g.,
>
> (ch.mat <- matrix(c('a','s','*','f','w','*','k','*','*','f','i','o'),
> ncol=3))
>
> When I convert 'ch.mat' to a dataframe, the columns are converted to
> factors:
>
> (d1 <- data.frame(ch.mat))
> mode(d1[,1])
> is.factor(d1[,1])
>
> To prevent this, I can use 'I' to protect the column vectors:
>
> (d2 <- data.frame(x1 = I(ch.mat[,1]), x2 = I(ch.mat[,2]), x3 =
> I(ch.mat[,3])))
> mode(d2[,1])
> is.factor(d2[,2])
>
> but this method is cumbersome if the matrix has many columns.
> The following code is reasonably efficient even if the matrix has
> arbitrarily many columns.
>
> (d3 <- data.frame(apply(ch.mat,2,function(x) data.frame(I(x)))))
> mode(d3[,1])
> is.factor(d3[,1])
>
> Question:  Is there a more efficient method than the last one for
> converting a character matrix to a dataframe while preventing the
> automatic conversion of the column vectors to factors?
>
> John
>
> --------------------------------------------------------------------
> John Miyamoto, Dept. of Psychology, Box 351525
> University of Washington, Seattle, WA 98195-1525
> Phone 206-543-0805, Fax 206-685-3157
> Homepage http://faculty.washington.edu/jmiyamot/
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

--
Brian D. Ripley,                  [hidden email]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: converting character matrix to a dataframe

John M. Miyamoto
On Fri, 24 Feb 2006, Prof Brian Ripley wrote:

> It is a bit more efficient to use as.data.frame in your apply.
>
> You could make a copy of as.data.frame.matrix (under another name) and remove
> the special-casing of character matrices.  This would efficiently give you a
> data frame with character columns, but they would then not be treated 'AsIs'
> in subsequent manipulations.  So this is only desirable if efficiency is
> really important (and it seems unlikely to me that it is).
>
> On Thu, 23 Feb 2006, John M. Miyamoto wrote:
>
>> Dear R-Help,
>>    Suppose I have a character matrix, e.g.,
>>
>> (ch.mat <- matrix(c('a','s','*','f','w','*','k','*','*','f','i','o'),
>> ncol=3))
>>
>> When I convert 'ch.mat' to a dataframe, the columns are converted to
>> factors:
[SNIP]

>> The following code is reasonably efficient even if the matrix has
>> arbitrarily many columns.
>>
>> (d3 <- data.frame(apply(ch.mat,2,function(x) data.frame(I(x)))))
>> mode(d3[,1])
>> is.factor(d3[,1])
>>
>> Question:  Is there a more efficient method than the last one for
>> converting a character matrix to a dataframe while preventing the
>> automatic conversion of the column vectors to factors?

So I take it that this last solution would be:

(ch.mat <- matrix(c('a','s','*','f','w','*','k','*','*','f','i','o'),
ncol=3))

(d4 <- data.frame(apply(ch.mat, 2, function(x) as.data.frame(I(x)))))
mode(d4[,1])
is.factor(d4[,1])

You're right that I'm not really concerned with computational efficiency,
but only minimizing the amount of code that I have to write and remember.
The solution seems to be that I should write a function that accomplishes
this task, which I have done.  Thank you.

John

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html