indexing data.frame columns

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

indexing data.frame columns

Peter Meilstrup
Consider the data.frame:

df <- data.frame(A = c(1,4,2,6,7,3,6), B= c(3,7,2,7,3,5,4), C =
c(2,7,5,2,7,4,5), index = c("A","B","A","C","B","B","C"))

I want to select the column specified in 'index' for every row of 'df', to
get

goal <- c(1, 7, 2, 2, 3, 5, 5)

This sounds a lot like the indexing-by-a-matrix you can do with arrays;

df[cbind(1:nrow(df), df$index)]

but this returns me values that are all characters where I want numbers.
(it seems that indexing by an array isn't well supported for data.frames.)

What is a better way to perform this selection operation?

Peter

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: indexing data.frame columns

ilai-2
On Thu, Apr 5, 2012 at 1:40 PM, Peter Meilstrup
<[hidden email]> wrote:

> Consider the data.frame:
>
> df <- data.frame(A = c(1,4,2,6,7,3,6), B= c(3,7,2,7,3,5,4), C =
> c(2,7,5,2,7,4,5), index = c("A","B","A","C","B","B","C"))
>
> I want to select the column specified in 'index' for every row of 'df', to
> get
>
> goal <- c(1, 7, 2, 2, 3, 5, 5)
>
> This sounds a lot like the indexing-by-a-matrix you can do with arrays;
>
> df[cbind(1:nrow(df), df$index)]
>
> but this returns me values that are all characters where I want numbers.

str(df[,-4][cbind(1:nrow(df),df$index)])
 num [1:7] 1 7 2 2 3 5 5

> (it seems that indexing by an array isn't well supported for data.frames.)

No, it's just that the index column in df is a factor so as.matrix(df)
return a matrix of characters

>
> What is a better way to perform this selection operation?
>

Not that I know of

Cheers


> Peter
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: indexing data.frame columns

Milan Bouchet-Valat
In reply to this post by Peter Meilstrup
Le jeudi 05 avril 2012 à 12:40 -0700, Peter Meilstrup a écrit :

> Consider the data.frame:
>
> df <- data.frame(A = c(1,4,2,6,7,3,6), B= c(3,7,2,7,3,5,4), C =
> c(2,7,5,2,7,4,5), index = c("A","B","A","C","B","B","C"))
>
> I want to select the column specified in 'index' for every row of 'df', to
> get
>
> goal <- c(1, 7, 2, 2, 3, 5, 5)
>
> This sounds a lot like the indexing-by-a-matrix you can do with arrays;
>
> df[cbind(1:nrow(df), df$index)]
>
> but this returns me values that are all characters where I want numbers.
> (it seems that indexing by an array isn't well supported for data.frames.)
>
> What is a better way to perform this selection operation?
I think the problem is that the data frame is converted to a matrix
under the hood, so numeric values are converted to characters (since the
reverse is not possible). You can either do:
as.numeric(df[cbind(1:nrow(df), df$index)])
[1] 1 7 2 2 3 5 5

Or avoid the conversion by excluding the character column beforehand:
df[-ncol(df)][cbind(1:nrow(df), df$index)]
[1] 1 7 2 2 3 5 5


Regards

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.