

I have a dataframe with a column, say "x" consisting of values, each
value appearing different times, e.g.
x: 1,1,1,1,2,2,4,4,4,9,10,10,10,10,10 ...
and a vector, including e.g.:
y: 2,9,10,...
I need a subset of the dataframe: all rows where x is equal to one of
the values in y. Currently I use a loop for this, but because x and y
are large this is very slow.
Is there any idea how to solve this problem faster?
Thank you,
Bernhard
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide! http://www.Rproject.org/postingguide.html


On 2/8/2006 9:21 AM, Bernhard Baumgartner wrote:
> I have a dataframe with a column, say "x" consisting of values, each
> value appearing different times, e.g.
> x: 1,1,1,1,2,2,4,4,4,9,10,10,10,10,10 ...
> and a vector, including e.g.:
> y: 2,9,10,...
> I need a subset of the dataframe: all rows where x is equal to one of
> the values in y. Currently I use a loop for this, but because x and y
> are large this is very slow.
> Is there any idea how to solve this problem faster?
It's actually very easy. Assume your dataframe is df, then
subset(df, x %in% y)
will give you what you want (assuming there is no column y in the
dataframe).
Duncan Murdoch
> Thank you,
> Bernhard
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide! http://www.Rproject.org/postingguide.html______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide! http://www.Rproject.org/postingguide.html


In reply to this post by Bernhard Baumgartner
Bernhard Baumgartner wrote:
> I have a dataframe with a column, say "x" consisting of values, each
> value appearing different times, e.g.
> x: 1,1,1,1,2,2,4,4,4,9,10,10,10,10,10 ...
> and a vector, including e.g.:
> y: 2,9,10,...
> I need a subset of the dataframe: all rows where x is equal to one of
> the values in y. Currently I use a loop for this, but because x and y
> are large this is very slow.
> Is there any idea how to solve this problem faster?
mydata < data.frame(X = sample(1:10, 10000, replace=TRUE),
Y = sample(c(2,9,10), 10000, replace=TRUE))
newdata < mydata[mydata$X %in% unique(mydata$Y),]
?"%in%"

Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 8454495 (Tu, Th)
tel: (732) 4521424 (M, W, F)
fax: (917) 4380894
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide! http://www.Rproject.org/postingguide.html


In reply to this post by Bernhard Baumgartner
Dear Bernhard,
if I understand correctly your question
may be you want something like
df<data.frame(x=sample(1:10, 100, repl=T),
y=sample(1:5, 100, repl=T))
subset(df, x%in%y)
Regards,
Stefano
>Messaggio originale
>Da: [hidden email]
>[mailto: [hidden email]]Per conto di Bernhard
>Baumgartner
>Inviato: 08 February 2006 15:22
>A: [hidden email]
>Oggetto: [R] dataframe subset
>
>
>I have a dataframe with a column, say "x" consisting of
>values, each
>value appearing different times, e.g.
>x: 1,1,1,1,2,2,4,4,4,9,10,10,10,10,10 ...
>and a vector, including e.g.:
>y: 2,9,10,...
>I need a subset of the dataframe: all rows where x is equal
>to one of
>the values in y. Currently I use a loop for this, but
>because x and y
>are large this is very slow.
>Is there any idea how to solve this problem faster?
>Thank you,
>Bernhard
>
>______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp >PLEASE do read the posting guide!
http://www.Rproject.org/postingguide.html______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide! http://www.Rproject.org/postingguide.html


In reply to this post by Bernhard Baumgartner
Sounds like you may need no use match().
On Wed, 20060208 at 15:21 +0100, Bernhard Baumgartner wrote:
> I have a dataframe with a column, say "x" consisting of values, each
> value appearing different times, e.g.
> x: 1,1,1,1,2,2,4,4,4,9,10,10,10,10,10 ...
> and a vector, including e.g.:
> y: 2,9,10,...
> I need a subset of the dataframe: all rows where x is equal to one of
> the values in y. Currently I use a loop for this, but because x and y
> are large this is very slow.
> Is there any idea how to solve this problem faster?
> Thank you,
> Bernhard
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide! http://www.Rproject.org/postingguide.html>
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide! http://www.Rproject.org/postingguide.html


In reply to this post by Bernhard Baumgartner
Hi
something like
xx<data.frame(x=sample(1:10,100,replace=T))
y<c(2,5,8)
xx[xx$x%in%y,]
HTH
Petr
On 8 Feb 2006 at 15:21, Bernhard Baumgartner wrote:
From: "Bernhard Baumgartner" < [hidden email]>
Organization: Universitaet Regensburg
To: [hidden email]
Date sent: Wed, 08 Feb 2006 15:21:46 +0100
Priority: normal
Subject: [R] dataframe subset
> I have a dataframe with a column, say "x" consisting of values, each
> value appearing different times, e.g. x:
> 1,1,1,1,2,2,4,4,4,9,10,10,10,10,10 ... and a vector, including e.g.:
> y: 2,9,10,... I need a subset of the dataframe: all rows where x is
> equal to one of the values in y. Currently I use a loop for this, but
> because x and y are large this is very slow. Is there any idea how to
> solve this problem faster? Thank you, Bernhard
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide!
> http://www.Rproject.org/postingguide.htmlPetr Pikal
[hidden email]
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide! http://www.Rproject.org/postingguide.html

