How do I subset a dataframe

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

How do I subset a dataframe

eric
I have a dataframe zeespan. One of the columns has the name "customer". The data in the customer column is text. I would like to return a subset of the dataframe with all rows that DON'T begin with either "ibm" or "exxon", or "sears" in the customer column.

I tried ....  subset(zeespan, customer != c("ibm" | "exxon" | "sears") )

That didn't work and even if it did, the text would have to be an exact match where what I really want is "begins with".

Suggestions on how to do this would be appreciated
Reply | Threaded
Open this post in threaded view
|

Re: How do I subset a dataframe

Jorge I Velez
Hi eric,

See

R> ?"%in%"

and try the following (untested):

subset(zeespan, !customer %in% c("ibm" , "exxon" , "sears") )

HTH,
Jorge



On Sat, Aug 13, 2011 at 7:44 PM, eric <> wrote:

> I have a dataframe zeespan. One of the columns has the name "customer". The
> data in the customer column is text. I would like to return a subset of the
> dataframe with all rows that DON'T begin with either "ibm" or "exxon", or
> "sears" in the customer column.
>
> I tried ....  subset(zeespan, customer != c("ibm" | "exxon" | "sears") )
>
> That didn't work and even if it did, the text would have to be an exact
> match where what I really want is "begins with".
>
> Suggestions on how to do this would be appreciated
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/How-do-I-subset-a-dataframe-tp3742172p3742172.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: How do I subset a dataframe

Timothy Bates
In reply to this post by eric
Perhaps this:

matches = grep("^ibm|sears|exxon", zeespan$customer, value=F)
zee = zeespan[matches,]

t

On Aug 14, 2011, at 12:44 AM, eric wrote:

> I have a dataframe zeespan. One of the columns has the name "customer". The
> data in the customer column is text. I would like to return a subset of the
> dataframe with all rows that DON'T begin with either "ibm" or "exxon", or
> "sears" in the customer column.
>
> I tried ....  subset(zeespan, customer != c("ibm" | "exxon" | "sears") )
>
> That didn't work and even if it did, the text would have to be an exact
> match where what I really want is "begins with".
>
> Suggestions on how to do this would be appreciated
>
> --
> View this message in context: http://r.789695.n4.nabble.com/How-do-I-subset-a-dataframe-tp3742172p3742172.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: How do I subset a dataframe

Mikhail Titov-2
In reply to this post by Jorge I Velez
Eric:

Create another column using grep and regular expression of your choice,
then subset based on that column.

Jorge:

OP wants inexact match.

P.S. I'd use RDBMS and SQL to pull data of interest

Mikhail

On 08/14/2011 02:20 AM, Jorge Ivan Velez wrote:

> Hi eric,
>
> See
>
> R> ?"%in%"
>
> and try the following (untested):
>
> subset(zeespan, !customer %in% c("ibm" , "exxon" , "sears") )
>
> HTH,
> Jorge
>
>
>
> On Sat, Aug 13, 2011 at 7:44 PM, eric <> wrote:
>
>> I have a dataframe zeespan. One of the columns has the name "customer". The
>> data in the customer column is text. I would like to return a subset of the
>> dataframe with all rows that DON'T begin with either "ibm" or "exxon", or
>> "sears" in the customer column.
>>
>> I tried ....  subset(zeespan, customer != c("ibm" | "exxon" | "sears") )
>>
>> That didn't work and even if it did, the text would have to be an exact
>> match where what I really want is "begins with".
>>
>> Suggestions on how to do this would be appreciated
>>
>> --
>> View this message in context:
>> http://r.789695.n4.nabble.com/How-do-I-subset-a-dataframe-tp3742172p3742172.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.