Quantcast

find and

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

find and

Ashta
Hi all,

I am trying to find a city that do not have the same "var" value.
Within city the var should be the same otherwise exclude the city from
the final data set.
Here is my sample data and my attempt. City1 and city4 should be excluded.

DF4 <- read.table(header=TRUE, text=' city  wk var
city1  1  x
city1  2  -
city1  3  x
city2  1  x
city2  2  x
city2  3  x
city2  4  x
city3  1  x
city3  2  x
city3  3  x
city3  4  x
city4  1  x
city4  2  x
city4  3  y
city4  4  y
city5  3  -
city5  4  -')

my attempt
     test2  <-   data.table(DF4, key="city,var")
     ID1    <-   test2[ !duplicated(test2),]
    dps     <-   ID1$city[duplicated(ID1$city)]
   Ddup  <-   which(test2$city %in% dps)

    if(length(Ddup) !=0)  {
          test2   <-  test2[- Ddup,]  }

want     <-  data.frame(test2)


I want get the following result but I am not getting it.

   city wk var
  city2  1   x
  city2  2   x
  city2  3   x
  city2  4   x
  city3  1   x
  city3  2   x
 city3  3   x
 city3  4   x
 city5  3   -
 city5  4   -

Can some help me out the problem is?

Thank you.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: find and

Rui Barradas
Hello,

I believe this does it.


sp <- split(DF4, DF4$city)
want <- do.call(rbind, lapply(sp, function(x)
                if(length(unique(x$var)) == 1) x else NULL))
rownames(want) <- NULL
want


Hope this helps,

Rui Barradas

Em 18-03-2017 13:51, Ashta escreveu:

> Hi all,
>
> I am trying to find a city that do not have the same "var" value.
> Within city the var should be the same otherwise exclude the city from
> the final data set.
> Here is my sample data and my attempt. City1 and city4 should be excluded.
>
> DF4 <- read.table(header=TRUE, text=' city  wk var
> city1  1  x
> city1  2  -
> city1  3  x
> city2  1  x
> city2  2  x
> city2  3  x
> city2  4  x
> city3  1  x
> city3  2  x
> city3  3  x
> city3  4  x
> city4  1  x
> city4  2  x
> city4  3  y
> city4  4  y
> city5  3  -
> city5  4  -')
>
> my attempt
>       test2  <-   data.table(DF4, key="city,var")
>       ID1    <-   test2[ !duplicated(test2),]
>      dps     <-   ID1$city[duplicated(ID1$city)]
>     Ddup  <-   which(test2$city %in% dps)
>
>      if(length(Ddup) !=0)  {
>            test2   <-  test2[- Ddup,]  }
>
> want     <-  data.frame(test2)
>
>
> I want get the following result but I am not getting it.
>
>     city wk var
>    city2  1   x
>    city2  2   x
>    city2  3   x
>    city2  4   x
>    city3  1   x
>    city3  2   x
>   city3  3   x
>   city3  4   x
>   city5  3   -
>   city5  4   -
>
> Can some help me out the problem is?
>
> Thank you.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: find and

Ulrik Stervbo-2
Using dplyr:

library(dplyr)

# Counting unique
DF4 %>%
  group_by(city) %>%
  filter(length(unique(var)) == 1)

# Counting not duplicated
DF4 %>%
  group_by(city) %>%
  filter(sum(!duplicated(var)) == 1)

HTH
Ulrik


On Sat, 18 Mar 2017 at 15:17 Rui Barradas <[hidden email]> wrote:

> Hello,
>
> I believe this does it.
>
>
> sp <- split(DF4, DF4$city)
> want <- do.call(rbind, lapply(sp, function(x)
>                 if(length(unique(x$var)) == 1) x else NULL))
> rownames(want) <- NULL
> want
>
>
> Hope this helps,
>
> Rui Barradas
>
> Em 18-03-2017 13:51, Ashta escreveu:
> > Hi all,
> >
> > I am trying to find a city that do not have the same "var" value.
> > Within city the var should be the same otherwise exclude the city from
> > the final data set.
> > Here is my sample data and my attempt. City1 and city4 should be
> excluded.
> >
> > DF4 <- read.table(header=TRUE, text=' city  wk var
> > city1  1  x
> > city1  2  -
> > city1  3  x
> > city2  1  x
> > city2  2  x
> > city2  3  x
> > city2  4  x
> > city3  1  x
> > city3  2  x
> > city3  3  x
> > city3  4  x
> > city4  1  x
> > city4  2  x
> > city4  3  y
> > city4  4  y
> > city5  3  -
> > city5  4  -')
> >
> > my attempt
> >       test2  <-   data.table(DF4, key="city,var")
> >       ID1    <-   test2[ !duplicated(test2),]
> >      dps     <-   ID1$city[duplicated(ID1$city)]
> >     Ddup  <-   which(test2$city %in% dps)
> >
> >      if(length(Ddup) !=0)  {
> >            test2   <-  test2[- Ddup,]  }
> >
> > want     <-  data.frame(test2)
> >
> >
> > I want get the following result but I am not getting it.
> >
> >     city wk var
> >    city2  1   x
> >    city2  2   x
> >    city2  3   x
> >    city2  4   x
> >    city3  1   x
> >    city3  2   x
> >   city3  3   x
> >   city3  4   x
> >   city5  3   -
> >   city5  4   -
> >
> > Can some help me out the problem is?
> >
> > Thank you.
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: find and

Ashta
Thank you Rudi and  Ulrik.

Rudi, your option worked for the small data set but when I applied to
the big data set it taking long and never finished and have to kill
it. I dont know why.


Ulrik's option worked fine for the big data set  (> 1.5M  records)
and took less than 2 minutes.

These two are giving me the same  results.
# Counting unique
DF4 %>%    group_by(city) %>%     filter(length(unique(var)) == 1)
# Counting not duplicated
DF4 %>%    group_by(city) %>%    filter(sum(!duplicated(var)) == 1)

 Thank yo again.


On Sat, Mar 18, 2017 at 10:40 AM, Ulrik Stervbo <[hidden email]> wrote:

> Using dplyr:
>
> library(dplyr)
>
> # Counting unique
> DF4 %>%
>   group_by(city) %>%
>   filter(length(unique(var)) == 1)
>
> # Counting not duplicated
> DF4 %>%
>   group_by(city) %>%
>   filter(sum(!duplicated(var)) == 1)
>
> HTH
> Ulrik
>
>
> On Sat, 18 Mar 2017 at 15:17 Rui Barradas <[hidden email]> wrote:
>>
>> Hello,
>>
>> I believe this does it.
>>
>>
>> sp <- split(DF4, DF4$city)
>> want <- do.call(rbind, lapply(sp, function(x)
>>                 if(length(unique(x$var)) == 1) x else NULL))
>> rownames(want) <- NULL
>> want
>>
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>> Em 18-03-2017 13:51, Ashta escreveu:
>> > Hi all,
>> >
>> > I am trying to find a city that do not have the same "var" value.
>> > Within city the var should be the same otherwise exclude the city from
>> > the final data set.
>> > Here is my sample data and my attempt. City1 and city4 should be
>> > excluded.
>> >
>> > DF4 <- read.table(header=TRUE, text=' city  wk var
>> > city1  1  x
>> > city1  2  -
>> > city1  3  x
>> > city2  1  x
>> > city2  2  x
>> > city2  3  x
>> > city2  4  x
>> > city3  1  x
>> > city3  2  x
>> > city3  3  x
>> > city3  4  x
>> > city4  1  x
>> > city4  2  x
>> > city4  3  y
>> > city4  4  y
>> > city5  3  -
>> > city5  4  -')
>> >
>> > my attempt
>> >       test2  <-   data.table(DF4, key="city,var")
>> >       ID1    <-   test2[ !duplicated(test2),]
>> >      dps     <-   ID1$city[duplicated(ID1$city)]
>> >     Ddup  <-   which(test2$city %in% dps)
>> >
>> >      if(length(Ddup) !=0)  {
>> >            test2   <-  test2[- Ddup,]  }
>> >
>> > want     <-  data.frame(test2)
>> >
>> >
>> > I want get the following result but I am not getting it.
>> >
>> >     city wk var
>> >    city2  1   x
>> >    city2  2   x
>> >    city2  3   x
>> >    city2  4   x
>> >    city3  1   x
>> >    city3  2   x
>> >   city3  3   x
>> >   city3  4   x
>> >   city5  3   -
>> >   city5  4   -
>> >
>> > Can some help me out the problem is?
>> >
>> > Thank you.
>> >
>> > ______________________________________________
>> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: find and

Bert Gunter-2
In reply to this post by Rui Barradas
Here is a version similar to Rui's, but using ave() and logical
indexing to simplify a bit:


> DF4[with(DF4, ave(as.numeric(var), city, FUN = function(x)length(unique(x))) ==1), ]

    city wk var
4  city2  1   x
5  city2  2   x
6  city2  3   x
7  city2  4   x
8  city3  1   x
9  city3  2   x
10 city3  3   x
11 city3  4   x
16 city5  3   -
17 city5  4   -


Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Mar 18, 2017 at 7:15 AM, Rui Barradas <[hidden email]> wrote:

> Hello,
>
> I believe this does it.
>
>
> sp <- split(DF4, DF4$city)
> want <- do.call(rbind, lapply(sp, function(x)
>                 if(length(unique(x$var)) == 1) x else NULL))
> rownames(want) <- NULL
> want
>
>
> Hope this helps,
>
> Rui Barradas
>
>
> Em 18-03-2017 13:51, Ashta escreveu:
>>
>> Hi all,
>>
>> I am trying to find a city that do not have the same "var" value.
>> Within city the var should be the same otherwise exclude the city from
>> the final data set.
>> Here is my sample data and my attempt. City1 and city4 should be excluded.
>>
>> DF4 <- read.table(header=TRUE, text=' city  wk var
>> city1  1  x
>> city1  2  -
>> city1  3  x
>> city2  1  x
>> city2  2  x
>> city2  3  x
>> city2  4  x
>> city3  1  x
>> city3  2  x
>> city3  3  x
>> city3  4  x
>> city4  1  x
>> city4  2  x
>> city4  3  y
>> city4  4  y
>> city5  3  -
>> city5  4  -')
>>
>> my attempt
>>       test2  <-   data.table(DF4, key="city,var")
>>       ID1    <-   test2[ !duplicated(test2),]
>>      dps     <-   ID1$city[duplicated(ID1$city)]
>>     Ddup  <-   which(test2$city %in% dps)
>>
>>      if(length(Ddup) !=0)  {
>>            test2   <-  test2[- Ddup,]  }
>>
>> want     <-  data.frame(test2)
>>
>>
>> I want get the following result but I am not getting it.
>>
>>     city wk var
>>    city2  1   x
>>    city2  2   x
>>    city2  3   x
>>    city2  4   x
>>    city3  1   x
>>    city3  2   x
>>   city3  3   x
>>   city3  4   x
>>   city5  3   -
>>   city5  4   -
>>
>> Can some help me out the problem is?
>>
>> Thank you.
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Loading...