Quantcast

how to simplify a data.frame and add the counts of duplicate rows as a new column

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

how to simplify a data.frame and add the counts of duplicate rows as a new column

Simone Gabbriellini-3
Hello List,

I would like to simplify a data.frame like this

columnA columnB
user10 proj12
user10 proj19
user10 proj12

into something like:

columnA columnB columnC
user10 proj12 2
user10 proj19 1

I know unique() can simplify the data.frame, but how to count and store the duplicates?

thanks in advance for any help.

best regards,
Simone

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: how to simplify a data.frame and add the counts of duplicate rows as a new column

Scott Chamberlain-3
see package plyr, especially the function ddply(), eg.., in your case:

ddply(dataframe, .(columnA, columnB), summarise,
columnC = length(columnB)
)

Scott
On Wednesday, March 2, 2011 at 9:10 AM, Simone Gabbriellini wrote:

> Hello List,
>
> I would like to simplify a data.frame like this
>
> columnA columnB
> user10 proj12
> user10 proj19
> user10 proj12
>
> into something like:
>
> columnA columnB columnC
> user10 proj12 2
> user10 proj19 1
>
> I know unique() can simplify the data.frame, but how to count and store the duplicates?
>
> thanks in advance for any help.
>
> best regards,
> Simone
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: how to simplify a data.frame and add the counts of duplicate rows as a new column

Simone Gabbriellini-3
many thanks, this is really a great solution!

best,
Simone

Il giorno 02/mar/2011, alle ore 16.22, Scott Chamberlain ha scritto:

> see package plyr, especially the function ddply(), eg.., in your case:
>
> ddply(dataframe, .(columnA, columnB), summarise,
> columnC = length(columnB)
> )
>
> Scott
> On Wednesday, March 2, 2011 at 9:10 AM, Simone Gabbriellini wrote:
>
>> Hello List,
>>
>> I would like to simplify a data.frame like this
>>
>> columnA columnB
>> user10 proj12
>> user10 proj19
>> user10 proj12
>>
>> into something like:
>>
>> columnA columnB columnC
>> user10 proj12 2
>> user10 proj19 1
>>
>> I know unique() can simplify the data.frame, but how to count and store the duplicates?
>>
>> thanks in advance for any help.
>>
>> best regards,
>> Simone
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Loading...