how to create data.frames from vectors with duplicates

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

how to create data.frames from vectors with duplicates

zhenjiang xu
Hi R users,

suppose I have two vectors,
 > x=c(1,2,3,4,5)
 > y=c('a','b','c','a','c')
How can I get a data.frame like this?
> xy
      count
a     5
b     2
c     8

I know a few ways to fulfill the task. However, I have a huge number
of this kind calculations, so I'd like an efficient solution. Thanks

--
Best,
Zhenjiang

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to create data.frames from vectors with duplicates

Jorge I Velez
Hi Zhenjiang,

Try

table(unlist(mapply(function(x, y) rep(x, y), y, x)))

HTH,
Jorge


On Wed, Aug 31, 2011 at 12:45 PM, zhenjiang xu <> wrote:

> Hi R users,
>
> suppose I have two vectors,
>  > x=c(1,2,3,4,5)
>  > y=c('a','b','c','a','c')
> How can I get a data.frame like this?
> > xy
>      count
> a     5
> b     2
> c     8
>
> I know a few ways to fulfill the task. However, I have a huge number
> of this kind calculations, so I'd like an efficient solution. Thanks
>
> --
> Best,
> Zhenjiang
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to create data.frames from vectors with duplicates

Henrique Dallazuanna
In reply to this post by zhenjiang xu
Try this:

rowsum(x, y)

On Wed, Aug 31, 2011 at 1:45 PM, zhenjiang xu <[hidden email]> wrote:

>
> Hi R users,
>
> suppose I have two vectors,
>  > x=c(1,2,3,4,5)
>  > y=c('a','b','c','a','c')
> How can I get a data.frame like this?
> > xy
>      count
> a     5
> b     2
> c     8
>
> I know a few ways to fulfill the task. However, I have a huge number
> of this kind calculations, so I'd like an efficient solution. Thanks
>
> --
> Best,
> Zhenjiang
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



--
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to create data.frames from vectors with duplicates

Bert Gunter
In reply to this post by Jorge I Velez
Inline below:

On Wed, Aug 31, 2011 at 9:50 AM, Jorge I Velez <[hidden email]> wrote:
> Hi Zhenjiang,
>
> Try
>
> table(unlist(mapply(function(x, y) rep(x, y), y, x)))

Yikes! How about simply tapply(x,y,sum) ??
?tapply

-- Bert

>
> HTH,
> Jorge
>
>
> On Wed, Aug 31, 2011 at 12:45 PM, zhenjiang xu <> wrote:
>
>> Hi R users,
>>
>> suppose I have two vectors,
>>  > x=c(1,2,3,4,5)
>>  > y=c('a','b','c','a','c')
>> How can I get a data.frame like this?
>> > xy
>>      count
>> a     5
>> b     2
>> c     8
>>
>> I know a few ways to fulfill the task. However, I have a huge number
>> of this kind calculations, so I'd like an efficient solution. Thanks
>>
>> --
>> Best,
>> Zhenjiang
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to create data.frames from vectors with duplicates

Marc Schwartz-3
In reply to this post by zhenjiang xu
On Aug 31, 2011, at 11:45 AM, zhenjiang xu wrote:

> Hi R users,
>
> suppose I have two vectors,
>> x=c(1,2,3,4,5)
>> y=c('a','b','c','a','c')
> How can I get a data.frame like this?
>> xy
>      count
> a     5
> b     2
> c     8
>
> I know a few ways to fulfill the task. However, I have a huge number
> of this kind calculations, so I'd like an efficient solution. Thanks


See ?rep and ?as.data.frame.table

Try this:

> data.frame(table(rep(y, x)))
  Var1 Freq
1    a    5
2    b    2
3    c    8

HTH,

Marc Schwartz

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to create data.frames from vectors with duplicates

Jorge I Velez
In reply to this post by Jorge I Velez
Also

tapply(x, y, sum)

HTH,
Jorge


On Wed, Aug 31, 2011 at 12:50 PM, Jorge I Velez <> wrote:

> Hi Zhenjiang,
>
> Try
>
> table(unlist(mapply(function(x, y) rep(x, y), y, x)))
>
> HTH,
> Jorge
>
>
> On Wed, Aug 31, 2011 at 12:45 PM, zhenjiang xu <> wrote:
>
>> Hi R users,
>>
>> suppose I have two vectors,
>>  > x=c(1,2,3,4,5)
>>  > y=c('a','b','c','a','c')
>> How can I get a data.frame like this?
>> > xy
>>      count
>> a     5
>> b     2
>> c     8
>>
>> I know a few ways to fulfill the task. However, I have a huge number
>> of this kind calculations, so I'd like an efficient solution. Thanks
>>
>> --
>> Best,
>> Zhenjiang
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to create data.frames from vectors with duplicates

Bert Gunter
In reply to this post by Henrique Dallazuanna
For the record, Henrique's use of rowsum() is about 10 times faster
than using tapply (and presumably anything with table() ) on my
computer.  It call a C primitive.

-- Bert

On Wed, Aug 31, 2011 at 9:55 AM, Henrique Dallazuanna <[hidden email]> wrote:

> Try this:
>
> rowsum(x, y)
>
> On Wed, Aug 31, 2011 at 1:45 PM, zhenjiang xu <[hidden email]> wrote:
>>
>> Hi R users,
>>
>> suppose I have two vectors,
>>  > x=c(1,2,3,4,5)
>>  > y=c('a','b','c','a','c')
>> How can I get a data.frame like this?
>> > xy
>>      count
>> a     5
>> b     2
>> c     8
>>
>> I know a few ways to fulfill the task. However, I have a huge number
>> of this kind calculations, so I'd like an efficient solution. Thanks
>>
>> --
>> Best,
>> Zhenjiang
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Henrique Dallazuanna
> Curitiba-Paraná-Brasil
> 25° 25' 40" S 49° 16' 22" O
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to create data.frames from vectors with duplicates

William Dunlap
I'll put in a plug for vapply().

  > # 100,000 numbers in 17576 groups:
  > y <- rep(do.call(paste, c(list(sep=""), expand.grid(LETTERS,letters,letters))), length=1e5)
  > x <- seq_along(y)^2
  > system.time(val.vapply <- vapply(split(x, y), FUN=sum, FUN.VALUE=0))
     user  system elapsed
     0.18    0.02    0.20
  > system.time(val.rowsum <- rowsum(x, y))
     user  system elapsed
     0.14    0.00    0.15
  > system.time(val.tapply <- tapply(x, y, sum))
     user  system elapsed
     0.40    0.00    0.41
  > all(val.vapply==val.rowsum)
  [1] TRUE
  > all(val.vapply==val.tapply)
  [1] TRUE

S+ has fast functions groupSums, groupProds, etc. (one for
each of the standard summary functions) to deal with this
sort of thing.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]] On Behalf Of Bert Gunter
> Sent: Wednesday, August 31, 2011 10:10 AM
> To: Henrique Dallazuanna
> Cc: r-help; zhenjiang xu
> Subject: Re: [R] how to create data.frames from vectors with duplicates
>
> For the record, Henrique's use of rowsum() is about 10 times faster
> than using tapply (and presumably anything with table() ) on my
> computer.  It call a C primitive.
>
> -- Bert
>
> On Wed, Aug 31, 2011 at 9:55 AM, Henrique Dallazuanna <[hidden email]> wrote:
> > Try this:
> >
> > rowsum(x, y)
> >
> > On Wed, Aug 31, 2011 at 1:45 PM, zhenjiang xu <[hidden email]> wrote:
> >>
> >> Hi R users,
> >>
> >> suppose I have two vectors,
> >>  > x=c(1,2,3,4,5)
> >>  > y=c('a','b','c','a','c')
> >> How can I get a data.frame like this?
> >> > xy
> >>      count
> >> a     5
> >> b     2
> >> c     8
> >>
> >> I know a few ways to fulfill the task. However, I have a huge number
> >> of this kind calculations, so I'd like an efficient solution. Thanks
> >>
> >> --
> >> Best,
> >> Zhenjiang
> >>
> >> ______________________________________________
> >> [hidden email] mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >
> > --
> > Henrique Dallazuanna
> > Curitiba-Paraná-Brasil
> > 25° 25' 40" S 49° 16' 22" O
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to create data.frames from vectors with duplicates

zhenjiang xu
In reply to this post by Bert Gunter
Thanks for all your replies. I am using rowsum() and it looks efficient. I
hope I could do some benchmark sometime in near future and let people know.
Or is there any benchmark result available?

On Wed, Aug 31, 2011 at 12:58 PM, Bert Gunter <[hidden email]>wrote:

> Inline below:
>
> On Wed, Aug 31, 2011 at 9:50 AM, Jorge I Velez <[hidden email]>
> wrote:
> > Hi Zhenjiang,
> >
> > Try
> >
> > table(unlist(mapply(function(x, y) rep(x, y), y, x)))
>
> Yikes! How about simply tapply(x,y,sum) ??
> ?tapply
>
> -- Bert
> >
> > HTH,
> > Jorge
> >
> >
> > On Wed, Aug 31, 2011 at 12:45 PM, zhenjiang xu <> wrote:
> >
> >> Hi R users,
> >>
> >> suppose I have two vectors,
> >>  > x=c(1,2,3,4,5)
> >>  > y=c('a','b','c','a','c')
> >> How can I get a data.frame like this?
> >> > xy
> >>      count
> >> a     5
> >> b     2
> >> c     8
> >>
> >> I know a few ways to fulfill the task. However, I have a huge number
> >> of this kind calculations, so I'd like an efficient solution. Thanks
> >>
> >> --
> >> Best,
> >> Zhenjiang
> >>
> >> ______________________________________________
> >> [hidden email] mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >        [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>



--
Best,
Zhenjiang

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to create data.frames from vectors with duplicates

djmuseR
Hi:

Here are a few informal timings on my machine with the following
example. The data.table package is worth investigating, particularly
in problems where its advantages can scale with size.

library(data.table)
dt <- data.table(x = sample(1:50, 1000000, replace = TRUE),
                  y = sample(letters[1:26], 1000000, replace = TRUE),
                  key = 'y')
system.time(dt[, list(count = sum(x)), by = 'y'])
   user  system elapsed
   0.02    0.00    0.02

# Data tables are also data frames, so we can use them as such:

system.time(with(dt, tapply(x, y, sum)))
   user  system elapsed
   0.39    0.00    0.39
system.time(with(dt, rowsum(x, y)))
   user  system elapsed
   0.04    0.00    0.03
system.time(aggregate(x ~ y, data = dt, FUN = sum))
   user  system elapsed
   1.87    0.00    1.87

So rowsum() is good, but data.table is a little better for this task.
Increasing the size of the problem is to the advantage of both
data.table and rowsum(), but tapply() takes a fair bit longer,
relatively speaking (appx. 10x rowsum() in the first example, 20x in
the second example). The ratios of rowsum() to data.table are about
the same (appx. 2x).

# 10M observations, 1000 groups
> dt <- data.table(x = sample(1:100, 10000000, replace = TRUE),
+                  y = sample(1:1000, 10000000, replace = TRUE),
+                  key = 'y')
> system.time(dt[, list(count = sum(x)), by = 'y'])
   user  system elapsed
   0.16    0.03    0.18
> system.time(with(dt, rowsum(x, y)))
   user  system elapsed
   0.36    0.04    0.40
> system.time(with(dt, tapply(x, y, sum)))
   user  system elapsed
   8.77    0.33    9.11

HTH,
Dennis


On Wed, Sep 7, 2011 at 6:18 PM, zhenjiang xu <[hidden email]> wrote:

> Thanks for all your replies. I am using rowsum() and it looks efficient. I
> hope I could do some benchmark sometime in near future and let people know.
> Or is there any benchmark result available?
>
> On Wed, Aug 31, 2011 at 12:58 PM, Bert Gunter <[hidden email]>wrote:
>
>> Inline below:
>>
>> On Wed, Aug 31, 2011 at 9:50 AM, Jorge I Velez <[hidden email]>
>> wrote:
>> > Hi Zhenjiang,
>> >
>> > Try
>> >
>> > table(unlist(mapply(function(x, y) rep(x, y), y, x)))
>>
>> Yikes! How about simply tapply(x,y,sum) ??
>> ?tapply
>>
>> -- Bert
>> >
>> > HTH,
>> > Jorge
>> >
>> >
>> > On Wed, Aug 31, 2011 at 12:45 PM, zhenjiang xu <> wrote:
>> >
>> >> Hi R users,
>> >>
>> >> suppose I have two vectors,
>> >>  > x=c(1,2,3,4,5)
>> >>  > y=c('a','b','c','a','c')
>> >> How can I get a data.frame like this?
>> >> > xy
>> >>      count
>> >> a     5
>> >> b     2
>> >> c     8
>> >>
>> >> I know a few ways to fulfill the task. However, I have a huge number
>> >> of this kind calculations, so I'd like an efficient solution. Thanks
>> >>
>> >> --
>> >> Best,
>> >> Zhenjiang
>> >>
>> >> ______________________________________________
>> >> [hidden email] mailing list
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide
>> >> http://www.R-project.org/posting-guide.html
>> >> and provide commented, minimal, self-contained, reproducible code.
>> >>
>> >
>> >        [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > [hidden email] mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>
>
>
> --
> Best,
> Zhenjiang
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to create data.frames from vectors with duplicates

zhenjiang xu
Thanks for benchmarking them. data.table is indeed worth looking at.

On Wed, Sep 7, 2011 at 9:55 PM, Dennis Murphy <[hidden email]> wrote:

> Hi:
>
> Here are a few informal timings on my machine with the following
> example. The data.table package is worth investigating, particularly
> in problems where its advantages can scale with size.
>
> library(data.table)
> dt <- data.table(x = sample(1:50, 1000000, replace = TRUE),
>                  y = sample(letters[1:26], 1000000, replace = TRUE),
>                  key = 'y')
> system.time(dt[, list(count = sum(x)), by = 'y'])
>   user  system elapsed
>   0.02    0.00    0.02
>
> # Data tables are also data frames, so we can use them as such:
>
> system.time(with(dt, tapply(x, y, sum)))
>   user  system elapsed
>   0.39    0.00    0.39
> system.time(with(dt, rowsum(x, y)))
>   user  system elapsed
>   0.04    0.00    0.03
> system.time(aggregate(x ~ y, data = dt, FUN = sum))
>   user  system elapsed
>   1.87    0.00    1.87
>
> So rowsum() is good, but data.table is a little better for this task.
> Increasing the size of the problem is to the advantage of both
> data.table and rowsum(), but tapply() takes a fair bit longer,
> relatively speaking (appx. 10x rowsum() in the first example, 20x in
> the second example). The ratios of rowsum() to data.table are about
> the same (appx. 2x).
>
> # 10M observations, 1000 groups
> > dt <- data.table(x = sample(1:100, 10000000, replace = TRUE),
> +                  y = sample(1:1000, 10000000, replace = TRUE),
> +                  key = 'y')
> > system.time(dt[, list(count = sum(x)), by = 'y'])
>   user  system elapsed
>   0.16    0.03    0.18
> > system.time(with(dt, rowsum(x, y)))
>   user  system elapsed
>   0.36    0.04    0.40
> > system.time(with(dt, tapply(x, y, sum)))
>   user  system elapsed
>   8.77    0.33    9.11
>
> HTH,
> Dennis
>
>
> On Wed, Sep 7, 2011 at 6:18 PM, zhenjiang xu <[hidden email]>
> wrote:
> > Thanks for all your replies. I am using rowsum() and it looks efficient.
> I
> > hope I could do some benchmark sometime in near future and let people
> know.
> > Or is there any benchmark result available?
> >
> > On Wed, Aug 31, 2011 at 12:58 PM, Bert Gunter <[hidden email]
> >wrote:
> >
> >> Inline below:
> >>
> >> On Wed, Aug 31, 2011 at 9:50 AM, Jorge I Velez <
> [hidden email]>
> >> wrote:
> >> > Hi Zhenjiang,
> >> >
> >> > Try
> >> >
> >> > table(unlist(mapply(function(x, y) rep(x, y), y, x)))
> >>
> >> Yikes! How about simply tapply(x,y,sum) ??
> >> ?tapply
> >>
> >> -- Bert
> >> >
> >> > HTH,
> >> > Jorge
> >> >
> >> >
> >> > On Wed, Aug 31, 2011 at 12:45 PM, zhenjiang xu <> wrote:
> >> >
> >> >> Hi R users,
> >> >>
> >> >> suppose I have two vectors,
> >> >>  > x=c(1,2,3,4,5)
> >> >>  > y=c('a','b','c','a','c')
> >> >> How can I get a data.frame like this?
> >> >> > xy
> >> >>      count
> >> >> a     5
> >> >> b     2
> >> >> c     8
> >> >>
> >> >> I know a few ways to fulfill the task. However, I have a huge number
> >> >> of this kind calculations, so I'd like an efficient solution. Thanks
> >> >>
> >> >> --
> >> >> Best,
> >> >> Zhenjiang
> >> >>
> >> >> ______________________________________________
> >> >> [hidden email] mailing list
> >> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> >> PLEASE do read the posting guide
> >> >> http://www.R-project.org/posting-guide.html
> >> >> and provide commented, minimal, self-contained, reproducible code.
> >> >>
> >> >
> >> >        [[alternative HTML version deleted]]
> >> >
> >> > ______________________________________________
> >> > [hidden email] mailing list
> >> > https://stat.ethz.ch/mailman/listinfo/r-help
> >> > PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> > and provide commented, minimal, self-contained, reproducible code.
> >> >
> >>
> >
> >
> >
> > --
> > Best,
> > Zhenjiang
> >
> >        [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>



--
Best,
Zhenjiang

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to create data.frames from vectors with duplicates

Peter Dalgaard-2
In reply to this post by zhenjiang xu

On Sep 8, 2011, at 03:18 , zhenjiang xu wrote:

> Thanks for all your replies. I am using rowsum() and it looks efficient. I
> hope I could do some benchmark sometime in near future and let people know.
> Or is there any benchmark result available?

I'm a bit surprised that no-one thought of xtabs(x ~ y)

It won't win the speed competition, though.

--
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: [hidden email]  Priv: [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.