Quantcast

summarize a vector

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

summarize a vector

Sam Steingold
I have a long numeric vector v (length N) and I want create a shorter
vector of length N/k consisting of sums of k-subsequences of v:

v <- c(1,2,3,4,5,6,7,8,9,10)

N=10, k=3
===> [6,15,24,10]

I can, of course, iterate:

> w <- vector(mode="numeric",length=ceiling(N/k))
> for (i in 1:length(w)) w[i] <- sum(v(i*k:(i+1)*k))

(modulo boundary conditions)
but I wonder if there is a better way.

thanks!

--
Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
http://www.childpsy.net/ http://think-israel.org http://thereligionofpeace.com
http://dhimmi.com http://truepeace.org http://www.PetitionOnline.com/tap12009/
Type louder, please.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: summarize a vector

David Winsemius

On Aug 10, 2012, at 12:20 PM, Sam Steingold wrote:

> I have a long numeric vector v (length N) and I want create a shorter
> vector of length N/k consisting of sums of k-subsequences of v:
>
> v <- c(1,2,3,4,5,6,7,8,9,10)
>
> N=10, k=3
> ===> [6,15,24,10]
>
> I can, of course, iterate:
>
>> w <- vector(mode="numeric",length=ceiling(N/k))
>> for (i in 1:length(w)) w[i] <- sum(v(i*k:(i+1)*k))
>
> (modulo boundary conditions)
> but I wonder if there is a better way.

Well, using v with parentheses instead of square-brackets might not be  
the right way, since v is not a function.

Consider this alternate (no need to pre-allocate 'w'):

 > w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum)
 > w
  1  2  3  4
  6 15 24 10

--

David Winsemius, MD
Alameda, CA, USA

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: summarize a vector

Bert Gunter
... or perhaps even simpler:

> sz <- function(x,k)tapply(x,(seq_along(x)-1)%/%k, sum)
> sz(1:10,3)
 0  1  2  3
 6 15 24 10

Note that this works for k>n, where the previous solution does not.
> sz(1:10,15)
 0
55

-- Bert

On Fri, Aug 10, 2012 at 12:37 PM, David Winsemius
<[hidden email]> wrote:

>
> On Aug 10, 2012, at 12:20 PM, Sam Steingold wrote:
>
>> I have a long numeric vector v (length N) and I want create a shorter
>> vector of length N/k consisting of sums of k-subsequences of v:
>>
>> v <- c(1,2,3,4,5,6,7,8,9,10)
>>
>> N=10, k=3
>> ===> [6,15,24,10]
>>
>> I can, of course, iterate:
>>
>>> w <- vector(mode="numeric",length=ceiling(N/k))
>>> for (i in 1:length(w)) w[i] <- sum(v(i*k:(i+1)*k))
>>
>>
>> (modulo boundary conditions)
>> but I wonder if there is a better way.
>
>
> Well, using v with parentheses instead of square-brackets might not be the
> right way, since v is not a function.
>
> Consider this alternate (no need to pre-allocate 'w'):
>
>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum)
>> w
>  1  2  3  4
>  6 15 24 10
>
> --
>
> David Winsemius, MD
> Alameda, CA, USA
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



--

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: summarize a vector

Sam Steingold
Thanks David & Bert.
It turned out that what I actually wanted was much simpler.
my vector's elements are 0&1 and the right way to "summarize" it is
hist(which(v==1))
however, your replies were quire educational!
Thanks again,
Sam.

> * Bert Gunter <[hidden email]> [2012-08-10 12:57:40 -0700]:
>
>> sz <- function(x,k)tapply(x,(seq_along(x)-1)%/%k, sum)
>
> On Fri, Aug 10, 2012 at 12:37 PM, David Winsemius
> <[hidden email]> wrote:
>>
>> On Aug 10, 2012, at 12:20 PM, Sam Steingold wrote:
>>
>>> I have a long numeric vector v (length N) and I want create a shorter
>>> vector of length N/k consisting of sums of k-subsequences of v:
>>>
>>> v <- c(1,2,3,4,5,6,7,8,9,10)
>>>
>>> N=10, k=3
>>> ===> [6,15,24,10]
>>
>>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum)

--
Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
http://www.childpsy.net/ http://camera.org http://iris.org.il
http://palestinefacts.org http://dhimmi.com http://truepeace.org
At war time "salt of the earth" becomes "cannon fodder".

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: summarize a vector

David Winsemius
In reply to this post by Bert Gunter

On Aug 10, 2012, at 12:57 PM, Bert Gunter wrote:

> ... or perhaps even simpler:
>
>> sz <- function(x,k)tapply(x,(seq_along(x)-1)%/%k, sum)
>> sz(1:10,3)
> 0  1  2  3
> 6 15 24 10
>
> Note that this works for k>n, where the previous solution does not.
>> sz(1:10,15)
> 0
> 55

I agree that it is more elegant, but I do not get an error or an  
unexpected result with my method.

 > N=10
 > k=15
 > w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum)
 > w
  1
55

A different label but the same result. I'm protected from the typical  
1:0 problem that seq_along solves by including +1 in the second  
argument to ":"/seq(). Unless, of course, you set N to a negative  
number, but that wouldn't make much sense would it, and you get an  
error from rep() anyway.

Best;
David.

>
> -- Bert
>
> On Fri, Aug 10, 2012 at 12:37 PM, David Winsemius
> <[hidden email]> wrote:
>>
>> On Aug 10, 2012, at 12:20 PM, Sam Steingold wrote:
>>
>>> I have a long numeric vector v (length N) and I want create a  
>>> shorter
>>> vector of length N/k consisting of sums of k-subsequences of v:
>>>
>>> v <- c(1,2,3,4,5,6,7,8,9,10)
>>>
>>> N=10, k=3
>>> ===> [6,15,24,10]
>>>
>>> I can, of course, iterate:
>>>
>>>> w <- vector(mode="numeric",length=ceiling(N/k))
>>>> for (i in 1:length(w)) w[i] <- sum(v(i*k:(i+1)*k))
>>>
>>>
>>> (modulo boundary conditions)
>>> but I wonder if there is a better way.
>>
>>
>> Well, using v with parentheses instead of square-brackets might not  
>> be the
>> right way, since v is not a function.
>>
>> Consider this alternate (no need to pre-allocate 'w'):
>>
>>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum)
>>> w
>> 1  2  3  4
>> 6 15 24 10
>>
>> --
>>
>> David Winsemius, MD
>> Alameda, CA, USA
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
> Internal Contact Info:
> Phone: 467-7374
> Website:
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

David Winsemius, MD
Alameda, CA, USA

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: summarize a vector

Bert Gunter
Oh yes, I stand corrected. I didn't look at your code carefully enough.

-- Bert

On Fri, Aug 10, 2012 at 3:07 PM, David Winsemius <[hidden email]> wrote:

>
> On Aug 10, 2012, at 12:57 PM, Bert Gunter wrote:
>
>> ... or perhaps even simpler:
>>
>>> sz <- function(x,k)tapply(x,(seq_along(x)-1)%/%k, sum)
>>> sz(1:10,3)
>>
>> 0  1  2  3
>> 6 15 24 10
>>
>> Note that this works for k>n, where the previous solution does not.
>>>
>>> sz(1:10,15)
>>
>> 0
>> 55
>
>
> I agree that it is more elegant, but I do not get an error or an unexpected
> result with my method.
>
>> N=10
>> k=15
>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum)
>> w
>  1
> 55
>
> A different label but the same result. I'm protected from the typical 1:0
> problem that seq_along solves by including +1 in the second argument to
> ":"/seq(). Unless, of course, you set N to a negative number, but that
> wouldn't make much sense would it, and you get an error from rep() anyway.
>
> Best;
> David.
>
>>
>> -- Bert
>>
>> On Fri, Aug 10, 2012 at 12:37 PM, David Winsemius
>> <[hidden email]> wrote:
>>>
>>>
>>> On Aug 10, 2012, at 12:20 PM, Sam Steingold wrote:
>>>
>>>> I have a long numeric vector v (length N) and I want create a shorter
>>>> vector of length N/k consisting of sums of k-subsequences of v:
>>>>
>>>> v <- c(1,2,3,4,5,6,7,8,9,10)
>>>>
>>>> N=10, k=3
>>>> ===> [6,15,24,10]
>>>>
>>>> I can, of course, iterate:
>>>>
>>>>> w <- vector(mode="numeric",length=ceiling(N/k))
>>>>> for (i in 1:length(w)) w[i] <- sum(v(i*k:(i+1)*k))
>>>>
>>>>
>>>>
>>>> (modulo boundary conditions)
>>>> but I wonder if there is a better way.
>>>
>>>
>>>
>>> Well, using v with parentheses instead of square-brackets might not be
>>> the
>>> right way, since v is not a function.
>>>
>>> Consider this alternate (no need to pre-allocate 'w'):
>>>
>>>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum)
>>>> w
>>>
>>> 1  2  3  4
>>> 6 15 24 10
>>>
>>> --
>>>
>>> David Winsemius, MD
>>> Alameda, CA, USA
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>>
>> --
>>
>> Bert Gunter
>> Genentech Nonclinical Biostatistics
>>
>> Internal Contact Info:
>> Phone: 467-7374
>> Website:
>>
>> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>
>
> David Winsemius, MD
> Alameda, CA, USA
>



--

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: summarize a vector

Bert Gunter
In reply to this post by David Winsemius
Oh yes, I stand corrected. I didn't look at your code carefully enough.

-- Bert

On Fri, Aug 10, 2012 at 3:07 PM, David Winsemius <[hidden email]> wrote:

>
> On Aug 10, 2012, at 12:57 PM, Bert Gunter wrote:
>
>> ... or perhaps even simpler:
>>
>>> sz <- function(x,k)tapply(x,(seq_along(x)-1)%/%k, sum)
>>> sz(1:10,3)
>>
>> 0  1  2  3
>> 6 15 24 10
>>
>> Note that this works for k>n, where the previous solution does not.
>>>
>>> sz(1:10,15)
>>
>> 0
>> 55
>
>
> I agree that it is more elegant, but I do not get an error or an unexpected
> result with my method.
>
>> N=10
>> k=15
>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum)
>> w
>  1
> 55
>
> A different label but the same result. I'm protected from the typical 1:0
> problem that seq_along solves by including +1 in the second argument to
> ":"/seq(). Unless, of course, you set N to a negative number, but that
> wouldn't make much sense would it, and you get an error from rep() anyway.
>
> Best;
> David.
>
>>
>> -- Bert
>>
>> On Fri, Aug 10, 2012 at 12:37 PM, David Winsemius
>> <[hidden email]> wrote:
>>>
>>>
>>> On Aug 10, 2012, at 12:20 PM, Sam Steingold wrote:
>>>
>>>> I have a long numeric vector v (length N) and I want create a shorter
>>>> vector of length N/k consisting of sums of k-subsequences of v:
>>>>
>>>> v <- c(1,2,3,4,5,6,7,8,9,10)
>>>>
>>>> N=10, k=3
>>>> ===> [6,15,24,10]
>>>>
>>>> I can, of course, iterate:
>>>>
>>>>> w <- vector(mode="numeric",length=ceiling(N/k))
>>>>> for (i in 1:length(w)) w[i] <- sum(v(i*k:(i+1)*k))
>>>>
>>>>
>>>>
>>>> (modulo boundary conditions)
>>>> but I wonder if there is a better way.
>>>
>>>
>>>
>>> Well, using v with parentheses instead of square-brackets might not be
>>> the
>>> right way, since v is not a function.
>>>
>>> Consider this alternate (no need to pre-allocate 'w'):
>>>
>>>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum)
>>>> w
>>>
>>> 1  2  3  4
>>> 6 15 24 10
>>>
>>> --
>>>
>>> David Winsemius, MD
>>> Alameda, CA, USA
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>>
>> --
>>
>> Bert Gunter
>> Genentech Nonclinical Biostatistics
>>
>> Internal Contact Info:
>> Phone: 467-7374
>> Website:
>>
>> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>
>
> David Winsemius, MD
> Alameda, CA, USA
>



--

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: summarize a vector

Michael Weylandt
In reply to this post by David Winsemius
I wouldn't be surprised if one couldn't get an *apply-free solution by using diff(), cumsum() and selective indexing as well.

Cheers,
Michael

On Aug 10, 2012, at 5:07 PM, David Winsemius <[hidden email]> wrote:

>
> On Aug 10, 2012, at 12:57 PM, Bert Gunter wrote:
>
>> ... or perhaps even simpler:
>>
>>> sz <- function(x,k)tapply(x,(seq_along(x)-1)%/%k, sum)
>>> sz(1:10,3)
>> 0  1  2  3
>> 6 15 24 10
>>
>> Note that this works for k>n, where the previous solution does not.
>>> sz(1:10,15)
>> 0
>> 55
>
> I agree that it is more elegant, but I do not get an error or an unexpected result with my method.
>
> > N=10
> > k=15
> > w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum)
> > w
> 1
> 55
>
> A different label but the same result. I'm protected from the typical 1:0 problem that seq_along solves by including +1 in the second argument to ":"/seq(). Unless, of course, you set N to a negative number, but that wouldn't make much sense would it, and you get an error from rep() anyway.
>
> Best;
> David.
>
>>
>> -- Bert
>>
>> On Fri, Aug 10, 2012 at 12:37 PM, David Winsemius
>> <[hidden email]> wrote:
>>>
>>> On Aug 10, 2012, at 12:20 PM, Sam Steingold wrote:
>>>
>>>> I have a long numeric vector v (length N) and I want create a shorter
>>>> vector of length N/k consisting of sums of k-subsequences of v:
>>>>
>>>> v <- c(1,2,3,4,5,6,7,8,9,10)
>>>>
>>>> N=10, k=3
>>>> ===> [6,15,24,10]
>>>>
>>>> I can, of course, iterate:
>>>>
>>>>> w <- vector(mode="numeric",length=ceiling(N/k))
>>>>> for (i in 1:length(w)) w[i] <- sum(v(i*k:(i+1)*k))
>>>>
>>>>
>>>> (modulo boundary conditions)
>>>> but I wonder if there is a better way.
>>>
>>>
>>> Well, using v with parentheses instead of square-brackets might not be the
>>> right way, since v is not a function.
>>>
>>> Consider this alternate (no need to pre-allocate 'w'):
>>>
>>>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum)
>>>> w
>>> 1  2  3  4
>>> 6 15 24 10
>>>
>>> --
>>>
>>> David Winsemius, MD
>>> Alameda, CA, USA
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>> --
>>
>> Bert Gunter
>> Genentech Nonclinical Biostatistics
>>
>> Internal Contact Info:
>> Phone: 467-7374
>> Website:
>> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>
> David Winsemius, MD
> Alameda, CA, USA
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: summarize a vector

David Winsemius

On Aug 10, 2012, at 3:42 PM, Michael Weylandt wrote:

> I wouldn't be surprised if one couldn't get an *apply-free solution  
> by using diff(), cumsum() and selective indexing as well.

What about colSums on a matrix extended with the right number of zeros.

 > colSums(matrix (c(v, rep(0, 3- length(v)%%3) ) , nrow=3) )
[1]  6 15 24 10

(My experience is that tapply is generally fairly fast anyway, much  
faster than apply.data.frame. So I do not lump all *apply solutions in  
the same efficiency category.)

--
David.

>
> Cheers,
> Michael
>
> On Aug 10, 2012, at 5:07 PM, David Winsemius  
> <[hidden email]> wrote:
>
>>
>> On Aug 10, 2012, at 12:57 PM, Bert Gunter wrote:
>>
>>> ... or perhaps even simpler:
>>>
>>>> sz <- function(x,k)tapply(x,(seq_along(x)-1)%/%k, sum)
>>>> sz(1:10,3)
>>> 0  1  2  3
>>> 6 15 24 10
>>>
>>> Note that this works for k>n, where the previous solution does not.
>>>> sz(1:10,15)
>>> 0
>>> 55
>>
>> I agree that it is more elegant, but I do not get an error or an  
>> unexpected result with my method.
>>
>>> N=10
>>> k=15
>>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum)
>>> w
>> 1
>> 55
>>
>> A different label but the same result. I'm protected from the  
>> typical 1:0 problem that seq_along solves by including +1 in the  
>> second argument to ":"/seq(). Unless, of course, you set N to a  
>> negative number, but that wouldn't make much sense would it, and  
>> you get an error from rep() anyway.
>>
>> Best;
>> David.
>>
>>>
>>> -- Bert
>>>
>>> On Fri, Aug 10, 2012 at 12:37 PM, David Winsemius
>>> <[hidden email]> wrote:
>>>>
>>>> On Aug 10, 2012, at 12:20 PM, Sam Steingold wrote:
>>>>
>>>>> I have a long numeric vector v (length N) and I want create a  
>>>>> shorter
>>>>> vector of length N/k consisting of sums of k-subsequences of v:
>>>>>
>>>>> v <- c(1,2,3,4,5,6,7,8,9,10)
>>>>>
>>>>> N=10, k=3
>>>>> ===> [6,15,24,10]
>>>>>
>>>>> I can, of course, iterate:
>>>>>
>>>>>> w <- vector(mode="numeric",length=ceiling(N/k))
>>>>>> for (i in 1:length(w)) w[i] <- sum(v(i*k:(i+1)*k))
>>>>>
>>>>>
>>>>> (modulo boundary conditions)
>>>>> but I wonder if there is a better way.
>>>>
>>>>
>>>> Well, using v with parentheses instead of square-brackets might  
>>>> not be the
>>>> right way, since v is not a function.
>>>>
>>>> Consider this alternate (no need to pre-allocate 'w'):
>>>>
>>>>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum)
>>>>> w
>>>> 1  2  3  4
>>>> 6 15 24 10
>>>>
>>>> --
>>>>
>>>> David Winsemius, MD
>>>> Alameda, CA, USA
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>>
>>> --
>>>
>>> Bert Gunter
>>> Genentech Nonclinical Biostatistics
>>>
>>> Internal Contact Info:
>>> Phone: 467-7374
>>> Website:
>>> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>>
>> David Winsemius, MD
>> Alameda, CA, USA
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Alameda, CA, USA

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: summarize a vector

Bert Gunter
Certainly ... but this is of course limited to the few C coded
functions available. Back to apply-type stuff for, say, median as a
summary statistic.

-- Bert

On Fri, Aug 10, 2012 at 3:58 PM, David Winsemius <[hidden email]> wrote:

>
> On Aug 10, 2012, at 3:42 PM, Michael Weylandt wrote:
>
>> I wouldn't be surprised if one couldn't get an *apply-free solution by
>> using diff(), cumsum() and selective indexing as well.
>
>
> What about colSums on a matrix extended with the right number of zeros.
>
>> colSums(matrix (c(v, rep(0, 3- length(v)%%3) ) , nrow=3) )
> [1]  6 15 24 10
>
> (My experience is that tapply is generally fairly fast anyway, much faster
> than apply.data.frame. So I do not lump all *apply solutions in the same
> efficiency category.)
>
> --
> David.
>>
>>
>> Cheers,
>> Michael
>>
>> On Aug 10, 2012, at 5:07 PM, David Winsemius <[hidden email]>
>> wrote:
>>
>>>
>>> On Aug 10, 2012, at 12:57 PM, Bert Gunter wrote:
>>>
>>>> ... or perhaps even simpler:
>>>>
>>>>> sz <- function(x,k)tapply(x,(seq_along(x)-1)%/%k, sum)
>>>>> sz(1:10,3)
>>>>
>>>> 0  1  2  3
>>>> 6 15 24 10
>>>>
>>>> Note that this works for k>n, where the previous solution does not.
>>>>>
>>>>> sz(1:10,15)
>>>>
>>>> 0
>>>> 55
>>>
>>>
>>> I agree that it is more elegant, but I do not get an error or an
>>> unexpected result with my method.
>>>
>>>> N=10
>>>> k=15
>>>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum)
>>>> w
>>>
>>> 1
>>> 55
>>>
>>> A different label but the same result. I'm protected from the typical 1:0
>>> problem that seq_along solves by including +1 in the second argument to
>>> ":"/seq(). Unless, of course, you set N to a negative number, but that
>>> wouldn't make much sense would it, and you get an error from rep() anyway.
>>>
>>> Best;
>>> David.
>>>
>>>>
>>>> -- Bert
>>>>
>>>> On Fri, Aug 10, 2012 at 12:37 PM, David Winsemius
>>>> <[hidden email]> wrote:
>>>>>
>>>>>
>>>>> On Aug 10, 2012, at 12:20 PM, Sam Steingold wrote:
>>>>>
>>>>>> I have a long numeric vector v (length N) and I want create a shorter
>>>>>> vector of length N/k consisting of sums of k-subsequences of v:
>>>>>>
>>>>>> v <- c(1,2,3,4,5,6,7,8,9,10)
>>>>>>
>>>>>> N=10, k=3
>>>>>> ===> [6,15,24,10]
>>>>>>
>>>>>> I can, of course, iterate:
>>>>>>
>>>>>>> w <- vector(mode="numeric",length=ceiling(N/k))
>>>>>>> for (i in 1:length(w)) w[i] <- sum(v(i*k:(i+1)*k))
>>>>>>
>>>>>>
>>>>>>
>>>>>> (modulo boundary conditions)
>>>>>> but I wonder if there is a better way.
>>>>>
>>>>>
>>>>>
>>>>> Well, using v with parentheses instead of square-brackets might not be
>>>>> the
>>>>> right way, since v is not a function.
>>>>>
>>>>> Consider this alternate (no need to pre-allocate 'w'):
>>>>>
>>>>>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum)
>>>>>> w
>>>>>
>>>>> 1  2  3  4
>>>>> 6 15 24 10
>>>>>
>>>>> --
>>>>>
>>>>> David Winsemius, MD
>>>>> Alameda, CA, USA
>>>>>
>>>>> ______________________________________________
>>>>> [hidden email] mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Bert Gunter
>>>> Genentech Nonclinical Biostatistics
>>>>
>>>> Internal Contact Info:
>>>> Phone: 467-7374
>>>> Website:
>>>>
>>>> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>>>
>>>
>>> David Winsemius, MD
>>> Alameda, CA, USA
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>
>
> David Winsemius, MD
> Alameda, CA, USA
>



--

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: summarize a vector

arun kirshna
In reply to this post by David Winsemius
HI,

Same result, with data.frame:
dat1<-data.frame(V1=v[1:3],V2=v[4:6],V3=v[7:9],V4=c(v[10],rep(0,2)))
sapply(dat1,cumsum)[3,]
V1 V2 V3 V4
 6 15 24 10
 sapply(dat1,sum)
V1 V2 V3 V4
 6 15 24 10
A.K.




----- Original Message -----
From: David Winsemius <[hidden email]>
To: Michael Weylandt <[hidden email]>
Cc: "[hidden email]" <[hidden email]>; "[hidden email]" <[hidden email]>; Bert Gunter <[hidden email]>
Sent: Friday, August 10, 2012 6:58 PM
Subject: Re: [R] summarize a vector


On Aug 10, 2012, at 3:42 PM, Michael Weylandt wrote:

> I wouldn't be surprised if one couldn't get an *apply-free solution by using diff(), cumsum() and selective indexing as well.

What about colSums on a matrix extended with the right number of zeros.

> colSums(matrix (c(v, rep(0, 3- length(v)%%3) ) , nrow=3) )
[1]  6 15 24 10

(My experience is that tapply is generally fairly fast anyway, much faster than apply.data.frame. So I do not lump all *apply solutions in the same efficiency category.)

--David.

>
> Cheers,
> Michael
>
> On Aug 10, 2012, at 5:07 PM, David Winsemius <[hidden email]> wrote:
>
>>
>> On Aug 10, 2012, at 12:57 PM, Bert Gunter wrote:
>>
>>> ... or perhaps even simpler:
>>>
>>>> sz <- function(x,k)tapply(x,(seq_along(x)-1)%/%k, sum)
>>>> sz(1:10,3)
>>> 0  1  2  3
>>> 6 15 24 10
>>>
>>> Note that this works for k>n, where the previous solution does not.
>>>> sz(1:10,15)
>>> 0
>>> 55
>>
>> I agree that it is more elegant, but I do not get an error or an unexpected result with my method.
>>
>>> N=10
>>> k=15
>>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum)
>>> w
>> 1
>> 55
>>
>> A different label but the same result. I'm protected from the typical 1:0 problem that seq_along solves by including +1 in the second argument to ":"/seq(). Unless, of course, you set N to a negative number, but that wouldn't make much sense would it, and you get an error from rep() anyway.
>>
>> Best;
>> David.
>>
>>>
>>> -- Bert
>>>
>>> On Fri, Aug 10, 2012 at 12:37 PM, David Winsemius
>>> <[hidden email]> wrote:
>>>>
>>>> On Aug 10, 2012, at 12:20 PM, Sam Steingold wrote:
>>>>
>>>>> I have a long numeric vector v (length N) and I want create a shorter
>>>>> vector of length N/k consisting of sums of k-subsequences of v:
>>>>>
>>>>> v <- c(1,2,3,4,5,6,7,8,9,10)
>>>>>
>>>>> N=10, k=3
>>>>> ===> [6,15,24,10]
>>>>>
>>>>> I can, of course, iterate:
>>>>>
>>>>>> w <- vector(mode="numeric",length=ceiling(N/k))
>>>>>> for (i in 1:length(w)) w[i] <- sum(v(i*k:(i+1)*k))
>>>>>
>>>>>
>>>>> (modulo boundary conditions)
>>>>> but I wonder if there is a better way.
>>>>
>>>>
>>>> Well, using v with parentheses instead of square-brackets might not be the
>>>> right way, since v is not a function.
>>>>
>>>> Consider this alternate (no need to pre-allocate 'w'):
>>>>
>>>>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum)
>>>>> w
>>>> 1  2  3  4
>>>> 6 15 24 10
>>>>
>>>> --
>>>>
>>>> David Winsemius, MD
>>>> Alameda, CA, USA
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>>
>>> --
>>> Bert Gunter
>>> Genentech Nonclinical Biostatistics
>>>
>>> Internal Contact Info:
>>> Phone: 467-7374
>>> Website:
>>> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>>
>> David Winsemius, MD
>> Alameda, CA, USA
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Alameda, CA, USA

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Loading...