|
I have a long numeric vector v (length N) and I want create a shorter
vector of length N/k consisting of sums of k-subsequences of v: v <- c(1,2,3,4,5,6,7,8,9,10) N=10, k=3 ===> [6,15,24,10] I can, of course, iterate: > w <- vector(mode="numeric",length=ceiling(N/k)) > for (i in 1:length(w)) w[i] <- sum(v(i*k:(i+1)*k)) (modulo boundary conditions) but I wonder if there is a better way. thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://think-israel.org http://thereligionofpeace.com http://dhimmi.com http://truepeace.org http://www.PetitionOnline.com/tap12009/ Type louder, please. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
On Aug 10, 2012, at 12:20 PM, Sam Steingold wrote: > I have a long numeric vector v (length N) and I want create a shorter > vector of length N/k consisting of sums of k-subsequences of v: > > v <- c(1,2,3,4,5,6,7,8,9,10) > > N=10, k=3 > ===> [6,15,24,10] > > I can, of course, iterate: > >> w <- vector(mode="numeric",length=ceiling(N/k)) >> for (i in 1:length(w)) w[i] <- sum(v(i*k:(i+1)*k)) > > (modulo boundary conditions) > but I wonder if there is a better way. Well, using v with parentheses instead of square-brackets might not be the right way, since v is not a function. Consider this alternate (no need to pre-allocate 'w'): > w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum) > w 1 2 3 4 6 15 24 10 -- David Winsemius, MD Alameda, CA, USA ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
... or perhaps even simpler:
> sz <- function(x,k)tapply(x,(seq_along(x)-1)%/%k, sum) > sz(1:10,3) 0 1 2 3 6 15 24 10 Note that this works for k>n, where the previous solution does not. > sz(1:10,15) 0 55 -- Bert On Fri, Aug 10, 2012 at 12:37 PM, David Winsemius <[hidden email]> wrote: > > On Aug 10, 2012, at 12:20 PM, Sam Steingold wrote: > >> I have a long numeric vector v (length N) and I want create a shorter >> vector of length N/k consisting of sums of k-subsequences of v: >> >> v <- c(1,2,3,4,5,6,7,8,9,10) >> >> N=10, k=3 >> ===> [6,15,24,10] >> >> I can, of course, iterate: >> >>> w <- vector(mode="numeric",length=ceiling(N/k)) >>> for (i in 1:length(w)) w[i] <- sum(v(i*k:(i+1)*k)) >> >> >> (modulo boundary conditions) >> but I wonder if there is a better way. > > > Well, using v with parentheses instead of square-brackets might not be the > right way, since v is not a function. > > Consider this alternate (no need to pre-allocate 'w'): > >> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum) >> w > 1 2 3 4 > 6 15 24 10 > > -- > > David Winsemius, MD > Alameda, CA, USA > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
Thanks David & Bert.
It turned out that what I actually wanted was much simpler. my vector's elements are 0&1 and the right way to "summarize" it is hist(which(v==1)) however, your replies were quire educational! Thanks again, Sam. > * Bert Gunter <[hidden email]> [2012-08-10 12:57:40 -0700]: > >> sz <- function(x,k)tapply(x,(seq_along(x)-1)%/%k, sum) > > On Fri, Aug 10, 2012 at 12:37 PM, David Winsemius > <[hidden email]> wrote: >> >> On Aug 10, 2012, at 12:20 PM, Sam Steingold wrote: >> >>> I have a long numeric vector v (length N) and I want create a shorter >>> vector of length N/k consisting of sums of k-subsequences of v: >>> >>> v <- c(1,2,3,4,5,6,7,8,9,10) >>> >>> N=10, k=3 >>> ===> [6,15,24,10] >> >>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum) -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://camera.org http://iris.org.il http://palestinefacts.org http://dhimmi.com http://truepeace.org At war time "salt of the earth" becomes "cannon fodder". ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
In reply to this post by Bert Gunter
On Aug 10, 2012, at 12:57 PM, Bert Gunter wrote: > ... or perhaps even simpler: > >> sz <- function(x,k)tapply(x,(seq_along(x)-1)%/%k, sum) >> sz(1:10,3) > 0 1 2 3 > 6 15 24 10 > > Note that this works for k>n, where the previous solution does not. >> sz(1:10,15) > 0 > 55 I agree that it is more elegant, but I do not get an error or an unexpected result with my method. > N=10 > k=15 > w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum) > w 1 55 A different label but the same result. I'm protected from the typical 1:0 problem that seq_along solves by including +1 in the second argument to ":"/seq(). Unless, of course, you set N to a negative number, but that wouldn't make much sense would it, and you get an error from rep() anyway. Best; David. > > -- Bert > > On Fri, Aug 10, 2012 at 12:37 PM, David Winsemius > <[hidden email]> wrote: >> >> On Aug 10, 2012, at 12:20 PM, Sam Steingold wrote: >> >>> I have a long numeric vector v (length N) and I want create a >>> shorter >>> vector of length N/k consisting of sums of k-subsequences of v: >>> >>> v <- c(1,2,3,4,5,6,7,8,9,10) >>> >>> N=10, k=3 >>> ===> [6,15,24,10] >>> >>> I can, of course, iterate: >>> >>>> w <- vector(mode="numeric",length=ceiling(N/k)) >>>> for (i in 1:length(w)) w[i] <- sum(v(i*k:(i+1)*k)) >>> >>> >>> (modulo boundary conditions) >>> but I wonder if there is a better way. >> >> >> Well, using v with parentheses instead of square-brackets might not >> be the >> right way, since v is not a function. >> >> Consider this alternate (no need to pre-allocate 'w'): >> >>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum) >>> w >> 1 2 3 4 >> 6 15 24 10 >> >> -- >> >> David Winsemius, MD >> Alameda, CA, USA >> >> ______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > > -- > > Bert Gunter > Genentech Nonclinical Biostatistics > > Internal Contact Info: > Phone: 467-7374 > Website: > http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm David Winsemius, MD Alameda, CA, USA ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
Oh yes, I stand corrected. I didn't look at your code carefully enough.
-- Bert On Fri, Aug 10, 2012 at 3:07 PM, David Winsemius <[hidden email]> wrote: > > On Aug 10, 2012, at 12:57 PM, Bert Gunter wrote: > >> ... or perhaps even simpler: >> >>> sz <- function(x,k)tapply(x,(seq_along(x)-1)%/%k, sum) >>> sz(1:10,3) >> >> 0 1 2 3 >> 6 15 24 10 >> >> Note that this works for k>n, where the previous solution does not. >>> >>> sz(1:10,15) >> >> 0 >> 55 > > > I agree that it is more elegant, but I do not get an error or an unexpected > result with my method. > >> N=10 >> k=15 >> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum) >> w > 1 > 55 > > A different label but the same result. I'm protected from the typical 1:0 > problem that seq_along solves by including +1 in the second argument to > ":"/seq(). Unless, of course, you set N to a negative number, but that > wouldn't make much sense would it, and you get an error from rep() anyway. > > Best; > David. > >> >> -- Bert >> >> On Fri, Aug 10, 2012 at 12:37 PM, David Winsemius >> <[hidden email]> wrote: >>> >>> >>> On Aug 10, 2012, at 12:20 PM, Sam Steingold wrote: >>> >>>> I have a long numeric vector v (length N) and I want create a shorter >>>> vector of length N/k consisting of sums of k-subsequences of v: >>>> >>>> v <- c(1,2,3,4,5,6,7,8,9,10) >>>> >>>> N=10, k=3 >>>> ===> [6,15,24,10] >>>> >>>> I can, of course, iterate: >>>> >>>>> w <- vector(mode="numeric",length=ceiling(N/k)) >>>>> for (i in 1:length(w)) w[i] <- sum(v(i*k:(i+1)*k)) >>>> >>>> >>>> >>>> (modulo boundary conditions) >>>> but I wonder if there is a better way. >>> >>> >>> >>> Well, using v with parentheses instead of square-brackets might not be >>> the >>> right way, since v is not a function. >>> >>> Consider this alternate (no need to pre-allocate 'w'): >>> >>>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum) >>>> w >>> >>> 1 2 3 4 >>> 6 15 24 10 >>> >>> -- >>> >>> David Winsemius, MD >>> Alameda, CA, USA >>> >>> ______________________________________________ >>> [hidden email] mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> >> >> >> -- >> >> Bert Gunter >> Genentech Nonclinical Biostatistics >> >> Internal Contact Info: >> Phone: 467-7374 >> Website: >> >> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm > > > David Winsemius, MD > Alameda, CA, USA > -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
In reply to this post by David Winsemius
Oh yes, I stand corrected. I didn't look at your code carefully enough.
-- Bert On Fri, Aug 10, 2012 at 3:07 PM, David Winsemius <[hidden email]> wrote: > > On Aug 10, 2012, at 12:57 PM, Bert Gunter wrote: > >> ... or perhaps even simpler: >> >>> sz <- function(x,k)tapply(x,(seq_along(x)-1)%/%k, sum) >>> sz(1:10,3) >> >> 0 1 2 3 >> 6 15 24 10 >> >> Note that this works for k>n, where the previous solution does not. >>> >>> sz(1:10,15) >> >> 0 >> 55 > > > I agree that it is more elegant, but I do not get an error or an unexpected > result with my method. > >> N=10 >> k=15 >> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum) >> w > 1 > 55 > > A different label but the same result. I'm protected from the typical 1:0 > problem that seq_along solves by including +1 in the second argument to > ":"/seq(). Unless, of course, you set N to a negative number, but that > wouldn't make much sense would it, and you get an error from rep() anyway. > > Best; > David. > >> >> -- Bert >> >> On Fri, Aug 10, 2012 at 12:37 PM, David Winsemius >> <[hidden email]> wrote: >>> >>> >>> On Aug 10, 2012, at 12:20 PM, Sam Steingold wrote: >>> >>>> I have a long numeric vector v (length N) and I want create a shorter >>>> vector of length N/k consisting of sums of k-subsequences of v: >>>> >>>> v <- c(1,2,3,4,5,6,7,8,9,10) >>>> >>>> N=10, k=3 >>>> ===> [6,15,24,10] >>>> >>>> I can, of course, iterate: >>>> >>>>> w <- vector(mode="numeric",length=ceiling(N/k)) >>>>> for (i in 1:length(w)) w[i] <- sum(v(i*k:(i+1)*k)) >>>> >>>> >>>> >>>> (modulo boundary conditions) >>>> but I wonder if there is a better way. >>> >>> >>> >>> Well, using v with parentheses instead of square-brackets might not be >>> the >>> right way, since v is not a function. >>> >>> Consider this alternate (no need to pre-allocate 'w'): >>> >>>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum) >>>> w >>> >>> 1 2 3 4 >>> 6 15 24 10 >>> >>> -- >>> >>> David Winsemius, MD >>> Alameda, CA, USA >>> >>> ______________________________________________ >>> [hidden email] mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> >> >> >> -- >> >> Bert Gunter >> Genentech Nonclinical Biostatistics >> >> Internal Contact Info: >> Phone: 467-7374 >> Website: >> >> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm > > > David Winsemius, MD > Alameda, CA, USA > -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
In reply to this post by David Winsemius
I wouldn't be surprised if one couldn't get an *apply-free solution by using diff(), cumsum() and selective indexing as well.
Cheers, Michael On Aug 10, 2012, at 5:07 PM, David Winsemius <[hidden email]> wrote: > > On Aug 10, 2012, at 12:57 PM, Bert Gunter wrote: > >> ... or perhaps even simpler: >> >>> sz <- function(x,k)tapply(x,(seq_along(x)-1)%/%k, sum) >>> sz(1:10,3) >> 0 1 2 3 >> 6 15 24 10 >> >> Note that this works for k>n, where the previous solution does not. >>> sz(1:10,15) >> 0 >> 55 > > I agree that it is more elegant, but I do not get an error or an unexpected result with my method. > > > N=10 > > k=15 > > w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum) > > w > 1 > 55 > > A different label but the same result. I'm protected from the typical 1:0 problem that seq_along solves by including +1 in the second argument to ":"/seq(). Unless, of course, you set N to a negative number, but that wouldn't make much sense would it, and you get an error from rep() anyway. > > Best; > David. > >> >> -- Bert >> >> On Fri, Aug 10, 2012 at 12:37 PM, David Winsemius >> <[hidden email]> wrote: >>> >>> On Aug 10, 2012, at 12:20 PM, Sam Steingold wrote: >>> >>>> I have a long numeric vector v (length N) and I want create a shorter >>>> vector of length N/k consisting of sums of k-subsequences of v: >>>> >>>> v <- c(1,2,3,4,5,6,7,8,9,10) >>>> >>>> N=10, k=3 >>>> ===> [6,15,24,10] >>>> >>>> I can, of course, iterate: >>>> >>>>> w <- vector(mode="numeric",length=ceiling(N/k)) >>>>> for (i in 1:length(w)) w[i] <- sum(v(i*k:(i+1)*k)) >>>> >>>> >>>> (modulo boundary conditions) >>>> but I wonder if there is a better way. >>> >>> >>> Well, using v with parentheses instead of square-brackets might not be the >>> right way, since v is not a function. >>> >>> Consider this alternate (no need to pre-allocate 'w'): >>> >>>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum) >>>> w >>> 1 2 3 4 >>> 6 15 24 10 >>> >>> -- >>> >>> David Winsemius, MD >>> Alameda, CA, USA >>> >>> ______________________________________________ >>> [hidden email] mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> >> >> -- >> >> Bert Gunter >> Genentech Nonclinical Biostatistics >> >> Internal Contact Info: >> Phone: 467-7374 >> Website: >> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm > > David Winsemius, MD > Alameda, CA, USA > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
On Aug 10, 2012, at 3:42 PM, Michael Weylandt wrote: > I wouldn't be surprised if one couldn't get an *apply-free solution > by using diff(), cumsum() and selective indexing as well. What about colSums on a matrix extended with the right number of zeros. > colSums(matrix (c(v, rep(0, 3- length(v)%%3) ) , nrow=3) ) [1] 6 15 24 10 (My experience is that tapply is generally fairly fast anyway, much faster than apply.data.frame. So I do not lump all *apply solutions in the same efficiency category.) -- David. > > Cheers, > Michael > > On Aug 10, 2012, at 5:07 PM, David Winsemius > <[hidden email]> wrote: > >> >> On Aug 10, 2012, at 12:57 PM, Bert Gunter wrote: >> >>> ... or perhaps even simpler: >>> >>>> sz <- function(x,k)tapply(x,(seq_along(x)-1)%/%k, sum) >>>> sz(1:10,3) >>> 0 1 2 3 >>> 6 15 24 10 >>> >>> Note that this works for k>n, where the previous solution does not. >>>> sz(1:10,15) >>> 0 >>> 55 >> >> I agree that it is more elegant, but I do not get an error or an >> unexpected result with my method. >> >>> N=10 >>> k=15 >>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum) >>> w >> 1 >> 55 >> >> A different label but the same result. I'm protected from the >> typical 1:0 problem that seq_along solves by including +1 in the >> second argument to ":"/seq(). Unless, of course, you set N to a >> negative number, but that wouldn't make much sense would it, and >> you get an error from rep() anyway. >> >> Best; >> David. >> >>> >>> -- Bert >>> >>> On Fri, Aug 10, 2012 at 12:37 PM, David Winsemius >>> <[hidden email]> wrote: >>>> >>>> On Aug 10, 2012, at 12:20 PM, Sam Steingold wrote: >>>> >>>>> I have a long numeric vector v (length N) and I want create a >>>>> shorter >>>>> vector of length N/k consisting of sums of k-subsequences of v: >>>>> >>>>> v <- c(1,2,3,4,5,6,7,8,9,10) >>>>> >>>>> N=10, k=3 >>>>> ===> [6,15,24,10] >>>>> >>>>> I can, of course, iterate: >>>>> >>>>>> w <- vector(mode="numeric",length=ceiling(N/k)) >>>>>> for (i in 1:length(w)) w[i] <- sum(v(i*k:(i+1)*k)) >>>>> >>>>> >>>>> (modulo boundary conditions) >>>>> but I wonder if there is a better way. >>>> >>>> >>>> Well, using v with parentheses instead of square-brackets might >>>> not be the >>>> right way, since v is not a function. >>>> >>>> Consider this alternate (no need to pre-allocate 'w'): >>>> >>>>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum) >>>>> w >>>> 1 2 3 4 >>>> 6 15 24 10 >>>> >>>> -- >>>> >>>> David Winsemius, MD >>>> Alameda, CA, USA >>>> >>>> ______________________________________________ >>>> [hidden email] mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >>> >>> -- >>> >>> Bert Gunter >>> Genentech Nonclinical Biostatistics >>> >>> Internal Contact Info: >>> Phone: 467-7374 >>> Website: >>> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm >> >> David Winsemius, MD >> Alameda, CA, USA >> >> ______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Alameda, CA, USA ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
Certainly ... but this is of course limited to the few C coded
functions available. Back to apply-type stuff for, say, median as a summary statistic. -- Bert On Fri, Aug 10, 2012 at 3:58 PM, David Winsemius <[hidden email]> wrote: > > On Aug 10, 2012, at 3:42 PM, Michael Weylandt wrote: > >> I wouldn't be surprised if one couldn't get an *apply-free solution by >> using diff(), cumsum() and selective indexing as well. > > > What about colSums on a matrix extended with the right number of zeros. > >> colSums(matrix (c(v, rep(0, 3- length(v)%%3) ) , nrow=3) ) > [1] 6 15 24 10 > > (My experience is that tapply is generally fairly fast anyway, much faster > than apply.data.frame. So I do not lump all *apply solutions in the same > efficiency category.) > > -- > David. >> >> >> Cheers, >> Michael >> >> On Aug 10, 2012, at 5:07 PM, David Winsemius <[hidden email]> >> wrote: >> >>> >>> On Aug 10, 2012, at 12:57 PM, Bert Gunter wrote: >>> >>>> ... or perhaps even simpler: >>>> >>>>> sz <- function(x,k)tapply(x,(seq_along(x)-1)%/%k, sum) >>>>> sz(1:10,3) >>>> >>>> 0 1 2 3 >>>> 6 15 24 10 >>>> >>>> Note that this works for k>n, where the previous solution does not. >>>>> >>>>> sz(1:10,15) >>>> >>>> 0 >>>> 55 >>> >>> >>> I agree that it is more elegant, but I do not get an error or an >>> unexpected result with my method. >>> >>>> N=10 >>>> k=15 >>>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum) >>>> w >>> >>> 1 >>> 55 >>> >>> A different label but the same result. I'm protected from the typical 1:0 >>> problem that seq_along solves by including +1 in the second argument to >>> ":"/seq(). Unless, of course, you set N to a negative number, but that >>> wouldn't make much sense would it, and you get an error from rep() anyway. >>> >>> Best; >>> David. >>> >>>> >>>> -- Bert >>>> >>>> On Fri, Aug 10, 2012 at 12:37 PM, David Winsemius >>>> <[hidden email]> wrote: >>>>> >>>>> >>>>> On Aug 10, 2012, at 12:20 PM, Sam Steingold wrote: >>>>> >>>>>> I have a long numeric vector v (length N) and I want create a shorter >>>>>> vector of length N/k consisting of sums of k-subsequences of v: >>>>>> >>>>>> v <- c(1,2,3,4,5,6,7,8,9,10) >>>>>> >>>>>> N=10, k=3 >>>>>> ===> [6,15,24,10] >>>>>> >>>>>> I can, of course, iterate: >>>>>> >>>>>>> w <- vector(mode="numeric",length=ceiling(N/k)) >>>>>>> for (i in 1:length(w)) w[i] <- sum(v(i*k:(i+1)*k)) >>>>>> >>>>>> >>>>>> >>>>>> (modulo boundary conditions) >>>>>> but I wonder if there is a better way. >>>>> >>>>> >>>>> >>>>> Well, using v with parentheses instead of square-brackets might not be >>>>> the >>>>> right way, since v is not a function. >>>>> >>>>> Consider this alternate (no need to pre-allocate 'w'): >>>>> >>>>>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum) >>>>>> w >>>>> >>>>> 1 2 3 4 >>>>> 6 15 24 10 >>>>> >>>>> -- >>>>> >>>>> David Winsemius, MD >>>>> Alameda, CA, USA >>>>> >>>>> ______________________________________________ >>>>> [hidden email] mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>>> >>>> >>>> >>>> -- >>>> >>>> Bert Gunter >>>> Genentech Nonclinical Biostatistics >>>> >>>> Internal Contact Info: >>>> Phone: 467-7374 >>>> Website: >>>> >>>> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm >>> >>> >>> David Winsemius, MD >>> Alameda, CA, USA >>> >>> ______________________________________________ >>> [hidden email] mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. > > > David Winsemius, MD > Alameda, CA, USA > -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
In reply to this post by David Winsemius
HI,
Same result, with data.frame: dat1<-data.frame(V1=v[1:3],V2=v[4:6],V3=v[7:9],V4=c(v[10],rep(0,2))) sapply(dat1,cumsum)[3,] V1 V2 V3 V4 6 15 24 10 sapply(dat1,sum) V1 V2 V3 V4 6 15 24 10 A.K. ----- Original Message ----- From: David Winsemius <[hidden email]> To: Michael Weylandt <[hidden email]> Cc: "[hidden email]" <[hidden email]>; "[hidden email]" <[hidden email]>; Bert Gunter <[hidden email]> Sent: Friday, August 10, 2012 6:58 PM Subject: Re: [R] summarize a vector On Aug 10, 2012, at 3:42 PM, Michael Weylandt wrote: > I wouldn't be surprised if one couldn't get an *apply-free solution by using diff(), cumsum() and selective indexing as well. What about colSums on a matrix extended with the right number of zeros. > colSums(matrix (c(v, rep(0, 3- length(v)%%3) ) , nrow=3) ) [1] 6 15 24 10 (My experience is that tapply is generally fairly fast anyway, much faster than apply.data.frame. So I do not lump all *apply solutions in the same efficiency category.) --David. > > Cheers, > Michael > > On Aug 10, 2012, at 5:07 PM, David Winsemius <[hidden email]> wrote: > >> >> On Aug 10, 2012, at 12:57 PM, Bert Gunter wrote: >> >>> ... or perhaps even simpler: >>> >>>> sz <- function(x,k)tapply(x,(seq_along(x)-1)%/%k, sum) >>>> sz(1:10,3) >>> 0 1 2 3 >>> 6 15 24 10 >>> >>> Note that this works for k>n, where the previous solution does not. >>>> sz(1:10,15) >>> 0 >>> 55 >> >> I agree that it is more elegant, but I do not get an error or an unexpected result with my method. >> >>> N=10 >>> k=15 >>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum) >>> w >> 1 >> 55 >> >> A different label but the same result. I'm protected from the typical 1:0 problem that seq_along solves by including +1 in the second argument to ":"/seq(). Unless, of course, you set N to a negative number, but that wouldn't make much sense would it, and you get an error from rep() anyway. >> >> Best; >> David. >> >>> >>> -- Bert >>> >>> On Fri, Aug 10, 2012 at 12:37 PM, David Winsemius >>> <[hidden email]> wrote: >>>> >>>> On Aug 10, 2012, at 12:20 PM, Sam Steingold wrote: >>>> >>>>> I have a long numeric vector v (length N) and I want create a shorter >>>>> vector of length N/k consisting of sums of k-subsequences of v: >>>>> >>>>> v <- c(1,2,3,4,5,6,7,8,9,10) >>>>> >>>>> N=10, k=3 >>>>> ===> [6,15,24,10] >>>>> >>>>> I can, of course, iterate: >>>>> >>>>>> w <- vector(mode="numeric",length=ceiling(N/k)) >>>>>> for (i in 1:length(w)) w[i] <- sum(v(i*k:(i+1)*k)) >>>>> >>>>> >>>>> (modulo boundary conditions) >>>>> but I wonder if there is a better way. >>>> >>>> >>>> Well, using v with parentheses instead of square-brackets might not be the >>>> right way, since v is not a function. >>>> >>>> Consider this alternate (no need to pre-allocate 'w'): >>>> >>>>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum) >>>>> w >>>> 1 2 3 4 >>>> 6 15 24 10 >>>> >>>> -- >>>> >>>> David Winsemius, MD >>>> Alameda, CA, USA >>>> >>>> ______________________________________________ >>>> [hidden email] mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >>> >>> -- >>> Bert Gunter >>> Genentech Nonclinical Biostatistics >>> >>> Internal Contact Info: >>> Phone: 467-7374 >>> Website: >>> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm >> >> David Winsemius, MD >> Alameda, CA, USA >> >> ______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Alameda, CA, USA ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
| Powered by Nabble | Edit this page |
