Add grid lines to levelplot

classic Classic list List threaded Threaded
27 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Add grid lines to levelplot

davavra
I'm using the levelplot function in the lattice package. I am plotting a
grid of cells and I want grid lines drawn between cells. I've spent a lot of
time trying different options. I've also looked at panel.levelplot. The
border parameter seems to be ignored or it doesn't mean cell border color.

 

The panel.levelplot calls grid.rect but forces lwd=1e-5 instead the passed
lwd. On the surface, this looks mighty small. Could this be the problem?

 

I'm hoping there's a simple solution. Is there some way to get them without
writing my own panel function?

 

DAV

 

 


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Effeciently sum 3d table

davavra
I have a large number of 3d tables that I wish to sum
Is there an efficient way to do this? Or perhaps a function I can call?

I tried using do.call("sum",listoftables) but that returns a single value.

So far, it seems only a loop will do the job.


TIA,
DAV

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Effeciently sum 3d table

Petr Savicky
On Mon, Apr 16, 2012 at 10:28:43AM -0400, David A Vavra wrote:
> I have a large number of 3d tables that I wish to sum
> Is there an efficient way to do this? Or perhaps a function I can call?
>
> I tried using do.call("sum",listoftables) but that returns a single value.
>
> So far, it seems only a loop will do the job.

Hi.

Use lapply(), for example

  listoftables <- list(array(1:8, dim=c(2, 2, 2)), array(2:9, dim=c(2, 2, 2)))
  lapply(listoftables, sum)

  [[1]]
  [1] 36

  [[2]]
  [1] 44

Hope this helps.

Petr Savicky.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Effeciently sum 3d table

glsnow
In reply to this post by davavra
Look at the Reduce function.

On Mon, Apr 16, 2012 at 8:28 AM, David A Vavra <[hidden email]> wrote:

> I have a large number of 3d tables that I wish to sum
> Is there an efficient way to do this? Or perhaps a function I can call?
>
> I tried using do.call("sum",listoftables) but that returns a single value.
>
> So far, it seems only a loop will do the job.
>
>
> TIA,
> DAV
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



--
Gregory (Greg) L. Snow Ph.D.
[hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Effeciently sum 3d table

Bert Gunter
Define "sum" . Do you mean you want to get a single sum for each
array? -- get marginal sums for each array? -- get a single array in
which each value is the sum of all the individual values at the
position?

Due thought and consideration for those trying to help by formulating
your query carefully and concisely vastly increases the chance of
getting a useful answer. See the posting guide -- this is a skill that
needs to be learned and the guide is quite helpful. And I must
acknowledge that it is a skill that I also have not yet mastered.

Concerning your query, I would only note that the two responses from
Greg and Petr that you received are unlikely to be significantly
faster than just using loops, since both are still essentially looping
at the interpreted level. Whether either give you what you want, I do
not know.

-- Bert

On Mon, Apr 16, 2012 at 8:53 AM, Greg Snow <[hidden email]> wrote:

> Look at the Reduce function.
>
> On Mon, Apr 16, 2012 at 8:28 AM, David A Vavra <[hidden email]> wrote:
>> I have a large number of 3d tables that I wish to sum
>> Is there an efficient way to do this? Or perhaps a function I can call?
>>
>> I tried using do.call("sum",listoftables) but that returns a single value.
>>
>> So far, it seems only a loop will do the job.
>>
>>
>> TIA,
>> DAV
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Gregory (Greg) L. Snow Ph.D.
> [hidden email]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



--

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Effeciently sum 3d table

davavra
Thanks Gunter,

I mean what I think is the normal definition of 'sum' as in:
   T1 + T2 + T3 + ...
It never occurred to me that there would be a question.

I have gotten the impression that a for loop is very inefficient. Whenever I
change them to lapply calls there is a noticeable improvement in run time
for whatever reason. The problem with lapply here is that I effectively need
a global table to hold the final sum. lapply also  wants to return a value.

You may be correct that in the long run, the loop is the best. There's a lot
of extraneous memory wastage holding all of the tables in a list as well as
the return 'values'.

As an alternate and given a pre-existing list of tables, I was thinking of
creating a temporary environment to hold the final result so it could be
passed globally to each lapply execution level but that seems clunky and
wasteful as well.

Example in partial code:

Env <- CreatEnv() # my own function
Assign('final',T1-T1,envir=env)
L<-listOfTables

lapply(L,function(t) {
        final <- get('final',envir=env) + t
        assign('final',final,envir=env)
        NULL
})

But I was hoping for a more elegant and hopefully more efficient solution.
Greg's suggestion for using reduce seems in order but as yet I'm unfamiliar
with the function.

DAV



-----Original Message-----
From: Bert Gunter [mailto:[hidden email]]
Sent: Monday, April 16, 2012 12:42 PM
To: Greg Snow
Cc: David A Vavra; [hidden email]
Subject: Re: [R] Effeciently sum 3d table

Define "sum" . Do you mean you want to get a single sum for each
array? -- get marginal sums for each array? -- get a single array in
which each value is the sum of all the individual values at the
position?

Due thought and consideration for those trying to help by formulating
your query carefully and concisely vastly increases the chance of
getting a useful answer. See the posting guide -- this is a skill that
needs to be learned and the guide is quite helpful. And I must
acknowledge that it is a skill that I also have not yet mastered.

Concerning your query, I would only note that the two responses from
Greg and Petr that you received are unlikely to be significantly
faster than just using loops, since both are still essentially looping
at the interpreted level. Whether either give you what you want, I do
not know.

-- Bert

On Mon, Apr 16, 2012 at 8:53 AM, Greg Snow <[hidden email]> wrote:
> Look at the Reduce function.
>
> On Mon, Apr 16, 2012 at 8:28 AM, David A Vavra <[hidden email]>
wrote:
>> I have a large number of 3d tables that I wish to sum
>> Is there an efficient way to do this? Or perhaps a function I can call?
>>
>> I tried using do.call("sum",listoftables) but that returns a single
value.
>>
>> So far, it seems only a loop will do the job.
>>
>>
>> TIA,
>> DAV


--

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biost
atistics/pdb-ncb-home.htm

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Effeciently sum 3d table

davavra
In reply to this post by Petr Savicky
Thanks Petr,

I'm after T1 + T2 + T3 + ... and your solution is giving a list of n items
each containing sum(T[i]). I guess I should have been clearer in stating
what I need.

Cheers,
DAV



-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On
Behalf Of Petr Savicky
Sent: Monday, April 16, 2012 11:07 AM
To: [hidden email]
Subject: Re: [R] Effeciently sum 3d table

On Mon, Apr 16, 2012 at 10:28:43AM -0400, David A Vavra wrote:
> I have a large number of 3d tables that I wish to sum
> Is there an efficient way to do this? Or perhaps a function I can call?
>
> I tried using do.call("sum",listoftables) but that returns a single value.

>
> So far, it seems only a loop will do the job.

Hi.

Use lapply(), for example

  listoftables <- list(array(1:8, dim=c(2, 2, 2)), array(2:9, dim=c(2, 2,
2)))
  lapply(listoftables, sum)

  [[1]]
  [1] 36

  [[2]]
  [1] 44

Hope this helps.

Petr Savicky.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Effeciently sum 3d table

davavra
In reply to this post by glsnow
Thanks Greg,

I think this may be what I'm after but the documentation for it isn't
particularly clear. I hate it when someone documents a piece of code saying
it works kinda like some other code (running elsewhere, of course) making
the tacit assumption that everybody will immediately know what that means
and implies.

I'm sure I'll understand it once I know what it is trying to say. :) There's
an item in the examples which may be exactly what I'm after.

DAV


-----Original Message-----
From: Greg Snow [mailto:[hidden email]]
Sent: Monday, April 16, 2012 11:54 AM
To: David A Vavra
Cc: [hidden email]
Subject: Re: [R] Effeciently sum 3d table

Look at the Reduce function.

On Mon, Apr 16, 2012 at 8:28 AM, David A Vavra <[hidden email]> wrote:

> I have a large number of 3d tables that I wish to sum
> Is there an efficient way to do this? Or perhaps a function I can call?
>
> I tried using do.call("sum",listoftables) but that returns a single value.
>
> So far, it seems only a loop will do the job.
>
>
> TIA,
> DAV


--
Gregory (Greg) L. Snow Ph.D.
[hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Effeciently sum 3d table

Bert Gunter
In reply to this post by davavra
David:

1. My first name is Bert.

2. " It never occurred to me that there would be a question."
Indeed. But in fact you got solutions for two different
interpretations (Greg's is what you wanted). That is what I meant when
I said that clarity in asking the question is important.

3. > I have gotten the impression that a for loop is very inefficient.
Whenever I
> change them to lapply calls there is a noticeable improvement in run time
> for whatever reason.
I'd like to see your data on this. My experience is that they are
typically comparable. Chambers in his "Software for Data Analysis"
book says (pp 213): (with apply type functions rather than explicit
loops),  " The computation should run faster... However, none of the
apply mechanisms changes the number of times the supplied functions is
called, so serious improvements will be limited to iterating simple
calculations many times."

4. You can get serious improvements by vectorizing; and you can do
that here, if I understand correctly, because all your arrays have
identical dim = d. Here's how:

## assume your list of arrays is in listoftables

alldat <- do.call(cbind,listoftables) ## this might be the slow part
ans <- array(.rowSums (allDat), dim = d)

See ?rowSums for explanations and caveats, especially with NA's .

Cheers,
Bert

On Mon, Apr 16, 2012 at 11:35 AM, David A Vavra <[hidden email]> wrote:

> Thanks Gunter,
>
> I mean what I think is the normal definition of 'sum' as in:
>   T1 + T2 + T3 + ...
> It never occurred to me that there would be a question.
>
> I have gotten the impression that a for loop is very inefficient. Whenever I
> change them to lapply calls there is a noticeable improvement in run time
> for whatever reason. The problem with lapply here is that I effectively need
> a global table to hold the final sum. lapply also  wants to return a value.
>
> You may be correct that in the long run, the loop is the best. There's a lot
> of extraneous memory wastage holding all of the tables in a list as well as
> the return 'values'.
>
> As an alternate and given a pre-existing list of tables, I was thinking of
> creating a temporary environment to hold the final result so it could be
> passed globally to each lapply execution level but that seems clunky and
> wasteful as well.
>
> Example in partial code:
>
> Env <- CreatEnv() # my own function
> Assign('final',T1-T1,envir=env)
> L<-listOfTables
>
> lapply(L,function(t) {
>        final <- get('final',envir=env) + t
>        assign('final',final,envir=env)
>        NULL
> })
>
> But I was hoping for a more elegant and hopefully more efficient solution.
> Greg's suggestion for using reduce seems in order but as yet I'm unfamiliar
> with the function.
>
> DAV
>
>
>
> -----Original Message-----
> From: Bert Gunter [mailto:[hidden email]]
> Sent: Monday, April 16, 2012 12:42 PM
> To: Greg Snow
> Cc: David A Vavra; [hidden email]
> Subject: Re: [R] Effeciently sum 3d table
>
> Define "sum" . Do you mean you want to get a single sum for each
> array? -- get marginal sums for each array? -- get a single array in
> which each value is the sum of all the individual values at the
> position?
>
> Due thought and consideration for those trying to help by formulating
> your query carefully and concisely vastly increases the chance of
> getting a useful answer. See the posting guide -- this is a skill that
> needs to be learned and the guide is quite helpful. And I must
> acknowledge that it is a skill that I also have not yet mastered.
>
> Concerning your query, I would only note that the two responses from
> Greg and Petr that you received are unlikely to be significantly
> faster than just using loops, since both are still essentially looping
> at the interpreted level. Whether either give you what you want, I do
> not know.
>
> -- Bert
>
> On Mon, Apr 16, 2012 at 8:53 AM, Greg Snow <[hidden email]> wrote:
>> Look at the Reduce function.
>>
>> On Mon, Apr 16, 2012 at 8:28 AM, David A Vavra <[hidden email]>
> wrote:
>>> I have a large number of 3d tables that I wish to sum
>>> Is there an efficient way to do this? Or perhaps a function I can call?
>>>
>>> I tried using do.call("sum",listoftables) but that returns a single
> value.
>>>
>>> So far, it seems only a loop will do the job.
>>>
>>>
>>> TIA,
>>> DAV
>
>
> --
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
> Internal Contact Info:
> Phone: 467-7374
> Website:
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biost
> atistics/pdb-ncb-home.htm
>



--

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Effeciently sum 3d table

David Winsemius
In reply to this post by davavra

On Apr 16, 2012, at 2:43 PM, David A Vavra wrote:

> Thanks Petr,
>
> I'm after T1 + T2 + T3 + ...

Which would be one number ... i.e. the result you originally said you  
did not want.

> and your solution is giving a list of n items
> each containing sum(T[i]). I guess I should have been clearer in  
> stating
> what I need.

Or even now you _could_ be clearer. Do you want successive partial  
sums? That would yield to:

Reduce("+", listoftables, accumaulate=TRUE)




>
> Cheers,
> DAV
>
>
>
> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]
> ] On
> Behalf Of Petr Savicky
> Sent: Monday, April 16, 2012 11:07 AM
> To: [hidden email]
> Subject: Re: [R] Effeciently sum 3d table
>
> On Mon, Apr 16, 2012 at 10:28:43AM -0400, David A Vavra wrote:
>> I have a large number of 3d tables that I wish to sum
>> Is there an efficient way to do this? Or perhaps a function I can  
>> call?
>>
>> I tried using do.call("sum",listoftables) but that returns a single  
>> value.
>
>>
>> So far, it seems only a loop will do the job.
>
> Hi.
>
> Use lapply(), for example
>
>  listoftables <- list(array(1:8, dim=c(2, 2, 2)), array(2:9,  
> dim=c(2, 2,
> 2)))
>  lapply(listoftables, sum)
>
>  [[1]]
>  [1] 36
>
>  [[2]]
>  [1] 44
>
> Hope this helps.
>
> Petr Savicky.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Effeciently sum 3d table

William Dunlap
In reply to this post by davavra
> Example in partial code:
>
> Env <- CreatEnv() # my own function
> Assign('final',T1-T1,envir=env)
> L<-listOfTables
>
> lapply(L,function(t) {
> final <- get('final',envir=env) + t
> assign('final',final,envir=env)
> NULL
> })

First, finish writing that code so it runs and you can make sure its
output is ok:

L <- lapply(1:50000, function(i) array(i:(i+3), c(2,2))) # list of 50,000 2x2 matrices
env <- new.env()
assign('final', L[[1]] - L[[1]], envir=env)
junk <- lapply(L, function(t) {
     final <- get('final', envir=env) + t
     assign('final', final, envir=env)
     NULL
})
get('final', envir=env)
#            [,1]       [,2]
# [1,] 1250025000 1250125000
# [2,] 1250075000 1250175000
> sum( (2:50001) ) # should be final[2,1]
# [1] 1250075000

You asked for something less "clunky".
You are fighting the system by using get() and assign(), just use
ordinary expression syntax to get and set variables:
final <- L[[1]]
for(i in seq_along(L)[-1]) final <- final + L[[i]]
final
#           [,1]       [,2]
# [1,] 1250025000 1250125000
# [2,] 1250075000 1250175000

The former took 0.22 seconds on my machine, the latter 0.06.

You don't have to compute the whole list of matrices before
doing the sum, just add to the current sum when you have
computed one matrix and then forget about it.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]] On Behalf
> Of David A Vavra
> Sent: Monday, April 16, 2012 11:35 AM
> To: 'Bert Gunter'
> Cc: [hidden email]
> Subject: Re: [R] Effeciently sum 3d table
>
> Thanks Gunter,
>
> I mean what I think is the normal definition of 'sum' as in:
>    T1 + T2 + T3 + ...
> It never occurred to me that there would be a question.
>
> I have gotten the impression that a for loop is very inefficient. Whenever I
> change them to lapply calls there is a noticeable improvement in run time
> for whatever reason. The problem with lapply here is that I effectively need
> a global table to hold the final sum. lapply also  wants to return a value.
>
> You may be correct that in the long run, the loop is the best. There's a lot
> of extraneous memory wastage holding all of the tables in a list as well as
> the return 'values'.
>
> As an alternate and given a pre-existing list of tables, I was thinking of
> creating a temporary environment to hold the final result so it could be
> passed globally to each lapply execution level but that seems clunky and
> wasteful as well.
>
> Example in partial code:
>
> Env <- CreatEnv() # my own function
> Assign('final',T1-T1,envir=env)
> L<-listOfTables
>
> lapply(L,function(t) {
> final <- get('final',envir=env) + t
> assign('final',final,envir=env)
> NULL
> })
>
> But I was hoping for a more elegant and hopefully more efficient solution.
> Greg's suggestion for using reduce seems in order but as yet I'm unfamiliar
> with the function.
>
> DAV
>
>
>
> -----Original Message-----
> From: Bert Gunter [mailto:[hidden email]]
> Sent: Monday, April 16, 2012 12:42 PM
> To: Greg Snow
> Cc: David A Vavra; [hidden email]
> Subject: Re: [R] Effeciently sum 3d table
>
> Define "sum" . Do you mean you want to get a single sum for each
> array? -- get marginal sums for each array? -- get a single array in
> which each value is the sum of all the individual values at the
> position?
>
> Due thought and consideration for those trying to help by formulating
> your query carefully and concisely vastly increases the chance of
> getting a useful answer. See the posting guide -- this is a skill that
> needs to be learned and the guide is quite helpful. And I must
> acknowledge that it is a skill that I also have not yet mastered.
>
> Concerning your query, I would only note that the two responses from
> Greg and Petr that you received are unlikely to be significantly
> faster than just using loops, since both are still essentially looping
> at the interpreted level. Whether either give you what you want, I do
> not know.
>
> -- Bert
>
> On Mon, Apr 16, 2012 at 8:53 AM, Greg Snow <[hidden email]> wrote:
> > Look at the Reduce function.
> >
> > On Mon, Apr 16, 2012 at 8:28 AM, David A Vavra <[hidden email]>
> wrote:
> >> I have a large number of 3d tables that I wish to sum
> >> Is there an efficient way to do this? Or perhaps a function I can call?
> >>
> >> I tried using do.call("sum",listoftables) but that returns a single
> value.
> >>
> >> So far, it seems only a loop will do the job.
> >>
> >>
> >> TIA,
> >> DAV
>
>
> --
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
> Internal Contact Info:
> Phone: 467-7374
> Website:
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biost
> atistics/pdb-ncb-home.htm
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Effeciently sum 3d table

davavra
In reply to this post by Bert Gunter
Bert,

My apologies on the name.

I haven't kept any data on loop times. I don't know why lapply seems faster
but the difference is quite noticeable. It has struck me as odd. I would
have thought lapply would be slower. It has taken an effort to change my
thinking to force fit solutions to it but I've gotten used to it. As of now
I reserve loops to times when there are only a few iterations (as in 10) and
to solutions that require passing large amounts of information among
iterations. lapply is particularly handy when constructing lists.

As for vectorizing, see the code below. Note that it uses mapply but that
simply may have made implementation easier. However, if vectorizing gives an
improvement over looping, the mapply may be the reason.

> f<-function(x,y,z) catn("do something")
> Vectorize(f,c('x','y'))
function (x, y, z)
{
    args <- lapply(as.list(match.call())[-1L], eval, parent.frame())
    names <- if (is.null(names(args)))
        character(length(args))
    else names(args)
    dovec <- names %in% vectorize.args
    do.call("mapply", c(FUN = FUN, args[dovec], MoreArgs =
list(args[!dovec]),
        SIMPLIFY = SIMPLIFY, USE.NAMES = USE.NAMES))
}
<environment: 0x7fb3442553c8>

DAV


-----Original Message-----
From: Bert Gunter [mailto:[hidden email]]
Sent: Monday, April 16, 2012 3:07 PM
To: David A Vavra
Cc: [hidden email]
Subject: Re: [R] Effeciently sum 3d table

David:

1. My first name is Bert.

2. " It never occurred to me that there would be a question."
Indeed. But in fact you got solutions for two different
interpretations (Greg's is what you wanted). That is what I meant when
I said that clarity in asking the question is important.

3. > I have gotten the impression that a for loop is very inefficient.
Whenever I
> change them to lapply calls there is a noticeable improvement in run time
> for whatever reason.
I'd like to see your data on this. My experience is that they are
typically comparable. Chambers in his "Software for Data Analysis"
book says (pp 213): (with apply type functions rather than explicit
loops),  " The computation should run faster... However, none of the
apply mechanisms changes the number of times the supplied functions is
called, so serious improvements will be limited to iterating simple
calculations many times."

4. You can get serious improvements by vectorizing; and you can do
that here, if I understand correctly, because all your arrays have
identical dim = d. Here's how:

## assume your list of arrays is in listoftables

alldat <- do.call(cbind,listoftables) ## this might be the slow part
ans <- array(.rowSums (allDat), dim = d)

See ?rowSums for explanations and caveats, especially with NA's .

Cheers,
Bert

On Mon, Apr 16, 2012 at 11:35 AM, David A Vavra <[hidden email]> wrote:
> Thanks Gunter,
>
> I mean what I think is the normal definition of 'sum' as in:
>   T1 + T2 + T3 + ...
> It never occurred to me that there would be a question.
>
> I have gotten the impression that a for loop is very inefficient. Whenever
I
> change them to lapply calls there is a noticeable improvement in run time
> for whatever reason. The problem with lapply here is that I effectively
need
> a global table to hold the final sum. lapply also  wants to return a
value.
>
> You may be correct that in the long run, the loop is the best. There's a
lot
> of extraneous memory wastage holding all of the tables in a list as well
as

> the return 'values'.
>
> As an alternate and given a pre-existing list of tables, I was thinking of
> creating a temporary environment to hold the final result so it could be
> passed globally to each lapply execution level but that seems clunky and
> wasteful as well.
>
> Example in partial code:
>
> Env <- CreatEnv() # my own function
> Assign('final',T1-T1,envir=env)
> L<-listOfTables
>
> lapply(L,function(t) {
>        final <- get('final',envir=env) + t
>        assign('final',final,envir=env)
>        NULL
> })
>
> But I was hoping for a more elegant and hopefully more efficient solution.
> Greg's suggestion for using reduce seems in order but as yet I'm
unfamiliar

> with the function.
>
> DAV
>
>
>
> -----Original Message-----
> From: Bert Gunter [mailto:[hidden email]]
> Sent: Monday, April 16, 2012 12:42 PM
> To: Greg Snow
> Cc: David A Vavra; [hidden email]
> Subject: Re: [R] Effeciently sum 3d table
>
> Define "sum" . Do you mean you want to get a single sum for each
> array? -- get marginal sums for each array? -- get a single array in
> which each value is the sum of all the individual values at the
> position?
>
> Due thought and consideration for those trying to help by formulating
> your query carefully and concisely vastly increases the chance of
> getting a useful answer. See the posting guide -- this is a skill that
> needs to be learned and the guide is quite helpful. And I must
> acknowledge that it is a skill that I also have not yet mastered.
>
> Concerning your query, I would only note that the two responses from
> Greg and Petr that you received are unlikely to be significantly
> faster than just using loops, since both are still essentially looping
> at the interpreted level. Whether either give you what you want, I do
> not know.
>
> -- Bert
>
> On Mon, Apr 16, 2012 at 8:53 AM, Greg Snow <[hidden email]> wrote:
>> Look at the Reduce function.
>>
>> On Mon, Apr 16, 2012 at 8:28 AM, David A Vavra <[hidden email]>
> wrote:
>>> I have a large number of 3d tables that I wish to sum
>>> Is there an efficient way to do this? Or perhaps a function I can call?
>>>
>>> I tried using do.call("sum",listoftables) but that returns a single
> value.
>>>
>>> So far, it seems only a loop will do the job.
>>>
>>>
>>> TIA,
>>> DAV
>
>
> --
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
> Internal Contact Info:
> Phone: 467-7374
> Website:
>
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biost
> atistics/pdb-ncb-home.htm
>



--

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biost
atistics/pdb-ncb-home.htm

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Effeciently sum 3d table

David Winsemius
In reply to this post by David Winsemius

On Apr 16, 2012, at 3:26 PM, David Winsemius wrote:

>
> On Apr 16, 2012, at 2:43 PM, David A Vavra wrote:
>
>> Thanks Petr,
>>
>> I'm after T1 + T2 + T3 + ...
>
> Which would be one number ... i.e. the result you originally said  
> you did not want.
>
>> and your solution is giving a list of n items
>> each containing sum(T[i]). I guess I should have been clearer in  
>> stating
>> what I need.
>
> Or even now you _could_ be clearer. Do you want successive partial  
> sums? That would yield to:
>
> Reduce("+", listoftables, accumaulate=TRUE)

If Dunlap's interpretation is corect then consder this

  L <- lapply(1:50000, function(i) array(i:(i+7), c(2,2,2)))
  system.time({final <- L[[1]]
  for(i in seq_along(L)[-1]) final <- final + L[[i]]
  final}  )
#   user  system elapsed
#  0.179   0.002   0.187

  system.time(Reduce("+", L))
#   user  system elapsed
#  0.150   0.002   0.157

 > identical(Reduce("+", L), final)
[1] TRUE


>
>
>
>
>>
>> Cheers,
>> DAV
>>
>>
>>
>> -----Original Message-----
>> From: [hidden email] [mailto:[hidden email]
>> ] On
>> Behalf Of Petr Savicky
>> Sent: Monday, April 16, 2012 11:07 AM
>> To: [hidden email]
>> Subject: Re: [R] Effeciently sum 3d table
>>
>> On Mon, Apr 16, 2012 at 10:28:43AM -0400, David A Vavra wrote:
>>> I have a large number of 3d tables that I wish to sum
>>> Is there an efficient way to do this? Or perhaps a function I can  
>>> call?
>>>
>>> I tried using do.call("sum",listoftables) but that returns a  
>>> single value.
>>
>>>
>>> So far, it seems only a loop will do the job.
>>
>> Hi.
>>
>> Use lapply(), for example
>>
>> listoftables <- list(array(1:8, dim=c(2, 2, 2)), array(2:9,  
>> dim=c(2, 2,
>> 2)))
>> lapply(listoftables, sum)
>>
>> [[1]]
>> [1] 36
>>
>> [[2]]
>> [1] 44
>>
>> Hope this helps.
>>
>> Petr Savicky.
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> West Hartford, CT
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Effeciently sum 3d table

davavra
In reply to this post by David Winsemius
> even now you _could_ be clearer

I fail to see why it's unclear.

>> I'm after T1 + T2 + T3 + ...
> Which would be one number ... i.e. the result you originally said you  
>did not want.

I think it's precisely what I want. If I have two 3d tables, T1 and T2, then
say either
        1) T1 + T2
        2) T1 - T2
(1) yields a third table equal to the sum of the individual cells and (2)
yields a table full of zeroes. At least it does for matrices. Are you saying
the T1+T2+T3+... above is equivalent to:

   sum(T1)+sum(T2)+sum(T3)+....

when the table has more than 2d? I tried it out by hand I get the result I'm
after. What I want is a general solution. Reduce may be the answer but I
find the documentation for it a bit daunting. Not to mention that it is far
from obvious that I should have originally thought of using it.

DAV



-----Original Message-----
From: David Winsemius [mailto:[hidden email]]
Sent: Monday, April 16, 2012 3:26 PM
To: David A Vavra
Cc: 'Petr Savicky'; [hidden email]
Subject: Re: [R] Effeciently sum 3d table


On Apr 16, 2012, at 2:43 PM, David A Vavra wrote:

> Thanks Petr,
>
> I'm after T1 + T2 + T3 + ...

Which would be one number ... i.e. the result you originally said you  
did not want.

> and your solution is giving a list of n items
> each containing sum(T[i]). I guess I should have been clearer in  
> stating
> what I need.

Or even now you _could_ be clearer. Do you want successive partial  
sums? That would yield to:

Reduce("+", listoftables, accumaulate=TRUE)




>
> Cheers,
> DAV
>
>
>
> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]
> ] On
> Behalf Of Petr Savicky
> Sent: Monday, April 16, 2012 11:07 AM
> To: [hidden email]
> Subject: Re: [R] Effeciently sum 3d table
>
> On Mon, Apr 16, 2012 at 10:28:43AM -0400, David A Vavra wrote:
>> I have a large number of 3d tables that I wish to sum
>> Is there an efficient way to do this? Or perhaps a function I can  
>> call?
>>
>> I tried using do.call("sum",listoftables) but that returns a single  
>> value.
>
>>
>> So far, it seems only a loop will do the job.
>
> Hi.
>
> Use lapply(), for example
>
>  listoftables <- list(array(1:8, dim=c(2, 2, 2)), array(2:9,  
> dim=c(2, 2,
> 2)))
>  lapply(listoftables, sum)
>
>  [[1]]
>  [1] 36
>
>  [[2]]
>  [1] 44
>
> Hope this helps.
>
> Petr Savicky.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Effeciently sum 3d table

Bert Gunter
In reply to this post by davavra
For purposes of clarity only...

On Mon, Apr 16, 2012 at 12:40 PM, David A Vavra <[hidden email]> wrote:

> Bert,
>
> My apologies on the name.
>
> I haven't kept any data on loop times. I don't know why lapply seems faster
> but the difference is quite noticeable. It has struck me as odd. I would
> have thought lapply would be slower. It has taken an effort to change my
> thinking to force fit solutions to it but I've gotten used to it. As of now
> I reserve loops to times when there are only a few iterations (as in 10) and
> to solutions that require passing large amounts of information among
> iterations. lapply is particularly handy when constructing lists.
>
> As for vectorizing, see the code below.

No. Despite the name, this is **not** what I mean by vectorization.
What I mean is pushing the loops down to the C level rather than doing
them at the interpreted level, which is where your code below still
leaves you.

-- Bert

 Note that it uses mapply but that

> simply may have made implementation easier. However, if vectorizing gives an
> improvement over looping, the mapply may be the reason.
>
>> f<-function(x,y,z) catn("do something")
>> Vectorize(f,c('x','y'))
> function (x, y, z)
> {
>    args <- lapply(as.list(match.call())[-1L], eval, parent.frame())
>    names <- if (is.null(names(args)))
>        character(length(args))
>    else names(args)
>    dovec <- names %in% vectorize.args
>    do.call("mapply", c(FUN = FUN, args[dovec], MoreArgs =
> list(args[!dovec]),
>        SIMPLIFY = SIMPLIFY, USE.NAMES = USE.NAMES))
> }
> <environment: 0x7fb3442553c8>
>
> DAV
>
>
> -----Original Message-----
> From: Bert Gunter [mailto:[hidden email]]
> Sent: Monday, April 16, 2012 3:07 PM
> To: David A Vavra
> Cc: [hidden email]
> Subject: Re: [R] Effeciently sum 3d table
>
> David:
>
> 1. My first name is Bert.
>
> 2. " It never occurred to me that there would be a question."
> Indeed. But in fact you got solutions for two different
> interpretations (Greg's is what you wanted). That is what I meant when
> I said that clarity in asking the question is important.
>
> 3. > I have gotten the impression that a for loop is very inefficient.
> Whenever I
>> change them to lapply calls there is a noticeable improvement in run time
>> for whatever reason.
> I'd like to see your data on this. My experience is that they are
> typically comparable. Chambers in his "Software for Data Analysis"
> book says (pp 213): (with apply type functions rather than explicit
> loops),  " The computation should run faster... However, none of the
> apply mechanisms changes the number of times the supplied functions is
> called, so serious improvements will be limited to iterating simple
> calculations many times."
>
> 4. You can get serious improvements by vectorizing; and you can do
> that here, if I understand correctly, because all your arrays have
> identical dim = d. Here's how:
>
> ## assume your list of arrays is in listoftables
>
> alldat <- do.call(cbind,listoftables) ## this might be the slow part
> ans <- array(.rowSums (allDat), dim = d)
>
> See ?rowSums for explanations and caveats, especially with NA's .
>
> Cheers,
> Bert
>
> On Mon, Apr 16, 2012 at 11:35 AM, David A Vavra <[hidden email]> wrote:
>> Thanks Gunter,
>>
>> I mean what I think is the normal definition of 'sum' as in:
>>   T1 + T2 + T3 + ...
>> It never occurred to me that there would be a question.
>>
>> I have gotten the impression that a for loop is very inefficient. Whenever
> I
>> change them to lapply calls there is a noticeable improvement in run time
>> for whatever reason. The problem with lapply here is that I effectively
> need
>> a global table to hold the final sum. lapply also  wants to return a
> value.
>>
>> You may be correct that in the long run, the loop is the best. There's a
> lot
>> of extraneous memory wastage holding all of the tables in a list as well
> as
>> the return 'values'.
>>
>> As an alternate and given a pre-existing list of tables, I was thinking of
>> creating a temporary environment to hold the final result so it could be
>> passed globally to each lapply execution level but that seems clunky and
>> wasteful as well.
>>
>> Example in partial code:
>>
>> Env <- CreatEnv() # my own function
>> Assign('final',T1-T1,envir=env)
>> L<-listOfTables
>>
>> lapply(L,function(t) {
>>        final <- get('final',envir=env) + t
>>        assign('final',final,envir=env)
>>        NULL
>> })
>>
>> But I was hoping for a more elegant and hopefully more efficient solution.
>> Greg's suggestion for using reduce seems in order but as yet I'm
> unfamiliar
>> with the function.
>>
>> DAV
>>
>>
>>
>> -----Original Message-----
>> From: Bert Gunter [mailto:[hidden email]]
>> Sent: Monday, April 16, 2012 12:42 PM
>> To: Greg Snow
>> Cc: David A Vavra; [hidden email]
>> Subject: Re: [R] Effeciently sum 3d table
>>
>> Define "sum" . Do you mean you want to get a single sum for each
>> array? -- get marginal sums for each array? -- get a single array in
>> which each value is the sum of all the individual values at the
>> position?
>>
>> Due thought and consideration for those trying to help by formulating
>> your query carefully and concisely vastly increases the chance of
>> getting a useful answer. See the posting guide -- this is a skill that
>> needs to be learned and the guide is quite helpful. And I must
>> acknowledge that it is a skill that I also have not yet mastered.
>>
>> Concerning your query, I would only note that the two responses from
>> Greg and Petr that you received are unlikely to be significantly
>> faster than just using loops, since both are still essentially looping
>> at the interpreted level. Whether either give you what you want, I do
>> not know.
>>
>> -- Bert
>>
>> On Mon, Apr 16, 2012 at 8:53 AM, Greg Snow <[hidden email]> wrote:
>>> Look at the Reduce function.
>>>
>>> On Mon, Apr 16, 2012 at 8:28 AM, David A Vavra <[hidden email]>
>> wrote:
>>>> I have a large number of 3d tables that I wish to sum
>>>> Is there an efficient way to do this? Or perhaps a function I can call?
>>>>
>>>> I tried using do.call("sum",listoftables) but that returns a single
>> value.
>>>>
>>>> So far, it seems only a loop will do the job.
>>>>
>>>>
>>>> TIA,
>>>> DAV
>>
>>
>> --
>>
>> Bert Gunter
>> Genentech Nonclinical Biostatistics
>>
>> Internal Contact Info:
>> Phone: 467-7374
>> Website:
>>
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biost
>> atistics/pdb-ncb-home.htm
>>
>
>
>
> --
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
> Internal Contact Info:
> Phone: 467-7374
> Website:
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biost
> atistics/pdb-ncb-home.htm
>



--

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Effeciently sum 3d table

davavra
In reply to this post by William Dunlap
Thanks Bill,

 

For reasons that aren't important here, I must start from a list. Computing
the sum while generating the tables may be a solution but it means doing
something in one piece of code that is unrelated to the surrounding code.
Bad practice where I'm from. If it's needed it's needed but if I can avoid
doing so, I will.

 

I haven't done any timing but because of the extra operations of get and
assign, the non-loop implementation will likely suffer. It seems you have
shown this to be true.

 

DAV

           

 

-----Original Message-----
From: William Dunlap [mailto:[hidden email]]
Sent: Monday, April 16, 2012 3:26 PM
To: David A Vavra; 'Bert Gunter'
Cc: [hidden email]
Subject: RE: [R] Effeciently sum 3d table

 

> Example in partial code:

>

> Env <- CreatEnv() # my own function

> Assign('final',T1-T1,envir=env)

> L<-listOfTables

>

> lapply(L,function(t) {

>     final <- get('final',envir=env) + t

>     assign('final',final,envir=env)

>     NULL

> })

 

First, finish writing that code so it runs and you can make sure its

output is ok:

 

L <- lapply(1:50000, function(i) array(i:(i+3), c(2,2))) # list of 50,000
2x2 matrices

env <- new.env()

assign('final', L[[1]] - L[[1]], envir=env)

junk <- lapply(L, function(t) {

     final <- get('final', envir=env) + t

     assign('final', final, envir=env)

     NULL

})

get('final', envir=env)

#            [,1]       [,2]

# [1,] 1250025000 1250125000

# [2,] 1250075000 1250175000

> sum( (2:50001) ) # should be final[2,1]

# [1] 1250075000

 

You asked for something less "clunky".

You are fighting the system by using get() and assign(), just use

ordinary expression syntax to get and set variables:

final <- L[[1]]

for(i in seq_along(L)[-1]) final <- final + L[[i]]

final

#           [,1]       [,2]

# [1,] 1250025000 1250125000

# [2,] 1250075000 1250175000

 

The former took 0.22 seconds on my machine, the latter 0.06.

 

You don't have to compute the whole list of matrices before

doing the sum, just add to the current sum when you have

computed one matrix and then forget about it.

 

Bill Dunlap

Spotfire, TIBCO Software

wdunlap tibco.com

 

 

> -----Original Message-----

> From: [hidden email] [mailto:[hidden email]]
On Behalf

> Of David A Vavra

> Sent: Monday, April 16, 2012 11:35 AM

> To: 'Bert Gunter'

> Cc: [hidden email]

> Subject: Re: [R] Effeciently sum 3d table

>

> Thanks Gunter,

>

> I mean what I think is the normal definition of 'sum' as in:

>    T1 + T2 + T3 + ...

> It never occurred to me that there would be a question.

>

> I have gotten the impression that a for loop is very inefficient. Whenever
I

> change them to lapply calls there is a noticeable improvement in run time

> for whatever reason. The problem with lapply here is that I effectively
need

> a global table to hold the final sum. lapply also  wants to return a
value.

>

> You may be correct that in the long run, the loop is the best. There's a
lot

> of extraneous memory wastage holding all of the tables in a list as well
as

> the return 'values'.

>

> As an alternate and given a pre-existing list of tables, I was thinking of

> creating a temporary environment to hold the final result so it could be

> passed globally to each lapply execution level but that seems clunky and

> wasteful as well.

>

> Example in partial code:

>

> Env <- CreatEnv() # my own function

> Assign('final',T1-T1,envir=env)

> L<-listOfTables

>

> lapply(L,function(t) {

>     final <- get('final',envir=env) + t

>     assign('final',final,envir=env)

>     NULL

> })

>

> But I was hoping for a more elegant and hopefully more efficient solution.

> Greg's suggestion for using reduce seems in order but as yet I'm
unfamiliar

> with the function.

>

> DAV

>

>

>

> -----Original Message-----

> From: Bert Gunter [mailto:[hidden email]]

> Sent: Monday, April 16, 2012 12:42 PM

> To: Greg Snow

> Cc: David A Vavra; [hidden email]

> Subject: Re: [R] Effeciently sum 3d table

>

> Define "sum" . Do you mean you want to get a single sum for each

> array? -- get marginal sums for each array? -- get a single array in

> which each value is the sum of all the individual values at the

> position?

>

> Due thought and consideration for those trying to help by formulating

> your query carefully and concisely vastly increases the chance of

> getting a useful answer. See the posting guide -- this is a skill that

> needs to be learned and the guide is quite helpful. And I must

> acknowledge that it is a skill that I also have not yet mastered.

>

> Concerning your query, I would only note that the two responses from

> Greg and Petr that you received are unlikely to be significantly

> faster than just using loops, since both are still essentially looping

> at the interpreted level. Whether either give you what you want, I do

> not know.

>

> -- Bert

>

> On Mon, Apr 16, 2012 at 8:53 AM, Greg Snow <[hidden email]> wrote:

> > Look at the Reduce function.

> >

> > On Mon, Apr 16, 2012 at 8:28 AM, David A Vavra <[hidden email]>

> wrote:

> >> I have a large number of 3d tables that I wish to sum

> >> Is there an efficient way to do this? Or perhaps a function I can call?

> >>

> >> I tried using do.call("sum",listoftables) but that returns a single

> value.

> >>

> >> So far, it seems only a loop will do the job.

> >>

> >>

> >> TIA,

> >> DAV

>

>

> --

>

> Bert Gunter

> Genentech Nonclinical Biostatistics

>

> Internal Contact Info:

> Phone: 467-7374

> Website:

>
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biost

> atistics/pdb-ncb-home.htm

>

> ______________________________________________

> [hidden email] mailing list

> https://stat.ethz.ch/mailman/listinfo/r-help

> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html

> and provide commented, minimal, self-contained, reproducible code.


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Effeciently sum 3d table

Bert Gunter
David:

Here is a comparison of the gains to be made by vectorization (again,
assuming I have interpreted your query correctly)

## create a list of arrays
> z <- lapply(seq_len(10000),function(i)array(runif(24),dim=2:4))
## Using an apply type approach
> system.time(ans1 <- array(do.call(mapply,c(sum,z)),dim=2:4))
   user  system elapsed
   0.62    0.00    0.62
## vectorizing via rowSums and cbind
> system.time(ans2 <-array(rowSums(do.call(cbind,z)),dim=2:4))
   user  system elapsed
   0.02    0.00    0.02
> identical(ans1,ans2)
[1] TRUE

Cheers,
Bert



On Mon, Apr 16, 2012 at 1:19 PM, David A Vavra <[hidden email]> wrote:

> Thanks Bill,
>
>
>
> For reasons that aren't important here, I must start from a list. Computing
> the sum while generating the tables may be a solution but it means doing
> something in one piece of code that is unrelated to the surrounding code.
> Bad practice where I'm from. If it's needed it's needed but if I can avoid
> doing so, I will.
>
>
>
> I haven't done any timing but because of the extra operations of get and
> assign, the non-loop implementation will likely suffer. It seems you have
> shown this to be true.
>
>
>
> DAV
>
>
>
>
>
> -----Original Message-----
> From: William Dunlap [mailto:[hidden email]]
> Sent: Monday, April 16, 2012 3:26 PM
> To: David A Vavra; 'Bert Gunter'
> Cc: [hidden email]
> Subject: RE: [R] Effeciently sum 3d table
>
>
>
>> Example in partial code:
>
>>
>
>> Env <- CreatEnv() # my own function
>
>> Assign('final',T1-T1,envir=env)
>
>> L<-listOfTables
>
>>
>
>> lapply(L,function(t) {
>
>>     final <- get('final',envir=env) + t
>
>>     assign('final',final,envir=env)
>
>>     NULL
>
>> })
>
>
>
> First, finish writing that code so it runs and you can make sure its
>
> output is ok:
>
>
>
> L <- lapply(1:50000, function(i) array(i:(i+3), c(2,2))) # list of 50,000
> 2x2 matrices
>
> env <- new.env()
>
> assign('final', L[[1]] - L[[1]], envir=env)
>
> junk <- lapply(L, function(t) {
>
>     final <- get('final', envir=env) + t
>
>     assign('final', final, envir=env)
>
>     NULL
>
> })
>
> get('final', envir=env)
>
> #            [,1]       [,2]
>
> # [1,] 1250025000 1250125000
>
> # [2,] 1250075000 1250175000
>
>> sum( (2:50001) ) # should be final[2,1]
>
> # [1] 1250075000
>
>
>
> You asked for something less "clunky".
>
> You are fighting the system by using get() and assign(), just use
>
> ordinary expression syntax to get and set variables:
>
> final <- L[[1]]
>
> for(i in seq_along(L)[-1]) final <- final + L[[i]]
>
> final
>
> #           [,1]       [,2]
>
> # [1,] 1250025000 1250125000
>
> # [2,] 1250075000 1250175000
>
>
>
> The former took 0.22 seconds on my machine, the latter 0.06.
>
>
>
> You don't have to compute the whole list of matrices before
>
> doing the sum, just add to the current sum when you have
>
> computed one matrix and then forget about it.
>
>
>
> Bill Dunlap
>
> Spotfire, TIBCO Software
>
> wdunlap tibco.com
>
>
>
>
>
>> -----Original Message-----
>
>> From: [hidden email] [mailto:[hidden email]]
> On Behalf
>
>> Of David A Vavra
>
>> Sent: Monday, April 16, 2012 11:35 AM
>
>> To: 'Bert Gunter'
>
>> Cc: [hidden email]
>
>> Subject: Re: [R] Effeciently sum 3d table
>
>>
>
>> Thanks Gunter,
>
>>
>
>> I mean what I think is the normal definition of 'sum' as in:
>
>>    T1 + T2 + T3 + ...
>
>> It never occurred to me that there would be a question.
>
>>
>
>> I have gotten the impression that a for loop is very inefficient. Whenever
> I
>
>> change them to lapply calls there is a noticeable improvement in run time
>
>> for whatever reason. The problem with lapply here is that I effectively
> need
>
>> a global table to hold the final sum. lapply also  wants to return a
> value.
>
>>
>
>> You may be correct that in the long run, the loop is the best. There's a
> lot
>
>> of extraneous memory wastage holding all of the tables in a list as well
> as
>
>> the return 'values'.
>
>>
>
>> As an alternate and given a pre-existing list of tables, I was thinking of
>
>> creating a temporary environment to hold the final result so it could be
>
>> passed globally to each lapply execution level but that seems clunky and
>
>> wasteful as well.
>
>>
>
>> Example in partial code:
>
>>
>
>> Env <- CreatEnv() # my own function
>
>> Assign('final',T1-T1,envir=env)
>
>> L<-listOfTables
>
>>
>
>> lapply(L,function(t) {
>
>>     final <- get('final',envir=env) + t
>
>>     assign('final',final,envir=env)
>
>>     NULL
>
>> })
>
>>
>
>> But I was hoping for a more elegant and hopefully more efficient solution.
>
>> Greg's suggestion for using reduce seems in order but as yet I'm
> unfamiliar
>
>> with the function.
>
>>
>
>> DAV
>
>>
>
>>
>
>>
>
>> -----Original Message-----
>
>> From: Bert Gunter [mailto:[hidden email]]
>
>> Sent: Monday, April 16, 2012 12:42 PM
>
>> To: Greg Snow
>
>> Cc: David A Vavra; [hidden email]
>
>> Subject: Re: [R] Effeciently sum 3d table
>
>>
>
>> Define "sum" . Do you mean you want to get a single sum for each
>
>> array? -- get marginal sums for each array? -- get a single array in
>
>> which each value is the sum of all the individual values at the
>
>> position?
>
>>
>
>> Due thought and consideration for those trying to help by formulating
>
>> your query carefully and concisely vastly increases the chance of
>
>> getting a useful answer. See the posting guide -- this is a skill that
>
>> needs to be learned and the guide is quite helpful. And I must
>
>> acknowledge that it is a skill that I also have not yet mastered.
>
>>
>
>> Concerning your query, I would only note that the two responses from
>
>> Greg and Petr that you received are unlikely to be significantly
>
>> faster than just using loops, since both are still essentially looping
>
>> at the interpreted level. Whether either give you what you want, I do
>
>> not know.
>
>>
>
>> -- Bert
>
>>
>
>> On Mon, Apr 16, 2012 at 8:53 AM, Greg Snow <[hidden email]> wrote:
>
>> > Look at the Reduce function.
>
>> >
>
>> > On Mon, Apr 16, 2012 at 8:28 AM, David A Vavra <[hidden email]>
>
>> wrote:
>
>> >> I have a large number of 3d tables that I wish to sum
>
>> >> Is there an efficient way to do this? Or perhaps a function I can call?
>
>> >>
>
>> >> I tried using do.call("sum",listoftables) but that returns a single
>
>> value.
>
>> >>
>
>> >> So far, it seems only a loop will do the job.
>
>> >>
>
>> >>
>
>> >> TIA,
>
>> >> DAV
>
>>
>
>>
>
>> --
>
>>
>
>> Bert Gunter
>
>> Genentech Nonclinical Biostatistics
>
>>
>
>> Internal Contact Info:
>
>> Phone: 467-7374
>
>> Website:
>
>>
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biost
>
>> atistics/pdb-ncb-home.htm
>
>>
>
>> ______________________________________________
>
>> [hidden email] mailing list
>
>> https://stat.ethz.ch/mailman/listinfo/r-help
>
>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
>
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



--

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Effeciently sum 3d table

William Dunlap
In reply to this post by davavra
I generally prefer the list approach too.  I only mentioned that you didn't
need to have a list of inputs before starting the summation because
you said
   > There's a lot
  > of extraneous memory wastage holding all of the tables in a list as well as
  > the return 'values'.
I guess I misinterpreted that sentence.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

From: David A Vavra [mailto:[hidden email]]
Sent: Monday, April 16, 2012 1:20 PM
To: William Dunlap
Cc: [hidden email]
Subject: RE: [R] Effeciently sum 3d table

Thanks Bill,

For reasons that aren't important here, I must start from a list. Computing the sum while generating the tables may be a solution but it means doing something in one piece of code that is unrelated to the surrounding code. Bad practice where I'm from. If it's needed it's needed but if I can avoid doing so, I will.

I haven't done any timing but because of the extra operations of get and assign, the non-loop implementation will likely suffer. It seems you have shown this to be true.

DAV
           

-----Original Message-----
From: William Dunlap [mailto:[hidden email]]
Sent: Monday, April 16, 2012 3:26 PM
To: David A Vavra; 'Bert Gunter'
Cc: [hidden email]
Subject: RE: [R] Effeciently sum 3d table

> Example in partial code:
>
> Env <- CreatEnv() # my own function
> Assign('final',T1-T1,envir=env)
> L<-listOfTables
>
> lapply(L,function(t) {
>     final <- get('final',envir=env) + t
>     assign('final',final,envir=env)
>     NULL
> })

First, finish writing that code so it runs and you can make sure its
output is ok:

L <- lapply(1:50000, function(i) array(i:(i+3), c(2,2))) # list of 50,000 2x2 matrices
env <- new.env()
assign('final', L[[1]] - L[[1]], envir=env)
junk <- lapply(L, function(t) {
     final <- get('final', envir=env) + t
     assign('final', final, envir=env)
     NULL
})
get('final', envir=env)
#            [,1]       [,2]
# [1,] 1250025000 1250125000
# [2,] 1250075000 1250175000
> sum( (2:50001) ) # should be final[2,1]
# [1] 1250075000

You asked for something less "clunky".
You are fighting the system by using get() and assign(), just use
ordinary expression syntax to get and set variables:
final <- L[[1]]
for(i in seq_along(L)[-1]) final <- final + L[[i]]
final
#           [,1]       [,2]
# [1,] 1250025000 1250125000
# [2,] 1250075000 1250175000

The former took 0.22 seconds on my machine, the latter 0.06.

You don't have to compute the whole list of matrices before
doing the sum, just add to the current sum when you have
computed one matrix and then forget about it.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]] On Behalf
> Of David A Vavra
> Sent: Monday, April 16, 2012 11:35 AM
> To: 'Bert Gunter'
> Cc: [hidden email]
> Subject: Re: [R] Effeciently sum 3d table
>
> Thanks Gunter,
>
> I mean what I think is the normal definition of 'sum' as in:
>    T1 + T2 + T3 + ...
> It never occurred to me that there would be a question.
>
> I have gotten the impression that a for loop is very inefficient. Whenever I
> change them to lapply calls there is a noticeable improvement in run time
> for whatever reason. The problem with lapply here is that I effectively need
> a global table to hold the final sum. lapply also  wants to return a value.
>
> You may be correct that in the long run, the loop is the best. There's a lot
> of extraneous memory wastage holding all of the tables in a list as well as
> the return 'values'.
>
> As an alternate and given a pre-existing list of tables, I was thinking of
> creating a temporary environment to hold the final result so it could be
> passed globally to each lapply execution level but that seems clunky and
> wasteful as well.
>
> Example in partial code:
>
> Env <- CreatEnv() # my own function
> Assign('final',T1-T1,envir=env)
> L<-listOfTables
>
> lapply(L,function(t) {
>     final <- get('final',envir=env) + t
>     assign('final',final,envir=env)
>     NULL
> })
>
> But I was hoping for a more elegant and hopefully more efficient solution.
> Greg's suggestion for using reduce seems in order but as yet I'm unfamiliar
> with the function.
>
> DAV
>
>
>
> -----Original Message-----
> From: Bert Gunter [mailto:[hidden email]]
> Sent: Monday, April 16, 2012 12:42 PM
> To: Greg Snow
> Cc: David A Vavra; [hidden email]
> Subject: Re: [R] Effeciently sum 3d table
>
> Define "sum" . Do you mean you want to get a single sum for each
> array? -- get marginal sums for each array? -- get a single array in
> which each value is the sum of all the individual values at the
> position?
>
> Due thought and consideration for those trying to help by formulating
> your query carefully and concisely vastly increases the chance of
> getting a useful answer. See the posting guide -- this is a skill that
> needs to be learned and the guide is quite helpful. And I must
> acknowledge that it is a skill that I also have not yet mastered.
>
> Concerning your query, I would only note that the two responses from
> Greg and Petr that you received are unlikely to be significantly
> faster than just using loops, since both are still essentially looping
> at the interpreted level. Whether either give you what you want, I do
> not know.
>
> -- Bert
>
> On Mon, Apr 16, 2012 at 8:53 AM, Greg Snow <[hidden email]> wrote:
> > Look at the Reduce function.
> >
> > On Mon, Apr 16, 2012 at 8:28 AM, David A Vavra <[hidden email]>
> wrote:
> >> I have a large number of 3d tables that I wish to sum
> >> Is there an efficient way to do this? Or perhaps a function I can call?
> >>
> >> I tried using do.call("sum",listoftables) but that returns a single
> value.
> >>
> >> So far, it seems only a loop will do the job.
> >>
> >>
> >> TIA,
> >> DAV
>
>
> --
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
> Internal Contact Info:
> Phone: 467-7374
> Website:
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biost
> atistics/pdb-ncb-home.htm
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Effeciently sum 3d table

davavra
In reply to this post by Bert Gunter
OK. I'll take your word for it. The mapply function calls "do_mapply" so I
would have thought it is passing the operation down to the C code. I haven't
tracked it any further than below.

> mapply
function (FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE, USE.NAMES = TRUE)
{
    FUN <- match.fun(FUN)
    dots <- list(...)
    answer <- .Call("do_mapply", FUN, dots, MoreArgs, environment(),
        PACKAGE = "base")

... etc.


-----Original Message-----
From: Bert Gunter [mailto:[hidden email]]
Sent: Monday, April 16, 2012 4:13 PM
To: David A Vavra
Cc: [hidden email]
Subject: Re: [R] Effeciently sum 3d table

For purposes of clarity only...

On Mon, Apr 16, 2012 at 12:40 PM, David A Vavra <[hidden email]> wrote:
> Bert,
>
> My apologies on the name.
>
> I haven't kept any data on loop times. I don't know why lapply seems
faster
> but the difference is quite noticeable. It has struck me as odd. I would
> have thought lapply would be slower. It has taken an effort to change my
> thinking to force fit solutions to it but I've gotten used to it. As of
now
> I reserve loops to times when there are only a few iterations (as in 10)
and
> to solutions that require passing large amounts of information among
> iterations. lapply is particularly handy when constructing lists.
>
> As for vectorizing, see the code below.

No. Despite the name, this is **not** what I mean by vectorization.
What I mean is pushing the loops down to the C level rather than doing
them at the interpreted level, which is where your code below still
leaves you.

-- Bert

 Note that it uses mapply but that
> simply may have made implementation easier. However, if vectorizing gives
an

> improvement over looping, the mapply may be the reason.
>
>> f<-function(x,y,z) catn("do something")
>> Vectorize(f,c('x','y'))
> function (x, y, z)
> {
>    args <- lapply(as.list(match.call())[-1L], eval, parent.frame())
>    names <- if (is.null(names(args)))
>        character(length(args))
>    else names(args)
>    dovec <- names %in% vectorize.args
>    do.call("mapply", c(FUN = FUN, args[dovec], MoreArgs =
> list(args[!dovec]),
>        SIMPLIFY = SIMPLIFY, USE.NAMES = USE.NAMES))
> }
> <environment: 0x7fb3442553c8>
>
> DAV
>
>
> -----Original Message-----
> From: Bert Gunter [mailto:[hidden email]]
> Sent: Monday, April 16, 2012 3:07 PM
> To: David A Vavra
> Cc: [hidden email]
> Subject: Re: [R] Effeciently sum 3d table
>
> David:
>
> 1. My first name is Bert.
>
> 2. " It never occurred to me that there would be a question."
> Indeed. But in fact you got solutions for two different
> interpretations (Greg's is what you wanted). That is what I meant when
> I said that clarity in asking the question is important.
>
> 3. > I have gotten the impression that a for loop is very inefficient.
> Whenever I
>> change them to lapply calls there is a noticeable improvement in run time
>> for whatever reason.
> I'd like to see your data on this. My experience is that they are
> typically comparable. Chambers in his "Software for Data Analysis"
> book says (pp 213): (with apply type functions rather than explicit
> loops),  " The computation should run faster... However, none of the
> apply mechanisms changes the number of times the supplied functions is
> called, so serious improvements will be limited to iterating simple
> calculations many times."
>
> 4. You can get serious improvements by vectorizing; and you can do
> that here, if I understand correctly, because all your arrays have
> identical dim = d. Here's how:
>
> ## assume your list of arrays is in listoftables
>
> alldat <- do.call(cbind,listoftables) ## this might be the slow part
> ans <- array(.rowSums (allDat), dim = d)
>
> See ?rowSums for explanations and caveats, especially with NA's .
>
> Cheers,
> Bert
>
> On Mon, Apr 16, 2012 at 11:35 AM, David A Vavra <[hidden email]>
wrote:
>> Thanks Gunter,
>>
>> I mean what I think is the normal definition of 'sum' as in:
>>   T1 + T2 + T3 + ...
>> It never occurred to me that there would be a question.
>>
>> I have gotten the impression that a for loop is very inefficient.
Whenever

> I
>> change them to lapply calls there is a noticeable improvement in run time
>> for whatever reason. The problem with lapply here is that I effectively
> need
>> a global table to hold the final sum. lapply also  wants to return a
> value.
>>
>> You may be correct that in the long run, the loop is the best. There's a
> lot
>> of extraneous memory wastage holding all of the tables in a list as well
> as
>> the return 'values'.
>>
>> As an alternate and given a pre-existing list of tables, I was thinking
of

>> creating a temporary environment to hold the final result so it could be
>> passed globally to each lapply execution level but that seems clunky and
>> wasteful as well.
>>
>> Example in partial code:
>>
>> Env <- CreatEnv() # my own function
>> Assign('final',T1-T1,envir=env)
>> L<-listOfTables
>>
>> lapply(L,function(t) {
>>        final <- get('final',envir=env) + t
>>        assign('final',final,envir=env)
>>        NULL
>> })
>>
>> But I was hoping for a more elegant and hopefully more efficient
solution.

>> Greg's suggestion for using reduce seems in order but as yet I'm
> unfamiliar
>> with the function.
>>
>> DAV
>>
>>
>>
>> -----Original Message-----
>> From: Bert Gunter [mailto:[hidden email]]
>> Sent: Monday, April 16, 2012 12:42 PM
>> To: Greg Snow
>> Cc: David A Vavra; [hidden email]
>> Subject: Re: [R] Effeciently sum 3d table
>>
>> Define "sum" . Do you mean you want to get a single sum for each
>> array? -- get marginal sums for each array? -- get a single array in
>> which each value is the sum of all the individual values at the
>> position?
>>
>> Due thought and consideration for those trying to help by formulating
>> your query carefully and concisely vastly increases the chance of
>> getting a useful answer. See the posting guide -- this is a skill that
>> needs to be learned and the guide is quite helpful. And I must
>> acknowledge that it is a skill that I also have not yet mastered.
>>
>> Concerning your query, I would only note that the two responses from
>> Greg and Petr that you received are unlikely to be significantly
>> faster than just using loops, since both are still essentially looping
>> at the interpreted level. Whether either give you what you want, I do
>> not know.
>>
>> -- Bert
>>
>> On Mon, Apr 16, 2012 at 8:53 AM, Greg Snow <[hidden email]> wrote:
>>> Look at the Reduce function.
>>>
>>> On Mon, Apr 16, 2012 at 8:28 AM, David A Vavra <[hidden email]>
>> wrote:
>>>> I have a large number of 3d tables that I wish to sum
>>>> Is there an efficient way to do this? Or perhaps a function I can call?
>>>>
>>>> I tried using do.call("sum",listoftables) but that returns a single
>> value.
>>>>
>>>> So far, it seems only a loop will do the job.
>>>>
>>>>
>>>> TIA,
>>>> DAV
>>
>>
>> --
>>
>> Bert Gunter
>> Genentech Nonclinical Biostatistics
>>
>> Internal Contact Info:
>> Phone: 467-7374
>> Website:
>>
>
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biost

>> atistics/pdb-ncb-home.htm
>>
>
>
>
> --
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
> Internal Contact Info:
> Phone: 467-7374
> Website:
>
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biost
> atistics/pdb-ncb-home.htm
>



--

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biost
atistics/pdb-ncb-home.htm

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Effeciently sum 3d table

Bert Gunter
On Mon, Apr 16, 2012 at 1:39 PM, David A Vavra <[hidden email]> wrote:
> OK. I'll take your word for it. The mapply function calls "do_mapply" so I
> would have thought it is passing the operation down to the C code. I haven't
> tracked it any further than below.

No, they can't. Function evaluation must take place at the interpreted
level. However, don't take my word -- take Chambers's.

-- Bert

>
>> mapply
> function (FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE, USE.NAMES = TRUE)
> {
>    FUN <- match.fun(FUN)
>    dots <- list(...)
>    answer <- .Call("do_mapply", FUN, dots, MoreArgs, environment(),
>        PACKAGE = "base")
>
> ... etc.
>
>
> -----Original Message-----
> From: Bert Gunter [mailto:[hidden email]]
> Sent: Monday, April 16, 2012 4:13 PM
> To: David A Vavra
> Cc: [hidden email]
> Subject: Re: [R] Effeciently sum 3d table
>
> For purposes of clarity only...
>
> On Mon, Apr 16, 2012 at 12:40 PM, David A Vavra <[hidden email]> wrote:
>> Bert,
>>
>> My apologies on the name.
>>
>> I haven't kept any data on loop times. I don't know why lapply seems
> faster
>> but the difference is quite noticeable. It has struck me as odd. I would
>> have thought lapply would be slower. It has taken an effort to change my
>> thinking to force fit solutions to it but I've gotten used to it. As of
> now
>> I reserve loops to times when there are only a few iterations (as in 10)
> and
>> to solutions that require passing large amounts of information among
>> iterations. lapply is particularly handy when constructing lists.
>>
>> As for vectorizing, see the code below.
>
> No. Despite the name, this is **not** what I mean by vectorization.
> What I mean is pushing the loops down to the C level rather than doing
> them at the interpreted level, which is where your code below still
> leaves you.
>
> -- Bert
>
>  Note that it uses mapply but that
>> simply may have made implementation easier. However, if vectorizing gives
> an
>> improvement over looping, the mapply may be the reason.
>>
>>> f<-function(x,y,z) catn("do something")
>>> Vectorize(f,c('x','y'))
>> function (x, y, z)
>> {
>>    args <- lapply(as.list(match.call())[-1L], eval, parent.frame())
>>    names <- if (is.null(names(args)))
>>        character(length(args))
>>    else names(args)
>>    dovec <- names %in% vectorize.args
>>    do.call("mapply", c(FUN = FUN, args[dovec], MoreArgs =
>> list(args[!dovec]),
>>        SIMPLIFY = SIMPLIFY, USE.NAMES = USE.NAMES))
>> }
>> <environment: 0x7fb3442553c8>
>>
>> DAV
>>
>>
>> -----Original Message-----
>> From: Bert Gunter [mailto:[hidden email]]
>> Sent: Monday, April 16, 2012 3:07 PM
>> To: David A Vavra
>> Cc: [hidden email]
>> Subject: Re: [R] Effeciently sum 3d table
>>
>> David:
>>
>> 1. My first name is Bert.
>>
>> 2. " It never occurred to me that there would be a question."
>> Indeed. But in fact you got solutions for two different
>> interpretations (Greg's is what you wanted). That is what I meant when
>> I said that clarity in asking the question is important.
>>
>> 3. > I have gotten the impression that a for loop is very inefficient.
>> Whenever I
>>> change them to lapply calls there is a noticeable improvement in run time
>>> for whatever reason.
>> I'd like to see your data on this. My experience is that they are
>> typically comparable. Chambers in his "Software for Data Analysis"
>> book says (pp 213): (with apply type functions rather than explicit
>> loops),  " The computation should run faster... However, none of the
>> apply mechanisms changes the number of times the supplied functions is
>> called, so serious improvements will be limited to iterating simple
>> calculations many times."
>>
>> 4. You can get serious improvements by vectorizing; and you can do
>> that here, if I understand correctly, because all your arrays have
>> identical dim = d. Here's how:
>>
>> ## assume your list of arrays is in listoftables
>>
>> alldat <- do.call(cbind,listoftables) ## this might be the slow part
>> ans <- array(.rowSums (allDat), dim = d)
>>
>> See ?rowSums for explanations and caveats, especially with NA's .
>>
>> Cheers,
>> Bert
>>
>> On Mon, Apr 16, 2012 at 11:35 AM, David A Vavra <[hidden email]>
> wrote:
>>> Thanks Gunter,
>>>
>>> I mean what I think is the normal definition of 'sum' as in:
>>>   T1 + T2 + T3 + ...
>>> It never occurred to me that there would be a question.
>>>
>>> I have gotten the impression that a for loop is very inefficient.
> Whenever
>> I
>>> change them to lapply calls there is a noticeable improvement in run time
>>> for whatever reason. The problem with lapply here is that I effectively
>> need
>>> a global table to hold the final sum. lapply also  wants to return a
>> value.
>>>
>>> You may be correct that in the long run, the loop is the best. There's a
>> lot
>>> of extraneous memory wastage holding all of the tables in a list as well
>> as
>>> the return 'values'.
>>>
>>> As an alternate and given a pre-existing list of tables, I was thinking
> of
>>> creating a temporary environment to hold the final result so it could be
>>> passed globally to each lapply execution level but that seems clunky and
>>> wasteful as well.
>>>
>>> Example in partial code:
>>>
>>> Env <- CreatEnv() # my own function
>>> Assign('final',T1-T1,envir=env)
>>> L<-listOfTables
>>>
>>> lapply(L,function(t) {
>>>        final <- get('final',envir=env) + t
>>>        assign('final',final,envir=env)
>>>        NULL
>>> })
>>>
>>> But I was hoping for a more elegant and hopefully more efficient
> solution.
>>> Greg's suggestion for using reduce seems in order but as yet I'm
>> unfamiliar
>>> with the function.
>>>
>>> DAV
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Bert Gunter [mailto:[hidden email]]
>>> Sent: Monday, April 16, 2012 12:42 PM
>>> To: Greg Snow
>>> Cc: David A Vavra; [hidden email]
>>> Subject: Re: [R] Effeciently sum 3d table
>>>
>>> Define "sum" . Do you mean you want to get a single sum for each
>>> array? -- get marginal sums for each array? -- get a single array in
>>> which each value is the sum of all the individual values at the
>>> position?
>>>
>>> Due thought and consideration for those trying to help by formulating
>>> your query carefully and concisely vastly increases the chance of
>>> getting a useful answer. See the posting guide -- this is a skill that
>>> needs to be learned and the guide is quite helpful. And I must
>>> acknowledge that it is a skill that I also have not yet mastered.
>>>
>>> Concerning your query, I would only note that the two responses from
>>> Greg and Petr that you received are unlikely to be significantly
>>> faster than just using loops, since both are still essentially looping
>>> at the interpreted level. Whether either give you what you want, I do
>>> not know.
>>>
>>> -- Bert
>>>
>>> On Mon, Apr 16, 2012 at 8:53 AM, Greg Snow <[hidden email]> wrote:
>>>> Look at the Reduce function.
>>>>
>>>> On Mon, Apr 16, 2012 at 8:28 AM, David A Vavra <[hidden email]>
>>> wrote:
>>>>> I have a large number of 3d tables that I wish to sum
>>>>> Is there an efficient way to do this? Or perhaps a function I can call?
>>>>>
>>>>> I tried using do.call("sum",listoftables) but that returns a single
>>> value.
>>>>>
>>>>> So far, it seems only a loop will do the job.
>>>>>
>>>>>
>>>>> TIA,
>>>>> DAV
>>>
>>>
>>> --
>>>
>>> Bert Gunter
>>> Genentech Nonclinical Biostatistics
>>>
>>> Internal Contact Info:
>>> Phone: 467-7374
>>> Website:
>>>
>>
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biost
>>> atistics/pdb-ncb-home.htm
>>>
>>
>>
>>
>> --
>>
>> Bert Gunter
>> Genentech Nonclinical Biostatistics
>>
>> Internal Contact Info:
>> Phone: 467-7374
>> Website:
>>
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biost
>> atistics/pdb-ncb-home.htm
>>
>
>
>
> --
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
> Internal Contact Info:
> Phone: 467-7374
> Website:
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biost
> atistics/pdb-ncb-home.htm
>



--

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
12