For Loop

classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

For Loop

rsherry8

It is my impression that good R programmers make very little use of the
for statement. Please consider  the following
R statement:
         for( i in 1:(len-1) )  s[i] = log(c1[i+1]/c1[i], base = exp(1) )
One problem I have found with this statement is that s must exist before
the statement is run. Can it be written without using a for
loop? Would that be better?

Thanks,
Bob

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: For Loop

Bert Gunter-2
Bob:

Please, please spend some time with an R tutorial or two before you post
here. This list can help, but I think we assume that you have already made
an effort to learn basic R on your own. Your question is about as basic as
it gets, so it appears to me that you have not done this. There are many
many R tutorials out there. Some suggestions, by no means comprehensive,
can be found here:
https://www.rstudio.com/online-learning/#r-programming

Others will no doubt respond, but you can answer it yourself after only a
few minutes with most R tutorials.

Cheers,
Bert




On Sat, Sep 22, 2018 at 2:16 PM rsherry8 <[hidden email]> wrote:

>
> It is my impression that good R programmers make very little use of the
> for statement. Please consider  the following
> R statement:
>          for( i in 1:(len-1) )  s[i] = log(c1[i+1]/c1[i], base = exp(1) )
> One problem I have found with this statement is that s must exist before
> the statement is run. Can it be written without using a for
> loop? Would that be better?
>
> Thanks,
> Bob
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: For Loop

Richard M. Heiberger
In reply to this post by rsherry8
c1 <- 1:1000000
len <- 1000000
system.time(
s1 <- log(c1[-1]/c1[-len])
)
s <- c1[-len]
system.time(
for (i in 1:(len-1)) s[i] <- log(c1[i+1]/c1[i])
)
all.equal(s,s1)


>
> c1 <- 1:1000000
> len <- 1000000
> system.time(
+ s1 <- log(c1[-1]/c1[-len])
+ )
   user  system elapsed
  0.032   0.005   0.037
> s <- c1[-len]
> system.time(
+ for (i in 1:(len-1)) s[i] <- log(c1[i+1]/c1[i])
+ )
   user  system elapsed
  0.226   0.002   0.232
> all.equal(s,s1)
[1] TRUE
>

much faster, and much easier to understand when vectorized

On Sat, Sep 22, 2018 at 5:16 PM, rsherry8 <[hidden email]> wrote:

>
> It is my impression that good R programmers make very little use of the for
> statement. Please consider  the following
> R statement:
>         for( i in 1:(len-1) )  s[i] = log(c1[i+1]/c1[i], base = exp(1) )
> One problem I have found with this statement is that s must exist before the
> statement is run. Can it be written without using a for
> loop? Would that be better?
>
> Thanks,
> Bob
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: For Loop

Wensui Liu
In reply to this post by rsherry8
another version just for fun

s <- parallel::pvec(1:len, function(i) log(c1[i + 1] / c1[i]))
On Sat, Sep 22, 2018 at 4:16 PM rsherry8 <[hidden email]> wrote:

>
>
> It is my impression that good R programmers make very little use of the
> for statement. Please consider  the following
> R statement:
>          for( i in 1:(len-1) )  s[i] = log(c1[i+1]/c1[i], base = exp(1) )
> One problem I have found with this statement is that s must exist before
> the statement is run. Can it be written without using a for
> loop? Would that be better?
>
> Thanks,
> Bob
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: For Loop

Wensui Liu
In reply to this post by rsherry8
or this one:

(Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len))

On Sat, Sep 22, 2018 at 4:16 PM rsherry8 <[hidden email]> wrote:

>
>
> It is my impression that good R programmers make very little use of the
> for statement. Please consider  the following
> R statement:
>          for( i in 1:(len-1) )  s[i] = log(c1[i+1]/c1[i], base = exp(1) )
> One problem I have found with this statement is that s must exist before
> the statement is run. Can it be written without using a for
> loop? Would that be better?
>
> Thanks,
> Bob
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: For Loop

Jeff Newmiller
In reply to this post by rsherry8
I do use for loops a few times per month, but only wrapped around large chunks of vectorized calculations, not for this kind of use case. In those cases I also pre-allocate output vectors/lists (e.g. vector( "list", len )) to avoid memory thrashing as you grow lists or other vectors one element at a time (v <- c( v, new value ) is an inefficient trick). I also create variables to hold intermediate results that would yield the same answer each time before going into the loop (e.g. exp(1)).

As regards your toy example, I would use a one-liner:

s <- diff( log( c1 ) )

which avoids executing exp(1) at all, much less every time through the loop, and it uses vectorized incremental subtraction rather than division (laws of logarithms from algebra). The default base for the log function is e, so it is unnecessary to specify it. Note that your loop calculates logs involving all but the first and last elements of c1 twice... once when indexing for i+1, and again in the next iteration of the loop it is accessed as index i.

You would be surprised how many iterative algorithms can be accomplished with cumsum and diff. Bill Dunlap has demonstrated examples quite a few times in the mailing list archives if you have time  to search.

On September 22, 2018 2:16:27 PM PDT, rsherry8 <[hidden email]> wrote:

>
>It is my impression that good R programmers make very little use of the
>
>for statement. Please consider  the following
>R statement:
>       for( i in 1:(len-1) )  s[i] = log(c1[i+1]/c1[i], base = exp(1) )
>One problem I have found with this statement is that s must exist
>before
>the statement is run. Can it be written without using a for
>loop? Would that be better?
>
>Thanks,
>Bob
>
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: For Loop

Ista Zahn
In reply to this post by Wensui Liu
On Sat, Sep 22, 2018 at 9:06 PM Wensui Liu <[hidden email]> wrote:
>
> or this one:
>
> (Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len))

Oh dear god no.

>
> On Sat, Sep 22, 2018 at 4:16 PM rsherry8 <[hidden email]> wrote:
> >
> >
> > It is my impression that good R programmers make very little use of the
> > for statement. Please consider  the following
> > R statement:
> >          for( i in 1:(len-1) )  s[i] = log(c1[i+1]/c1[i], base = exp(1) )
> > One problem I have found with this statement is that s must exist before
> > the statement is run. Can it be written without using a for
> > loop? Would that be better?
> >
> > Thanks,
> > Bob
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: For Loop

Ista Zahn
On Sun, Sep 23, 2018 at 10:09 AM Wensui Liu <[hidden email]> wrote:
>
> Why?

The operations required for this algorithm are vectorized, as are most
operations in R. There is no need to iterate through each element.
Using Vectorize to achieve the iteration is no better than using
*apply or a for-loop, and betrays the same basic lack of insight into
basic principles of programming in R.

And/or, if you want a more practical reason:

> c1 <- 1:1000000
> len <- 1000000
> system.time( s1 <- log(c1[-1]/c1[-len]))
   user  system elapsed
  0.031   0.004   0.035
> system.time(s2 <- Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len))
   user  system elapsed
  1.258   0.022   1.282

Best,
Ista

>
> On Sun, Sep 23, 2018 at 7:54 AM Ista Zahn <[hidden email]> wrote:
>>
>> On Sat, Sep 22, 2018 at 9:06 PM Wensui Liu <[hidden email]> wrote:
>> >
>> > or this one:
>> >
>> > (Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len))
>>
>> Oh dear god no.
>>
>> >
>> > On Sat, Sep 22, 2018 at 4:16 PM rsherry8 <[hidden email]> wrote:
>> > >
>> > >
>> > > It is my impression that good R programmers make very little use of the
>> > > for statement. Please consider  the following
>> > > R statement:
>> > >          for( i in 1:(len-1) )  s[i] = log(c1[i+1]/c1[i], base = exp(1) )
>> > > One problem I have found with this statement is that s must exist before
>> > > the statement is run. Can it be written without using a for
>> > > loop? Would that be better?
>> > >
>> > > Thanks,
>> > > Bob
>> > >
>> > > ______________________________________________
>> > > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> > > https://stat.ethz.ch/mailman/listinfo/r-help
>> > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> > > and provide commented, minimal, self-contained, reproducible code.
>> >
>> > ______________________________________________
>> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: For Loop

Wensui Liu
actually, by the parallel pvec, the user time is a lot shorter. or did
I somewhere miss your invaluable insight?

> c1 <- 1:1000000
> len <- length(c1)
> rbenchmark::benchmark(log(c1[-1]/c1[-len]), replications = 100)
                  test replications elapsed relative user.self sys.self
1 log(c1[-1]/c1[-len])          100   4.617        1     4.484    0.133
  user.child sys.child
1          0         0
> rbenchmark::benchmark(pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1] / c1[i])), replications = 100)
                                                               test
1 pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1]/c1[i]))
  replications elapsed relative user.self sys.self user.child sys.child
1          100   9.079        1     2.571    4.138      9.736     8.046
On Sun, Sep 23, 2018 at 12:33 PM Ista Zahn <[hidden email]> wrote:

>
> On Sun, Sep 23, 2018 at 10:09 AM Wensui Liu <[hidden email]> wrote:
> >
> > Why?
>
> The operations required for this algorithm are vectorized, as are most
> operations in R. There is no need to iterate through each element.
> Using Vectorize to achieve the iteration is no better than using
> *apply or a for-loop, and betrays the same basic lack of insight into
> basic principles of programming in R.
>
> And/or, if you want a more practical reason:
>
> > c1 <- 1:1000000
> > len <- 1000000
> > system.time( s1 <- log(c1[-1]/c1[-len]))
>    user  system elapsed
>   0.031   0.004   0.035
> > system.time(s2 <- Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len))
>    user  system elapsed
>   1.258   0.022   1.282
>
> Best,
> Ista
>
> >
> > On Sun, Sep 23, 2018 at 7:54 AM Ista Zahn <[hidden email]> wrote:
> >>
> >> On Sat, Sep 22, 2018 at 9:06 PM Wensui Liu <[hidden email]> wrote:
> >> >
> >> > or this one:
> >> >
> >> > (Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len))
> >>
> >> Oh dear god no.
> >>
> >> >
> >> > On Sat, Sep 22, 2018 at 4:16 PM rsherry8 <[hidden email]> wrote:
> >> > >
> >> > >
> >> > > It is my impression that good R programmers make very little use of the
> >> > > for statement. Please consider  the following
> >> > > R statement:
> >> > >          for( i in 1:(len-1) )  s[i] = log(c1[i+1]/c1[i], base = exp(1) )
> >> > > One problem I have found with this statement is that s must exist before
> >> > > the statement is run. Can it be written without using a for
> >> > > loop? Would that be better?
> >> > >
> >> > > Thanks,
> >> > > Bob
> >> > >
> >> > > ______________________________________________
> >> > > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >> > > https://stat.ethz.ch/mailman/listinfo/r-help
> >> > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> >> > > and provide commented, minimal, self-contained, reproducible code.
> >> >
> >> > ______________________________________________
> >> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >> > https://stat.ethz.ch/mailman/listinfo/r-help
> >> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> >> > and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: For Loop

Ista Zahn
On Sun, Sep 23, 2018 at 1:46 PM Wensui Liu <[hidden email]> wrote:

>
> actually, by the parallel pvec, the user time is a lot shorter. or did
> I somewhere miss your invaluable insight?
>
> > c1 <- 1:1000000
> > len <- length(c1)
> > rbenchmark::benchmark(log(c1[-1]/c1[-len]), replications = 100)
>                   test replications elapsed relative user.self sys.self
> 1 log(c1[-1]/c1[-len])          100   4.617        1     4.484    0.133
>   user.child sys.child
> 1          0         0
> > rbenchmark::benchmark(pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1] / c1[i])), replications = 100)
>                                                                test
> 1 pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1]/c1[i]))
>   replications elapsed relative user.self sys.self user.child sys.child
> 1          100   9.079        1     2.571    4.138      9.736     8.046

Your output is mangled in my email, but on my system your pvec
approach takes more than twice as long:

c1 <- 1:1000000
len <- length(c1)
library(parallel)
library(rbenchmark)

regular <- function() log(c1[-1]/c1[-len])
iterate.parallel <- function() {
  pvec(1:(len - 1), mc.cores = 4,
       function(i) log(c1[i + 1] / c1[i]))
}

benchmark(regular(), iterate.parallel(),
          replications = 100,
          columns = c("test", "elapsed", "relative"))
##                 test elapsed relative
## 2 iterate.parallel()   7.517    2.482
## 1          regular()   3.028    1.000

Honestly, just use log(c1[-1]/c1[-len]). The code is simple and easy
to understand and it runs pretty fast. There is usually no reason to
make it more complicated.
--Ista

> On Sun, Sep 23, 2018 at 12:33 PM Ista Zahn <[hidden email]> wrote:
> >
> > On Sun, Sep 23, 2018 at 10:09 AM Wensui Liu <[hidden email]> wrote:
> > >
> > > Why?
> >
> > The operations required for this algorithm are vectorized, as are most
> > operations in R. There is no need to iterate through each element.
> > Using Vectorize to achieve the iteration is no better than using
> > *apply or a for-loop, and betrays the same basic lack of insight into
> > basic principles of programming in R.
> >
> > And/or, if you want a more practical reason:
> >
> > > c1 <- 1:1000000
> > > len <- 1000000
> > > system.time( s1 <- log(c1[-1]/c1[-len]))
> >    user  system elapsed
> >   0.031   0.004   0.035
> > > system.time(s2 <- Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len))
> >    user  system elapsed
> >   1.258   0.022   1.282
> >
> > Best,
> > Ista
> >
> > >
> > > On Sun, Sep 23, 2018 at 7:54 AM Ista Zahn <[hidden email]> wrote:
> > >>
> > >> On Sat, Sep 22, 2018 at 9:06 PM Wensui Liu <[hidden email]> wrote:
> > >> >
> > >> > or this one:
> > >> >
> > >> > (Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len))
> > >>
> > >> Oh dear god no.
> > >>
> > >> >
> > >> > On Sat, Sep 22, 2018 at 4:16 PM rsherry8 <[hidden email]> wrote:
> > >> > >
> > >> > >
> > >> > > It is my impression that good R programmers make very little use of the
> > >> > > for statement. Please consider  the following
> > >> > > R statement:
> > >> > >          for( i in 1:(len-1) )  s[i] = log(c1[i+1]/c1[i], base = exp(1) )
> > >> > > One problem I have found with this statement is that s must exist before
> > >> > > the statement is run. Can it be written without using a for
> > >> > > loop? Would that be better?
> > >> > >
> > >> > > Thanks,
> > >> > > Bob
> > >> > >
> > >> > > ______________________________________________
> > >> > > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > >> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > >> > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > >> > > and provide commented, minimal, self-contained, reproducible code.
> > >> >
> > >> > ______________________________________________
> > >> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > >> > https://stat.ethz.ch/mailman/listinfo/r-help
> > >> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > >> > and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: For Loop

Wensui Liu
what you measures is the "elapsed" time in the default setting. you
might need to take a closer look at the beautiful benchmark() function
and see what time I am talking about.

I just provided tentative solution for the person asking for it  and
believe he has enough wisdom to decide what's best. why bother to
judge others subjectively?
On Sun, Sep 23, 2018 at 1:18 PM Ista Zahn <[hidden email]> wrote:

>
> On Sun, Sep 23, 2018 at 1:46 PM Wensui Liu <[hidden email]> wrote:
> >
> > actually, by the parallel pvec, the user time is a lot shorter. or did
> > I somewhere miss your invaluable insight?
> >
> > > c1 <- 1:1000000
> > > len <- length(c1)
> > > rbenchmark::benchmark(log(c1[-1]/c1[-len]), replications = 100)
> >                   test replications elapsed relative user.self sys.self
> > 1 log(c1[-1]/c1[-len])          100   4.617        1     4.484    0.133
> >   user.child sys.child
> > 1          0         0
> > > rbenchmark::benchmark(pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1] / c1[i])), replications = 100)
> >                                                                test
> > 1 pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1]/c1[i]))
> >   replications elapsed relative user.self sys.self user.child sys.child
> > 1          100   9.079        1     2.571    4.138      9.736     8.046
>
> Your output is mangled in my email, but on my system your pvec
> approach takes more than twice as long:
>
> c1 <- 1:1000000
> len <- length(c1)
> library(parallel)
> library(rbenchmark)
>
> regular <- function() log(c1[-1]/c1[-len])
> iterate.parallel <- function() {
>   pvec(1:(len - 1), mc.cores = 4,
>        function(i) log(c1[i + 1] / c1[i]))
> }
>
> benchmark(regular(), iterate.parallel(),
>           replications = 100,
>           columns = c("test", "elapsed", "relative"))
> ##                 test elapsed relative
> ## 2 iterate.parallel()   7.517    2.482
> ## 1          regular()   3.028    1.000
>
> Honestly, just use log(c1[-1]/c1[-len]). The code is simple and easy
> to understand and it runs pretty fast. There is usually no reason to
> make it more complicated.
> --Ista
>
> > On Sun, Sep 23, 2018 at 12:33 PM Ista Zahn <[hidden email]> wrote:
> > >
> > > On Sun, Sep 23, 2018 at 10:09 AM Wensui Liu <[hidden email]> wrote:
> > > >
> > > > Why?
> > >
> > > The operations required for this algorithm are vectorized, as are most
> > > operations in R. There is no need to iterate through each element.
> > > Using Vectorize to achieve the iteration is no better than using
> > > *apply or a for-loop, and betrays the same basic lack of insight into
> > > basic principles of programming in R.
> > >
> > > And/or, if you want a more practical reason:
> > >
> > > > c1 <- 1:1000000
> > > > len <- 1000000
> > > > system.time( s1 <- log(c1[-1]/c1[-len]))
> > >    user  system elapsed
> > >   0.031   0.004   0.035
> > > > system.time(s2 <- Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len))
> > >    user  system elapsed
> > >   1.258   0.022   1.282
> > >
> > > Best,
> > > Ista
> > >
> > > >
> > > > On Sun, Sep 23, 2018 at 7:54 AM Ista Zahn <[hidden email]> wrote:
> > > >>
> > > >> On Sat, Sep 22, 2018 at 9:06 PM Wensui Liu <[hidden email]> wrote:
> > > >> >
> > > >> > or this one:
> > > >> >
> > > >> > (Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len))
> > > >>
> > > >> Oh dear god no.
> > > >>
> > > >> >
> > > >> > On Sat, Sep 22, 2018 at 4:16 PM rsherry8 <[hidden email]> wrote:
> > > >> > >
> > > >> > >
> > > >> > > It is my impression that good R programmers make very little use of the
> > > >> > > for statement. Please consider  the following
> > > >> > > R statement:
> > > >> > >          for( i in 1:(len-1) )  s[i] = log(c1[i+1]/c1[i], base = exp(1) )
> > > >> > > One problem I have found with this statement is that s must exist before
> > > >> > > the statement is run. Can it be written without using a for
> > > >> > > loop? Would that be better?
> > > >> > >
> > > >> > > Thanks,
> > > >> > > Bob
> > > >> > >
> > > >> > > ______________________________________________
> > > >> > > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > > >> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > >> > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > > >> > > and provide commented, minimal, self-contained, reproducible code.
> > > >> >
> > > >> > ______________________________________________
> > > >> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > > >> > https://stat.ethz.ch/mailman/listinfo/r-help
> > > >> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > > >> > and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: For Loop

Sorkin, John
At the risk of asking something fundamental . . . .

does log(c1[-1]/c1[-len]

do the following


(1) use all elements of c and perform the calculation

(2) delete the first element of c and perform the calculation,

(2) delete the first two elements of c and perform the calculation,

 . . .

(n) use only the last element of c and perform the calculation.


Thank you,

John



John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)



________________________________
From: R-help <[hidden email]> on behalf of Wensui Liu <[hidden email]>
Sent: Sunday, September 23, 2018 2:26 PM
To: Ista Zahn
Cc: [hidden email]
Subject: Re: [R] For Loop

CAUTION: This message originated from a non UMB, UMSOM, FPI, or UMMS email system. Whether the sender is known or not known, hover over any links before clicking and use caution opening attachments.



what you measures is the "elapsed" time in the default setting. you
might need to take a closer look at the beautiful benchmark() function
and see what time I am talking about.

I just provided tentative solution for the person asking for it  and
believe he has enough wisdom to decide what's best. why bother to
judge others subjectively?
On Sun, Sep 23, 2018 at 1:18 PM Ista Zahn <[hidden email]> wrote:

>
> On Sun, Sep 23, 2018 at 1:46 PM Wensui Liu <[hidden email]> wrote:
> >
> > actually, by the parallel pvec, the user time is a lot shorter. or did
> > I somewhere miss your invaluable insight?
> >
> > > c1 <- 1:1000000
> > > len <- length(c1)
> > > rbenchmark::benchmark(log(c1[-1]/c1[-len]), replications = 100)
> >                   test replications elapsed relative user.self sys.self
> > 1 log(c1[-1]/c1[-len])          100   4.617        1     4.484    0.133
> >   user.child sys.child
> > 1          0         0
> > > rbenchmark::benchmark(pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1] / c1[i])), replications = 100)
> >                                                                test
> > 1 pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1]/c1[i]))
> >   replications elapsed relative user.self sys.self user.child sys.child
> > 1          100   9.079        1     2.571    4.138      9.736     8.046
>
> Your output is mangled in my email, but on my system your pvec
> approach takes more than twice as long:
>
> c1 <- 1:1000000
> len <- length(c1)
> library(parallel)
> library(rbenchmark)
>
> regular <- function() log(c1[-1]/c1[-len])
> iterate.parallel <- function() {
>   pvec(1:(len - 1), mc.cores = 4,
>        function(i) log(c1[i + 1] / c1[i]))
> }
>
> benchmark(regular(), iterate.parallel(),
>           replications = 100,
>           columns = c("test", "elapsed", "relative"))
> ##                 test elapsed relative
> ## 2 iterate.parallel()   7.517    2.482
> ## 1          regular()   3.028    1.000
>
> Honestly, just use log(c1[-1]/c1[-len]). The code is simple and easy
> to understand and it runs pretty fast. There is usually no reason to
> make it more complicated.
> --Ista
>
> > On Sun, Sep 23, 2018 at 12:33 PM Ista Zahn <[hidden email]> wrote:
> > >
> > > On Sun, Sep 23, 2018 at 10:09 AM Wensui Liu <[hidden email]> wrote:
> > > >
> > > > Why?
> > >
> > > The operations required for this algorithm are vectorized, as are most
> > > operations in R. There is no need to iterate through each element.
> > > Using Vectorize to achieve the iteration is no better than using
> > > *apply or a for-loop, and betrays the same basic lack of insight into
> > > basic principles of programming in R.
> > >
> > > And/or, if you want a more practical reason:
> > >
> > > > c1 <- 1:1000000
> > > > len <- 1000000
> > > > system.time( s1 <- log(c1[-1]/c1[-len]))
> > >    user  system elapsed
> > >   0.031   0.004   0.035
> > > > system.time(s2 <- Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len))
> > >    user  system elapsed
> > >   1.258   0.022   1.282
> > >
> > > Best,
> > > Ista
> > >
> > > >
> > > > On Sun, Sep 23, 2018 at 7:54 AM Ista Zahn <[hidden email]> wrote:
> > > >>
> > > >> On Sat, Sep 22, 2018 at 9:06 PM Wensui Liu <[hidden email]> wrote:
> > > >> >
> > > >> > or this one:
> > > >> >
> > > >> > (Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len))
> > > >>
> > > >> Oh dear god no.
> > > >>
> > > >> >
> > > >> > On Sat, Sep 22, 2018 at 4:16 PM rsherry8 <[hidden email]> wrote:
> > > >> > >
> > > >> > >
> > > >> > > It is my impression that good R programmers make very little use of the
> > > >> > > for statement. Please consider  the following
> > > >> > > R statement:
> > > >> > >          for( i in 1:(len-1) )  s[i] = log(c1[i+1]/c1[i], base = exp(1) )
> > > >> > > One problem I have found with this statement is that s must exist before
> > > >> > > the statement is run. Can it be written without using a for
> > > >> > > loop? Would that be better?
> > > >> > >
> > > >> > > Thanks,
> > > >> > > Bob
> > > >> > >
> > > >> > > ______________________________________________
> > > >> > > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > > >> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > >> > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > > >> > > and provide commented, minimal, self-contained, reproducible code.
> > > >> >
> > > >> > ______________________________________________
> > > >> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > > >> > https://stat.ethz.ch/mailman/listinfo/r-help
> > > >> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > > >> > and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: For Loop

Duncan Murdoch-2
On 23/09/2018 2:36 PM, Sorkin, John wrote:

> At the risk of asking something fundamental . . . .
>
> does log(c1[-1]/c1[-len]
>
> do the following
>
>
> (1) use all elements of c and perform the calculation
>
> (2) delete the first element of c and perform the calculation,
>
> (2) delete the first two elements of c and perform the calculation,
>
>   . . .
>
> (n) use only the last element of c and perform the calculation.

c1[-1] creates a new vector which is a copy of c1 leaving out element 1,
and c1[-len] creates a new vector which copies everything except element
len.  So your (1) is closest to the truth.

It is very similar to (but probably a little faster than)

log(c1[2:len]/c1[1:(len-1)])

There are differences in borderline cases (like length(c1) != len, or
len < 2) that are not relevant in the original context.

Duncan Murdoch

>
>
> Thank you,
>
> John
>
>
>
> John David Sorkin M.D., Ph.D.
> Professor of Medicine
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>
>
>
> ________________________________
> From: R-help <[hidden email]> on behalf of Wensui Liu <[hidden email]>
> Sent: Sunday, September 23, 2018 2:26 PM
> To: Ista Zahn
> Cc: [hidden email]
> Subject: Re: [R] For Loop
>
> CAUTION: This message originated from a non UMB, UMSOM, FPI, or UMMS email system. Whether the sender is known or not known, hover over any links before clicking and use caution opening attachments.
>
>
>
> what you measures is the "elapsed" time in the default setting. you
> might need to take a closer look at the beautiful benchmark() function
> and see what time I am talking about.
>
> I just provided tentative solution for the person asking for it  and
> believe he has enough wisdom to decide what's best. why bother to
> judge others subjectively?
> On Sun, Sep 23, 2018 at 1:18 PM Ista Zahn <[hidden email]> wrote:
>>
>> On Sun, Sep 23, 2018 at 1:46 PM Wensui Liu <[hidden email]> wrote:
>>>
>>> actually, by the parallel pvec, the user time is a lot shorter. or did
>>> I somewhere miss your invaluable insight?
>>>
>>>> c1 <- 1:1000000
>>>> len <- length(c1)
>>>> rbenchmark::benchmark(log(c1[-1]/c1[-len]), replications = 100)
>>>                    test replications elapsed relative user.self sys.self
>>> 1 log(c1[-1]/c1[-len])          100   4.617        1     4.484    0.133
>>>    user.child sys.child
>>> 1          0         0
>>>> rbenchmark::benchmark(pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1] / c1[i])), replications = 100)
>>>                                                                 test
>>> 1 pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1]/c1[i]))
>>>    replications elapsed relative user.self sys.self user.child sys.child
>>> 1          100   9.079        1     2.571    4.138      9.736     8.046
>>
>> Your output is mangled in my email, but on my system your pvec
>> approach takes more than twice as long:
>>
>> c1 <- 1:1000000
>> len <- length(c1)
>> library(parallel)
>> library(rbenchmark)
>>
>> regular <- function() log(c1[-1]/c1[-len])
>> iterate.parallel <- function() {
>>    pvec(1:(len - 1), mc.cores = 4,
>>         function(i) log(c1[i + 1] / c1[i]))
>> }
>>
>> benchmark(regular(), iterate.parallel(),
>>            replications = 100,
>>            columns = c("test", "elapsed", "relative"))
>> ##                 test elapsed relative
>> ## 2 iterate.parallel()   7.517    2.482
>> ## 1          regular()   3.028    1.000
>>
>> Honestly, just use log(c1[-1]/c1[-len]). The code is simple and easy
>> to understand and it runs pretty fast. There is usually no reason to
>> make it more complicated.
>> --Ista
>>
>>> On Sun, Sep 23, 2018 at 12:33 PM Ista Zahn <[hidden email]> wrote:
>>>>
>>>> On Sun, Sep 23, 2018 at 10:09 AM Wensui Liu <[hidden email]> wrote:
>>>>>
>>>>> Why?
>>>>
>>>> The operations required for this algorithm are vectorized, as are most
>>>> operations in R. There is no need to iterate through each element.
>>>> Using Vectorize to achieve the iteration is no better than using
>>>> *apply or a for-loop, and betrays the same basic lack of insight into
>>>> basic principles of programming in R.
>>>>
>>>> And/or, if you want a more practical reason:
>>>>
>>>>> c1 <- 1:1000000
>>>>> len <- 1000000
>>>>> system.time( s1 <- log(c1[-1]/c1[-len]))
>>>>     user  system elapsed
>>>>    0.031   0.004   0.035
>>>>> system.time(s2 <- Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len))
>>>>     user  system elapsed
>>>>    1.258   0.022   1.282
>>>>
>>>> Best,
>>>> Ista
>>>>
>>>>>
>>>>> On Sun, Sep 23, 2018 at 7:54 AM Ista Zahn <[hidden email]> wrote:
>>>>>>
>>>>>> On Sat, Sep 22, 2018 at 9:06 PM Wensui Liu <[hidden email]> wrote:
>>>>>>>
>>>>>>> or this one:
>>>>>>>
>>>>>>> (Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len))
>>>>>>
>>>>>> Oh dear god no.
>>>>>>
>>>>>>>
>>>>>>> On Sat, Sep 22, 2018 at 4:16 PM rsherry8 <[hidden email]> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> It is my impression that good R programmers make very little use of the
>>>>>>>> for statement. Please consider  the following
>>>>>>>> R statement:
>>>>>>>>           for( i in 1:(len-1) )  s[i] = log(c1[i+1]/c1[i], base = exp(1) )
>>>>>>>> One problem I have found with this statement is that s must exist before
>>>>>>>> the statement is run. Can it be written without using a for
>>>>>>>> loop? Would that be better?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Bob
>>>>>>>>
>>>>>>>> ______________________________________________
>>>>>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>>>
>>>>>>> ______________________________________________
>>>>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>>>>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: For Loop

Jeff Newmiller
In reply to this post by Sorkin, John
Below...

On Sun, 23 Sep 2018, Sorkin, John wrote:

> At the risk of asking something fundamental . . . .
>
> does log(c1[-1]/c1[-len]

You dropped the closing parenthesis.

log( c1[-1] / c1[-len] )

>
> do the following
>
>
> (1) use all elements of c and perform the calculation

No. a) "c" is the base "concatenate" function, and b) it is using two
different subsets of the elements in c1.

> (2) delete the first element of c and perform the calculation,

It does not change c1. c1[-1] is an expression that creates an entirely
new (but unnamed) vector that contains everything but the first element of
c1.

> (2) delete the first two elements of c and perform the calculation,

You are wandering into the weeds here...

> . . .
>
> (n) use only the last element of c and perform the calculation.

No, c1[-len] creates a temporary array that contains all elements except
the one(s) in the variable "len".  Note that the more conventional syntax
here is c1[ length(c1) ].

c1 <- 1:3
c1[ -1 ]
#> [1] 2 3
c1[ -length(c1) ]
#> [1] 1 2
c1[ -1 ] / c1[ -length( c1 ) ] # c(2,3)/c(1,2)
#> [1] 2.0 1.5
log( c1[ -1 ] / c1[ -length( c1 ) ] ) # log( c(2, 1.5) )
#> [1] 0.6931472 0.4054651

#' Created on 2018-09-23 by the [reprex package](http://reprex.tidyverse.org) (v0.2.0).

>
>
> Thank you,
>
> John
>
>
>
> John David Sorkin M.D., Ph.D.
> Professor of Medicine
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>
>
>
> ________________________________
> From: R-help <[hidden email]> on behalf of Wensui Liu <[hidden email]>
> Sent: Sunday, September 23, 2018 2:26 PM
> To: Ista Zahn
> Cc: [hidden email]
> Subject: Re: [R] For Loop
>
> CAUTION: This message originated from a non UMB, UMSOM, FPI, or UMMS email system. Whether the sender is known or not known, hover over any links before clicking and use caution opening attachments.
>
>
>
> what you measures is the "elapsed" time in the default setting. you
> might need to take a closer look at the beautiful benchmark() function
> and see what time I am talking about.
>
> I just provided tentative solution for the person asking for it  and
> believe he has enough wisdom to decide what's best. why bother to
> judge others subjectively?
> On Sun, Sep 23, 2018 at 1:18 PM Ista Zahn <[hidden email]> wrote:
>>
>> On Sun, Sep 23, 2018 at 1:46 PM Wensui Liu <[hidden email]> wrote:
>>>
>>> actually, by the parallel pvec, the user time is a lot shorter. or did
>>> I somewhere miss your invaluable insight?
>>>
>>>> c1 <- 1:1000000
>>>> len <- length(c1)
>>>> rbenchmark::benchmark(log(c1[-1]/c1[-len]), replications = 100)
>>>                   test replications elapsed relative user.self sys.self
>>> 1 log(c1[-1]/c1[-len])          100   4.617        1     4.484    0.133
>>>   user.child sys.child
>>> 1          0         0
>>>> rbenchmark::benchmark(pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1] / c1[i])), replications = 100)
>>>                                                                test
>>> 1 pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1]/c1[i]))
>>>   replications elapsed relative user.self sys.self user.child sys.child
>>> 1          100   9.079        1     2.571    4.138      9.736     8.046
>>
>> Your output is mangled in my email, but on my system your pvec
>> approach takes more than twice as long:
>>
>> c1 <- 1:1000000
>> len <- length(c1)
>> library(parallel)
>> library(rbenchmark)
>>
>> regular <- function() log(c1[-1]/c1[-len])
>> iterate.parallel <- function() {
>>   pvec(1:(len - 1), mc.cores = 4,
>>        function(i) log(c1[i + 1] / c1[i]))
>> }
>>
>> benchmark(regular(), iterate.parallel(),
>>           replications = 100,
>>           columns = c("test", "elapsed", "relative"))
>> ##                 test elapsed relative
>> ## 2 iterate.parallel()   7.517    2.482
>> ## 1          regular()   3.028    1.000
>>
>> Honestly, just use log(c1[-1]/c1[-len]). The code is simple and easy
>> to understand and it runs pretty fast. There is usually no reason to
>> make it more complicated.
>> --Ista
>>
>>> On Sun, Sep 23, 2018 at 12:33 PM Ista Zahn <[hidden email]> wrote:
>>>>
>>>> On Sun, Sep 23, 2018 at 10:09 AM Wensui Liu <[hidden email]> wrote:
>>>>>
>>>>> Why?
>>>>
>>>> The operations required for this algorithm are vectorized, as are most
>>>> operations in R. There is no need to iterate through each element.
>>>> Using Vectorize to achieve the iteration is no better than using
>>>> *apply or a for-loop, and betrays the same basic lack of insight into
>>>> basic principles of programming in R.
>>>>
>>>> And/or, if you want a more practical reason:
>>>>
>>>>> c1 <- 1:1000000
>>>>> len <- 1000000
>>>>> system.time( s1 <- log(c1[-1]/c1[-len]))
>>>>    user  system elapsed
>>>>   0.031   0.004   0.035
>>>>> system.time(s2 <- Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len))
>>>>    user  system elapsed
>>>>   1.258   0.022   1.282
>>>>
>>>> Best,
>>>> Ista
>>>>
>>>>>
>>>>> On Sun, Sep 23, 2018 at 7:54 AM Ista Zahn <[hidden email]> wrote:
>>>>>>
>>>>>> On Sat, Sep 22, 2018 at 9:06 PM Wensui Liu <[hidden email]> wrote:
>>>>>>>
>>>>>>> or this one:
>>>>>>>
>>>>>>> (Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len))
>>>>>>
>>>>>> Oh dear god no.
>>>>>>
>>>>>>>
>>>>>>> On Sat, Sep 22, 2018 at 4:16 PM rsherry8 <[hidden email]> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> It is my impression that good R programmers make very little use of the
>>>>>>>> for statement. Please consider  the following
>>>>>>>> R statement:
>>>>>>>>          for( i in 1:(len-1) )  s[i] = log(c1[i+1]/c1[i], base = exp(1) )
>>>>>>>> One problem I have found with this statement is that s must exist before
>>>>>>>> the statement is run. Can it be written without using a for
>>>>>>>> loop? Would that be better?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Bob
>>>>>>>>
>>>>>>>> ______________________________________________
>>>>>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>>>
>>>>>>> ______________________________________________
>>>>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>>>>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<[hidden email]>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: For Loop

Ista Zahn
In reply to this post by Wensui Liu
On Sun, Sep 23, 2018 at 2:26 PM Wensui Liu <[hidden email]> wrote:
>
> what you measures is the "elapsed" time in the default setting. you
> might need to take a closer look at the beautiful benchmark() function
> and see what time I am talking about.

I'm pretty sure you do not know what you are talking about.

>
> I just provided tentative solution for the person asking for it  and
> believe he has enough wisdom to decide what's best. why bother to
> judge others subjectively?

You are giving bad and confused advice. Please stop doing that.

--Ista

> On Sun, Sep 23, 2018 at 1:18 PM Ista Zahn <[hidden email]> wrote:
> >
> > On Sun, Sep 23, 2018 at 1:46 PM Wensui Liu <[hidden email]> wrote:
> > >
> > > actually, by the parallel pvec, the user time is a lot shorter. or did
> > > I somewhere miss your invaluable insight?
> > >
> > > > c1 <- 1:1000000
> > > > len <- length(c1)
> > > > rbenchmark::benchmark(log(c1[-1]/c1[-len]), replications = 100)
> > >                   test replications elapsed relative user.self sys.self
> > > 1 log(c1[-1]/c1[-len])          100   4.617        1     4.484    0.133
> > >   user.child sys.child
> > > 1          0         0
> > > > rbenchmark::benchmark(pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1] / c1[i])), replications = 100)
> > >                                                                test
> > > 1 pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1]/c1[i]))
> > >   replications elapsed relative user.self sys.self user.child sys.child
> > > 1          100   9.079        1     2.571    4.138      9.736     8.046
> >
> > Your output is mangled in my email, but on my system your pvec
> > approach takes more than twice as long:
> >
> > c1 <- 1:1000000
> > len <- length(c1)
> > library(parallel)
> > library(rbenchmark)
> >
> > regular <- function() log(c1[-1]/c1[-len])
> > iterate.parallel <- function() {
> >   pvec(1:(len - 1), mc.cores = 4,
> >        function(i) log(c1[i + 1] / c1[i]))
> > }
> >
> > benchmark(regular(), iterate.parallel(),
> >           replications = 100,
> >           columns = c("test", "elapsed", "relative"))
> > ##                 test elapsed relative
> > ## 2 iterate.parallel()   7.517    2.482
> > ## 1          regular()   3.028    1.000
> >
> > Honestly, just use log(c1[-1]/c1[-len]). The code is simple and easy
> > to understand and it runs pretty fast. There is usually no reason to
> > make it more complicated.
> > --Ista
> >
> > > On Sun, Sep 23, 2018 at 12:33 PM Ista Zahn <[hidden email]> wrote:
> > > >
> > > > On Sun, Sep 23, 2018 at 10:09 AM Wensui Liu <[hidden email]> wrote:
> > > > >
> > > > > Why?
> > > >
> > > > The operations required for this algorithm are vectorized, as are most
> > > > operations in R. There is no need to iterate through each element.
> > > > Using Vectorize to achieve the iteration is no better than using
> > > > *apply or a for-loop, and betrays the same basic lack of insight into
> > > > basic principles of programming in R.
> > > >
> > > > And/or, if you want a more practical reason:
> > > >
> > > > > c1 <- 1:1000000
> > > > > len <- 1000000
> > > > > system.time( s1 <- log(c1[-1]/c1[-len]))
> > > >    user  system elapsed
> > > >   0.031   0.004   0.035
> > > > > system.time(s2 <- Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len))
> > > >    user  system elapsed
> > > >   1.258   0.022   1.282
> > > >
> > > > Best,
> > > > Ista
> > > >
> > > > >
> > > > > On Sun, Sep 23, 2018 at 7:54 AM Ista Zahn <[hidden email]> wrote:
> > > > >>
> > > > >> On Sat, Sep 22, 2018 at 9:06 PM Wensui Liu <[hidden email]> wrote:
> > > > >> >
> > > > >> > or this one:
> > > > >> >
> > > > >> > (Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len))
> > > > >>
> > > > >> Oh dear god no.
> > > > >>
> > > > >> >
> > > > >> > On Sat, Sep 22, 2018 at 4:16 PM rsherry8 <[hidden email]> wrote:
> > > > >> > >
> > > > >> > >
> > > > >> > > It is my impression that good R programmers make very little use of the
> > > > >> > > for statement. Please consider  the following
> > > > >> > > R statement:
> > > > >> > >          for( i in 1:(len-1) )  s[i] = log(c1[i+1]/c1[i], base = exp(1) )
> > > > >> > > One problem I have found with this statement is that s must exist before
> > > > >> > > the statement is run. Can it be written without using a for
> > > > >> > > loop? Would that be better?
> > > > >> > >
> > > > >> > > Thanks,
> > > > >> > > Bob
> > > > >> > >
> > > > >> > > ______________________________________________
> > > > >> > > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > > > >> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > >> > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > > > >> > > and provide commented, minimal, self-contained, reproducible code.
> > > > >> >
> > > > >> > ______________________________________________
> > > > >> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > > > >> > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > >> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > > > >> > and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: For Loop

Jeff Newmiller
In reply to this post by Wensui Liu
On Sun, 23 Sep 2018, Wensui Liu wrote:

> what you measures is the "elapsed" time in the default setting. you
> might need to take a closer look at the beautiful benchmark() function
> and see what time I am talking about.

When I am waiting for the answer, elapsed time is what matters to me.
Also, since each person usually has different hardware, running benchmark
with multiple expressions as Ista did lets you pay attention to relative
comparisons.

Keep in mind that parallel processing requires extra time just to
distribute the calculations to the workers, so it doesn't pay to
distribute tiny tasks like calculating the division of two numeric vector
elements. That is the essence of vectorizing... bundle your simple
calculations together so the processor can focus on getting answers rather
than managing processes or even interpreting R for loops.

> I just provided tentative solution for the person asking for it  and
> believe he has enough wisdom to decide what's best. why bother to
> judge others subjectively?

I would say that Ista has backed up his objections with measurable
performance metrics, so while his initial reaction was pretty subjective I
think your reaction at this point is really off the mark.

One confusing aspect of your response is that Ista reacted to your
use of the Vectorize function, but you responded as though he reacted
to your use of the pvec function. I mentioned drawbacks of using pvec
above, but it really is important to stress that the Vectorize function is
a usability facade and is in no way a performance enhancement to be
associated with what we refer to as vectorized (lowercase) code.

The Vectorize function creates a function that calls lapply, which in turn
calls the C function do_lapply, which calls your R function with scalar
inputs as many times as desired, storing the results in a list, which
Vectorize then gives to mapply which runs another for loop over to create
a matrix or vector result. This is clearly less efficient than a simple
for loop would have been, rather than more efficient as a true vectorized
solution such as log(c1[-1]/c1[-len]) will normally be. Vectorize is
syntactic sugar with a performance penalty.

Please pay attention to the comments offered by others on this list...
being told your solution is inferior doesn't feel good but it is a very
real opportunity for you to improve.

End comment.

> On Sun, Sep 23, 2018 at 1:18 PM Ista Zahn <[hidden email]> wrote:
>>
>> On Sun, Sep 23, 2018 at 1:46 PM Wensui Liu <[hidden email]> wrote:
>>>
>>> actually, by the parallel pvec, the user time is a lot shorter. or did
>>> I somewhere miss your invaluable insight?
>>>
>>>> c1 <- 1:1000000
>>>> len <- length(c1)
>>>> rbenchmark::benchmark(log(c1[-1]/c1[-len]), replications = 100)
>>>                   test replications elapsed relative user.self sys.self
>>> 1 log(c1[-1]/c1[-len])          100   4.617        1     4.484    0.133
>>>   user.child sys.child
>>> 1          0         0
>>>> rbenchmark::benchmark(pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1] / c1[i])), replications = 100)
>>>                                                                test
>>> 1 pvec(1:(len - 1), mc.cores = 4, function(i) log(c1[i + 1]/c1[i]))
>>>   replications elapsed relative user.self sys.self user.child sys.child
>>> 1          100   9.079        1     2.571    4.138      9.736     8.046
>>
>> Your output is mangled in my email, but on my system your pvec
>> approach takes more than twice as long:
>>
>> c1 <- 1:1000000
>> len <- length(c1)
>> library(parallel)
>> library(rbenchmark)
>>
>> regular <- function() log(c1[-1]/c1[-len])
>> iterate.parallel <- function() {
>>   pvec(1:(len - 1), mc.cores = 4,
>>        function(i) log(c1[i + 1] / c1[i]))
>> }
>>
>> benchmark(regular(), iterate.parallel(),
>>           replications = 100,
>>           columns = c("test", "elapsed", "relative"))
>> ##                 test elapsed relative
>> ## 2 iterate.parallel()   7.517    2.482
>> ## 1          regular()   3.028    1.000
>>
>> Honestly, just use log(c1[-1]/c1[-len]). The code is simple and easy
>> to understand and it runs pretty fast. There is usually no reason to
>> make it more complicated.
>> --Ista
>>
>>> On Sun, Sep 23, 2018 at 12:33 PM Ista Zahn <[hidden email]> wrote:
>>>>
>>>> On Sun, Sep 23, 2018 at 10:09 AM Wensui Liu <[hidden email]> wrote:
>>>>>
>>>>> Why?
>>>>
>>>> The operations required for this algorithm are vectorized, as are most
>>>> operations in R. There is no need to iterate through each element.
>>>> Using Vectorize to achieve the iteration is no better than using
>>>> *apply or a for-loop, and betrays the same basic lack of insight into
>>>> basic principles of programming in R.
>>>>
>>>> And/or, if you want a more practical reason:
>>>>
>>>>> c1 <- 1:1000000
>>>>> len <- 1000000
>>>>> system.time( s1 <- log(c1[-1]/c1[-len]))
>>>>    user  system elapsed
>>>>   0.031   0.004   0.035
>>>>> system.time(s2 <- Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len))
>>>>    user  system elapsed
>>>>   1.258   0.022   1.282
>>>>
>>>> Best,
>>>> Ista
>>>>
>>>>>
>>>>> On Sun, Sep 23, 2018 at 7:54 AM Ista Zahn <[hidden email]> wrote:
>>>>>>
>>>>>> On Sat, Sep 22, 2018 at 9:06 PM Wensui Liu <[hidden email]> wrote:
>>>>>>>
>>>>>>> or this one:
>>>>>>>
>>>>>>> (Vectorize(function(i) log(c1[i + 1] / c1[i])) (1:len))
>>>>>>
>>>>>> Oh dear god no.
>>>>>>
>>>>>>>
>>>>>>> On Sat, Sep 22, 2018 at 4:16 PM rsherry8 <[hidden email]> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> It is my impression that good R programmers make very little use of the
>>>>>>>> for statement. Please consider  the following
>>>>>>>> R statement:
>>>>>>>>          for( i in 1:(len-1) )  s[i] = log(c1[i+1]/c1[i], base = exp(1) )
>>>>>>>> One problem I have found with this statement is that s must exist before
>>>>>>>> the statement is run. Can it be written without using a for
>>>>>>>> loop? Would that be better?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Bob
>>>>>>>>
>>>>>>>> ______________________________________________
>>>>>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>>>
>>>>>>> ______________________________________________
>>>>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>>>>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<[hidden email]>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: For Loop

Duncan Murdoch-2
On 23/09/2018 3:31 PM, Jeff Newmiller wrote:

[lots of good stuff deleted]

> Vectorize is
> syntactic sugar with a performance penalty.

[More deletions.]

I would say Vectorize isn't just "syntactic sugar".  When I use that
term, I mean something that looks nice but is functionally equivalent.

However, Vectorize() really does something useful:  some functions (e.g.
outer()) take other functions as arguments, but they assume the argument
is a vectorized function.  If it is not, they fail, or generate garbage
results.  Vectorize() is designed to modify the interface to a function
so it acts as if it is vectorized.

The "performance penalty" part of your statement is true.  It will
generally save some computing cycles to write a new function using a for
loop instead of using Vectorize().  But that may waste some programmer time.

Duncan Murdoch
(writing as one of the authors of Vectorize())

P.S. I'd give an example of syntactic sugar, but I don't want to bruise
some other author's feelings :-).

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: For Loop

Wensui Liu
Very insightful. Thanks, Duncan

Based on your opinion, is there any benefit to use the parallelism in the
corporate computing environment where the size of data is far more than
million rows and there are multiple cores in the server.

Actually the practice of going concurrency or not is more related to my
production tasks instead of something academic.

Really appreciate your thoughts.

On Sun, Sep 23, 2018 at 2:42 PM Duncan Murdoch <[hidden email]>
wrote:

> On 23/09/2018 3:31 PM, Jeff Newmiller wrote:
>
> [lots of good stuff deleted]
>
> > Vectorize is
> > syntactic sugar with a performance penalty.
>
> [More deletions.]
>
> I would say Vectorize isn't just "syntactic sugar".  When I use that
> term, I mean something that looks nice but is functionally equivalent.
>
> However, Vectorize() really does something useful:  some functions (e.g.
> outer()) take other functions as arguments, but they assume the argument
> is a vectorized function.  If it is not, they fail, or generate garbage
> results.  Vectorize() is designed to modify the interface to a function
> so it acts as if it is vectorized.
>
> The "performance penalty" part of your statement is true.  It will
> generally save some computing cycles to write a new function using a for
> loop instead of using Vectorize().  But that may waste some programmer
> time.
>
> Duncan Murdoch
> (writing as one of the authors of Vectorize())
>
> P.S. I'd give an example of syntactic sugar, but I don't want to bruise
> some other author's feelings :-).
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: For Loop

Jeff Newmiller
In reply to this post by Duncan Murdoch-2
On Sun, 23 Sep 2018, Duncan Murdoch wrote:

> On 23/09/2018 3:31 PM, Jeff Newmiller wrote:
>
> [lots of good stuff deleted]
>
>> Vectorize is
>> syntactic sugar with a performance penalty.
>
> [More deletions.]
>
> I would say Vectorize isn't just "syntactic sugar".  When I use that term, I
> mean something that looks nice but is functionally equivalent.
>
> However, Vectorize() really does something useful:  some functions (e.g.
> outer()) take other functions as arguments, but they assume the argument is a
> vectorized function.  If it is not, they fail, or generate garbage results.
> Vectorize() is designed to modify the interface to a function so it acts as
> if it is vectorized.
>
> The "performance penalty" part of your statement is true.  It will generally
> save some computing cycles to write a new function using a for loop instead
> of using Vectorize().  But that may waste some programmer time.
>
> Duncan Murdoch
> (writing as one of the authors of Vectorize())
>
> P.S. I'd give an example of syntactic sugar, but I don't want to bruise some
> other author's feelings :-).

Perhaps my writing needs some syntactic sugar: inefficient looping
algorithms can make sense when the calculations performed in each
iteration are long and/or involve large amounts of data. As I mentioned
earlier in this thread I use for loops fairly often, and I use other
inefficient syntactic sugar as well but only to organize large
blocks of already-vectorized (lowercase) calculation units.

In addition to the potential for inefficient use of programmer time,
vectorizing code increases the maximum amount of memory used during
execution of your program. A for loop is one simple way to allow memory
re-use so really large problems can be solved with limited resources, and
some syntactic sugar such as Vectorize can make it easier to keep track of
those for loops.

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<[hidden email]>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: For Loop

Duncan Murdoch-2
In reply to this post by Wensui Liu
On 23/09/2018 4:00 PM, Wensui Liu wrote:
> Very insightful. Thanks, Duncan
>
> Based on your opinion, is there any benefit to use the parallelism in
> the corporate computing environment where the size of data is far more
> than million rows and there are multiple cores in the server.

I would say "try it and see".  Sometimes it probably helps a lot,
sometimes it's probably detrimental.

Duncan Murdoch

P.S. I last worked in a corporate computing environment 40 years ago
when I was still wet behind the ears, so you'd probably want to ask
someone else.  However, more recently I worked in an academic
environment where I learned to say "try it and see" in many different
ways.  You just got the basic one today.


>
> Actually the practice of going concurrency or not is more related to my
> production tasks instead of something academic.
>
> Really appreciate your thoughts.
>
> On Sun, Sep 23, 2018 at 2:42 PM Duncan Murdoch <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     On 23/09/2018 3:31 PM, Jeff Newmiller wrote:
>
>     [lots of good stuff deleted]
>
>      > Vectorize is
>      > syntactic sugar with a performance penalty.
>
>     [More deletions.]
>
>     I would say Vectorize isn't just "syntactic sugar".  When I use that
>     term, I mean something that looks nice but is functionally equivalent.
>
>     However, Vectorize() really does something useful:  some functions
>     (e.g.
>     outer()) take other functions as arguments, but they assume the
>     argument
>     is a vectorized function.  If it is not, they fail, or generate garbage
>     results.  Vectorize() is designed to modify the interface to a function
>     so it acts as if it is vectorized.
>
>     The "performance penalty" part of your statement is true.  It will
>     generally save some computing cycles to write a new function using a
>     for
>     loop instead of using Vectorize().  But that may waste some
>     programmer time.
>
>     Duncan Murdoch
>     (writing as one of the authors of Vectorize())
>
>     P.S. I'd give an example of syntactic sugar, but I don't want to bruise
>     some other author's feelings :-).
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
12