

Hi,
I recently learned of the existence of R through a physicist friend who
uses it in his research. I've used Octave for a decade, and C for 35
years, but would like to learn R. These all have advantages and
disadvantages for certain tasks, but as I'm new to R I hardly know how
to evaluate them. Any suggestions?
Thanks!

This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On 1/29/19 10:05 AM, Alan Feuerbacher wrote:
> Hi,
>
> I recently learned of the existence of R through a physicist friend who
> uses it in his research. I've used Octave for a decade, and C for 35
> years, but would like to learn R. These all have advantages and
> disadvantages for certain tasks, but as I'm new to R I hardly know how
> to evaluate them. Any suggestions?
* C is fast, but with a syntax that is (to my mind) virtually
incomprehensible. (You probably think differently about this.)
* In C, you essentially have to roll your own for all tasks; in R,
practically anything (well ...) that you want to do has already
been programmed up. CRAN is a wonderful resource, and there's more
on github.
* The syntax of R meshes beautifully with *my* thought patterns; YMMV.
* Why not just bog in and try R out? It's free, it's readily available,
and there are a number of good online tutorials.
cheers,
Rolf Turner

Honorary Research Fellow
Department of Statistics
University of Auckland
Phone: +6493737599 ext. 88276
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


R has many similarities to Octave. Have a look at:
https://cran.rproject.org/doc/contrib/Randoctave.txthttps://CRAN.Rproject.org/package=matconvOn Mon, Jan 28, 2019 at 4:58 PM Alan Feuerbacher < [hidden email]> wrote:
>
> Hi,
>
> I recently learned of the existence of R through a physicist friend who
> uses it in his research. I've used Octave for a decade, and C for 35
> years, but would like to learn R. These all have advantages and
> disadvantages for certain tasks, but as I'm new to R I hardly know how
> to evaluate them. Any suggestions?
>
> Thanks!
>
> 
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.

Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1877GKXGROUP
email: ggrothendieck at gmail.com
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On 1/28/2019 4:20 PM, Rolf Turner wrote:
>
> On 1/29/19 10:05 AM, Alan Feuerbacher wrote:
>
>> Hi,
>>
>> I recently learned of the existence of R through a physicist friend
>> who uses it in his research. I've used Octave for a decade, and C for
>> 35 years, but would like to learn R. These all have advantages and
>> disadvantages for certain tasks, but as I'm new to R I hardly know how
>> to evaluate them. Any suggestions?
>
> * C is fast, but with a syntax that is (to my mind) virtually
> incomprehensible. (You probably think differently about this.)
I've been doing it long enough that I have little problem with it,
except for pointers. :)
> * In C, you essentially have to roll your own for all tasks; in R,
> practically anything (well ...) that you want to do has already
> been programmed up. CRAN is a wonderful resource, and there's more
> on github.
>
> * The syntax of R meshes beautifully with *my* thought patterns; YMMV.
>
> * Why not just bog in and try R out? It's free, it's readily available,
> and there are a number of good online tutorials.
I just installed R on my Linux Fedora system, so I'll do that.
I wonder if you'd care to comment on my little project that prompted
this? As part of another project, I wanted to model population growth
starting from a handful of starting individuals. This is exponential in
the long run, of course, but I wanted to see how a few basic parameters
affected the outcome. Using Octave, I modeled a single person as a
"cell", which in Octave has a good deal of overhead. The program
basically looped over the entire population, and updated each person
according to the parameters, which included random statistical
variations. So when the total population reached, say 10,000, and an
update time of 1 day, the program had to execute 10,000 x 365 update
operations for each year of growth. For large populations, say 100,000,
the program did not return even after 24 hours of run time.
So I switched to C, and used its "struct" declaration and an array of
structs to model the population. This allowed the program to complete in
under a minute as opposed to 24 hours+. So in line with your comments, C
is far more efficient than Octave.
How do you think R would fare in this simulation?
Alan

This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


I would say your question is foolish  you disagree no doubt!  because
the point of using R (or Octave or C++) is to take advantage of the
packages (= "libraries" in some languages; a library is something different
in R) it (or they) offers to simplify your task. Many of R's libraries are
written in C (or Fortran) an thus **are** fast as well as having
taskappropriate functionality and UI's .
So I think instead of pursuing this discussion you would do well to search.
I find rseek.org to be especially good for this sort of thing. Searching
there on "demography" brought up what appeared to be many appropriate hits
 including the "demography" package!  which you could then examine to
see whether and to what extent they provide the functionality you seek.
Cheers,
Bert
Bert Gunter
"The trouble with having an open mind is that people keep coming along and
sticking things into it."
 Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Mon, Jan 28, 2019 at 4:00 PM Alan Feuerbacher < [hidden email]>
wrote:
> On 1/28/2019 4:20 PM, Rolf Turner wrote:
> >
> > On 1/29/19 10:05 AM, Alan Feuerbacher wrote:
> >
> >> Hi,
> >>
> >> I recently learned of the existence of R through a physicist friend
> >> who uses it in his research. I've used Octave for a decade, and C for
> >> 35 years, but would like to learn R. These all have advantages and
> >> disadvantages for certain tasks, but as I'm new to R I hardly know how
> >> to evaluate them. Any suggestions?
> >
> > * C is fast, but with a syntax that is (to my mind) virtually
> > incomprehensible. (You probably think differently about this.)
>
> I've been doing it long enough that I have little problem with it,
> except for pointers. :)
>
> > * In C, you essentially have to roll your own for all tasks; in R,
> > practically anything (well ...) that you want to do has already
> > been programmed up. CRAN is a wonderful resource, and there's more
> > on github.
> >
> > * The syntax of R meshes beautifully with *my* thought patterns; YMMV.
> >
> > * Why not just bog in and try R out? It's free, it's readily available,
> > and there are a number of good online tutorials.
>
> I just installed R on my Linux Fedora system, so I'll do that.
>
> I wonder if you'd care to comment on my little project that prompted
> this? As part of another project, I wanted to model population growth
> starting from a handful of starting individuals. This is exponential in
> the long run, of course, but I wanted to see how a few basic parameters
> affected the outcome. Using Octave, I modeled a single person as a
> "cell", which in Octave has a good deal of overhead. The program
> basically looped over the entire population, and updated each person
> according to the parameters, which included random statistical
> variations. So when the total population reached, say 10,000, and an
> update time of 1 day, the program had to execute 10,000 x 365 update
> operations for each year of growth. For large populations, say 100,000,
> the program did not return even after 24 hours of run time.
>
> So I switched to C, and used its "struct" declaration and an array of
> structs to model the population. This allowed the program to complete in
> under a minute as opposed to 24 hours+. So in line with your comments, C
> is far more efficient than Octave.
>
> How do you think R would fare in this simulation?
>
> Alan
>
>
> 
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide
> http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


This would be a suitable application for NetLogo. The R package
RNetLogo provides an interface. In a few lines of code you get a
simulation with graphics.
On Mon, Jan 28, 2019 at 7:00 PM Alan Feuerbacher < [hidden email]> wrote:
>
> On 1/28/2019 4:20 PM, Rolf Turner wrote:
> >
> > On 1/29/19 10:05 AM, Alan Feuerbacher wrote:
> >
> >> Hi,
> >>
> >> I recently learned of the existence of R through a physicist friend
> >> who uses it in his research. I've used Octave for a decade, and C for
> >> 35 years, but would like to learn R. These all have advantages and
> >> disadvantages for certain tasks, but as I'm new to R I hardly know how
> >> to evaluate them. Any suggestions?
> >
> > * C is fast, but with a syntax that is (to my mind) virtually
> > incomprehensible. (You probably think differently about this.)
>
> I've been doing it long enough that I have little problem with it,
> except for pointers. :)
>
> > * In C, you essentially have to roll your own for all tasks; in R,
> > practically anything (well ...) that you want to do has already
> > been programmed up. CRAN is a wonderful resource, and there's more
> > on github.
> >
> > * The syntax of R meshes beautifully with *my* thought patterns; YMMV.
> >
> > * Why not just bog in and try R out? It's free, it's readily available,
> > and there are a number of good online tutorials.
>
> I just installed R on my Linux Fedora system, so I'll do that.
>
> I wonder if you'd care to comment on my little project that prompted
> this? As part of another project, I wanted to model population growth
> starting from a handful of starting individuals. This is exponential in
> the long run, of course, but I wanted to see how a few basic parameters
> affected the outcome. Using Octave, I modeled a single person as a
> "cell", which in Octave has a good deal of overhead. The program
> basically looped over the entire population, and updated each person
> according to the parameters, which included random statistical
> variations. So when the total population reached, say 10,000, and an
> update time of 1 day, the program had to execute 10,000 x 365 update
> operations for each year of growth. For large populations, say 100,000,
> the program did not return even after 24 hours of run time.
>
> So I switched to C, and used its "struct" declaration and an array of
> structs to model the population. This allowed the program to complete in
> under a minute as opposed to 24 hours+. So in line with your comments, C
> is far more efficient than Octave.
>
> How do you think R would fare in this simulation?
>
> Alan
>
>
> 
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.

Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1877GKXGROUP
email: ggrothendieck at gmail.com
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


S (R's predecessor) was designed by and for data analysts. R generally
follows that tradition. I think that simulations such as yours are not its
strength, although it can make analyzing (graphically and numerically) the
results of the simulation fun.
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Mon, Jan 28, 2019 at 4:00 PM Alan Feuerbacher < [hidden email]>
wrote:
> On 1/28/2019 4:20 PM, Rolf Turner wrote:
> >
> > On 1/29/19 10:05 AM, Alan Feuerbacher wrote:
> >
> >> Hi,
> >>
> >> I recently learned of the existence of R through a physicist friend
> >> who uses it in his research. I've used Octave for a decade, and C for
> >> 35 years, but would like to learn R. These all have advantages and
> >> disadvantages for certain tasks, but as I'm new to R I hardly know how
> >> to evaluate them. Any suggestions?
> >
> > * C is fast, but with a syntax that is (to my mind) virtually
> > incomprehensible. (You probably think differently about this.)
>
> I've been doing it long enough that I have little problem with it,
> except for pointers. :)
>
> > * In C, you essentially have to roll your own for all tasks; in R,
> > practically anything (well ...) that you want to do has already
> > been programmed up. CRAN is a wonderful resource, and there's more
> > on github.
> >
> > * The syntax of R meshes beautifully with *my* thought patterns; YMMV.
> >
> > * Why not just bog in and try R out? It's free, it's readily available,
> > and there are a number of good online tutorials.
>
> I just installed R on my Linux Fedora system, so I'll do that.
>
> I wonder if you'd care to comment on my little project that prompted
> this? As part of another project, I wanted to model population growth
> starting from a handful of starting individuals. This is exponential in
> the long run, of course, but I wanted to see how a few basic parameters
> affected the outcome. Using Octave, I modeled a single person as a
> "cell", which in Octave has a good deal of overhead. The program
> basically looped over the entire population, and updated each person
> according to the parameters, which included random statistical
> variations. So when the total population reached, say 10,000, and an
> update time of 1 day, the program had to execute 10,000 x 365 update
> operations for each year of growth. For large populations, say 100,000,
> the program did not return even after 24 hours of run time.
>
> So I switched to C, and used its "struct" declaration and an array of
> structs to model the population. This allowed the program to complete in
> under a minute as opposed to 24 hours+. So in line with your comments, C
> is far more efficient than Octave.
>
> How do you think R would fare in this simulation?
>
> Alan
>
>
> 
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide
> http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


If you forge on with your preconceptions of how such a simulation should be implemented then you will be able to reproduce your failure just as spectacularly using R as you did using Octave. It is crucial to employ vectorization of your algorithms if you want good performance with either Octave or R. That vectorization may either be over time or over separate simulations.
I am running simulations of a million cases of power plant performance over 25 years in about a minute. I know someone who used R to simulate a CFD river flow problem in a class in a few minutes, while others using Fortran or Matlab were struggling to get comparable runs completed in many hours. I believe the difference was in how the data were structured and manipulated more than the language that was being used. I think the strong capabilities for presenting results using R makes using it advantageous over Octave, though.
If your problems truly need a compiled language, the Rcpp package lets you mix C++ with R quite easily and then you get the best of both worlds. (C and Fortran are supported, but they are a bit more finicky to setup than C++).
On January 28, 2019 4:00:07 PM PST, Alan Feuerbacher < [hidden email]> wrote:
>On 1/28/2019 4:20 PM, Rolf Turner wrote:
>>
>> On 1/29/19 10:05 AM, Alan Feuerbacher wrote:
>>
>>> Hi,
>>>
>>> I recently learned of the existence of R through a physicist friend
>>> who uses it in his research. I've used Octave for a decade, and C
>for
>>> 35 years, but would like to learn R. These all have advantages and
>>> disadvantages for certain tasks, but as I'm new to R I hardly know
>how
>>> to evaluate them. Any suggestions?
>>
>> * C is fast, but with a syntax that is (to my mind) virtually
>> incomprehensible. (You probably think differently about this.)
>
>I've been doing it long enough that I have little problem with it,
>except for pointers. :)
>
>> * In C, you essentially have to roll your own for all tasks; in R,
>> practically anything (well ...) that you want to do has already
>> been programmed up. CRAN is a wonderful resource, and there's
>more
>> on github.
> >
>> * The syntax of R meshes beautifully with *my* thought patterns;
>YMMV.
>>
>> * Why not just bog in and try R out? It's free, it's readily
>available,
>> and there are a number of good online tutorials.
>
>I just installed R on my Linux Fedora system, so I'll do that.
>
>I wonder if you'd care to comment on my little project that prompted
>this? As part of another project, I wanted to model population growth
>starting from a handful of starting individuals. This is exponential in
>
>the long run, of course, but I wanted to see how a few basic parameters
>
>affected the outcome. Using Octave, I modeled a single person as a
>"cell", which in Octave has a good deal of overhead. The program
>basically looped over the entire population, and updated each person
>according to the parameters, which included random statistical
>variations. So when the total population reached, say 10,000, and an
>update time of 1 day, the program had to execute 10,000 x 365 update
>operations for each year of growth. For large populations, say 100,000,
>
>the program did not return even after 24 hours of run time.
>
>So I switched to C, and used its "struct" declaration and an array of
>structs to model the population. This allowed the program to complete
>in
>under a minute as opposed to 24 hours+. So in line with your comments,
>C
>is far more efficient than Octave.
>
>How do you think R would fare in this simulation?
>
>Alan
>
>
>
>This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus>
>______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp>PLEASE do read the posting guide
> http://www.Rproject.org/postingguide.html>and provide commented, minimal, selfcontained, reproducible code.

Sent from my phone. Please excuse my brevity.
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On 1/28/19 4:00 PM, Alan Feuerbacher wrote:
> On 1/28/2019 4:20 PM, Rolf Turner wrote:
>>
>> On 1/29/19 10:05 AM, Alan Feuerbacher wrote:
>>
>>> Hi,
>>>
>>> I recently learned of the existence of R through a physicist friend
>>> who uses it in his research. I've used Octave for a decade, and C
>>> for 35 years, but would like to learn R. These all have advantages
>>> and disadvantages for certain tasks, but as I'm new to R I hardly
>>> know how to evaluate them. Any suggestions?
>> >
> snpped
>> * The syntax of R meshes beautifully with *my* thought patterns; YMMV.
>>
>> * Why not just bog in and try R out? It's free, it's readily available,
>> and there are a number of good online tutorials.
>
> I just installed R on my Linux Fedora system, so I'll do that.
>
> I wonder if you'd care to comment on my little project that prompted
> this? As part of another project, I wanted to model population growth
> starting from a handful of starting individuals. This is exponential
> in the long run, of course, but I wanted to see how a few basic
> parameters affected the outcome. Using Octave, I modeled a single
> person as a "cell", which in Octave has a good deal of overhead. The
> program basically looped over the entire population, and updated each
> person according to the parameters, which included random statistical
> variations. So when the total population reached, say 10,000, and an
> update time of 1 day, the program had to execute 10,000 x 365 update
> operations for each year of growth. For large populations, say
> 100,000, the program did not return even after 24 hours of run time.
>
> So I switched to C, and used its "struct" declaration and an array of
> structs to model the population. This allowed the program to complete
> in under a minute as opposed to 24 hours+. So in line with your
> comments, C is far more efficient than Octave.
>
> How do you think R would fare in this simulation?
>
This sounds like a problem that would fit into a stochastic differential
equation. There are at least three packages in CRAN (and I suspect a
few more) that will handle simulations of stochastic differential
equations. Bert's suggestion to use Rseek should serve you well.

David.
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Two additional comments:
 depending on the nature of your problem you may be able to get an
analytic solution using branching processes. I found this approach
successful when I once had to model stem cell growth.
 in addition to NetLogo another alternative to R would be the Julia
language which is motivated to some degree by Octave but is actually
quite different and is particularly suitable in terms of performance
for iterative computations where one iteration depends on the prior
one.
On Mon, Jan 28, 2019 at 6:32 PM Gabor Grothendieck
< [hidden email]> wrote:
>
> R has many similarities to Octave. Have a look at:
>
> https://cran.rproject.org/doc/contrib/Randoctave.txt> https://CRAN.Rproject.org/package=matconv>
> On Mon, Jan 28, 2019 at 4:58 PM Alan Feuerbacher < [hidden email]> wrote:
> >
> > Hi,
> >
> > I recently learned of the existence of R through a physicist friend who
> > uses it in his research. I've used Octave for a decade, and C for 35
> > years, but would like to learn R. These all have advantages and
> > disadvantages for certain tasks, but as I'm new to R I hardly know how
> > to evaluate them. Any suggestions?
> >
> > Thanks!
> >
> > 
> > This email has been checked for viruses by Avast antivirus software.
> > https://www.avast.com/antivirus> >
> > ______________________________________________
> > [hidden email] mailing list  To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/rhelp> > PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> > and provide commented, minimal, selfcontained, reproducible code.
>
>
>
> 
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1877GKXGROUP
> email: ggrothendieck at gmail.com

Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1877GKXGROUP
email: ggrothendieck at gmail.com
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On 1/28/2019 5:17 PM, Bert Gunter wrote:
> I would say your question is foolish  you disagree no doubt! 
> because the point of using R (or Octave or C++) is to take advantage of
> the packages (= "libraries" in some languages; a library is something
> different in R) it (or they) offers to simplify your task. Many of R's
> libraries are written in C (or Fortran) an thus **are** fast as well as
> having taskappropriate functionality and UI's .
Yes, I'm well aware of the libraries in Octave. But so far as I was able
to see, none of them fit my needs. I used Octave at first because I'm
familiar with it. But far from an expert.
> So I think instead of pursuing this discussion you would do well to
> search. I find rseek.org < http://rseek.org> to be especially good for
> this sort of thing. Searching there on "demography" brought up what
> appeared to be many appropriate hits  including the "demography"
> package!  which you could then examine to see whether and to what
> extent they provide the functionality you seek.
I looked over the demography package, and it indeed appears to do what I
want. But it seems to be far more complicated than my simple problem,
and has a large learning curve.
Alan
> Cheers,
> Bert
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
>  Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Mon, Jan 28, 2019 at 4:00 PM Alan Feuerbacher < [hidden email]
> <mailto: [hidden email]>> wrote:
>
> On 1/28/2019 4:20 PM, Rolf Turner wrote:
> >
> > On 1/29/19 10:05 AM, Alan Feuerbacher wrote:
> >
> >> Hi,
> >>
> >> I recently learned of the existence of R through a physicist friend
> >> who uses it in his research. I've used Octave for a decade, and
> C for
> >> 35 years, but would like to learn R. These all have advantages and
> >> disadvantages for certain tasks, but as I'm new to R I hardly
> know how
> >> to evaluate them. Any suggestions?
> >
> > * C is fast, but with a syntax that is (to my mind) virtually
> > incomprehensible. (You probably think differently about this.)
>
> I've been doing it long enough that I have little problem with it,
> except for pointers. :)
>
> > * In C, you essentially have to roll your own for all tasks; in R,
> > practically anything (well ...) that you want to do has already
> > been programmed up. CRAN is a wonderful resource, and there's
> more
> > on github.
> >
> > * The syntax of R meshes beautifully with *my* thought patterns;
> YMMV.
> >
> > * Why not just bog in and try R out? It's free, it's readily
> available,
> > and there are a number of good online tutorials.
>
> I just installed R on my Linux Fedora system, so I'll do that.
>
> I wonder if you'd care to comment on my little project that prompted
> this? As part of another project, I wanted to model population growth
> starting from a handful of starting individuals. This is exponential in
> the long run, of course, but I wanted to see how a few basic parameters
> affected the outcome. Using Octave, I modeled a single person as a
> "cell", which in Octave has a good deal of overhead. The program
> basically looped over the entire population, and updated each person
> according to the parameters, which included random statistical
> variations. So when the total population reached, say 10,000, and an
> update time of 1 day, the program had to execute 10,000 x 365 update
> operations for each year of growth. For large populations, say 100,000,
> the program did not return even after 24 hours of run time.
>
> So I switched to C, and used its "struct" declaration and an array of
> structs to model the population. This allowed the program to
> complete in
> under a minute as opposed to 24 hours+. So in line with your
> comments, C
> is far more efficient than Octave.
>
> How do you think R would fare in this simulation?
>
> Alan
>
>
> 
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus>
> ______________________________________________
> [hidden email] <mailto: [hidden email]> mailing list 
> To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide
> http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


In reply to this post by R help mailing list2
On 1/28/2019 6:07 PM, William Dunlap wrote:
> S (R's predecessor) was designed by and for data analysts. R generally
> follows that tradition. I think that simulations such as yours are not
> its strength, although it can make analyzing (graphically and
> numerically) the results of the simulation fun.
At this point I think you're right on all counts.
Alan
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com < http://tibco.com>
>
>
> On Mon, Jan 28, 2019 at 4:00 PM Alan Feuerbacher < [hidden email]
> <mailto: [hidden email]>> wrote:
>
> On 1/28/2019 4:20 PM, Rolf Turner wrote:
> >
> > On 1/29/19 10:05 AM, Alan Feuerbacher wrote:
> >
> >> Hi,
> >>
> >> I recently learned of the existence of R through a physicist friend
> >> who uses it in his research. I've used Octave for a decade, and
> C for
> >> 35 years, but would like to learn R. These all have advantages and
> >> disadvantages for certain tasks, but as I'm new to R I hardly
> know how
> >> to evaluate them. Any suggestions?
> >
> > * C is fast, but with a syntax that is (to my mind) virtually
> > incomprehensible. (You probably think differently about this.)
>
> I've been doing it long enough that I have little problem with it,
> except for pointers. :)
>
> > * In C, you essentially have to roll your own for all tasks; in R,
> > practically anything (well ...) that you want to do has already
> > been programmed up. CRAN is a wonderful resource, and there's
> more
> > on github.
> >
> > * The syntax of R meshes beautifully with *my* thought patterns;
> YMMV.
> >
> > * Why not just bog in and try R out? It's free, it's readily
> available,
> > and there are a number of good online tutorials.
>
> I just installed R on my Linux Fedora system, so I'll do that.
>
> I wonder if you'd care to comment on my little project that prompted
> this? As part of another project, I wanted to model population growth
> starting from a handful of starting individuals. This is exponential in
> the long run, of course, but I wanted to see how a few basic parameters
> affected the outcome. Using Octave, I modeled a single person as a
> "cell", which in Octave has a good deal of overhead. The program
> basically looped over the entire population, and updated each person
> according to the parameters, which included random statistical
> variations. So when the total population reached, say 10,000, and an
> update time of 1 day, the program had to execute 10,000 x 365 update
> operations for each year of growth. For large populations, say 100,000,
> the program did not return even after 24 hours of run time.
>
> So I switched to C, and used its "struct" declaration and an array of
> structs to model the population. This allowed the program to
> complete in
> under a minute as opposed to 24 hours+. So in line with your
> comments, C
> is far more efficient than Octave.
>
> How do you think R would fare in this simulation?
>
> Alan
>
>
> 
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus>
> ______________________________________________
> [hidden email] <mailto: [hidden email]> mailing list 
> To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide
> http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On 1/28/2019 7:51 PM, Jeff Newmiller wrote:
> If you forge on with your preconceptions of how such a simulation should be implemented then you will be able to reproduce your failure just as spectacularly using R as you did using Octave.
I think I've come to the same conclusion. :)
> It is crucial to employ vectorization of your algorithms if you want good performance with either Octave or R. That vectorization may either be over time or over separate simulations.
Please explain further, if you don't mind. My background is not in
programming, but in analog microchip circuit design (I'm now retired).
Thus I'm a user of circuit simulators, not a programmer of them. Also,
I'm running this stuff on my home computers, either Linux or Windows
machines.
> I am running simulations of a million cases of power plant performance over 25 years in about a minute. I know someone who used R to simulate a CFD river flow problem in a class in a few minutes, while others using Fortran or Matlab were struggling to get comparable runs completed in many hours. I believe the difference was in how the data were structured and manipulated more than the language that was being used. I think the strong capabilities for presenting results using R makes using it advantageous over Octave, though.
After my failed attempt at using Octave, I realized that most likely the
main contributing factor was that I was not able to figure out an
efficient data structure to model one person. But C lent itself
perfectly to my idea of how to go about programming my simulation. So
here's a simplified pseudocode sort of example of what I did:
To model a single reproducing woman I used this C construct:
typedef struct woman {
int isAlive;
int isPregnant;
double age;
. . .
} WOMAN;
Then I allocated memory for a big array of these things, using the C
malloc() function, which gave me the equivalent of this statement:
WOMAN women[NWOMEN]; /* An array of NWOMEN womanstructs */
After some initialization I set up two loops:
for( j=0; j<numberOfYears; j++) {
for(i=1; i< numberOfWomen; i++) {
updateWomen();
}
}
The function updateWomen() figures out things like whether the woman
becomes pregnant or gives birth on a given day, dies, etc.
I added other refinements that are not relevant here, such as random
variations of various parameters, using the GNU Scientific Library
random number generator functions.
If you can suggest a data construct in R or Octave that does something
like this, and uses your idea of vectorization, I'd like to hear it. I'd
like to implement it and compare results with my C implementation.
> If your problems truly need a compiled language, the Rcpp package lets you mix C++ with R quite easily and then you get the best of both worlds. (C and Fortran are supported, but they are a bit more finicky to setup than C++).
I don't know the answer to that, but perhaps you can help decide.
Alan
> On January 28, 2019 4:00:07 PM PST, Alan Feuerbacher < [hidden email]> wrote:
>> On 1/28/2019 4:20 PM, Rolf Turner wrote:
>>>
>>> On 1/29/19 10:05 AM, Alan Feuerbacher wrote:
>>>
>>>> Hi,
>>>>
>>>> I recently learned of the existence of R through a physicist friend
>>>> who uses it in his research. I've used Octave for a decade, and C
>> for
>>>> 35 years, but would like to learn R. These all have advantages and
>>>> disadvantages for certain tasks, but as I'm new to R I hardly know
>> how
>>>> to evaluate them. Any suggestions?
>>>
>>> * C is fast, but with a syntax that is (to my mind) virtually
>>> incomprehensible. (You probably think differently about this.)
>>
>> I've been doing it long enough that I have little problem with it,
>> except for pointers. :)
>>
>>> * In C, you essentially have to roll your own for all tasks; in R,
>>> practically anything (well ...) that you want to do has already
>>> been programmed up. CRAN is a wonderful resource, and there's
>> more
>>> on github.
>>>
>>> * The syntax of R meshes beautifully with *my* thought patterns;
>> YMMV.
>>>
>>> * Why not just bog in and try R out? It's free, it's readily
>> available,
>>> and there are a number of good online tutorials.
>>
>> I just installed R on my Linux Fedora system, so I'll do that.
>>
>> I wonder if you'd care to comment on my little project that prompted
>> this? As part of another project, I wanted to model population growth
>> starting from a handful of starting individuals. This is exponential in
>>
>> the long run, of course, but I wanted to see how a few basic parameters
>>
>> affected the outcome. Using Octave, I modeled a single person as a
>> "cell", which in Octave has a good deal of overhead. The program
>> basically looped over the entire population, and updated each person
>> according to the parameters, which included random statistical
>> variations. So when the total population reached, say 10,000, and an
>> update time of 1 day, the program had to execute 10,000 x 365 update
>> operations for each year of growth. For large populations, say 100,000,
>>
>> the program did not return even after 24 hours of run time.
>>
>> So I switched to C, and used its "struct" declaration and an array of
>> structs to model the population. This allowed the program to complete
>> in
>> under a minute as opposed to 24 hours+. So in line with your comments,
>> C
>> is far more efficient than Octave.
>>
>> How do you think R would fare in this simulation?
>>
>> Alan
>>
>>
>> 
>> This email has been checked for viruses by Avast antivirus software.
>> https://www.avast.com/antivirus>>
>> ______________________________________________
>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/rhelp>> PLEASE do read the posting guide
>> http://www.Rproject.org/postingguide.html>> and provide commented, minimal, selfcontained, reproducible code.
>
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On 1/29/2019 8:11 AM, Gabor Grothendieck wrote:
> Two additional comments:
>
>  depending on the nature of your problem you may be able to get an
> analytic solution using branching processes. I found this approach
> successful when I once had to model stem cell growth.
That sounds very interesting! Please see my reply to Jeff Newmiller. Not
being a mathematician, I have no clue how to go about this but would be
very interested to learn.
>  in addition to NetLogo another alternative to R would be the Julia
> language which is motivated to some degree by Octave but is actually
> quite different and is particularly suitable in terms of performance
> for iterative computations where one iteration depends on the prior
> one.
Given my response to Jeff Newmiller, do your comments still apply?
Alan
> On Mon, Jan 28, 2019 at 6:32 PM Gabor Grothendieck
> < [hidden email]> wrote:
>>
>> R has many similarities to Octave. Have a look at:
>>
>> https://cran.rproject.org/doc/contrib/Randoctave.txt>> https://CRAN.Rproject.org/package=matconv>>
>> On Mon, Jan 28, 2019 at 4:58 PM Alan Feuerbacher < [hidden email]> wrote:
>>>
>>> Hi,
>>>
>>> I recently learned of the existence of R through a physicist friend who
>>> uses it in his research. I've used Octave for a decade, and C for 35
>>> years, but would like to learn R. These all have advantages and
>>> disadvantages for certain tasks, but as I'm new to R I hardly know how
>>> to evaluate them. Any suggestions?
>>>
>>> Thanks!
>>>
>>> 
>>> This email has been checked for viruses by Avast antivirus software.
>>> https://www.avast.com/antivirus>>>
>>> ______________________________________________
>>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/rhelp>>> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html>>> and provide commented, minimal, selfcontained, reproducible code.
>>
>>
>>
>> 
>> Statistics & Software Consulting
>> GKX Group, GKX Associates Inc.
>> tel: 1877GKXGROUP
>> email: ggrothendieck at gmail.com
>
>
>
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On Tue, 29 Jan 2019, Alan Feuerbacher wrote:
> On 1/28/2019 7:51 PM, Jeff Newmiller wrote:
>> If you forge on with your preconceptions of how such a simulation should be
>> implemented then you will be able to reproduce your failure just as
>> spectacularly using R as you did using Octave.
>
> I think I've come to the same conclusion. :)
>
>> It is crucial to employ vectorization of your algorithms if you want good
>> performance with either Octave or R. That vectorization may either be over
>> time or over separate simulations.
>
> Please explain further, if you don't mind. My background is not in
> programming, but in analog microchip circuit design (I'm now retired). Thus
> I'm a user of circuit simulators, not a programmer of them. Also, I'm running
> this stuff on my home computers, either Linux or Windows machines.
>
>> I am running simulations of a million cases of power plant performance over
>> 25 years in about a minute. I know someone who used R to simulate a CFD
>> river flow problem in a class in a few minutes, while others using Fortran
>> or Matlab were struggling to get comparable runs completed in many hours. I
>> believe the difference was in how the data were structured and manipulated
>> more than the language that was being used. I think the strong capabilities
>> for presenting results using R makes using it advantageous over Octave,
>> though.
>
> After my failed attempt at using Octave, I realized that most likely the main
> contributing factor was that I was not able to figure out an efficient data
> structure to model one person. But C lent itself perfectly to my idea of how
> to go about programming my simulation. So here's a simplified pseudocode sort
> of example of what I did:
Don't model one person... model an array of people.
> To model a single reproducing woman I used this C construct:
>
> typedef struct woman {
> int isAlive;
> int isPregnant;
> double age;
> . . .
> } WOMAN;
# e.g.
Nwomen < 100
women < data.frame( isAlive = rep( TRUE, Nwomen )
, isPregnant = rep( FALSE, Nwomen )
, age = rep( 20, Nwomen )
)
> Then I allocated memory for a big array of these things, using the C malloc()
> function, which gave me the equivalent of this statement:
>
> WOMAN women[NWOMEN]; /* An array of NWOMEN womanstructs */
>
> After some initialization I set up two loops:
>
> for( j=0; j<numberOfYears; j++) {
> for(i=1; i< numberOfWomen; i++) {
> updateWomen();
> }
> }
for ( j in seq.int( numberOfYears ) {
# let vectorized data storage automatically handle the other for loop
women < updateWomen( women )
}
> The function updateWomen() figures out things like whether the woman becomes
> pregnant or gives birth on a given day, dies, etc.
You can use your "fixed size" allocation strategy with flags indicating
whether specific rows are in use, or you can only work with valid rows and
add rows as needed for children... best to compute a logical vector that
identifies all of the birthing mothers as a subset of the data frame, and
build a set of children rows using the birthing mothers data frame as
input, and then rbind the new rows to the updated women dataframe as
appropriate. The most clear approach for individual decision calculations
is the use of the vectorized "ifelse" function, though under certain
circumstances putting an indexed subset on the left side of an assignment
can modify memory "in place" (the functionalprogramming restriction
against this is probably a foreign idea to a dyedinthewool C
programmer, but R usually prevents you from modifying the variable that
was input to a function, automatically making a local copy of the input as
needed in order to prevent such backwash into the caller's context).
> I added other refinements that are not relevant here, such as random
> variations of various parameters, using the GNU Scientific Library random
> number generator functions.
R has quite sophisticated random number generation by default.
> If you can suggest a data construct in R or Octave that does something like
> this, and uses your idea of vectorization, I'd like to hear it. I'd like to
> implement it and compare results with my C implementation.
>
>> If your problems truly need a compiled language, the Rcpp package lets you
>> mix C++ with R quite easily and then you get the best of both worlds. (C
>> and Fortran are supported, but they are a bit more finicky to setup than
>> C++).
>
> I don't know the answer to that, but perhaps you can help decide.
>
> Alan
>
>
>> On January 28, 2019 4:00:07 PM PST, Alan Feuerbacher < [hidden email]>
>> wrote:
>>> On 1/28/2019 4:20 PM, Rolf Turner wrote:
>>>>
>>>> On 1/29/19 10:05 AM, Alan Feuerbacher wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I recently learned of the existence of R through a physicist friend
>>>>> who uses it in his research. I've used Octave for a decade, and C
>>> for
>>>>> 35 years, but would like to learn R. These all have advantages and
>>>>> disadvantages for certain tasks, but as I'm new to R I hardly know
>>> how
>>>>> to evaluate them. Any suggestions?
>>>>
>>>> * C is fast, but with a syntax that is (to my mind) virtually
>>>> incomprehensible. (You probably think differently about this.)
>>>
>>> I've been doing it long enough that I have little problem with it,
>>> except for pointers. :)
>>>
>>>> * In C, you essentially have to roll your own for all tasks; in R,
>>>> practically anything (well ...) that you want to do has already
>>>> been programmed up. CRAN is a wonderful resource, and there's
>>> more
>>>> on github.
>>>>
>>>> * The syntax of R meshes beautifully with *my* thought patterns;
>>> YMMV.
>>>>
>>>> * Why not just bog in and try R out? It's free, it's readily
>>> available,
>>>> and there are a number of good online tutorials.
>>>
>>> I just installed R on my Linux Fedora system, so I'll do that.
>>>
>>> I wonder if you'd care to comment on my little project that prompted
>>> this? As part of another project, I wanted to model population growth
>>> starting from a handful of starting individuals. This is exponential in
>>>
>>> the long run, of course, but I wanted to see how a few basic parameters
>>>
>>> affected the outcome. Using Octave, I modeled a single person as a
>>> "cell", which in Octave has a good deal of overhead. The program
>>> basically looped over the entire population, and updated each person
>>> according to the parameters, which included random statistical
>>> variations. So when the total population reached, say 10,000, and an
>>> update time of 1 day, the program had to execute 10,000 x 365 update
>>> operations for each year of growth. For large populations, say 100,000,
>>>
>>> the program did not return even after 24 hours of run time.
>>>
>>> So I switched to C, and used its "struct" declaration and an array of
>>> structs to model the population. This allowed the program to complete
>>> in
>>> under a minute as opposed to 24 hours+. So in line with your comments,
>>> C
>>> is far more efficient than Octave.
>>>
>>> How do you think R would fare in this simulation?
>>>
>>> Alan
>>>
>>>
>>> 
>>> This email has been checked for viruses by Avast antivirus software.
>>> https://www.avast.com/antivirus>>>
>>> ______________________________________________
>>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/rhelp>>> PLEASE do read the posting guide
>>> http://www.Rproject.org/postingguide.html>>> and provide commented, minimal, selfcontained, reproducible code.
>>
>
>

Jeff Newmiller The ..... ..... Go Live...
DCN:< [hidden email]> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k

______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On 1/29/2019 11:50 PM, Jeff Newmiller wrote:
Thanks very much for providing these coding examples! I think this is a
good way to learn some R.
Alan
> On Tue, 29 Jan 2019, Alan Feuerbacher wrote:
>
>> On 1/28/2019 7:51 PM, Jeff Newmiller wrote:
>>> If you forge on with your preconceptions of how such a simulation
>>> should be implemented then you will be able to reproduce your failure
>>> just as spectacularly using R as you did using Octave.
>>
>> I think I've come to the same conclusion. :)
>>
>>> It is crucial to employ vectorization of your algorithms if you want
>>> good performance with either Octave or R. That vectorization may
>>> either be over time or over separate simulations.
>>
>> Please explain further, if you don't mind. My background is not in
>> programming, but in analog microchip circuit design (I'm now retired).
>> Thus I'm a user of circuit simulators, not a programmer of them. Also,
>> I'm running this stuff on my home computers, either Linux or Windows
>> machines.
>>
>>> I am running simulations of a million cases of power plant
>>> performance over 25 years in about a minute. I know someone who used
>>> R to simulate a CFD river flow problem in a class in a few minutes,
>>> while others using Fortran or Matlab were struggling to get
>>> comparable runs completed in many hours. I believe the difference was
>>> in how the data were structured and manipulated more than the
>>> language that was being used. I think the strong capabilities for
>>> presenting results using R makes using it advantageous over Octave,
>>> though.
>>
>> After my failed attempt at using Octave, I realized that most likely
>> the main contributing factor was that I was not able to figure out an
>> efficient data structure to model one person. But C lent itself
>> perfectly to my idea of how to go about programming my simulation. So
>> here's a simplified pseudocode sort of example of what I did:
>
> Don't model one person... model an array of people.
>
>> To model a single reproducing woman I used this C construct:
>>
>> typedef struct woman {
>> int isAlive;
>> int isPregnant;
>> double age;
>> . . .
>> } WOMAN;
>
> # e.g.
> Nwomen < 100
> women < data.frame( isAlive = rep( TRUE, Nwomen )
> , isPregnant = rep( FALSE, Nwomen )
> , age = rep( 20, Nwomen )
> )
>
>> Then I allocated memory for a big array of these things, using the C
>> malloc() function, which gave me the equivalent of this statement:
>>
>> WOMAN women[NWOMEN]; /* An array of NWOMEN womanstructs */
>>
>> After some initialization I set up two loops:
>>
>> for( j=0; j<numberOfYears; j++) {
>> for(i=1; i< numberOfWomen; i++) {
>> updateWomen();
>> }
>> }
>
> for ( j in seq.int( numberOfYears ) {
> # let vectorized data storage automatically handle the other for loop
> women < updateWomen( women )
> }
>
>> The function updateWomen() figures out things like whether the woman
>> becomes pregnant or gives birth on a given day, dies, etc.
>
> You can use your "fixed size" allocation strategy with flags indicating
> whether specific rows are in use, or you can only work with valid rows
> and add rows as needed for children... best to compute a logical vector
> that identifies all of the birthing mothers as a subset of the data
> frame, and build a set of children rows using the birthing mothers data
> frame as input, and then rbind the new rows to the updated women
> dataframe as appropriate. The most clear approach for individual
> decision calculations is the use of the vectorized "ifelse" function,
> though under certain circumstances putting an indexed subset on the left
> side of an assignment can modify memory "in place" (the
> functionalprogramming restriction against this is probably a foreign
> idea to a dyedinthewool C programmer, but R usually prevents you from
> modifying the variable that was input to a function, automatically
> making a local copy of the input as needed in order to prevent such
> backwash into the caller's context).
>
>> I added other refinements that are not relevant here, such as random
>> variations of various parameters, using the GNU Scientific Library
>> random number generator functions.
>
> R has quite sophisticated random number generation by default.
>
>> If you can suggest a data construct in R or Octave that does something
>> like this, and uses your idea of vectorization, I'd like to hear it.
>> I'd like to implement it and compare results with my C implementation.
>>
>>> If your problems truly need a compiled language, the Rcpp package
>>> lets you mix C++ with R quite easily and then you get the best of
>>> both worlds. (C and Fortran are supported, but they are a bit more
>>> finicky to setup than C++).
>>
>> I don't know the answer to that, but perhaps you can help decide.
>>
>> Alan
>>
>>
>>> On January 28, 2019 4:00:07 PM PST, Alan Feuerbacher
>>> < [hidden email]> wrote:
>>>> On 1/28/2019 4:20 PM, Rolf Turner wrote:
>>>>>
>>>>> On 1/29/19 10:05 AM, Alan Feuerbacher wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I recently learned of the existence of R through a physicist friend
>>>>>> who uses it in his research. I've used Octave for a decade, and C
>>>> for
>>>>>> 35 years, but would like to learn R. These all have advantages and
>>>>>> disadvantages for certain tasks, but as I'm new to R I hardly know
>>>> how
>>>>>> to evaluate them. Any suggestions?
>>>>>
>>>>> * C is fast, but with a syntax that is (to my mind) virtually
>>>>> incomprehensible. (You probably think differently about this.)
>>>>
>>>> I've been doing it long enough that I have little problem with it,
>>>> except for pointers. :)
>>>>
>>>>> * In C, you essentially have to roll your own for all tasks; in R,
>>>>> practically anything (well ...) that you want to do has already
>>>>> been programmed up. CRAN is a wonderful resource, and there's
>>>> more
>>>>> on github.
>>>>>
>>>>> * The syntax of R meshes beautifully with *my* thought patterns;
>>>> YMMV.
>>>>>
>>>>> * Why not just bog in and try R out? It's free, it's readily
>>>> available,
>>>>> and there are a number of good online tutorials.
>>>>
>>>> I just installed R on my Linux Fedora system, so I'll do that.
>>>>
>>>> I wonder if you'd care to comment on my little project that prompted
>>>> this? As part of another project, I wanted to model population growth
>>>> starting from a handful of starting individuals. This is exponential in
>>>>
>>>> the long run, of course, but I wanted to see how a few basic parameters
>>>>
>>>> affected the outcome. Using Octave, I modeled a single person as a
>>>> "cell", which in Octave has a good deal of overhead. The program
>>>> basically looped over the entire population, and updated each person
>>>> according to the parameters, which included random statistical
>>>> variations. So when the total population reached, say 10,000, and an
>>>> update time of 1 day, the program had to execute 10,000 x 365 update
>>>> operations for each year of growth. For large populations, say 100,000,
>>>>
>>>> the program did not return even after 24 hours of run time.
>>>>
>>>> So I switched to C, and used its "struct" declaration and an array of
>>>> structs to model the population. This allowed the program to complete
>>>> in
>>>> under a minute as opposed to 24 hours+. So in line with your comments,
>>>> C
>>>> is far more efficient than Octave.
>>>>
>>>> How do you think R would fare in this simulation?
>>>>
>>>> Alan
>>>>
>>>>
>>>> 
>>>> This email has been checked for viruses by Avast antivirus software.
>>>> https://www.avast.com/antivirus>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/rhelp>>>> PLEASE do read the posting guide
>>>> http://www.Rproject.org/postingguide.html>>>> and provide commented, minimal, selfcontained, reproducible code.
>>>
>>
>>
>
> 
> Jeff Newmiller The ..... ..... Go Live...
> DCN:< [hidden email]> Basics: ##.#. ##.#. Live Go...
> Live: OO#.. Dead: OO#.. Playing
> Research Engineer (Solar/Batteries O.O#. #.O#. with
> /Software/Embedded Controllers) .OO#. .OO#. rocks...1k
> 
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On 1/29/2019 11:50 PM, Jeff Newmiller wrote:
> On Tue, 29 Jan 2019, Alan Feuerbacher wrote:
>
>> After my failed attempt at using Octave, I realized that most likely
>> the main contributing factor was that I was not able to figure out an
>> efficient data structure to model one person. But C lent itself
>> perfectly to my idea of how to go about programming my simulation. So
>> here's a simplified pseudocode sort of example of what I did:
>
> Don't model one person... model an array of people.
>
>> To model a single reproducing woman I used this C construct:
>>
>> typedef struct woman {
>> int isAlive;
>> int isPregnant;
>> double age;
>> . . .
>> } WOMAN;
>
> # e.g.
> Nwomen < 100
> women < data.frame( isAlive = rep( TRUE, Nwomen )
> , isPregnant = rep( FALSE, Nwomen )
> , age = rep( 20, Nwomen )
> )
>
>> Then I allocated memory for a big array of these things, using the C
>> malloc() function, which gave me the equivalent of this statement:
>>
>> WOMAN women[NWOMEN]; /* An array of NWOMEN womanstructs */
>>
>> After some initialization I set up two loops:
>>
>> for( j=0; j<numberOfYears; j++) {
>> for(i=1; i< numberOfWomen; i++) {
>> updateWomen();
>> }
>> }
>
> for ( j in seq.int( numberOfYears ) {
> # let vectorized data storage automatically handle the other for loop
> women < updateWomen( women )
> }
>
>> The function updateWomen() figures out things like whether the woman
>> becomes pregnant or gives birth on a given day, dies, etc.
>
> You can use your "fixed size" allocation strategy with flags indicating
> whether specific rows are in use, or you can only work with valid rows
> and add rows as needed for children... best to compute a logical vector
> that identifies all of the birthing mothers as a subset of the data
> frame, and build a set of children rows using the birthing mothers data
> frame as input, and then rbind the new rows to the updated women
> dataframe as appropriate. The most clear approach for individual
> decision calculations is the use of the vectorized "ifelse" function,
> though under certain circumstances putting an indexed subset on the left
> side of an assignment can modify memory "in place" (the
> functionalprogramming restriction against this is probably a foreign
> idea to a dyedinthewool C programmer, but R usually prevents you from
> modifying the variable that was input to a function, automatically
> making a local copy of the input as needed in order to prevent such
> backwash into the caller's context).
Hi Jeff,
I'm well along in implementing your suggestions, but I don't understand
the last paragraph. Here is part of the experimenting I've done so far:
*=======*=======*=======*=======*=======*=======*
updatePerson < function() {
ifelse( women$isAlive,
{
# Check whether to kill off this person, if she's pregnant whether
# to give birth, whether to make her pregnant again.
women$age = women$age + timeStep
# Check if the person has reached maxAge
}
)
}
calculatePopulation < function() {
lastDate = 0
jd = 0
while( jd < maxDate ) {
for( i in seq_len( nWomen ) ) {
updatePerson();
}
todaysDateInt = floor(jd/dpy)
NAlive[todaysDateInt] = nWomen  nDead
# Do various other things
todaysDate = todaysDate + timeStep
jd = jd + timeStep
}
}
nWomen < 5
numberOfYears < 30
women < data.frame( isAlive = rep_len( TRUE, nWomen )
, isPregnant = rep_len( FALSE, nWomen )
, nChildren = rep_len( 0L, nWomen )
, ageInt = rep_len( 0L, nWomen )
, age = rep_len( 0, nWomen )
, dateOfPregnancy = rep_len( 0, nWomen )
, endDateLastPregnancy = rep_len( 0.0, nWomen )
, minBirthAge = rep_len( 0, nWomen )
, maxBirthAge = rep_len( 0, nWomen )
)
# . . .
calculatePopulation()
*=======*=======*=======*=======*=======*=======*
The above code (in its complete form) executes without errors. I don't
understand at least two things:
In the updatePerson function, in the ifelse statement, how do I change
the appropriate values in the women dataframe?
I don't understand most of your last paragraph at all.
Thanks so much for your help in learning R!
Alan

This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Your code seems to be attempting to modify global variables from within
functions... R purposely makes this hard to do. Don't fight it. Instead,
use function arguments and produce function outputs with your functions.
Also, the ifelse function does not control flow of execution of code... it
selects values between two vectors according to the state of the logical
input vector. Note that all values in both possible input values must be
computed when using ifelse before it can do its magic, so ifelse can be
significantly slower than assigning into an indexed vector if a small
fraction of the vector will be changing.
Below is some proofofconcept code. It mostly modifies values inplace
within the data frame rather than using ifelse.
You might want to read the Intro to R document available through the R
console via:
RShowDoc("Rintro")
to look up numeric indexing and logical indexing syntax while reading
through this.
#####################
makeNewWomen < function( nWomen ) {
data.frame( isAlive = rep_len( TRUE, nWomen )
, isPregnant = rep_len( FALSE, nWomen )
, nChildren = rep_len( 0L, nWomen )
, age = rep_len( 0, nWomen )
, dateOfPregnancy = rep_len( 0, nWomen )
, endDateLastPregnancy = rep_len( 0.0, nWomen )
)
}
updateWomen < function( DF
, jd
, maxAge
, timeStep
, pregProb
, gestation
, minBirthAge
, maxBirthAge
) {
DF$isAlive[ maxAge <= DF$age ] < FALSE
fertileIdx < with( DF, isAlive & !isPregnant & minBirthAge <= age & age <= maxBirthAge )
conceiveIdx < fertileIdx
conceiveIdx[ conceiveIdx ] < sample( c( FALSE, TRUE )
, size = sum( fertileIdx )
, replace = TRUE
, prob = c( 1pregProb, pregProb )
)
DF$isPregnant[ conceiveIdx ] < TRUE
DF$dateOfPregnancy[ conceiveIdx ] < jd
birthIdx < with( DF, isAlive & isPregnant & ( dateOfPregnancy + gestation ) <= jd )
femalechild < sample( c( FALSE, TRUE )
, size = sum( birthIdx ) # random within birthing group
, replace = TRUE
, prob = c( 0.5, 0.5 )
)
DF$isPregnant[ birthIdx ] < FALSE # pregnancy over
birthIdx[ birthIdx ] < femalechild # track births further only where female
# DF$age < ifelse( DF$isAlive
# , DF$age + timeStep
# , DF$age
# )
DF$age[ DF$isAlive ] < DF$age[ DF$isAlive ] + timeStep
numNotAlive < sum( !DF$isAlive )
numBirths < sum( birthIdx )
if ( 0 < numBirths ) { # if needed, start female babies in existing or new rows
if ( 0 < numNotAlive ) {
reuseidx < which( !DF$isAlive )
if ( numBirths <= numNotAlive ) {
# can fit all new births into existing DF
reuseidx < reuseidx[ seq.int( numBirths ) ]
DF[ reuseidx, ] < makeNewWomen( numBirths )
} else {
DF[ reuseidx, ] < makeNewWomen( length( reuseidx ) )
DF < rbind( DF
, makeNewWomen( numBirths  length( reuseidx ) )
)
}
} else { # no empty rows in DF
DF < rbind( DF
, makeNewWomen( numBirths )
)
}
}
DF # return the updated data frame to the caller
}
calculatePopulation < function( nWomen
, maxDate
, dpy
, pregProb
, maxAge
, timeStep
, gestation
, minBirthAge
, maxBirthAge
, prealloc
) {
jd < 0
nextSampleJd < jd + dpy
numSamples < maxDate %/% dpy
result < data.frame( jd = rep( NA, numSamples )
, NAlive = rep( NA, numSamples )
, NPreg = rep( NA, numSamples )
, NNotAlive = rep( NA, numSamples )
)
i < 1L
DF < makeNewWomen( prealloc )
DF$isAlive < seq.int( prealloc ) <= nWomen # leave most entries "dead"
while( jd < maxDate ) {
DF < updateWomen( DF
, jd
, maxAge
, timeStep
, pregProb
, gestation
, minBirthAge
, maxBirthAge
)
if ( nextSampleJd <= jd ) {
result$jd[ i ] < jd
result$NAlive[ i ] < sum( DF$isAlive )
result$NPreg[ i ] < sum( DF$isPregnant )
result$NNotAlive < sum( !DF$isAlive )
nextSampleJd < nextSampleJd + dpy
i < i + 1L
}
# Do various other things
jd < jd + timeStep
}
result
}
nWomen < 5
numberOfYears < 30
maxDate < 300 * 365
dpy < 365
pregProb < 0.01
maxAge < 50 * 365
minBirthAge < 18 * 365
maxBirthAge < 45 * 365
timeStep < 30
gestation < 30 * 9
prealloc < 10000
set.seed(42)
simresult < calculatePopulation( nWomen
, maxDate
, dpy
, pregProb
, maxAge
, timeStep
, gestation
, minBirthAge
, maxBirthAge
, prealloc
)
plot( simresult$jd/365, simresult$NAlive )
plot( simresult$jd/365, simresult$NNotAlive )
plot( simresult$jd/365, simresult$NPreg )
#####################
On Sat, 2 Feb 2019, Alan Feuerbacher wrote:
> On 1/29/2019 11:50 PM, Jeff Newmiller wrote:
>> On Tue, 29 Jan 2019, Alan Feuerbacher wrote:
>>
>>> After my failed attempt at using Octave, I realized that most likely the
>>> main contributing factor was that I was not able to figure out an
>>> efficient data structure to model one person. But C lent itself perfectly
>>> to my idea of how to go about programming my simulation. So here's a
>>> simplified pseudocode sort of example of what I did:
>>
>> Don't model one person... model an array of people.
>>
>>> To model a single reproducing woman I used this C construct:
>>>
>>> typedef struct woman {
>>> int isAlive;
>>> int isPregnant;
>>> double age;
>>> . . .
>>> } WOMAN;
>>
>> # e.g.
>> Nwomen < 100
>> women < data.frame( isAlive = rep( TRUE, Nwomen )
>> , isPregnant = rep( FALSE, Nwomen )
>> , age = rep( 20, Nwomen )
>> )
>>
>>> Then I allocated memory for a big array of these things, using the C
>>> malloc() function, which gave me the equivalent of this statement:
>>>
>>> WOMAN women[NWOMEN]; /* An array of NWOMEN womanstructs */
>>>
>>> After some initialization I set up two loops:
>>>
>>> for( j=0; j<numberOfYears; j++) {
>>> for(i=1; i< numberOfWomen; i++) {
>>> updateWomen();
>>> }
>>> }
>>
>> for ( j in seq.int( numberOfYears ) {
>> # let vectorized data storage automatically handle the other for loop
>> women < updateWomen( women )
>> }
>>
>>> The function updateWomen() figures out things like whether the woman
>>> becomes pregnant or gives birth on a given day, dies, etc.
>>
>> You can use your "fixed size" allocation strategy with flags indicating
>> whether specific rows are in use, or you can only work with valid rows and
>> add rows as needed for children... best to compute a logical vector that
>> identifies all of the birthing mothers as a subset of the data frame, and
>> build a set of children rows using the birthing mothers data frame as
>> input, and then rbind the new rows to the updated women dataframe as
>> appropriate. The most clear approach for individual decision calculations
>> is the use of the vectorized "ifelse" function, though under certain
>> circumstances putting an indexed subset on the left side of an assignment
>> can modify memory "in place" (the functionalprogramming restriction
>> against this is probably a foreign idea to a dyedinthewool C programmer,
>> but R usually prevents you from modifying the variable that was input to a
>> function, automatically making a local copy of the input as needed in order
>> to prevent such backwash into the caller's context).
>
> Hi Jeff,
>
> I'm well along in implementing your suggestions, but I don't understand the
> last paragraph. Here is part of the experimenting I've done so far:
>
> *=======*=======*=======*=======*=======*=======*
> updatePerson < function() {
> ifelse( women$isAlive,
> {
> # Check whether to kill off this person, if she's pregnant whether
> # to give birth, whether to make her pregnant again.
> women$age = women$age + timeStep
> # Check if the person has reached maxAge
> }
> )
> }
>
> calculatePopulation < function() {
> lastDate = 0
> jd = 0
> while( jd < maxDate ) {
> for( i in seq_len( nWomen ) ) {
> updatePerson();
> }
> todaysDateInt = floor(jd/dpy)
> NAlive[todaysDateInt] = nWomen  nDead
> # Do various other things
> todaysDate = todaysDate + timeStep
> jd = jd + timeStep
> }
> }
>
> nWomen < 5
> numberOfYears < 30
> women < data.frame( isAlive = rep_len( TRUE, nWomen )
> , isPregnant = rep_len( FALSE, nWomen )
> , nChildren = rep_len( 0L, nWomen )
> , ageInt = rep_len( 0L, nWomen )
> , age = rep_len( 0, nWomen )
> , dateOfPregnancy = rep_len( 0, nWomen )
> , endDateLastPregnancy = rep_len( 0.0, nWomen )
> , minBirthAge = rep_len( 0, nWomen )
> , maxBirthAge = rep_len( 0, nWomen )
> )
>
> # . . .
>
> calculatePopulation()
>
> *=======*=======*=======*=======*=======*=======*
>
> The above code (in its complete form) executes without errors. I don't
> understand at least two things:
>
> In the updatePerson function, in the ifelse statement, how do I change the
> appropriate values in the women dataframe?
>
> I don't understand most of your last paragraph at all.
>
> Thanks so much for your help in learning R!
>
> Alan
>
> 
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus>
>

Jeff Newmiller The ..... ..... Go Live...
DCN:< [hidden email]> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k

______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.

