Re: The case for freezing CRAN

classic Classic list List threaded Threaded
27 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: The case for freezing CRAN

Therneau, Terry M., Ph.D.
There is a central assertion to this argument that I don't follow:

> At the end of the day most published results obtained with R just won't be reproducible.

This is a very strong assertion. What is the evidence for it?

  I write a lot of Sweave/knitr in house as a way of documenting complex analyses, and a
glm() based logistic regression looks the same yesterday as it will tomorrow.

Terry Therneau

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: The case for freezing CRAN

Frank Harrell
This post has NOT been accepted by the mailing list yet.
Well put Terry.

I'd like to make a different point.  Freezing R packages may make certain analyses more reproducible in the short term, but essentially one is just deferring the problem until later.  A statistician who doesn't update package for a long time will be quite surprised with multiple packages "break" for them when a major new release comes out.  I prefer to have incremental updating, keeping all packages I use up to date with the latest CRAN submission.

Frank

Therneau, Terry M., Ph.D. wrote
There is a central assertion to this argument that I don't follow:

> At the end of the day most published results obtained with R just won't be reproducible.

This is a very strong assertion. What is the evidence for it?

  I write a lot of Sweave/knitr in house as a way of documenting complex analyses, and a
glm() based logistic regression looks the same yesterday as it will tomorrow.

Terry Therneau

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Frank Harrell
Department of Biostatistics, Vanderbilt University
Reply | Threaded
Open this post in threaded view
|

Re: The case for freezing CRAN

Michael Weylandt
In reply to this post by Therneau, Terry M., Ph.D.
On Mar 20, 2014, at 8:19, "Therneau, Terry M., Ph.D." <[hidden email]> wrote:

> There is a central assertion to this argument that I don't follow:
>
>> At the end of the day most published results obtained with R just won't be reproducible.
>
> This is a very strong assertion. What is the evidence for it?

If I've understood Jeroen correctly, his point might be alternatively phrased as "won't be reproducED" (i.e., end user difficulties, not software availability).

Michael

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: The case for freezing CRAN

Therneau, Terry M., Ph.D.


On 03/20/2014 07:48 AM, Michael Weylandt wrote:

> On Mar 20, 2014, at 8:19, "Therneau, Terry M., Ph.D." <[hidden email]> wrote:
>
>> There is a central assertion to this argument that I don't follow:
>>
>>> At the end of the day most published results obtained with R just won't be reproducible.
>>
>> This is a very strong assertion. What is the evidence for it?
>
> If I've understood Jeroen correctly, his point might be alternatively phrased as "won't be reproducED" (i.e., end user difficulties, not software availability).
>
> Michael
>

That was my point as well.  Of the 30+ Sweave documents that I've produced I can't think
of one that will change its output with a new version of R.  My 0/30 estimate is at odds
with the "nearly all" assertion.  Perhaps I only do dull things?

Terry T.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: The case for freezing CRAN

Kevin Coombes

On 3/20/2014 9:00 AM, Therneau, Terry M., Ph.D. wrote:

>
>
> On 03/20/2014 07:48 AM, Michael Weylandt wrote:
>> On Mar 20, 2014, at 8:19, "Therneau, Terry M., Ph.D."
>> <[hidden email]> wrote:
>>
>>> There is a central assertion to this argument that I don't follow:
>>>
>>>> At the end of the day most published results obtained with R just
>>>> won't be reproducible.
>>>
>>> This is a very strong assertion. What is the evidence for it?
>>
>> If I've understood Jeroen correctly, his point might be alternatively
>> phrased as "won't be reproducED" (i.e., end user difficulties, not
>> software availability).
>>
>> Michael
>>
>
> That was my point as well.  Of the 30+ Sweave documents that I've
> produced I can't think of one that will change its output with a new
> version of R.  My 0/30 estimate is at odds with the "nearly all"
> assertion.  Perhaps I only do dull things?
>
> Terry T.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

The only concrete example that comes to mind from my own Sweave reports
was actually caused by BioConductor and not CRAN. I had a set of
analyses that used DNAcopy, and the results changed substantially with a
new release of the package in which they changed the default values to
the main function call.   As a result, I've taken to writing out more of
the defaults that I previously just accepted.  There have been a few
minor issues similar to this one (with changes to parts of the Mclust
package ??). So my estimates are somewhat higher than 0/30 but are still
a long way from "almost all".

Kevin

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: The case for freezing CRAN

Dirk Eddelbuettel
In reply to this post by Therneau, Terry M., Ph.D.

No attempt to summarize the thread, but a few highlighted points:

 o Karl's suggestion of versioned / dated access to the repo by adding a
   layer to webaccess is (as usual) nice.  It works on the 'supply' side. But
   Jeroen's problem is on the demand side.  Even when we know that an
   analysis was done on 20xx-yy-zz, and we reconstruct CRAN that day, it only
   gives us a 'ceiling' estimate of what was on the machine.  In production
   or lab environments, installations get stale.  Maybe packages were already
   a year old?  To me, this is an issue that needs to be addressed on the
   'demand' side of the user. But just writing out version numbers is not
   good enough.

 o Roger correctly notes that R scripts and packages are just one issue.
   Compilers, libraries and the OS matter.  To me, the natural approach these
   days would be to think of something based on Docker or Vagrant or (if you
   must, VirtualBox).  The newer alternatives make snapshotting very cheap
   (eg by using Linux LXC).  That approach reproduces a full environemnt as
   best as we can while still ignoring the hardware layer (and some readers
   may recall the infamous Pentium bug of two decades ago).

 o Reproduciblity will probably remain the responsibility of study
   authors. If an investigator on a mega-grant wants to (or needs to) freeze,
   they do have the tools now.  Requiring the need of a few to push work on
   those already overloaded (ie CRAN) and changing the workflow of everybody
   is a non-starter.

 o As Terry noted, Jeroen made some strong claims about exactly how flawed
   the existing system is and keeps coming back to the example of 'a JSS
   paper that cannot be re-run'.  I would really like to see empirics on
   this.  Studies of reproducibility appear to be publishable these days, so
   maybe some enterprising grad student wants to run with the idea of
   actually _testing_ this.  We maybe be above Terry's 0/30 and nearer to
   Kevin's 'low'/30.  But let's bring some data to the debate.

 o Overall, I would tend to think that our CRAN standards of releasing with
   tests, examples, and checks on every build and release already do a much
   better job of keeping things tidy and workable than in most if not all
   other related / similar open source projects. I would of course welcome
   contradictory examples.

Dirk
 
--
Dirk Eddelbuettel | [hidden email] | http://dirk.eddelbuettel.com

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: The case for freezing CRAN

glsnow
On Thu, Mar 20, 2014 at 7:32 AM, Dirk Eddelbuettel <[hidden email]> wrote:
[snip]

>      (and some readers
>    may recall the infamous Pentium bug of two decades ago).

It was a "Flaw" not a "Bug".  At least I remember the Intel people
making a big deal about that distinction.

But I do remember the time well, I was a biostatistics Ph.D. student
at the time and bought one of the flawed pentiums.  My attempts at
getting the chip replaced resulted in a major run around and each
person that I talked to would first try to explain that I really did
not need the fix because the only people likely to be affected were
large corporations and research scientists.  I will admit that I was
not a large corporation, but if a Ph.D. student in biostatistics is
not a research scientist, then I did not know what they defined one
as.  When I pointed this out they would usually then say that it still
would not matter, unless I did a few thousand floating point
operations I was unlikely to encounter one of the problematic
divisions.  I would then point out that some days I did over 10,000
floating point operations before breakfast (I had checked after the
1st person told me this and 10,000 was a low estimate of a lower bound
of one set of simulations) at which point they would admit that I had
a case and then send me to talk to someone else who would start the
process over.



[snip]
> --
> Dirk Eddelbuettel | [hidden email] | http://dirk.eddelbuettel.com
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



--
Gregory (Greg) L. Snow Ph.D.
[hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: The case for freezing CRAN

Karl Millar
In reply to this post by Dirk Eddelbuettel
Given the version / dated snapshots of CRAN, and an agreement that
reproducibility is the responsibility of the study author, the author
simply needs to sync all their packages to a chosen date, run the analysis
and publish the chosen date.  It is true that this doesn't include
compilers, OS, system packages etc, but in my experience those are
significantly more stable than CRAN packages.


Also, my previous description of how to serve up a dated CRAN was way too
complicated.  Since most of the files on CRAN never change, they don't need
version control.  Only the metadata about which versions are current really
needs to be tracked, and that's small enough that it could be stored in
static files.




On Thu, Mar 20, 2014 at 6:32 AM, Dirk Eddelbuettel <[hidden email]> wrote:

>
> No attempt to summarize the thread, but a few highlighted points:
>
>  o Karl's suggestion of versioned / dated access to the repo by adding a
>    layer to webaccess is (as usual) nice.  It works on the 'supply' side.
> But
>    Jeroen's problem is on the demand side.  Even when we know that an
>    analysis was done on 20xx-yy-zz, and we reconstruct CRAN that day, it
> only
>    gives us a 'ceiling' estimate of what was on the machine.  In production
>    or lab environments, installations get stale.  Maybe packages were
> already
>    a year old?  To me, this is an issue that needs to be addressed on the
>    'demand' side of the user. But just writing out version numbers is not
>    good enough.
>
>  o Roger correctly notes that R scripts and packages are just one issue.
>    Compilers, libraries and the OS matter.  To me, the natural approach
> these
>    days would be to think of something based on Docker or Vagrant or (if
> you
>    must, VirtualBox).  The newer alternatives make snapshotting very cheap
>    (eg by using Linux LXC).  That approach reproduces a full environemnt as
>    best as we can while still ignoring the hardware layer (and some readers
>    may recall the infamous Pentium bug of two decades ago).
>
>  o Reproduciblity will probably remain the responsibility of study
>    authors. If an investigator on a mega-grant wants to (or needs to)
> freeze,
>    they do have the tools now.  Requiring the need of a few to push work on
>    those already overloaded (ie CRAN) and changing the workflow of
> everybody
>    is a non-starter.
>
>  o As Terry noted, Jeroen made some strong claims about exactly how flawed
>    the existing system is and keeps coming back to the example of 'a JSS
>    paper that cannot be re-run'.  I would really like to see empirics on
>    this.  Studies of reproducibility appear to be publishable these days,
> so
>    maybe some enterprising grad student wants to run with the idea of
>    actually _testing_ this.  We maybe be above Terry's 0/30 and nearer to
>    Kevin's 'low'/30.  But let's bring some data to the debate.
>
>  o Overall, I would tend to think that our CRAN standards of releasing with
>    tests, examples, and checks on every build and release already do a much
>    better job of keeping things tidy and workable than in most if not all
>    other related / similar open source projects. I would of course welcome
>    contradictory examples.
>
> Dirk
>
> --
> Dirk Eddelbuettel | [hidden email] | http://dirk.eddelbuettel.com
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: The case for freezing CRAN

Carl Boettiger
In reply to this post by glsnow
There seems to be some question of how frequently changes to software
packages result in irreproducible results.

I am sure Terry is correct that research using functions like `glm` and
other functions that are shipped with base R are quite reliable; and after
all they already benefit from being versioned with R releases as Jeroen
argues.

In my field of ecology and evolution, the situation is quite different.
 Packages are frequently developed by scientists without any background in
programming and become widely used, such as [geiger](
http://cran.r-project.org/web/packages/geiger/), with 463 papers citing it
and probably many more using it that do not cite it (both because it is
sometimes used only as a dependency of another package or just because our
community isn't great at citing packages).  The package has changed
substantially over the time it has been on CRAN and many functions that
would once run based on older versions could no longer run on newer ones.
 It's dependencies, notably the phylogenetics package ape, has changed
continually over that interval with both bug fixes and substantial changes
to the basic data structure.  The ape package has 1,276 citations (again a
lower bound).  I suspect that correctly identifying the right version of
the software used in any of these thousands of papers would prove difficult
and for a large fraction the results would simply not execute successfully.
It would be much harder to track down cases where the bug fixes would have
any impact on the result.  I have certainly seen both problems in the
hundreds of Sweave/knitr files I have produced over the years that use
these packages.

Even work that simply relies on a package that has been archived becomes a
substantial challenge to reproducibility by other scientists even when an
expert familiar with the packages (e.g. the original author) would not have
a problem, as the informatics team at the Evolutionary Synthesis center
recently concluded in an exercise trying to reproduce several papers
including my own that used a package that had been archived (odesolve,
whose replacement, deSolve, does not use quite the same function call for
the same `lsoda` function).

New methods are being published all the time, and I think it is excellent
that in ecology and evolution it is increasingly standard to publish R
packages implementing those methods, as a scan of any table of contents in
"methods in Ecology and Evolution", for instance, will quickly show.  But
unlike `glm`, these methods have a long way to go before they are fully
tested and debugged, and reproducing any work based on them requires a
close eye to the versions (particularly when unit tests and even detailed
changelogs are not common). The methods are invariably built by
"user-developers", researchers developing the code for their own needs, and
thus these packages can themselves fall afoul of changes as they depend and
build upon work of other nascent ecology and evolution packages.

Detailed reproducibility studies of published work in this area are still
hard to come by, not least because the actual code used by the researchers
is seldom published (other than when it is published as it's own R
package).  But incompatibilities between successive versions of the 100s of
packages in our domain, along with the interdependencies of those packages
might provide some window into the difficulties of computational
reproducibility.  I suspect changes in these fast-moving packages are far
more culprit than differences in compilers and operating systems.

Cheers,

Carl








On Thu, Mar 20, 2014 at 10:23 AM, Greg Snow <[hidden email]> wrote:

> On Thu, Mar 20, 2014 at 7:32 AM, Dirk Eddelbuettel <[hidden email]> wrote:
> [snip]
>
> >      (and some readers
> >    may recall the infamous Pentium bug of two decades ago).
>
> It was a "Flaw" not a "Bug".  At least I remember the Intel people
> making a big deal about that distinction.
>
> But I do remember the time well, I was a biostatistics Ph.D. student
> at the time and bought one of the flawed pentiums.  My attempts at
> getting the chip replaced resulted in a major run around and each
> person that I talked to would first try to explain that I really did
> not need the fix because the only people likely to be affected were
> large corporations and research scientists.  I will admit that I was
> not a large corporation, but if a Ph.D. student in biostatistics is
> not a research scientist, then I did not know what they defined one
> as.  When I pointed this out they would usually then say that it still
> would not matter, unless I did a few thousand floating point
> operations I was unlikely to encounter one of the problematic
> divisions.  I would then point out that some days I did over 10,000
> floating point operations before breakfast (I had checked after the
> 1st person told me this and 10,000 was a low estimate of a lower bound
> of one set of simulations) at which point they would admit that I had
> a case and then send me to talk to someone else who would start the
> process over.
>
>
>
> [snip]
> > --
> > Dirk Eddelbuettel | [hidden email] | http://dirk.eddelbuettel.com
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
> --
> Gregory (Greg) L. Snow Ph.D.
> [hidden email]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



--
Carl Boettiger
UC Santa Cruz
http://carlboettiger.info/

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: The case for freezing CRAN

Marc Schwartz-3
In reply to this post by glsnow

On Mar 20, 2014, at 12:23 PM, Greg Snow <[hidden email]> wrote:

> On Thu, Mar 20, 2014 at 7:32 AM, Dirk Eddelbuettel <[hidden email]> wrote:
> [snip]
>
>>     (and some readers
>>   may recall the infamous Pentium bug of two decades ago).
>
> It was a "Flaw" not a "Bug".  At least I remember the Intel people
> making a big deal about that distinction.
>
> But I do remember the time well, I was a biostatistics Ph.D. student
> at the time and bought one of the flawed pentiums.  My attempts at
> getting the chip replaced resulted in a major run around and each
> person that I talked to would first try to explain that I really did
> not need the fix because the only people likely to be affected were
> large corporations and research scientists.  I will admit that I was
> not a large corporation, but if a Ph.D. student in biostatistics is
> not a research scientist, then I did not know what they defined one
> as.  When I pointed this out they would usually then say that it still
> would not matter, unless I did a few thousand floating point
> operations I was unlikely to encounter one of the problematic
> divisions.  I would then point out that some days I did over 10,000
> floating point operations before breakfast (I had checked after the
> 1st person told me this and 10,000 was a low estimate of a lower bound
> of one set of simulations) at which point they would admit that I had
> a case and then send me to talk to someone else who would start the
> process over.


Further segue:

That (1994) was a watershed moment for Intel as a company. A time during which Intel's future was quite literally at stake. Intel's internal response to that debacle, which fundamentally altered their own perception of just who their customer was (the OEM's like IBM, COMPAQ and Dell versus the end users like us), took time to be realized, as the impact of increasingly negative PR took hold. It was also a good example of the impact of public perception (a flawed product) versus the realities of how infrequently the flaw would be observed in "typical" computing. "Perception is reality", as some would observe.

Intel ultimately spent somewhere in the neighborhood of $500 million (in 1994 U.S. dollars), as I recall, to implement a large scale Pentium chip replacement infrastructure targeted to end users. The "Intel Inside" marketing campaign was also an outgrowth of that time period.

Regards,

Marc Schwartz


> [snip]
>> --
>> Dirk Eddelbuettel | [hidden email] | http://dirk.eddelbuettel.com
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
> --
> Gregory (Greg) L. Snow Ph.D.
> [hidden email]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: The case for freezing CRAN

Marc Schwartz-3

On Mar 20, 2014, at 1:02 PM, Marc Schwartz <[hidden email]> wrote:

>
> On Mar 20, 2014, at 12:23 PM, Greg Snow <[hidden email]> wrote:
>
>> On Thu, Mar 20, 2014 at 7:32 AM, Dirk Eddelbuettel <[hidden email]> wrote:
>> [snip]
>>
>>>    (and some readers
>>>  may recall the infamous Pentium bug of two decades ago).
>>
>> It was a "Flaw" not a "Bug".  At least I remember the Intel people
>> making a big deal about that distinction.
>>
>> But I do remember the time well, I was a biostatistics Ph.D. student
>> at the time and bought one of the flawed pentiums.  My attempts at
>> getting the chip replaced resulted in a major run around and each
>> person that I talked to would first try to explain that I really did
>> not need the fix because the only people likely to be affected were
>> large corporations and research scientists.  I will admit that I was
>> not a large corporation, but if a Ph.D. student in biostatistics is
>> not a research scientist, then I did not know what they defined one
>> as.  When I pointed this out they would usually then say that it still
>> would not matter, unless I did a few thousand floating point
>> operations I was unlikely to encounter one of the problematic
>> divisions.  I would then point out that some days I did over 10,000
>> floating point operations before breakfast (I had checked after the
>> 1st person told me this and 10,000 was a low estimate of a lower bound
>> of one set of simulations) at which point they would admit that I had
>> a case and then send me to talk to someone else who would start the
>> process over.
>
>
> Further segue:
>
> That (1994) was a watershed moment for Intel as a company. A time during which Intel's future was quite literally at stake. Intel's internal response to that debacle, which fundamentally altered their own perception of just who their customer was (the OEM's like IBM, COMPAQ and Dell versus the end users like us), took time to be realized, as the impact of increasingly negative PR took hold. It was also a good example of the impact of public perception (a flawed product) versus the realities of how infrequently the flaw would be observed in "typical" computing. "Perception is reality", as some would observe.
>
> Intel ultimately spent somewhere in the neighborhood of $500 million (in 1994 U.S. dollars), as I recall, to implement a large scale Pentium chip replacement infrastructure targeted to end users. The "Intel Inside" marketing campaign was also an outgrowth of that time period.
>


Quick correction, thanks to Peter, on my assertion that the "Intel Inside" campaign arose from the 1994 Pentium issue. It actually started in 1991.

I had a faulty recollection from my long ago reading of Andy Grove's 1996 book, "Only The Paranoid Survive", that the slogan arose from Intel's reaction to the Pentium fiasco. It actually pre-dated that time frame by a few years.

Thanks Peter!

Regards,

Marc

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Docker versus Vagrant for reproducability - was: The case for freezing CRAN

Rainer Krug-3
In reply to this post by Dirk Eddelbuettel
Dirk Eddelbuettel <[hidden email]> writes:

>  o Roger correctly notes that R scripts and packages are just one issue.
>    Compilers, libraries and the OS matter.  To me, the natural approach these
>    days would be to think of something based on Docker or Vagrant or (if you
>    must, VirtualBox).  The newer alternatives make snapshotting very cheap
>    (eg by using Linux LXC).  That approach reproduces a full environemnt as
>    best as we can while still ignoring the hardware layer (and some readers
>    may recall the infamous Pentium bug of two decades ago).

These two tools look very interesting - but I have, even after reading a
few discussions of their differences, no idea which one is better suited
to be used for what has been discussed here: Making it possible to run
the analysis later to reproduce results using the same versions used in
the initial analysis.

Am I right in saying:

- Vagrant uses VMs to emulate the hardware
- Docker does not

wherefore
- Vagrant is slower and requires more space
- Docker is faster and requires less space

Therefore, could one say that Vagrant is more "robust" in the long run?

How do they compare in relation to different platforms? Vagrant seems to
be platform agnostic, I can develop and run on Linux, Mac and Windows -
how does it work with Docker?

I just followed [1] and setup Docker on OSX - loos promising - it also
uses an underlying VM. SO both should be equal in regards to
reproducability in the long run?

Please note: I see these questions in the light of this discussion of
reproducability and not in regards to deployment of applications what
the discussions on the web are.

Any comments, thoughts, remarks?

Rainer


Footnotes:
[1]  http://docs.docker.io/en/latest/installation/mac/

--
Rainer M. Krug
email: Rainer<at>krugs<dot>de
PGP: 0x0F52F982

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

attachment0 (504 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: The case for freezing CRAN

Therneau, Terry M., Ph.D.
In reply to this post by Therneau, Terry M., Ph.D.
This has been a fascinating discussion.

Carl Boettinger replied with a set of examples where the world is much more fragile than
my examples.  That was useful.  It seems that people in my area (medical research and
survival) are more careful with their packages (whew!).

Gabor Csardi discussed the problems with maintaining a package with lots of dependencies.
I maintain the survival package which currently has 246 reverse dependencies and take a
slightly different view, which could be described as "the price of fame".  I feel a
responsiblity to not break R.  I have automated scripts which download the latest copy of
all 246, using the install-tests option, and run them all. Most updates have 1-3 issues.  
About 25% of the time it turns out to be a problem that I introduced, and in all the
others I have found the other package authors to be responsive.  It is a nuisance, yes,
but also worth it.  I've built the test scripts over several years, with help from several
others; a place to share this information would be a useful addition.

This process also keeps me honest about any updates that are not backwards compatable.  
There is hardly a single option that is not used by some other package, somewhere.

Terry Therneau

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Docker versus Vagrant for reproducability - was: The case for freezing CRAN

Philippe Grosjean-3
In reply to this post by Rainer Krug-3

..............................................<°}))><........
 ) ) ) ) )
( ( ( ( (    Prof. Philippe Grosjean
 ) ) ) ) )
( ( ( ( (    Numerical Ecology of Aquatic Systems
 ) ) ) ) )   Mons University, Belgium
( ( ( ( (
..............................................................

On 21 Mar 2014, at 10:59, Rainer M Krug <[hidden email]> wrote:

> Dirk Eddelbuettel <[hidden email]> writes:
>
>> o Roger correctly notes that R scripts and packages are just one issue.
>>   Compilers, libraries and the OS matter.  To me, the natural approach these
>>   days would be to think of something based on Docker or Vagrant or (if you
>>   must, VirtualBox).  The newer alternatives make snapshotting very cheap
>>   (eg by using Linux LXC).  That approach reproduces a full environemnt as
>>   best as we can while still ignoring the hardware layer (and some readers
>>   may recall the infamous Pentium bug of two decades ago).
>
> These two tools look very interesting - but I have, even after reading a
> few discussions of their differences, no idea which one is better suited
> to be used for what has been discussed here: Making it possible to run
> the analysis later to reproduce results using the same versions used in
> the initial analysis.
>
> Am I right in saying:
>
> - Vagrant uses VMs to emulate the hardware
> - Docker does not
>
Yes.


> wherefore
> - Vagrant is slower and requires more space
> - Docker is faster and requires less space
>
It depends. For instance, if you run R in VirtualBox under Windows, it may run faster depending on the code you run and, say, the Lapack library used. On Linux, you typically got R code run in the VM 2-3% slower than natively, but In a Windows host, most of my R code runs faster in the VM… But yes, you need more RAM.

With Vagrant, you do not need to keep you VM once you don't use it any more. Then, disk space is shrunk down to a few kB, corresponding to the Vagrant configuration file. I guess the same is true for Docker?

A big advantage of Vagrant + VirtualBox is that you got a very similar virtual hardware, no matter if your host system is Linux, Windows or Mac OS X. I see this as a good point for better reproducibility.


> Therefore, could one say that Vagrant is more "robust" in the long run?
>
May be,… but it depends almost entirely how VirtualBox will support old VMs in the future!

PhG

> How do they compare in relation to different platforms? Vagrant seems to
> be platform agnostic, I can develop and run on Linux, Mac and Windows -
> how does it work with Docker?
>
> I just followed [1] and setup Docker on OSX - loos promising - it also
> uses an underlying VM. SO both should be equal in regards to
> reproducability in the long run?
>
> Please note: I see these questions in the light of this discussion of
> reproducability and not in regards to deployment of applications what
> the discussions on the web are.
>
> Any comments, thoughts, remarks?
>
> Rainer
>
>
> Footnotes:
> [1]  http://docs.docker.io/en/latest/installation/mac/
>
> --
> Rainer M. Krug
> email: Rainer<at>krugs<dot>de
> PGP: 0x0F52F982
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: The case for freezing CRAN

Gábor Csárdi
In reply to this post by Therneau, Terry M., Ph.D.
On Fri, Mar 21, 2014 at 8:43 AM, Therneau, Terry M., Ph.D. <
[hidden email]> wrote:
[...]

>
> Gabor Csardi discussed the problems with maintaining a package with lots
> of dependencies.
> I maintain the survival package which currently has 246 reverse
> dependencies and take a slightly different view, which could be described
> as "the price of fame".  I feel a responsiblity to not break R.  I have
> automated scripts which download the latest copy of all 246, using the
> install-tests option, and run them all. Most updates have 1-3 issues.
>  About 25% of the time it turns out to be a problem that I introduced, and
> in all the others I have found the other package authors to be responsive.
>  It is a nuisance, yes, but also worth it.  I've built the test scripts
> over several years, with help from several others; a place to share this
> information would be a useful addition.
>

Well, maybe you are just a better programmer and maintainer than me, and I
am alone with my problems. I hope that this is the case.

I actually do run automated tests against the reverse dependencies. It
downloads ~3GB of packages, the output is 500KB (much of it is the
compilation of my package, though), and it contains the word 'error' ~ 80
and the word 'warning' ~ 270 times:
http://pave.igraph.org/job/igraph-r-check-deps/15/consoleFull

This process also keeps me honest about any updates that are not backwards
> compatable.


Not really, this would only be true if all the 246 package had proper tests
for all of their survival uses. Unlikely. It definitely helps, I am not
saying that it does not, but I also think that it is up to the maintainer
of the package to test it, including testing it against newer versions of
its dependencies. Simply because the maintainers know best how their
packages are supposed to work, and how it is supposed to be tested.

The other thing is that quite often I do want to break the API, and this
would be much easier with having a CRAN-devel, so that there is some time
for the problems to come up.

Gabor

There is hardly a single option that is not used by some other package,
> somewhere.


[...]

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: The case for freezing CRAN

Dirk Eddelbuettel
In reply to this post by Therneau, Terry M., Ph.D.

On 21 March 2014 at 07:43, Therneau, Terry M., Ph.D. wrote:
| This has been a fascinating discussion.

I am not so sure. Seems more like rehashing of old and known arguments, while
some folks try to push their work (Hi Jeroen :) onto already overloaded
others.  The only real thing I learned so far is that Philippe is busy
earning publication credits along the line 'damn, just go and test it'
suggestion I made (somewhat flippantly) in my last email.

| I maintain the survival package which currently has 246 reverse dependencies and take a
| slightly different view, which could be described as "the price of fame".  I feel a
| responsiblity to not break R.  I have automated scripts which download the latest copy of
| all 246, using the install-tests option, and run them all. Most updates have 1-3 issues.  

Same here, but as a somewhat younger package Rcpp is so far "only" at 189 and
counting, with pretty decent growth.  My experience has been positive too,
and CRAN appears appreciative for us doing preemptive work and trying to be
careful about not introducing breaking changes.  I too see the latter part as
something we owe the users of our package: a "promise" not to mess with the
interface unless we absolutely must.  

| but also worth it.  I've built the test scripts over several years, with help from several
| others; a place to share this information would be a useful addition.

I put my script on GitHub next to Rcpp itself, turns out that another thread
participant just a need for exactly that script yesterday.

Dirk

--
Dirk Eddelbuettel | [hidden email] | http://dirk.eddelbuettel.com

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Docker versus Vagrant for reproducability - was: The case for freezing CRAN

Gábor Csárdi
In reply to this post by Philippe Grosjean-3
You might want to look at packer as well, which can build virtual machines
from an ISO, without any user intaraction. I successfully used it to build
VMs with Linux, OSX and Windows. It can also create vagrant boxes. You can
specify provisioners, e.g. to install R, or a set of R packages, etc. It is
under heavy development, by the same team as vagrant.

Gabor

On Fri, Mar 21, 2014 at 9:03 AM, Philippe GROSJEAN <
[hidden email]> wrote:

>
> ..............................................<°}))><........
>  ) ) ) ) )
> ( ( ( ( (    Prof. Philippe Grosjean
>  ) ) ) ) )
> ( ( ( ( (    Numerical Ecology of Aquatic Systems
>  ) ) ) ) )   Mons University, Belgium
> ( ( ( ( (
> ..............................................................
>
> On 21 Mar 2014, at 10:59, Rainer M Krug <[hidden email]> wrote:
>
> > Dirk Eddelbuettel <[hidden email]> writes:
> >
> >> o Roger correctly notes that R scripts and packages are just one issue.
> >>   Compilers, libraries and the OS matter.  To me, the natural approach
> these
> >>   days would be to think of something based on Docker or Vagrant or (if
> you
> >>   must, VirtualBox).  The newer alternatives make snapshotting very
> cheap
> >>   (eg by using Linux LXC).  That approach reproduces a full environemnt
> as
> >>   best as we can while still ignoring the hardware layer (and some
> readers
> >>   may recall the infamous Pentium bug of two decades ago).
> >
> > These two tools look very interesting - but I have, even after reading a
> > few discussions of their differences, no idea which one is better suited
> > to be used for what has been discussed here: Making it possible to run
> > the analysis later to reproduce results using the same versions used in
> > the initial analysis.
> >
> > Am I right in saying:
> >
> > - Vagrant uses VMs to emulate the hardware
> > - Docker does not
> >
> Yes.
>
>
> > wherefore
> > - Vagrant is slower and requires more space
> > - Docker is faster and requires less space
> >
> It depends. For instance, if you run R in VirtualBox under Windows, it may
> run faster depending on the code you run and, say, the Lapack library used.
> On Linux, you typically got R code run in the VM 2-3% slower than natively,
> but In a Windows host, most of my R code runs faster in the VM... But yes,
> you need more RAM.
>
> With Vagrant, you do not need to keep you VM once you don't use it any
> more. Then, disk space is shrunk down to a few kB, corresponding to the
> Vagrant configuration file. I guess the same is true for Docker?
>
> A big advantage of Vagrant + VirtualBox is that you got a very similar
> virtual hardware, no matter if your host system is Linux, Windows or Mac OS
> X. I see this as a good point for better reproducibility.
>
>
> > Therefore, could one say that Vagrant is more "robust" in the long run?
> >
> May be,... but it depends almost entirely how VirtualBox will support old
> VMs in the future!
>
> PhG
>
> > How do they compare in relation to different platforms? Vagrant seems to
> > be platform agnostic, I can develop and run on Linux, Mac and Windows -
> > how does it work with Docker?
> >
> > I just followed [1] and setup Docker on OSX - loos promising - it also
> > uses an underlying VM. SO both should be equal in regards to
> > reproducability in the long run?
> >
> > Please note: I see these questions in the light of this discussion of
> > reproducability and not in regards to deployment of applications what
> > the discussions on the web are.
> >
> > Any comments, thoughts, remarks?
> >
> > Rainer
> >
> >
> > Footnotes:
> > [1]  http://docs.docker.io/en/latest/installation/mac/
> >
> > --
> > Rainer M. Krug
> > email: Rainer<at>krugs<dot>de
> > PGP: 0x0F52F982
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Docker versus Vagrant for reproducability - was: The case for freezing CRAN

Rainer Krug-3
Gábor Csárdi <[hidden email]> writes:

> You might want to look at packer as well, which can build virtual machines
> from an ISO, without any user intaraction. I successfully used it to build
> VMs with Linux, OSX and Windows. It can also create vagrant boxes. You can
> specify provisioners, e.g. to install R, or a set of R packages, etc. It is
> under heavy development, by the same team as vagrant.

I think I am getting lost in these - I looked ad Docker, and it looks
promising, but I actually didn't even manage to sh into the running
container. Is there somewhere an howto on how one can use these in R, to
the purpose discussed in this thread? If not, I really think this would
be needed. It is extremely difficult for me to translate what I want to
do into the deployment / management / development scenarios discussed in
the blogs I have found.

Cheers,

(a confused)
Rainer


>
> Gabor
>
> On Fri, Mar 21, 2014 at 9:03 AM, Philippe GROSJEAN <
> [hidden email]> wrote:
>
>>
>> ..............................................<}))><........
>>  ) ) ) ) )
>> ( ( ( ( (    Prof. Philippe Grosjean
>>  ) ) ) ) )
>> ( ( ( ( (    Numerical Ecology of Aquatic Systems
>>  ) ) ) ) )   Mons University, Belgium
>> ( ( ( ( (
>> ..............................................................
>>
>> On 21 Mar 2014, at 10:59, Rainer M Krug <[hidden email]> wrote:
>>
>> > Dirk Eddelbuettel <[hidden email]> writes:
>> >
>> >> o Roger correctly notes that R scripts and packages are just one issue.
>> >>   Compilers, libraries and the OS matter.  To me, the natural approach
>> these
>> >>   days would be to think of something based on Docker or Vagrant or (if
>> you
>> >>   must, VirtualBox).  The newer alternatives make snapshotting very
>> cheap
>> >>   (eg by using Linux LXC).  That approach reproduces a full environemnt
>> as
>> >>   best as we can while still ignoring the hardware layer (and some
>> readers
>> >>   may recall the infamous Pentium bug of two decades ago).
>> >
>> > These two tools look very interesting - but I have, even after reading a
>> > few discussions of their differences, no idea which one is better suited
>> > to be used for what has been discussed here: Making it possible to run
>> > the analysis later to reproduce results using the same versions used in
>> > the initial analysis.
>> >
>> > Am I right in saying:
>> >
>> > - Vagrant uses VMs to emulate the hardware
>> > - Docker does not
>> >
>> Yes.
>>
>>
>> > wherefore
>> > - Vagrant is slower and requires more space
>> > - Docker is faster and requires less space
>> >
>> It depends. For instance, if you run R in VirtualBox under Windows, it may
>> run faster depending on the code you run and, say, the Lapack library used.
>> On Linux, you typically got R code run in the VM 2-3% slower than natively,
>> but In a Windows host, most of my R code runs faster in the VM... But yes,
>> you need more RAM.
>>
>> With Vagrant, you do not need to keep you VM once you don't use it any
>> more. Then, disk space is shrunk down to a few kB, corresponding to the
>> Vagrant configuration file. I guess the same is true for Docker?
>>
>> A big advantage of Vagrant + VirtualBox is that you got a very similar
>> virtual hardware, no matter if your host system is Linux, Windows or Mac OS
>> X. I see this as a good point for better reproducibility.
>>
>>
>> > Therefore, could one say that Vagrant is more "robust" in the long run?
>> >
>> May be,... but it depends almost entirely how VirtualBox will support old
>> VMs in the future!
>>
>> PhG
>>
>> > How do they compare in relation to different platforms? Vagrant seems to
>> > be platform agnostic, I can develop and run on Linux, Mac and Windows -
>> > how does it work with Docker?
>> >
>> > I just followed [1] and setup Docker on OSX - loos promising - it also
>> > uses an underlying VM. SO both should be equal in regards to
>> > reproducability in the long run?
>> >
>> > Please note: I see these questions in the light of this discussion of
>> > reproducability and not in regards to deployment of applications what
>> > the discussions on the web are.
>> >
>> > Any comments, thoughts, remarks?
>> >
>> > Rainer
>> >
>> >
>> > Footnotes:
>> > [1]  http://docs.docker.io/en/latest/installation/mac/
>> >
>> > --
>> > Rainer M. Krug
>> > email: Rainer<at>krugs<dot>de
>> > PGP: 0x0F52F982
>> > ______________________________________________
>> > [hidden email] mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
> [[alternative HTML version deleted]]
>
--
Rainer M. Krug
email: Rainer<at>krugs<dot>de
PGP: 0x0F52F982

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

attachment0 (504 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Docker versus Vagrant for reproducability - was: The case for freezing CRAN

Gábor Csárdi
On Fri, Mar 21, 2014 at 11:12 AM, Rainer M Krug <[hidden email]> wrote:

> Gábor Csárdi <[hidden email]> writes:
>
> > You might want to look at packer as well, which can build virtual
> machines
> > from an ISO, without any user intaraction. I successfully used it to
> build
> > VMs with Linux, OSX and Windows. It can also create vagrant boxes. You
> can
> > specify provisioners, e.g. to install R, or a set of R packages, etc. It
> is
> > under heavy development, by the same team as vagrant.
>
> I think I am getting lost in these - I looked ad Docker, and it looks
> promising, but I actually didn't even manage to sh into the running
> container. Is there somewhere an howto on how one can use these in R, to
> the purpose discussed in this thread? If not, I really think this would
> be needed. It is extremely difficult for me to translate what I want to
> do into the deployment / management / development scenarios discussed in
> the blogs I have found.
>
I haven't tried Docker, so I cannot say anything about that. The purpose of
vagrant and packer is slightly different, but there seems to be some
overlap.

Packer helps you building a virtual machine from an ISO, automatically,
without any human interaction. That's pretty much it. The result can be a
VirtualBox, VMWare, etc. virtual machine, or even a vagrant box. I used it
to build Ubuntu, OSX and Windows boxes, it works great if you have a
working configuration. If you need to tweak a config to install additional
software, etc. then it requires some experimenting and patience, because
debugging is not that great.

Vagrant manages disposable virtual machines. I.e. it takes a vagrant box,
which is essentially a VM and some extra configuration info, provisions it,
which usually means installing software or setting up a development
environment, and then manages it, so that you can ssh to it, or do whatever
you want with it.

There are a number of boxes available, so if you want a minimal VM with
Ubuntu32, it takes one command to create it from a public box, another one
starting it, and a third one to ssh to it. It is literally a couple of
minutes, downloading the box takes longest. If you have the box, then it is
even quicker.

You can use packer and vagrant together. Packer creates the vagrant box,
sets up a very minimal environment. Then you can use vagrant with this box.

In my opinion it is somewhat cumbersome to use this for everyday work,
although good virtualization software definitely helps.

Gabor

        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Docker versus Vagrant for reproducability - was: The case for freezing CRAN

Philippe Grosjean-3
On 21 Mar 2014, at 20:21, Gábor Csárdi <[hidden email]> wrote:

> On Fri, Mar 21, 2014 at 11:12 AM, Rainer M Krug <[hidden email]> wrote:
>
>> Gábor Csárdi <[hidden email]> writes:
>>
>>> You might want to look at packer as well, which can build virtual
>> machines
>>> from an ISO, without any user intaraction. I successfully used it to
>> build
>>> VMs with Linux, OSX and Windows. It can also create vagrant boxes. You
>> can
>>> specify provisioners, e.g. to install R, or a set of R packages, etc. It
>> is
>>> under heavy development, by the same team as vagrant.
>>
>> I think I am getting lost in these - I looked ad Docker, and it looks
>> promising, but I actually didn't even manage to sh into the running
>> container. Is there somewhere an howto on how one can use these in R, to
>> the purpose discussed in this thread? If not, I really think this would
>> be needed. It is extremely difficult for me to translate what I want to
>> do into the deployment / management / development scenarios discussed in
>> the blogs I have found.
>>
>
> I haven't tried Docker, so I cannot say anything about that. The purpose of
> vagrant and packer is slightly different, but there seems to be some
> overlap.
>
> Packer helps you building a virtual machine from an ISO, automatically,
> without any human interaction. That's pretty much it. The result can be a
> VirtualBox, VMWare, etc. virtual machine, or even a vagrant box. I used it
> to build Ubuntu, OSX and Windows boxes, it works great if you have a
> working configuration. If you need to tweak a config to install additional
> software, etc. then it requires some experimenting and patience, because
> debugging is not that great.
>
> Vagrant manages disposable virtual machines. I.e. it takes a vagrant box,
> which is essentially a VM and some extra configuration info, provisions it,
> which usually means installing software or setting up a development
> environment, and then manages it, so that you can ssh to it, or do whatever
> you want with it.
>
> There are a number of boxes available, so if you want a minimal VM with
> Ubuntu32, it takes one command to create it from a public box, another one
> starting it, and a third one to ssh to it. It is literally a couple of
> minutes, downloading the box takes longest. If you have the box, then it is
> even quicker.
>
> You can use packer and vagrant together. Packer creates the vagrant box,
> sets up a very minimal environment. Then you can use vagrant with this box.
>
> In my opinion it is somewhat cumbersome to use this for everyday work,
> although good virtualization software definitely helps.
>
> Gabor
>
Additional info: you access R into the VM from within the host by ssh. You can enable x11 forwarding there and you also got GUI stuff. It works like a charm, but there are still some problems on my side when I try to disconnect and reconnect to the same R process. I can solve this with, say, screen. However, if any X11 window is displayed while I disconnect, R crashes immediately on reconnection.
Best,

PhG




> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
12