Upgrading a package to which other packages are LinkingTo

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Upgrading a package to which other packages are LinkingTo

Kirill Müller
Hi


I'd like to suggest to make R more informative when a user updates a
package A where there's at least one package B that has "LinkingTo: A"
in its description.

To illustrate the problem, assume package A is updated so that its C/C++
header interface (in inst/include) is changed. For package B to pick up
these changes, we need to reinstall package A. In extreme cases, if B
also imports A and uses functions from A's shared library, failure to
reinstall B may lead to all sorts of undefined behavior.

I've stumbled over this recently for A = Rcpp 0.12.8 and B = dplyr 0.5.0
[1], with a bug fix available in Rcpp 0.12.8.2. Simply upgrading Rcpp to
0.12.8.2 wasn't enough to propagate the bug fix to dplyr; we need to
reinstall dplyr 0.5.0 too.

I've prepared an example with R-devel r71799. The initial configuration
[2] is Rcpp 0.12.8 and dplyr 0.5.0. There is no warning from R after
upgrading Rcpp to 0.12.8.2 [3], and no warning when loading the (now
"broken") dplyr 0.5.0 linked against Rcpp 0.12.8 but importing Rcpp
0.12.8.2 [4].

As a remedy, I'd like to suggest that upgrading Rcpp gives a warning
about installed packages that are LinkingTo it [3], and that loading
dplyr gives a warning that it has been built against a different version
of Rcpp [4], just like the warning when packages are built against a
different version of R.

Thanks.


Best regards

Kirill


[1] https://github.com/hadley/dplyr/issues/2308#issuecomment-267495075
[2] https://travis-ci.org/krlmlr/pkg.upgrade.test#L589-L593
[3] https://travis-ci.org/krlmlr/pkg.upgrade.test#L619-L645
[4] https://travis-ci.org/krlmlr/pkg.upgrade.test#L671-L703

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Upgrading a package to which other packages are LinkingTo

Duncan Murdoch-2
I think there's one typo in your post which may confuse some readers;
I've edited it inline below.  My comments on the suggestion are at the
bottom of the message.


On 16/12/2016 5:35 AM, Kirill Müller wrote:

> Hi
>
>
> I'd like to suggest to make R more informative when a user updates a
> package A where there's at least one package B that has "LinkingTo: A"
> in its description.
>
> To illustrate the problem, assume package A is updated so that its C/C++
> header interface (in inst/include) is changed. For package B to pick up
> these changes, we need to reinstall package A.

This should be "reinstall package B", I think.

 > In extreme cases, if B

> also imports A and uses functions from A's shared library, failure to
> reinstall B may lead to all sorts of undefined behavior.
>
> I've stumbled over this recently for A = Rcpp 0.12.8 and B = dplyr 0.5.0
> [1], with a bug fix available in Rcpp 0.12.8.2. Simply upgrading Rcpp to
> 0.12.8.2 wasn't enough to propagate the bug fix to dplyr; we need to
> reinstall dplyr 0.5.0 too.
>
> I've prepared an example with R-devel r71799. The initial configuration
> [2] is Rcpp 0.12.8 and dplyr 0.5.0. There is no warning from R after
> upgrading Rcpp to 0.12.8.2 [3], and no warning when loading the (now
> "broken") dplyr 0.5.0 linked against Rcpp 0.12.8 but importing Rcpp
> 0.12.8.2 [4].
>
> As a remedy, I'd like to suggest that upgrading Rcpp gives a warning
> about installed packages that are LinkingTo it [3], and that loading
> dplyr gives a warning that it has been built against a different version
> of Rcpp [4], just like the warning when packages are built against a
> different version of R.

I'd call it a bug that we allow the situation to exist without some sort
of warning or error.

Your suggestion is one remedy, but might lead to too many warnings (or
too many unnecessary recompiles).

An argument could be made that it's a bug in package A that it has
updated its interface in a way that breaks packages that use it.

Perhaps the solution is to recommend that packages which export their
C-level entry points either guarantee them not to change or offer
(require?) version checks by user code.  So dplyr should start out by
saying "I'm using Rcpp interface 0.12.8".  If Rcpp has a new version
with a compatible interface, it replies "that's fine".  If Rcpp has
changed its interface, it says "Sorry, I don't support that any more."

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Upgrading a package to which other packages are LinkingTo

Dirk Eddelbuettel

On 16 December 2016 at 08:20, Duncan Murdoch wrote:
| Perhaps the solution is to recommend that packages which export their
| C-level entry points either guarantee them not to change or offer
| (require?) version checks by user code.  So dplyr should start out by
| saying "I'm using Rcpp interface 0.12.8".  If Rcpp has a new version
| with a compatible interface, it replies "that's fine".  If Rcpp has
| changed its interface, it says "Sorry, I don't support that any more."

We try. But it's hard, and I'd argue, likely impossible.

For example I even added a "frozen" package [1] in the sources / unit tests
to test for just this. In practice you just cannot hit every possible access
point of the (rich, in our case) API so the tests pass too often.

Which is why we relentlessly test against reverse-depends to _at least ensure
buildability_ from our releases.

As for seamless binary upgrade, I don't think in can work in practice.  Ask
Uwe one day we he rebuilds everything every time on Windows. And for what it
is worth, we essentially do the same in Debian.

Sometimes you just need to rebuild.  That may be the price of admission for
using the convenience of rich C++ interfaces.

Dirk

[1] https://github.com/RcppCore/Rcpp/tree/master/inst/unitTests/testRcppPackage


| Duncan Murdoch
|
| >
| > Thanks.
| >
| >
| > Best regards
| >
| > Kirill
| >
| >
| > [1] https://github.com/hadley/dplyr/issues/2308#issuecomment-267495075
| > [2] https://travis-ci.org/krlmlr/pkg.upgrade.test#L589-L593
| > [3] https://travis-ci.org/krlmlr/pkg.upgrade.test#L619-L645
| > [4] https://travis-ci.org/krlmlr/pkg.upgrade.test#L671-L703
| >
| > ______________________________________________
| > [hidden email] mailing list
| > https://stat.ethz.ch/mailman/listinfo/r-devel
| >
|
| ______________________________________________
| [hidden email] mailing list
| https://stat.ethz.ch/mailman/listinfo/r-devel

--
http://dirk.eddelbuettel.com | @eddelbuettel | [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Upgrading a package to which other packages are LinkingTo

Duncan Murdoch-2
On 16/12/2016 8:37 AM, Dirk Eddelbuettel wrote:

>
> On 16 December 2016 at 08:20, Duncan Murdoch wrote:
> | Perhaps the solution is to recommend that packages which export their
> | C-level entry points either guarantee them not to change or offer
> | (require?) version checks by user code.  So dplyr should start out by
> | saying "I'm using Rcpp interface 0.12.8".  If Rcpp has a new version
> | with a compatible interface, it replies "that's fine".  If Rcpp has
> | changed its interface, it says "Sorry, I don't support that any more."
>
> We try. But it's hard, and I'd argue, likely impossible.
>
> For example I even added a "frozen" package [1] in the sources / unit tests
> to test for just this. In practice you just cannot hit every possible access
> point of the (rich, in our case) API so the tests pass too often.
>
> Which is why we relentlessly test against reverse-depends to _at least ensure
> buildability_ from our releases.
>
> As for seamless binary upgrade, I don't think in can work in practice.  Ask
> Uwe one day we he rebuilds everything every time on Windows. And for what it
> is worth, we essentially do the same in Debian.
>
> Sometimes you just need to rebuild.  That may be the price of admission for
> using the convenience of rich C++ interfaces.
>

Okay, so would you say that Kirill's suggestion is not overkill?  Every
time package B uses LinkingTo: A, R should assume it needs to rebuild B
when A is updated?

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Upgrading a package to which other packages are LinkingTo

Dirk Eddelbuettel

On 16 December 2016 at 10:14, Duncan Murdoch wrote:
| On 16/12/2016 8:37 AM, Dirk Eddelbuettel wrote:
| >
| > On 16 December 2016 at 08:20, Duncan Murdoch wrote:
| > | Perhaps the solution is to recommend that packages which export their
| > | C-level entry points either guarantee them not to change or offer
| > | (require?) version checks by user code.  So dplyr should start out by
| > | saying "I'm using Rcpp interface 0.12.8".  If Rcpp has a new version
| > | with a compatible interface, it replies "that's fine".  If Rcpp has
| > | changed its interface, it says "Sorry, I don't support that any more."
| >
| > We try. But it's hard, and I'd argue, likely impossible.
| >
| > For example I even added a "frozen" package [1] in the sources / unit tests
| > to test for just this. In practice you just cannot hit every possible access
| > point of the (rich, in our case) API so the tests pass too often.
| >
| > Which is why we relentlessly test against reverse-depends to _at least ensure
| > buildability_ from our releases.

I meant to also add:  "... against a large corpus of other packages."
The intent is to empirically answer this.

| > As for seamless binary upgrade, I don't think in can work in practice.  Ask
| > Uwe one day we he rebuilds everything every time on Windows. And for what it
| > is worth, we essentially do the same in Debian.
| >
| > Sometimes you just need to rebuild.  That may be the price of admission for
| > using the convenience of rich C++ interfaces.
| >
|
| Okay, so would you say that Kirill's suggestion is not overkill?  Every
| time package B uses LinkingTo: A, R should assume it needs to rebuild B
| when A is updated?

Based on my experience is a "halting problem" -- i.e. cannot know ex ante.

So "every time" would be overkill to me.  Sometimes you know you must
recompile (but try to be very prudent with public-facing API).  Many times
you do not. It is hard to pin down.

At work we have a bunch of servers with Rcpp and many packages against them
(installed system-wide for all users). We _very really_ needs rebuild.  

Dirk

--
http://dirk.eddelbuettel.com | @eddelbuettel | [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Upgrading a package to which other packages are LinkingTo

Duncan Murdoch-2
On 16/12/2016 10:40 AM, Dirk Eddelbuettel wrote:

> On 16 December 2016 at 10:14, Duncan Murdoch wrote:
> | On 16/12/2016 8:37 AM, Dirk Eddelbuettel wrote:
> | >
> | > On 16 December 2016 at 08:20, Duncan Murdoch wrote:
> | > | Perhaps the solution is to recommend that packages which export their
> | > | C-level entry points either guarantee them not to change or offer
> | > | (require?) version checks by user code.  So dplyr should start out by
> | > | saying "I'm using Rcpp interface 0.12.8".  If Rcpp has a new version
> | > | with a compatible interface, it replies "that's fine".  If Rcpp has
> | > | changed its interface, it says "Sorry, I don't support that any more."
> | >
> | > We try. But it's hard, and I'd argue, likely impossible.
> | >
> | > For example I even added a "frozen" package [1] in the sources / unit tests
> | > to test for just this. In practice you just cannot hit every possible access
> | > point of the (rich, in our case) API so the tests pass too often.
> | >
> | > Which is why we relentlessly test against reverse-depends to _at least ensure
> | > buildability_ from our releases.
>
> I meant to also add:  "... against a large corpus of other packages."
> The intent is to empirically answer this.
>
> | > As for seamless binary upgrade, I don't think in can work in practice.  Ask
> | > Uwe one day we he rebuilds everything every time on Windows. And for what it
> | > is worth, we essentially do the same in Debian.
> | >
> | > Sometimes you just need to rebuild.  That may be the price of admission for
> | > using the convenience of rich C++ interfaces.
> | >
> |
> | Okay, so would you say that Kirill's suggestion is not overkill?  Every
> | time package B uses LinkingTo: A, R should assume it needs to rebuild B
> | when A is updated?
>
> Based on my experience is a "halting problem" -- i.e. cannot know ex ante.
>
> So "every time" would be overkill to me.  Sometimes you know you must
> recompile (but try to be very prudent with public-facing API).  Many times
> you do not. It is hard to pin down.
>
> At work we have a bunch of servers with Rcpp and many packages against them
> (installed system-wide for all users). We _very really_ needs rebuild.

So that comes back to my suggestion:  you should provide a way for a
dependent package to ask if your API has changed.  If you say it hasn't,
the package is fine.  If you say it has, the package should abort,
telling the user they need to reinstall it.  (Because it's a hard
question to answer, you might get it wrong and say it's fine when it's
not.  But that's easy to fix:  just make a new release that does require
a rebuild.)

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Upgrading a package to which other packages are LinkingTo

Dirk Eddelbuettel

On 16 December 2016 at 11:00, Duncan Murdoch wrote:
| On 16/12/2016 10:40 AM, Dirk Eddelbuettel wrote:
| > On 16 December 2016 at 10:14, Duncan Murdoch wrote:
| > | On 16/12/2016 8:37 AM, Dirk Eddelbuettel wrote:
| > | >
| > | > On 16 December 2016 at 08:20, Duncan Murdoch wrote:
| > | > | Perhaps the solution is to recommend that packages which export their
| > | > | C-level entry points either guarantee them not to change or offer
| > | > | (require?) version checks by user code.  So dplyr should start out by
| > | > | saying "I'm using Rcpp interface 0.12.8".  If Rcpp has a new version
| > | > | with a compatible interface, it replies "that's fine".  If Rcpp has
| > | > | changed its interface, it says "Sorry, I don't support that any more."
| > | >
| > | > We try. But it's hard, and I'd argue, likely impossible.
| > | >
| > | > For example I even added a "frozen" package [1] in the sources / unit tests
| > | > to test for just this. In practice you just cannot hit every possible access
| > | > point of the (rich, in our case) API so the tests pass too often.
| > | >
| > | > Which is why we relentlessly test against reverse-depends to _at least ensure
| > | > buildability_ from our releases.
| >
| > I meant to also add:  "... against a large corpus of other packages."
| > The intent is to empirically answer this.
| >
| > | > As for seamless binary upgrade, I don't think in can work in practice.  Ask
| > | > Uwe one day we he rebuilds everything every time on Windows. And for what it
| > | > is worth, we essentially do the same in Debian.
| > | >
| > | > Sometimes you just need to rebuild.  That may be the price of admission for
| > | > using the convenience of rich C++ interfaces.
| > | >
| > |
| > | Okay, so would you say that Kirill's suggestion is not overkill?  Every
| > | time package B uses LinkingTo: A, R should assume it needs to rebuild B
| > | when A is updated?
| >
| > Based on my experience is a "halting problem" -- i.e. cannot know ex ante.
| >
| > So "every time" would be overkill to me.  Sometimes you know you must
| > recompile (but try to be very prudent with public-facing API).  Many times
| > you do not. It is hard to pin down.
| >
| > At work we have a bunch of servers with Rcpp and many packages against them
| > (installed system-wide for all users). We _very really_ needs rebuild.

Edit:  "We _very rarely_ need rebuilds" is what was meant there.
 
| So that comes back to my suggestion:  you should provide a way for a
| dependent package to ask if your API has changed.  If you say it hasn't,
| the package is fine.  If you say it has, the package should abort,
| telling the user they need to reinstall it.  (Because it's a hard
| question to answer, you might get it wrong and say it's fine when it's
| not.  But that's easy to fix:  just make a new release that does require

Sure.

We have always increased the higher-order version number when that is needed.

One problem with your proposal is that the testing code may run after the
package load, and in the case where it matters ... that very code may not get
reached because the package didn't load.

Dirk

--
http://dirk.eddelbuettel.com | @eddelbuettel | [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Upgrading a package to which other packages are LinkingTo

Kirill Müller
Thanks for discussing this.

On 16.12.2016 17:19, Dirk Eddelbuettel wrote:

> On 16 December 2016 at 11:00, Duncan Murdoch wrote:
> | On 16/12/2016 10:40 AM, Dirk Eddelbuettel wrote:
> | > On 16 December 2016 at 10:14, Duncan Murdoch wrote:
> | > | On 16/12/2016 8:37 AM, Dirk Eddelbuettel wrote:
> | > | >
> | > | > On 16 December 2016 at 08:20, Duncan Murdoch wrote:
> | > | > | Perhaps the solution is to recommend that packages which export their
> | > | > | C-level entry points either guarantee them not to change or offer
> | > | > | (require?) version checks by user code.  So dplyr should start out by
> | > | > | saying "I'm using Rcpp interface 0.12.8".  If Rcpp has a new version
> | > | > | with a compatible interface, it replies "that's fine".  If Rcpp has
> | > | > | changed its interface, it says "Sorry, I don't support that any more."
Sounds good to me, I was considering something similar. dplyr can simply
query Rcpp's current version in .onLoad(), compare it to the version at
installation time and act accordingly.

> | > | >
> | > | > We try. But it's hard, and I'd argue, likely impossible.
> | > | >
> | > | > For example I even added a "frozen" package [1] in the sources / unit tests
> | > | > to test for just this. In practice you just cannot hit every possible access
> | > | > point of the (rich, in our case) API so the tests pass too often.
> | > | >
> | > | > Which is why we relentlessly test against reverse-depends to _at least ensure
> | > | > buildability_ from our releases.
> | >
> | > I meant to also add:  "... against a large corpus of other packages."
> | > The intent is to empirically answer this.
> | >
> | > | > As for seamless binary upgrade, I don't think in can work in practice.  Ask
> | > | > Uwe one day we he rebuilds everything every time on Windows. And for what it
> | > | > is worth, we essentially do the same in Debian.
> | > | >
> | > | > Sometimes you just need to rebuild.  That may be the price of admission for
> | > | > using the convenience of rich C++ interfaces.
> | > | >
> | > |
> | > | Okay, so would you say that Kirill's suggestion is not overkill?  Every
> | > | time package B uses LinkingTo: A, R should assume it needs to rebuild B
> | > | when A is updated?
> | >
> | > Based on my experience is a "halting problem" -- i.e. cannot know ex ante.
> | >
> | > So "every time" would be overkill to me.  Sometimes you know you must
> | > recompile (but try to be very prudent with public-facing API).  Many times
> | > you do not. It is hard to pin down.
I'd argue that recompiling/reinstalling B is cheap enough and the safest
option. So unless there is a risk, why not simply do it every time A
updates? This could be implemented with a perhaps small change in R:
When installing A, treat all packages that have A in both LinkingTo and
Imports as dependencies that need to be reinstalled.


-Kirill

> | >
> | > At work we have a bunch of servers with Rcpp and many packages against them
> | > (installed system-wide for all users). We _very really_ needs rebuild.
>
> Edit:  "We _very rarely_ need rebuilds" is what was meant there.
>  
> | So that comes back to my suggestion:  you should provide a way for a
> | dependent package to ask if your API has changed.  If you say it hasn't,
> | the package is fine.  If you say it has, the package should abort,
> | telling the user they need to reinstall it.  (Because it's a hard
> | question to answer, you might get it wrong and say it's fine when it's
> | not.  But that's easy to fix:  just make a new release that does require
>
> Sure.
>
> We have always increased the higher-order version number when that is needed.
>
> One problem with your proposal is that the testing code may run after the
> package load, and in the case where it matters ... that very code may not get
> reached because the package didn't load.
>
> Dirk
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Upgrading a package to which other packages are LinkingTo

R devel mailing list
In reply to this post by Dirk Eddelbuettel
A couple of points:
  - rebuilding dependent packages is needed if there is an ABI change,
not just an API change.  For packages like Rcpp which export inline
functions or macros that might have changed, this is potentially any
change to existing functions, but for packages like Matrix, it isn't
really an issue at all IIUC.

  - If we're looking into a way to check if package APIs are
compatible, then that's something that's relevant for all packages,
since they all export an R API.  I believe that CRAN only tests
package compatibility with the most recent versions of packages on
CRAN that import or depend on it.  There's no guarantee that a package
update won't contain API or behaviour changes that breaks older
versions of packages, packages not on CRAN or any scripts that use the
package, and these sorts of breakages do happen semi-regularly.

 - AFAICT, the only difference with packages like Rcpp is that you can
potentially have all of your CRAN packages at the latest version, but
some of them might have inlined code from an older version of Rcpp
even after running update.packages().  While that is an issue, in my
experience that's been a lot less trouble than the general case of
backwards compatibility.

Karl

On Fri, Dec 16, 2016 at 8:19 AM, Dirk Eddelbuettel <[hidden email]> wrote:

>
> On 16 December 2016 at 11:00, Duncan Murdoch wrote:
> | On 16/12/2016 10:40 AM, Dirk Eddelbuettel wrote:
> | > On 16 December 2016 at 10:14, Duncan Murdoch wrote:
> | > | On 16/12/2016 8:37 AM, Dirk Eddelbuettel wrote:
> | > | >
> | > | > On 16 December 2016 at 08:20, Duncan Murdoch wrote:
> | > | > | Perhaps the solution is to recommend that packages which export their
> | > | > | C-level entry points either guarantee them not to change or offer
> | > | > | (require?) version checks by user code.  So dplyr should start out by
> | > | > | saying "I'm using Rcpp interface 0.12.8".  If Rcpp has a new version
> | > | > | with a compatible interface, it replies "that's fine".  If Rcpp has
> | > | > | changed its interface, it says "Sorry, I don't support that any more."
> | > | >
> | > | > We try. But it's hard, and I'd argue, likely impossible.
> | > | >
> | > | > For example I even added a "frozen" package [1] in the sources / unit tests
> | > | > to test for just this. In practice you just cannot hit every possible access
> | > | > point of the (rich, in our case) API so the tests pass too often.
> | > | >
> | > | > Which is why we relentlessly test against reverse-depends to _at least ensure
> | > | > buildability_ from our releases.
> | >
> | > I meant to also add:  "... against a large corpus of other packages."
> | > The intent is to empirically answer this.
> | >
> | > | > As for seamless binary upgrade, I don't think in can work in practice.  Ask
> | > | > Uwe one day we he rebuilds everything every time on Windows. And for what it
> | > | > is worth, we essentially do the same in Debian.
> | > | >
> | > | > Sometimes you just need to rebuild.  That may be the price of admission for
> | > | > using the convenience of rich C++ interfaces.
> | > | >
> | > |
> | > | Okay, so would you say that Kirill's suggestion is not overkill?  Every
> | > | time package B uses LinkingTo: A, R should assume it needs to rebuild B
> | > | when A is updated?
> | >
> | > Based on my experience is a "halting problem" -- i.e. cannot know ex ante.
> | >
> | > So "every time" would be overkill to me.  Sometimes you know you must
> | > recompile (but try to be very prudent with public-facing API).  Many times
> | > you do not. It is hard to pin down.
> | >
> | > At work we have a bunch of servers with Rcpp and many packages against them
> | > (installed system-wide for all users). We _very really_ needs rebuild.
>
> Edit:  "We _very rarely_ need rebuilds" is what was meant there.
>
> | So that comes back to my suggestion:  you should provide a way for a
> | dependent package to ask if your API has changed.  If you say it hasn't,
> | the package is fine.  If you say it has, the package should abort,
> | telling the user they need to reinstall it.  (Because it's a hard
> | question to answer, you might get it wrong and say it's fine when it's
> | not.  But that's easy to fix:  just make a new release that does require
>
> Sure.
>
> We have always increased the higher-order version number when that is needed.
>
> One problem with your proposal is that the testing code may run after the
> package load, and in the case where it matters ... that very code may not get
> reached because the package didn't load.
>
> Dirk
>
> --
> http://dirk.eddelbuettel.com | @eddelbuettel | [hidden email]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Upgrading a package to which other packages are LinkingTo

Duncan Murdoch-2
On 16/12/2016 12:35 PM, Karl Millar wrote:
> A couple of points:
>   - rebuilding dependent packages is needed if there is an ABI change,
> not just an API change.  For packages like Rcpp which export inline
> functions or macros that might have changed, this is potentially any
> change to existing functions, but for packages like Matrix, it isn't
> really an issue at all IIUC.

This is why someone else needs to do this, not me.  I know the three
words that ABI stands for, but not what they mean in practice.

>
>   - If we're looking into a way to check if package APIs are
> compatible, then that's something that's relevant for all packages,
> since they all export an R API.  I believe that CRAN only tests
> package compatibility with the most recent versions of packages on
> CRAN that import or depend on it.  There's no guarantee that a package
> update won't contain API or behaviour changes that breaks older
> versions of packages, packages not on CRAN or any scripts that use the
> package, and these sorts of breakages do happen semi-regularly.

That's correct.
>
>  - AFAICT, the only difference with packages like Rcpp is that you can
> potentially have all of your CRAN packages at the latest version, but
> some of them might have inlined code from an older version of Rcpp
> even after running update.packages().  While that is an issue, in my
> experience that's been a lot less trouble than the general case of
> backwards compatibility.

I think that's an important difference.  Package authors can play nicely
with each other and keep their sources completely compatible, and
package users can still end up with broken libraries that aren't fixed
by anything simpler than re-installing everything.

We do have (imperfect) processes in place to help with the general
compatibility problem, but nothing to help with this one.

Duncan Murdoch

>
> Karl
>
> On Fri, Dec 16, 2016 at 8:19 AM, Dirk Eddelbuettel <[hidden email]> wrote:
>>
>> On 16 December 2016 at 11:00, Duncan Murdoch wrote:
>> | On 16/12/2016 10:40 AM, Dirk Eddelbuettel wrote:
>> | > On 16 December 2016 at 10:14, Duncan Murdoch wrote:
>> | > | On 16/12/2016 8:37 AM, Dirk Eddelbuettel wrote:
>> | > | >
>> | > | > On 16 December 2016 at 08:20, Duncan Murdoch wrote:
>> | > | > | Perhaps the solution is to recommend that packages which export their
>> | > | > | C-level entry points either guarantee them not to change or offer
>> | > | > | (require?) version checks by user code.  So dplyr should start out by
>> | > | > | saying "I'm using Rcpp interface 0.12.8".  If Rcpp has a new version
>> | > | > | with a compatible interface, it replies "that's fine".  If Rcpp has
>> | > | > | changed its interface, it says "Sorry, I don't support that any more."
>> | > | >
>> | > | > We try. But it's hard, and I'd argue, likely impossible.
>> | > | >
>> | > | > For example I even added a "frozen" package [1] in the sources / unit tests
>> | > | > to test for just this. In practice you just cannot hit every possible access
>> | > | > point of the (rich, in our case) API so the tests pass too often.
>> | > | >
>> | > | > Which is why we relentlessly test against reverse-depends to _at least ensure
>> | > | > buildability_ from our releases.
>> | >
>> | > I meant to also add:  "... against a large corpus of other packages."
>> | > The intent is to empirically answer this.
>> | >
>> | > | > As for seamless binary upgrade, I don't think in can work in practice.  Ask
>> | > | > Uwe one day we he rebuilds everything every time on Windows. And for what it
>> | > | > is worth, we essentially do the same in Debian.
>> | > | >
>> | > | > Sometimes you just need to rebuild.  That may be the price of admission for
>> | > | > using the convenience of rich C++ interfaces.
>> | > | >
>> | > |
>> | > | Okay, so would you say that Kirill's suggestion is not overkill?  Every
>> | > | time package B uses LinkingTo: A, R should assume it needs to rebuild B
>> | > | when A is updated?
>> | >
>> | > Based on my experience is a "halting problem" -- i.e. cannot know ex ante.
>> | >
>> | > So "every time" would be overkill to me.  Sometimes you know you must
>> | > recompile (but try to be very prudent with public-facing API).  Many times
>> | > you do not. It is hard to pin down.
>> | >
>> | > At work we have a bunch of servers with Rcpp and many packages against them
>> | > (installed system-wide for all users). We _very really_ needs rebuild.
>>
>> Edit:  "We _very rarely_ need rebuilds" is what was meant there.
>>
>> | So that comes back to my suggestion:  you should provide a way for a
>> | dependent package to ask if your API has changed.  If you say it hasn't,
>> | the package is fine.  If you say it has, the package should abort,
>> | telling the user they need to reinstall it.  (Because it's a hard
>> | question to answer, you might get it wrong and say it's fine when it's
>> | not.  But that's easy to fix:  just make a new release that does require
>>
>> Sure.
>>
>> We have always increased the higher-order version number when that is needed.
>>
>> One problem with your proposal is that the testing code may run after the
>> package load, and in the case where it matters ... that very code may not get
>> reached because the package didn't load.
>>
>> Dirk
>>
>> --
>> http://dirk.eddelbuettel.com | @eddelbuettel | [hidden email]
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Upgrading a package to which other packages are LinkingTo

Gábor Csárdi
I think that this problem is actually more general than just ABI
versioning. The common definition of ABI refers to compiled code, but
with R packages similar problems might happen (and they to happen)
without any compiled code.

I think the key issue is the concept of build-time dependencies. While
R packages usually does not distinguish between build-time and
run-time dependencies, they still do exist, and I think ideally we
would need to treat them differently.

AFAIK LinkingTo is the only form of a build-time dependency, that is
completely explicit, so it is relatively easy to handle. The other
frequent of build-time dependency is a function call to the other
package, that happens at install time. E.g. with references or R6*
classes you frequently include code like this in yourpackage:

myclass <- R6::R6Class(...)

and this code is evaluated at install time. So if the R6 package is
updated, the installed version myclass in yourpackage is not affected
at all. In fact, if the new version of R6 is not compatible with the
myclass object created by the old version, then yourpackage will be
broken. (This AFAIK cannot happen with R6, so it is not the best
example, but it can happen in other similar cases.)

The key here is that R6 is a build-time dependency of yourpackage,
similarly to packages linking to (i.e. LinkingTo) Rpp.

Another possible type of build-time dependency is if you put objects
from another package in yourpackage. E.g.

myfun <- otherpkg::fun

Then a copy of otherpkg::fun will be saved in yourpackage. If you
install a new version of otherpkg, yourpackage is unaffected, and if
otherpkg::fun uses some (possibly internal) API from otherpkg, that
has changed in the new version of otherpkg, you might easily end up
with a broken yourpackage again.

I think one lesson is to avoid running code at install time. This is
not a new thing, AFAIR it is even mentioned in 'Writing R extensions'.
Instead of running code at install time, you might consider running it
in `.onLoad()`, and then these "problems" go away. But you obviously
cannot always avoid it.

Gabor

* I think the R6 package is great, and I am not speaking in any way
against it. I just needed an example, and I know R6 much better than
reference classes, or other similar packages.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Upgrading a package to which other packages are LinkingTo

Gábor Csárdi
FWIW I wrote a tool that tests which dependencies of a package are
build-time dependencies:
https://github.com/r-hub/builddeps

It is not very smart, just "brute-force", really. It tries to install the
package several times, leaving out one dependency at a time, and if the
installation fails, then the missing package is a build-time dependency.
(First it tries with the LinkingTo dependencies only, and if that succeeds,
then these are the only build time dependncies.)

It does download all dependent packages, and runs R CMD install several
times, so it is expensive. It is better to run it with binary packages.

It is mostly trivial, except that
1) it needs to edit DESCRIPTION and NAMESPACE to omit a dependency.
DESCRIPTION is easy, NAMESPACE somewhat more difficult, because there is a
parser for it, but no "writer".
2) the dependencies need to be considered in a topological order, otherwise
one gets wrong answers.

I wrote this mainly for R-hub, to know which binary packages need to be
rebuilt after a package update, but if you use it and have feedback, please
email me or open an issue in the GitHub repo.

Gabor

On Fri, Dec 16, 2016 at 9:27 PM, Gábor Csárdi <[hidden email]>
wrote:

> I think that this problem is actually more general than just ABI
> versioning. The common definition of ABI refers to compiled code, but
> with R packages similar problems might happen (and they to happen)
> without any compiled code.
>
> I think the key issue is the concept of build-time dependencies. While
> R packages usually does not distinguish between build-time and
> run-time dependencies, they still do exist, and I think ideally we
> would need to treat them differently.
>
> AFAIK LinkingTo is the only form of a build-time dependency, that is
> completely explicit, so it is relatively easy to handle. The other
> frequent of build-time dependency is a function call to the other
> package, that happens at install time. E.g. with references or R6*
> classes you frequently include code like this in yourpackage:
>
> myclass <- R6::R6Class(...)
>
> and this code is evaluated at install time. So if the R6 package is
> updated, the installed version myclass in yourpackage is not affected
> at all. In fact, if the new version of R6 is not compatible with the
> myclass object created by the old version, then yourpackage will be
> broken. (This AFAIK cannot happen with R6, so it is not the best
> example, but it can happen in other similar cases.)
>
> The key here is that R6 is a build-time dependency of yourpackage,
> similarly to packages linking to (i.e. LinkingTo) Rpp.
>
> Another possible type of build-time dependency is if you put objects
> from another package in yourpackage. E.g.
>
> myfun <- otherpkg::fun
>
> Then a copy of otherpkg::fun will be saved in yourpackage. If you
> install a new version of otherpkg, yourpackage is unaffected, and if
> otherpkg::fun uses some (possibly internal) API from otherpkg, that
> has changed in the new version of otherpkg, you might easily end up
> with a broken yourpackage again.
>
> I think one lesson is to avoid running code at install time. This is
> not a new thing, AFAIR it is even mentioned in 'Writing R extensions'.
> Instead of running code at install time, you might consider running it
> in `.onLoad()`, and then these "problems" go away. But you obviously
> cannot always avoid it.
>
> Gabor
>
> * I think the R6 package is great, and I am not speaking in any way
> against it. I just needed an example, and I know R6 much better than
> reference classes, or other similar packages.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel