install.packages and dependency version checking

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

install.packages and dependency version checking

Prof Brian Ripley
I've started to implement checks for package versions on dependencies in
install.packages().  However, this is revealing a number of
problems/misconceptions.


(A) We do not check versions when loading namespaces, ahd the namespace
registry does not contain version information.  So that for example
(rtracklayer)

Depends: R (>= 2.7.0), Biobase, methods, RCurl
Imports: XML (>= 1.98-0), IRanges, Biostrings

will never check the version of namespace XML that is loaded, either
already loaded or resulting from loading this package's namespace.  For
this to be operational we would need to extend the syntax of the imports()
and importsFrom() directive in a NAMESPACE file to allow version
restrictions. I am not sure this is worth doing, as an alternative is to
put the imported package in Depends.

The version dependence will in a future release cause an update of XML
when rtracklayer is installed, if needed (and available).


(B) Things like (package stam)

Depends: R (>= 2.7.0), GO.db (>= 2.1.3), Biobase (>= 1.99.5), pamr (>=
         1.37.0), cluster (>= 1.11.10), annaffy (>= 1.11.5), methods (>=
         2.7.0), utils (>= 2.7.0)

are redundant: the versions of method and utils are always the same as
that of R.

And there is no point in having a package in both Depends: and Imports:,
as Biostrings has.


(C) There is no check on the version of a package suggested by Suggests:,
unless the package itself provides one (and I found no instances).


(D) We can really only handle >= dependencies on package versions (but
then I can see no other ops in use).  install.packages() will find the
latest version available on the repositories, and we possibly need to
check version requirements on the same dependency many times.  Given that
BioC has a penchant for having version dependencies on unavailable
versions (e.g. last week on IRanges (>= 1.1.7) with 1.1.4 available), we
may be able to satisfy the requirements of some packages and not others.
(In that case the strategy used is to install the latest available version
if the one installed does not suffice for those we can satisfy, and report
the problem(s).)


(E) One of the arguments that has been used to do this version checking at
install time is to avoid installing packages that cannot work. It would be
possible to extend the approach to do so, but I am going to leave that to
those who advocated it.


The net effect of the current changes will be that if there is a
dependence that is already installed but a later version is available and
will help satisfy a >= dependence, it will be added to the list of
packages to be installed.  As we have seen with Matrix this last week,
that can have downsides in stopping previously functional packages
working.

This is work in progress: there is no way to write a test suite that will
encapsulate all the possible scenarios so weneed to get experience
until 2.9.0 is released.  Please report any quirks to R-devel if they are
completely reproducible (and preferably with the code change needed to fix
them, since the chance of anyone else being able to reproduce them are
fairly slim).

--
Brian D. Ripley,                  [hidden email]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: install.packages and dependency version checking

Gabor Grothendieck
On Mon, Dec 15, 2008 at 12:35 PM, Prof Brian Ripley
<[hidden email]> wrote:
> (D) We can really only handle >= dependencies on package versions (but then
> I can see no other ops in use).  install.packages() will find the latest

Ryacas works with XML 1.96-0; however, after Ryacas was released newer
versions of XML break Ryacas so a new release of Ryacas would, at the moment,
need XML (= 1.96-0).

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: install.packages and dependency version checking

Martin Morgan
In reply to this post by Prof Brian Ripley
Prof Brian Ripley <[hidden email]> writes:

> I've started to implement checks for package versions on dependencies
> in install.packages().  However, this is revealing a number of
> problems/misconceptions.
>
>
> (A) We do not check versions when loading namespaces, ahd the
> namespace registry does not contain version information.  So that for
> example (rtracklayer)
>
> Depends: R (>= 2.7.0), Biobase, methods, RCurl
> Imports: XML (>= 1.98-0), IRanges, Biostrings
>
> will never check the version of namespace XML that is loaded, either
> already loaded or resulting from loading this package's namespace.
> For this to be operational we would need to extend the syntax of the
> imports() and importsFrom() directive in a NAMESPACE file to allow
> version restrictions. I am not sure this is worth doing, as an
> alternative is to put the imported package in Depends.

Without XML in Imports: references in the package name space to
functions in XML rely on the user not adjusting their search
path. Often XML may well be 'infrastructure' that the end-user has no
use for, and should not be contributing to the possibility of
unexpected name collisions by cluttering their search path.

> The version dependence will in a future release cause an update of XML
> when rtracklayer is installed, if needed (and available).
>
>
> (B) Things like (package stam)
>
> Depends: R (>= 2.7.0), GO.db (>= 2.1.3), Biobase (>= 1.99.5), pamr (>=
>          1.37.0), cluster (>= 1.11.10), annaffy (>= 1.11.5), methods (>=
>          2.7.0), utils (>= 2.7.0)
>
> are redundant: the versions of method and utils are always the same as
> that of R.
>
> And there is no point in having a package in both Depends: and
> Imports:, as Biostrings has.

Imports: (and imports() in NAMESPACE) gives the name space reliable
access to specific functions / classes; Depends: gives the user access
to (possibly a greater diversity of) functions.

> (C) There is no check on the version of a package suggested by
> Suggests:, unless the package itself provides one (and I found no
> instances).
>
>
> (D) We can really only handle >= dependencies on package versions (but
> then I can see no other ops in use).  install.packages() will find the
> latest version available on the repositories, and we possibly need to
> check version requirements on the same dependency many times.  Given
> that BioC has a penchant for having version dependencies on
> unavailable versions (e.g. last week on IRanges (>= 1.1.7) with 1.1.4
> available), we may be able to satisfy the requirements of some
> packages and not others. (In that case the strategy used is to install
> the latest available version if the one installed does not suffice for
> those we can satisfy, and report the problem(s).)

To clarify, I guess you mean that IRanges 1.1.4 would be installed for
packages that specified, say, IRanges >= 1.1.0, but that the package
depending on 1.1.7 would not install. It would be a mistake to install
a package with unsatisfied dependencies.

> (E) One of the arguments that has been used to do this version
> checking at install time is to avoid installing packages that cannot
> work. It would be possible to extend the approach to do so, but I am
> going to leave that to those who advocated it.
>
>
> The net effect of the current changes will be that if there is a
> dependence that is already installed but a later version is available
> and will help satisfy a >= dependence, it will be added to the list of
> packages to be installed.  As we have seen with Matrix this last week,
> that can have downsides in stopping previously functional packages
> working.
>
> This is work in progress: there is no way to write a test suite that
> will encapsulate all the possible scenarios so weneed to get
> experience until 2.9.0 is released.  Please report any quirks to
> R-devel if they are completely reproducible (and preferably with the
> code change needed to fix them, since the chance of anyone else being
> able to reproduce them are fairly slim).
>
> --
> Brian D. Ripley,                  [hidden email]
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

--
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M2 B169
Phone: (206) 667-2793

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: install.packages and dependency version checking

Prof Brian Ripley
On Mon, 15 Dec 2008, Martin Morgan wrote:

> Prof Brian Ripley <[hidden email]> writes:
>
>> I've started to implement checks for package versions on dependencies
>> in install.packages().  However, this is revealing a number of
>> problems/misconceptions.
>>
>>
>> (A) We do not check versions when loading namespaces, ahd the
>> namespace registry does not contain version information.  So that for
>> example (rtracklayer)
>>
>> Depends: R (>= 2.7.0), Biobase, methods, RCurl
>> Imports: XML (>= 1.98-0), IRanges, Biostrings
>>
>> will never check the version of namespace XML that is loaded, either
>> already loaded or resulting from loading this package's namespace.
>> For this to be operational we would need to extend the syntax of the
>> imports() and importsFrom() directive in a NAMESPACE file to allow
>> version restrictions. I am not sure this is worth doing, as an
>> alternative is to put the imported package in Depends.
>
> Without XML in Imports: references in the package name space to
> functions in XML rely on the user not adjusting their search
> path. Often XML may well be 'infrastructure' that the end-user has no
> use for, and should not be contributing to the possibility of
> unexpected name collisions by cluttering their search path.

But if they have a version requirement it is going unchecked.

>> The version dependence will in a future release cause an update of XML
>> when rtracklayer is installed, if needed (and available).
>>
>>
>> (B) Things like (package stam)
>>
>> Depends: R (>= 2.7.0), GO.db (>= 2.1.3), Biobase (>= 1.99.5), pamr (>=
>>          1.37.0), cluster (>= 1.11.10), annaffy (>= 1.11.5), methods (>=
>>          2.7.0), utils (>= 2.7.0)
>>
>> are redundant: the versions of method and utils are always the same as
>> that of R.
>>
>> And there is no point in having a package in both Depends: and
>> Imports:, as Biostrings has.
>
> Imports: (and imports() in NAMESPACE) gives the name space reliable
> access to specific functions / classes; Depends: gives the user access
> to (possibly a greater diversity of) functions.
>
>> (C) There is no check on the version of a package suggested by
>> Suggests:, unless the package itself provides one (and I found no
>> instances).
>>
>>
>> (D) We can really only handle >= dependencies on package versions (but
>> then I can see no other ops in use).  install.packages() will find the
>> latest version available on the repositories, and we possibly need to
>> check version requirements on the same dependency many times.  Given
>> that BioC has a penchant for having version dependencies on
>> unavailable versions (e.g. last week on IRanges (>= 1.1.7) with 1.1.4
>> available), we may be able to satisfy the requirements of some
>> packages and not others. (In that case the strategy used is to install
>> the latest available version if the one installed does not suffice for
>> those we can satisfy, and report the problem(s).)
>
> To clarify, I guess you mean that IRanges 1.1.4 would be installed for
> packages that specified, say, IRanges >= 1.1.0,

Yes, if a recent enough version is not already installed.

> but that the package depending on 1.1.7 would not install.

Only if it uses lazy loading.  It might install but not load otherwise.

> It would be a mistake to install a package with unsatisfied
> dependencies.

That's a point of view: see (E).  Others would argue that the bug is in
the BioC release procedures, in that packages with impossible dependencies
should never be released.

>> (E) One of the arguments that has been used to do this version
>> checking at install time is to avoid installing packages that cannot
>> work. It would be possible to extend the approach to do so, but I am
>> going to leave that to those who advocated it.
>>
>>
>> The net effect of the current changes will be that if there is a
>> dependence that is already installed but a later version is available
>> and will help satisfy a >= dependence, it will be added to the list of
>> packages to be installed.  As we have seen with Matrix this last week,
>> that can have downsides in stopping previously functional packages
>> working.
>>
>> This is work in progress: there is no way to write a test suite that
>> will encapsulate all the possible scenarios so weneed to get
>> experience until 2.9.0 is released.  Please report any quirks to
>> R-devel if they are completely reproducible (and preferably with the
>> code change needed to fix them, since the chance of anyone else being
>> able to reproduce them are fairly slim).
>>
>> --
>> Brian D. Ripley,                  [hidden email]
>> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>> University of Oxford,             Tel:  +44 1865 272861 (self)
>> 1 South Parks Road,                     +44 1865 272866 (PA)
>> Oxford OX1 3TG, UK                Fax:  +44 1865 272595

--
Brian D. Ripley,                  [hidden email]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: install.packages and dependency version checking

Robert Gentleman
In reply to this post by Prof Brian Ripley
Hi,

Prof Brian Ripley wrote:

> I've started to implement checks for package versions on dependencies in
> install.packages().  However, this is revealing a number of
> problems/misconceptions.
>
>
> (A) We do not check versions when loading namespaces, ahd the namespace
> registry does not contain version information.  So that for example
> (rtracklayer)
>
> Depends: R (>= 2.7.0), Biobase, methods, RCurl
> Imports: XML (>= 1.98-0), IRanges, Biostrings
>
> will never check the version of namespace XML that is loaded, either
> already loaded or resulting from loading this package's namespace.  For
> this to be operational we would need to extend the syntax of the
> imports() and importsFrom() directive in a NAMESPACE file to allow
> version restrictions. I am not sure this is worth doing, as an
> alternative is to put the imported package in Depends.
>
> The version dependence will in a future release cause an update of XML
> when rtracklayer is installed, if needed (and available).
>
>

      I think we need to have this functionality in both Imports and Depends,
  see my response to another point for why.

> (B) Things like (package stam)
>
> Depends: R (>= 2.7.0), GO.db (>= 2.1.3), Biobase (>= 1.99.5), pamr (>=
>         1.37.0), cluster (>= 1.11.10), annaffy (>= 1.11.5), methods (>=
>         2.7.0), utils (>= 2.7.0)
>
> are redundant: the versions of method and utils are always the same as
> that of R.
>
> And there is no point in having a package in both Depends: and Imports:,
> as Biostrings has.

  I don't think that is true.  There are cases where both Imports and Depends
are reasonable.  The purpose of importing is to ensure correct resolution of
symbols in the internal functions of a package. I would do that in almost all
cases.  In some instances I want users to see functionality from another package
- and I can then either a) (re)export those functions, or if there are lots of
them, then b) just put the package also in Depends.  Now, a) is a bit less
useful than it could be since R CMD check gets annoyed about these re-exported
functions (I don't think it should care, the man page exists and is findable).

>
>
> (C) There is no check on the version of a package suggested by
> Suggests:, unless the package itself provides one (and I found no
> instances).

  It may be worthwhile, but this is a less frequent use case and I would
prioritize it lower than having that functionality in Imports.

>
>
> (D) We can really only handle >= dependencies on package versions (but
> then I can see no other ops in use).  install.packages() will find the
> latest version available on the repositories, and we possibly need to
> check version requirements on the same dependency many times.  Given
> that BioC has a penchant for having version dependencies on unavailable
> versions (e.g. last week on IRanges (>= 1.1.7) with 1.1.4 available), we
> may be able to satisfy the requirements of some packages and not others.
> (In that case the strategy used is to install the latest available
> version if the one installed does not suffice for those we can satisfy,
> and report the problem(s).)
>

  I suspect one needs = (basically as Gabor pointed out, some packages have issues).

>
> (E) One of the arguments that has been used to do this version checking
> at install time is to avoid installing packages that cannot work. It
> would be possible to extend the approach to do so, but I am going to
> leave that to those who advocated it.
>
>
> The net effect of the current changes will be that if there is a
> dependence that is already installed but a later version is available
> and will help satisfy a >= dependence, it will be added to the list of
> packages to be installed.  As we have seen with Matrix this last week,
> that can have downsides in stopping previously functional packages working.
>
> This is work in progress: there is no way to write a test suite that
> will encapsulate all the possible scenarios so weneed to get experience
> until 2.9.0 is released.  Please report any quirks to R-devel if they
> are completely reproducible (and preferably with the code change needed
> to fix them, since the chance of anyone else being able to reproduce
> them are fairly slim).
>
  thanks
    Robert

--
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
[hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: install.packages and dependency version checking

Kurt Hornik
>>>>> Robert Gentleman writes:

> Hi,

Two quick (late) comments:

> Prof Brian Ripley wrote:
>> I've started to implement checks for package versions on dependencies in
>> install.packages().  However, this is revealing a number of
>> problems/misconceptions.
>>
>>
>> (A) We do not check versions when loading namespaces, ahd the namespace
>> registry does not contain version information.  So that for example
>> (rtracklayer)
>>
>> Depends: R (>= 2.7.0), Biobase, methods, RCurl
>> Imports: XML (>= 1.98-0), IRanges, Biostrings
>>
>> will never check the version of namespace XML that is loaded, either
>> already loaded or resulting from loading this package's namespace.  For
>> this to be operational we would need to extend the syntax of the
>> imports() and importsFrom() directive in a NAMESPACE file to allow
>> version restrictions. I am not sure this is worth doing, as an
>> alternative is to put the imported package in Depends.
>>
>> The version dependence will in a future release cause an update of XML
>> when rtracklayer is installed, if needed (and available).
>>
>>

>       I think we need to have this functionality in both Imports and Depends,
>   see my response to another point for why.

>> (B) Things like (package stam)
>>
>> Depends: R (>= 2.7.0), GO.db (>= 2.1.3), Biobase (>= 1.99.5), pamr (>=
>> 1.37.0), cluster (>= 1.11.10), annaffy (>= 1.11.5), methods (>=
>> 2.7.0), utils (>= 2.7.0)
>>
>> are redundant: the versions of method and utils are always the same as
>> that of R.
>>
>> And there is no point in having a package in both Depends: and Imports:,
>> as Biostrings has.

>   I don't think that is true.  There are cases where both Imports and Depends
> are reasonable.  The purpose of importing is to ensure correct resolution of
> symbols in the internal functions of a package. I would do that in almost all
> cases.  In some instances I want users to see functionality from another package
> - and I can then either a) (re)export those functions, or if there are lots of
> them, then b) just put the package also in Depends.  Now, a) is a bit less
> useful than it could be since R CMD check gets annoyed about these re-exported
> functions (I don't think it should care, the man page exists and is findable).

>>
>>
>> (C) There is no check on the version of a package suggested by
>> Suggests:, unless the package itself provides one (and I found no
>> instances).

>   It may be worthwhile, but this is a less frequent use case and I
> would prioritize it lower than having that functionality in Imports.

I think it would be good to have this too, because (see below) often
Suggests are used for "conditional Depends".

>>
>>
>> (D) We can really only handle >= dependencies on package versions (but
>> then I can see no other ops in use).  install.packages() will find the
>> latest version available on the repositories, and we possibly need to
>> check version requirements on the same dependency many times.  Given
>> that BioC has a penchant for having version dependencies on unavailable
>> versions (e.g. last week on IRanges (>= 1.1.7) with 1.1.4 available), we
>> may be able to satisfy the requirements of some packages and not others.
>> (In that case the strategy used is to install the latest available
>> version if the one installed does not suffice for those we can satisfy,
>> and report the problem(s).)
>>

>   I suspect one needs = (basically as Gabor pointed out, some packages
>   have issues).

It would be good to support all comparison ops, I think (including !=).

Not something to be doable right away, but it would be very good to
allow for alternatives in dependency specs so that one can say things
like

    Depends: foo | bar

indicating that one needs foo or bar, and most likely that if neither is
there installing the first one should be attempted.

Best
-k


>>
>> (E) One of the arguments that has been used to do this version checking
>> at install time is to avoid installing packages that cannot work. It
>> would be possible to extend the approach to do so, but I am going to
>> leave that to those who advocated it.
>>
>>
>> The net effect of the current changes will be that if there is a
>> dependence that is already installed but a later version is available
>> and will help satisfy a >= dependence, it will be added to the list of
>> packages to be installed.  As we have seen with Matrix this last week,
>> that can have downsides in stopping previously functional packages working.
>>
>> This is work in progress: there is no way to write a test suite that
>> will encapsulate all the possible scenarios so weneed to get experience
>> until 2.9.0 is released.  Please report any quirks to R-devel if they
>> are completely reproducible (and preferably with the code change needed
>> to fix them, since the chance of anyone else being able to reproduce
>> them are fairly slim).
>>
>   thanks
>     Robert

> --
> Robert Gentleman, PhD
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M2-B876
> PO Box 19024
> Seattle, Washington 98109-1024
> 206-667-7700
> [hidden email]

> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel