How to (appropropriately) use require in a package?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

How to (appropropriately) use require in a package?

Joshua Wiley-2
Dear All,

What is the preferred way for Package A, to initialize a cluster, and load
Package B on all nodes?

I am writing a package that parallelizes some functions through the use of
a cluster if useRs are on a Windows machine (using parLapply and family).
 I also make use of another package in some of my code, so it is necessary
to load the required packages on each slave once the cluster is started.

Right now, I have done this, by evaluating require(packages) on each slave;
however, Rcmd check has a note that I should remove the "require" in my
code.

Thanks!

Josh

--
Joshua F. Wiley
Ph.D. Student, UCLA Department of Psychology
http://joshuawiley.com/
Senior Analyst, Elkhart Group Ltd.
http://elkhartgroup.com
Office: 260.673.5518

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: How to (appropropriately) use require in a package?

Chris Green
I am not an expert here, but if it's a package, couldn't (shouldn't?) you
include Package B in one of the Depends: or Imports: lines in the
DESCRIPTION file? That would ensure Package B is automatically made
accessible whenever Package A is loaded. For example, see the Writing R
Extensions manual:

http://cran.fhcrc.org/doc/manuals/r-release/R-exts.html#Package-Dependencies


Chris Green
Ph.D. Student, Statistics
University of Washington, Seattle




On Thu, Aug 7, 2014 at 4:35 PM, Joshua Wiley <[hidden email]> wrote:

> Dear All,
>
> What is the preferred way for Package A, to initialize a cluster, and load
> Package B on all nodes?
>
> I am writing a package that parallelizes some functions through the use of
> a cluster if useRs are on a Windows machine (using parLapply and family).
>  I also make use of another package in some of my code, so it is necessary
> to load the required packages on each slave once the cluster is started.
>
> Right now, I have done this, by evaluating require(packages) on each slave;
> however, Rcmd check has a note that I should remove the "require" in my
> code.
>
> Thanks!
>
> Josh
>
> --
> Joshua F. Wiley
> Ph.D. Student, UCLA Department of Psychology
> http://joshuawiley.com/
> Senior Analyst, Elkhart Group Ltd.
> http://elkhartgroup.com
> Office: 260.673.5518
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: How to (appropropriately) use require in a package?

Joshua Wiley-2
In reply to this post by Joshua Wiley-2
Someone kindly pointed out that it is not clear from my email why Depends
will not work.  A more complete example is:

PkgA:
f <- function(ncores) {
  cl <- makeCluster(ncores)

  clusterEvalQ(cl, {
    require(PkgB)
  })
  [other code]

  ### this is the code I want to work and need to be able to call
  ### PkgB functions on each of the cluster slaves
  output <- parLapply(cl, 1:n, function(i) {
    [code from my package and using some functions from PkgB]
  })

}

As far as I know, just because I add PkgB to the Depends (or imports,
whatever) of PkgA, does not mean that the cluster started by PkgA will
automatically have PkgB loaded and functions available.

Thanks!



On Fri, Aug 8, 2014 at 9:35 AM, Joshua Wiley <[hidden email]> wrote:

> Dear All,
>
> What is the preferred way for Package A, to initialize a cluster, and load
> Package B on all nodes?
>
> I am writing a package that parallelizes some functions through the use of
> a cluster if useRs are on a Windows machine (using parLapply and family).
>  I also make use of another package in some of my code, so it is necessary
> to load the required packages on each slave once the cluster is started.
>
> Right now, I have done this, by evaluating require(packages) on each
> slave; however, Rcmd check has a note that I should remove the "require" in
> my code.
>
> Thanks!
>
> Josh
>
> --
> Joshua F. Wiley
> Ph.D. Student, UCLA Department of Psychology
> http://joshuawiley.com/
> Senior Analyst, Elkhart Group Ltd.
> http://elkhartgroup.com
> Office: 260.673.5518
>



--
Joshua F. Wiley
Ph.D. Student, UCLA Department of Psychology
http://joshuawiley.com/
Senior Analyst, Elkhart Group Ltd.
http://elkhartgroup.com
Office: 260.673.5518

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: How to (appropropriately) use require in a package?

Prof Brian Ripley
The safe, elegant way to do this is to use namespace scoping: it is
still not at all clear why 'other code' needs PkgB *on the search path*.

In other cases seen in CRAN submissions, 'other code' has been in PkgA's
namespace, and hence things in PkgB's exports have been visible as it
was imported by PkgA and hence in the environment tree for functions in
PkgA.  Then namespace scoping will ensure that PkgB's namespace is
loaded on the cluster workers.


On 08/08/2014 00:58, Joshua Wiley wrote:

> Someone kindly pointed out that it is not clear from my email why Depends
> will not work.  A more complete example is:
>
> PkgA:
> f <- function(ncores) {
>    cl <- makeCluster(ncores)
>
>    clusterEvalQ(cl, {
>      require(PkgB)
>    })
>    [other code]
>
>    ### this is the code I want to work and need to be able to call
>    ### PkgB functions on each of the cluster slaves
>    output <- parLapply(cl, 1:n, function(i) {
>      [code from my package and using some functions from PkgB]
>    })
>
> }
>
> As far as I know, just because I add PkgB to the Depends (or imports,
> whatever) of PkgA, does not mean that the cluster started by PkgA will
> automatically have PkgB loaded and functions available.
>
> Thanks!
>
>
>
> On Fri, Aug 8, 2014 at 9:35 AM, Joshua Wiley <[hidden email]> wrote:
>
>> Dear All,
>>
>> What is the preferred way for Package A, to initialize a cluster, and load
>> Package B on all nodes?
>>
>> I am writing a package that parallelizes some functions through the use of
>> a cluster if useRs are on a Windows machine (using parLapply and family).
>>   I also make use of another package in some of my code, so it is necessary
>> to load the required packages on each slave once the cluster is started.
>>
>> Right now, I have done this, by evaluating require(packages) on each
>> slave; however, Rcmd check has a note that I should remove the "require" in
>> my code.
>>
>> Thanks!
>>
>> Josh
>>
>> --
>> Joshua F. Wiley
>> Ph.D. Student, UCLA Department of Psychology
>> http://joshuawiley.com/
>> Senior Analyst, Elkhart Group Ltd.
>> http://elkhartgroup.com
>> Office: 260.673.5518
>>
>
>
>


--
Brian D. Ripley,                  [hidden email]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: How to (appropropriately) use require in a package?

Joshua Wiley-2
Dear Professor Ripley,

PkgB does not need to be on the search path---importing into the namespace
is fine.  I did not realize that namespace scoping ensured that if a
cluster is created from within a package, that packages entire environment
tree is available on all the workers.
I tried to apply how makeCluster works from an interactive R session, where
functions from packages that are loaded when the cluster is created are not
available on the workers, to how it works from within a package.

Thanks for your reply,

Josh



On Fri, Aug 8, 2014 at 4:47 PM, Prof Brian Ripley <[hidden email]>
wrote:

> The safe, elegant way to do this is to use namespace scoping: it is still
> not at all clear why 'other code' needs PkgB *on the search path*.
>
> In other cases seen in CRAN submissions, 'other code' has been in PkgA's
> namespace, and hence things in PkgB's exports have been visible as it was
> imported by PkgA and hence in the environment tree for functions in PkgA.
>  Then namespace scoping will ensure that PkgB's namespace is loaded on the
> cluster workers.
>
>
>
> On 08/08/2014 00:58, Joshua Wiley wrote:
>
>> Someone kindly pointed out that it is not clear from my email why Depends
>> will not work.  A more complete example is:
>>
>> PkgA:
>> f <- function(ncores) {
>>    cl <- makeCluster(ncores)
>>
>>    clusterEvalQ(cl, {
>>      require(PkgB)
>>    })
>>    [other code]
>>
>>    ### this is the code I want to work and need to be able to call
>>    ### PkgB functions on each of the cluster slaves
>>    output <- parLapply(cl, 1:n, function(i) {
>>      [code from my package and using some functions from PkgB]
>>    })
>>
>> }
>>
>> As far as I know, just because I add PkgB to the Depends (or imports,
>> whatever) of PkgA, does not mean that the cluster started by PkgA will
>> automatically have PkgB loaded and functions available.
>>
>> Thanks!
>>
>>
>>
>> On Fri, Aug 8, 2014 at 9:35 AM, Joshua Wiley <[hidden email]>
>> wrote:
>>
>>  Dear All,
>>>
>>> What is the preferred way for Package A, to initialize a cluster, and
>>> load
>>> Package B on all nodes?
>>>
>>> I am writing a package that parallelizes some functions through the use
>>> of
>>> a cluster if useRs are on a Windows machine (using parLapply and family).
>>>   I also make use of another package in some of my code, so it is
>>> necessary
>>> to load the required packages on each slave once the cluster is started.
>>>
>>> Right now, I have done this, by evaluating require(packages) on each
>>> slave; however, Rcmd check has a note that I should remove the "require"
>>> in
>>> my code.
>>>
>>> Thanks!
>>>
>>> Josh
>>>
>>> --
>>> Joshua F. Wiley
>>> Ph.D. Student, UCLA Department of Psychology
>>> http://joshuawiley.com/
>>> Senior Analyst, Elkhart Group Ltd.
>>> http://elkhartgroup.com
>>> Office: 260.673.5518
>>>
>>>
>>
>>
>>
>
> --
> Brian D. Ripley,                  [hidden email]
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>



--
Joshua F. Wiley
Ph.D. Student, UCLA Department of Psychology
http://joshuawiley.com/
Senior Analyst, Elkhart Group Ltd.
http://elkhartgroup.com
Office: 260.673.5518

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: How to (appropropriately) use require in a package?

Henrik Singmann
Dear Joshua,

Sorry for resurrecting this thread, but I was on holidays earlier. I also had that problem and unfortunately using loadNamespace() as suggested by Prof. Ripley didn't work in my case (the reason is that the cluster is created by the user and the call executed on the cluster can contain additional function calls from the loaded package which are passed by the user).

I settled on the following construct that doesn't seem to raise issues with R CMD check:

junk <- clusterCall(cl = cl, "require", package = "lme4", character.only = TRUE)

Let's hope it continues to not raise any flags.

Best,
Henrik


Am 08.08.2014 09:22, schrieb Joshua Wiley:

> Dear Professor Ripley,
>
> PkgB does not need to be on the search path---importing into the namespace
> is fine.  I did not realize that namespace scoping ensured that if a
> cluster is created from within a package, that packages entire environment
> tree is available on all the workers.
> I tried to apply how makeCluster works from an interactive R session, where
> functions from packages that are loaded when the cluster is created are not
> available on the workers, to how it works from within a package.
>
> Thanks for your reply,
>
> Josh
>
>
>
> On Fri, Aug 8, 2014 at 4:47 PM, Prof Brian Ripley <[hidden email]>
> wrote:
>
>> The safe, elegant way to do this is to use namespace scoping: it is still
>> not at all clear why 'other code' needs PkgB *on the search path*.
>>
>> In other cases seen in CRAN submissions, 'other code' has been in PkgA's
>> namespace, and hence things in PkgB's exports have been visible as it was
>> imported by PkgA and hence in the environment tree for functions in PkgA.
>>   Then namespace scoping will ensure that PkgB's namespace is loaded on the
>> cluster workers.
>>
>>
>>
>> On 08/08/2014 00:58, Joshua Wiley wrote:
>>
>>> Someone kindly pointed out that it is not clear from my email why Depends
>>> will not work.  A more complete example is:
>>>
>>> PkgA:
>>> f <- function(ncores) {
>>>     cl <- makeCluster(ncores)
>>>
>>>     clusterEvalQ(cl, {
>>>       require(PkgB)
>>>     })
>>>     [other code]
>>>
>>>     ### this is the code I want to work and need to be able to call
>>>     ### PkgB functions on each of the cluster slaves
>>>     output <- parLapply(cl, 1:n, function(i) {
>>>       [code from my package and using some functions from PkgB]
>>>     })
>>>
>>> }
>>>
>>> As far as I know, just because I add PkgB to the Depends (or imports,
>>> whatever) of PkgA, does not mean that the cluster started by PkgA will
>>> automatically have PkgB loaded and functions available.
>>>
>>> Thanks!
>>>
>>>
>>>
>>> On Fri, Aug 8, 2014 at 9:35 AM, Joshua Wiley <[hidden email]>
>>> wrote:
>>>
>>>   Dear All,
>>>>
>>>> What is the preferred way for Package A, to initialize a cluster, and
>>>> load
>>>> Package B on all nodes?
>>>>
>>>> I am writing a package that parallelizes some functions through the use
>>>> of
>>>> a cluster if useRs are on a Windows machine (using parLapply and family).
>>>>    I also make use of another package in some of my code, so it is
>>>> necessary
>>>> to load the required packages on each slave once the cluster is started.
>>>>
>>>> Right now, I have done this, by evaluating require(packages) on each
>>>> slave; however, Rcmd check has a note that I should remove the "require"
>>>> in
>>>> my code.
>>>>
>>>> Thanks!
>>>>
>>>> Josh
>>>>
>>>> --
>>>> Joshua F. Wiley
>>>> Ph.D. Student, UCLA Department of Psychology
>>>> http://joshuawiley.com/
>>>> Senior Analyst, Elkhart Group Ltd.
>>>> http://elkhartgroup.com
>>>> Office: 260.673.5518
>>>>
>>>>
>>>
>>>
>>>
>>
>> --
>> Brian D. Ripley,                  [hidden email]
>> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>> University of Oxford,             Tel:  +44 1865 272861 (self)
>> 1 South Parks Road,                     +44 1865 272866 (PA)
>> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>>
>
>
>

--
Dr. Henrik Singmann
Albert-Ludwigs-Universit├Ąt Freiburg, Germany
http://www.psychologie.uni-freiburg.de/Members/singmann

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel