no visible binding for global variable for data sets in a package

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

no visible binding for global variable for data sets in a package

Michael Friendly
I'm updating the Lahman package of baseball statistics to the 2013
release. In addition to
the main data sets, the package also contains several convenience
functions that make use
of these data sets.  These now trigger the notes below from R CMD check
run with
Win builder, R-devel.  How can I avoid these?

* using R Under development (unstable) (2014-08-25 r66471)
* using platform: x86_64-w64-mingw32 (64-bit)
   ...
* checking R code for possible problems ... NOTE
Label: no visible binding for global variable 'battingLabels'
Label: no visible binding for global variable 'pitchingLabels'
Label: no visible binding for global variable 'fieldingLabels'
battingStats: no visible binding for global variable 'Batting'
battingStats: no visible global function definition for 'mutate'
playerInfo: no visible binding for global variable 'Master'
teamInfo: no visible binding for global variable 'Teams'

One such function:

## function for accessing variable labels

Label <- function(var, labels=rbind(battingLabels, pitchingLabels,
fieldingLabels)) {
     wanted <- which(labels[,1]==var)
     if (length(wanted)) labels[wanted[1],2] else var
}


--
Michael Friendly     Email: friendly AT yorku DOT ca
Professor, Psychology Dept. & Chair, Quantitative Methods
York University      Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele Street    Web:http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: no visible binding for global variable for data sets in a package

Martin Maechler
>>>>> Michael Friendly <[hidden email]>
>>>>>     on Tue, 26 Aug 2014 17:58:34 -0400 writes:

    > I'm updating the Lahman package of baseball statistics to the 2013
    > release. In addition to
    > the main data sets, the package also contains several convenience
    > functions that make use
    > of these data sets.  These now trigger the notes below from R CMD check
    > run with
    > Win builder, R-devel.  How can I avoid these?

    > * using R Under development (unstable) (2014-08-25 r66471)
    > * using platform: x86_64-w64-mingw32 (64-bit)
    > ...
    > * checking R code for possible problems ... NOTE
    > Label: no visible binding for global variable 'battingLabels'
    > Label: no visible binding for global variable 'pitchingLabels'
    > Label: no visible binding for global variable 'fieldingLabels'
    > battingStats: no visible binding for global variable 'Batting'
    > battingStats: no visible global function definition for 'mutate'
    > playerInfo: no visible binding for global variable 'Master'
    > teamInfo: no visible binding for global variable 'Teams'

    > One such function:

    > ## function for accessing variable labels

    > Label <- function(var, labels=rbind(battingLabels, pitchingLabels,
    > fieldingLabels)) {
    > wanted <- which(labels[,1]==var)
    > if (length(wanted)) labels[wanted[1],2] else var
    > }

and you are using the data sets you mentioned before,
(and the checking has been changed recently here).

This is a bit subtle:
Your data sets are part of your package (thanks to the default
lazyData), but *not* part of the namespace of your package.
Now, the reasoning goes as following: if someone uses a function
from your package, say Label() above,
by
        Lahman::Label(..)
and your package has not been attached to the search path,
your user will get an error, as the datasets are not found by
Label().

If you consider something like   Lahman::Label(..)
for a bit and the emphasis we put on R functions being the
primary entities, you can understand the current, i.e. new,
R CMD check warnings.

I see the following two options for you:

1) export all these data sets from your NAMESPACE
   For this (I thinK), you must define them in  Lahman/R/ possibly via a
   Lahman/R/sysdata.rda

2) rewrite your functions such that ensure the data sets are
   loaded when they are used.


"2)" actually works by adding    
    stopifnot(require(Lahman, quietly=TRUE))
  as first line in Label() and other such functions.

It works in the sense that  Lahman::Label("yearID")  will
work even when Lahman is not in the search path,
but   R-devel CMD check   will still give the same NOTE,
though you can argue that that note is actally a "false positive".
   
Not sure about another elegant way to make "2)" work, apart from
using  data() on each of the datasets inside the
function.  As I haven't tried it, that may *still* give a
(false) NOTE..

This is a somewhat interesting problem, and I wonder if everyone
else has solved it with '1)' rather than a version of '2)'.

Martin

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: no visible binding for global variable for data sets in a package

Peter Dalgaard-2
The change would seem to be this

      \item \command{R CMD check} now by default checks code usage
      directly on the package namespace without loading and attaching
      the package and its suggests and enhances.

and perhaps the remedies could be stated more clearly?

Putting the data objects in the namespace would seem the obvious thing to do.

One oddity is that they are sort of in there already:

> Lahman::battingLabels
    variable                             label
1   playerID                    Player ID code
2     yearID                              Year
3      stint                    Player's stint
4     teamID                              Team
5       lgID                            League
6          G                             Games
7  G_batting                   Games as batter
8         AB                           At Bats
9          R                              Runs
10         H                              Hits
11       X2B                           Doubles
12       X3B                           Triples
13        HR                          Homeruns
14       RBI                    Runs Batted In
15        SB                      Stolen Bases
16        CS                   Caught Stealing
17        BB                     Base on Balls
18        SO                        Strikeouts
19       IBB                 Intentional walks
20       HBP                      Hit by pitch
21        SH                    Sacrifice hits
22        SF                   Sacrifice flies
23      GIDP        Grounded into double plays
24     G_Old Old version of games (deprecated)

But then again, actually not:

> Lahman::Label()
Error in rbind(battingLabels, pitchingLabels, fieldingLabels) :
  object 'battingLabels' not found

This is documented (:: can access datasets made available by lazy-loading), but it sort of suggests that we might want to improve consistency in this area.

-pd  


On 27 Aug 2014, at 11:24 , Martin Maechler <[hidden email]> wrote:

>>>>>> Michael Friendly <[hidden email]>
>>>>>>    on Tue, 26 Aug 2014 17:58:34 -0400 writes:
>
>> I'm updating the Lahman package of baseball statistics to the 2013
>> release. In addition to
>> the main data sets, the package also contains several convenience
>> functions that make use
>> of these data sets.  These now trigger the notes below from R CMD check
>> run with
>> Win builder, R-devel.  How can I avoid these?
>
>> * using R Under development (unstable) (2014-08-25 r66471)
>> * using platform: x86_64-w64-mingw32 (64-bit)
>> ...
>> * checking R code for possible problems ... NOTE
>> Label: no visible binding for global variable 'battingLabels'
>> Label: no visible binding for global variable 'pitchingLabels'
>> Label: no visible binding for global variable 'fieldingLabels'
>> battingStats: no visible binding for global variable 'Batting'
>> battingStats: no visible global function definition for 'mutate'
>> playerInfo: no visible binding for global variable 'Master'
>> teamInfo: no visible binding for global variable 'Teams'
>
>> One such function:
>
>> ## function for accessing variable labels
>
>> Label <- function(var, labels=rbind(battingLabels, pitchingLabels,
>> fieldingLabels)) {
>> wanted <- which(labels[,1]==var)
>> if (length(wanted)) labels[wanted[1],2] else var
>> }
>
> and you are using the data sets you mentioned before,
> (and the checking has been changed recently here).
>
> This is a bit subtle:
> Your data sets are part of your package (thanks to the default
> lazyData), but *not* part of the namespace of your package.
> Now, the reasoning goes as following: if someone uses a function
> from your package, say Label() above,
> by
> Lahman::Label(..)
> and your package has not been attached to the search path,
> your user will get an error, as the datasets are not found by
> Label().
>
> If you consider something like   Lahman::Label(..)
> for a bit and the emphasis we put on R functions being the
> primary entities, you can understand the current, i.e. new,
> R CMD check warnings.
>
> I see the following two options for you:
>
> 1) export all these data sets from your NAMESPACE
>   For this (I thinK), you must define them in  Lahman/R/ possibly via a
>   Lahman/R/sysdata.rda
>
> 2) rewrite your functions such that ensure the data sets are
>   loaded when they are used.
>
>
> "2)" actually works by adding    
>   stopifnot(require(Lahman, quietly=TRUE))
>  as first line in Label() and other such functions.
>
> It works in the sense that  Lahman::Label("yearID")  will
> work even when Lahman is not in the search path,
> but   R-devel CMD check   will still give the same NOTE,
> though you can argue that that note is actally a "false positive".
>
> Not sure about another elegant way to make "2)" work, apart from
> using  data() on each of the datasets inside the
> function.  As I haven't tried it, that may *still* give a
> (false) NOTE..
>
> This is a somewhat interesting problem, and I wonder if everyone
> else has solved it with '1)' rather than a version of '2)'.
>
> Martin
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: [hidden email]  Priv: [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: no visible binding for global variable for data sets in a package

Joshua Wiley-2
I have had similar notes, but in cases where the dataset was created
internally by a function:

* checking R code for possible problems ... NOTE
vm_diagnostics: no visible binding for global variable 'Median'
vm_diagnostics: no visible binding for global variable 'Index'
vm_diagnostics: no visible binding for global variable 'LL'
vm_diagnostics: no visible binding for global variable 'UL'

and the relevant code is here (the function creates a dataset called
'sigma' and always names the variables Index, Median, LL, UL):

  p.sigma <- ggplot(sigma, aes(Index, Median, ymin = LL, ymax = UL)) +
    geom_pointrange() +
    labs(y = "Median + 95% CI for Sigma") +
    theme_classic()

is the appropriate way to remove warnings in this case also to have the
function attach the data it internally created?




On Wed, Aug 27, 2014 at 8:09 PM, peter dalgaard <[hidden email]> wrote:

> The change would seem to be this
>
>       \item \command{R CMD check} now by default checks code usage
>       directly on the package namespace without loading and attaching
>       the package and its suggests and enhances.
>
> and perhaps the remedies could be stated more clearly?
>
> Putting the data objects in the namespace would seem the obvious thing to
> do.
>
> One oddity is that they are sort of in there already:
>
> > Lahman::battingLabels
>     variable                             label
> 1   playerID                    Player ID code
> 2     yearID                              Year
> 3      stint                    Player's stint
> 4     teamID                              Team
> 5       lgID                            League
> 6          G                             Games
> 7  G_batting                   Games as batter
> 8         AB                           At Bats
> 9          R                              Runs
> 10         H                              Hits
> 11       X2B                           Doubles
> 12       X3B                           Triples
> 13        HR                          Homeruns
> 14       RBI                    Runs Batted In
> 15        SB                      Stolen Bases
> 16        CS                   Caught Stealing
> 17        BB                     Base on Balls
> 18        SO                        Strikeouts
> 19       IBB                 Intentional walks
> 20       HBP                      Hit by pitch
> 21        SH                    Sacrifice hits
> 22        SF                   Sacrifice flies
> 23      GIDP        Grounded into double plays
> 24     G_Old Old version of games (deprecated)
>
> But then again, actually not:
>
> > Lahman::Label()
> Error in rbind(battingLabels, pitchingLabels, fieldingLabels) :
>   object 'battingLabels' not found
>
> This is documented (:: can access datasets made available by
> lazy-loading), but it sort of suggests that we might want to improve
> consistency in this area.
>
> -pd
>
>
> On 27 Aug 2014, at 11:24 , Martin Maechler <[hidden email]>
> wrote:
>
> >>>>>> Michael Friendly <[hidden email]>
> >>>>>>    on Tue, 26 Aug 2014 17:58:34 -0400 writes:
> >
> >> I'm updating the Lahman package of baseball statistics to the 2013
> >> release. In addition to
> >> the main data sets, the package also contains several convenience
> >> functions that make use
> >> of these data sets.  These now trigger the notes below from R CMD check
> >> run with
> >> Win builder, R-devel.  How can I avoid these?
> >
> >> * using R Under development (unstable) (2014-08-25 r66471)
> >> * using platform: x86_64-w64-mingw32 (64-bit)
> >> ...
> >> * checking R code for possible problems ... NOTE
> >> Label: no visible binding for global variable 'battingLabels'
> >> Label: no visible binding for global variable 'pitchingLabels'
> >> Label: no visible binding for global variable 'fieldingLabels'
> >> battingStats: no visible binding for global variable 'Batting'
> >> battingStats: no visible global function definition for 'mutate'
> >> playerInfo: no visible binding for global variable 'Master'
> >> teamInfo: no visible binding for global variable 'Teams'
> >
> >> One such function:
> >
> >> ## function for accessing variable labels
> >
> >> Label <- function(var, labels=rbind(battingLabels, pitchingLabels,
> >> fieldingLabels)) {
> >> wanted <- which(labels[,1]==var)
> >> if (length(wanted)) labels[wanted[1],2] else var
> >> }
> >
> > and you are using the data sets you mentioned before,
> > (and the checking has been changed recently here).
> >
> > This is a bit subtle:
> > Your data sets are part of your package (thanks to the default
> > lazyData), but *not* part of the namespace of your package.
> > Now, the reasoning goes as following: if someone uses a function
> > from your package, say Label() above,
> > by
> >       Lahman::Label(..)
> > and your package has not been attached to the search path,
> > your user will get an error, as the datasets are not found by
> > Label().
> >
> > If you consider something like   Lahman::Label(..)
> > for a bit and the emphasis we put on R functions being the
> > primary entities, you can understand the current, i.e. new,
> > R CMD check warnings.
> >
> > I see the following two options for you:
> >
> > 1) export all these data sets from your NAMESPACE
> >   For this (I thinK), you must define them in  Lahman/R/ possibly via a
> >   Lahman/R/sysdata.rda
> >
> > 2) rewrite your functions such that ensure the data sets are
> >   loaded when they are used.
> >
> >
> > "2)" actually works by adding
> >       stopifnot(require(Lahman, quietly=TRUE))
> >  as first line in Label() and other such functions.
> >
> > It works in the sense that  Lahman::Label("yearID")  will
> > work even when Lahman is not in the search path,
> > but   R-devel CMD check   will still give the same NOTE,
> > though you can argue that that note is actally a "false positive".
> >
> > Not sure about another elegant way to make "2)" work, apart from
> > using  data() on each of the datasets inside the
> > function.  As I haven't tried it, that may *still* give a
> > (false) NOTE..
> >
> > This is a somewhat interesting problem, and I wonder if everyone
> > else has solved it with '1)' rather than a version of '2)'.
> >
> > Martin
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: [hidden email]  Priv: [hidden email]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



--
Joshua F. Wiley
Ph.D. Student, UCLA Department of Psychology
http://joshuawiley.com/
Senior Analyst, Elkhart Group Ltd.
http://elkhartgroup.com
Office: 260.673.5518

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: no visible binding for global variable for data sets in a package

John Maindonald
In reply to this post by Martin Maechler
Re solution 2, the following is in the function tabFarsDead()
the latest (0.55) version of gamclass:

  data('FARS', package='gamclass', envir=environment())
  FARS <- get("FARS", envir=environment())

The second statement is, strictly, redundant, but it
makes the syntax checker happy.  Another possibility
might be:

  FARS <- NULL
 data('FARS', package='gamclass', envir=environment())

I do not know whether this passes.

An FAQ that offers preferred solutions to such chestnuts,
or a web page, or a blog, would seem to me useful.


John Maindonald             email: [hidden email]<mailto:[hidden email]>

phone : +61 2 (6125)3473    fax  : +61 2(6125)5549

Centre for Mathematics & Its Applications, Room 1194,

John Dedman Mathematical Sciences Building (Building 27)

Australian National University, Canberra ACT 0200.


On 27 Aug 2014, at 20:00, <[hidden email]<mailto:[hidden email]>> <[hidden email]<mailto:[hidden email]>> wrote:

From: Martin Maechler <[hidden email]<mailto:[hidden email]>>
Subject: Re: [Rd] no visible binding for global variable for data sets in a package
Date: 27 August 2014 19:24:36 AEST
To: Michael Friendly <[hidden email]<mailto:[hidden email]>>
Cc: r-devel <[hidden email]<mailto:[hidden email]>>
Reply-To: Martin Maechler <[hidden email]<mailto:[hidden email]>>


Michael Friendly <[hidden email]<mailto:[hidden email]>>
   on Tue, 26 Aug 2014 17:58:34 -0400 writes:

I'm updating the Lahman package of baseball statistics to the 2013
release. In addition to
the main data sets, the package also contains several convenience
functions that make use
of these data sets.  These now trigger the notes below from R CMD check
run with
Win builder, R-devel.  How can I avoid these?

* using R Under development (unstable) (2014-08-25 r66471)
* using platform: x86_64-w64-mingw32 (64-bit)
...
* checking R code for possible problems ... NOTE
Label: no visible binding for global variable 'battingLabels'
Label: no visible binding for global variable 'pitchingLabels'
Label: no visible binding for global variable 'fieldingLabels'
battingStats: no visible binding for global variable 'Batting'
battingStats: no visible global function definition for 'mutate'
playerInfo: no visible binding for global variable 'Master'
teamInfo: no visible binding for global variable 'Teams'

One such function:

## function for accessing variable labels

Label <- function(var, labels=rbind(battingLabels, pitchingLabels,
fieldingLabels)) {
wanted <- which(labels[,1]==var)
if (length(wanted)) labels[wanted[1],2] else var
}

and you are using the data sets you mentioned before,
(and the checking has been changed recently here).

This is a bit subtle:
Your data sets are part of your package (thanks to the default
lazyData), but *not* part of the namespace of your package.
Now, the reasoning goes as following: if someone uses a function
from your package, say Label() above,
by
Lahman::Label(..)
and your package has not been attached to the search path,
your user will get an error, as the datasets are not found by
Label().

If you consider something like   Lahman::Label(..)
for a bit and the emphasis we put on R functions being the
primary entities, you can understand the current, i.e. new,
R CMD check warnings.

I see the following two options for you:

1) export all these data sets from your NAMESPACE
  For this (I thinK), you must define them in  Lahman/R/ possibly via a
  Lahman/R/sysdata.rda

2) rewrite your functions such that ensure the data sets are
  loaded when they are used.


"2)" actually works by adding
   stopifnot(require(Lahman, quietly=TRUE))
 as first line in Label() and other such functions.

It works in the sense that  Lahman::Label("yearID")  will
work even when Lahman is not in the search path,
but   R-devel CMD check   will still give the same NOTE,
though you can argue that that note is actally a "false positive".

Not sure about another elegant way to make "2)" work, apart from
using  data() on each of the datasets inside the
function.  As I haven't tried it, that may *still* give a
(false) NOTE..

This is a somewhat interesting problem, and I wonder if everyone
else has solved it with '1)' rather than a version of '2)'.

Martin


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: no visible binding for global variable for data sets in a package

Michael Friendly
In reply to this post by Martin Maechler
On 8/27/2014 5:24 AM, Martin Maechler wrote:

>>>>>> Michael Friendly <[hidden email]>
>>>>>>      on Tue, 26 Aug 2014 17:58:34 -0400 writes:
>      > I'm updating the Lahman package of baseball statistics to the 2013
>      > release. In addition to
>      > the main data sets, the package also contains several convenience
>      > functions that make use
>      > of these data sets.  These now trigger the notes below from R CMD check
>      > run with
>      > Win builder, R-devel.  How can I avoid these?
>
>      > * using R Under development (unstable) (2014-08-25 r66471)
>      > * using platform: x86_64-w64-mingw32 (64-bit)
>      > ...
>      > * checking R code for possible problems ... NOTE
>      > Label: no visible binding for global variable 'battingLabels'
>      > Label: no visible binding for global variable 'pitchingLabels'
>      > Label: no visible binding for global variable 'fieldingLabels'
>      > battingStats: no visible binding for global variable 'Batting'
>      > battingStats: no visible global function definition for 'mutate'
>      > playerInfo: no visible binding for global variable 'Master'
>      > teamInfo: no visible binding for global variable 'Teams'
>
>      > One such function:
>
>      > ## function for accessing variable labels
>
>      > Label <- function(var, labels=rbind(battingLabels, pitchingLabels,
>      > fieldingLabels)) {
>      > wanted <- which(labels[,1]==var)
>      > if (length(wanted)) labels[wanted[1],2] else var
>      > }
>
> and you are using the data sets you mentioned before,
> (and the checking has been changed recently here).
>
> This is a bit subtle:
> Your data sets are part of your package (thanks to the default
> lazyData), but *not* part of the namespace of your package.
> Now, the reasoning goes as following: if someone uses a function
> from your package, say Label() above,
> by
> Lahman::Label(..)
> and your package has not been attached to the search path,
> your user will get an error, as the datasets are not found by
> Label().
>
> If you consider something like   Lahman::Label(..)
> for a bit and the emphasis we put on R functions being the
> primary entities, you can understand the current, i.e. new,
> R CMD check warnings.
Thanks for this explicit explanation.  Now I understand why this occurs.
>
> I see the following two options for you:
>
> 1) export all these data sets from your NAMESPACE
>     For this (I thinK), you must define them in  Lahman/R/ possibly via a
>     Lahman/R/sysdata.rda
Not sure I quite understand how this would work.  My NAMESPACE currently
exports
the few functions in this package:

# all the rest is data
export(battingStats,
     playerInfo,teamInfo,
     Label
     )

Do you mean to simply add all the data sets ('globals')  that are
referred to in these functions?

# all the rest is data
export(battingStats,
     playerInfo,teamInfo,
     Label,
     battingLabels, pitchingLabels, fieldingLabels,
     Batting, Master, Teams
     )

That seems a bit odd.  Can you actually export data?  Maybe there is a
need for a
separate NAMESPACE declaration, that might be called either of

exportdata()
globaldata()

>
> 2) rewrite your functions such that ensure the data sets are
>     loaded when they are used.
>
>
> "2)" actually works by adding
>     stopifnot(require(Lahman, quietly=TRUE))
>    as first line in Label() and other such functions.
>
> It works in the sense that  Lahman::Label("yearID")  will
> work even when Lahman is not in the search path,
> but   R-devel CMD check   will still give the same NOTE,
> though you can argue that that note is actally a "false positive".
So, this would be version 1 of "2)":

Label <- function(var, labels) {
     stopifnot(require(Lahman, quietly=TRUE))
     if(missing(labels)) labels <- rbind(battingLabels, pitchingLabels,
fieldingLabels)
     wanted <- which(labels[,1]==var)
     if (length(wanted)) labels[wanted[1],2] else var
}

And this would be version 2, using data():

Label <- function(var, labels) {
     stopifnot(require(Lahman, quietly=TRUE))
     if(missing(labels)) {
         data(battingLabels); data(pitchingLabels); data(fieldingLabels)
         labels <- rbind(battingLabels, pitchingLabels, fieldingLabels)
         }
     wanted <- which(labels[,1]==var)
     if (length(wanted)) labels[wanted[1],2] else var
}


>    
> Not sure about another elegant way to make "2)" work, apart from
> using  data() on each of the datasets inside the
> function.  As I haven't tried it, that may *still* give a
> (false) NOTE..
>
> This is a somewhat interesting problem, and I wonder if everyone
> else has solved it with '1)' rather than a version of '2)'.
>
> Martin
>    


--
Michael Friendly     Email: friendly AT yorku DOT ca
Professor, Psychology Dept. & Chair, Quantitative Methods
York University      Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele Street    Web:http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: no visible binding for global variable for data sets in a package

Michael Friendly
On 8/27/2014 9:29 AM, Michael Friendly wrote:

>> It works in the sense that  Lahman::Label("yearID")  will
>> work even when Lahman is not in the search path,
>> but   R-devel CMD check   will still give the same NOTE,
>> though you can argue that that note is actally a "false positive".
> So, this would be version 1 of "2)":
>
> Label <- function(var, labels) {
>      stopifnot(require(Lahman, quietly=TRUE))
>      if(missing(labels)) labels <- rbind(battingLabels, pitchingLabels,
> fieldingLabels)
>      wanted <- which(labels[,1]==var)
>      if (length(wanted)) labels[wanted[1],2] else var
> }
>
> And this would be version 2, using data():
>
> Label <- function(var, labels) {
>      stopifnot(require(Lahman, quietly=TRUE))
>      if(missing(labels)) {
>          data(battingLabels); data(pitchingLabels); data(fieldingLabels)
>          labels <- rbind(battingLabels, pitchingLabels, fieldingLabels)
>          }
>      wanted <- which(labels[,1]==var)
>      if (length(wanted)) labels[wanted[1],2] else var
> }
>
>

Just to follow up:  R-devel likes this less than it does my initial
version.  I still get no visible binding NOTES, and complaint about
using data() in a function:

* checking R code for possible problems ... NOTE
Label: no visible binding for global variable 'battingLabels'
Label: no visible binding for global variable 'pitchingLabels'
Label: no visible binding for global variable 'fieldingLabels'
battingStats: no visible binding for global variable 'Batting'
battingStats: no visible global function definition for 'mutate'
playerInfo: no visible binding for global variable 'Master'
teamInfo: no visible binding for global variable 'Teams'

Found the following calls to data() loading into the global environment:
File 'Lahman/R/Label.R':
   data(battingLabels)
   data(pitchingLabels)
   data(fieldingLabels)
See section 'Good practice' in '?data'.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: no visible binding for global variable for data sets in a package

Peter Dalgaard-2
Using data() in a package with lazyloaded data seems like asking for trouble to me. But you're making me curious (and I'm too lazy[*] to set up for rebuilding the package myself):

Did you actually try putting battingLabels & friends in the namespace? What happened?

A workaround could be to use rbind(Lahman::battingLabels,....) which runs a bit against the grain for me.

I think the right answer _is_ to export the lazy data; the question is how to do it. There's nothing particularly strange about exporting non-functions ("letters" would be an example, save for the special status of package:base). If you attach the package, the lazyloaded data appear in the same environment as the exported function so they are de facto already in the namespace for the purposes of library() and `::`. So I agree, something like exportData() would be useful. (Or some other mechanism. You might want to be able to export data selectively.)

- pd


[*] "Burdened with pressing obligations", if you like.


On 27 Aug 2014, at 16:07 , Michael Friendly <[hidden email]> wrote:

> On 8/27/2014 9:29 AM, Michael Friendly wrote:
>>> It works in the sense that  Lahman::Label("yearID")  will
>>> work even when Lahman is not in the search path,
>>> but   R-devel CMD check   will still give the same NOTE,
>>> though you can argue that that note is actally a "false positive".
>> So, this would be version 1 of "2)":
>>
>> Label <- function(var, labels) {
>>     stopifnot(require(Lahman, quietly=TRUE))
>>     if(missing(labels)) labels <- rbind(battingLabels, pitchingLabels,
>> fieldingLabels)
>>     wanted <- which(labels[,1]==var)
>>     if (length(wanted)) labels[wanted[1],2] else var
>> }
>>
>> And this would be version 2, using data():
>>
>> Label <- function(var, labels) {
>>     stopifnot(require(Lahman, quietly=TRUE))
>>     if(missing(labels)) {
>>         data(battingLabels); data(pitchingLabels); data(fieldingLabels)
>>         labels <- rbind(battingLabels, pitchingLabels, fieldingLabels)
>>         }
>>     wanted <- which(labels[,1]==var)
>>     if (length(wanted)) labels[wanted[1],2] else var
>> }
>>
>>
>
> Just to follow up:  R-devel likes this less than it does my initial version.  I still get no visible binding NOTES, and complaint about
> using data() in a function:
>
> * checking R code for possible problems ... NOTE
> Label: no visible binding for global variable 'battingLabels'
> Label: no visible binding for global variable 'pitchingLabels'
> Label: no visible binding for global variable 'fieldingLabels'
> battingStats: no visible binding for global variable 'Batting'
> battingStats: no visible global function definition for 'mutate'
> playerInfo: no visible binding for global variable 'Master'
> teamInfo: no visible binding for global variable 'Teams'
>
> Found the following calls to data() loading into the global environment:
> File 'Lahman/R/Label.R':
>  data(battingLabels)
>  data(pitchingLabels)
>  data(fieldingLabels)
> See section 'Good practice' in '?data'.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: [hidden email]  Priv: [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: no visible binding for global variable for data sets in a package

hadley wickham
> I think the right answer _is_ to export the lazy data; the question is how to do it. There's nothing particularly strange about exporting non-functions ("letters" would be an example, save for the special status of package:base). If you attach the package, the lazyloaded data appear in the same environment as the exported function so they are de facto already in the namespace for the purposes of library() and `::`. So I agree, something like exportData() would be useful. (Or some other mechanism. You might want to be able to export data selectively.)

I don't think lazyloaded data are in the same environment as exported
functions - getExportedValue() (called by ::) looks first in the
"exports" namespace, then in the "lazydata" namespace:

function (ns, name)
{
    getInternalExportName <- function(name, ns) {
        exports <- getNamespaceInfo(ns, "exports")
        if (exists(name, envir = exports, inherits = FALSE))
            get(get(name, envir = exports, inherits = FALSE),
                envir = ns)
        else {
            ld <- getNamespaceInfo(ns, "lazydata")
            if (exists(name, envir = ld, inherits = FALSE))
                get(name, envir = ld, inherits = FALSE)
            else stop(gettextf("'%s' is not an exported object from
'namespace:%s'",
                name, getNamespaceName(ns)), call. = FALSE, domain = NA)
        }
    }
    ns <- asNamespace(ns)
    if (isBaseNamespace(ns))
        get(name, envir = ns, inherits = FALSE)
    else getInternalExportName(name, ns)
}


(But maybe you just meant the library() and :: behaves as is lazydata
and exports were the same thing)

Hadley

--
http://had.co.nz/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: no visible binding for global variable for data sets in a package

Prof Brian Ripley
In reply to this post by Michael Friendly
http://cran.r-project.org/doc/manuals/r-patched/R-exts.html#Suggested-packages 
suggests e.g. Lahman::battingLabels, and that does work for lazy-loaded
datasets (which is what these appear to be).

We have seen a couple of other instances in which this was needed for
code within the package.  However, in this case I would have thought
battingLabels etc were really system data and should be in sysdata.rda.
  See ?data.

As the reference suggests, putting Lahman on the search path does not
necessarily work correctly: what if a user (or someone else's package!)
puts another package on the search path which masks battingLabels?
Assuming you want the version in the package, you have to say so.  (If
you do not, supply argument 'labels'.)


On 27/08/2014 15:07, Michael Friendly wrote:

> On 8/27/2014 9:29 AM, Michael Friendly wrote:
>>> It works in the sense that  Lahman::Label("yearID")  will
>>> work even when Lahman is not in the search path,
>>> but   R-devel CMD check   will still give the same NOTE,
>>> though you can argue that that note is actally a "false positive".
>> So, this would be version 1 of "2)":
>>
>> Label <- function(var, labels) {
>>      stopifnot(require(Lahman, quietly=TRUE))
>>      if(missing(labels)) labels <- rbind(battingLabels, pitchingLabels,
>> fieldingLabels)
>>      wanted <- which(labels[,1]==var)
>>      if (length(wanted)) labels[wanted[1],2] else var
>> }
>>
>> And this would be version 2, using data():
>>
>> Label <- function(var, labels) {
>>      stopifnot(require(Lahman, quietly=TRUE))
>>      if(missing(labels)) {
>>          data(battingLabels); data(pitchingLabels); data(fieldingLabels)
>>          labels <- rbind(battingLabels, pitchingLabels, fieldingLabels)
>>          }
>>      wanted <- which(labels[,1]==var)
>>      if (length(wanted)) labels[wanted[1],2] else var
>> }
>>
>>
>
> Just to follow up:  R-devel likes this less than it does my initial
> version.  I still get no visible binding NOTES, and complaint about
> using data() in a function:
>
> * checking R code for possible problems ... NOTE
> Label: no visible binding for global variable 'battingLabels'
> Label: no visible binding for global variable 'pitchingLabels'
> Label: no visible binding for global variable 'fieldingLabels'
> battingStats: no visible binding for global variable 'Batting'
> battingStats: no visible global function definition for 'mutate'
> playerInfo: no visible binding for global variable 'Master'
> teamInfo: no visible binding for global variable 'Teams'
>
> Found the following calls to data() loading into the global environment:
> File 'Lahman/R/Label.R':
>    data(battingLabels)
>    data(pitchingLabels)
>    data(fieldingLabels)
> See section 'Good practice' in '?data'.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


--
Brian D. Ripley,                  [hidden email]
Emeritus Professor of Applied Statistics, University of Oxford
1 South Parks Road, Oxford OX1 3TG, UK

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: no visible binding for global variable for data sets in a package

Michael Friendly
In reply to this post by Peter Dalgaard-2
On 8/27/2014 10:41 AM, peter dalgaard wrote:
> Using data() in a package with lazyloaded data seems like asking for trouble to me. But you're making me curious (and I'm too lazy[*] to set up for rebuilding the package myself):
>
> Did you actually try putting battingLabels & friends in the namespace? What happened?
Well, I tried this, with NAMESPACE as

# all the rest is data, except we try to export the data sets used
globally in these functions
export(battingStats,
     playerInfo,teamInfo,
     Label,
     battingLabels, pitchingLabels, fieldingLabels,
     Batting, Master, Teams
     )

R CMD check was even more unhappy, failing immediately, so the attempted
cure was worse than the original disease.

* checking whether package 'Lahman' can be installed ...Warning: running
command '"C:/R/R-3.0.3/bin/x64/Rcmd.exe" INSTALL -l
"C:/eclipse/Lahman2013.Rcheck" --no-html
"C:\DOCUME~2\WORKSP~1\LAHMAN~1"' had status 1
  ERROR
Installation failed.
See 'C:/eclipse/Lahman2013.Rcheck/00install.out' for details.

00install.out said:

Error in namespaceExport(ns, exports) :
   undefined exports: battingLabels, pitchingLabels, fieldingLabels,
Batting, Master, Teams
Error: loading failed
Execution halted

My conclusions so far are:

- These are just NOTEs, so I will ignore them for now.  But they will
probably trigger the CRAN
maintainers to notice them when I resubmit and I will have to plead
guilty with an explanation.  This makes more useless work for all involved.

- Peter Dalgaard noted the change in R-devel, and nobody so far has
suggested a working remedy, so a clean solution
seems warranted.  It mentions 'now by default' -- is there a switch for
this?

The change would seem to be this

       \item \command{R CMD check} now by default checks code usage
       directly on the package namespace without loading and attaching
       the package and its suggests and enhances.

and perhaps the remedies could be stated more clearly?

Alternatively, it isn't common, but it is by no means rare for package
functions to need to use data sets
in the package, so some mechanism to declare these, perhaps in NAMESPACE
is needed.

>
> A workaround could be to use rbind(Lahman::battingLabels,....) which runs a bit against the grain for me.
>
> I think the right answer _is_ to export the lazy data; the question is how to do it. There's nothing particularly strange about exporting non-functions ("letters" would be an example, save for the special status of package:base). If you attach the package, the lazyloaded data appear in the same environment as the exported function so they are de facto already in the namespace for the purposes of library() and `::`. So I agree, something like exportData() would be useful. (Or some other mechanism. You might want to be able to export data selectively.)
>
> - pd
>
>
> [*] "Burdened with pressing obligations", if you like.
>
>
> On 27 Aug 2014, at 16:07 , Michael Friendly <[hidden email]> wrote:
>
>> On 8/27/2014 9:29 AM, Michael Friendly wrote:
>>>> It works in the sense that  Lahman::Label("yearID")  will
>>>> work even when Lahman is not in the search path,
>>>> but   R-devel CMD check   will still give the same NOTE,
>>>> though you can argue that that note is actally a "false positive".
>>> So, this would be version 1 of "2)":
>>>
>>> Label <- function(var, labels) {
>>>      stopifnot(require(Lahman, quietly=TRUE))
>>>      if(missing(labels)) labels <- rbind(battingLabels, pitchingLabels,
>>> fieldingLabels)
>>>      wanted <- which(labels[,1]==var)
>>>      if (length(wanted)) labels[wanted[1],2] else var
>>> }
>>>
>>> And this would be version 2, using data():
>>>
>>> Label <- function(var, labels) {
>>>      stopifnot(require(Lahman, quietly=TRUE))
>>>      if(missing(labels)) {
>>>          data(battingLabels); data(pitchingLabels); data(fieldingLabels)
>>>          labels <- rbind(battingLabels, pitchingLabels, fieldingLabels)
>>>          }
>>>      wanted <- which(labels[,1]==var)
>>>      if (length(wanted)) labels[wanted[1],2] else var
>>> }
>>>
>>>
>> Just to follow up:  R-devel likes this less than it does my initial version.  I still get no visible binding NOTES, and complaint about
>> using data() in a function:
>>
>> * checking R code for possible problems ... NOTE
>> Label: no visible binding for global variable 'battingLabels'
>> Label: no visible binding for global variable 'pitchingLabels'
>> Label: no visible binding for global variable 'fieldingLabels'
>> battingStats: no visible binding for global variable 'Batting'
>> battingStats: no visible global function definition for 'mutate'
>> playerInfo: no visible binding for global variable 'Master'
>> teamInfo: no visible binding for global variable 'Teams'
>>
>> Found the following calls to data() loading into the global environment:
>> File 'Lahman/R/Label.R':
>>   data(battingLabels)
>>   data(pitchingLabels)
>>   data(fieldingLabels)
>> See section 'Good practice' in '?data'.
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel


--
Michael Friendly     Email: friendly AT yorku DOT ca
Professor, Psychology Dept. & Chair, Quantitative Methods
York University      Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele Street    Web:http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: no visible binding for global variable for data sets in a package

hadley wickham
On Wed, Aug 27, 2014 at 12:51 PM, Michael Friendly <[hidden email]> wrote:

> On 8/27/2014 10:41 AM, peter dalgaard wrote:
>>
>> Using data() in a package with lazyloaded data seems like asking for
>> trouble to me. But you're making me curious (and I'm too lazy[*] to set up
>> for rebuilding the package myself):
>>
>> Did you actually try putting battingLabels & friends in the namespace?
>> What happened?
>
> Well, I tried this, with NAMESPACE as
>
> # all the rest is data, except we try to export the data sets used globally
> in these functions
>
> export(battingStats,
>     playerInfo,teamInfo,
>     Label,
>     battingLabels, pitchingLabels, fieldingLabels,
>     Batting, Master, Teams
>     )
>
> R CMD check was even more unhappy, failing immediately, so the attempted
> cure was worse than the original disease.

If you want to export your datasets, you'll need to put them in R/sysdata.rda.

But I think Brian suggested the right fix: use Lahmann::Batting
instead of Batting. (And leave everything else as is)

Hadley

--
http://had.co.nz/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: no visible binding for global variable for data sets in a package

Peter Dalgaard-2
In reply to this post by Michael Friendly

On 27 Aug 2014, at 19:51 , Michael Friendly <[hidden email]> wrote:

> - Peter Dalgaard noted the change in R-devel, and nobody so far has suggested a working remedy, so a clean solution
> seems warranted.

Actually, both Peter Dalgaard and Brian Ripley suggested Lahman:battingLabels, and, with the current semantics at least, it does seem to be recommendable to namespace-qualify any such datasets, lest you be a potential victim of variable masking.

(See also my reply to Hadley, once I get it finalized.)

-pd
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: [hidden email]  Priv: [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: no visible binding for global variable for data sets in a package

Peter Dalgaard-2
In reply to this post by hadley wickham

On 27 Aug 2014, at 16:48 , Hadley Wickham <[hidden email]> wrote:

>> I think the right answer _is_ to export the lazy data; the question is how to do it. There's nothing particularly strange about exporting non-functions ("letters" would be an example, save for the special status of package:base). If you attach the package, the lazyloaded data appear in the same environment as the exported function so they are de facto already in the namespace for the purposes of library() and `::`. So I agree, something like exportData() would be useful. (Or some other mechanism. You might want to be able to export data selectively.)
>
> I don't think lazyloaded data are in the same environment as exported
> functions - getExportedValue() (called by ::) looks first in the
> "exports" namespace, then in the "lazydata" namespace:
>
> function (ns, name)
> {
>    getInternalExportName <- function(name, ns) {
>        exports <- getNamespaceInfo(ns, "exports")
>        if (exists(name, envir = exports, inherits = FALSE))
>            get(get(name, envir = exports, inherits = FALSE),
>                envir = ns)
>        else {
>            ld <- getNamespaceInfo(ns, "lazydata")
>            if (exists(name, envir = ld, inherits = FALSE))
>                get(name, envir = ld, inherits = FALSE)
>            else stop(gettextf("'%s' is not an exported object from
> 'namespace:%s'",
>                name, getNamespaceName(ns)), call. = FALSE, domain = NA)
>        }
>    }
>    ns <- asNamespace(ns)
>    if (isBaseNamespace(ns))
>        get(name, envir = ns, inherits = FALSE)
>    else getInternalExportName(name, ns)
> }
>
>
> (But maybe you just meant the library() and :: behaves as is lazydata
> and exports were the same thing)
>
> Hadley
>
> --
> http://had.co.nz/

I meant that

a) :: gives results as if the data was in the namespace

b) if you do

> library(MASS)
> ls("package:MASS")
  [1] "abbey"              "accdeaths"          "addterm"          
  [4] "Aids2"              "Animals"            "anorexia"          
  [7] "area"               "as.fractions"       "bacteria"          
 [10] "bandwidth.nrd"      "bcv"                "beav1"            
 ....

data and functions get put together in the same environment (however, this is not the namespace environment, see later).

What puzzles me is why the distinction between lazydata and exports was there to begin with.

The implication of the current setup is clearly that pkg::foo() cannot access package::dat by referring to `dat` whereas it can do so if invoked as library(pkg); foo().

We also have

> get("accdeaths", environment(MASS::addterm))
Error in get("accdeaths", environment(MASS::addterm)) :
  object 'accdeaths' not found
> library(MASS)
> get("accdeaths", environment(MASS::addterm))
       Jan   Feb   Mar   Apr   May   Jun   Jul   Aug   Sep   Oct   Nov
1973  9007  8106  8928  9137 10017 10826 11317 10744  9713  9938  9161
1974  7750  6981  8038  8422  8714  9512 10120  9823  8743  9129  8710
...

which confused me at first, but it actually just means that "accdeaths" is found on the search path in the latter case. This strikes me as somewhat dangerous: If a package uses one of its own datasets, it can be masked by a later attach() or the global env.

(I suspect that someone already explained all this a while back, but I just wasn't listening at the time...)

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: [hidden email]  Priv: [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: no visible binding for global variable for data sets in a package

Martin Maechler
>>>>> peter dalgaard <[hidden email]>
>>>>>     on Wed, 27 Aug 2014 21:09:47 +0200 writes:

    > On 27 Aug 2014, at 16:48 , Hadley Wickham <[hidden email]> wrote:

    >>> I think the right answer _is_ to export the lazy data; the question is how to do it. There's nothing particularly strange about exporting non-functions ("letters" would be an example, save for the special status of package:base). If you attach the package, the lazyloaded data appear in the same environment as the exported function so they are de facto already in the namespace for the purposes of library() and `::`. So I agree, something like exportData() would be useful. (Or some other mechanism. You might want to be able to export data selectively.)
    >>
    >> I don't think lazyloaded data are in the same environment as exported
    >> functions - getExportedValue() (called by ::) looks first in the
    >> "exports" namespace, then in the "lazydata" namespace:
    >>
    >> function (ns, name)
    >> {
    >> getInternalExportName <- function(name, ns) {
    >> exports <- getNamespaceInfo(ns, "exports")
    >> if (exists(name, envir = exports, inherits = FALSE))
    >> get(get(name, envir = exports, inherits = FALSE),
    >> envir = ns)
    >> else {
    >> ld <- getNamespaceInfo(ns, "lazydata")
    >> if (exists(name, envir = ld, inherits = FALSE))
    >> get(name, envir = ld, inherits = FALSE)
    >> else stop(gettextf("'%s' is not an exported object from
    >> 'namespace:%s'",
    >> name, getNamespaceName(ns)), call. = FALSE, domain = NA)
    >> }
    >> }
    >> ns <- asNamespace(ns)
    >> if (isBaseNamespace(ns))
    >> get(name, envir = ns, inherits = FALSE)
    >> else getInternalExportName(name, ns)
    >> }
    >>
    >>
    >> (But maybe you just meant the library() and :: behaves as is lazydata
    >> and exports were the same thing)
    >>
    >> Hadley
    >>
    >> --
    >> http://had.co.nz/

    > I meant that

    > a) :: gives results as if the data was in the namespace

    > b) if you do

    >> library(MASS)
    >> ls("package:MASS")
    > [1] "abbey"              "accdeaths"          "addterm"          
    > [4] "Aids2"              "Animals"            "anorexia"          
    > [7] "area"               "as.fractions"       "bacteria"          
    > [10] "bandwidth.nrd"      "bcv"                "beav1"            
    > ....

    > data and functions get put together in the same environment (however, this is not the namespace environment, see later).

    > What puzzles me is why the distinction between lazydata and exports was there to begin with.

    > The implication of the current setup is clearly that pkg::foo() cannot access package::dat by referring to `dat` whereas it can do so if invoked as library(pkg); foo().

    > We also have

    >> get("accdeaths", environment(MASS::addterm))
    > Error in get("accdeaths", environment(MASS::addterm)) :
    > object 'accdeaths' not found
    >> library(MASS)
    >> get("accdeaths", environment(MASS::addterm))
    > Jan   Feb   Mar   Apr   May   Jun   Jul   Aug   Sep   Oct   Nov
    > 1973  9007  8106  8928  9137 10017 10826 11317 10744  9713  9938  9161
    > 1974  7750  6981  8038  8422  8714  9512 10120  9823  8743  9129  8710
    > ...

    > which confused me at first, but it actually just means that "accdeaths" is found on the search path in the latter case. This strikes me as somewhat dangerous: If a package uses one of its own datasets, it can be masked by a later attach() or the global env.

    > (I suspect that someone already explained all this a while back, but I just wasn't listening at the time...)

I did say (as first reply in this thread) that the lazyloaded
datasets where in the package environment, but not
in the package's *namespace* environment -- which is somewhat
exceptional, as otherwise, the namespace envir is typically a superset
of the package envir, and I did mention the solution with
sysdata.rda + NAMESPACE which Brian and Hadley later mentioned, too.
However Michael did not get it well enough (because you
need to read other material), and I do think that is not an
ideal solution for data sets that have help pages, etc; notably
because creating sysdata.rda is not at all part of the package
sources, and hence harder for maintenance and "open source transparency".

The one additional important in the thread was the special
semantic of '::'  which here returns something for
 <pkg>::<obj> which is *not* part of the <pkg>'s namespace,
whereas the use of '::' does suggest to Joe Average to be
working with namespace (as opposed to package) contents.
I agree with your suggestion, Peter, that this looks more
confusing than it should, and ideally, we'd find a better setup.

OTOH, if we additionally allowed something like   exportData(),
we would additionally get data that is both in the package and
namespace environments and other (not exported) data that is
only in the package, so add room for even more confusion.

Martin

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: no visible binding for global variable for data sets in a package

Therneau, Terry M., Ph.D.
In reply to this post by Michael Friendly
Here is one rationale for the change, which was useful for my understanding

This arises in the survival package in the survexp function:
     survexp <- function(formula, data, weights, subset, na.action,.... ratetable=survexp.us)

The argument has been changed to survival::survexp.us, soon to be uploaded to CRAN.

Assume that a user has an object named "survexp.us" in their working data, and calls
survexp without providing a ratetable arguement.  Should they get the package default or
their own object?  I'd vote for the first, but the current behavior is the latter.  This
was presented to me as "fixing a bug that has long been present, but we never noticed before."

Now, if the user adds "ratetable=survexp.us" to the call they should get their own
object.  I've always hated computer systems that ignore explicit direction, assuming they
know more than me.

Terry Therneau

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel