|
I'm updating the Lahman package of baseball statistics to the 2013
release. In addition to the main data sets, the package also contains several convenience functions that make use of these data sets. These now trigger the notes below from R CMD check run with Win builder, R-devel. How can I avoid these? * using R Under development (unstable) (2014-08-25 r66471) * using platform: x86_64-w64-mingw32 (64-bit) ... * checking R code for possible problems ... NOTE Label: no visible binding for global variable 'battingLabels' Label: no visible binding for global variable 'pitchingLabels' Label: no visible binding for global variable 'fieldingLabels' battingStats: no visible binding for global variable 'Batting' battingStats: no visible global function definition for 'mutate' playerInfo: no visible binding for global variable 'Master' teamInfo: no visible binding for global variable 'Teams' One such function: ## function for accessing variable labels Label <- function(var, labels=rbind(battingLabels, pitchingLabels, fieldingLabels)) { wanted <- which(labels[,1]==var) if (length(wanted)) labels[wanted[1],2] else var } -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. & Chair, Quantitative Methods York University Voice: 416 736-2100 x66249 Fax: 416 736-5814 4700 Keele Street Web:http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel |
|
>>>>> Michael Friendly <[hidden email]>
>>>>> on Tue, 26 Aug 2014 17:58:34 -0400 writes: > I'm updating the Lahman package of baseball statistics to the 2013 > release. In addition to > the main data sets, the package also contains several convenience > functions that make use > of these data sets. These now trigger the notes below from R CMD check > run with > Win builder, R-devel. How can I avoid these? > * using R Under development (unstable) (2014-08-25 r66471) > * using platform: x86_64-w64-mingw32 (64-bit) > ... > * checking R code for possible problems ... NOTE > Label: no visible binding for global variable 'battingLabels' > Label: no visible binding for global variable 'pitchingLabels' > Label: no visible binding for global variable 'fieldingLabels' > battingStats: no visible binding for global variable 'Batting' > battingStats: no visible global function definition for 'mutate' > playerInfo: no visible binding for global variable 'Master' > teamInfo: no visible binding for global variable 'Teams' > One such function: > ## function for accessing variable labels > Label <- function(var, labels=rbind(battingLabels, pitchingLabels, > fieldingLabels)) { > wanted <- which(labels[,1]==var) > if (length(wanted)) labels[wanted[1],2] else var > } and you are using the data sets you mentioned before, (and the checking has been changed recently here). This is a bit subtle: Your data sets are part of your package (thanks to the default lazyData), but *not* part of the namespace of your package. Now, the reasoning goes as following: if someone uses a function from your package, say Label() above, by Lahman::Label(..) and your package has not been attached to the search path, your user will get an error, as the datasets are not found by Label(). If you consider something like Lahman::Label(..) for a bit and the emphasis we put on R functions being the primary entities, you can understand the current, i.e. new, R CMD check warnings. I see the following two options for you: 1) export all these data sets from your NAMESPACE For this (I thinK), you must define them in Lahman/R/ possibly via a Lahman/R/sysdata.rda 2) rewrite your functions such that ensure the data sets are loaded when they are used. "2)" actually works by adding stopifnot(require(Lahman, quietly=TRUE)) as first line in Label() and other such functions. It works in the sense that Lahman::Label("yearID") will work even when Lahman is not in the search path, but R-devel CMD check will still give the same NOTE, though you can argue that that note is actally a "false positive". Not sure about another elegant way to make "2)" work, apart from using data() on each of the datasets inside the function. As I haven't tried it, that may *still* give a (false) NOTE.. This is a somewhat interesting problem, and I wonder if everyone else has solved it with '1)' rather than a version of '2)'. Martin ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel |
|
The change would seem to be this
\item \command{R CMD check} now by default checks code usage directly on the package namespace without loading and attaching the package and its suggests and enhances. and perhaps the remedies could be stated more clearly? Putting the data objects in the namespace would seem the obvious thing to do. One oddity is that they are sort of in there already: > Lahman::battingLabels variable label 1 playerID Player ID code 2 yearID Year 3 stint Player's stint 4 teamID Team 5 lgID League 6 G Games 7 G_batting Games as batter 8 AB At Bats 9 R Runs 10 H Hits 11 X2B Doubles 12 X3B Triples 13 HR Homeruns 14 RBI Runs Batted In 15 SB Stolen Bases 16 CS Caught Stealing 17 BB Base on Balls 18 SO Strikeouts 19 IBB Intentional walks 20 HBP Hit by pitch 21 SH Sacrifice hits 22 SF Sacrifice flies 23 GIDP Grounded into double plays 24 G_Old Old version of games (deprecated) But then again, actually not: > Lahman::Label() Error in rbind(battingLabels, pitchingLabels, fieldingLabels) : object 'battingLabels' not found This is documented (:: can access datasets made available by lazy-loading), but it sort of suggests that we might want to improve consistency in this area. -pd On 27 Aug 2014, at 11:24 , Martin Maechler <[hidden email]> wrote: >>>>>> Michael Friendly <[hidden email]> >>>>>> on Tue, 26 Aug 2014 17:58:34 -0400 writes: > >> I'm updating the Lahman package of baseball statistics to the 2013 >> release. In addition to >> the main data sets, the package also contains several convenience >> functions that make use >> of these data sets. These now trigger the notes below from R CMD check >> run with >> Win builder, R-devel. How can I avoid these? > >> * using R Under development (unstable) (2014-08-25 r66471) >> * using platform: x86_64-w64-mingw32 (64-bit) >> ... >> * checking R code for possible problems ... NOTE >> Label: no visible binding for global variable 'battingLabels' >> Label: no visible binding for global variable 'pitchingLabels' >> Label: no visible binding for global variable 'fieldingLabels' >> battingStats: no visible binding for global variable 'Batting' >> battingStats: no visible global function definition for 'mutate' >> playerInfo: no visible binding for global variable 'Master' >> teamInfo: no visible binding for global variable 'Teams' > >> One such function: > >> ## function for accessing variable labels > >> Label <- function(var, labels=rbind(battingLabels, pitchingLabels, >> fieldingLabels)) { >> wanted <- which(labels[,1]==var) >> if (length(wanted)) labels[wanted[1],2] else var >> } > > and you are using the data sets you mentioned before, > (and the checking has been changed recently here). > > This is a bit subtle: > Your data sets are part of your package (thanks to the default > lazyData), but *not* part of the namespace of your package. > Now, the reasoning goes as following: if someone uses a function > from your package, say Label() above, > by > Lahman::Label(..) > and your package has not been attached to the search path, > your user will get an error, as the datasets are not found by > Label(). > > If you consider something like Lahman::Label(..) > for a bit and the emphasis we put on R functions being the > primary entities, you can understand the current, i.e. new, > R CMD check warnings. > > I see the following two options for you: > > 1) export all these data sets from your NAMESPACE > For this (I thinK), you must define them in Lahman/R/ possibly via a > Lahman/R/sysdata.rda > > 2) rewrite your functions such that ensure the data sets are > loaded when they are used. > > > "2)" actually works by adding > stopifnot(require(Lahman, quietly=TRUE)) > as first line in Label() and other such functions. > > It works in the sense that Lahman::Label("yearID") will > work even when Lahman is not in the search path, > but R-devel CMD check will still give the same NOTE, > though you can argue that that note is actally a "false positive". > > Not sure about another elegant way to make "2)" work, apart from > using data() on each of the datasets inside the > function. As I haven't tried it, that may *still* give a > (false) NOTE.. > > This is a somewhat interesting problem, and I wonder if everyone > else has solved it with '1)' rather than a version of '2)'. > > Martin > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: [hidden email] Priv: [hidden email] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel |
|
I have had similar notes, but in cases where the dataset was created
internally by a function: * checking R code for possible problems ... NOTE vm_diagnostics: no visible binding for global variable 'Median' vm_diagnostics: no visible binding for global variable 'Index' vm_diagnostics: no visible binding for global variable 'LL' vm_diagnostics: no visible binding for global variable 'UL' and the relevant code is here (the function creates a dataset called 'sigma' and always names the variables Index, Median, LL, UL): p.sigma <- ggplot(sigma, aes(Index, Median, ymin = LL, ymax = UL)) + geom_pointrange() + labs(y = "Median + 95% CI for Sigma") + theme_classic() is the appropriate way to remove warnings in this case also to have the function attach the data it internally created? On Wed, Aug 27, 2014 at 8:09 PM, peter dalgaard <[hidden email]> wrote: > The change would seem to be this > > \item \command{R CMD check} now by default checks code usage > directly on the package namespace without loading and attaching > the package and its suggests and enhances. > > and perhaps the remedies could be stated more clearly? > > Putting the data objects in the namespace would seem the obvious thing to > do. > > One oddity is that they are sort of in there already: > > > Lahman::battingLabels > variable label > 1 playerID Player ID code > 2 yearID Year > 3 stint Player's stint > 4 teamID Team > 5 lgID League > 6 G Games > 7 G_batting Games as batter > 8 AB At Bats > 9 R Runs > 10 H Hits > 11 X2B Doubles > 12 X3B Triples > 13 HR Homeruns > 14 RBI Runs Batted In > 15 SB Stolen Bases > 16 CS Caught Stealing > 17 BB Base on Balls > 18 SO Strikeouts > 19 IBB Intentional walks > 20 HBP Hit by pitch > 21 SH Sacrifice hits > 22 SF Sacrifice flies > 23 GIDP Grounded into double plays > 24 G_Old Old version of games (deprecated) > > But then again, actually not: > > > Lahman::Label() > Error in rbind(battingLabels, pitchingLabels, fieldingLabels) : > object 'battingLabels' not found > > This is documented (:: can access datasets made available by > lazy-loading), but it sort of suggests that we might want to improve > consistency in this area. > > -pd > > > On 27 Aug 2014, at 11:24 , Martin Maechler <[hidden email]> > wrote: > > >>>>>> Michael Friendly <[hidden email]> > >>>>>> on Tue, 26 Aug 2014 17:58:34 -0400 writes: > > > >> I'm updating the Lahman package of baseball statistics to the 2013 > >> release. In addition to > >> the main data sets, the package also contains several convenience > >> functions that make use > >> of these data sets. These now trigger the notes below from R CMD check > >> run with > >> Win builder, R-devel. How can I avoid these? > > > >> * using R Under development (unstable) (2014-08-25 r66471) > >> * using platform: x86_64-w64-mingw32 (64-bit) > >> ... > >> * checking R code for possible problems ... NOTE > >> Label: no visible binding for global variable 'battingLabels' > >> Label: no visible binding for global variable 'pitchingLabels' > >> Label: no visible binding for global variable 'fieldingLabels' > >> battingStats: no visible binding for global variable 'Batting' > >> battingStats: no visible global function definition for 'mutate' > >> playerInfo: no visible binding for global variable 'Master' > >> teamInfo: no visible binding for global variable 'Teams' > > > >> One such function: > > > >> ## function for accessing variable labels > > > >> Label <- function(var, labels=rbind(battingLabels, pitchingLabels, > >> fieldingLabels)) { > >> wanted <- which(labels[,1]==var) > >> if (length(wanted)) labels[wanted[1],2] else var > >> } > > > > and you are using the data sets you mentioned before, > > (and the checking has been changed recently here). > > > > This is a bit subtle: > > Your data sets are part of your package (thanks to the default > > lazyData), but *not* part of the namespace of your package. > > Now, the reasoning goes as following: if someone uses a function > > from your package, say Label() above, > > by > > Lahman::Label(..) > > and your package has not been attached to the search path, > > your user will get an error, as the datasets are not found by > > Label(). > > > > If you consider something like Lahman::Label(..) > > for a bit and the emphasis we put on R functions being the > > primary entities, you can understand the current, i.e. new, > > R CMD check warnings. > > > > I see the following two options for you: > > > > 1) export all these data sets from your NAMESPACE > > For this (I thinK), you must define them in Lahman/R/ possibly via a > > Lahman/R/sysdata.rda > > > > 2) rewrite your functions such that ensure the data sets are > > loaded when they are used. > > > > > > "2)" actually works by adding > > stopifnot(require(Lahman, quietly=TRUE)) > > as first line in Label() and other such functions. > > > > It works in the sense that Lahman::Label("yearID") will > > work even when Lahman is not in the search path, > > but R-devel CMD check will still give the same NOTE, > > though you can argue that that note is actally a "false positive". > > > > Not sure about another elegant way to make "2)" work, apart from > > using data() on each of the datasets inside the > > function. As I haven't tried it, that may *still* give a > > (false) NOTE.. > > > > This is a somewhat interesting problem, and I wonder if everyone > > else has solved it with '1)' rather than a version of '2)'. > > > > Martin > > > > ______________________________________________ > > [hidden email] mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > -- > Peter Dalgaard, Professor, > Center for Statistics, Copenhagen Business School > Solbjerg Plads 3, 2000 Frederiksberg, Denmark > Phone: (+45)38153501 > Email: [hidden email] Priv: [hidden email] > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > -- Joshua F. Wiley Ph.D. Student, UCLA Department of Psychology http://joshuawiley.com/ Senior Analyst, Elkhart Group Ltd. http://elkhartgroup.com Office: 260.673.5518 [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel |
|
In reply to this post by Martin Maechler
Re solution 2, the following is in the function tabFarsDead()
the latest (0.55) version of gamclass: data('FARS', package='gamclass', envir=environment()) FARS <- get("FARS", envir=environment()) The second statement is, strictly, redundant, but it makes the syntax checker happy. Another possibility might be: FARS <- NULL data('FARS', package='gamclass', envir=environment()) I do not know whether this passes. An FAQ that offers preferred solutions to such chestnuts, or a web page, or a blog, would seem to me useful. John Maindonald email: [hidden email]<mailto:[hidden email]> phone : +61 2 (6125)3473 fax : +61 2(6125)5549 Centre for Mathematics & Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. On 27 Aug 2014, at 20:00, <[hidden email]<mailto:[hidden email]>> <[hidden email]<mailto:[hidden email]>> wrote: From: Martin Maechler <[hidden email]<mailto:[hidden email]>> Subject: Re: [Rd] no visible binding for global variable for data sets in a package Date: 27 August 2014 19:24:36 AEST To: Michael Friendly <[hidden email]<mailto:[hidden email]>> Cc: r-devel <[hidden email]<mailto:[hidden email]>> Reply-To: Martin Maechler <[hidden email]<mailto:[hidden email]>> Michael Friendly <[hidden email]<mailto:[hidden email]>> on Tue, 26 Aug 2014 17:58:34 -0400 writes: I'm updating the Lahman package of baseball statistics to the 2013 release. In addition to the main data sets, the package also contains several convenience functions that make use of these data sets. These now trigger the notes below from R CMD check run with Win builder, R-devel. How can I avoid these? * using R Under development (unstable) (2014-08-25 r66471) * using platform: x86_64-w64-mingw32 (64-bit) ... * checking R code for possible problems ... NOTE Label: no visible binding for global variable 'battingLabels' Label: no visible binding for global variable 'pitchingLabels' Label: no visible binding for global variable 'fieldingLabels' battingStats: no visible binding for global variable 'Batting' battingStats: no visible global function definition for 'mutate' playerInfo: no visible binding for global variable 'Master' teamInfo: no visible binding for global variable 'Teams' One such function: ## function for accessing variable labels Label <- function(var, labels=rbind(battingLabels, pitchingLabels, fieldingLabels)) { wanted <- which(labels[,1]==var) if (length(wanted)) labels[wanted[1],2] else var } and you are using the data sets you mentioned before, (and the checking has been changed recently here). This is a bit subtle: Your data sets are part of your package (thanks to the default lazyData), but *not* part of the namespace of your package. Now, the reasoning goes as following: if someone uses a function from your package, say Label() above, by Lahman::Label(..) and your package has not been attached to the search path, your user will get an error, as the datasets are not found by Label(). If you consider something like Lahman::Label(..) for a bit and the emphasis we put on R functions being the primary entities, you can understand the current, i.e. new, R CMD check warnings. I see the following two options for you: 1) export all these data sets from your NAMESPACE For this (I thinK), you must define them in Lahman/R/ possibly via a Lahman/R/sysdata.rda 2) rewrite your functions such that ensure the data sets are loaded when they are used. "2)" actually works by adding stopifnot(require(Lahman, quietly=TRUE)) as first line in Label() and other such functions. It works in the sense that Lahman::Label("yearID") will work even when Lahman is not in the search path, but R-devel CMD check will still give the same NOTE, though you can argue that that note is actally a "false positive". Not sure about another elegant way to make "2)" work, apart from using data() on each of the datasets inside the function. As I haven't tried it, that may *still* give a (false) NOTE.. This is a somewhat interesting problem, and I wonder if everyone else has solved it with '1)' rather than a version of '2)'. Martin [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel |
|
In reply to this post by Martin Maechler
On 8/27/2014 5:24 AM, Martin Maechler wrote:
>>>>>> Michael Friendly <[hidden email]> >>>>>> on Tue, 26 Aug 2014 17:58:34 -0400 writes: > > I'm updating the Lahman package of baseball statistics to the 2013 > > release. In addition to > > the main data sets, the package also contains several convenience > > functions that make use > > of these data sets. These now trigger the notes below from R CMD check > > run with > > Win builder, R-devel. How can I avoid these? > > > * using R Under development (unstable) (2014-08-25 r66471) > > * using platform: x86_64-w64-mingw32 (64-bit) > > ... > > * checking R code for possible problems ... NOTE > > Label: no visible binding for global variable 'battingLabels' > > Label: no visible binding for global variable 'pitchingLabels' > > Label: no visible binding for global variable 'fieldingLabels' > > battingStats: no visible binding for global variable 'Batting' > > battingStats: no visible global function definition for 'mutate' > > playerInfo: no visible binding for global variable 'Master' > > teamInfo: no visible binding for global variable 'Teams' > > > One such function: > > > ## function for accessing variable labels > > > Label <- function(var, labels=rbind(battingLabels, pitchingLabels, > > fieldingLabels)) { > > wanted <- which(labels[,1]==var) > > if (length(wanted)) labels[wanted[1],2] else var > > } > > and you are using the data sets you mentioned before, > (and the checking has been changed recently here). > > This is a bit subtle: > Your data sets are part of your package (thanks to the default > lazyData), but *not* part of the namespace of your package. > Now, the reasoning goes as following: if someone uses a function > from your package, say Label() above, > by > Lahman::Label(..) > and your package has not been attached to the search path, > your user will get an error, as the datasets are not found by > Label(). > > If you consider something like Lahman::Label(..) > for a bit and the emphasis we put on R functions being the > primary entities, you can understand the current, i.e. new, > R CMD check warnings. > > I see the following two options for you: > > 1) export all these data sets from your NAMESPACE > For this (I thinK), you must define them in Lahman/R/ possibly via a > Lahman/R/sysdata.rda Not sure I quite understand how this would work. My NAMESPACE currently exports the few functions in this package: # all the rest is data export(battingStats, playerInfo,teamInfo, Label ) Do you mean to simply add all the data sets ('globals') that are referred to in these functions? # all the rest is data export(battingStats, playerInfo,teamInfo, Label, battingLabels, pitchingLabels, fieldingLabels, Batting, Master, Teams ) That seems a bit odd. Can you actually export data? Maybe there is a need for a separate NAMESPACE declaration, that might be called either of exportdata() globaldata() > > 2) rewrite your functions such that ensure the data sets are > loaded when they are used. > > > "2)" actually works by adding > stopifnot(require(Lahman, quietly=TRUE)) > as first line in Label() and other such functions. > > It works in the sense that Lahman::Label("yearID") will > work even when Lahman is not in the search path, > but R-devel CMD check will still give the same NOTE, > though you can argue that that note is actally a "false positive". Label <- function(var, labels) { stopifnot(require(Lahman, quietly=TRUE)) if(missing(labels)) labels <- rbind(battingLabels, pitchingLabels, fieldingLabels) wanted <- which(labels[,1]==var) if (length(wanted)) labels[wanted[1],2] else var } And this would be version 2, using data(): Label <- function(var, labels) { stopifnot(require(Lahman, quietly=TRUE)) if(missing(labels)) { data(battingLabels); data(pitchingLabels); data(fieldingLabels) labels <- rbind(battingLabels, pitchingLabels, fieldingLabels) } wanted <- which(labels[,1]==var) if (length(wanted)) labels[wanted[1],2] else var } > > Not sure about another elegant way to make "2)" work, apart from > using data() on each of the datasets inside the > function. As I haven't tried it, that may *still* give a > (false) NOTE.. > > This is a somewhat interesting problem, and I wonder if everyone > else has solved it with '1)' rather than a version of '2)'. > > Martin > -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. & Chair, Quantitative Methods York University Voice: 416 736-2100 x66249 Fax: 416 736-5814 4700 Keele Street Web:http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel |
|
On 8/27/2014 9:29 AM, Michael Friendly wrote:
>> It works in the sense that Lahman::Label("yearID") will >> work even when Lahman is not in the search path, >> but R-devel CMD check will still give the same NOTE, >> though you can argue that that note is actally a "false positive". > So, this would be version 1 of "2)": > > Label <- function(var, labels) { > stopifnot(require(Lahman, quietly=TRUE)) > if(missing(labels)) labels <- rbind(battingLabels, pitchingLabels, > fieldingLabels) > wanted <- which(labels[,1]==var) > if (length(wanted)) labels[wanted[1],2] else var > } > > And this would be version 2, using data(): > > Label <- function(var, labels) { > stopifnot(require(Lahman, quietly=TRUE)) > if(missing(labels)) { > data(battingLabels); data(pitchingLabels); data(fieldingLabels) > labels <- rbind(battingLabels, pitchingLabels, fieldingLabels) > } > wanted <- which(labels[,1]==var) > if (length(wanted)) labels[wanted[1],2] else var > } > > Just to follow up: R-devel likes this less than it does my initial version. I still get no visible binding NOTES, and complaint about using data() in a function: * checking R code for possible problems ... NOTE Label: no visible binding for global variable 'battingLabels' Label: no visible binding for global variable 'pitchingLabels' Label: no visible binding for global variable 'fieldingLabels' battingStats: no visible binding for global variable 'Batting' battingStats: no visible global function definition for 'mutate' playerInfo: no visible binding for global variable 'Master' teamInfo: no visible binding for global variable 'Teams' Found the following calls to data() loading into the global environment: File 'Lahman/R/Label.R': data(battingLabels) data(pitchingLabels) data(fieldingLabels) See section 'Good practice' in '?data'. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel |
|
Using data() in a package with lazyloaded data seems like asking for trouble to me. But you're making me curious (and I'm too lazy[*] to set up for rebuilding the package myself):
Did you actually try putting battingLabels & friends in the namespace? What happened? A workaround could be to use rbind(Lahman::battingLabels,....) which runs a bit against the grain for me. I think the right answer _is_ to export the lazy data; the question is how to do it. There's nothing particularly strange about exporting non-functions ("letters" would be an example, save for the special status of package:base). If you attach the package, the lazyloaded data appear in the same environment as the exported function so they are de facto already in the namespace for the purposes of library() and `::`. So I agree, something like exportData() would be useful. (Or some other mechanism. You might want to be able to export data selectively.) - pd [*] "Burdened with pressing obligations", if you like. On 27 Aug 2014, at 16:07 , Michael Friendly <[hidden email]> wrote: > On 8/27/2014 9:29 AM, Michael Friendly wrote: >>> It works in the sense that Lahman::Label("yearID") will >>> work even when Lahman is not in the search path, >>> but R-devel CMD check will still give the same NOTE, >>> though you can argue that that note is actally a "false positive". >> So, this would be version 1 of "2)": >> >> Label <- function(var, labels) { >> stopifnot(require(Lahman, quietly=TRUE)) >> if(missing(labels)) labels <- rbind(battingLabels, pitchingLabels, >> fieldingLabels) >> wanted <- which(labels[,1]==var) >> if (length(wanted)) labels[wanted[1],2] else var >> } >> >> And this would be version 2, using data(): >> >> Label <- function(var, labels) { >> stopifnot(require(Lahman, quietly=TRUE)) >> if(missing(labels)) { >> data(battingLabels); data(pitchingLabels); data(fieldingLabels) >> labels <- rbind(battingLabels, pitchingLabels, fieldingLabels) >> } >> wanted <- which(labels[,1]==var) >> if (length(wanted)) labels[wanted[1],2] else var >> } >> >> > > Just to follow up: R-devel likes this less than it does my initial version. I still get no visible binding NOTES, and complaint about > using data() in a function: > > * checking R code for possible problems ... NOTE > Label: no visible binding for global variable 'battingLabels' > Label: no visible binding for global variable 'pitchingLabels' > Label: no visible binding for global variable 'fieldingLabels' > battingStats: no visible binding for global variable 'Batting' > battingStats: no visible global function definition for 'mutate' > playerInfo: no visible binding for global variable 'Master' > teamInfo: no visible binding for global variable 'Teams' > > Found the following calls to data() loading into the global environment: > File 'Lahman/R/Label.R': > data(battingLabels) > data(pitchingLabels) > data(fieldingLabels) > See section 'Good practice' in '?data'. > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: [hidden email] Priv: [hidden email] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel |
|
> I think the right answer _is_ to export the lazy data; the question is how to do it. There's nothing particularly strange about exporting non-functions ("letters" would be an example, save for the special status of package:base). If you attach the package, the lazyloaded data appear in the same environment as the exported function so they are de facto already in the namespace for the purposes of library() and `::`. So I agree, something like exportData() would be useful. (Or some other mechanism. You might want to be able to export data selectively.)
I don't think lazyloaded data are in the same environment as exported functions - getExportedValue() (called by ::) looks first in the "exports" namespace, then in the "lazydata" namespace: function (ns, name) { getInternalExportName <- function(name, ns) { exports <- getNamespaceInfo(ns, "exports") if (exists(name, envir = exports, inherits = FALSE)) get(get(name, envir = exports, inherits = FALSE), envir = ns) else { ld <- getNamespaceInfo(ns, "lazydata") if (exists(name, envir = ld, inherits = FALSE)) get(name, envir = ld, inherits = FALSE) else stop(gettextf("'%s' is not an exported object from 'namespace:%s'", name, getNamespaceName(ns)), call. = FALSE, domain = NA) } } ns <- asNamespace(ns) if (isBaseNamespace(ns)) get(name, envir = ns, inherits = FALSE) else getInternalExportName(name, ns) } (But maybe you just meant the library() and :: behaves as is lazydata and exports were the same thing) Hadley -- http://had.co.nz/ ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel |
|
In reply to this post by Michael Friendly
http://cran.r-project.org/doc/manuals/r-patched/R-exts.html#Suggested-packages
suggests e.g. Lahman::battingLabels, and that does work for lazy-loaded datasets (which is what these appear to be). We have seen a couple of other instances in which this was needed for code within the package. However, in this case I would have thought battingLabels etc were really system data and should be in sysdata.rda. See ?data. As the reference suggests, putting Lahman on the search path does not necessarily work correctly: what if a user (or someone else's package!) puts another package on the search path which masks battingLabels? Assuming you want the version in the package, you have to say so. (If you do not, supply argument 'labels'.) On 27/08/2014 15:07, Michael Friendly wrote: > On 8/27/2014 9:29 AM, Michael Friendly wrote: >>> It works in the sense that Lahman::Label("yearID") will >>> work even when Lahman is not in the search path, >>> but R-devel CMD check will still give the same NOTE, >>> though you can argue that that note is actally a "false positive". >> So, this would be version 1 of "2)": >> >> Label <- function(var, labels) { >> stopifnot(require(Lahman, quietly=TRUE)) >> if(missing(labels)) labels <- rbind(battingLabels, pitchingLabels, >> fieldingLabels) >> wanted <- which(labels[,1]==var) >> if (length(wanted)) labels[wanted[1],2] else var >> } >> >> And this would be version 2, using data(): >> >> Label <- function(var, labels) { >> stopifnot(require(Lahman, quietly=TRUE)) >> if(missing(labels)) { >> data(battingLabels); data(pitchingLabels); data(fieldingLabels) >> labels <- rbind(battingLabels, pitchingLabels, fieldingLabels) >> } >> wanted <- which(labels[,1]==var) >> if (length(wanted)) labels[wanted[1],2] else var >> } >> >> > > Just to follow up: R-devel likes this less than it does my initial > version. I still get no visible binding NOTES, and complaint about > using data() in a function: > > * checking R code for possible problems ... NOTE > Label: no visible binding for global variable 'battingLabels' > Label: no visible binding for global variable 'pitchingLabels' > Label: no visible binding for global variable 'fieldingLabels' > battingStats: no visible binding for global variable 'Batting' > battingStats: no visible global function definition for 'mutate' > playerInfo: no visible binding for global variable 'Master' > teamInfo: no visible binding for global variable 'Teams' > > Found the following calls to data() loading into the global environment: > File 'Lahman/R/Label.R': > data(battingLabels) > data(pitchingLabels) > data(fieldingLabels) > See section 'Good practice' in '?data'. > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel -- Brian D. Ripley, [hidden email] Emeritus Professor of Applied Statistics, University of Oxford 1 South Parks Road, Oxford OX1 3TG, UK ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel |
|
In reply to this post by Peter Dalgaard-2
On 8/27/2014 10:41 AM, peter dalgaard wrote:
> Using data() in a package with lazyloaded data seems like asking for trouble to me. But you're making me curious (and I'm too lazy[*] to set up for rebuilding the package myself): > > Did you actually try putting battingLabels & friends in the namespace? What happened? Well, I tried this, with NAMESPACE as # all the rest is data, except we try to export the data sets used globally in these functions export(battingStats, playerInfo,teamInfo, Label, battingLabels, pitchingLabels, fieldingLabels, Batting, Master, Teams ) R CMD check was even more unhappy, failing immediately, so the attempted cure was worse than the original disease. * checking whether package 'Lahman' can be installed ...Warning: running command '"C:/R/R-3.0.3/bin/x64/Rcmd.exe" INSTALL -l "C:/eclipse/Lahman2013.Rcheck" --no-html "C:\DOCUME~2\WORKSP~1\LAHMAN~1"' had status 1 ERROR Installation failed. See 'C:/eclipse/Lahman2013.Rcheck/00install.out' for details. 00install.out said: Error in namespaceExport(ns, exports) : undefined exports: battingLabels, pitchingLabels, fieldingLabels, Batting, Master, Teams Error: loading failed Execution halted My conclusions so far are: - These are just NOTEs, so I will ignore them for now. But they will probably trigger the CRAN maintainers to notice them when I resubmit and I will have to plead guilty with an explanation. This makes more useless work for all involved. - Peter Dalgaard noted the change in R-devel, and nobody so far has suggested a working remedy, so a clean solution seems warranted. It mentions 'now by default' -- is there a switch for this? The change would seem to be this \item \command{R CMD check} now by default checks code usage directly on the package namespace without loading and attaching the package and its suggests and enhances. and perhaps the remedies could be stated more clearly? Alternatively, it isn't common, but it is by no means rare for package functions to need to use data sets in the package, so some mechanism to declare these, perhaps in NAMESPACE is needed. > > A workaround could be to use rbind(Lahman::battingLabels,....) which runs a bit against the grain for me. > > I think the right answer _is_ to export the lazy data; the question is how to do it. There's nothing particularly strange about exporting non-functions ("letters" would be an example, save for the special status of package:base). If you attach the package, the lazyloaded data appear in the same environment as the exported function so they are de facto already in the namespace for the purposes of library() and `::`. So I agree, something like exportData() would be useful. (Or some other mechanism. You might want to be able to export data selectively.) > > - pd > > > [*] "Burdened with pressing obligations", if you like. > > > On 27 Aug 2014, at 16:07 , Michael Friendly <[hidden email]> wrote: > >> On 8/27/2014 9:29 AM, Michael Friendly wrote: >>>> It works in the sense that Lahman::Label("yearID") will >>>> work even when Lahman is not in the search path, >>>> but R-devel CMD check will still give the same NOTE, >>>> though you can argue that that note is actally a "false positive". >>> So, this would be version 1 of "2)": >>> >>> Label <- function(var, labels) { >>> stopifnot(require(Lahman, quietly=TRUE)) >>> if(missing(labels)) labels <- rbind(battingLabels, pitchingLabels, >>> fieldingLabels) >>> wanted <- which(labels[,1]==var) >>> if (length(wanted)) labels[wanted[1],2] else var >>> } >>> >>> And this would be version 2, using data(): >>> >>> Label <- function(var, labels) { >>> stopifnot(require(Lahman, quietly=TRUE)) >>> if(missing(labels)) { >>> data(battingLabels); data(pitchingLabels); data(fieldingLabels) >>> labels <- rbind(battingLabels, pitchingLabels, fieldingLabels) >>> } >>> wanted <- which(labels[,1]==var) >>> if (length(wanted)) labels[wanted[1],2] else var >>> } >>> >>> >> Just to follow up: R-devel likes this less than it does my initial version. I still get no visible binding NOTES, and complaint about >> using data() in a function: >> >> * checking R code for possible problems ... NOTE >> Label: no visible binding for global variable 'battingLabels' >> Label: no visible binding for global variable 'pitchingLabels' >> Label: no visible binding for global variable 'fieldingLabels' >> battingStats: no visible binding for global variable 'Batting' >> battingStats: no visible global function definition for 'mutate' >> playerInfo: no visible binding for global variable 'Master' >> teamInfo: no visible binding for global variable 'Teams' >> >> Found the following calls to data() loading into the global environment: >> File 'Lahman/R/Label.R': >> data(battingLabels) >> data(pitchingLabels) >> data(fieldingLabels) >> See section 'Good practice' in '?data'. >> >> ______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. & Chair, Quantitative Methods York University Voice: 416 736-2100 x66249 Fax: 416 736-5814 4700 Keele Street Web:http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel |
|
On Wed, Aug 27, 2014 at 12:51 PM, Michael Friendly <[hidden email]> wrote:
> On 8/27/2014 10:41 AM, peter dalgaard wrote: >> >> Using data() in a package with lazyloaded data seems like asking for >> trouble to me. But you're making me curious (and I'm too lazy[*] to set up >> for rebuilding the package myself): >> >> Did you actually try putting battingLabels & friends in the namespace? >> What happened? > > Well, I tried this, with NAMESPACE as > > # all the rest is data, except we try to export the data sets used globally > in these functions > > export(battingStats, > playerInfo,teamInfo, > Label, > battingLabels, pitchingLabels, fieldingLabels, > Batting, Master, Teams > ) > > R CMD check was even more unhappy, failing immediately, so the attempted > cure was worse than the original disease. If you want to export your datasets, you'll need to put them in R/sysdata.rda. But I think Brian suggested the right fix: use Lahmann::Batting instead of Batting. (And leave everything else as is) Hadley -- http://had.co.nz/ ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel |
|
In reply to this post by Michael Friendly
On 27 Aug 2014, at 19:51 , Michael Friendly <[hidden email]> wrote: > - Peter Dalgaard noted the change in R-devel, and nobody so far has suggested a working remedy, so a clean solution > seems warranted. Actually, both Peter Dalgaard and Brian Ripley suggested Lahman:battingLabels, and, with the current semantics at least, it does seem to be recommendable to namespace-qualify any such datasets, lest you be a potential victim of variable masking. (See also my reply to Hadley, once I get it finalized.) -pd -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: [hidden email] Priv: [hidden email] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel |
|
In reply to this post by hadley wickham
On 27 Aug 2014, at 16:48 , Hadley Wickham <[hidden email]> wrote: >> I think the right answer _is_ to export the lazy data; the question is how to do it. There's nothing particularly strange about exporting non-functions ("letters" would be an example, save for the special status of package:base). If you attach the package, the lazyloaded data appear in the same environment as the exported function so they are de facto already in the namespace for the purposes of library() and `::`. So I agree, something like exportData() would be useful. (Or some other mechanism. You might want to be able to export data selectively.) > > I don't think lazyloaded data are in the same environment as exported > functions - getExportedValue() (called by ::) looks first in the > "exports" namespace, then in the "lazydata" namespace: > > function (ns, name) > { > getInternalExportName <- function(name, ns) { > exports <- getNamespaceInfo(ns, "exports") > if (exists(name, envir = exports, inherits = FALSE)) > get(get(name, envir = exports, inherits = FALSE), > envir = ns) > else { > ld <- getNamespaceInfo(ns, "lazydata") > if (exists(name, envir = ld, inherits = FALSE)) > get(name, envir = ld, inherits = FALSE) > else stop(gettextf("'%s' is not an exported object from > 'namespace:%s'", > name, getNamespaceName(ns)), call. = FALSE, domain = NA) > } > } > ns <- asNamespace(ns) > if (isBaseNamespace(ns)) > get(name, envir = ns, inherits = FALSE) > else getInternalExportName(name, ns) > } > > > (But maybe you just meant the library() and :: behaves as is lazydata > and exports were the same thing) > > Hadley > > -- > http://had.co.nz/ I meant that a) :: gives results as if the data was in the namespace b) if you do > library(MASS) > ls("package:MASS") [1] "abbey" "accdeaths" "addterm" [4] "Aids2" "Animals" "anorexia" [7] "area" "as.fractions" "bacteria" [10] "bandwidth.nrd" "bcv" "beav1" .... data and functions get put together in the same environment (however, this is not the namespace environment, see later). What puzzles me is why the distinction between lazydata and exports was there to begin with. The implication of the current setup is clearly that pkg::foo() cannot access package::dat by referring to `dat` whereas it can do so if invoked as library(pkg); foo(). We also have > get("accdeaths", environment(MASS::addterm)) Error in get("accdeaths", environment(MASS::addterm)) : object 'accdeaths' not found > library(MASS) > get("accdeaths", environment(MASS::addterm)) Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov 1973 9007 8106 8928 9137 10017 10826 11317 10744 9713 9938 9161 1974 7750 6981 8038 8422 8714 9512 10120 9823 8743 9129 8710 ... which confused me at first, but it actually just means that "accdeaths" is found on the search path in the latter case. This strikes me as somewhat dangerous: If a package uses one of its own datasets, it can be masked by a later attach() or the global env. (I suspect that someone already explained all this a while back, but I just wasn't listening at the time...) -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: [hidden email] Priv: [hidden email] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel |
|
>>>>> peter dalgaard <[hidden email]>
>>>>> on Wed, 27 Aug 2014 21:09:47 +0200 writes: > On 27 Aug 2014, at 16:48 , Hadley Wickham <[hidden email]> wrote: >>> I think the right answer _is_ to export the lazy data; the question is how to do it. There's nothing particularly strange about exporting non-functions ("letters" would be an example, save for the special status of package:base). If you attach the package, the lazyloaded data appear in the same environment as the exported function so they are de facto already in the namespace for the purposes of library() and `::`. So I agree, something like exportData() would be useful. (Or some other mechanism. You might want to be able to export data selectively.) >> >> I don't think lazyloaded data are in the same environment as exported >> functions - getExportedValue() (called by ::) looks first in the >> "exports" namespace, then in the "lazydata" namespace: >> >> function (ns, name) >> { >> getInternalExportName <- function(name, ns) { >> exports <- getNamespaceInfo(ns, "exports") >> if (exists(name, envir = exports, inherits = FALSE)) >> get(get(name, envir = exports, inherits = FALSE), >> envir = ns) >> else { >> ld <- getNamespaceInfo(ns, "lazydata") >> if (exists(name, envir = ld, inherits = FALSE)) >> get(name, envir = ld, inherits = FALSE) >> else stop(gettextf("'%s' is not an exported object from >> 'namespace:%s'", >> name, getNamespaceName(ns)), call. = FALSE, domain = NA) >> } >> } >> ns <- asNamespace(ns) >> if (isBaseNamespace(ns)) >> get(name, envir = ns, inherits = FALSE) >> else getInternalExportName(name, ns) >> } >> >> >> (But maybe you just meant the library() and :: behaves as is lazydata >> and exports were the same thing) >> >> Hadley >> >> -- >> http://had.co.nz/ > I meant that > a) :: gives results as if the data was in the namespace > b) if you do >> library(MASS) >> ls("package:MASS") > [1] "abbey" "accdeaths" "addterm" > [4] "Aids2" "Animals" "anorexia" > [7] "area" "as.fractions" "bacteria" > [10] "bandwidth.nrd" "bcv" "beav1" > .... > data and functions get put together in the same environment (however, this is not the namespace environment, see later). > What puzzles me is why the distinction between lazydata and exports was there to begin with. > The implication of the current setup is clearly that pkg::foo() cannot access package::dat by referring to `dat` whereas it can do so if invoked as library(pkg); foo(). > We also have >> get("accdeaths", environment(MASS::addterm)) > Error in get("accdeaths", environment(MASS::addterm)) : > object 'accdeaths' not found >> library(MASS) >> get("accdeaths", environment(MASS::addterm)) > Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov > 1973 9007 8106 8928 9137 10017 10826 11317 10744 9713 9938 9161 > 1974 7750 6981 8038 8422 8714 9512 10120 9823 8743 9129 8710 > ... > which confused me at first, but it actually just means that "accdeaths" is found on the search path in the latter case. This strikes me as somewhat dangerous: If a package uses one of its own datasets, it can be masked by a later attach() or the global env. > (I suspect that someone already explained all this a while back, but I just wasn't listening at the time...) I did say (as first reply in this thread) that the lazyloaded datasets where in the package environment, but not in the package's *namespace* environment -- which is somewhat exceptional, as otherwise, the namespace envir is typically a superset of the package envir, and I did mention the solution with sysdata.rda + NAMESPACE which Brian and Hadley later mentioned, too. However Michael did not get it well enough (because you need to read other material), and I do think that is not an ideal solution for data sets that have help pages, etc; notably because creating sysdata.rda is not at all part of the package sources, and hence harder for maintenance and "open source transparency". The one additional important in the thread was the special semantic of '::' which here returns something for <pkg>::<obj> which is *not* part of the <pkg>'s namespace, whereas the use of '::' does suggest to Joe Average to be working with namespace (as opposed to package) contents. I agree with your suggestion, Peter, that this looks more confusing than it should, and ideally, we'd find a better setup. OTOH, if we additionally allowed something like exportData(), we would additionally get data that is both in the package and namespace environments and other (not exported) data that is only in the package, so add room for even more confusion. Martin ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel |
|
In reply to this post by Michael Friendly
Here is one rationale for the change, which was useful for my understanding
This arises in the survival package in the survexp function: survexp <- function(formula, data, weights, subset, na.action,.... ratetable=survexp.us) The argument has been changed to survival::survexp.us, soon to be uploaded to CRAN. Assume that a user has an object named "survexp.us" in their working data, and calls survexp without providing a ratetable arguement. Should they get the package default or their own object? I'd vote for the first, but the current behavior is the latter. This was presented to me as "fixing a bug that has long been present, but we never noticed before." Now, if the user adds "ratetable=survexp.us" to the call they should get their own object. I've always hated computer systems that ignore explicit direction, assuming they know more than me. Terry Therneau ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel |
| Free forum by Nabble | Edit this page |
