# RFC: What should ?foo do?

## RFC: What should ?foo do?

 Currently ?foo does help("foo"), which looks for a man page with alias foo.  If foo happens to be a function call, it will do a bit more, so ?mean(something) will find the mean method for something if mean happens to be an S4 generic.  There are also the type?foo variations, e.g. methods?foo, or package?foo. I think these are all too limited. The easiest search should be the most permissive.  Users should need to do extra work to limit their search to man pages, with exact matches, as ? does. We don't currently have a general purpose search for "foo", or something like it.  We come close with RSiteSearch, and so possibly ?foo should mean RSiteSearch("foo"), but there are problems with that: it can't limit itself to the current version of R, and it doesn't work when you're offline (or when search.r-project.org is down.)  We also have help.search("foo"), but it is too limited. I'd like to have a local search that looks through the man pages, manuals, FAQs, vignettes, DESCRIPTION files, etc., specific to the current R installation, and I think ? should be attached to that search. Comments, please. Duncan Murdoch
 Duncan Murdoch wrote: > Currently ?foo does help("foo"), which looks for a man page with alias > foo.  If foo happens to be a function call, it will do a bit more, so > > ?mean(something) > > will find the mean method for something if mean happens to be an S4 > generic.  There are also the type?foo variations, e.g. methods?foo, or > package?foo. > > I think these are all too limited. > > The easiest search should be the most permissive.  Users should need to > do extra work to limit their search to man pages, with exact matches, as > ? does. > > We don't currently have a general purpose search for "foo", or something > like it.  We come close with RSiteSearch, and so possibly ?foo should > mean RSiteSearch("foo"), but > there are problems with that: it can't limit itself to the current > version of R, and it doesn't work when you're offline (or when > search.r-project.org is down.)  We also have help.search("foo"), but it > is too limited. I'd like to have a local search that looks through the > man pages, manuals, FAQs, vignettes, DESCRIPTION files, etc., specific > to the current R installation, and I think ? should be attached to that > search. > > Comments, please. > > Duncan Murdoch Duncan, I agree in principle with the points that you raise. I suspect that at least in part, it might assist new users with some of the issues that were raised in the latest incarnation of the 'we need better documentation' thread on r-help. I am not convinced that ?foo should do this however. help("foo") conceptually seems predicated upon the notion that a user is looking for a reference/help page for a specific function or descriptor called 'foo'. The user knows the name of the function or descriptor and should not have to wait for a search function to locate it or conceptually related terms. If the user has a large number of CRAN packages installed, such a search can take a rather long time. That's an issue for example with help.search(). That being said and being a firm believer in incrementalism, perhaps the first step should be to create a new function, called esearch() [as in extended search] or doc.search() [as in documentation search] or even search.all(). This new function would facilitate searching all of the local objects that you list and perhaps others. It would by default be uber-inclusive of all categories of such objects. It would support functionality along the lines of help.search() in allowing for the use of regex and fuzzy matching via grep()/agrep(). The downside of this approach is that we would add yet another search function to the list of those already available, each of which searches a focused subset of the potential targets for assistance, whether local or online. Thus, it would require some level of understanding of the general structure of the myriad of local and online resources of R related assistance. Perhaps ?help could be augmented a bit in elucidating some of these issues. The See Also there does not reference apropos() for example and it might be worthwhile adding something along the lines of the bullets in the "Do your homework before posting" section in the Posting Guide. Thus ?help can become something of a "first place to look - local centralized help resource" for users to identify the tiered help resources that are available and to also provide a framework for how to use those resources. One could also have links to the online pages for R News, R Books, the R Wiki, the R Graph Gallery, Contributed Documentation, Bioconductor and Other Documentation, so that users become more aware of help resources beyond the documentation installed with R by default. A longer term plan could be to look to consolidate some of these functions into a single help/search function perhaps circa R version 3.0.0. That would enable some time for thoughtful consideration and feedback. I am not convinced that ?foo should do this however. help("foo") conceptually seems predicated upon the notion that a user is looking for a reference/help page for a specific function or descriptor called 'foo'. The user knows the name of the function or descriptor and should not have to wait for a search function to locate it or conceptually Is that true? That does limit the usefulness of ? as it implies that you must already know the function that you need. related terms. If the user has a large number of CRAN packages installed, such a search can take a rather long time. That's an issue for example with help.search(). But that's just a problem with the current implementation. Better indexing could make full text search of all documentation practical instantaneous. This is one argument for a centralised documentation web site - such indices are much easier to set up in a modern web development environment. I could imagine this being an eventual feature of crantastic.org, but it requires on some tool to turn Rd files into a form more easily parsed with off-the-shelf tools (ideally xml). Hadley The user knows the name of the function or descriptor and should >> not have to wait for a search function to locate it or conceptually > > Is that true? That does limit the usefulness of ? as it implies that > you must already know the function that you need. > >> related terms. If the user has a large number of CRAN packages >> installed, such a search can take a rather long time. That's an issue >> for example with help.search(). > > But that's just a problem with the current implementation. Better > indexing could make full text search of all documentation practical > instantaneous. This is one argument for a centralised documentation > web site - such indices are much easier to set up in a modern web > development environment. I could imagine this being an eventual > feature of crantastic.org, but it requires on some tool to turn Rd > files into a form more easily parsed with off-the-shelf tools (ideally > xml). The search is a lot faster in 2.7.x, but is limited by disc speed (and hence is relatively slow on Windows -- and I have a box which has both Linux and Windows on, so I've tested this on identical hardware). R is a dynamic environment which can change libraries (and their order), add or remove packages .... The text search is pretty close to instantaneous (try it a second time) -- the time is taken building the database for the first use. Yes, we could do this differently, but some credit for the work already done would be an encouragement to continue to improve. -- Brian D. Ripley Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595 Users should need to >> do extra work to limit their search to man pages, with exact matches, as >> ? does. >> >> We don't currently have a general purpose search for "foo", or something >> like it. We come close with RSiteSearch, and so possibly ?foo should >> mean RSiteSearch("foo"), but >> there are problems with that: it can't limit itself to the current >> version of R, and it doesn't work when you're offline (or when >> search.r-project.org is down.) We also have help.search("foo"), but it >> is too limited. I'd like to have a local search that looks through the >> man pages, manuals, FAQs, vignettes, DESCRIPTION files, etc., specific >> to the current R installation, and I think ? should be attached to that >> search. >> >> Comments, please. >> >> Duncan Murdoch >> > > Duncan, > > I agree in principle with the points that you raise. I suspect that at > least in part, it might assist new users with some of the issues that > were raised in the latest incarnation of the 'we need better > documentation' thread on r-help. > > I am not convinced that ?foo should do this however. help("foo") > conceptually seems predicated upon the notion that a user is looking for > a reference/help page for a specific function or descriptor called > 'foo'. The user knows the name of the function or descriptor ... With the current definition, that's correct, though man("foo") might be a better match to Unix-users expectations for a function that did that. For a naive user, help("foo") suggests that they're looking for help on "foo". > > ... and should > not have to wait for a search function to locate it or conceptually > related terms. If the user has a large number of CRAN packages > installed, such a search can take a rather long time. That's an issue > for example with help.search(). > As Brian and Hadley said, that's an implementation issue, already being addressed. > That being said and being a firm believer in incrementalism, perhaps the > first step should be to create a new function, called esearch() [as in > extended search] or doc.search() [as in documentation search] or even > search.all(). This new function would facilitate searching all of the > local objects that you list and perhaps others. It would by default be > uber-inclusive of all categories of such objects. It would support > functionality along the lines of help.search() in allowing for the use > of regex and fuzzy matching via grep()/agrep(). > Definitely there would need to be a new function, with a new name; if we were attaching the name to ? somehow, then it wouldn't matter much what name was used. I haven't done it, but I suspect we could introduce special behaviour for ??foo very easily. We could even have a whole hierarchy: ?foo, ??foo, ???foo, ????foo, ... > The downside of this approach is that we would add yet another search > function to the list of those already available, each of which searches > a focused subset of the potential targets for assistance, whether local > or online. Thus, it would require some level of understanding of the > general structure of the myriad of local and online resources of R > related assistance. > Part of the idea behind my suggestion is that it should be somewhat automatic for a new user to learn about the different types of help. One way for this to happen is the current one: expect them to find and read the manuals. The suggestion is to make it easier to find the different types. The risk of this is that exposing a new user to a wide range of different kinds of results would just be confusing. > Perhaps ?help could be augmented a bit in elucidating some of these > issues. The See Also there does not reference apropos() for example and > it might be worthwhile adding something along the lines of the bullets > in the "Do your homework before posting" section in the Posting Guide. > Thus ?help can become something of a "first place to look - local > centralized help resource" for users to identify the tiered help > resources that are available and to also provide a framework for how to > use those resources. One could also have links to the online pages for R > News, R Books, the R Wiki, the R Graph Gallery, Contributed > Documentation, Bioconductor and Other Documentation, so that users > become more aware of help resources beyond the documentation installed > with R by default. > Those are probably good ideas, but my guess would be that few users read ?help. > A longer term plan could be to look to consolidate some of these > functions into a single help/search function perhaps circa R version > 3.0.0. As all the recent bug reports show, we don't really get feedback until code is released, so there's not much of an advantage of 3.0.0 (unless we really break the current system) over 2.8.0. Duncan Murdoch
## Re: RFC: What should ?foo do?

 Duncan Murdoch wrote: > I haven't done it, but I suspect we could introduce special behaviour > for ??foo very easily.  We could even have a whole hierarchy: > > ?foo, ??foo, ???foo, ????foo, ... > >   Heh, that's rather nice, actually. In words, that could read ?foo: tell me about foo! ??foo: what can you tell me about foo? ???foo: what can you tell me about things like foo? ????foo: I don't know what I'm looking for but it might be something related foo? You do have to be careful about messing with ?, though. I think many people, including me, would pretty quickly go nuts if ?par suddenly didn't work the way we're used to. --    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918 ~~~~~~~~~~ - ([hidden email])              FAX: (+45) 35327907
## Re: RFC: What should ?foo do?

 In reply to this post by Duncan Murdoch Duncan Murdoch wrote: > Currently ?foo does help("foo"), which looks for a man page with alias > foo.  If foo happens to be a function call, it will do a bit more, so > > ?mean(something) > > will find the mean method for something if mean happens to be an S4 > generic.  There are also the type?foo variations, e.g. methods?foo, or > package?foo. > > I think these are all too limited. > > The easiest search should be the most permissive.  Users should need to > do extra work to limit their search to man pages, with exact matches, as > ? does.    While I like the idea, I don't really agree with the sentiment above. I think that the easiest search should be the one that you want the result of most often. And at least for me that is the man page for the function, so I can check some detail; and it works pretty well.  I use site searches much less frequently and would be happy to type more for them. > > We don't currently have a general purpose search for "foo", or something > like it.  We come close with RSiteSearch, and so possibly ?foo should > mean RSiteSearch("foo"), but > there are problems with that: it can't limit itself to the current > version of R, and it doesn't work when you're offline (or when > search.r-project.org is down.)  We also have help.search("foo"), but it > is too limited. I'd like to have a local search that looks through the > man pages, manuals, FAQs, vignettes, DESCRIPTION files, etc., specific > to the current R installation, and I think ? should be attached to that > search.   I think that would be very useful (although there will be some decisions on which tool to use to achieve this). But, it will also be problematic, as one will get tons of hits for some things, and then selecting the one you really want will be a pain.   I would rather see that be one of the dyadic forms, say    site?foo   or    all?foo   one could even imagine refining that for different subsets of the docs you have mentioned;    help?foo #only man pages    guides?foo #the manuals, R Extensions etc and so on.    You did not, make a suggestion as to how we would get the equivalent of ?foo now, if a decision to move were taken. > > Comments, please. > > Duncan Murdoch > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel> -- Robert Gentleman, PhD Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 PO Box 19024 Seattle, Washington 98109-1024 206-667-7700 [hidden email] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
## Re: RFC: What should ?foo do?

 Would it be possible to build per package help databases during the install (or build?) process?  That way all that work could be shifted to one a one off compilation. Yes, we could do this differently, but some credit for the work already done would be an encouragement to continue to improve. Sorry! Writing a good text search engine is hard and I didn't mean to disparage the existing work of R-core.  The current system works great for the majority of uses. Hadley
## Re: RFC: What should ?foo do?

 On Fri, Apr 25, 2008 at 7:46 AM, Peter Dalgaard wrote: > Duncan Murdoch wrote: >  > I haven't done it, but I suspect we could introduce special behaviour >  > for ??foo very easily.  We could even have a whole hierarchy: >  > >  > ?foo, ??foo, ???foo, ????foo, ... >  > >  > >  Heh, that's rather nice, actually. In words, that could read > >  ?foo: tell me about foo! >  ??foo: what can you tell me about foo? >  ???foo: what can you tell me about things like foo? >  ????foo: I don't know what I'm looking for but it might be something >  related foo? I like the idea, but why do not it automatically and then display the results on a single page?  (i.e. list results in order of specificity) Hadley
## Re: RFC: What should ?foo do?

 I'd be interested to know how many R users are aware of the dyadic form - I suspect it's very very few. Hadley
## Re: RFC: What should ?foo do?

 One reason not to do that is that in single-threaded R you are pretty much stuck until it is done.  Presumably the more specific search is quicker than the less specific one.  And even if we could act on results as soon as they were available, I think a lot of users would wait for the search to stop, so there'd be a perception that it was too slow. One possible change to ?foo that would not be so painful for old-time users would be to try it under its current meaning first, and only fall back to a more general search if that fails. Consistent with this idea would be something like the "I feel lucky" search on Google, i.e. ?foo would go immediately to the best match, while ??foo would present a list of possible matches.  This is not consistent with current behaviour, where ?foo will present a list if it matches two or more topics, but I think we can always rank one ahead of the other based on their ordering in the search list.  I don't know if it will be so easy to rank hits coming from help.search(), or from other searches that don't exist yet:  but maybe it doesn't matter.  If someone doesn't like what they get from ?foo, they can always try ??foo. Duncan
## Re: RFC: What should ?foo do?

 On Apr 25, 2008, at 8:46 AM, Peter Dalgaard wrote: > Duncan Murdoch wrote: >> I haven't done it, but I suspect we could introduce special behaviour >> for ??foo very easily.  We could even have a whole hierarchy: >> >> ?foo, ??foo, ???foo, ????foo, ... >> >> > Heh, that's rather nice, actually. In words, that could read > > ?foo: tell me about foo! > ??foo: what can you tell me about foo? > ???foo: what can you tell me about things like foo? > ????foo: I don't know what I'm looking for but it might be something > related foo? > > You do have to be careful about messing with ?, though. I think many > people, including me, would pretty quickly go nuts if ?par suddenly > didn't work the way we're used to. > I strongly agree with that. One potential way out could be to offer some extended fall-back in   case the man page doesn't exist. (I'm not sure I like that, either,   but I could get used to it ;).) I don't really have a problem with status quo and I think if you want   proper advanced searches, you should be using (or implementing them)   in the GUIs anyway. That is what the new users will be using (and   looking for) in the first place. If they have to count the question   marks instead, they won't know about it (although I like the idea for   advanced users). Cheers, Simon
## Re: RFC: What should ?foo do?

 On 4/25/2008 10:16 AM, Robert Gentleman wrote: > > Duncan Murdoch wrote: >> Currently ?foo does help("foo"), which looks for a man page with alias >> foo.  If foo happens to be a function call, it will do a bit more, so >> >> ?mean(something) >> >> will find the mean method for something if mean happens to be an S4 >> generic.  There are also the type?foo variations, e.g. methods?foo, or >> package?foo. >> >> I think these are all too limited. >> >> The easiest search should be the most permissive.  Users should need to >> do extra work to limit their search to man pages, with exact matches, as >> ? does. > >    While I like the idea, I don't really agree with the sentiment above. > I think that the easiest search should be the one that you want the > result of most often. > And at least for me that is the man page for the function, so I can > check some detail; and it works pretty well.  I use site searches much > less frequently and would be happy to type more for them. That's true. What's your feeling about what should happen when ?foo fails? > >> >> We don't currently have a general purpose search for "foo", or something >> like it.  We come close with RSiteSearch, and so possibly ?foo should >> mean RSiteSearch("foo"), but >> there are problems with that: it can't limit itself to the current >> version of R, and it doesn't work when you're offline (or when >> search.r-project.org is down.)  We also have help.search("foo"), but it >> is too limited. I'd like to have a local search that looks through the >> man pages, manuals, FAQs, vignettes, DESCRIPTION files, etc., specific >> to the current R installation, and I think ? should be attached to that >> search. > >   I think that would be very useful (although there will be some > decisions on which tool to use to achieve this). But, it will also be > problematic, as one will get tons of hits for some things, and then > selecting the one you really want will be a pain. > >   I would rather see that be one of the dyadic forms, say > >    site?foo > >   or >    all?foo > >   one could even imagine refining that for different subsets of the docs > you have mentioned; > >    help?foo #only man pages >    guides?foo #the manuals, R Extensions etc > > and so on. > >    You did not, make a suggestion as to how we would get the equivalent > of ?foo now, if a decision to move were taken. I didn't say, but I would assume there would be a way to do it, and it shouldn't be hard to invoke.  Maybe help?foo as you suggested, or man?foo. Duncan Murdoch
## Re: RFC: What should ?foo do?

 Consistent with this idea would be something like the "I feel lucky" search on Google, i.e. ?foo would go immediately to the best match, while ??foo would present a list of possible matches.  This is not consistent with current behaviour, where ?foo will present a list if it matches two or more topics, but I think we can always rank one ahead of the other based on their ordering in the search list.  I don't know if it will be so easy to rank hits coming from help.search(), or from other searches that don't exist yet: but maybe it doesn't matter.  If someone doesn't like what they get from ?foo, they can always try ??foo. That seems like a good compromise to me - it's a metaphor familiar to most people, and corresponds roughly to the current behaviour of help and help.search. Hadley
## Re: RFC: What should ?foo do?

 On 4/25/2008 10:41 AM, Simon Urbanek wrote: > On Apr 25, 2008, at 8:46 AM, Peter Dalgaard wrote: > >> Duncan Murdoch wrote: >>> I haven't done it, but I suspect we could introduce special behaviour >>> for ??foo very easily.  We could even have a whole hierarchy: >>> >>> ?foo, ??foo, ???foo, ????foo, ... >>> >>> >> Heh, that's rather nice, actually. In words, that could read >> >> ?foo: tell me about foo! >> ??foo: what can you tell me about foo? >> ???foo: what can you tell me about things like foo? >> ????foo: I don't know what I'm looking for but it might be something >> related foo? >> >> You do have to be careful about messing with ?, though. I think many >> people, including me, would pretty quickly go nuts if ?par suddenly >> didn't work the way we're used to. >> > > I strongly agree with that. > > One potential way out could be to offer some extended fall-back in   > case the man page doesn't exist. (I'm not sure I like that, either,   > but I could get used to it ;).) > > I don't really have a problem with status quo and I think if you want   > proper advanced searches, you should be using (or implementing them)   > in the GUIs anyway. That is what the new users will be using (and   > looking for) in the first place. If they have to count the question   > marks instead, they won't know about it (although I like the idea for   > advanced users). I'd like to try to have the search common across all platforms, but the GUIs could present it and the results in their own way. Duncan Murdoch
## Re: RFC: What should ?foo do?

## Re: RFC: What should ?foo do?

 That would help if anyone actually read the startup banner.  The next time you're in front of an audience of people who have been using R for a couple of weeks, ask them how to cite R. Hadley
## Re: RFC: What should ?foo do?

 On 4/25/2008 10:16 AM, Robert Gentleman wrote: >> >> Duncan Murdoch wrote: >>> Currently
## Re: RFC: What should ?foo do?

 In reply to this post by hadley wickham hadley wickham wrote: >>  It seems logical to me that such a resource be embedded up front in "Intro" >> with it also being included within the existing help system and referenced >> in the start up banner message. > > That would help if anyone actually read the startup banner.  The next > time you're in front of an audience of people who have been using R > for a couple of weeks, ask them how to cite R. > > Hadley That might be supportive of having the startup banner be nothing more than a version/copyright notice and then leave you at a '>" prompt. I am open to the notion however, that some do read it, perhaps less so over time as they just want to get to the prompt and effectively become oblivious to the banner. Of course, it ('citation()') is also referenced in the main R FAQ... Marc ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel