RFC: What should ?foo do?

classic Classic list List threaded Threaded
27 messages Options
12
Reply | Threaded
Open this post in threaded view
|

RFC: What should ?foo do?

Duncan Murdoch
Currently ?foo does help("foo"), which looks for a man page with alias
foo.  If foo happens to be a function call, it will do a bit more, so

?mean(something)

will find the mean method for something if mean happens to be an S4
generic.  There are also the type?foo variations, e.g. methods?foo, or
package?foo.

I think these are all too limited.

The easiest search should be the most permissive.  Users should need to
do extra work to limit their search to man pages, with exact matches, as
? does.

We don't currently have a general purpose search for "foo", or something
like it.  We come close with RSiteSearch, and so possibly ?foo should
mean RSiteSearch("foo"), but
there are problems with that: it can't limit itself to the current
version of R, and it doesn't work when you're offline (or when
search.r-project.org is down.)  We also have help.search("foo"), but it
is too limited. I'd like to have a local search that looks through the
man pages, manuals, FAQs, vignettes, DESCRIPTION files, etc., specific
to the current R installation, and I think ? should be attached to that
search.

Comments, please.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: RFC: What should ?foo do?

Marc Schwartz
Duncan Murdoch wrote:

> Currently ?foo does help("foo"), which looks for a man page with alias
> foo.  If foo happens to be a function call, it will do a bit more, so
>
> ?mean(something)
>
> will find the mean method for something if mean happens to be an S4
> generic.  There are also the type?foo variations, e.g. methods?foo, or
> package?foo.
>
> I think these are all too limited.
>
> The easiest search should be the most permissive.  Users should need to
> do extra work to limit their search to man pages, with exact matches, as
> ? does.
>
> We don't currently have a general purpose search for "foo", or something
> like it.  We come close with RSiteSearch, and so possibly ?foo should
> mean RSiteSearch("foo"), but
> there are problems with that: it can't limit itself to the current
> version of R, and it doesn't work when you're offline (or when
> search.r-project.org is down.)  We also have help.search("foo"), but it
> is too limited. I'd like to have a local search that looks through the
> man pages, manuals, FAQs, vignettes, DESCRIPTION files, etc., specific
> to the current R installation, and I think ? should be attached to that
> search.
>
> Comments, please.
>
> Duncan Murdoch

Duncan,

I agree in principle with the points that you raise. I suspect that at
least in part, it might assist new users with some of the issues that
were raised in the latest incarnation of the 'we need better
documentation' thread on r-help.

I am not convinced that ?foo should do this however. help("foo")
conceptually seems predicated upon the notion that a user is looking for
a reference/help page for a specific function or descriptor called
'foo'. The user knows the name of the function or descriptor and should
not have to wait for a search function to locate it or conceptually
related terms. If the user has a large number of CRAN packages
installed, such a search can take a rather long time. That's an issue
for example with help.search().

That being said and being a firm believer in incrementalism, perhaps the
first step should be to create a new function, called esearch() [as in
extended search] or doc.search() [as in documentation search] or even
search.all(). This new function would facilitate searching all of the
local objects that you list and perhaps others. It would by default be
uber-inclusive of all categories of such objects. It would support
functionality along the lines of help.search() in allowing for the use
of regex and fuzzy matching via grep()/agrep().

The downside of this approach is that we would add yet another search
function to the list of those already available, each of which searches
a focused subset of the potential targets for assistance, whether local
or online. Thus, it would require some level of understanding of the
general structure of the myriad of local and online resources of R
related assistance.

Perhaps ?help could be augmented a bit in elucidating some of these
issues. The See Also there does not reference apropos() for example and
it might be worthwhile adding something along the lines of the bullets
in the "Do your homework before posting" section in the Posting Guide.
Thus ?help can become something of a "first place to look - local
centralized help resource" for users to identify the tiered help
resources that are available and to also provide a framework for how to
use those resources. One could also have links to the online pages for R
News, R Books, the R Wiki, the R Graph Gallery, Contributed
Documentation, Bioconductor and Other Documentation, so that users
become more aware of help resources beyond the documentation installed
with R by default.

A longer term plan could be to look to consolidate some of these
functions into a single help/search function perhaps circa R version
3.0.0. That would enable some time for thoughtful consideration and
feedback.

That's my US\$ $\displaystyle e^{i\pi} + \sum_{n=1}^\infty \frac{1}{2^n}
+ 2(10^{-2})$

:-)

HTH,

Marc

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: RFC: What should ?foo do?

hadley wickham
>  I am not convinced that ?foo should do this however. help("foo")
>  conceptually seems predicated upon the notion that a user is looking for
>  a reference/help page for a specific function or descriptor called
>  'foo'. The user knows the name of the function or descriptor and should
>  not have to wait for a search function to locate it or conceptually

Is that true?  That does limit the usefulness of ? as it implies that
you must already know the function that you need.

>  related terms. If the user has a large number of CRAN packages
>  installed, such a search can take a rather long time. That's an issue
>  for example with help.search().

But that's just a problem with the current implementation.  Better
indexing could make full text search of all documentation practical
instantaneous.  This is one argument for a centralised documentation
web site - such indices are much easier to set up in a modern web
development environment.  I could imagine this being an eventual
feature of crantastic.org, but it requires on some tool to turn Rd
files into a form more easily parsed with off-the-shelf tools (ideally
xml).

Hadley

--
http://had.co.nz/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: RFC: What should ?foo do?

Prof Brian Ripley
On Thu, 24 Apr 2008, hadley wickham wrote:

>>  I am not convinced that ?foo should do this however. help("foo")
>>  conceptually seems predicated upon the notion that a user is looking for
>>  a reference/help page for a specific function or descriptor called
>>  'foo'. The user knows the name of the function or descriptor and should
>>  not have to wait for a search function to locate it or conceptually
>
> Is that true?  That does limit the usefulness of ? as it implies that
> you must already know the function that you need.
>
>>  related terms. If the user has a large number of CRAN packages
>>  installed, such a search can take a rather long time. That's an issue
>>  for example with help.search().
>
> But that's just a problem with the current implementation.  Better
> indexing could make full text search of all documentation practical
> instantaneous.  This is one argument for a centralised documentation
> web site - such indices are much easier to set up in a modern web
> development environment.  I could imagine this being an eventual
> feature of crantastic.org, but it requires on some tool to turn Rd
> files into a form more easily parsed with off-the-shelf tools (ideally
> xml).

The search is a lot faster in 2.7.x, but is limited by disc speed (and
hence is relatively slow on Windows -- and I have a box which has both
Linux and Windows on, so I've tested this on identical hardware). R is a
dynamic environment which can change libraries (and their order), add or
remove packages ....  The text search is pretty close to instantaneous
(try it a second time) -- the time is taken building the database for the
first use.

Yes, we could do this differently, but some credit for the work already
done would be an encouragement to continue to improve.


--
Brian D. Ripley,                  [hidden email]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: RFC: What should ?foo do?

Duncan Murdoch
In reply to this post by Marc Schwartz
Marc Schwartz wrote:

> Duncan Murdoch wrote:
>  
>> Currently ?foo does help("foo"), which looks for a man page with alias
>> foo.  If foo happens to be a function call, it will do a bit more, so
>>
>> ?mean(something)
>>
>> will find the mean method for something if mean happens to be an S4
>> generic.  There are also the type?foo variations, e.g. methods?foo, or
>> package?foo.
>>
>> I think these are all too limited.
>>
>> The easiest search should be the most permissive.  Users should need to
>> do extra work to limit their search to man pages, with exact matches, as
>> ? does.
>>
>> We don't currently have a general purpose search for "foo", or something
>> like it.  We come close with RSiteSearch, and so possibly ?foo should
>> mean RSiteSearch("foo"), but
>> there are problems with that: it can't limit itself to the current
>> version of R, and it doesn't work when you're offline (or when
>> search.r-project.org is down.)  We also have help.search("foo"), but it
>> is too limited. I'd like to have a local search that looks through the
>> man pages, manuals, FAQs, vignettes, DESCRIPTION files, etc., specific
>> to the current R installation, and I think ? should be attached to that
>> search.
>>
>> Comments, please.
>>
>> Duncan Murdoch
>>    
>
> Duncan,
>
> I agree in principle with the points that you raise. I suspect that at
> least in part, it might assist new users with some of the issues that
> were raised in the latest incarnation of the 'we need better
> documentation' thread on r-help.
>
> I am not convinced that ?foo should do this however. help("foo")
> conceptually seems predicated upon the notion that a user is looking for
> a reference/help page for a specific function or descriptor called
> 'foo'. The user knows the name of the function or descriptor ...

With the current definition, that's correct, though man("foo") might be
a better match to Unix-users expectations for a function that did that.  
For a naive user, help("foo") suggests that they're looking for help on
"foo".
>
> ...  and should
> not have to wait for a search function to locate it or conceptually
> related terms.  If the user has a large number of CRAN packages
> installed, such a search can take a rather long time. That's an issue
> for example with help.search().
>  

As Brian and Hadley said, that's an implementation issue, already being
addressed.

> That being said and being a firm believer in incrementalism, perhaps the
> first step should be to create a new function, called esearch() [as in
> extended search] or doc.search() [as in documentation search] or even
> search.all(). This new function would facilitate searching all of the
> local objects that you list and perhaps others. It would by default be
> uber-inclusive of all categories of such objects. It would support
> functionality along the lines of help.search() in allowing for the use
> of regex and fuzzy matching via grep()/agrep().
>  

Definitely there would need to be a new function, with a new name; if we
were attaching the name to ? somehow, then it wouldn't matter much what
name was used.

I haven't done it, but I suspect we could introduce special behaviour
for ??foo very easily.  We could even have a whole hierarchy:

?foo, ??foo, ???foo, ????foo, ...

> The downside of this approach is that we would add yet another search
> function to the list of those already available, each of which searches
> a focused subset of the potential targets for assistance, whether local
> or online. Thus, it would require some level of understanding of the
> general structure of the myriad of local and online resources of R
> related assistance.
>  

Part of the idea behind my suggestion is that it should be somewhat
automatic for a new user to learn about the different types of help.  
One way for this to happen is the current one:  expect them to find and
read the manuals.  The suggestion is to make it easier to find the
different types.  The risk of this is that exposing a new user to a wide
range of different kinds of results would just be confusing.

> Perhaps ?help could be augmented a bit in elucidating some of these
> issues. The See Also there does not reference apropos() for example and
> it might be worthwhile adding something along the lines of the bullets
> in the "Do your homework before posting" section in the Posting Guide.
> Thus ?help can become something of a "first place to look - local
> centralized help resource" for users to identify the tiered help
> resources that are available and to also provide a framework for how to
> use those resources. One could also have links to the online pages for R
> News, R Books, the R Wiki, the R Graph Gallery, Contributed
> Documentation, Bioconductor and Other Documentation, so that users
> become more aware of help resources beyond the documentation installed
> with R by default.
>  

Those are probably good ideas, but my guess would be that few users read
?help.

> A longer term plan could be to look to consolidate some of these
> functions into a single help/search function perhaps circa R version
> 3.0.0. That would enable some time for thoughtful consideration and
> feedback.
>  

As all the recent bug reports show, we don't really get feedback until
code is released, so there's not much of an advantage of 3.0.0 (unless
we really break the current system) over 2.8.0.

Duncan Murdoch

> That's my US\$ $\displaystyle e^{i\pi} + \sum_{n=1}^\infty \frac{1}{2^n}
> + 2(10^{-2})$
>
> :-)
>
> HTH,
>
> Marc
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: RFC: What should ?foo do?

Peter Dalgaard
Duncan Murdoch wrote:
> I haven't done it, but I suspect we could introduce special behaviour
> for ??foo very easily.  We could even have a whole hierarchy:
>
> ?foo, ??foo, ???foo, ????foo, ...
>
>  
Heh, that's rather nice, actually. In words, that could read

?foo: tell me about foo!
??foo: what can you tell me about foo?
???foo: what can you tell me about things like foo?
????foo: I don't know what I'm looking for but it might be something
related foo?

You do have to be careful about messing with ?, though. I think many
people, including me, would pretty quickly go nuts if ?par suddenly
didn't work the way we're used to.

--

   O__  ---- Peter Dalgaard             Ă˜ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - ([hidden email])              FAX: (+45) 35327907

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: RFC: What should ?foo do?

Robert Gentleman
In reply to this post by Duncan Murdoch


Duncan Murdoch wrote:

> Currently ?foo does help("foo"), which looks for a man page with alias
> foo.  If foo happens to be a function call, it will do a bit more, so
>
> ?mean(something)
>
> will find the mean method for something if mean happens to be an S4
> generic.  There are also the type?foo variations, e.g. methods?foo, or
> package?foo.
>
> I think these are all too limited.
>
> The easiest search should be the most permissive.  Users should need to
> do extra work to limit their search to man pages, with exact matches, as
> ? does.

   While I like the idea, I don't really agree with the sentiment above.
I think that the easiest search should be the one that you want the
result of most often.
And at least for me that is the man page for the function, so I can
check some detail; and it works pretty well.  I use site searches much
less frequently and would be happy to type more for them.

>
> We don't currently have a general purpose search for "foo", or something
> like it.  We come close with RSiteSearch, and so possibly ?foo should
> mean RSiteSearch("foo"), but
> there are problems with that: it can't limit itself to the current
> version of R, and it doesn't work when you're offline (or when
> search.r-project.org is down.)  We also have help.search("foo"), but it
> is too limited. I'd like to have a local search that looks through the
> man pages, manuals, FAQs, vignettes, DESCRIPTION files, etc., specific
> to the current R installation, and I think ? should be attached to that
> search.

  I think that would be very useful (although there will be some
decisions on which tool to use to achieve this). But, it will also be
problematic, as one will get tons of hits for some things, and then
selecting the one you really want will be a pain.

  I would rather see that be one of the dyadic forms, say

   site?foo

  or
   all?foo

  one could even imagine refining that for different subsets of the docs
you have mentioned;

   help?foo #only man pages
   guides?foo #the manuals, R Extensions etc

and so on.

   You did not, make a suggestion as to how we would get the equivalent
of ?foo now, if a decision to move were taken.


>
> Comments, please.
>
> Duncan Murdoch
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

--
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
[hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: RFC: What should ?foo do?

hadley wickham
In reply to this post by Prof Brian Ripley
> > But that's just a problem with the current implementation.  Better
> > indexing could make full text search of all documentation practical
> > instantaneous.  This is one argument for a centralised documentation
> > web site - such indices are much easier to set up in a modern web
> > development environment.  I could imagine this being an eventual
> > feature of crantastic.org, but it requires on some tool to turn Rd
> > files into a form more easily parsed with off-the-shelf tools (ideally
> > xml).
> >
>
>  The search is a lot faster in 2.7.x, but is limited by disc speed (and
> hence is relatively slow on Windows -- and I have a box which has both Linux
> and Windows on, so I've tested this on identical hardware). R is a dynamic
> environment which can change libraries (and their order), add or remove
> packages ....  The text search is pretty close to instantaneous (try it a
> second time) -- the time is taken building the database for the first use.

Would it be possible to build per package help databases during the
install (or build?) process?  That way all that work could be shifted
to one a one off compilation.

>  Yes, we could do this differently, but some credit for the work already
> done would be an encouragement to continue to improve.

Sorry! Writing a good text search engine is hard and I didn't mean to
disparage the existing work of R-core.  The current system works great
for the majority of uses.

Hadley

--
http://had.co.nz/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: RFC: What should ?foo do?

hadley wickham
In reply to this post by Peter Dalgaard
On Fri, Apr 25, 2008 at 7:46 AM, Peter Dalgaard
<[hidden email]> wrote:

> Duncan Murdoch wrote:
>  > I haven't done it, but I suspect we could introduce special behaviour
>  > for ??foo very easily.  We could even have a whole hierarchy:
>  >
>  > ?foo, ??foo, ???foo, ????foo, ...
>  >
>  >
>  Heh, that's rather nice, actually. In words, that could read
>
>  ?foo: tell me about foo!
>  ??foo: what can you tell me about foo?
>  ???foo: what can you tell me about things like foo?
>  ????foo: I don't know what I'm looking for but it might be something
>  related foo?

I like the idea, but why do not it automatically and then display the
results on a single page?  (i.e. list results in order of specificity)


Hadley


--
http://had.co.nz/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: RFC: What should ?foo do?

hadley wickham
In reply to this post by Robert Gentleman
>   I would rather see that be one of the dyadic forms, say
>
>    site?foo
>
>   or
>    all?foo

I'd be interested to know how many R users are aware of the dyadic
form - I suspect it's very very few.

Hadley


--
http://had.co.nz/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: RFC: What should ?foo do?

Duncan Murdoch
In reply to this post by hadley wickham
On 4/25/2008 10:18 AM, hadley wickham wrote:

> On Fri, Apr 25, 2008 at 7:46 AM, Peter Dalgaard
> <[hidden email]> wrote:
>> Duncan Murdoch wrote:
>>  > I haven't done it, but I suspect we could introduce special behaviour
>>  > for ??foo very easily.  We could even have a whole hierarchy:
>>  >
>>  > ?foo, ??foo, ???foo, ????foo, ...
>>  >
>>  >
>>  Heh, that's rather nice, actually. In words, that could read
>>
>>  ?foo: tell me about foo!
>>  ??foo: what can you tell me about foo?
>>  ???foo: what can you tell me about things like foo?
>>  ????foo: I don't know what I'm looking for but it might be something
>>  related foo?
>
> I like the idea, but why do not it automatically and then display the
> results on a single page?  (i.e. list results in order of specificity)

One reason not to do that is that in single-threaded R you are pretty
much stuck until it is done.  Presumably the more specific search is
quicker than the less specific one.  And even if we could act on results
as soon as they were available, I think a lot of users would wait for
the search to stop, so there'd be a perception that it was too slow.

One possible change to ?foo that would not be so painful for old-time
users would be to try it under its current meaning first, and only fall
back to a more general search if that fails.

Consistent with this idea would be something like the "I feel lucky"
search on Google, i.e. ?foo would go immediately to the best match,
while ??foo would present a list of possible matches.  This is not
consistent with current behaviour, where ?foo will present a list if it
matches two or more topics, but I think we can always rank one ahead of
the other based on their ordering in the search list.  I don't know if
it will be so easy to rank hits coming from help.search(), or from other
searches that don't exist yet:  but maybe it doesn't matter.  If someone
doesn't like what they get from ?foo, they can always try ??foo.

Duncan

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: RFC: What should ?foo do?

Simon Urbanek
In reply to this post by Peter Dalgaard

On Apr 25, 2008, at 8:46 AM, Peter Dalgaard wrote:

> Duncan Murdoch wrote:
>> I haven't done it, but I suspect we could introduce special behaviour
>> for ??foo very easily.  We could even have a whole hierarchy:
>>
>> ?foo, ??foo, ???foo, ????foo, ...
>>
>>
> Heh, that's rather nice, actually. In words, that could read
>
> ?foo: tell me about foo!
> ??foo: what can you tell me about foo?
> ???foo: what can you tell me about things like foo?
> ????foo: I don't know what I'm looking for but it might be something
> related foo?
>
> You do have to be careful about messing with ?, though. I think many
> people, including me, would pretty quickly go nuts if ?par suddenly
> didn't work the way we're used to.
>

I strongly agree with that.

One potential way out could be to offer some extended fall-back in  
case the man page doesn't exist. (I'm not sure I like that, either,  
but I could get used to it ;).)

I don't really have a problem with status quo and I think if you want  
proper advanced searches, you should be using (or implementing them)  
in the GUIs anyway. That is what the new users will be using (and  
looking for) in the first place. If they have to count the question  
marks instead, they won't know about it (although I like the idea for  
advanced users).

Cheers,
Simon

>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: RFC: What should ?foo do?

Duncan Murdoch
In reply to this post by Robert Gentleman
On 4/25/2008 10:16 AM, Robert Gentleman wrote:

>
> Duncan Murdoch wrote:
>> Currently ?foo does help("foo"), which looks for a man page with alias
>> foo.  If foo happens to be a function call, it will do a bit more, so
>>
>> ?mean(something)
>>
>> will find the mean method for something if mean happens to be an S4
>> generic.  There are also the type?foo variations, e.g. methods?foo, or
>> package?foo.
>>
>> I think these are all too limited.
>>
>> The easiest search should be the most permissive.  Users should need to
>> do extra work to limit their search to man pages, with exact matches, as
>> ? does.
>
>    While I like the idea, I don't really agree with the sentiment above.
> I think that the easiest search should be the one that you want the
> result of most often.
> And at least for me that is the man page for the function, so I can
> check some detail; and it works pretty well.  I use site searches much
> less frequently and would be happy to type more for them.

That's true.

What's your feeling about what should happen when ?foo fails?


>
>>
>> We don't currently have a general purpose search for "foo", or something
>> like it.  We come close with RSiteSearch, and so possibly ?foo should
>> mean RSiteSearch("foo"), but
>> there are problems with that: it can't limit itself to the current
>> version of R, and it doesn't work when you're offline (or when
>> search.r-project.org is down.)  We also have help.search("foo"), but it
>> is too limited. I'd like to have a local search that looks through the
>> man pages, manuals, FAQs, vignettes, DESCRIPTION files, etc., specific
>> to the current R installation, and I think ? should be attached to that
>> search.
>
>   I think that would be very useful (although there will be some
> decisions on which tool to use to achieve this). But, it will also be
> problematic, as one will get tons of hits for some things, and then
> selecting the one you really want will be a pain.
>
>   I would rather see that be one of the dyadic forms, say
>
>    site?foo
>
>   or
>    all?foo
>
>   one could even imagine refining that for different subsets of the docs
> you have mentioned;
>
>    help?foo #only man pages
>    guides?foo #the manuals, R Extensions etc
>
> and so on.
>
>    You did not, make a suggestion as to how we would get the equivalent
> of ?foo now, if a decision to move were taken.

I didn't say, but I would assume there would be a way to do it, and it
shouldn't be hard to invoke.  Maybe help?foo as you suggested, or man?foo.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: RFC: What should ?foo do?

hadley wickham
In reply to this post by Duncan Murdoch
>  Consistent with this idea would be something like the "I feel lucky" search
> on Google, i.e. ?foo would go immediately to the best match, while ??foo
> would present a list of possible matches.  This is not consistent with
> current behaviour, where ?foo will present a list if it matches two or more
> topics, but I think we can always rank one ahead of the other based on their
> ordering in the search list.  I don't know if it will be so easy to rank
> hits coming from help.search(), or from other searches that don't exist yet:
> but maybe it doesn't matter.  If someone doesn't like what they get from
> ?foo, they can always try ??foo.

That seems like a good compromise to me - it's a metaphor familiar to
most people, and corresponds roughly to the current behaviour of help
and help.search.

Hadley

--
http://had.co.nz/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: RFC: What should ?foo do?

Duncan Murdoch
In reply to this post by Simon Urbanek
On 4/25/2008 10:41 AM, Simon Urbanek wrote:

> On Apr 25, 2008, at 8:46 AM, Peter Dalgaard wrote:
>
>> Duncan Murdoch wrote:
>>> I haven't done it, but I suspect we could introduce special behaviour
>>> for ??foo very easily.  We could even have a whole hierarchy:
>>>
>>> ?foo, ??foo, ???foo, ????foo, ...
>>>
>>>
>> Heh, that's rather nice, actually. In words, that could read
>>
>> ?foo: tell me about foo!
>> ??foo: what can you tell me about foo?
>> ???foo: what can you tell me about things like foo?
>> ????foo: I don't know what I'm looking for but it might be something
>> related foo?
>>
>> You do have to be careful about messing with ?, though. I think many
>> people, including me, would pretty quickly go nuts if ?par suddenly
>> didn't work the way we're used to.
>>
>
> I strongly agree with that.
>
> One potential way out could be to offer some extended fall-back in  
> case the man page doesn't exist. (I'm not sure I like that, either,  
> but I could get used to it ;).)
>
> I don't really have a problem with status quo and I think if you want  
> proper advanced searches, you should be using (or implementing them)  
> in the GUIs anyway. That is what the new users will be using (and  
> looking for) in the first place. If they have to count the question  
> marks instead, they won't know about it (although I like the idea for  
> advanced users).

I'd like to try to have the search common across all platforms, but the
GUIs could present it and the results in their own way.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: RFC: What should ?foo do?

Marc Schwartz
In reply to this post by Duncan Murdoch
Duncan Murdoch wrote:

> Marc Schwartz wrote:
>> Duncan Murdoch wrote:
>>  
>>> Currently ?foo does help("foo"), which looks for a man page with
>>> alias foo.  If foo happens to be a function call, it will do a bit
>>> more, so
>>>
>>> ?mean(something)
>>>
>>> will find the mean method for something if mean happens to be an S4
>>> generic.  There are also the type?foo variations, e.g. methods?foo,
>>> or package?foo.
>>>
>>> I think these are all too limited.
>>>
>>> The easiest search should be the most permissive.  Users should need
>>> to do extra work to limit their search to man pages, with exact
>>> matches, as ? does.
>>>
>>> We don't currently have a general purpose search for "foo", or
>>> something like it.  We come close with RSiteSearch, and so possibly
>>> ?foo should mean RSiteSearch("foo"), but
>>> there are problems with that: it can't limit itself to the current
>>> version of R, and it doesn't work when you're offline (or when
>>> search.r-project.org is down.)  We also have help.search("foo"), but
>>> it is too limited. I'd like to have a local search that looks through
>>> the man pages, manuals, FAQs, vignettes, DESCRIPTION files, etc.,
>>> specific to the current R installation, and I think ? should be
>>> attached to that search.
>>>
>>> Comments, please.
>>>
>>> Duncan Murdoch
>>>    
>>
>> Duncan,
>>
>> I agree in principle with the points that you raise. I suspect that at
>> least in part, it might assist new users with some of the issues that
>> were raised in the latest incarnation of the 'we need better
>> documentation' thread on r-help.
>>
>> I am not convinced that ?foo should do this however. help("foo")
>> conceptually seems predicated upon the notion that a user is looking
>> for a reference/help page for a specific function or descriptor called
>> 'foo'. The user knows the name of the function or descriptor ...
>
> With the current definition, that's correct, though man("foo") might be
> a better match to Unix-users expectations for a function that did that.  
> For a naive user, help("foo") suggests that they're looking for help on
> "foo".

I agree that man("foo") would be consistent with looking for help with a
known specific function. I also agree that Linux/Unix users would expect
such behavior. However, I am not sure that Windows users would be so
inclined. Certainly as a former Windows user and despite many years of
programming experience in various environments, I would not, out of the
gate, have instinctively known or thought about man("foo").

That is not to argue against moving in that direction however. In fact,
as part of any future consolidation of the myriad help and search
functions, it would make a great deal of sense that ?foo become an alias
for man("foo") rather than help("foo").

Thus, the other help/search related functions could also consolidate
around a mechanism with two key distinctions, that being local versus
online sources.

>> ...  and should not have to wait for a search function to locate it or
>> conceptually related terms.  If the user has a large number of CRAN
>> packages installed, such a search can take a rather long time. That's
>> an issue for example with help.search().
>>  
>
> As Brian and Hadley said, that's an implementation issue, already being
> addressed.

My comment there was more of an observation rather than a criticism and
my apologies if it was taken as the latter. I think it is reasonable to
expect, that if a useR has 1,300+ CRAN packages installed, that it is
going to take longer to search that infrastructure, than if the useR
only has a few.

I would want to have a reasonable expectation however, that if I used
?t.test as opposed to help.search("t test"), the result would be
forthcoming in a more efficient manner in the former case than in the
latter. In the former case, I am typically looking for a specific
function in a package that is in the search path. In the latter, I am
searching for related terms/concepts in all installed packages, etc.

>> That being said and being a firm believer in incrementalism, perhaps
>> the first step should be to create a new function, called esearch()
>> [as in extended search] or doc.search() [as in documentation search]
>> or even search.all(). This new function would facilitate searching all
>> of the local objects that you list and perhaps others. It would by
>> default be uber-inclusive of all categories of such objects. It would
>> support functionality along the lines of help.search() in allowing for
>> the use of regex and fuzzy matching via grep()/agrep().
>>  
>
> Definitely there would need to be a new function, with a new name; if we
> were attaching the name to ? somehow, then it wouldn't matter much what
> name was used.
>
> I haven't done it, but I suspect we could introduce special behaviour
> for ??foo very easily.  We could even have a whole hierarchy:
>
> ?foo, ??foo, ???foo, ????foo, ...

Conceptually, my initial reaction, which I think is consistent to an
extent with the differentiation that Peter made in his reply, is
positive, though as always, there is the risk of confusion.

 From the perspective of naive users and the KISS approach, I would tend
to favor the basic distinction of:

1. ?foo or man("foo") - look for the man page for a known specific
function in the current searchpath

2. help.search("foo") - look for conceptual links related to 'foo', with
some appropriate wrappers that default to either local or online sources.


>> The downside of this approach is that we would add yet another search
>> function to the list of those already available, each of which
>> searches a focused subset of the potential targets for assistance,
>> whether local or online. Thus, it would require some level of
>> understanding of the general structure of the myriad of local and
>> online resources of R related assistance.
>>  
>
> Part of the idea behind my suggestion is that it should be somewhat
> automatic for a new user to learn about the different types of help.  
> One way for this to happen is the current one:  expect them to find and
> read the manuals.  The suggestion is to make it easier to find the
> different types.  The risk of this is that exposing a new user to a wide
> range of different kinds of results would just be confusing.

I will admit a little ambivalence here. Part of me thinks that a useR
*should* at minimum, read "An Introduction to R" or at least be inclined
to look there as their first resource. It does seem that there is some
expectation from new users that they can just dive in and become
productive with R immediately, whether or not they have prior
programming experience and whether or not they have experience with
other statistical applications. In fact, there is an argument to be made
that such prior experience can bias their expectations and frame of
reference.

Reading "Intro" can assist them in beginning to understand the
conceptual differences in R as compared to these other environments,
such as methods, vectorized functions, object structures and accessor
functions, etc.

If a user has found and knows how to use lm() and construct model
formulae for example, why is it that they don't know about coef(),
effects(), fitted(), etc. when these are listed on the help page for lm?

They didn't read far enough or they skipped right over the See Also
section to the examples?

The first instinct has become to post to r-help (as just happened),
rather than use the phenomenal resources that this community has already
made available.

So rather than taking a little time to read a bit more, and in the long
run, save themselves time from posting and waiting for a reply, they
default to posting.

It is interesting to note that by users doing this, they are in effect
providing substantive praise to this community and the support provided
by the lists, in that they have come to expect a pretty rapid response
from the community 24x7. I suspect that the volume of certain categories
of e-mails might be quite different if the typical response time on the
lists was hours rather than minutes...

The first presumption of a useR should be that the available
documentation might cover these issues or that there is a reasonable
possibility that somebody else has likely already asked the same
question and thus if I don't find the answer in "Intro", I should then
consider searching the list archives.

These are the issues that are covered in the Posting Guide, which
clearly many don't utilize either.

So, that being the case, how do we provide a conceptual framework for
seeking assistance in using R and how do we behaviorally modify useRs to
actually utilize those resources to their own benefit?

I am not looking to solve 100% of the needs, but again within the notion
of incrementalism and Pareto's 80/20 Rule, how do we address a
reasonable majority of the needs. How do we get the biggest bang for the
the investment of time.

>> Perhaps ?help could be augmented a bit in elucidating some of these
>> issues. The See Also there does not reference apropos() for example
>> and it might be worthwhile adding something along the lines of the
>> bullets in the "Do your homework before posting" section in the
>> Posting Guide. Thus ?help can become something of a "first place to
>> look - local centralized help resource" for users to identify the
>> tiered help resources that are available and to also provide a
>> framework for how to use those resources. One could also have links to
>> the online pages for R News, R Books, the R Wiki, the R Graph Gallery,
>> Contributed Documentation, Bioconductor and Other Documentation, so
>> that users become more aware of help resources beyond the
>> documentation installed with R by default.
>>  
>
> Those are probably good ideas, but my guess would be that few users read
> ?help.

As I note above, somehow we need to get users to look to a central
resource that is platform independent. That resource should include some
type of overview of the local and online help resources that are
available for R, and perhaps a suggested hierarchy of use.

It seems logical to me that such a resource be embedded up front in
"Intro" with it also being included within the existing help system and
referenced in the start up banner message.

>> A longer term plan could be to look to consolidate some of these
>> functions into a single help/search function perhaps circa R version
>> 3.0.0. That would enable some time for thoughtful consideration and
>> feedback.
>>  
>
> As all the recent bug reports show, we don't really get feedback until
> code is released, so there's not much of an advantage of 3.0.0 (unless
> we really break the current system) over 2.8.0.

Good point.

Regards,

Marc

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: RFC: What should ?foo do?

hadley wickham
>  It seems logical to me that such a resource be embedded up front in "Intro"
> with it also being included within the existing help system and referenced
> in the start up banner message.

That would help if anyone actually read the startup banner.  The next
time you're in front of an audience of people who have been using R
for a couple of weeks, ask them how to cite R.

Hadley


--
http://had.co.nz/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: RFC: What should ?foo do?

Robert Gentleman
In reply to this post by Duncan Murdoch


Duncan Murdoch wrote:

> On 4/25/2008 10:16 AM, Robert Gentleman wrote:
>>
>> Duncan Murdoch wrote:
>>> Currently ?foo does help("foo"), which looks for a man page with
>>> alias foo.  If foo happens to be a function call, it will do a bit
>>> more, so
>>>
>>> ?mean(something)
>>>
>>> will find the mean method for something if mean happens to be an S4
>>> generic.  There are also the type?foo variations, e.g. methods?foo,
>>> or package?foo.
>>>
>>> I think these are all too limited.
>>>
>>> The easiest search should be the most permissive.  Users should need
>>> to do extra work to limit their search to man pages, with exact
>>> matches, as ? does.
>>
>>    While I like the idea, I don't really agree with the sentiment
>> above. I think that the easiest search should be the one that you want
>> the result of most often.
>> And at least for me that is the man page for the function, so I can
>> check some detail; and it works pretty well.  I use site searches much
>> less frequently and would be happy to type more for them.
>
> That's true.
>
> What's your feeling about what should happen when ?foo fails?

   present of list of man pages with spellings close to foo (we have the
tools to do this in many places right now, and it would be a great help,
IMHO, as spellings and capitalization behavior varies both between and
within individuals), so the user can select one

>
>
>>
>>>
>>> We don't currently have a general purpose search for "foo", or
>>> something like it.  We come close with RSiteSearch, and so possibly
>>> ?foo should mean RSiteSearch("foo"), but
>>> there are problems with that: it can't limit itself to the current
>>> version of R, and it doesn't work when you're offline (or when
>>> search.r-project.org is down.)  We also have help.search("foo"), but
>>> it is too limited. I'd like to have a local search that looks through
>>> the man pages, manuals, FAQs, vignettes, DESCRIPTION files, etc.,
>>> specific to the current R installation, and I think ? should be
>>> attached to that search.
>>
>>   I think that would be very useful (although there will be some
>> decisions on which tool to use to achieve this). But, it will also be
>> problematic, as one will get tons of hits for some things, and then
>> selecting the one you really want will be a pain.
>>
>>   I would rather see that be one of the dyadic forms, say
>>
>>    site?foo
>>
>>   or
>>    all?foo
>>
>>   one could even imagine refining that for different subsets of the
>> docs you have mentioned;
>>
>>    help?foo #only man pages
>>    guides?foo #the manuals, R Extensions etc
>>
>> and so on.
>>
>>    You did not, make a suggestion as to how we would get the
>> equivalent of ?foo now, if a decision to move were taken.
>
> I didn't say, but I would assume there would be a way to do it, and it
> shouldn't be hard to invoke.  Maybe help?foo as you suggested, or man?foo.

   If not then I would be strongly opposed -- I really think we want to
make the most common thing the easiest to do.  And if we really think
that might be different for different people, then disambiguate the
"short-cut", ? in this case, from the command so that users have some
freedom to customize, would be my favored alternative.

   I also wonder if one could not also provide some mechanism to provide
distinct information on what is local vs what is on the internet.
Something that would make tools like spotlight much more valuable, IMHO,
is to tell me what I have on my computer, and what I can get, if I want
to; at least as some form of option.


   Robert

>
> Duncan Murdoch
>

--
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
[hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: RFC: What should ?foo do?

Marc Schwartz
In reply to this post by hadley wickham
hadley wickham wrote:
>>  It seems logical to me that such a resource be embedded up front in "Intro"
>> with it also being included within the existing help system and referenced
>> in the start up banner message.
>
> That would help if anyone actually read the startup banner.  The next
> time you're in front of an audience of people who have been using R
> for a couple of weeks, ask them how to cite R.
>
> Hadley

That might be supportive of having the startup banner be nothing more
than a version/copyright notice and then leave you at a '>" prompt.

I am open to the notion however, that some do read it, perhaps less so
over time as they just want to get to the prompt and effectively become
oblivious to the banner.

Of course, it ('citation()') is also referenced in the main R FAQ...

Marc

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: RFC: What should ?foo do?

Deepayan Sarkar
For what it's worth, I use ?foo mostly to look up usage of functions
that I know I want to use, and find it perfect for that (one benefit
over help() is that completion works for ?). The only thing I miss is
the ability to do the equivalent of help("foo", package = "bar");
?bar::foo gives the help page for "::". Perhaps that would be
something to consider for addition.

-Deepayan

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
12