a generic 'attach'?

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

a generic 'attach'?

Bill.Venables
Is there any reason why 'attach' is not generic in R?

I notice that it is in another system, for example, and I can see some
applications if it were so in R.

Bill Venables.


Bill Venables,
CMIS, CSIRO Laboratories,
PO Box 120, Cleveland, Qld. 4163
AUSTRALIA

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: a generic 'attach'?

Peter Dalgaard
<[hidden email]> writes:

> Is there any reason why 'attach' is not generic in R?
>
> I notice that it is in another system, for example,

I wonder which one? ;-)

> and I can see some
> applications if it were so in R.

I suppose there is no particular reason, except that it was probably
"good enough for now" at some point in time.

Apropos attach(), and apologies in advance for the lengthy rant that
follows:

There are a couple of other annoyances with the attach/detach
mechanism that could do with a review. In particular, detach() is not
behaving according to documentation (return value is really NULL). I
feel that sensible semantics for editing an attached database and
storing it back would be useful. The current semantics tend to get
people in trouble, and some of the stuff you need to explain really
feels quite odd:

attach(airquality)
airquality$Month <- factor(airquality$Month)
# oops, that's not going to work. You need:
detach(airquality)
attach(airquality)

(notice in particular that this tends to keep two copies of the data
in memory at a time).

You can actually modify a database after attaching it (I'm
deliberately not saying "data frame", because it will not be one at
that stage), but it leads to contorsions like

assign("Month", factor(Month), "airquality")

or

with(pos.to.env(2), Month <- factor(Month))

(or even with(pos.to.env(match("airquality",search())),....))

I've been thinking on and off about these matters. It is a bit tricky
because we'd rather not break codes (and books!) all over the place,
but suppose we

(a) allowed with() to have its first argument interpreted like the 3rd
    argument in assign()

(b) made detach() do what it claims: return the (possibly modified)
    database. This requires that more metadata are kept around than
    currently. Also, the semantics of

    attach(airquality)
    assign("foo", function(bar)baz, "airquality")
    aq <- detach(airquality)

    would need to be sorted out. Presumably "foo" needs to be dropped
    with a warning.

Potentially, one could then also devise mechanisms for load/store
directly to/from the search path.

Alternative ideas include changing the search path itself to be an
actual list of objects (rather than a nesting of environments), but
that leads to the same sort of issues.


--
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - ([hidden email])                  FAX: (+45) 35327907

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: a generic 'attach'?

Bill.Venables
In reply to this post by Bill.Venables
What have I started?  I had nothing anywhere near as radical as that in mind, Peter...

One argument against making 'attach' generic might be that such a move would slow it down a bit, but I can't really see why speed would be much of an issue with 'attach'.

I've noticed that David Brahm's package, g.data, for example really has a method for attach as part of it, (well almost), but he has to calls it g.data.attach.

Another package that has an obvious application for a method for attach is the filehash package of Roger Peng.

And as it happens I have another, but for now I call it 'Attach', which is pretty unsatisfying from an aesthetic point of view.

I think I'll just sew the seed for now.  The thing about generic functions is that if they exist people sometimes find quite innovative uses for them, and if they come at minimal cost, and break no existing code, I suggest we thik about implementing them.

(Notice I have had no need to use a 'compatibility with another system' argument at any stage...)

---

Another, even more minor issue I've wondered about is giving rm() the return value the object, or list of objects, that are removed.  Thus

newName <- rm(object)

would become essentially a renaming of an object in memory.

For some reason I seem to recall that this was indeed a feature of a very early version of the S language, but dropped out (I think) when S3 was introduced.  Have I got that completely wrong?  (I seem to recall a lot of code had to be scrapped at that stage, including something rather reminiscent of the R with(), but I digress...)

Bill.


-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Peter Dalgaard
Sent: Sunday, 5 February 2006 8:35 PM
To: Venables, Bill (CMIS, Cleveland)
Cc: [hidden email]
Subject: Re: [Rd] a generic 'attach'?


<[hidden email]> writes:

> Is there any reason why 'attach' is not generic in R?
>
> I notice that it is in another system, for example,

I wonder which one? ;-)

> and I can see some
> applications if it were so in R.

I suppose there is no particular reason, except that it was probably
"good enough for now" at some point in time.

Apropos attach(), and apologies in advance for the lengthy rant that
follows:

There are a couple of other annoyances with the attach/detach
mechanism that could do with a review. In particular, detach() is not
behaving according to documentation (return value is really NULL). I
feel that sensible semantics for editing an attached database and
storing it back would be useful. The current semantics tend to get
people in trouble, and some of the stuff you need to explain really
feels quite odd:

attach(airquality)
airquality$Month <- factor(airquality$Month)
# oops, that's not going to work. You need:
detach(airquality)
attach(airquality)

(notice in particular that this tends to keep two copies of the data
in memory at a time).

You can actually modify a database after attaching it (I'm
deliberately not saying "data frame", because it will not be one at
that stage), but it leads to contorsions like

assign("Month", factor(Month), "airquality")

or

with(pos.to.env(2), Month <- factor(Month))

(or even with(pos.to.env(match("airquality",search())),....))

I've been thinking on and off about these matters. It is a bit tricky
because we'd rather not break codes (and books!) all over the place,
but suppose we

(a) allowed with() to have its first argument interpreted like the 3rd
    argument in assign()

(b) made detach() do what it claims: return the (possibly modified)
    database. This requires that more metadata are kept around than
    currently. Also, the semantics of

    attach(airquality)
    assign("foo", function(bar)baz, "airquality")
    aq <- detach(airquality)

    would need to be sorted out. Presumably "foo" needs to be dropped
    with a warning.

Potentially, one could then also devise mechanisms for load/store
directly to/from the search path.

Alternative ideas include changing the search path itself to be an
actual list of objects (rather than a nesting of environments), but
that leads to the same sort of issues.


--
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - ([hidden email])                  FAX: (+45) 35327907

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: a generic 'attach'?

Brian Ripley
What are you proposing the generic be, and how should it be described?

Most of the currrent attach seems to be general, the only parts which are
specific to save() images and lists are

         value <- .Internal(attach(NULL, pos, name))
         load(what, envir = as.environment(pos))
     }
     else value <- .Internal(attach(what, pos, name))

So maybe it is not attach() but some internal version of it (which
populates a frame on the search list) which needs to be generic.  Indeed.
dbLoad() in pkg filehash looks just what one wants here.

[That code is a bit strange: is not 'value' the environment into which you
want to load things?  So why go via as.environment?]

The devil is in the `well almost'.

On Sun, 5 Feb 2006 [hidden email] wrote:

> What have I started?  I had nothing anywhere near as radical as that in
> mind, Peter...
>
> One argument against making 'attach' generic might be that such a move
> would slow it down a bit, but I can't really see why speed would be much
> of an issue with 'attach'.

Speed is not an issue.  The major issue in making a function generic is
describing what a generic function is required to do (including what it is
required to return), and thereby ensuring that you do not break existing
code without unduly limiting future uses.

> I've noticed that David Brahm's package, g.data, for example really has
> a method for attach as part of it, (well almost), but he has to calls it
> g.data.attach.

Another candidate is lazyload/lazydata databases.

> Another package that has an obvious application for a method for attach
> is the filehash package of Roger Peng.
>
> And as it happens I have another, but for now I call it 'Attach', which
> is pretty unsatisfying from an aesthetic point of view.
>
> I think I'll just sew the seed for now.  The thing about generic
> functions is that if they exist people sometimes find quite innovative
> uses for them, and if they come at minimal cost, and break no existing
> code, I suggest we thik about implementing them.
>
> (Notice I have had no need to use a 'compatibility with another system'
> argument at any stage...)

A good point, as it is not actually documented to be generic under that
systems, as far as I can see.

[...]

--
Brian D. Ripley,                  [hidden email]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: a generic 'attach'?

Roger D. Peng
In reply to this post by Bill.Venables
I think having a generic attach might be useful in the end.  But I agree
that some more thought needs to go into how such a generic would behave.
  I've always avoided using `attach()' precisely because I didn't fully
understand the semantics.

One related possibility would be to create a method for `with()' (which
is already generic) which would work on (in my case) "filehash"
databases.  It still wouldn't be quite as nice as `attach()' for
interactive work but it could serve some purposes.

-roger

[hidden email] wrote:

> What have I started?  I had nothing anywhere near as radical as that
> in mind, Peter...
>
> One argument against making 'attach' generic might be that such a
> move would slow it down a bit, but I can't really see why speed would
> be much of an issue with 'attach'.
>
> I've noticed that David Brahm's package, g.data, for example really
> has a method for attach as part of it, (well almost), but he has to
> calls it g.data.attach.
>
> Another package that has an obvious application for a method for
> attach is the filehash package of Roger Peng.
>
> And as it happens I have another, but for now I call it 'Attach',
> which is pretty unsatisfying from an aesthetic point of view.
>
> I think I'll just sew the seed for now.  The thing about generic
> functions is that if they exist people sometimes find quite
> innovative uses for them, and if they come at minimal cost, and break
> no existing code, I suggest we thik about implementing them.
>
> (Notice I have had no need to use a 'compatibility with another
> system' argument at any stage...)
>
> ---
>
> Another, even more minor issue I've wondered about is giving rm() the
> return value the object, or list of objects, that are removed.  Thus
>
> newName <- rm(object)
>
> would become essentially a renaming of an object in memory.
>
> For some reason I seem to recall that this was indeed a feature of a
> very early version of the S language, but dropped out (I think) when
> S3 was introduced.  Have I got that completely wrong?  (I seem to
> recall a lot of code had to be scrapped at that stage, including
> something rather reminiscent of the R with(), but I digress...)
>
> Bill.
>
>
> -----Original Message----- From: [hidden email]
> [mailto:[hidden email]] On Behalf Of Peter Dalgaard Sent: Sunday,
> 5 February 2006 8:35 PM To: Venables, Bill (CMIS, Cleveland) Cc:
> [hidden email] Subject: Re: [Rd] a generic 'attach'?
>
>
> <[hidden email]> writes:
>
>
>> Is there any reason why 'attach' is not generic in R?
>>
>> I notice that it is in another system, for example,
>
>
> I wonder which one? ;-)
>
>
>> and I can see some applications if it were so in R.
>
>
> I suppose there is no particular reason, except that it was probably
> "good enough for now" at some point in time.
>
> Apropos attach(), and apologies in advance for the lengthy rant that
> follows:
>
> There are a couple of other annoyances with the attach/detach
> mechanism that could do with a review. In particular, detach() is not
>  behaving according to documentation (return value is really NULL). I
>  feel that sensible semantics for editing an attached database and
> storing it back would be useful. The current semantics tend to get
> people in trouble, and some of the stuff you need to explain really
> feels quite odd:
>
> attach(airquality) airquality$Month <- factor(airquality$Month) #
> oops, that's not going to work. You need: detach(airquality)
> attach(airquality)
>
> (notice in particular that this tends to keep two copies of the data
> in memory at a time).
>
> You can actually modify a database after attaching it (I'm
> deliberately not saying "data frame", because it will not be one at
> that stage), but it leads to contorsions like
>
> assign("Month", factor(Month), "airquality")
>
> or
>
> with(pos.to.env(2), Month <- factor(Month))
>
> (or even with(pos.to.env(match("airquality",search())),....))
>
> I've been thinking on and off about these matters. It is a bit tricky
>  because we'd rather not break codes (and books!) all over the place,
>  but suppose we
>
> (a) allowed with() to have its first argument interpreted like the
> 3rd argument in assign()
>
> (b) made detach() do what it claims: return the (possibly modified)
> database. This requires that more metadata are kept around than
> currently. Also, the semantics of
>
> attach(airquality) assign("foo", function(bar)baz, "airquality") aq
> <- detach(airquality)
>
> would need to be sorted out. Presumably "foo" needs to be dropped
> with a warning.
>
> Potentially, one could then also devise mechanisms for load/store
> directly to/from the search path.
>
> Alternative ideas include changing the search path itself to be an
> actual list of objects (rather than a nesting of environments), but
> that leads to the same sort of issues.
>
>

--
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel