Enhancement request: anonymous connections

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Enhancement request: anonymous connections

Seth Falcon-2
I would like to be able to use anonymous connections in R and have
them close themselves when they go out of scope.

Here is an example of what I think should work, but does not at
present:

## create test file
x <- 1:10
fn <- "anon-con-test-x.rda"
save(x, file=fn)
testUrl <- paste("file:/", getwd(), fn, sep="/")

## use an anonymous connection to load data from
## the URL as suggested in help(load).
for (i in 1:50) {
    print(load(url(testUrl)))
}

[snip some output]
Error in url(testUrl) : all connections are in use

If such a feature is not possible/desired for the next release, it
might be good to add a note to the documentation for connections that
mentions this issue with anonymous connections.


--
 + seth

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Enhancement request: anonymous connections

Duncan Murdoch
On 12/27/2005 7:59 PM, Seth Falcon wrote:

> I would like to be able to use anonymous connections in R and have
> them close themselves when they go out of scope.
>
> Here is an example of what I think should work, but does not at
> present:
>
> ## create test file
> x <- 1:10
> fn <- "anon-con-test-x.rda"
> save(x, file=fn)
> testUrl <- paste("file:/", getwd(), fn, sep="/")
>
> ## use an anonymous connection to load data from
> ## the URL as suggested in help(load).
> for (i in 1:50) {
>     print(load(url(testUrl)))
> }
>
> [snip some output]
> Error in url(testUrl) : all connections are in use
>
> If such a feature is not possible/desired for the next release, it
> might be good to add a note to the documentation for connections that
> mentions this issue with anonymous connections.

This is a bug in load, isn't it?  load() opens the connection but
doesn't close it.

I think a fix is to add a line to load() as shown below:

Index: load.R
===================================================================
--- load.R      (revision 36884)
+++ load.R      (working copy)
@@ -11,6 +11,7 @@
      if(!isOpen(con)) {
          ## code below assumes that the connection is open ...
          open(con, "rb")
+        on.exit(close(con))
      }

      magic <- readChar(con, 5)

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Enhancement request: anonymous connections

Seth Falcon-2
On 27 Dec 2005, [hidden email] wrote:
> This is a bug in load, isn't it?  load() opens the connection but
> doesn't close it.

Well, it may be that load needs a small fix, but that doesn't fix
anonymous connections in general, IMO.

The loop could easily have been:

for (i in 1:50) {
    print(load(url(testUrl, open="r")))
}

And it doesn't need to be related to url or load:

cat("a line of text\n", file="another-example.txt")
z <- NULL
for (i in 1:50) {
    z <- c(z, readLines(file("another-example.txt", open="r")))
}

Also, connections are "in use" even if they are closed:

for (i in 1:50) {
    if (isOpen(file("another-example.txt")))
        stop("you will not get here")
}


--
+ seth

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Enhancement request: anonymous connections

Byron Ellis
I think what you're suggesting is that connections should become  
first-class citizens in the R world by becoming a CONSXP, an  
EXTPTRSXP + Finalizer or whatever (which wouldn't bother me a bit,  
BTW). Then they'd play by the same rules as everything else, although  
you might want to have an explicit close as well as a close on  
finalize, that way you could close and then reopen if you keep a  
reference around (It also really annoys me that close NULLs the value  
of the symbol right now). It'd also be nice to *gasp* have an API for  
I/O from the C side of things as well.

If we wanted to be truly radical we'd just accept that graphics  
devices and event loops are just special cases of the connection and  
merge the whole thing, thus more-or-less reinventing CLIM. :-)

On Dec 28, 2005, at 6:50 AM, Seth Falcon wrote:

> On 27 Dec 2005, [hidden email] wrote:
>> This is a bug in load, isn't it?  load() opens the connection but
>> doesn't close it.
>
> Well, it may be that load needs a small fix, but that doesn't fix
> anonymous connections in general, IMO.
>
> The loop could easily have been:
>
> for (i in 1:50) {
>     print(load(url(testUrl, open="r")))
> }
>
> And it doesn't need to be related to url or load:
>
> cat("a line of text\n", file="another-example.txt")
> z <- NULL
> for (i in 1:50) {
>     z <- c(z, readLines(file("another-example.txt", open="r")))
> }
>
> Also, connections are "in use" even if they are closed:
>
> for (i in 1:50) {
>     if (isOpen(file("another-example.txt")))
>         stop("you will not get here")
> }
>
>
> --
> + seth
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

---
Byron Ellis ([hidden email])
"Oook" -- The Librarian

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Enhancement request: anonymous connections

Duncan Murdoch
In reply to this post by Seth Falcon-2
On 12/28/2005 9:50 AM, Seth Falcon wrote:
> On 27 Dec 2005, [hidden email] wrote:
>
>>This is a bug in load, isn't it?  load() opens the connection but
>>doesn't close it.
>
>
> Well, it may be that load needs a small fix, but that doesn't fix
> anonymous connections in general, IMO.

No it doesn't.  However, I've committed the small fix.

>
> The loop could easily have been:
>
> for (i in 1:50) {
>     print(load(url(testUrl, open="r")))
> }
>
> And it doesn't need to be related to url or load:
>
> cat("a line of text\n", file="another-example.txt")
> z <- NULL
> for (i in 1:50) {
>     z <- c(z, readLines(file("another-example.txt", open="r")))
> }
>
> Also, connections are "in use" even if they are closed:
>
> for (i in 1:50) {
>     if (isOpen(file("another-example.txt")))
>         stop("you will not get here")
> }

I think the general problem is that R doesn't have references (or at
least, they aren't in a final, documented state).  If the garbage
collector closed a connection, then things would go wrong when there
were two copies of it:  the second one would be messed up when the first
was destroyed.  If we had references, then opening a connection could
create a connection object and a reference to it; the connection object
would remain as long as there were any references to it, and could be
destroyed (and automatically closed) after the last reference was gone.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Enhancement request: anonymous connections

Prof Brian Ripley
On Sun, 1 Jan 2006, Duncan Murdoch wrote:

> On 12/28/2005 9:50 AM, Seth Falcon wrote:
>> On 27 Dec 2005, [hidden email] wrote:
>>
>>> This is a bug in load, isn't it?  load() opens the connection but
>>> doesn't close it.
>>
>>
>> Well, it may be that load needs a small fix, but that doesn't fix
>> anonymous connections in general, IMO.
>
> No it doesn't.  However, I've committed the small fix.

It was not even a bug in load: close() and open() are not pairs.  (I
didn't pick the names!) Your `fix' destroys a connection, which is
not the documented behaviour and far more dangerous than leaving it open.

The lifecycle of a connection is  (see e.g. my R-news article)

  create->open->close->destroy

and close() does both of the last two.  Please revert this change.
Ideally we would close and not destroy if the connection was opened, but
that needs a better C-level interface in place of this R-level one.

>> The loop could easily have been:
>>
>> for (i in 1:50) {
>>     print(load(url(testUrl, open="r")))
>> }
>>
>> And it doesn't need to be related to url or load:
>>
>> cat("a line of text\n", file="another-example.txt")
>> z <- NULL
>> for (i in 1:50) {
>>     z <- c(z, readLines(file("another-example.txt", open="r")))
>> }
>>
>> Also, connections are "in use" even if they are closed:
>>
>> for (i in 1:50) {
>>     if (isOpen(file("another-example.txt")))
>>         stop("you will not get here")
>> }
>
> I think the general problem is that R doesn't have references (or at
> least, they aren't in a final, documented state).  If the garbage
> collector closed a connection, then things would go wrong when there
> were two copies of it:  the second one would be messed up when the first
> was destroyed.  If we had references, then opening a connection could
> create a connection object and a reference to it; the connection object
> would remain as long as there were any references to it, and could be
> destroyed (and automatically closed) after the last reference was gone.

However, that just isn't how connections are documented in the Green Book
(referenced on all the relevant help pages, so required reading) and
getConnection() allows you to create an R object pointing to a connection
that previously had none.  The OP has never told us what `anonymous
connections' are, but it is quite possible that his unstated ideas are
incompatible with the documentation.

--
Brian D. Ripley,                  [hidden email]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Enhancement request: anonymous connections

Duncan Murdoch
On 1/1/2006 1:05 PM, Prof Brian Ripley wrote:

> On Sun, 1 Jan 2006, Duncan Murdoch wrote:
>
>
>>On 12/28/2005 9:50 AM, Seth Falcon wrote:
>>
>>>On 27 Dec 2005, [hidden email] wrote:
>>>
>>>
>>>>This is a bug in load, isn't it?  load() opens the connection but
>>>>doesn't close it.
>>>
>>>
>>>Well, it may be that load needs a small fix, but that doesn't fix
>>>anonymous connections in general, IMO.
>>
>>No it doesn't.  However, I've committed the small fix.
>
>
> It was not even a bug in load: close() and open() are not pairs.  (I
> didn't pick the names!) Your `fix' destroys a connection, which is
> not the documented behaviour and far more dangerous than leaving it open.
>
> The lifecycle of a connection is  (see e.g. my R-news article)
>
>   create->open->close->destroy
>
> and close() does both of the last two.  Please revert this change.
> Ideally we would close and not destroy if the connection was opened, but
> that needs a better C-level interface in place of this R-level one.

Sorry about that.  I've done the reversion.

Is it worth putting that C-level interface in place to make load() more
compatible with ?connections which says,

      'open' opens a connection.  In general functions using connections
      will open them if they are not open, but then close them again, so
      to leave a connection open call 'open' explicitly.

?  I suppose a natural way to do it would be to add a "destroy=TRUE"
argument to close(), and then have load() do

  on.exit(close(con, destroy=FALSE))

but maybe it would be better to add a separate function to do this, for
better green book compatibility.

Duncan Murdoch

>
>
>>>The loop could easily have been:
>>>
>>>for (i in 1:50) {
>>>    print(load(url(testUrl, open="r")))
>>>}
>>>
>>>And it doesn't need to be related to url or load:
>>>
>>>cat("a line of text\n", file="another-example.txt")
>>>z <- NULL
>>>for (i in 1:50) {
>>>    z <- c(z, readLines(file("another-example.txt", open="r")))
>>>}
>>>
>>>Also, connections are "in use" even if they are closed:
>>>
>>>for (i in 1:50) {
>>>    if (isOpen(file("another-example.txt")))
>>>        stop("you will not get here")
>>>}
>>
>>I think the general problem is that R doesn't have references (or at
>>least, they aren't in a final, documented state).  If the garbage
>>collector closed a connection, then things would go wrong when there
>>were two copies of it:  the second one would be messed up when the first
>>was destroyed.  If we had references, then opening a connection could
>>create a connection object and a reference to it; the connection object
>>would remain as long as there were any references to it, and could be
>>destroyed (and automatically closed) after the last reference was gone.
>
>
> However, that just isn't how connections are documented in the Green Book
> (referenced on all the relevant help pages, so required reading) and
> getConnection() allows you to create an R object pointing to a connection
> that previously had none.  The OP has never told us what `anonymous
> connections' are, but it is quite possible that his unstated ideas are
> incompatible with the documentation.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Enhancement request: anonymous connections

Seth Falcon-2
In reply to this post by Prof Brian Ripley
On  1 Jan 2006, [hidden email] wrote:
> However, that just isn't how connections are documented in the Green
> Book (referenced on all the relevant help pages, so required
> reading) and getConnection() allows you to create an R object
> pointing to a connection that previously had none.  

Yes, p. 384 of my copy of the Green Book explains:

    Once a connection has been opened, the evaluation manager keeps it
    open until it is explicitly closed, or the session ends, even if
    no corresponding S connection object exists.

What are some advantages of this design choice?  

> The OP has never told us what anonymous connections' are, but it is
> quite possible that his unstated ideas are incompatible with the
> documentation.

Yep.  The current behavior is as documented in the Green Book and yes,
the enhancement I would like is incompatible with that documentation.

What is an anonymous connection?  Well, really what I want is for the
evaluation manager to clean up connections that have no corresponding
S connection objects referring to them.  Then the code examples that
are part of this thread would work.


+ seth

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel