write.csv

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
16 messages Options
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

write.csv

Lipatz Jean-Luc
Hi all,

I am currently studying how to generalize the usage of R in my statistical institute and I encountered a problem that I cannot declare on bugzilla (cannot understand why). Sorry for trying this mailing list but I am really worried about the problem itself and the possible implications in using R in a professionnal data production context.
The issue about 'write.csv' is that it just doesn't check if there is enough space on disk and doesn't report failure to write data.

Example (R 3.4.0 windows 32 bits, but I reproduced the problem with older versions and under Mac OS/X)

> fwrite(as.list(1:1000000),"G:/Test")
Error in fwrite(as.list(1:1e+06), "G:/Test") :
  No space left on device: 'G:/Test'
> write.csv(1:1000000,"G:/Test")
>

I have a big concern here, because it means that you could save some important data at one point of time and discover a long time after that you actually lost them. I suppose that the fix is relatively straightforward, but how can we be sure that there is no another function with the same bad properties? Is the lesson that you should not use a R function, even from the core, without having personnally tested it against extreme conditions? And wouldn't it be the work of the developpers to do such elementary tests?

Thanks


Jean-Luc LIPATZ
Insee - Direction g�n�rale

Responsable de la coordination sur le d�veloppement de R et la mise en oeuvre d'alternatives � SAS






        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: write.csv

Duncan Murdoch-2
On 04/07/2017 5:40 AM, Lipatz Jean-Luc wrote:
> Hi all,
>
> I am currently studying how to generalize the usage of R in my statistical institute and I encountered a problem that I cannot declare on bugzilla (cannot understand why).

Bugzilla was badly abused by spammers last year, so you need to have
your account created manually by one of the admins to post there.  Write
to me privately if you'd like me to create an account for you.  (If you
want it attached to a different email address, that's fine.)

Sorry for trying this mailing list but I am really worried about the
problem itself and the possible implications in using R in a
professionnal data production context.

> The issue about 'write.csv' is that it just doesn't check if there is enough space on disk and doesn't report failure to write data.
>
> Example (R 3.4.0 windows 32 bits, but I reproduced the problem with older versions and under Mac OS/X)
>
>> fwrite(as.list(1:1000000),"G:/Test")
> Error in fwrite(as.list(1:1e+06), "G:/Test") :
>   No space left on device: 'G:/Test'
>> write.csv(1:1000000,"G:/Test")
>>
>
> I have a big concern here, because it means that you could save some important data at one point of time and discover a long time after that you actually lost them.
 > I suppose that the fix is relatively straightforward, but how can we
be sure that there is no another function with the same bad properties?

R is open source.  You could work out the patch for this bug, and in the
process see the pattern of coding that leads to it.  Then you'll know if
other functions use the same buggy pattern.

> Is the lesson that you should not use a R function, even from the core, without having personnally tested it against extreme conditions?

I think the answer to that is yes.  Most people never write such big
files that they fill their disk:  if they did, all sorts of things would
go wrong on their systems.  So this kind of extreme condition isn't
often tested.  It's not easy to test in a platform independent way:  R
would need to be able to create a volume with a small capacity.  That's
a very system-dependent thing to do.

> And wouldn't it be the work of the developpers to do such elementary tests?

Again, R is open source.  You can and should contribute code (and
therefore become one of the developers) if you are working in unusual
conditions.

R states quite clearly in the welcome message every time it starts: "R
is free software and comes with ABSOLUTELY NO WARRANTY."  This is
essentially the same lack of warranty that you get with commercial
software, though it's stated a lot more clearly.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: write.csv

Lipatz Jean-Luc
I would really like the bug fixed. At least this one, because I know people in my institute using this function.
I understand your arguments about open source, but I also saw in this mail list a proposal for a fix for this bug for which there were no answer from the people who are able to include it in the distribution. It looks like if there were interesting bugs and the other ones.
I don't understand the other arguments : the example was reproduced with a simple USB key and you cannot state that a disk will eternally be empty enough, specially when it has several users.

JLL


-----Message d'origine-----
De : Duncan Murdoch [mailto:[hidden email]]
Envoyé : mardi 4 juillet 2017 14:24
À : Lipatz Jean-Luc; [hidden email]
Objet : Re: [Rd] write.csv

On 04/07/2017 5:40 AM, Lipatz Jean-Luc wrote:
> Hi all,
>
> I am currently studying how to generalize the usage of R in my statistical institute and I encountered a problem that I cannot declare on bugzilla (cannot understand why).

Bugzilla was badly abused by spammers last year, so you need to have your account created manually by one of the admins to post there.  Write to me privately if you'd like me to create an account for you.  (If you want it attached to a different email address, that's fine.)

Sorry for trying this mailing list but I am really worried about the problem itself and the possible implications in using R in a professionnal data production context.

> The issue about 'write.csv' is that it just doesn't check if there is enough space on disk and doesn't report failure to write data.
>
> Example (R 3.4.0 windows 32 bits, but I reproduced the problem with older versions and under Mac OS/X)
>
>> fwrite(as.list(1:1000000),"G:/Test")
> Error in fwrite(as.list(1:1e+06), "G:/Test") :
>   No space left on device: 'G:/Test'
>> write.csv(1:1000000,"G:/Test")
>>
>
> I have a big concern here, because it means that you could save some important data at one point of time and discover a long time after that you actually lost them.
 > I suppose that the fix is relatively straightforward, but how can we
be sure that there is no another function with the same bad properties?

R is open source.  You could work out the patch for this bug, and in the
process see the pattern of coding that leads to it.  Then you'll know if
other functions use the same buggy pattern.

> Is the lesson that you should not use a R function, even from the core, without having personnally tested it against extreme conditions?

I think the answer to that is yes.  Most people never write such big
files that they fill their disk:  if they did, all sorts of things would
go wrong on their systems.  So this kind of extreme condition isn't
often tested.  It's not easy to test in a platform independent way:  R
would need to be able to create a volume with a small capacity.  That's
a very system-dependent thing to do.

> And wouldn't it be the work of the developpers to do such elementary tests?

Again, R is open source.  You can and should contribute code (and
therefore become one of the developers) if you are working in unusual
conditions.

R states quite clearly in the welcome message every time it starts: "R
is free software and comes with ABSOLUTELY NO WARRANTY."  This is
essentially the same lack of warranty that you get with commercial
software, though it's stated a lot more clearly.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: write.csv

R devel mailing list
This doesn't really strike me as a bug. Lots of (most?) programming languages expect you to handle this as an error condition. If you tried the same thing in C you would get the same error.

-----Original Message-----
From: R-devel [mailto:[hidden email]] On Behalf Of Lipatz Jean-Luc
Sent: Tuesday, July 4, 2017 5:40 AM
To: Duncan Murdoch <[hidden email]>
Cc: [hidden email]
Subject: Re: [Rd] write.csv

I would really like the bug fixed. At least this one, because I know people in my institute using this function.
I understand your arguments about open source, but I also saw in this mail list a proposal for a fix for this bug for which there were no answer from the people who are able to include it in the distribution. It looks like if there were interesting bugs and the other ones.
I don't understand the other arguments : the example was reproduced with a simple USB key and you cannot state that a disk will eternally be empty enough, specially when it has several users.

JLL


-----Message d'origine-----
De : Duncan Murdoch [mailto:[hidden email]] Envoyé : mardi 4 juillet 2017 14:24 À : Lipatz Jean-Luc; [hidden email] Objet : Re: [Rd] write.csv

On 04/07/2017 5:40 AM, Lipatz Jean-Luc wrote:
> Hi all,
>
> I am currently studying how to generalize the usage of R in my statistical institute and I encountered a problem that I cannot declare on bugzilla (cannot understand why).

Bugzilla was badly abused by spammers last year, so you need to have your account created manually by one of the admins to post there.  Write to me privately if you'd like me to create an account for you.  (If you want it attached to a different email address, that's fine.)

Sorry for trying this mailing list but I am really worried about the problem itself and the possible implications in using R in a professionnal data production context.

> The issue about 'write.csv' is that it just doesn't check if there is enough space on disk and doesn't report failure to write data.
>
> Example (R 3.4.0 windows 32 bits, but I reproduced the problem with
> older versions and under Mac OS/X)
>
>> fwrite(as.list(1:1000000),"G:/Test")
> Error in fwrite(as.list(1:1e+06), "G:/Test") :
>   No space left on device: 'G:/Test'
>> write.csv(1:1000000,"G:/Test")
>>
>
> I have a big concern here, because it means that you could save some important data at one point of time and discover a long time after that you actually lost them.
 > I suppose that the fix is relatively straightforward, but how can we be sure that there is no another function with the same bad properties?

R is open source.  You could work out the patch for this bug, and in the process see the pattern of coding that leads to it.  Then you'll know if other functions use the same buggy pattern.

> Is the lesson that you should not use a R function, even from the core, without having personnally tested it against extreme conditions?

I think the answer to that is yes.  Most people never write such big files that they fill their disk:  if they did, all sorts of things would go wrong on their systems.  So this kind of extreme condition isn't often tested.  It's not easy to test in a platform independent way:  R would need to be able to create a volume with a small capacity.  That's a very system-dependent thing to do.

> And wouldn't it be the work of the developpers to do such elementary tests?

Again, R is open source.  You can and should contribute code (and therefore become one of the developers) if you are working in unusual conditions.

R states quite clearly in the welcome message every time it starts: "R is free software and comes with ABSOLUTELY NO WARRANTY."  This is essentially the same lack of warranty that you get with commercial software, though it's stated a lot more clearly.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-devel&data=02%7C01%7Cnsosnov%40microsoft.com%7C92c3e87c4ca1482e32f908d4c2d9dd57%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636347688364867350&sdata=7z5OJqLZDZ1zIvx8pP7KhQzNaQ%2FBrhZFKdUHeiFfke4%3D&reserved=0

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: write.csv

Duncan Murdoch-2
In reply to this post by Lipatz Jean-Luc
On 04/07/2017 8:40 AM, Lipatz Jean-Luc wrote:
> I would really like the bug fixed. At least this one, because I know people in my institute using this function.
> I understand your arguments about open source, but I also saw in this mail list a proposal for a fix for this bug for which there were no answer from the people who are able to include it in the distribution. It looks like if there were interesting bugs and the other ones.

Please post a link to that, and I'll look.  Bug reports should be posted
to the bug list.  It's unfortunate that it is currently so difficult to
do so, but if they are only posted here, they are often overlooked.

> I don't understand the other arguments : the example was reproduced with a simple USB key and you cannot state that a disk will eternally be empty enough, specially when it has several users.

I am not denying that it's a bug, I'm just saying that it is a difficult
one to test automatically (so we probably won't add a regression test
once it's fixed), and it's not one that has been reported often.  I
didn't know there were any reports before yours.

Duncan Murdoch

> JLL
>
>
> -----Message d'origine-----
> De : Duncan Murdoch [mailto:[hidden email]]
> Envoyé : mardi 4 juillet 2017 14:24
> À : Lipatz Jean-Luc; [hidden email]
> Objet : Re: [Rd] write.csv
>
> On 04/07/2017 5:40 AM, Lipatz Jean-Luc wrote:
>> Hi all,
>>
>> I am currently studying how to generalize the usage of R in my statistical institute and I encountered a problem that I cannot declare on bugzilla (cannot understand why).
>
> Bugzilla was badly abused by spammers last year, so you need to have your account created manually by one of the admins to post there.  Write to me privately if you'd like me to create an account for you.  (If you want it attached to a different email address, that's fine.)
>
> Sorry for trying this mailing list but I am really worried about the problem itself and the possible implications in using R in a professionnal data production context.
>> The issue about 'write.csv' is that it just doesn't check if there is enough space on disk and doesn't report failure to write data.
>>
>> Example (R 3.4.0 windows 32 bits, but I reproduced the problem with older versions and under Mac OS/X)
>>
>>> fwrite(as.list(1:1000000),"G:/Test")
>> Error in fwrite(as.list(1:1e+06), "G:/Test") :
>>   No space left on device: 'G:/Test'
>>> write.csv(1:1000000,"G:/Test")
>>>
>>
>> I have a big concern here, because it means that you could save some important data at one point of time and discover a long time after that you actually lost them.
>  > I suppose that the fix is relatively straightforward, but how can we
> be sure that there is no another function with the same bad properties?
>
> R is open source.  You could work out the patch for this bug, and in the
> process see the pattern of coding that leads to it.  Then you'll know if
> other functions use the same buggy pattern.
>
>> Is the lesson that you should not use a R function, even from the core, without having personnally tested it against extreme conditions?
>
> I think the answer to that is yes.  Most people never write such big
> files that they fill their disk:  if they did, all sorts of things would
> go wrong on their systems.  So this kind of extreme condition isn't
> often tested.  It's not easy to test in a platform independent way:  R
> would need to be able to create a volume with a small capacity.  That's
> a very system-dependent thing to do.
>
>> And wouldn't it be the work of the developpers to do such elementary tests?
>
> Again, R is open source.  You can and should contribute code (and
> therefore become one of the developers) if you are working in unusual
> conditions.
>
> R states quite clearly in the welcome message every time it starts: "R
> is free software and comes with ABSOLUTELY NO WARRANTY."  This is
> essentially the same lack of warranty that you get with commercial
> software, though it's stated a lot more clearly.
>
> Duncan Murdoch
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: write.csv

Duncan Murdoch-2
In reply to this post by R devel mailing list
On 04/07/2017 8:46 AM, Nathan Sosnovske wrote:
> This doesn't really strike me as a bug. Lots of (most?) programming languages expect you to handle this as an error condition. If you tried the same thing in C you would get the same error.

The bug is that there is no error signalled.  It looks as though the
write succeeded, when it didn't.

Duncan Murdoch

> -----Original Message-----
> From: R-devel [mailto:[hidden email]] On Behalf Of Lipatz Jean-Luc
> Sent: Tuesday, July 4, 2017 5:40 AM
> To: Duncan Murdoch <[hidden email]>
> Cc: [hidden email]
> Subject: Re: [Rd] write.csv
>
> I would really like the bug fixed. At least this one, because I know people in my institute using this function.
> I understand your arguments about open source, but I also saw in this mail list a proposal for a fix for this bug for which there were no answer from the people who are able to include it in the distribution. It looks like if there were interesting bugs and the other ones.
> I don't understand the other arguments : the example was reproduced with a simple USB key and you cannot state that a disk will eternally be empty enough, specially when it has several users.
>
> JLL
>
>
> -----Message d'origine-----
> De : Duncan Murdoch [mailto:[hidden email]] Envoyé : mardi 4 juillet 2017 14:24 À : Lipatz Jean-Luc; [hidden email] Objet : Re: [Rd] write.csv
>
> On 04/07/2017 5:40 AM, Lipatz Jean-Luc wrote:
>> Hi all,
>>
>> I am currently studying how to generalize the usage of R in my statistical institute and I encountered a problem that I cannot declare on bugzilla (cannot understand why).
>
> Bugzilla was badly abused by spammers last year, so you need to have your account created manually by one of the admins to post there.  Write to me privately if you'd like me to create an account for you.  (If you want it attached to a different email address, that's fine.)
>
> Sorry for trying this mailing list but I am really worried about the problem itself and the possible implications in using R in a professionnal data production context.
>> The issue about 'write.csv' is that it just doesn't check if there is enough space on disk and doesn't report failure to write data.
>>
>> Example (R 3.4.0 windows 32 bits, but I reproduced the problem with
>> older versions and under Mac OS/X)
>>
>>> fwrite(as.list(1:1000000),"G:/Test")
>> Error in fwrite(as.list(1:1e+06), "G:/Test") :
>>   No space left on device: 'G:/Test'
>>> write.csv(1:1000000,"G:/Test")
>>>
>>
>> I have a big concern here, because it means that you could save some important data at one point of time and discover a long time after that you actually lost them.
>  > I suppose that the fix is relatively straightforward, but how can we be sure that there is no another function with the same bad properties?
>
> R is open source.  You could work out the patch for this bug, and in the process see the pattern of coding that leads to it.  Then you'll know if other functions use the same buggy pattern.
>
>> Is the lesson that you should not use a R function, even from the core, without having personnally tested it against extreme conditions?
>
> I think the answer to that is yes.  Most people never write such big files that they fill their disk:  if they did, all sorts of things would go wrong on their systems.  So this kind of extreme condition isn't often tested.  It's not easy to test in a platform independent way:  R would need to be able to create a volume with a small capacity.  That's a very system-dependent thing to do.
>
>> And wouldn't it be the work of the developpers to do such elementary tests?
>
> Again, R is open source.  You can and should contribute code (and therefore become one of the developers) if you are working in unusual conditions.
>
> R states quite clearly in the welcome message every time it starts: "R is free software and comes with ABSOLUTELY NO WARRANTY."  This is essentially the same lack of warranty that you get with commercial software, though it's stated a lot more clearly.
>
> Duncan Murdoch
>
> ______________________________________________
> [hidden email] mailing list
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-devel&data=02%7C01%7Cnsosnov%40microsoft.com%7C92c3e87c4ca1482e32f908d4c2d9dd57%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636347688364867350&sdata=7z5OJqLZDZ1zIvx8pP7KhQzNaQ%2FBrhZFKdUHeiFfke4%3D&reserved=0
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: write.csv

R devel mailing list
Ah, I misread the example code (went straight to the line where the error was raised for fwrite). Apologies Jean-Luc.

-----Original Message-----
From: Duncan Murdoch [mailto:[hidden email]]
Sent: Tuesday, July 4, 2017 6:39 AM
To: Nathan Sosnovske <[hidden email]>; Lipatz Jean-Luc <[hidden email]>
Cc: [hidden email]
Subject: Re: [Rd] write.csv

On 04/07/2017 8:46 AM, Nathan Sosnovske wrote:
> This doesn't really strike me as a bug. Lots of (most?) programming languages expect you to handle this as an error condition. If you tried the same thing in C you would get the same error.

The bug is that there is no error signalled.  It looks as though the write succeeded, when it didn't.

Duncan Murdoch

> -----Original Message-----
> From: R-devel [mailto:[hidden email]] On Behalf Of
> Lipatz Jean-Luc
> Sent: Tuesday, July 4, 2017 5:40 AM
> To: Duncan Murdoch <[hidden email]>
> Cc: [hidden email]
> Subject: Re: [Rd] write.csv
>
> I would really like the bug fixed. At least this one, because I know people in my institute using this function.
> I understand your arguments about open source, but I also saw in this mail list a proposal for a fix for this bug for which there were no answer from the people who are able to include it in the distribution. It looks like if there were interesting bugs and the other ones.
> I don't understand the other arguments : the example was reproduced with a simple USB key and you cannot state that a disk will eternally be empty enough, specially when it has several users.
>
> JLL
>
>
> -----Message d'origine-----
> De : Duncan Murdoch [mailto:[hidden email]] Envoyé : mardi 4
> juillet 2017 14:24 À : Lipatz Jean-Luc; [hidden email] Objet :
> Re: [Rd] write.csv
>
> On 04/07/2017 5:40 AM, Lipatz Jean-Luc wrote:
>> Hi all,
>>
>> I am currently studying how to generalize the usage of R in my statistical institute and I encountered a problem that I cannot declare on bugzilla (cannot understand why).
>
> Bugzilla was badly abused by spammers last year, so you need to have
> your account created manually by one of the admins to post there.  
> Write to me privately if you'd like me to create an account for you.  
> (If you want it attached to a different email address, that's fine.)
>
> Sorry for trying this mailing list but I am really worried about the problem itself and the possible implications in using R in a professionnal data production context.
>> The issue about 'write.csv' is that it just doesn't check if there is enough space on disk and doesn't report failure to write data.
>>
>> Example (R 3.4.0 windows 32 bits, but I reproduced the problem with
>> older versions and under Mac OS/X)
>>
>>> fwrite(as.list(1:1000000),"G:/Test")
>> Error in fwrite(as.list(1:1e+06), "G:/Test") :
>>   No space left on device: 'G:/Test'
>>> write.csv(1:1000000,"G:/Test")
>>>
>>
>> I have a big concern here, because it means that you could save some important data at one point of time and discover a long time after that you actually lost them.
>  > I suppose that the fix is relatively straightforward, but how can we be sure that there is no another function with the same bad properties?
>
> R is open source.  You could work out the patch for this bug, and in the process see the pattern of coding that leads to it.  Then you'll know if other functions use the same buggy pattern.
>
>> Is the lesson that you should not use a R function, even from the core, without having personnally tested it against extreme conditions?
>
> I think the answer to that is yes.  Most people never write such big files that they fill their disk:  if they did, all sorts of things would go wrong on their systems.  So this kind of extreme condition isn't often tested.  It's not easy to test in a platform independent way:  R would need to be able to create a volume with a small capacity.  That's a very system-dependent thing to do.
>
>> And wouldn't it be the work of the developpers to do such elementary tests?
>
> Again, R is open source.  You can and should contribute code (and therefore become one of the developers) if you are working in unusual conditions.
>
> R states quite clearly in the welcome message every time it starts: "R is free software and comes with ABSOLUTELY NO WARRANTY."  This is essentially the same lack of warranty that you get with commercial software, though it's stated a lot more clearly.
>
> Duncan Murdoch
>
> ______________________________________________
> [hidden email] mailing list
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.
> ethz.ch%2Fmailman%2Flistinfo%2Fr-devel&data=02%7C01%7Cnsosnov%40micros
> oft.com%7C92c3e87c4ca1482e32f908d4c2d9dd57%7C72f988bf86f141af91ab2d7cd
> 011db47%7C1%7C0%7C636347688364867350&sdata=7z5OJqLZDZ1zIvx8pP7KhQzNaQ%
> 2FBrhZFKdUHeiFfke4%3D&reserved=0
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: write.csv

Joris FA Meys
In reply to this post by Lipatz Jean-Luc
I tested myself, and the "reason" why write.csv() is not giving any error,
is because a file is created. I tested the following with a USB stick
containing only 32Mb of free space:

write.csv(data.frame(V=rnorm(2e7),
                     V2= rnorm(2e7),
                     V3 = rnorm(2e7)),
          file = "G:/Test.csv")

X <- read.csv("G:/Test.csv")

Gives:

> str(X)
'data.frame':    506336 obs. of  4 variables:
 $ X : int  1 2 3 4 5 6 7 8 9 10 ...
 $ V : num  0.0666 -1.2052 -0.2288 -0.4758 1.9168 ...
 $ V2: num  -0.304 -1.766 -1.611 -0.221 -1.118 ...
 $ V3: num  -0.6774 0.0841 0.2062 1.7053 -0.2105 ...

So the first part of the data is stored actually. I totally agree that at
least a warning could be given to tell you not all lines are saved.

While Duncan's reaction might come off a bit direct, please understand that
they are not employees but volunteers. You can demand things from a
company, but in the case of R that's actually rather rude, even when not
intended that way.

Given my limited C skills and my wife hating it when I'm solving other
people's problems in the middle of the night, I'm not hacking in the R core
myself. But as for now, I can offer you this very naive and for big
datasets very time consuming function to check beforehand whether you have
enough space:

testSpace <- function(df,dir){
   totchar <- do.call(sum,
                      lapply(df,
                             function(i) sum(nchar(as.character(i)))))
   # On Windows!
   path <- path.expand(dir)
   path <- gsub("(^[A-Z]{1}:)/.*","\\1",path)

   disks <- system("wmic logicaldisk get freespace, caption",
                   inter = TRUE)

   available <- disks[grep(path,disks)]
   available <- gsub("\\D","",available)
   # Assume 2 bytes per char in UTF-8, which is very liberal
   # but not uncommon
   totchar*16 < as.numeric(available)
}

Gives after about half a minute:

> mydf <- data.frame(V=rnorm(1e7))
> testSpace(mydf, "G:/text.csv")
[1] FALSE

Best regards
Joris

On Tue, Jul 4, 2017 at 2:40 PM, Lipatz Jean-Luc <[hidden email]>
wrote:

> I would really like the bug fixed. At least this one, because I know
> people in my institute using this function.
> I understand your arguments about open source, but I also saw in this mail
> list a proposal for a fix for this bug for which there were no answer from
> the people who are able to include it in the distribution. It looks like if
> there were interesting bugs and the other ones.
> I don't understand the other arguments : the example was reproduced with a
> simple USB key and you cannot state that a disk will eternally be empty
> enough, specially when it has several users.
>
> JLL
>
>
> -----Message d'origine-----
> De : Duncan Murdoch [mailto:[hidden email]]
> Envoyé : mardi 4 juillet 2017 14:24
> À : Lipatz Jean-Luc; [hidden email]
> Objet : Re: [Rd] write.csv
>
> On 04/07/2017 5:40 AM, Lipatz Jean-Luc wrote:
> > Hi all,
> >
> > I am currently studying how to generalize the usage of R in my
> statistical institute and I encountered a problem that I cannot declare on
> bugzilla (cannot understand why).
>
> Bugzilla was badly abused by spammers last year, so you need to have your
> account created manually by one of the admins to post there.  Write to me
> privately if you'd like me to create an account for you.  (If you want it
> attached to a different email address, that's fine.)
>
> Sorry for trying this mailing list but I am really worried about the
> problem itself and the possible implications in using R in a professionnal
> data production context.
> > The issue about 'write.csv' is that it just doesn't check if there is
> enough space on disk and doesn't report failure to write data.
> >
> > Example (R 3.4.0 windows 32 bits, but I reproduced the problem with
> older versions and under Mac OS/X)
> >
> >> fwrite(as.list(1:1000000),"G:/Test")
> > Error in fwrite(as.list(1:1e+06), "G:/Test") :
> >   No space left on device: 'G:/Test'
> >> write.csv(1:1000000,"G:/Test")
> >>
> >
> > I have a big concern here, because it means that you could save some
> important data at one point of time and discover a long time after that you
> actually lost them.
>  > I suppose that the fix is relatively straightforward, but how can we
> be sure that there is no another function with the same bad properties?
>
> R is open source.  You could work out the patch for this bug, and in the
> process see the pattern of coding that leads to it.  Then you'll know if
> other functions use the same buggy pattern.
>
> > Is the lesson that you should not use a R function, even from the core,
> without having personnally tested it against extreme conditions?
>
> I think the answer to that is yes.  Most people never write such big
> files that they fill their disk:  if they did, all sorts of things would
> go wrong on their systems.  So this kind of extreme condition isn't
> often tested.  It's not easy to test in a platform independent way:  R
> would need to be able to create a volume with a small capacity.  That's
> a very system-dependent thing to do.
>
> > And wouldn't it be the work of the developpers to do such elementary
> tests?
>
> Again, R is open source.  You can and should contribute code (and
> therefore become one of the developers) if you are working in unusual
> conditions.
>
> R states quite clearly in the welcome message every time it starts: "R
> is free software and comes with ABSOLUTELY NO WARRANTY."  This is
> essentially the same lack of warranty that you get with commercial
> software, though it's stated a lot more clearly.
>
> Duncan Murdoch
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



--
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Mathematical Modelling, Statistics and Bio-Informatics

tel :  +32 (0)9 264 61 79
[hidden email]
-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: write.csv

Jim Hester
In reply to this post by Duncan Murdoch-2
On linux at least you can use `/dev/full` [1] to test writing to a full device.

    > echo 'foo' > /dev/full
    bash: echo: write error: No space left on device

Although that won't be a perfect test for this case where part of the
file is written successfully.

An alternative suggestion for testing this is to create and mount a
loop device [2] with a small file.

[1]: https://en.wikipedia.org/wiki//dev/full
[2]: https://stackoverflow.com/a/16044420/2055486

On Tue, Jul 4, 2017 at 3:38 PM, Duncan Murdoch <[hidden email]> wrote:

> On 04/07/2017 8:40 AM, Lipatz Jean-Luc wrote:
>>
>> I would really like the bug fixed. At least this one, because I know
>> people in my institute using this function.
>> I understand your arguments about open source, but I also saw in this mail
>> list a proposal for a fix for this bug for which there were no answer from
>> the people who are able to include it in the distribution. It looks like if
>> there were interesting bugs and the other ones.
>
>
> Please post a link to that, and I'll look.  Bug reports should be posted to
> the bug list.  It's unfortunate that it is currently so difficult to do so,
> but if they are only posted here, they are often overlooked.
>
>> I don't understand the other arguments : the example was reproduced with a
>> simple USB key and you cannot state that a disk will eternally be empty
>> enough, specially when it has several users.
>
>
> I am not denying that it's a bug, I'm just saying that it is a difficult one
> to test automatically (so we probably won't add a regression test once it's
> fixed), and it's not one that has been reported often.  I didn't know there
> were any reports before yours.
>
> Duncan Murdoch
>
>
>> JLL
>>
>>
>> -----Message d'origine-----
>> De : Duncan Murdoch [mailto:[hidden email]]
>> Envoyé : mardi 4 juillet 2017 14:24
>> À : Lipatz Jean-Luc; [hidden email]
>> Objet : Re: [Rd] write.csv
>>
>> On 04/07/2017 5:40 AM, Lipatz Jean-Luc wrote:
>>>
>>> Hi all,
>>>
>>> I am currently studying how to generalize the usage of R in my
>>> statistical institute and I encountered a problem that I cannot declare on
>>> bugzilla (cannot understand why).
>>
>>
>> Bugzilla was badly abused by spammers last year, so you need to have your
>> account created manually by one of the admins to post there.  Write to me
>> privately if you'd like me to create an account for you.  (If you want it
>> attached to a different email address, that's fine.)
>>
>> Sorry for trying this mailing list but I am really worried about the
>> problem itself and the possible implications in using R in a professionnal
>> data production context.
>>>
>>> The issue about 'write.csv' is that it just doesn't check if there is
>>> enough space on disk and doesn't report failure to write data.
>>>
>>> Example (R 3.4.0 windows 32 bits, but I reproduced the problem with older
>>> versions and under Mac OS/X)
>>>
>>>> fwrite(as.list(1:1000000),"G:/Test")
>>>
>>> Error in fwrite(as.list(1:1e+06), "G:/Test") :
>>>   No space left on device: 'G:/Test'
>>>>
>>>> write.csv(1:1000000,"G:/Test")
>>>>
>>>
>>> I have a big concern here, because it means that you could save some
>>> important data at one point of time and discover a long time after that you
>>> actually lost them.
>>
>>  > I suppose that the fix is relatively straightforward, but how can we
>> be sure that there is no another function with the same bad properties?
>>
>> R is open source.  You could work out the patch for this bug, and in the
>> process see the pattern of coding that leads to it.  Then you'll know if
>> other functions use the same buggy pattern.
>>
>>> Is the lesson that you should not use a R function, even from the core,
>>> without having personnally tested it against extreme conditions?
>>
>>
>> I think the answer to that is yes.  Most people never write such big
>> files that they fill their disk:  if they did, all sorts of things would
>> go wrong on their systems.  So this kind of extreme condition isn't
>> often tested.  It's not easy to test in a platform independent way:  R
>> would need to be able to create a volume with a small capacity.  That's
>> a very system-dependent thing to do.
>>
>>> And wouldn't it be the work of the developpers to do such elementary
>>> tests?
>>
>>
>> Again, R is open source.  You can and should contribute code (and
>> therefore become one of the developers) if you are working in unusual
>> conditions.
>>
>> R states quite clearly in the welcome message every time it starts: "R
>> is free software and comes with ABSOLUTELY NO WARRANTY."  This is
>> essentially the same lack of warranty that you get with commercial
>> software, though it's stated a lot more clearly.
>>
>> Duncan Murdoch
>>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: write.csv

Duncan Murdoch-2
On 04/07/2017 10:01 AM, Jim Hester wrote:
> On linux at least you can use `/dev/full` [1] to test writing to a full device.
>
>     > echo 'foo' > /dev/full
>     bash: echo: write error: No space left on device

Unfortunately, I get a permission denied error if I try to write there
from MacOS.  I don't know if Windows has an equivalent.

I've taken a look at the code.  Essentially it comes down to a call to
the C function vfprintf, which is supposed to return the number of bytes
written, or a negative value for an error. This return value is often
not checked; in particular, write.table and friends don't check it.

I'll add code to signal an error if there's a negative value.

I don't think it's feasible to check the number of bytes (formatted text
with possible translation to a different encoding could have any number
of bytes) if it's positive.  So hopefully all of our file systems will
correctly signal an error, and not just report how many bytes were
successfully written.

>
> Although that won't be a perfect test for this case where part of the
> file is written successfully.
>
> An alternative suggestion for testing this is to create and mount a
> loop device [2] with a small file.
>
> [1]: https://en.wikipedia.org/wiki//dev/full
> [2]: https://stackoverflow.com/a/16044420/2055486

Loop devices sound ideal, but seem to be Linux-only (at least with that
recipe).

Duncan


>
> On Tue, Jul 4, 2017 at 3:38 PM, Duncan Murdoch <[hidden email]> wrote:
>> On 04/07/2017 8:40 AM, Lipatz Jean-Luc wrote:
>>>
>>> I would really like the bug fixed. At least this one, because I know
>>> people in my institute using this function.
>>> I understand your arguments about open source, but I also saw in this mail
>>> list a proposal for a fix for this bug for which there were no answer from
>>> the people who are able to include it in the distribution. It looks like if
>>> there were interesting bugs and the other ones.
>>
>>
>> Please post a link to that, and I'll look.  Bug reports should be posted to
>> the bug list.  It's unfortunate that it is currently so difficult to do so,
>> but if they are only posted here, they are often overlooked.
>>
>>> I don't understand the other arguments : the example was reproduced with a
>>> simple USB key and you cannot state that a disk will eternally be empty
>>> enough, specially when it has several users.
>>
>>
>> I am not denying that it's a bug, I'm just saying that it is a difficult one
>> to test automatically (so we probably won't add a regression test once it's
>> fixed), and it's not one that has been reported often.  I didn't know there
>> were any reports before yours.
>>
>> Duncan Murdoch
>>
>>
>>> JLL
>>>
>>>
>>> -----Message d'origine-----
>>> De : Duncan Murdoch [mailto:[hidden email]]
>>> Envoyé : mardi 4 juillet 2017 14:24
>>> À : Lipatz Jean-Luc; [hidden email]
>>> Objet : Re: [Rd] write.csv
>>>
>>> On 04/07/2017 5:40 AM, Lipatz Jean-Luc wrote:
>>>>
>>>> Hi all,
>>>>
>>>> I am currently studying how to generalize the usage of R in my
>>>> statistical institute and I encountered a problem that I cannot declare on
>>>> bugzilla (cannot understand why).
>>>
>>>
>>> Bugzilla was badly abused by spammers last year, so you need to have your
>>> account created manually by one of the admins to post there.  Write to me
>>> privately if you'd like me to create an account for you.  (If you want it
>>> attached to a different email address, that's fine.)
>>>
>>> Sorry for trying this mailing list but I am really worried about the
>>> problem itself and the possible implications in using R in a professionnal
>>> data production context.
>>>>
>>>> The issue about 'write.csv' is that it just doesn't check if there is
>>>> enough space on disk and doesn't report failure to write data.
>>>>
>>>> Example (R 3.4.0 windows 32 bits, but I reproduced the problem with older
>>>> versions and under Mac OS/X)
>>>>
>>>>> fwrite(as.list(1:1000000),"G:/Test")
>>>>
>>>> Error in fwrite(as.list(1:1e+06), "G:/Test") :
>>>>   No space left on device: 'G:/Test'
>>>>>
>>>>> write.csv(1:1000000,"G:/Test")
>>>>>
>>>>
>>>> I have a big concern here, because it means that you could save some
>>>> important data at one point of time and discover a long time after that you
>>>> actually lost them.
>>>
>>>  > I suppose that the fix is relatively straightforward, but how can we
>>> be sure that there is no another function with the same bad properties?
>>>
>>> R is open source.  You could work out the patch for this bug, and in the
>>> process see the pattern of coding that leads to it.  Then you'll know if
>>> other functions use the same buggy pattern.
>>>
>>>> Is the lesson that you should not use a R function, even from the core,
>>>> without having personnally tested it against extreme conditions?
>>>
>>>
>>> I think the answer to that is yes.  Most people never write such big
>>> files that they fill their disk:  if they did, all sorts of things would
>>> go wrong on their systems.  So this kind of extreme condition isn't
>>> often tested.  It's not easy to test in a platform independent way:  R
>>> would need to be able to create a volume with a small capacity.  That's
>>> a very system-dependent thing to do.
>>>
>>>> And wouldn't it be the work of the developpers to do such elementary
>>>> tests?
>>>
>>>
>>> Again, R is open source.  You can and should contribute code (and
>>> therefore become one of the developers) if you are working in unusual
>>> conditions.
>>>
>>> R states quite clearly in the welcome message every time it starts: "R
>>> is free software and comes with ABSOLUTELY NO WARRANTY."  This is
>>> essentially the same lack of warranty that you get with commercial
>>> software, though it's stated a lot more clearly.
>>>
>>> Duncan Murdoch
>>>
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: write.csv

R devel mailing list
The best way to test on Windows would probably be creating a small virtual hard disk (via CreateVirtualDisk), mounting it, and writing to the mounted location. I believe the drive could even be mounted to an arbitrary location on the filesystem (instead of a drive letter) so that drive letter conflicts don't come into play.

-----Original Message-----
From: R-devel [mailto:[hidden email]] On Behalf Of Duncan Murdoch
Sent: Tuesday, July 4, 2017 7:53 AM
To: Jim Hester <[hidden email]>
Cc: [hidden email]; Lipatz Jean-Luc <[hidden email]>
Subject: Re: [Rd] write.csv

On 04/07/2017 10:01 AM, Jim Hester wrote:
> On linux at least you can use `/dev/full` [1] to test writing to a full device.
>
>     > echo 'foo' > /dev/full
>     bash: echo: write error: No space left on device

Unfortunately, I get a permission denied error if I try to write there from MacOS.  I don't know if Windows has an equivalent.

I've taken a look at the code.  Essentially it comes down to a call to the C function vfprintf, which is supposed to return the number of bytes written, or a negative value for an error. This return value is often not checked; in particular, write.table and friends don't check it.

I'll add code to signal an error if there's a negative value.

I don't think it's feasible to check the number of bytes (formatted text with possible translation to a different encoding could have any number of bytes) if it's positive.  So hopefully all of our file systems will correctly signal an error, and not just report how many bytes were successfully written.

>
> Although that won't be a perfect test for this case where part of the
> file is written successfully.
>
> An alternative suggestion for testing this is to create and mount a
> loop device [2] with a small file.
>
> [1]:
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wi
> kipedia.org%2Fwiki%2F%2Fdev%2Ffull&data=02%7C01%7Cnsosnov%40microsoft.
> com%7Cb97a7371538b4dbe9a7308d4c2ec5aa0%7C72f988bf86f141af91ab2d7cd011d
> b47%7C1%7C0%7C636347767773809248&sdata=Cb2oduozc2IDCLvXZGG1C4i4hQA7FPs
> 5jHmnFYbk7zQ%3D&reserved=0
> [2]:
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstack
> overflow.com%2Fa%2F16044420%2F2055486&data=02%7C01%7Cnsosnov%40microso
> ft.com%7Cb97a7371538b4dbe9a7308d4c2ec5aa0%7C72f988bf86f141af91ab2d7cd0
> 11db47%7C1%7C0%7C636347767773809248&sdata=%2BWPfqD0nUS%2F30DUNDqQU79lR
> EJh02ZX0yik9HXiY5kg%3D&reserved=0

Loop devices sound ideal, but seem to be Linux-only (at least with that recipe).

Duncan


>
> On Tue, Jul 4, 2017 at 3:38 PM, Duncan Murdoch <[hidden email]> wrote:
>> On 04/07/2017 8:40 AM, Lipatz Jean-Luc wrote:
>>>
>>> I would really like the bug fixed. At least this one, because I know
>>> people in my institute using this function.
>>> I understand your arguments about open source, but I also saw in
>>> this mail list a proposal for a fix for this bug for which there
>>> were no answer from the people who are able to include it in the
>>> distribution. It looks like if there were interesting bugs and the other ones.
>>
>>
>> Please post a link to that, and I'll look.  Bug reports should be
>> posted to the bug list.  It's unfortunate that it is currently so
>> difficult to do so, but if they are only posted here, they are often overlooked.
>>
>>> I don't understand the other arguments : the example was reproduced
>>> with a simple USB key and you cannot state that a disk will
>>> eternally be empty enough, specially when it has several users.
>>
>>
>> I am not denying that it's a bug, I'm just saying that it is a
>> difficult one to test automatically (so we probably won't add a
>> regression test once it's fixed), and it's not one that has been
>> reported often.  I didn't know there were any reports before yours.
>>
>> Duncan Murdoch
>>
>>
>>> JLL
>>>
>>>
>>> -----Message d'origine-----
>>> De : Duncan Murdoch [mailto:[hidden email]] Envoyé : mardi
>>> 4 juillet 2017 14:24 À : Lipatz Jean-Luc; [hidden email]
>>> Objet : Re: [Rd] write.csv
>>>
>>> On 04/07/2017 5:40 AM, Lipatz Jean-Luc wrote:
>>>>
>>>> Hi all,
>>>>
>>>> I am currently studying how to generalize the usage of R in my
>>>> statistical institute and I encountered a problem that I cannot
>>>> declare on bugzilla (cannot understand why).
>>>
>>>
>>> Bugzilla was badly abused by spammers last year, so you need to have
>>> your account created manually by one of the admins to post there.  
>>> Write to me privately if you'd like me to create an account for you.  
>>> (If you want it attached to a different email address, that's fine.)
>>>
>>> Sorry for trying this mailing list but I am really worried about the
>>> problem itself and the possible implications in using R in a
>>> professionnal data production context.
>>>>
>>>> The issue about 'write.csv' is that it just doesn't check if there
>>>> is enough space on disk and doesn't report failure to write data.
>>>>
>>>> Example (R 3.4.0 windows 32 bits, but I reproduced the problem with
>>>> older versions and under Mac OS/X)
>>>>
>>>>> fwrite(as.list(1:1000000),"G:/Test")
>>>>
>>>> Error in fwrite(as.list(1:1e+06), "G:/Test") :
>>>>   No space left on device: 'G:/Test'
>>>>>
>>>>> write.csv(1:1000000,"G:/Test")
>>>>>
>>>>
>>>> I have a big concern here, because it means that you could save
>>>> some important data at one point of time and discover a long time
>>>> after that you actually lost them.
>>>
>>>  > I suppose that the fix is relatively straightforward, but how can
>>> we be sure that there is no another function with the same bad properties?
>>>
>>> R is open source.  You could work out the patch for this bug, and in
>>> the process see the pattern of coding that leads to it.  Then you'll
>>> know if other functions use the same buggy pattern.
>>>
>>>> Is the lesson that you should not use a R function, even from the
>>>> core, without having personnally tested it against extreme conditions?
>>>
>>>
>>> I think the answer to that is yes.  Most people never write such big
>>> files that they fill their disk:  if they did, all sorts of things
>>> would go wrong on their systems.  So this kind of extreme condition
>>> isn't often tested.  It's not easy to test in a platform independent
>>> way:  R would need to be able to create a volume with a small
>>> capacity.  That's a very system-dependent thing to do.
>>>
>>>> And wouldn't it be the work of the developpers to do such
>>>> elementary tests?
>>>
>>>
>>> Again, R is open source.  You can and should contribute code (and
>>> therefore become one of the developers) if you are working in
>>> unusual conditions.
>>>
>>> R states quite clearly in the welcome message every time it starts:
>>> "R is free software and comes with ABSOLUTELY NO WARRANTY."  This is
>>> essentially the same lack of warranty that you get with commercial
>>> software, though it's stated a lot more clearly.
>>>
>>> Duncan Murdoch
>>>
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat
>> .ethz.ch%2Fmailman%2Flistinfo%2Fr-devel&data=02%7C01%7Cnsosnov%40micr
>> osoft.com%7Cb97a7371538b4dbe9a7308d4c2ec5aa0%7C72f988bf86f141af91ab2d
>> 7cd011db47%7C1%7C0%7C636347767773809248&sdata=zMU5Ua2gL3fVPc%2FOPhfdc
>> iAdoHzwDyaRnKusZCnXqWo%3D&reserved=0

______________________________________________
[hidden email] mailing list
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-devel&data=02%7C01%7Cnsosnov%40microsoft.com%7Cb97a7371538b4dbe9a7308d4c2ec5aa0%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636347767773809248&sdata=zMU5Ua2gL3fVPc%2FOPhfdciAdoHzwDyaRnKusZCnXqWo%3D&reserved=0
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: write.csv

January Weiner-3
Dear Jean-Luc,

neither write.csv nor save nor save.image nor any other default write
functions in R check for enough space remaining. While this might be indeed
a problem that one should take care of -- sooner or later -- I would
strongly recommend using data.table::fwrite as the working horse for saving
CSV files anyway.

Firstly, it takes better care of error conditions (as demonstrated by you).
Second, it is much faster:

> system.time(fwrite(list(a=1:1e8), file="test.csv"))
   user  system elapsed
  4.672   0.572   0.857
> system.time(write.csv(list(a=1:1e8), file="test.csv"))
   user  system elapsed
165.056   2.684 176.832

That said, I think that the larger issue here is that the logic behind the
family of functions for saving data in base R is different from the logic
of fwrite(). While fwrite allows to write some contents to a file, save(),
write.csv() and family are based on R file connections and can write not
only to a file, but just any sort of a connection. For example, you can
directly write to a compressed file:

df <- data.frame(a=1:1000)
gz <- gzfile("file.gz")
write.csv(df, file=gz)

It can be a socket, it can be a pipe, an URL etc etc.

The problem might be that there is no easy, general way for testing the
specific errors. Internally (see code in the connections.c file in R
sources), the Rconnection object has a member called "write", which is a
pointer to function writing the data to the connection, a different
function for different types of connections. I do not fully understand all
of this code, but since the functions used for writing return errors (I
think) in different ways, it could be that a reasonable solution is not
straightforward.

In the end and at the moment, as usual, you are faced with a compromise
between safety and freedom. If you need freedom or flexibility, you need to
use the core R functions which allow you to compress data on the fly or use
all sorts of connections. If you rather have stay safe, tell your users to
use fwrite for crucial data.

Best,

j.


On 4 July 2017 at 17:01, Nathan Sosnovske via R-devel <[hidden email]
> wrote:

> The best way to test on Windows would probably be creating a small virtual
> hard disk (via CreateVirtualDisk), mounting it, and writing to the mounted
> location. I believe the drive could even be mounted to an arbitrary
> location on the filesystem (instead of a drive letter) so that drive letter
> conflicts don't come into play.
>
> -----Original Message-----
> From: R-devel [mailto:[hidden email]] On Behalf Of Duncan
> Murdoch
> Sent: Tuesday, July 4, 2017 7:53 AM
> To: Jim Hester <[hidden email]>
> Cc: [hidden email]; Lipatz Jean-Luc <[hidden email]>
> Subject: Re: [Rd] write.csv
>
> On 04/07/2017 10:01 AM, Jim Hester wrote:
> > On linux at least you can use `/dev/full` [1] to test writing to a full
> device.
> >
> >     > echo 'foo' > /dev/full
> >     bash: echo: write error: No space left on device
>
> Unfortunately, I get a permission denied error if I try to write there
> from MacOS.  I don't know if Windows has an equivalent.
>
> I've taken a look at the code.  Essentially it comes down to a call to the
> C function vfprintf, which is supposed to return the number of bytes
> written, or a negative value for an error. This return value is often not
> checked; in particular, write.table and friends don't check it.
>
> I'll add code to signal an error if there's a negative value.
>
> I don't think it's feasible to check the number of bytes (formatted text
> with possible translation to a different encoding could have any number of
> bytes) if it's positive.  So hopefully all of our file systems will
> correctly signal an error, and not just report how many bytes were
> successfully written.
>
> >
> > Although that won't be a perfect test for this case where part of the
> > file is written successfully.
> >
> > An alternative suggestion for testing this is to create and mount a
> > loop device [2] with a small file.
> >
> > [1]:
> > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wi
> > kipedia.org%2Fwiki%2F%2Fdev%2Ffull&data=02%7C01%7Cnsosnov%40microsoft.
> > com%7Cb97a7371538b4dbe9a7308d4c2ec5aa0%7C72f988bf86f141af91ab2d7cd011d
> > b47%7C1%7C0%7C636347767773809248&sdata=Cb2oduozc2IDCLvXZGG1C4i4hQA7FPs
> > 5jHmnFYbk7zQ%3D&reserved=0
> > [2]:
> > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstack
> > overflow.com%2Fa%2F16044420%2F2055486&data=02%7C01%7Cnsosnov%40microso
> > ft.com%7Cb97a7371538b4dbe9a7308d4c2ec5aa0%7C72f988bf86f141af91ab2d7cd0
> > 11db47%7C1%7C0%7C636347767773809248&sdata=%2BWPfqD0nUS%2F30DUNDqQU79lR
> > EJh02ZX0yik9HXiY5kg%3D&reserved=0
>
> Loop devices sound ideal, but seem to be Linux-only (at least with that
> recipe).
>
> Duncan
>
>
> >
> > On Tue, Jul 4, 2017 at 3:38 PM, Duncan Murdoch <[hidden email]>
> wrote:
> >> On 04/07/2017 8:40 AM, Lipatz Jean-Luc wrote:
> >>>
> >>> I would really like the bug fixed. At least this one, because I know
> >>> people in my institute using this function.
> >>> I understand your arguments about open source, but I also saw in
> >>> this mail list a proposal for a fix for this bug for which there
> >>> were no answer from the people who are able to include it in the
> >>> distribution. It looks like if there were interesting bugs and the
> other ones.
> >>
> >>
> >> Please post a link to that, and I'll look.  Bug reports should be
> >> posted to the bug list.  It's unfortunate that it is currently so
> >> difficult to do so, but if they are only posted here, they are often
> overlooked.
> >>
> >>> I don't understand the other arguments : the example was reproduced
> >>> with a simple USB key and you cannot state that a disk will
> >>> eternally be empty enough, specially when it has several users.
> >>
> >>
> >> I am not denying that it's a bug, I'm just saying that it is a
> >> difficult one to test automatically (so we probably won't add a
> >> regression test once it's fixed), and it's not one that has been
> >> reported often.  I didn't know there were any reports before yours.
> >>
> >> Duncan Murdoch
> >>
> >>
> >>> JLL
> >>>
> >>>
> >>> -----Message d'origine-----
> >>> De : Duncan Murdoch [mailto:[hidden email]] Envoyé : mardi
> >>> 4 juillet 2017 14:24 À : Lipatz Jean-Luc; [hidden email]
> >>> Objet : Re: [Rd] write.csv
> >>>
> >>> On 04/07/2017 5:40 AM, Lipatz Jean-Luc wrote:
> >>>>
> >>>> Hi all,
> >>>>
> >>>> I am currently studying how to generalize the usage of R in my
> >>>> statistical institute and I encountered a problem that I cannot
> >>>> declare on bugzilla (cannot understand why).
> >>>
> >>>
> >>> Bugzilla was badly abused by spammers last year, so you need to have
> >>> your account created manually by one of the admins to post there.
> >>> Write to me privately if you'd like me to create an account for you.
> >>> (If you want it attached to a different email address, that's fine.)
> >>>
> >>> Sorry for trying this mailing list but I am really worried about the
> >>> problem itself and the possible implications in using R in a
> >>> professionnal data production context.
> >>>>
> >>>> The issue about 'write.csv' is that it just doesn't check if there
> >>>> is enough space on disk and doesn't report failure to write data.
> >>>>
> >>>> Example (R 3.4.0 windows 32 bits, but I reproduced the problem with
> >>>> older versions and under Mac OS/X)
> >>>>
> >>>>> fwrite(as.list(1:1000000),"G:/Test")
> >>>>
> >>>> Error in fwrite(as.list(1:1e+06), "G:/Test") :
> >>>>   No space left on device: 'G:/Test'
> >>>>>
> >>>>> write.csv(1:1000000,"G:/Test")
> >>>>>
> >>>>
> >>>> I have a big concern here, because it means that you could save
> >>>> some important data at one point of time and discover a long time
> >>>> after that you actually lost them.
> >>>
> >>>  > I suppose that the fix is relatively straightforward, but how can
> >>> we be sure that there is no another function with the same bad
> properties?
> >>>
> >>> R is open source.  You could work out the patch for this bug, and in
> >>> the process see the pattern of coding that leads to it.  Then you'll
> >>> know if other functions use the same buggy pattern.
> >>>
> >>>> Is the lesson that you should not use a R function, even from the
> >>>> core, without having personnally tested it against extreme conditions?
> >>>
> >>>
> >>> I think the answer to that is yes.  Most people never write such big
> >>> files that they fill their disk:  if they did, all sorts of things
> >>> would go wrong on their systems.  So this kind of extreme condition
> >>> isn't often tested.  It's not easy to test in a platform independent
> >>> way:  R would need to be able to create a volume with a small
> >>> capacity.  That's a very system-dependent thing to do.
> >>>
> >>>> And wouldn't it be the work of the developpers to do such
> >>>> elementary tests?
> >>>
> >>>
> >>> Again, R is open source.  You can and should contribute code (and
> >>> therefore become one of the developers) if you are working in
> >>> unusual conditions.
> >>>
> >>> R states quite clearly in the welcome message every time it starts:
> >>> "R is free software and comes with ABSOLUTELY NO WARRANTY."  This is
> >>> essentially the same lack of warranty that you get with commercial
> >>> software, though it's stated a lot more clearly.
> >>>
> >>> Duncan Murdoch
> >>>
> >>
> >> ______________________________________________
> >> [hidden email] mailing list
> >> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat
> >> .ethz.ch%2Fmailman%2Flistinfo%2Fr-devel&data=02%7C01%7Cnsosnov%40micr
> >> osoft.com%7Cb97a7371538b4dbe9a7308d4c2ec5aa0%7C72f988bf86f141af91ab2d
> >> 7cd011db47%7C1%7C0%7C636347767773809248&sdata=zMU5Ua2gL3fVPc%2FOPhfdc
> >> iAdoHzwDyaRnKusZCnXqWo%3D&reserved=0
>
> ______________________________________________
> [hidden email] mailing list
> <a href="https://na01.safelinks.protection.outlook.com/?url=https%3A%">https://na01.safelinks.protection.outlook.com/?url=https%3A%
> 2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-devel&data=02%
> 7C01%7Cnsosnov%40microsoft.com%7Cb97a7371538b4dbe9a7308d4
> c2ec5aa0%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636347
> 767773809248&sdata=zMU5Ua2gL3fVPc%2FOPhfdciAdoHzwDyaRnKusZCn
> XqWo%3D&reserved=0
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



--
-------- January Weiner --------------------------------------

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: write.csv

R devel mailing list
In reply to this post by R devel mailing list
As a follow up to this, Martin Maechler suggested that I write a small script that allows this to be tested on Windows from R. The attached script demonstrates creating and mounting a very small (4 MiB) VHD to a path on the system (in this case c:/smallmount) and then calling write.csv with a large dataframe to the newly mounted path.

The method I attached has two limitations.

1) It will only work if run as administrator. This is a limitation of the OS.
2) It will only work on Windows 8.1/Server 2012R2 or higher, as it uses powershell commands that only exist on those operating systems. I believe it could be made to work on Windows 7, but it would need to be written using the windows C apis at that point.

If a regression test is created for this issue I would be more than happy to integrate this method into that if there is interest and if it would work in the environment where automated builds/tests for windows are run.

-----Original Message-----
From: Nathan Sosnovske
Sent: Tuesday, July 4, 2017 8:02 AM
To: 'Duncan Murdoch' <[hidden email]>; Jim Hester <[hidden email]>
Cc: [hidden email]; Lipatz Jean-Luc <[hidden email]>
Subject: RE: [Rd] write.csv

The best way to test on Windows would probably be creating a small virtual hard disk (via CreateVirtualDisk), mounting it, and writing to the mounted location. I believe the drive could even be mounted to an arbitrary location on the filesystem (instead of a drive letter) so that drive letter conflicts don't come into play.

-----Original Message-----
From: R-devel [mailto:[hidden email]] On Behalf Of Duncan Murdoch
Sent: Tuesday, July 4, 2017 7:53 AM
To: Jim Hester <[hidden email]>
Cc: [hidden email]; Lipatz Jean-Luc <[hidden email]>
Subject: Re: [Rd] write.csv

On 04/07/2017 10:01 AM, Jim Hester wrote:
> On linux at least you can use `/dev/full` [1] to test writing to a full device.
>
>     > echo 'foo' > /dev/full
>     bash: echo: write error: No space left on device

Unfortunately, I get a permission denied error if I try to write there from MacOS.  I don't know if Windows has an equivalent.

I've taken a look at the code.  Essentially it comes down to a call to the C function vfprintf, which is supposed to return the number of bytes written, or a negative value for an error. This return value is often not checked; in particular, write.table and friends don't check it.

I'll add code to signal an error if there's a negative value.

I don't think it's feasible to check the number of bytes (formatted text with possible translation to a different encoding could have any number of bytes) if it's positive.  So hopefully all of our file systems will correctly signal an error, and not just report how many bytes were successfully written.

>
> Although that won't be a perfect test for this case where part of the
> file is written successfully.
>
> An alternative suggestion for testing this is to create and mount a
> loop device [2] with a small file.
>
> [1]:
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wi
> kipedia.org%2Fwiki%2F%2Fdev%2Ffull&data=02%7C01%7Cnsosnov%40microsoft.
> com%7Cb97a7371538b4dbe9a7308d4c2ec5aa0%7C72f988bf86f141af91ab2d7cd011d
> b47%7C1%7C0%7C636347767773809248&sdata=Cb2oduozc2IDCLvXZGG1C4i4hQA7FPs
> 5jHmnFYbk7zQ%3D&reserved=0
> [2]:
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstack
> overflow.com%2Fa%2F16044420%2F2055486&data=02%7C01%7Cnsosnov%40microso
> ft.com%7Cb97a7371538b4dbe9a7308d4c2ec5aa0%7C72f988bf86f141af91ab2d7cd0
> 11db47%7C1%7C0%7C636347767773809248&sdata=%2BWPfqD0nUS%2F30DUNDqQU79lR
> EJh02ZX0yik9HXiY5kg%3D&reserved=0

Loop devices sound ideal, but seem to be Linux-only (at least with that recipe).

Duncan


>
> On Tue, Jul 4, 2017 at 3:38 PM, Duncan Murdoch <[hidden email]> wrote:
>> On 04/07/2017 8:40 AM, Lipatz Jean-Luc wrote:
>>>
>>> I would really like the bug fixed. At least this one, because I know
>>> people in my institute using this function.
>>> I understand your arguments about open source, but I also saw in
>>> this mail list a proposal for a fix for this bug for which there
>>> were no answer from the people who are able to include it in the
>>> distribution. It looks like if there were interesting bugs and the other ones.
>>
>>
>> Please post a link to that, and I'll look.  Bug reports should be
>> posted to the bug list.  It's unfortunate that it is currently so
>> difficult to do so, but if they are only posted here, they are often overlooked.
>>
>>> I don't understand the other arguments : the example was reproduced
>>> with a simple USB key and you cannot state that a disk will
>>> eternally be empty enough, specially when it has several users.
>>
>>
>> I am not denying that it's a bug, I'm just saying that it is a
>> difficult one to test automatically (so we probably won't add a
>> regression test once it's fixed), and it's not one that has been
>> reported often.  I didn't know there were any reports before yours.
>>
>> Duncan Murdoch
>>
>>
>>> JLL
>>>
>>>
>>> -----Message d'origine-----
>>> De : Duncan Murdoch [mailto:[hidden email]] Envoyé : mardi
>>> 4 juillet 2017 14:24 À : Lipatz Jean-Luc; [hidden email]
>>> Objet : Re: [Rd] write.csv
>>>
>>> On 04/07/2017 5:40 AM, Lipatz Jean-Luc wrote:
>>>>
>>>> Hi all,
>>>>
>>>> I am currently studying how to generalize the usage of R in my
>>>> statistical institute and I encountered a problem that I cannot
>>>> declare on bugzilla (cannot understand why).
>>>
>>>
>>> Bugzilla was badly abused by spammers last year, so you need to have
>>> your account created manually by one of the admins to post there.
>>> Write to me privately if you'd like me to create an account for you.  
>>> (If you want it attached to a different email address, that's fine.)
>>>
>>> Sorry for trying this mailing list but I am really worried about the
>>> problem itself and the possible implications in using R in a
>>> professionnal data production context.
>>>>
>>>> The issue about 'write.csv' is that it just doesn't check if there
>>>> is enough space on disk and doesn't report failure to write data.
>>>>
>>>> Example (R 3.4.0 windows 32 bits, but I reproduced the problem with
>>>> older versions and under Mac OS/X)
>>>>
>>>>> fwrite(as.list(1:1000000),"G:/Test")
>>>>
>>>> Error in fwrite(as.list(1:1e+06), "G:/Test") :
>>>>   No space left on device: 'G:/Test'
>>>>>
>>>>> write.csv(1:1000000,"G:/Test")
>>>>>
>>>>
>>>> I have a big concern here, because it means that you could save
>>>> some important data at one point of time and discover a long time
>>>> after that you actually lost them.
>>>
>>>  > I suppose that the fix is relatively straightforward, but how can
>>> we be sure that there is no another function with the same bad properties?
>>>
>>> R is open source.  You could work out the patch for this bug, and in
>>> the process see the pattern of coding that leads to it.  Then you'll
>>> know if other functions use the same buggy pattern.
>>>
>>>> Is the lesson that you should not use a R function, even from the
>>>> core, without having personnally tested it against extreme conditions?
>>>
>>>
>>> I think the answer to that is yes.  Most people never write such big
>>> files that they fill their disk:  if they did, all sorts of things
>>> would go wrong on their systems.  So this kind of extreme condition
>>> isn't often tested.  It's not easy to test in a platform independent
>>> way:  R would need to be able to create a volume with a small
>>> capacity.  That's a very system-dependent thing to do.
>>>
>>>> And wouldn't it be the work of the developpers to do such
>>>> elementary tests?
>>>
>>>
>>> Again, R is open source.  You can and should contribute code (and
>>> therefore become one of the developers) if you are working in
>>> unusual conditions.
>>>
>>> R states quite clearly in the welcome message every time it starts:
>>> "R is free software and comes with ABSOLUTELY NO WARRANTY."  This is
>>> essentially the same lack of warranty that you get with commercial
>>> software, though it's stated a lot more clearly.
>>>
>>> Duncan Murdoch
>>>
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat
>> .ethz.ch%2Fmailman%2Flistinfo%2Fr-devel&data=02%7C01%7Cnsosnov%40micr
>> osoft.com%7Cb97a7371538b4dbe9a7308d4c2ec5aa0%7C72f988bf86f141af91ab2d
>> 7cd011db47%7C1%7C0%7C636347767773809248&sdata=zMU5Ua2gL3fVPc%2FOPhfdc
>> iAdoHzwDyaRnKusZCnXqWo%3D&reserved=0

______________________________________________
[hidden email] mailing list
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-devel&data=02%7C01%7Cnsosnov%40microsoft.com%7Cb97a7371538b4dbe9a7308d4c2ec5aa0%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636347767773809248&sdata=zMU5Ua2gL3fVPc%2FOPhfdciAdoHzwDyaRnKusZCnXqWo%3D&reserved=0
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: write.csv

Uwe Ligges-3
This is a bit difficult:

R binaries (both for R base and R packages) are still built with Windows
Server 2008 and my desktop machine is Windows 7, hence at least
currently such a check would not get executed on the machines R core /
CRAN use ...

Best,
Uwe

On 11.07.2017 00:59, Nathan Sosnovske via R-devel wrote:

> As a follow up to this, Martin Maechler suggested that I write a small script that allows this to be tested on Windows from R. The attached script demonstrates creating and mounting a very small (4 MiB) VHD to a path on the system (in this case c:/smallmount) and then calling write.csv with a large dataframe to the newly mounted path.
>
> The method I attached has two limitations.
>
> 1) It will only work if run as administrator. This is a limitation of the OS.
> 2) It will only work on Windows 8.1/Server 2012R2 or higher, as it uses powershell commands that only exist on those operating systems. I believe it could be made to work on Windows 7, but it would need to be written using the windows C apis at that point.
>
> If a regression test is created for this issue I would be more than happy to integrate this method into that if there is interest and if it would work in the environment where automated builds/tests for windows are run.
>
> -----Original Message-----
> From: Nathan Sosnovske
> Sent: Tuesday, July 4, 2017 8:02 AM
> To: 'Duncan Murdoch' <[hidden email]>; Jim Hester <[hidden email]>
> Cc: [hidden email]; Lipatz Jean-Luc <[hidden email]>
> Subject: RE: [Rd] write.csv
>
> The best way to test on Windows would probably be creating a small virtual hard disk (via CreateVirtualDisk), mounting it, and writing to the mounted location. I believe the drive could even be mounted to an arbitrary location on the filesystem (instead of a drive letter) so that drive letter conflicts don't come into play.
>
> -----Original Message-----
> From: R-devel [mailto:[hidden email]] On Behalf Of Duncan Murdoch
> Sent: Tuesday, July 4, 2017 7:53 AM
> To: Jim Hester <[hidden email]>
> Cc: [hidden email]; Lipatz Jean-Luc <[hidden email]>
> Subject: Re: [Rd] write.csv
>
> On 04/07/2017 10:01 AM, Jim Hester wrote:
>> On linux at least you can use `/dev/full` [1] to test writing to a full device.
>>
>>      > echo 'foo' > /dev/full
>>      bash: echo: write error: No space left on device
>
> Unfortunately, I get a permission denied error if I try to write there from MacOS.  I don't know if Windows has an equivalent.
>
> I've taken a look at the code.  Essentially it comes down to a call to the C function vfprintf, which is supposed to return the number of bytes written, or a negative value for an error. This return value is often not checked; in particular, write.table and friends don't check it.
>
> I'll add code to signal an error if there's a negative value.
>
> I don't think it's feasible to check the number of bytes (formatted text with possible translation to a different encoding could have any number of bytes) if it's positive.  So hopefully all of our file systems will correctly signal an error, and not just report how many bytes were successfully written.
>
>>
>> Although that won't be a perfect test for this case where part of the
>> file is written successfully.
>>
>> An alternative suggestion for testing this is to create and mount a
>> loop device [2] with a small file.
>>
>> [1]:
>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wi
>> kipedia.org%2Fwiki%2F%2Fdev%2Ffull&data=02%7C01%7Cnsosnov%40microsoft.
>> com%7Cb97a7371538b4dbe9a7308d4c2ec5aa0%7C72f988bf86f141af91ab2d7cd011d
>> b47%7C1%7C0%7C636347767773809248&sdata=Cb2oduozc2IDCLvXZGG1C4i4hQA7FPs
>> 5jHmnFYbk7zQ%3D&reserved=0
>> [2]:
>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstack
>> overflow.com%2Fa%2F16044420%2F2055486&data=02%7C01%7Cnsosnov%40microso
>> ft.com%7Cb97a7371538b4dbe9a7308d4c2ec5aa0%7C72f988bf86f141af91ab2d7cd0
>> 11db47%7C1%7C0%7C636347767773809248&sdata=%2BWPfqD0nUS%2F30DUNDqQU79lR
>> EJh02ZX0yik9HXiY5kg%3D&reserved=0
>
> Loop devices sound ideal, but seem to be Linux-only (at least with that recipe).
>
> Duncan
>
>
>>
>> On Tue, Jul 4, 2017 at 3:38 PM, Duncan Murdoch <[hidden email]> wrote:
>>> On 04/07/2017 8:40 AM, Lipatz Jean-Luc wrote:
>>>>
>>>> I would really like the bug fixed. At least this one, because I know
>>>> people in my institute using this function.
>>>> I understand your arguments about open source, but I also saw in
>>>> this mail list a proposal for a fix for this bug for which there
>>>> were no answer from the people who are able to include it in the
>>>> distribution. It looks like if there were interesting bugs and the other ones.
>>>
>>>
>>> Please post a link to that, and I'll look.  Bug reports should be
>>> posted to the bug list.  It's unfortunate that it is currently so
>>> difficult to do so, but if they are only posted here, they are often overlooked.
>>>
>>>> I don't understand the other arguments : the example was reproduced
>>>> with a simple USB key and you cannot state that a disk will
>>>> eternally be empty enough, specially when it has several users.
>>>
>>>
>>> I am not denying that it's a bug, I'm just saying that it is a
>>> difficult one to test automatically (so we probably won't add a
>>> regression test once it's fixed), and it's not one that has been
>>> reported often.  I didn't know there were any reports before yours.
>>>
>>> Duncan Murdoch
>>>
>>>
>>>> JLL
>>>>
>>>>
>>>> -----Message d'origine-----
>>>> De : Duncan Murdoch [mailto:[hidden email]] Envoyé : mardi
>>>> 4 juillet 2017 14:24 À : Lipatz Jean-Luc; [hidden email]
>>>> Objet : Re: [Rd] write.csv
>>>>
>>>> On 04/07/2017 5:40 AM, Lipatz Jean-Luc wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> I am currently studying how to generalize the usage of R in my
>>>>> statistical institute and I encountered a problem that I cannot
>>>>> declare on bugzilla (cannot understand why).
>>>>
>>>>
>>>> Bugzilla was badly abused by spammers last year, so you need to have
>>>> your account created manually by one of the admins to post there.
>>>> Write to me privately if you'd like me to create an account for you.
>>>> (If you want it attached to a different email address, that's fine.)
>>>>
>>>> Sorry for trying this mailing list but I am really worried about the
>>>> problem itself and the possible implications in using R in a
>>>> professionnal data production context.
>>>>>
>>>>> The issue about 'write.csv' is that it just doesn't check if there
>>>>> is enough space on disk and doesn't report failure to write data.
>>>>>
>>>>> Example (R 3.4.0 windows 32 bits, but I reproduced the problem with
>>>>> older versions and under Mac OS/X)
>>>>>
>>>>>> fwrite(as.list(1:1000000),"G:/Test")
>>>>>
>>>>> Error in fwrite(as.list(1:1e+06), "G:/Test") :
>>>>>    No space left on device: 'G:/Test'
>>>>>>
>>>>>> write.csv(1:1000000,"G:/Test")
>>>>>>
>>>>>
>>>>> I have a big concern here, because it means that you could save
>>>>> some important data at one point of time and discover a long time
>>>>> after that you actually lost them.
>>>>
>>>>   > I suppose that the fix is relatively straightforward, but how can
>>>> we be sure that there is no another function with the same bad properties?
>>>>
>>>> R is open source.  You could work out the patch for this bug, and in
>>>> the process see the pattern of coding that leads to it.  Then you'll
>>>> know if other functions use the same buggy pattern.
>>>>
>>>>> Is the lesson that you should not use a R function, even from the
>>>>> core, without having personnally tested it against extreme conditions?
>>>>
>>>>
>>>> I think the answer to that is yes.  Most people never write such big
>>>> files that they fill their disk:  if they did, all sorts of things
>>>> would go wrong on their systems.  So this kind of extreme condition
>>>> isn't often tested.  It's not easy to test in a platform independent
>>>> way:  R would need to be able to create a volume with a small
>>>> capacity.  That's a very system-dependent thing to do.
>>>>
>>>>> And wouldn't it be the work of the developpers to do such
>>>>> elementary tests?
>>>>
>>>>
>>>> Again, R is open source.  You can and should contribute code (and
>>>> therefore become one of the developers) if you are working in
>>>> unusual conditions.
>>>>
>>>> R states quite clearly in the welcome message every time it starts:
>>>> "R is free software and comes with ABSOLUTELY NO WARRANTY."  This is
>>>> essentially the same lack of warranty that you get with commercial
>>>> software, though it's stated a lot more clearly.
>>>>
>>>> Duncan Murdoch
>>>>
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat
>>> .ethz.ch%2Fmailman%2Flistinfo%2Fr-devel&data=02%7C01%7Cnsosnov%40micr
>>> osoft.com%7Cb97a7371538b4dbe9a7308d4c2ec5aa0%7C72f988bf86f141af91ab2d
>>> 7cd011db47%7C1%7C0%7C636347767773809248&sdata=zMU5Ua2gL3fVPc%2FOPhfdc
>>> iAdoHzwDyaRnKusZCnXqWo%3D&reserved=0
>
> ______________________________________________
> [hidden email] mailing list
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-devel&data=02%7C01%7Cnsosnov%40microsoft.com%7Cb97a7371538b4dbe9a7308d4c2ec5aa0%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636347767773809248&sdata=zMU5Ua2gL3fVPc%2FOPhfdciAdoHzwDyaRnKusZCnXqWo%3D&reserved=0
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: write.csv

R devel mailing list
Two thoughts:

1) Will the new server that you are setting up with Server 2016 eventually host build and test? If so, this could at least run on that.
2) CreateVirtualDisk and OpenVirtualDisk are C functions that are available in Windows 7 and Server 2008 R2. So these are options that we could use, but it would require creating a small program to drive creation/mounting/deletion of the disk and compiling it at build time.

Nathan

-----Original Message-----
From: Uwe Ligges [mailto:[hidden email]]
Sent: Tuesday, July 11, 2017 5:09 AM
To: Nathan Sosnovske <[hidden email]>; Duncan Murdoch <[hidden email]>; Jim Hester <[hidden email]>
Cc: [hidden email]; Lipatz Jean-Luc <[hidden email]>
Subject: Re: [Rd] write.csv

This is a bit difficult:

R binaries (both for R base and R packages) are still built with Windows Server 2008 and my desktop machine is Windows 7, hence at least currently such a check would not get executed on the machines R core / CRAN use ...

Best,
Uwe

On 11.07.2017 00:59, Nathan Sosnovske via R-devel wrote:

> As a follow up to this, Martin Maechler suggested that I write a small script that allows this to be tested on Windows from R. The attached script demonstrates creating and mounting a very small (4 MiB) VHD to a path on the system (in this case c:/smallmount) and then calling write.csv with a large dataframe to the newly mounted path.
>
> The method I attached has two limitations.
>
> 1) It will only work if run as administrator. This is a limitation of the OS.
> 2) It will only work on Windows 8.1/Server 2012R2 or higher, as it uses powershell commands that only exist on those operating systems. I believe it could be made to work on Windows 7, but it would need to be written using the windows C apis at that point.
>
> If a regression test is created for this issue I would be more than happy to integrate this method into that if there is interest and if it would work in the environment where automated builds/tests for windows are run.
>
> -----Original Message-----
> From: Nathan Sosnovske
> Sent: Tuesday, July 4, 2017 8:02 AM
> To: 'Duncan Murdoch' <[hidden email]>; Jim Hester
> <[hidden email]>
> Cc: [hidden email]; Lipatz Jean-Luc <[hidden email]>
> Subject: RE: [Rd] write.csv
>
> The best way to test on Windows would probably be creating a small virtual hard disk (via CreateVirtualDisk), mounting it, and writing to the mounted location. I believe the drive could even be mounted to an arbitrary location on the filesystem (instead of a drive letter) so that drive letter conflicts don't come into play.
>
> -----Original Message-----
> From: R-devel [mailto:[hidden email]] On Behalf Of
> Duncan Murdoch
> Sent: Tuesday, July 4, 2017 7:53 AM
> To: Jim Hester <[hidden email]>
> Cc: [hidden email]; Lipatz Jean-Luc <[hidden email]>
> Subject: Re: [Rd] write.csv
>
> On 04/07/2017 10:01 AM, Jim Hester wrote:
>> On linux at least you can use `/dev/full` [1] to test writing to a full device.
>>
>>      > echo 'foo' > /dev/full
>>      bash: echo: write error: No space left on device
>
> Unfortunately, I get a permission denied error if I try to write there from MacOS.  I don't know if Windows has an equivalent.
>
> I've taken a look at the code.  Essentially it comes down to a call to the C function vfprintf, which is supposed to return the number of bytes written, or a negative value for an error. This return value is often not checked; in particular, write.table and friends don't check it.
>
> I'll add code to signal an error if there's a negative value.
>
> I don't think it's feasible to check the number of bytes (formatted text with possible translation to a different encoding could have any number of bytes) if it's positive.  So hopefully all of our file systems will correctly signal an error, and not just report how many bytes were successfully written.
>
>>
>> Although that won't be a perfect test for this case where part of the
>> file is written successfully.
>>
>> An alternative suggestion for testing this is to create and mount a
>> loop device [2] with a small file.
>>
>> [1]:
>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.w
>> i
>> kipedia.org%2Fwiki%2F%2Fdev%2Ffull&data=02%7C01%7Cnsosnov%40microsoft.
>> com%7Cb97a7371538b4dbe9a7308d4c2ec5aa0%7C72f988bf86f141af91ab2d7cd011
>> d
>> b47%7C1%7C0%7C636347767773809248&sdata=Cb2oduozc2IDCLvXZGG1C4i4hQA7FP
>> s
>> 5jHmnFYbk7zQ%3D&reserved=0
>> [2]:
>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstac
>> k
>> overflow.com%2Fa%2F16044420%2F2055486&data=02%7C01%7Cnsosnov%40micros
>> o
>> ft.com%7Cb97a7371538b4dbe9a7308d4c2ec5aa0%7C72f988bf86f141af91ab2d7cd
>> 0
>> 11db47%7C1%7C0%7C636347767773809248&sdata=%2BWPfqD0nUS%2F30DUNDqQU79l
>> R
>> EJh02ZX0yik9HXiY5kg%3D&reserved=0
>
> Loop devices sound ideal, but seem to be Linux-only (at least with that recipe).
>
> Duncan
>
>
>>
>> On Tue, Jul 4, 2017 at 3:38 PM, Duncan Murdoch <[hidden email]> wrote:
>>> On 04/07/2017 8:40 AM, Lipatz Jean-Luc wrote:
>>>>
>>>> I would really like the bug fixed. At least this one, because I
>>>> know people in my institute using this function.
>>>> I understand your arguments about open source, but I also saw in
>>>> this mail list a proposal for a fix for this bug for which there
>>>> were no answer from the people who are able to include it in the
>>>> distribution. It looks like if there were interesting bugs and the other ones.
>>>
>>>
>>> Please post a link to that, and I'll look.  Bug reports should be
>>> posted to the bug list.  It's unfortunate that it is currently so
>>> difficult to do so, but if they are only posted here, they are often overlooked.
>>>
>>>> I don't understand the other arguments : the example was reproduced
>>>> with a simple USB key and you cannot state that a disk will
>>>> eternally be empty enough, specially when it has several users.
>>>
>>>
>>> I am not denying that it's a bug, I'm just saying that it is a
>>> difficult one to test automatically (so we probably won't add a
>>> regression test once it's fixed), and it's not one that has been
>>> reported often.  I didn't know there were any reports before yours.
>>>
>>> Duncan Murdoch
>>>
>>>
>>>> JLL
>>>>
>>>>
>>>> -----Message d'origine-----
>>>> De : Duncan Murdoch [mailto:[hidden email]] Envoyé :
>>>> mardi
>>>> 4 juillet 2017 14:24 À : Lipatz Jean-Luc; [hidden email]
>>>> Objet : Re: [Rd] write.csv
>>>>
>>>> On 04/07/2017 5:40 AM, Lipatz Jean-Luc wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> I am currently studying how to generalize the usage of R in my
>>>>> statistical institute and I encountered a problem that I cannot
>>>>> declare on bugzilla (cannot understand why).
>>>>
>>>>
>>>> Bugzilla was badly abused by spammers last year, so you need to
>>>> have your account created manually by one of the admins to post there.
>>>> Write to me privately if you'd like me to create an account for you.
>>>> (If you want it attached to a different email address, that's
>>>> fine.)
>>>>
>>>> Sorry for trying this mailing list but I am really worried about
>>>> the problem itself and the possible implications in using R in a
>>>> professionnal data production context.
>>>>>
>>>>> The issue about 'write.csv' is that it just doesn't check if there
>>>>> is enough space on disk and doesn't report failure to write data.
>>>>>
>>>>> Example (R 3.4.0 windows 32 bits, but I reproduced the problem
>>>>> with older versions and under Mac OS/X)
>>>>>
>>>>>> fwrite(as.list(1:1000000),"G:/Test")
>>>>>
>>>>> Error in fwrite(as.list(1:1e+06), "G:/Test") :
>>>>>    No space left on device: 'G:/Test'
>>>>>>
>>>>>> write.csv(1:1000000,"G:/Test")
>>>>>>
>>>>>
>>>>> I have a big concern here, because it means that you could save
>>>>> some important data at one point of time and discover a long time
>>>>> after that you actually lost them.
>>>>
>>>>   > I suppose that the fix is relatively straightforward, but how
>>>> can we be sure that there is no another function with the same bad properties?
>>>>
>>>> R is open source.  You could work out the patch for this bug, and
>>>> in the process see the pattern of coding that leads to it.  Then
>>>> you'll know if other functions use the same buggy pattern.
>>>>
>>>>> Is the lesson that you should not use a R function, even from the
>>>>> core, without having personnally tested it against extreme conditions?
>>>>
>>>>
>>>> I think the answer to that is yes.  Most people never write such
>>>> big files that they fill their disk:  if they did, all sorts of
>>>> things would go wrong on their systems.  So this kind of extreme
>>>> condition isn't often tested.  It's not easy to test in a platform
>>>> independent
>>>> way:  R would need to be able to create a volume with a small
>>>> capacity.  That's a very system-dependent thing to do.
>>>>
>>>>> And wouldn't it be the work of the developpers to do such
>>>>> elementary tests?
>>>>
>>>>
>>>> Again, R is open source.  You can and should contribute code (and
>>>> therefore become one of the developers) if you are working in
>>>> unusual conditions.
>>>>
>>>> R states quite clearly in the welcome message every time it starts:
>>>> "R is free software and comes with ABSOLUTELY NO WARRANTY."  This
>>>> is essentially the same lack of warranty that you get with
>>>> commercial software, though it's stated a lot more clearly.
>>>>
>>>> Duncan Murdoch
>>>>
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsta
>>> t
>>> .ethz.ch%2Fmailman%2Flistinfo%2Fr-devel&data=02%7C01%7Cnsosnov%40mic
>>> r
>>> osoft.com%7Cb97a7371538b4dbe9a7308d4c2ec5aa0%7C72f988bf86f141af91ab2
>>> d
>>> 7cd011db47%7C1%7C0%7C636347767773809248&sdata=zMU5Ua2gL3fVPc%2FOPhfd
>>> c
>>> iAdoHzwDyaRnKusZCnXqWo%3D&reserved=0
>
> ______________________________________________
> [hidden email] mailing list
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.
> ethz.ch%2Fmailman%2Flistinfo%2Fr-devel&data=02%7C01%7Cnsosnov%40micros
> oft.com%7Cb97a7371538b4dbe9a7308d4c2ec5aa0%7C72f988bf86f141af91ab2d7cd
> 011db47%7C1%7C0%7C636347767773809248&sdata=zMU5Ua2gL3fVPc%2FOPhfdciAdo
> HzwDyaRnKusZCnXqWo%3D&reserved=0
> ______________________________________________
> [hidden email] mailing list
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.
> ethz.ch%2Fmailman%2Flistinfo%2Fr-devel&data=02%7C01%7Cnsosnov%40micros
> oft.com%7C395a385e087e4af6005b08d4c85598ad%7C72f988bf86f141af91ab2d7cd
> 011db47%7C1%7C0%7C636353717344106719&sdata=ixzdTvi5X1mPngNq7dxAR1tLHcy
> xGyeiJbmBzL8kHjI%3D&reserved=0
>
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: write.csv

Uwe Ligges-3


On 12.07.2017 05:09, Nathan Sosnovske wrote:
> Two thoughts:
>
> 1) Will the new server that you are setting up with Server 2016 eventually host build and test? If so, this could at least run on that.
> 2) CreateVirtualDisk and OpenVirtualDisk are C functions that are available in Windows 7 and Server 2008 R2. So these are options that we could use, but it would require creating a small program to drive creation/mounting/deletion of the disk and compiling it at build time.

OK, thank you. Then the best way forward is probably to do 1) and
include the test you propose, but keep it conditional on availability of
admin permissions and an OS that supports it, please.

Best,
Uwe


>
> Nathan
>
> -----Original Message-----
> From: Uwe Ligges [mailto:[hidden email]]
> Sent: Tuesday, July 11, 2017 5:09 AM
> To: Nathan Sosnovske <[hidden email]>; Duncan Murdoch <[hidden email]>; Jim Hester <[hidden email]>
> Cc: [hidden email]; Lipatz Jean-Luc <[hidden email]>
> Subject: Re: [Rd] write.csv
>
> This is a bit difficult:
>
> R binaries (both for R base and R packages) are still built with Windows Server 2008 and my desktop machine is Windows 7, hence at least currently such a check would not get executed on the machines R core / CRAN use ...
>
> Best,
> Uwe
>
> On 11.07.2017 00:59, Nathan Sosnovske via R-devel wrote:
>> As a follow up to this, Martin Maechler suggested that I write a small script that allows this to be tested on Windows from R. The attached script demonstrates creating and mounting a very small (4 MiB) VHD to a path on the system (in this case c:/smallmount) and then calling write.csv with a large dataframe to the newly mounted path.
>>
>> The method I attached has two limitations.
>>
>> 1) It will only work if run as administrator. This is a limitation of the OS.
>> 2) It will only work on Windows 8.1/Server 2012R2 or higher, as it uses powershell commands that only exist on those operating systems. I believe it could be made to work on Windows 7, but it would need to be written using the windows C apis at that point.
>>
>> If a regression test is created for this issue I would be more than happy to integrate this method into that if there is interest and if it would work in the environment where automated builds/tests for windows are run.
>>
>> -----Original Message-----
>> From: Nathan Sosnovske
>> Sent: Tuesday, July 4, 2017 8:02 AM
>> To: 'Duncan Murdoch' <[hidden email]>; Jim Hester
>> <[hidden email]>
>> Cc: [hidden email]; Lipatz Jean-Luc <[hidden email]>
>> Subject: RE: [Rd] write.csv
>>
>> The best way to test on Windows would probably be creating a small virtual hard disk (via CreateVirtualDisk), mounting it, and writing to the mounted location. I believe the drive could even be mounted to an arbitrary location on the filesystem (instead of a drive letter) so that drive letter conflicts don't come into play.
>>
>> -----Original Message-----
>> From: R-devel [mailto:[hidden email]] On Behalf Of
>> Duncan Murdoch
>> Sent: Tuesday, July 4, 2017 7:53 AM
>> To: Jim Hester <[hidden email]>
>> Cc: [hidden email]; Lipatz Jean-Luc <[hidden email]>
>> Subject: Re: [Rd] write.csv
>>
>> On 04/07/2017 10:01 AM, Jim Hester wrote:
>>> On linux at least you can use `/dev/full` [1] to test writing to a full device.
>>>
>>>       > echo 'foo' > /dev/full
>>>       bash: echo: write error: No space left on device
>>
>> Unfortunately, I get a permission denied error if I try to write there from MacOS.  I don't know if Windows has an equivalent.
>>
>> I've taken a look at the code.  Essentially it comes down to a call to the C function vfprintf, which is supposed to return the number of bytes written, or a negative value for an error. This return value is often not checked; in particular, write.table and friends don't check it.
>>
>> I'll add code to signal an error if there's a negative value.
>>
>> I don't think it's feasible to check the number of bytes (formatted text with possible translation to a different encoding could have any number of bytes) if it's positive.  So hopefully all of our file systems will correctly signal an error, and not just report how many bytes were successfully written.
>>
>>>
>>> Although that won't be a perfect test for this case where part of the
>>> file is written successfully.
>>>
>>> An alternative suggestion for testing this is to create and mount a
>>> loop device [2] with a small file.
>>>
>>> [1]:
>>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.w
>>> i
>>> kipedia.org%2Fwiki%2F%2Fdev%2Ffull&data=02%7C01%7Cnsosnov%40microsoft.
>>> com%7Cb97a7371538b4dbe9a7308d4c2ec5aa0%7C72f988bf86f141af91ab2d7cd011
>>> d
>>> b47%7C1%7C0%7C636347767773809248&sdata=Cb2oduozc2IDCLvXZGG1C4i4hQA7FP
>>> s
>>> 5jHmnFYbk7zQ%3D&reserved=0
>>> [2]:
>>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstac
>>> k
>>> overflow.com%2Fa%2F16044420%2F2055486&data=02%7C01%7Cnsosnov%40micros
>>> o
>>> ft.com%7Cb97a7371538b4dbe9a7308d4c2ec5aa0%7C72f988bf86f141af91ab2d7cd
>>> 0
>>> 11db47%7C1%7C0%7C636347767773809248&sdata=%2BWPfqD0nUS%2F30DUNDqQU79l
>>> R
>>> EJh02ZX0yik9HXiY5kg%3D&reserved=0
>>
>> Loop devices sound ideal, but seem to be Linux-only (at least with that recipe).
>>
>> Duncan
>>
>>
>>>
>>> On Tue, Jul 4, 2017 at 3:38 PM, Duncan Murdoch <[hidden email]> wrote:
>>>> On 04/07/2017 8:40 AM, Lipatz Jean-Luc wrote:
>>>>>
>>>>> I would really like the bug fixed. At least this one, because I
>>>>> know people in my institute using this function.
>>>>> I understand your arguments about open source, but I also saw in
>>>>> this mail list a proposal for a fix for this bug for which there
>>>>> were no answer from the people who are able to include it in the
>>>>> distribution. It looks like if there were interesting bugs and the other ones.
>>>>
>>>>
>>>> Please post a link to that, and I'll look.  Bug reports should be
>>>> posted to the bug list.  It's unfortunate that it is currently so
>>>> difficult to do so, but if they are only posted here, they are often overlooked.
>>>>
>>>>> I don't understand the other arguments : the example was reproduced
>>>>> with a simple USB key and you cannot state that a disk will
>>>>> eternally be empty enough, specially when it has several users.
>>>>
>>>>
>>>> I am not denying that it's a bug, I'm just saying that it is a
>>>> difficult one to test automatically (so we probably won't add a
>>>> regression test once it's fixed), and it's not one that has been
>>>> reported often.  I didn't know there were any reports before yours.
>>>>
>>>> Duncan Murdoch
>>>>
>>>>
>>>>> JLL
>>>>>
>>>>>
>>>>> -----Message d'origine-----
>>>>> De : Duncan Murdoch [mailto:[hidden email]] Envoyé :
>>>>> mardi
>>>>> 4 juillet 2017 14:24 À : Lipatz Jean-Luc; [hidden email]
>>>>> Objet : Re: [Rd] write.csv
>>>>>
>>>>> On 04/07/2017 5:40 AM, Lipatz Jean-Luc wrote:
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I am currently studying how to generalize the usage of R in my
>>>>>> statistical institute and I encountered a problem that I cannot
>>>>>> declare on bugzilla (cannot understand why).
>>>>>
>>>>>
>>>>> Bugzilla was badly abused by spammers last year, so you need to
>>>>> have your account created manually by one of the admins to post there.
>>>>> Write to me privately if you'd like me to create an account for you.
>>>>> (If you want it attached to a different email address, that's
>>>>> fine.)
>>>>>
>>>>> Sorry for trying this mailing list but I am really worried about
>>>>> the problem itself and the possible implications in using R in a
>>>>> professionnal data production context.
>>>>>>
>>>>>> The issue about 'write.csv' is that it just doesn't check if there
>>>>>> is enough space on disk and doesn't report failure to write data.
>>>>>>
>>>>>> Example (R 3.4.0 windows 32 bits, but I reproduced the problem
>>>>>> with older versions and under Mac OS/X)
>>>>>>
>>>>>>> fwrite(as.list(1:1000000),"G:/Test")
>>>>>>
>>>>>> Error in fwrite(as.list(1:1e+06), "G:/Test") :
>>>>>>     No space left on device: 'G:/Test'
>>>>>>>
>>>>>>> write.csv(1:1000000,"G:/Test")
>>>>>>>
>>>>>>
>>>>>> I have a big concern here, because it means that you could save
>>>>>> some important data at one point of time and discover a long time
>>>>>> after that you actually lost them.
>>>>>
>>>>>    > I suppose that the fix is relatively straightforward, but how
>>>>> can we be sure that there is no another function with the same bad properties?
>>>>>
>>>>> R is open source.  You could work out the patch for this bug, and
>>>>> in the process see the pattern of coding that leads to it.  Then
>>>>> you'll know if other functions use the same buggy pattern.
>>>>>
>>>>>> Is the lesson that you should not use a R function, even from the
>>>>>> core, without having personnally tested it against extreme conditions?
>>>>>
>>>>>
>>>>> I think the answer to that is yes.  Most people never write such
>>>>> big files that they fill their disk:  if they did, all sorts of
>>>>> things would go wrong on their systems.  So this kind of extreme
>>>>> condition isn't often tested.  It's not easy to test in a platform
>>>>> independent
>>>>> way:  R would need to be able to create a volume with a small
>>>>> capacity.  That's a very system-dependent thing to do.
>>>>>
>>>>>> And wouldn't it be the work of the developpers to do such
>>>>>> elementary tests?
>>>>>
>>>>>
>>>>> Again, R is open source.  You can and should contribute code (and
>>>>> therefore become one of the developers) if you are working in
>>>>> unusual conditions.
>>>>>
>>>>> R states quite clearly in the welcome message every time it starts:
>>>>> "R is free software and comes with ABSOLUTELY NO WARRANTY."  This
>>>>> is essentially the same lack of warranty that you get with
>>>>> commercial software, though it's stated a lot more clearly.
>>>>>
>>>>> Duncan Murdoch
>>>>>
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list
>>>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsta
>>>> t
>>>> .ethz.ch%2Fmailman%2Flistinfo%2Fr-devel&data=02%7C01%7Cnsosnov%40mic
>>>> r
>>>> osoft.com%7Cb97a7371538b4dbe9a7308d4c2ec5aa0%7C72f988bf86f141af91ab2
>>>> d
>>>> 7cd011db47%7C1%7C0%7C636347767773809248&sdata=zMU5Ua2gL3fVPc%2FOPhfd
>>>> c
>>>> iAdoHzwDyaRnKusZCnXqWo%3D&reserved=0
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.
>> ethz.ch%2Fmailman%2Flistinfo%2Fr-devel&data=02%7C01%7Cnsosnov%40micros
>> oft.com%7Cb97a7371538b4dbe9a7308d4c2ec5aa0%7C72f988bf86f141af91ab2d7cd
>> 011db47%7C1%7C0%7C636347767773809248&sdata=zMU5Ua2gL3fVPc%2FOPhfdciAdo
>> HzwDyaRnKusZCnXqWo%3D&reserved=0
>> ______________________________________________
>> [hidden email] mailing list
>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.
>> ethz.ch%2Fmailman%2Flistinfo%2Fr-devel&data=02%7C01%7Cnsosnov%40micros
>> oft.com%7C395a385e087e4af6005b08d4c85598ad%7C72f988bf86f141af91ab2d7cd
>> 011db47%7C1%7C0%7C636353717344106719&sdata=ixzdTvi5X1mPngNq7dxAR1tLHcy
>> xGyeiJbmBzL8kHjI%3D&reserved=0
>>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Loading...