Col names in a data frame

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Col names in a data frame

Bernard McGarvey
Here is an example piece of code to illustrate an issue:

rm(list=ls()) # Clear Workspace
#
Data1 <- matrix(data=rnorm(9,0,1),nrow=3,ncol=3)
Colnames1 <- c("(A)","(B)","(C)")
colnames(Data1) <- Colnames1
print(Data1)
DataFrame1 <- data.frame(Data1)
print(DataFrame1)
colnames(DataFrame1) <- Colnames1
print(DataFrame1)

The results I get are:

            (A)        (B)        (C)
[1,]  0.4739417  1.3138868  0.4262165
[2,] -2.1288083  1.0333770  1.1543404
[3,] -0.3401786 -0.7023236 -0.2336880
        X.A.       X.B.       X.C.
1  0.4739417  1.3138868  0.4262165
2 -2.1288083  1.0333770  1.1543404
3 -0.3401786 -0.7023236 -0.2336880
         (A)        (B)        (C)
1  0.4739417  1.3138868  0.4262165
2 -2.1288083  1.0333770  1.1543404
3 -0.3401786 -0.7023236 -0.2336880

so that when I make the matrix with headings the parentheses are replaced by periods but I can add them after creating the data frame and the column headings are correct.

Any ideas on why this occurs?

Thanks


Bernard McGarvey
Director, Fort Myers Beach Lions Foundation, Inc.
Retired (Lilly Engineering Fellow).

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Col names in a data frame

Jeff Newmiller
rm(list=ls()) is a bad practice... especially when posting examples. It doesn't clean out everything and it removes objects created by the user.

Read ?data.frame, particularly regarding the check.names parameter. The intent is to make it easier to use DF$A notation, though DF$`(A)` is usable if you are somewhat experienced with R.

On January 21, 2021 12:58:44 PM PST, Bernard McGarvey <[hidden email]> wrote:

>Here is an example piece of code to illustrate an issue:
>
>rm(list=ls()) # Clear Workspace
>#
>Data1 <- matrix(data=rnorm(9,0,1),nrow=3,ncol=3)
>Colnames1 <- c("(A)","(B)","(C)")
>colnames(Data1) <- Colnames1
>print(Data1)
>DataFrame1 <- data.frame(Data1)
>print(DataFrame1)
>colnames(DataFrame1) <- Colnames1
>print(DataFrame1)
>
>The results I get are:
>
>            (A)        (B)        (C)
>[1,]  0.4739417  1.3138868  0.4262165
>[2,] -2.1288083  1.0333770  1.1543404
>[3,] -0.3401786 -0.7023236 -0.2336880
>        X.A.       X.B.       X.C.
>1  0.4739417  1.3138868  0.4262165
>2 -2.1288083  1.0333770  1.1543404
>3 -0.3401786 -0.7023236 -0.2336880
>         (A)        (B)        (C)
>1  0.4739417  1.3138868  0.4262165
>2 -2.1288083  1.0333770  1.1543404
>3 -0.3401786 -0.7023236 -0.2336880
>
>so that when I make the matrix with headings the parentheses are
>replaced by periods but I can add them after creating the data frame
>and the column headings are correct.
>
>Any ideas on why this occurs?
>
>Thanks
>
>
>Bernard McGarvey
>Director, Fort Myers Beach Lions Foundation, Inc.
>Retired (Lilly Engineering Fellow).
>
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Col names in a data frame

Sarah Goslee
In reply to this post by Bernard McGarvey
Hi,

data.frame() checks names by default to ensure that column names are
legal, but there's an argument to change that.

From ?data.frame()


check.names: logical.  If ‘TRUE’ then the names of the variables in the
          data frame are checked to ensure that they are syntactically
          valid variable names and are not duplicated.  If necessary
          they are adjusted (by ‘make.names’) so that they are.

Sarah

On Thu, Jan 21, 2021 at 3:59 PM Bernard McGarvey
<[hidden email]> wrote:

>
> Here is an example piece of code to illustrate an issue:
>
> rm(list=ls()) # Clear Workspace
> #
> Data1 <- matrix(data=rnorm(9,0,1),nrow=3,ncol=3)
> Colnames1 <- c("(A)","(B)","(C)")
> colnames(Data1) <- Colnames1
> print(Data1)
> DataFrame1 <- data.frame(Data1)
> print(DataFrame1)
> colnames(DataFrame1) <- Colnames1
> print(DataFrame1)
>
> The results I get are:
>
>             (A)        (B)        (C)
> [1,]  0.4739417  1.3138868  0.4262165
> [2,] -2.1288083  1.0333770  1.1543404
> [3,] -0.3401786 -0.7023236 -0.2336880
>         X.A.       X.B.       X.C.
> 1  0.4739417  1.3138868  0.4262165
> 2 -2.1288083  1.0333770  1.1543404
> 3 -0.3401786 -0.7023236 -0.2336880
>          (A)        (B)        (C)
> 1  0.4739417  1.3138868  0.4262165
> 2 -2.1288083  1.0333770  1.1543404
> 3 -0.3401786 -0.7023236 -0.2336880
>
> so that when I make the matrix with headings the parentheses are replaced by periods but I can add them after creating the data frame and the column headings are correct.
>
> Any ideas on why this occurs?
>
> Thanks
>
>
> Bernard McGarvey
> Director, Fort Myers Beach Lions Foundation, Inc.
> Retired (Lilly Engineering Fellow).
>

--
Sarah Goslee (she/her)
http://www.numberwright.com

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Col names in a data frame

Duncan Murdoch-2
In reply to this post by Bernard McGarvey
On 21/01/2021 3:58 p.m., Bernard McGarvey wrote:

> Here is an example piece of code to illustrate an issue:
>
> rm(list=ls()) # Clear Workspace
> #
> Data1 <- matrix(data=rnorm(9,0,1),nrow=3,ncol=3)
> Colnames1 <- c("(A)","(B)","(C)")
> colnames(Data1) <- Colnames1
> print(Data1)
> DataFrame1 <- data.frame(Data1)
> print(DataFrame1)
> colnames(DataFrame1) <- Colnames1
> print(DataFrame1)
>
> The results I get are:
>
>              (A)        (B)        (C)
> [1,]  0.4739417  1.3138868  0.4262165
> [2,] -2.1288083  1.0333770  1.1543404
> [3,] -0.3401786 -0.7023236 -0.2336880
>          X.A.       X.B.       X.C.
> 1  0.4739417  1.3138868  0.4262165
> 2 -2.1288083  1.0333770  1.1543404
> 3 -0.3401786 -0.7023236 -0.2336880
>           (A)        (B)        (C)
> 1  0.4739417  1.3138868  0.4262165
> 2 -2.1288083  1.0333770  1.1543404
> 3 -0.3401786 -0.7023236 -0.2336880
>
> so that when I make the matrix with headings the parentheses are replaced by periods but I can add them after creating the data frame and the column headings are correct.
>
> Any ideas on why this occurs?

By default, data.frame() uses names that are legal variable names, so
you can do things like Data1$X.A. You can stop this change by saying

DataFrame1 <- data.frame(Data1, check.names=FALSE)

Duncan Murdoch

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Col names in a data frame

R help mailing list-2
In reply to this post by Bernard McGarvey
it looks to me that the names are cranked through make.names for
data frames case while that doesn't happen for matrices. Peeking
into the `colnames<-` code supports this idea, but that in turn
uses `names<-` which is a primitive and so defies further easy
peeking.

The data.frame function provides the check.names parameter to
switch this on / off, but for other classes this checking doesn't
seem to be provided.

Perhaps the idea behind this discrepancy is to enable the use of
the $ operator to access columns of the data frame, while that's
not possible for matrices anyway. (Personally, I don't find the
results of make.names that useful, though, and I tend to sanitise
them myself when working with data frames with unwieldy column
names).

Best regards, Jan


On Thu, Jan 21, 2021 at 03:58:44PM -0500, Bernard McGarvey wrote:

> Here is an example piece of code to illustrate an issue:
>
> rm(list=ls()) # Clear Workspace
> #
> Data1 <- matrix(data=rnorm(9,0,1),nrow=3,ncol=3)
> Colnames1 <- c("(A)","(B)","(C)")
> colnames(Data1) <- Colnames1
> print(Data1)
> DataFrame1 <- data.frame(Data1)
> print(DataFrame1)
> colnames(DataFrame1) <- Colnames1
> print(DataFrame1)
>
> The results I get are:
>
>             (A)        (B)        (C)
> [1,]  0.4739417  1.3138868  0.4262165
> [2,] -2.1288083  1.0333770  1.1543404
> [3,] -0.3401786 -0.7023236 -0.2336880
>         X.A.       X.B.       X.C.
> 1  0.4739417  1.3138868  0.4262165
> 2 -2.1288083  1.0333770  1.1543404
> 3 -0.3401786 -0.7023236 -0.2336880
>          (A)        (B)        (C)
> 1  0.4739417  1.3138868  0.4262165
> 2 -2.1288083  1.0333770  1.1543404
> 3 -0.3401786 -0.7023236 -0.2336880
>
> so that when I make the matrix with headings the parentheses are replaced by periods but I can add them after creating the data frame and the column headings are correct.
>
> Any ideas on why this occurs?
>
> Thanks
>
>
> Bernard McGarvey
> Director, Fort Myers Beach Lions Foundation, Inc.
> Retired (Lilly Engineering Fellow).
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Col names in a data frame

Bernard McGarvey
In reply to this post by Duncan Murdoch-2
Thanks - I had seen that parameter but did not think the ( would be illegal but now I understand why it considers it illegal.

Thanks again

Bernard
Sent from my iPhone so please excuse the spelling!"

> On Jan 21, 2021, at 4:14 PM, Duncan Murdoch <[hidden email]> wrote:
>
> On 21/01/2021 3:58 p.m., Bernard McGarvey wrote:
>> Here is an example piece of code to illustrate an issue:
>> rm(list=ls()) # Clear Workspace
>> #
>> Data1 <- matrix(data=rnorm(9,0,1),nrow=3,ncol=3)
>> Colnames1 <- c("(A)","(B)","(C)")
>> colnames(Data1) <- Colnames1
>> print(Data1)
>> DataFrame1 <- data.frame(Data1)
>> print(DataFrame1)
>> colnames(DataFrame1) <- Colnames1
>> print(DataFrame1)
>> The results I get are:
>>             (A)        (B)        (C)
>> [1,]  0.4739417  1.3138868  0.4262165
>> [2,] -2.1288083  1.0333770  1.1543404
>> [3,] -0.3401786 -0.7023236 -0.2336880
>>         X.A.       X.B.       X.C.
>> 1  0.4739417  1.3138868  0.4262165
>> 2 -2.1288083  1.0333770  1.1543404
>> 3 -0.3401786 -0.7023236 -0.2336880
>>          (A)        (B)        (C)
>> 1  0.4739417  1.3138868  0.4262165
>> 2 -2.1288083  1.0333770  1.1543404
>> 3 -0.3401786 -0.7023236 -0.2336880
>> so that when I make the matrix with headings the parentheses are replaced by periods but I can add them after creating the data frame and the column headings are correct.
>> Any ideas on why this occurs?
>
> By default, data.frame() uses names that are legal variable names, so you can do things like Data1$X.A. You can stop this change by saying
>
> DataFrame1 <- data.frame(Data1, check.names=FALSE)
>
> Duncan Murdoch

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Col names in a data frame

Bernard McGarvey
In reply to this post by Bernard McGarvey
Thanks - I had seen that parameter but did not think the ( would be illegal but now I understand why it considers it illegal.

Thanks again

Bernard
Sent from my iPhone so please excuse the spelling!"

> On Jan 21, 2021, at 4:14 PM, Duncan Murdoch <[hidden email]> wrote:
>
> On 21/01/2021 3:58 p.m., Bernard McGarvey wrote:
>> Here is an example piece of code to illustrate an issue:
>> rm(list=ls()) # Clear Workspace
>> #
>> Data1 <- matrix(data=rnorm(9,0,1),nrow=3,ncol=3)
>> Colnames1 <- c("(A)","(B)","(C)")
>> colnames(Data1) <- Colnames1
>> print(Data1)
>> DataFrame1 <- data.frame(Data1)
>> print(DataFrame1)
>> colnames(DataFrame1) <- Colnames1
>> print(DataFrame1)
>> The results I get are:
>>            (A)        (B)        (C)
>> [1,]  0.4739417  1.3138868  0.4262165
>> [2,] -2.1288083  1.0333770  1.1543404
>> [3,] -0.3401786 -0.7023236 -0.2336880
>>        X.A.       X.B.       X.C.
>> 1  0.4739417  1.3138868  0.4262165
>> 2 -2.1288083  1.0333770  1.1543404
>> 3 -0.3401786 -0.7023236 -0.2336880
>>         (A)        (B)        (C)
>> 1  0.4739417  1.3138868  0.4262165
>> 2 -2.1288083  1.0333770  1.1543404
>> 3 -0.3401786 -0.7023236 -0.2336880
>> so that when I make the matrix with headings the parentheses are replaced by periods but I can add them after creating the data frame and the column headings are correct.
>> Any ideas on why this occurs?
>
> By default, data.frame() uses names that are legal variable names, so you can do things like Data1$X.A. You can stop this change by saying
>
> DataFrame1 <- data.frame(Data1, check.names=FALSE)
>
> Duncan Murdoch

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Why is rm(list=ls()) bad practice?

J C Nash
In reply to this post by Jeff Newmiller
In a separate thread Jeff Newmiller wrote:
> rm(list=ls()) is a bad practice... especially when posting examples. It doesn't clean out everything and it removes objects created by the user.

This query is to ask

1) Why is it bad practice to clear the workspace when presenting an example?
I'm assuming here that people who will try R-help examples will not run them in the
middle of something else, which I agree would be unfortunates. However, one of the
not very nice aspects of R is that it is VERY easy to have stuff hanging around (including
overloaded functions and operators) that get you into trouble, and indeed make it harder
to reproduce those important "minimal reproducible examples".  This includes the .RData
contents. (For information, I can understand the attraction, but I seem to have been
burned much more often than I've benefited from a pre-warmed oven.)

2) Is there a good command that really does leave a blank workspace? For testing
purposes, it would be useful to have an assured blank canvas.

This post is definitely not to start an argument, but to try to find ways to reduce
the possibilities for unanticipated outcomes in examples.

Cheers, JN

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Why is rm(list=ls()) bad practice?

Bert Gunter-2
Do you mean:
rm(list = ls(all = TRUE))
?
... or something else?

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Jan 21, 2021 at 2:21 PM J C Nash <[hidden email]> wrote:

> In a separate thread Jeff Newmiller wrote:
> > rm(list=ls()) is a bad practice... especially when posting examples. It
> doesn't clean out everything and it removes objects created by the user.
>
> This query is to ask
>
> 1) Why is it bad practice to clear the workspace when presenting an
> example?
> I'm assuming here that people who will try R-help examples will not run
> them in the
> middle of something else, which I agree would be unfortunates. However,
> one of the
> not very nice aspects of R is that it is VERY easy to have stuff hanging
> around (including
> overloaded functions and operators) that get you into trouble, and indeed
> make it harder
> to reproduce those important "minimal reproducible examples".  This
> includes the .RData
> contents. (For information, I can understand the attraction, but I seem to
> have been
> burned much more often than I've benefited from a pre-warmed oven.)
>
> 2) Is there a good command that really does leave a blank workspace? For
> testing
> purposes, it would be useful to have an assured blank canvas.
>
> This post is definitely not to start an argument, but to try to find ways
> to reduce
> the possibilities for unanticipated outcomes in examples.
>
> Cheers, JN
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Why is rm(list=ls()) bad practice?

Duncan Murdoch-2
In reply to this post by J C Nash
On 21/01/2021 5:20 p.m., J C Nash wrote:
> In a separate thread Jeff Newmiller wrote:
>> rm(list=ls()) is a bad practice... especially when posting examples. It doesn't clean out everything and it removes objects created by the user.
>
> This query is to ask
>
> 1) Why is it bad practice to clear the workspace when presenting an example?
> I'm assuming here that people who will try R-help examples will not run them in the
> middle of something else, which I agree would be unfortunates.

I think that's exactly the concern.  I doubt it would have happened in
this instance, but in other cases, people might copy and paste a
complete example before reading it.  It's safer to say:  "Run this code
in a clean workspace:", rather than cleaning it out yourself.

Duncan Murdoch


However, one of the
> not very nice aspects of R is that it is VERY easy to have stuff hanging around (including
> overloaded functions and operators) that get you into trouble, and indeed make it harder
> to reproduce those important "minimal reproducible examples".  This includes the .RData
> contents. (For information, I can understand the attraction, but I seem to have been
> burned much more often than I've benefited from a pre-warmed oven.)
>
> 2) Is there a good command that really does leave a blank workspace? For testing
> purposes, it would be useful to have an assured blank canvas.

Yes, start R with

   R --vanilla

Duncan Murdoch

>
> This post is definitely not to start an argument, but to try to find ways to reduce
> the possibilities for unanticipated outcomes in examples.
>
> Cheers, JN
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Why is rm(list=ls()) bad practice?

J C Nash
Thanks Duncan for a clear argument about the "why".

The suggestion of R --vanilla started a train of thought that one could do something like

 clearws <- function(){ # Try to clear workspace
   tmp <- readline("Are you sure you want to clear the workspace? ")
   print(tmp)
   if ( substr(toupper(tmp),1,1) != "Y" ){
       return(0)
   }
   #  rm(tmp)
   tgt<-parent.env(environment())
   print(ls(tgt))
   rm(list = ls(tgt, all.names = TRUE),envir=tgt) #will clear all objects includes hidden objects.
   gc() #free up memrory and report the memory usage.
   # What should we return?
   # Can we offer interactive control?
   # How about packages?
 }

and call clearws() at the start of an example.

Contact me off-list if you have suggestions for improving this. I'd like to be able to preface
examples with such a function that would render the session "vanilla" but from inside. That means
removing packages that might have altered / replaced / masked standard functions. That might not
be possible, but the idea is attractive to me as someone who mostly uses R in tests of tools that
get used by others.

Best, JN


On 2021-01-21 6:05 p.m., Duncan Murdoch wrote:

> On 21/01/2021 5:20 p.m., J C Nash wrote:
>> In a separate thread Jeff Newmiller wrote:
>>> rm(list=ls()) is a bad practice... especially when posting examples. It doesn't clean out everything and it removes
>>> objects created by the user.
>>
>> This query is to ask
>>
>> 1) Why is it bad practice to clear the workspace when presenting an example?
>> I'm assuming here that people who will try R-help examples will not run them in the
>> middle of something else, which I agree would be unfortunates.
>
> I think that's exactly the concern.  I doubt it would have happened in this instance, but in other cases, people might
> copy and paste a complete example before reading it.  It's safer to say:  "Run this code in a clean workspace:", rather
> than cleaning it out yourself.
>
> Duncan Murdoch
>
>
> However, one of the
>> not very nice aspects of R is that it is VERY easy to have stuff hanging around (including
>> overloaded functions and operators) that get you into trouble, and indeed make it harder
>> to reproduce those important "minimal reproducible examples".  This includes the .RData
>> contents. (For information, I can understand the attraction, but I seem to have been
>> burned much more often than I've benefited from a pre-warmed oven.)
>>
>> 2) Is there a good command that really does leave a blank workspace? For testing
>> purposes, it would be useful to have an assured blank canvas.
>
> Yes, start R with
>
>   R --vanilla
>
> Duncan Murdoch
>
>>
>> This post is definitely not to start an argument, but to try to find ways to reduce
>> the possibilities for unanticipated outcomes in examples.
>>
>> Cheers, JN
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Why is rm(list=ls()) bad practice?

Duncan Murdoch-2
I think it's always difficult and sometimes impossible to take an
existing session and convert it to the vanilla state, but it's very easy
to run a new instance of R from an existing one.

So instead of a clearws() function, I'd suggest a "runInVanilla"
function, that takes some code as input, starts up a vanilla session and
collects the output.

This is quite similar to what reprex::reprex does, maybe not different
at all.

Duncan Murdoch

On 22/01/2021 10:37 a.m., J C Nash wrote:

> Thanks Duncan for a clear argument about the "why".
>
> The suggestion of R --vanilla started a train of thought that one could do something like
>
>   clearws <- function(){ # Try to clear workspace
>     tmp <- readline("Are you sure you want to clear the workspace? ")
>     print(tmp)
>     if ( substr(toupper(tmp),1,1) != "Y" ){
>         return(0)
>     }
>     #  rm(tmp)
>     tgt<-parent.env(environment())
>     print(ls(tgt))
>     rm(list = ls(tgt, all.names = TRUE),envir=tgt) #will clear all objects includes hidden objects.
>     gc() #free up memrory and report the memory usage.
>     # What should we return?
>     # Can we offer interactive control?
>     # How about packages?
>   }
>
> and call clearws() at the start of an example.
>
> Contact me off-list if you have suggestions for improving this. I'd like to be able to preface
> examples with such a function that would render the session "vanilla" but from inside. That means
> removing packages that might have altered / replaced / masked standard functions. That might not
> be possible, but the idea is attractive to me as someone who mostly uses R in tests of tools that
> get used by others.
>
> Best, JN
>
>
> On 2021-01-21 6:05 p.m., Duncan Murdoch wrote:
>> On 21/01/2021 5:20 p.m., J C Nash wrote:
>>> In a separate thread Jeff Newmiller wrote:
>>>> rm(list=ls()) is a bad practice... especially when posting examples. It doesn't clean out everything and it removes
>>>> objects created by the user.
>>>
>>> This query is to ask
>>>
>>> 1) Why is it bad practice to clear the workspace when presenting an example?
>>> I'm assuming here that people who will try R-help examples will not run them in the
>>> middle of something else, which I agree would be unfortunates.
>>
>> I think that's exactly the concern.  I doubt it would have happened in this instance, but in other cases, people might
>> copy and paste a complete example before reading it.  It's safer to say:  "Run this code in a clean workspace:", rather
>> than cleaning it out yourself.
>>
>> Duncan Murdoch
>>
>>
>> However, one of the
>>> not very nice aspects of R is that it is VERY easy to have stuff hanging around (including
>>> overloaded functions and operators) that get you into trouble, and indeed make it harder
>>> to reproduce those important "minimal reproducible examples".  This includes the .RData
>>> contents. (For information, I can understand the attraction, but I seem to have been
>>> burned much more often than I've benefited from a pre-warmed oven.)
>>>
>>> 2) Is there a good command that really does leave a blank workspace? For testing
>>> purposes, it would be useful to have an assured blank canvas.
>>
>> Yes, start R with
>>
>>    R --vanilla
>>
>> Duncan Murdoch
>>
>>>
>>> This post is definitely not to start an argument, but to try to find ways to reduce
>>> the possibilities for unanticipated outcomes in examples.
>>>
>>> Cheers, JN
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.