A slight trap in read.table/read.csv.

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

A slight trap in read.table/read.csv.

Rolf Turner

I had occasion recently to read in a one-line *.csv file that
looked like:

"CandidateName","NSN","Ethnicity","dob","gender"
"Smith, Mary Jane",111222333,"E","2/25/1989","F"

That "F" (for female) in the last field got transformed to
FALSE.  Apparently read.csv (and hence read.table) are inferring
that if the entries of a file are all F's and T's then the
field is interpreted as logical.

If I change the file to

"CandidateName","NSN","Ethnicity","dob","gender"
"Smith, Mary Jane",111222333,"E","2/25/1989","F"
"Mingdinkler, Melvin Queue",999888777,"01/04/1942","M"

then the read functions correctly interpret the last field
as being character.

The translation of "F" into FALSE resulted in some mysterious
contretemps in further analysis, which it took me a while to
track down.

I solved the problem by putting in a colClasses argument in my
call to read.csv().  But I really think that the read functions
are being too clever by half here.  If field entries are surrounded
by quotes, shouldn't they be left as character?  Even if they are
all F's and T's?

Furthermore using F's and T's to represent TRUE's and FALSE's is
bad practice anyway.  Since FALSE and TRUE are reserved words it
would make sense for the read function to assume that a field is
logical if it consists entirely of these words.  But T's and F's
.... I don't think so.

I would argue that this behaviour should be changed.  I can see no
downside to such a change.

        cheers,

                Rolf Turner

######################################################################
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: A slight trap in read.table/read.csv.

Sharpie
Rolf Turner wrote
I solved the problem by putting in a colClasses argument in my
call to read.csv().  But I really think that the read functions
are being too clever by half here.  If field entries are surrounded
by quotes, shouldn't they be left as character?  Even if they are
all F's and T's?
It has been my experience that fields surrounded by quotes are interpreted as factors unless the stringsAsFactors switch has been set to false.  So it seems the default behavior of read.table is to be clever.  Annoying as these behaviors are, changing them would probably break existing code that expects the function to execute the way it does.

-Charlie
Charlie Sharpsteen
Undergraduate-- Environmental Resources Engineering
Humboldt State University
Reply | Threaded
Open this post in threaded view
|

Re: A slight trap in read.table/read.csv.

David Winsemius
In reply to this post by Rolf Turner

On Feb 28, 2010, at 4:55 PM, Rolf Turner wrote:

>
> I had occasion recently to read in a one-line *.csv file that
> looked like:
>
> "CandidateName","NSN","Ethnicity","dob","gender"
> "Smith, Mary Jane",111222333,"E","2/25/1989","F"
>
> That "F" (for female) in the last field got transformed to
> FALSE.  Apparently read.csv (and hence read.table) are inferring
> that if the entries of a file are all F's and T's then the
> field is interpreted as logical.
>
> If I change the file to
>
> "CandidateName","NSN","Ethnicity","dob","gender"
> "Smith, Mary Jane",111222333,"E","2/25/1989","F"
> "Mingdinkler, Melvin Queue",999888777,"01/04/1942","M"
>
> then the read functions correctly interpret the last field
> as being character.
>
> The translation of "F" into FALSE resulted in some mysterious
> contretemps in further analysis, which it took me a while to
> track down.
>
> I solved the problem by putting in a colClasses argument in my
> call to read.csv().  But I really think that the read functions
> are being too clever by half here.  If field entries are surrounded
> by quotes, shouldn't they be left as character?  Even if they are
> all F's and T's?
>
> Furthermore using F's and T's to represent TRUE's and FALSE's is
> bad practice anyway.  Since FALSE and TRUE are reserved words it
> would make sense for the read function to assume that a field is
> logical if it consists entirely of these words.  But T's and F's
> .... I don't think so.

It is documented that conversion will be attempted to logical, so it  
does make sense that T/F would become TRUE and FALSE since that is  
typical behavior elsewhere. But at the very least this sentence in the  
type.convert help page:
"Given a character vector, it attempts to convert it to logical,  
integer, numeric or complex, and failing that converts it to factor  
unless as.is = TRUE."
  ... ought to be clarified. It is not at all clear that the  
conversion to logical still will be attempted even if as.is=TRUE, i.e.  
the only conversion not attempted would be to factor.

>
> I would argue that this behaviour should be changed.  I can see no
> downside to such a change.
>
> cheers,
>
> Rolf Turner
>
> ######################################################################
> Attention:\ This e-mail message is privileged and confid...{{dropped:
> 9}}
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: A slight trap in read.table/read.csv.

Gabor Grothendieck
In reply to this post by Rolf Turner
It is strange.  Even in R itself T and F are not guaranteed to be TRUE
and FALSE.

> T <- 1:3
> T
[1] 1 2 3


On Sun, Feb 28, 2010 at 4:55 PM, Rolf Turner <[hidden email]> wrote:

>
> I had occasion recently to read in a one-line *.csv file that
> looked like:
>
> "CandidateName","NSN","Ethnicity","dob","gender"
> "Smith, Mary Jane",111222333,"E","2/25/1989","F"
>
> That "F" (for female) in the last field got transformed to
> FALSE.  Apparently read.csv (and hence read.table) are inferring
> that if the entries of a file are all F's and T's then the
> field is interpreted as logical.
>
> If I change the file to
>
> "CandidateName","NSN","Ethnicity","dob","gender"
> "Smith, Mary Jane",111222333,"E","2/25/1989","F"
> "Mingdinkler, Melvin Queue",999888777,"01/04/1942","M"
>
> then the read functions correctly interpret the last field
> as being character.
>
> The translation of "F" into FALSE resulted in some mysterious
> contretemps in further analysis, which it took me a while to
> track down.
>
> I solved the problem by putting in a colClasses argument in my
> call to read.csv().  But I really think that the read functions
> are being too clever by half here.  If field entries are surrounded
> by quotes, shouldn't they be left as character?  Even if they are
> all F's and T's?
>
> Furthermore using F's and T's to represent TRUE's and FALSE's is
> bad practice anyway.  Since FALSE and TRUE are reserved words it
> would make sense for the read function to assume that a field is
> logical if it consists entirely of these words.  But T's and F's
> .... I don't think so.
>
> I would argue that this behaviour should be changed.  I can see no
> downside to such a change.
>
>        cheers,
>
>                Rolf Turner
>
> ######################################################################
> Attention:\ This e-mail message is privileged and confid...{{dropped:9}}
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: A slight trap in read.table/read.csv.

Jim Lemon
In reply to this post by Rolf Turner
On 03/01/2010 08:55 AM, Rolf Turner wrote:

>...
> Furthermore using F's and T's to represent TRUE's and FALSE's is
> bad practice anyway.  Since FALSE and TRUE are reserved words it
> would make sense for the read function to assume that a field is
> logical if it consists entirely of these words.  But T's and F's
> .... I don't think so.
>
> I would argue that this behaviour should be changed.  I can see no
> downside to such a change.
>
Hi Rolf,
I think that the answer is buried in the history of Truth and Falsity,
in that T and F once were valid ways to abbreviate these fundamental but
widely disputed concepts. The number of messages on the help list that
contain this usage indicates that an awful lot of people would be
chucking a tanty if automatic conversion were dropped.

Jim

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: A slight trap in read.table/read.csv.

Don MacQueen
In reply to this post by Gabor Grothendieck
There is, however, an important distinction.

Quoting from ?TRUE  (or ?logical):

'TRUE' and 'FALSE' are reserved words denoting logical constants
      in the R language, whereas 'T' and 'F' are global variables whose
      initial values set to these.  All four are 'logical(1)' vectors.

>  TRUE <- 3
Error in TRUE <- 3 : invalid (do_set) left-hand side to assignment

In other words, the rule is
   T is TRUE unless otherwise defined by the user
(ditto for F)

So this rule apparently applies to input from a file. Using
colClasses is then an example of "otherwise defined by the user."

I think it's logical (pun not particularly intended) and consistent
(though perhaps not ideal, but that's another question...)

-Don


At 5:37 PM -0500 2/28/10, Gabor Grothendieck wrote:

>It is strange.  Even in R itself T and F are not guaranteed to be TRUE
>and FALSE.
>
>>  T <- 1:3
>>  T
>[1] 1 2 3
>
>
>On Sun, Feb 28, 2010 at 4:55 PM, Rolf Turner <[hidden email]> wrote:
>>
>>  I had occasion recently to read in a one-line *.csv file that
>>  looked like:
>>
>>  "CandidateName","NSN","Ethnicity","dob","gender"
>>  "Smith, Mary Jane",111222333,"E","2/25/1989","F"
>>
>>  That "F" (for female) in the last field got transformed to
>>  FALSE.  Apparently read.csv (and hence read.table) are inferring
>>  that if the entries of a file are all F's and T's then the
>>  field is interpreted as logical.
>>
>>  If I change the file to
>>
>>  "CandidateName","NSN","Ethnicity","dob","gender"
>>  "Smith, Mary Jane",111222333,"E","2/25/1989","F"
>>  "Mingdinkler, Melvin Queue",999888777,"01/04/1942","M"
>>
>>  then the read functions correctly interpret the last field
>>  as being character.
>>
>>  The translation of "F" into FALSE resulted in some mysterious
>>  contretemps in further analysis, which it took me a while to
>>  track down.
>>
>>  I solved the problem by putting in a colClasses argument in my
>>  call to read.csv().  But I really think that the read functions
>>  are being too clever by half here.  If field entries are surrounded
>>  by quotes, shouldn't they be left as character?  Even if they are
>>  all F's and T's?
>>
>>  Furthermore using F's and T's to represent TRUE's and FALSE's is
>>  bad practice anyway.  Since FALSE and TRUE are reserved words it
>>  would make sense for the read function to assume that a field is
>>  logical if it consists entirely of these words.  But T's and F's
>>  .... I don't think so.
>>
>>  I would argue that this behaviour should be changed.  I can see no
>>  downside to such a change.
>>
>>         cheers,
>>
>>                 Rolf Turner
>>
>>  ######################################################################
>>  Attention:\ This e-mail message is privileged and confid...{{dropped:9}}
>>
>>  ______________________________________________
>>  [hidden email] mailing list
>>  https://*stat.ethz.ch/mailman/listinfo/r-help
>>  PLEASE do read the posting guide
>>http://*www.*R-project.org/posting-guide.html
>>  and provide commented, minimal, self-contained, reproducible code.
>>
>
>______________________________________________
>[hidden email] mailing list
>https://*stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://*www.*R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.


--
---------------------------------
Don MacQueen
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062
[hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: A slight trap in read.table/read.csv.

Peter Ehlers
In reply to this post by Rolf Turner
On 2010-02-28 14:55, Rolf Turner wrote:

>
> I had occasion recently to read in a one-line *.csv file that
> looked like:
>
> "CandidateName","NSN","Ethnicity","dob","gender"
> "Smith, Mary Jane",111222333,"E","2/25/1989","F"
>
> That "F" (for female) in the last field got transformed to
> FALSE.  Apparently read.csv (and hence read.table) are inferring
> that if the entries of a file are all F's and T's then the
> field is interpreted as logical.
>
> If I change the file to
>
> "CandidateName","NSN","Ethnicity","dob","gender"
> "Smith, Mary Jane",111222333,"E","2/25/1989","F"
> "Mingdinkler, Melvin Queue",999888777,"01/04/1942","M"
>
> then the read functions correctly interpret the last field
> as being character.
>
> The translation of "F" into FALSE resulted in some mysterious
> contretemps in further analysis, which it took me a while to
> track down.
>
> I solved the problem by putting in a colClasses argument in my
> call to read.csv().  But I really think that the read functions
> are being too clever by half here.  If field entries are surrounded
> by quotes, shouldn't they be left as character?  Even if they are
> all F's and T's?
>
> Furthermore using F's and T's to represent TRUE's and FALSE's is
> bad practice anyway.  Since FALSE and TRUE are reserved words it
> would make sense for the read function to assume that a field is
> logical if it consists entirely of these words.  But T's and F's
> .... I don't think so.
>
> I would argue that this behaviour should be changed.  I can see no
> downside to such a change.
>

I agree with Rolf. Indeed, I'm not fond of the use of T/F for
TRUE/FALSE at all.

> cheers,
>
> Rolf Turner
>
> ######################################################################
> Attention:\ This e-mail message is privileged and confid...{{dropped:9}}
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

--
Peter Ehlers
University of Calgary

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: A slight trap in read.table/read.csv.

Mike Prager
In reply to this post by Rolf Turner
Rolf Turner <[hidden email]> wrote:

>
> I solved the problem by putting in a colClasses argument in my
> call to read.csv().  But I really think that the read functions
> are being too clever by half here.  If field entries are surrounded
> by quotes, shouldn't they be left as character?  Even if they are
> all F's and T's?
>
> Furthermore using F's and T's to represent TRUE's and FALSE's is
> bad practice anyway.  Since FALSE and TRUE are reserved words it
> would make sense for the read function to assume that a field is
> logical if it consists entirely of these words.  But T's and F's
> .... I don't think so.
>
> I would argue that this behaviour should be changed.  I can see no
> downside to such a change.
>

I agree with you, Rolf, that this is horrid behavior. It is such
automatic devices that have made people hate (e.g.) Microsoft
Word with a passion.

Yet, in R this is a designed-in bug (e.g., feature) that
probably can't be changed without making some legacy code not
work. But at least, T and F could be removed soon as synonms for
TRUE and FALSE. We have seen that "_" was removed as an
assignment operator, and the world did not crumble. The use of T
and F is no less error-prone, and possibly more.

The only immediate solution to this accretion of overly clever
behavior would be for someone to write new functions (say,
Read.csv) that didn't do all those conversions behind the
scenes. I'm not about to do that. Are you?

Best of luck!

--
Mike Prager, NOAA, Beaufort, NC
* Opinions expressed are personal and not represented otherwise.
* Any use of tradenames does not constitute a NOAA endorsement.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: A slight trap in read.table/read.csv.

Rolf Turner

On 9/03/2010, at 11:17 AM, Mike Prager wrote:

> Rolf Turner <[hidden email]> wrote:
>>
>> I solved the problem by putting in a colClasses argument in my
>> call to read.csv().  But I really think that the read functions
>> are being too clever by half here.  If field entries are surrounded
>> by quotes, shouldn't they be left as character?  Even if they are
>> all F's and T's?
>>
>> Furthermore using F's and T's to represent TRUE's and FALSE's is
>> bad practice anyway.  Since FALSE and TRUE are reserved words it
>> would make sense for the read function to assume that a field is
>> logical if it consists entirely of these words.  But T's and F's
>> .... I don't think so.
>>
>> I would argue that this behaviour should be changed.  I can see no
>> downside to such a change.
>>
>
> I agree with you, Rolf, that this is horrid behavior. It is such
> automatic devices that have made people hate (e.g.) Microsoft
> Word with a passion.
>
> Yet, in R this is a designed-in bug (e.g., feature) that
> probably can't be changed without making some legacy code not
> work. But at least, T and F could be removed soon as synonms for
> TRUE and FALSE. We have seen that "_" was removed as an
> assignment operator, and the world did not crumble. The use of T
> and F is no less error-prone, and possibly more.

        I would definitely support the removal of the use of T
        and F for TRUE and FALSE.  Some code would break, but
        it would be easy to trace the source of the problem and
        easy to fix.
>
> The only immediate solution to this accretion of overly clever
> behavior would be for someone to write new functions (say,
> Read.csv) that didn't do all those conversions behind the
> scenes. I'm not about to do that. Are you?


        NFL!!!

                cheers,

                        Rolf

######################################################################
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: A slight trap in read.table/read.csv.

Peter Ehlers
Ditching T/F for TRUE/FALSE would get my vote, too.

  -Peter Ehlers

On 2010-03-08 17:44, Rolf Turner wrote:

>
> On 9/03/2010, at 11:17 AM, Mike Prager wrote:
>
>> Rolf Turner<[hidden email]>  wrote:
>>>
>>> I solved the problem by putting in a colClasses argument in my
>>> call to read.csv().  But I really think that the read functions
>>> are being too clever by half here.  If field entries are surrounded
>>> by quotes, shouldn't they be left as character?  Even if they are
>>> all F's and T's?
>>>
>>> Furthermore using F's and T's to represent TRUE's and FALSE's is
>>> bad practice anyway.  Since FALSE and TRUE are reserved words it
>>> would make sense for the read function to assume that a field is
>>> logical if it consists entirely of these words.  But T's and F's
>>> .... I don't think so.
>>>
>>> I would argue that this behaviour should be changed.  I can see no
>>> downside to such a change.
>>>
>>
>> I agree with you, Rolf, that this is horrid behavior. It is such
>> automatic devices that have made people hate (e.g.) Microsoft
>> Word with a passion.
>>
>> Yet, in R this is a designed-in bug (e.g., feature) that
>> probably can't be changed without making some legacy code not
>> work. But at least, T and F could be removed soon as synonms for
>> TRUE and FALSE. We have seen that "_" was removed as an
>> assignment operator, and the world did not crumble. The use of T
>> and F is no less error-prone, and possibly more.
>
> I would definitely support the removal of the use of T
> and F for TRUE and FALSE.  Some code would break, but
> it would be easy to trace the source of the problem and
> easy to fix.
>>
>> The only immediate solution to this accretion of overly clever
>> behavior would be for someone to write new functions (say,
>> Read.csv) that didn't do all those conversions behind the
>> scenes. I'm not about to do that. Are you?
>
>
> NFL!!!
>
> cheers,
>
> Rolf
>
> ######################################################################
> Attention:\ This e-mail message is privileged and confid...{{dropped:9}}
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

--
Peter Ehlers
University of Calgary

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: A slight trap in read.table/read.csv.

PIKAL Petr
In reply to this post by Rolf Turner
Hi

[hidden email] napsal dne 09.03.2010 01:44:38:

>
> On 9/03/2010, at 11:17 AM, Mike Prager wrote:
>
> > Rolf Turner <[hidden email]> wrote:
> >>
> >> I solved the problem by putting in a colClasses argument in my
> >> call to read.csv().  But I really think that the read functions
> >> are being too clever by half here.  If field entries are surrounded
> >> by quotes, shouldn't they be left as character?  Even if they are
> >> all F's and T's?
> >>
> >> Furthermore using F's and T's to represent TRUE's and FALSE's is
> >> bad practice anyway.  Since FALSE and TRUE are reserved words it
> >> would make sense for the read function to assume that a field is
> >> logical if it consists entirely of these words.  But T's and F's
> >> .... I don't think so.
> >>
> >> I would argue that this behaviour should be changed.  I can see no
> >> downside to such a change.
> >>
> >
> > I agree with you, Rolf, that this is horrid behavior. It is such
> > automatic devices that have made people hate (e.g.) Microsoft
> > Word with a passion.
> >
> > Yet, in R this is a designed-in bug (e.g., feature) that
> > probably can't be changed without making some legacy code not
> > work. But at least, T and F could be removed soon as synonms for
> > TRUE and FALSE. We have seen that "_" was removed as an
> > assignment operator, and the world did not crumble. The use of T
> > and F is no less error-prone, and possibly more.
>
>    I would definitely support the removal of the use of T
>    and F for TRUE and FALSE.  Some code would break, but
>    it would be easy to trace the source of the problem and
>    easy to fix.

I would respectfully oppose it. It may be quite convenient for making code
for functions and other programming stuff but all using R more or less in
interactive way this change could be quite a burden especially when there
are many functions which use TRUE/FALSE for setting its parameters.

In those (and many others) instances I almost exclusively use T/F
shortcut.

read.delim(file, header = TRUE, sep = "\t", quote="\"", dec=".", fill =
TRUE, comment.char="", ...)
lm(formula, data, subset, weights, na.action,method = "qr", model = TRUE,
x = FALSE, y = FALSE, qr = TRUE,     singular.ok = TRUE, contrasts = NULL,
offset, ...)

If this had to be changed I would vote for some change which allow users
some other shortcut for setting interactively parameters in functions.

Regards
Petr
 


> >
> > The only immediate solution to this accretion of overly clever
> > behavior would be for someone to write new functions (say,
> > Read.csv) that didn't do all those conversions behind the
> > scenes. I'm not about to do that. Are you?
>
>
>    NFL!!!
>
>       cheers,
>
>          Rolf
>
> ######################################################################
> Attention:\ This e-mail message is privileged and confid...{{dropped:9}}
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: A slight trap in read.table/read.csv.

Patrick Connolly-4
On Tue, 09-Mar-2010 at 08:14AM +0100, Petr PIKAL wrote:


|> I would respectfully oppose it. It may be quite convenient for making code
|> for functions and other programming stuff but all using R more or less in
|> interactive way this change could be quite a burden especially when there
|> are many functions which use TRUE/FALSE for setting its parameters.
|>
|> In those (and many others) instances I almost exclusively use T/F
|> shortcut.
|>
|> read.delim(file, header = TRUE, sep = "\t", quote="\"", dec=".", fill =
|> TRUE, comment.char="", ...)
|> lm(formula, data, subset, weights, na.action,method = "qr", model = TRUE,
|> x = FALSE, y = FALSE, qr = TRUE,     singular.ok = TRUE, contrasts = NULL,
|> offset, ...)

If you use ESS, you have the benefit of completions.  Depending on
what else could begin with T or F, you can press the TAB key after
typing the first letter or two.  Admittedly, three keystrokes isn't
much shorter than TRUE -- but they are all with the left hand. You
always get at least a 40% discount with FALSE. :-) -- except in the
'unlikely event' that you have objects named FALLOW or something
else a lot like FALSE.






|>
|> If this had to be changed I would vote for some change which allow users
|> some other shortcut for setting interactively parameters in functions.
|>
|> Regards
|> Petr
|>  
|>
|>
|> > >
|> > > The only immediate solution to this accretion of overly clever
|> > > behavior would be for someone to write new functions (say,
|> > > Read.csv) that didn't do all those conversions behind the
|> > > scenes. I'm not about to do that. Are you?
|> >
|> >
|> >    NFL!!!
|> >
|> >       cheers,
|> >
|> >          Rolf
|> >
|> > ######################################################################
|> > Attention:\ This e-mail message is privileged and confid...{{dropped:9}}
|> >
|> > ______________________________________________
|> > [hidden email] mailing list
|> > https://stat.ethz.ch/mailman/listinfo/r-help
|> > PLEASE do read the posting guide
|> http://www.R-project.org/posting-guide.html
|> > and provide commented, minimal, self-contained, reproducible code.
|>
|> ______________________________________________
|> [hidden email] mailing list
|> https://stat.ethz.ch/mailman/listinfo/r-help
|> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
|> and provide commented, minimal, self-contained, reproducible code.

--
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.  
   ___    Patrick Connolly  
 {~._.~}                   Great minds discuss ideas    
 _( Y )_           Average minds discuss events
(:_~*~_:)                  Small minds discuss people  
 (_)-(_)                        ..... Eleanor Roosevelt
         
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: A slight trap in read.table/read.csv.

Barry Rowlingson
On Tue, Mar 9, 2010 at 8:11 AM, Patrick Connolly
<[hidden email]> wrote:

> If you use ESS, you have the benefit of completions.  Depending on
> what else could begin with T or F, you can press the TAB key after
> typing the first letter or two.  Admittedly, three keystrokes isn't
> much shorter than TRUE -- but they are all with the left hand. You
> always get at least a 40% discount with FALSE. :-) -- except in the
> 'unlikely event' that you have objects named FALLOW or something
> else a lot like FALSE.

 "FALSETTO"? Anyone analysing choral music in R?

 I'd somehow got the impression that T and F were going to be removed
as values, but there's no mention of it in ?FALSE. The package check
utilities warn you if you use them.

 Just looking at the source for the underlying cause of the read.csv
behaviour, which is type.convert in R which is do_typecvt in C, and it
admits:

/* This is a horrible hack

There's a couple of instances of:

 if (strcmp(s, "F") == 0 || strcmp(s, "FALSE")

which are what do it. And are obviously not language-dependent - do
French people have 'VRAI' and 'FAUX' in their CSV files? Do they call
them DSV files:
http://translate.google.com/#en|fr|comma-separated%20file

Got to keep the Academie Francaise happy...

Barry

--
blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: A slight trap in read.table/read.csv.

Ivan Calandra
To answer your questions, French people don't have 'Vrai' and 'Faux'.
Since most of people have no idea what "csv" means, there's no use for
using "dsv"! The "Académie Française" would not be happy indeed.

But to keep on the subject, I personally never had troubles using T or
F. I sometimes use T/F, sometimes TRUE/FALSE, depending on my mood!
Looks like I should think about using TRUE/FALSE all the time, but since
I've never had problems, I don't feel bad with it!

Ivan

Le 3/9/2010 10:00, Barry Rowlingson a écrit :

> On Tue, Mar 9, 2010 at 8:11 AM, Patrick Connolly
> <[hidden email]>  wrote:
>
>    
>> If you use ESS, you have the benefit of completions.  Depending on
>> what else could begin with T or F, you can press the TAB key after
>> typing the first letter or two.  Admittedly, three keystrokes isn't
>> much shorter than TRUE -- but they are all with the left hand. You
>> always get at least a 40% discount with FALSE. :-) -- except in the
>> 'unlikely event' that you have objects named FALLOW or something
>> else a lot like FALSE.
>>      
>   "FALSETTO"? Anyone analysing choral music in R?
>
>   I'd somehow got the impression that T and F were going to be removed
> as values, but there's no mention of it in ?FALSE. The package check
> utilities warn you if you use them.
>
>   Just looking at the source for the underlying cause of the read.csv
> behaviour, which is type.convert in R which is do_typecvt in C, and it
> admits:
>
> /* This is a horrible hack
>
> There's a couple of instances of:
>
>   if (strcmp(s, "F") == 0 || strcmp(s, "FALSE")
>
> which are what do it. And are obviously not language-dependent - do
> French people have 'VRAI' and 'FAUX' in their CSV files? Do they call
> them DSV files:
> http://translate.google.com/#en|fr|comma-separated%20file
>
> Got to keep the Academie Francaise happy...
>
> Barry
>
>    

--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
[hidden email]

**********
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.