R as a programming language

classic Classic list List threaded Threaded
26 messages Options
12
Reply | Threaded
Open this post in threaded view
|

R as a programming language

Alexy Khrabrov
Greetings -- coming from Python/Ruby perspective, I'm wondering about  
certain features of R as a programming language.

Say I have a huge table t of the form

run     ord     unit    words   new
1       1       6939    1013    641
1       2       275     1001    518
1       3       3314    1008    488
1       4       14154   1018    463
1       5       2982    1006    421

Alternatively, it may have a part column in front.  For each run (in  
a part if present), I select ord and new columns as x and y and plot  
their functions in various ways.  t is huge.  So I want to select the  
subset to plot, as follows:

t.xy <- function(t,part=NA,run=NA) {
        if (is.na(run)) {
                # TODO does this entail a full copy -- or how do we do references  
in R?
                r <- t
        } else if (is.na(part)) {
                r <- t[t$run == run,]
        } else { # part present too
                r <- t[t$part == part & t$run == run,]
        }
        x <- r$ord
        y <- r$new
        xy.coords(x,y)
}

What I'm wondering about is whether r <-t will copy the complete t,  
and how do I minimize copying in R.  I heard it's a functional  
language -- is there lazy evaluation in place here?

Additionally, tried to use --args command line arguments, and found a  
way only due to David Brahm -- who helped with several important R  
points (thanks Dave!):

#!/bin/sh
# graph a fertility run
tail --lines=+4 "$0" | R --vanilla --slave --args $*; exit
args <- commandArgs()[-(1:4)]
...

And, still no option processing as in GNU long options, or python or  
ruby's optparse.

What's the semantics of parameter passing -- by value or by reference?

Is there anything less ugly than

print(paste("x=",x,"y=",y))

-- for routine printing?  Can [1] be eliminated from such simple  
printing?  What about formatted printing?

Is there a way to assign all of

a <- args[1]
b <- args[2]
c <- args[3]

in one fell swoop, a lá Python's

a,b,c = args

What's the simplest way to check whether a filename ends in ".rda"?

Will ask more as I go programming...

(Will someone here please write an O'Reilly's "Programming in R"?  :)

Cheers,
Alexy
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: R as a programming language

Duncan Murdoch
On 11/7/2007 7:46 AM, Alexy Khrabrov wrote:
> Greetings -- coming from Python/Ruby perspective, I'm wondering about  
> certain features of R as a programming language.

Lots of question, I'll intersperse some answers.

>
> Say I have a huge table t of the form
>
> run     ord     unit    words   new
> 1       1       6939    1013    641
> 1       2       275     1001    518
> 1       3       3314    1008    488
> 1       4       14154   1018    463
> 1       5       2982    1006    421
>
> Alternatively, it may have a part column in front.  For each run (in  
> a part if present), I select ord and new columns as x and y and plot  
> their functions in various ways.  t is huge.  So I want to select the  
> subset to plot, as follows:
>
> t.xy <- function(t,part=NA,run=NA) {
> if (is.na(run)) {
> # TODO does this entail a full copy -- or how do we do references  
> in R?
> r <- t

Semantically it acts as a full copy, though there is some internal
optimization that means the copy won't be made until necessary, i.e. one
of r or t changes.

There are some kinds of objects in R that are handled as references:
environments, external pointers, names, NULL. (I may have missed some.)
There are various kludges to expand this list to other kinds of objects,
the most common way being to wrap an object in an environment.  But
there is a fond wish that people use R as a functional language and
avoid doing this.

> } else if (is.na(part)) {
> r <- t[t$run == run,]
> } else { # part present too
> r <- t[t$part == part & t$run == run,]
> }
> x <- r$ord
> y <- r$new
> xy.coords(x,y)
> }
>
> What I'm wondering about is whether r <-t will copy the complete t,  
> and how do I minimize copying in R.  I heard it's a functional  
> language -- is there lazy evaluation in place here?

There is lazy evaluation of function arguments, but assignments trigger
evaluation of their RHS.

>
> Additionally, tried to use --args command line arguments, and found a  
> way only due to David Brahm -- who helped with several important R  
> points (thanks Dave!):
>
> #!/bin/sh
> # graph a fertility run
> tail --lines=+4 "$0" | R --vanilla --slave --args $*; exit
> args <- commandArgs()[-(1:4)]
> ...
>
> And, still no option processing as in GNU long options, or python or  
> ruby's optparse.
>
> What's the semantics of parameter passing -- by value or by reference?

By value.

> Is there anything less ugly than
>
> print(paste("x=",x,"y=",y))
>
> -- for routine printing?  Can [1] be eliminated from such simple  
> printing?  What about formatted printing?

You can use cat() instead of print(), and avoid the numbering and
quoting.  Remember to explicitly specify a "\n" newline at the end.

At first I thought you were complaining about the syntax, which I find
ugly.  There was a proposal last year to overload + to do concatenation
of strings, so you'd type cat("x=" + x + "y=" + y + "\n"), but there was
substantial resistance, on the grounds that + should be commutative.

> Is there a way to assign all of
>
> a <- args[1]
> b <- args[2]
> c <- args[3]
>
> in one fell swoop, a lá Python's
>
> a,b,c = args

No, but you can do

abc <- args[1:3]
names(abc) <- c('a', 'b', 'c')

and refer to the components as abc$a, etc.

> What's the simplest way to check whether a filename ends in ".rda"?

Probably something like

if (regexpr("\\.rda$", filename) > 0) ...

You double the escape char to get it entered into the RE, and then the
regexpr function uses it to escape the dot in the RE.

Duncan Murdoch

> Will ask more as I go programming...
>
> (Will someone here please write an O'Reilly's "Programming in R"?  :)
>
> Cheers,
> Alexy
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: R as a programming language

Duncan Murdoch
On 11/7/2007 8:13 AM, Duncan Murdoch wrote:

> On 11/7/2007 7:46 AM, Alexy Khrabrov wrote:
>> Greetings -- coming from Python/Ruby perspective, I'm wondering about  
>> certain features of R as a programming language.
>
> Lots of question, I'll intersperse some answers.
>>
>> Say I have a huge table t of the form
>>
>> run     ord     unit    words   new
>> 1       1       6939    1013    641
>> 1       2       275     1001    518
>> 1       3       3314    1008    488
>> 1       4       14154   1018    463
>> 1       5       2982    1006    421
>>
>> Alternatively, it may have a part column in front.  For each run (in  
>> a part if present), I select ord and new columns as x and y and plot  
>> their functions in various ways.  t is huge.  So I want to select the  
>> subset to plot, as follows:
>>
>> t.xy <- function(t,part=NA,run=NA) {
>> if (is.na(run)) {
>> # TODO does this entail a full copy -- or how do we do references  
>> in R?
>> r <- t
>
> Semantically it acts as a full copy, though there is some internal
> optimization that means the copy won't be made until necessary, i.e. one
> of r or t changes.
>
> There are some kinds of objects in R that are handled as references:
> environments, external pointers, names, NULL. (I may have missed some.)
> There are various kludges to expand this list to other kinds of objects,
> the most common way being to wrap an object in an environment.  But
> there is a fond wish that people use R as a functional language and
> avoid doing this.
>
>> } else if (is.na(part)) {
>> r <- t[t$run == run,]
>> } else { # part present too
>> r <- t[t$part == part & t$run == run,]
>> }
>> x <- r$ord
>> y <- r$new
>> xy.coords(x,y)
>> }
>>
>> What I'm wondering about is whether r <-t will copy the complete t,  
>> and how do I minimize copying in R.  I heard it's a functional  
>> language -- is there lazy evaluation in place here?
>
> There is lazy evaluation of function arguments, but assignments trigger
> evaluation of their RHS.
>
>>
>> Additionally, tried to use --args command line arguments, and found a  
>> way only due to David Brahm -- who helped with several important R  
>> points (thanks Dave!):
>>
>> #!/bin/sh
>> # graph a fertility run
>> tail --lines=+4 "$0" | R --vanilla --slave --args $*; exit
>> args <- commandArgs()[-(1:4)]
>> ...
>>
>> And, still no option processing as in GNU long options, or python or  
>> ruby's optparse.
>>
>> What's the semantics of parameter passing -- by value or by reference?
>
> By value.
>
>> Is there anything less ugly than
>>
>> print(paste("x=",x,"y=",y))
>>
>> -- for routine printing?  Can [1] be eliminated from such simple  
>> printing?  What about formatted printing?
>
> You can use cat() instead of print(), and avoid the numbering and
> quoting.  Remember to explicitly specify a "\n" newline at the end.
>
> At first I thought you were complaining about the syntax, which I find
> ugly.  There was a proposal last year to overload + to do concatenation
> of strings, so you'd type cat("x=" + x + "y=" + y + "\n"), but there was
> substantial resistance, on the grounds that + should be commutative.
>
>> Is there a way to assign all of
>>
>> a <- args[1]
>> b <- args[2]
>> c <- args[3]
>>
>> in one fell swoop, a lá Python's
>>
>> a,b,c = args
>
> No, but you can do
>
> abc <- args[1:3]
> names(abc) <- c('a', 'b', 'c')

Oops, this code assumed that args was a list already, and I think yours
was a character vector.  In that case you'd need

abc <- as.list(args[1:3])

on the first line.

>
> and refer to the components as abc$a, etc.
>
>> What's the simplest way to check whether a filename ends in ".rda"?
>
> Probably something like
>
> if (regexpr("\\.rda$", filename) > 0) ...
>
> You double the escape char to get it entered into the RE, and then the
> regexpr function uses it to escape the dot in the RE.
>
> Duncan Murdoch
>
>> Will ask more as I go programming...
>>
>> (Will someone here please write an O'Reilly's "Programming in R"?  :)
>>
>> Cheers,
>> Alexy
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: R as a programming language

Alexy Khrabrov
In reply to this post by Duncan Murdoch
On Nov 7, 2007, at 4:13 PM, Duncan Murdoch wrote:

>> And, still no option processing as in GNU long options, or python  
>> or  ruby's optparse.
>> What's the semantics of parameter passing -- by value or by  
>> reference?
>
> By value.

Thanks Duncan!  So if I have a huge table t, and the idea was to  
write a function t.xy(t, ...) to select slices of it, will parameter  
passing copying waste forfeit all aesthetic savings from  
refactoring?  What I'm dreading is having to explicitly select x and  
y from t,

if (<t has some shape>) {
        plot(t$this, t$that, ...)
} else if (<t has that shape>) {
        plot(t$smth_else, ...)
}

-- that way I do refer to parts of t and there's no copying except to  
plot (?), yet if indeed passing parameters by value copies them, one  
would have to refrain from writing functions!  Is that the state of  
things?

Cheers,
Alexy

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: R as a programming language

Duncan Murdoch
On 11/7/2007 8:17 AM, Alexy Khrabrov wrote:

> On Nov 7, 2007, at 4:13 PM, Duncan Murdoch wrote:
>
>>> And, still no option processing as in GNU long options, or python  
>>> or  ruby's optparse.
>>> What's the semantics of parameter passing -- by value or by  
>>> reference?
>>
>> By value.
>
> Thanks Duncan!  So if I have a huge table t, and the idea was to  
> write a function t.xy(t, ...) to select slices of it, will parameter  
> passing copying waste forfeit all aesthetic savings from  
> refactoring?  What I'm dreading is having to explicitly select x and  
> y from t,
>
> if (<t has some shape>) {
> plot(t$this, t$that, ...)
> } else if (<t has that shape>) {
> plot(t$smth_else, ...)
> }
>
> -- that way I do refer to parts of t and there's no copying except to  
> plot (?), yet if indeed passing parameters by value copies them, one  
> would have to refrain from writing functions!  Is that the state of  
> things?

As long as your function doesn't modify t, no actual copy will be made.
   My previous message explained it like this:

> Semantically it acts as a full copy, though there is some internal
> optimization that means the copy won't be made until necessary, i.e. one
> of r or t changes.

This applies to argument passing as well as assignments.  Argument
passing is very much like an assignment to a new variable in the local
frame.  The only difference is the lazy evaluation:  the assignment
won't take place until you use the value of that local variable, and if
you never use it, it won't take place at all.

Selecting slices will create new copies of the slices, but you won't get
a new copy of the full table.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: R as a programming language

Gabor Grothendieck
In reply to this post by Alexy Khrabrov
Most of these have been answered but here are a few
additional options.

On Nov 7, 2007 7:46 AM, Alexy Khrabrov <[hidden email]> wrote:
>
> Is there anything less ugly than
>
> print(paste("x=",x,"y=",y))
>

> library(gsubfn)
> a <- 1; b <- 2
> fn$cat("a = $a b = $b\n")
a = 1 b = 2

See gsubfn home page at htp://gsubfn.googlecode.com

> -- for routine printing?  Can [1] be eliminated from such simple
> printing?  What about formatted printing?
>

?format, ?formatC, ?prettyNum, ?sprintf

> Is there a way to assign all of
>
> a <- args[1]
> b <- args[2]
> c <- args[3]
>
> in one fell swoop, a lá Python's
>
> a,b,c = args
>

See:

http://finzi.psych.upenn.edu/R/Rhelp02a/archive/36820.html

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: R as a programming language

Bert Gunter
In reply to this post by Alexy Khrabrov
>> (Will someone here please write an O'Reilly's "Programming in R"?  :)

Someone already has ... see Venable and Ripley's S PROGRAMMING.

**However** R is more than a general purpose programming language: it is a
programming language specifically designed for data analysis -- including
statistical graphics -- and statistics. So, IMHO anyway, it's really
impossible to discuss it without reference to the data structures and
procedures underlying such tasks. Because it is targeted to do those sorts
of things well, it may handle poorly some things that general purpose
languages do well (minimizing storage with the use of references, for
example).

My own experience is that one appreciates the power and beauty of the
language and the wisdom of the designers the more one uses it in real
applications. But I am not a computer scientist and have only a limited
exposure to standard CS concepts and algorithms, to say nothing of "real"
programming experience. So just my $.02.

Best regards,

Bert Gunter
Genentech Nonclinical Statistics

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: R as a programming language

Alexy Khrabrov
With all due respect to the great book -- of which I own 2 copies I  
bought new -- it's not an "O'Reilly Programming in <X>" book.  The  
idea of a programming book like that is to thoroughly treat the  
language from a programmer's standpoint, in a fairly standard way,  
such as Ruby or Python.

As I'm learning more of statistics with R, I prefer to do it with the  
book by Crawley.  Looks like most of R books are written by  
statisticians who became programmers, not the other way.  Through all  
those years I periodically follow R, I forget its programming spirit  
in between, and there's no "Programming ..." book to help.  
Statistics is hard to forget once you master it; syntax sugar melts  
away...

"Programming with Data" is the closest to an O'Reilly, but more  
advanced and esoteric than that.

Since R became a bona fide Open Source language with CRAN and all, an  
O'Reilly book by a [Python and Ruby] programmer-turn-statistician is  
long overdue!  If it systematically compares R with Ruby and Python,  
its closest Open Source cousins, it would help even more.  RPy and  
RRb are there to help, too.  Just my $0.01...

Cheers,
Alexy

On Nov 7, 2007, at 7:46 PM, Bert Gunter wrote:

>>> (Will someone here please write an O'Reilly's "Programming in  
>>> R"?  :)
>
> Someone already has ... see Venable and Ripley's S PROGRAMMING.
>
> **However** R is more than a general purpose programming language:  
> it is a
> programming language specifically designed for data analysis --  
> including
> statistical graphics -- and statistics. So, IMHO anyway, it's really
> impossible to discuss it without reference to the data structures and
> procedures underlying such tasks. Because it is targeted to do  
> those sorts
> of things well, it may handle poorly some things that general purpose
> languages do well (minimizing storage with the use of references, for
> example).
>
> My own experience is that one appreciates the power and beauty of the
> language and the wisdom of the designers the more one uses it in real
> applications. But I am not a computer scientist and have only a  
> limited
> exposure to standard CS concepts and algorithms, to say nothing of  
> "real"
> programming experience. So just my $.02.
>
> Best regards,
>
> Bert Gunter
> Genentech Nonclinical Statistics
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: R as a programming language

Simon Blomberg-4
Although Crawley is an ecologist, not a programmer or statistician. But
he is an FRS. Maybe that counts for something. ;-)

Simon.

On Thu, 2007-11-08 at 01:56 +0300, Alexy Khrabrov wrote:

> With all due respect to the great book -- of which I own 2 copies I  
> bought new -- it's not an "O'Reilly Programming in <X>" book.  The  
> idea of a programming book like that is to thoroughly treat the  
> language from a programmer's standpoint, in a fairly standard way,  
> such as Ruby or Python.
>
> As I'm learning more of statistics with R, I prefer to do it with the  
> book by Crawley.  Looks like most of R books are written by  
> statisticians who became programmers, not the other way.  Through all  
> those years I periodically follow R, I forget its programming spirit  
> in between, and there's no "Programming ..." book to help.  
> Statistics is hard to forget once you master it; syntax sugar melts  
> away...
>
> "Programming with Data" is the closest to an O'Reilly, but more  
> advanced and esoteric than that.
>
> Since R became a bona fide Open Source language with CRAN and all, an  
> O'Reilly book by a [Python and Ruby] programmer-turn-statistician is  
> long overdue!  If it systematically compares R with Ruby and Python,  
> its closest Open Source cousins, it would help even more.  RPy and  
> RRb are there to help, too.  Just my $0.01...
>
> Cheers,
> Alexy
>
> On Nov 7, 2007, at 7:46 PM, Bert Gunter wrote:
>
> >>> (Will someone here please write an O'Reilly's "Programming in  
> >>> R"?  :)
> >
> > Someone already has ... see Venable and Ripley's S PROGRAMMING.
> >
> > **However** R is more than a general purpose programming language:  
> > it is a
> > programming language specifically designed for data analysis --  
> > including
> > statistical graphics -- and statistics. So, IMHO anyway, it's really
> > impossible to discuss it without reference to the data structures and
> > procedures underlying such tasks. Because it is targeted to do  
> > those sorts
> > of things well, it may handle poorly some things that general purpose
> > languages do well (minimizing storage with the use of references, for
> > example).
> >
> > My own experience is that one appreciates the power and beauty of the
> > language and the wisdom of the designers the more one uses it in real
> > applications. But I am not a computer scientist and have only a  
> > limited
> > exposure to standard CS concepts and algorithms, to say nothing of  
> > "real"
> > programming experience. So just my $.02.
> >
> > Best regards,
> >
> > Bert Gunter
> > Genentech Nonclinical Statistics
> >
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Simon Blomberg, BSc (Hons), PhD, MAppStat.
Lecturer and Consultant Statistician
Faculty of Biological and Chemical Sciences
The University of Queensland
St. Lucia Queensland 4072
Australia
Room 320 Goddard Building (8)
T: +61 7 3365 2506
email: S.Blomberg1_at_uq.edu.au

Policies:
1.  I will NOT analyse your data for you.
2.  Your deadline is your problem.

The combination of some data and an aching desire for
an answer does not ensure that a reasonable answer can
be extracted from a given body of data. - John Tukey.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: R as a programming language

Dale Steele
In reply to this post by Alexy Khrabrov
I'm anxiously awaiting my copy of the soon to be published  "A First
Course in Statistical Programming with R"  by
W. John Braun
University of Western Ontario
Duncan J. Murdoch
University of Western Ontario
Paperback
 (ISBN-13: 9780521694247)

http://www.cambridge.org/catalogue/catalogue.asp?isbn=9780521694247

On 11/7/07, Alexy Khrabrov <[hidden email]> wrote:
> With all due respect to the great book -- of which I own 2 copies I

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: R as a programming language

Thomas Lumley
In reply to this post by Duncan Murdoch
On Wed, 7 Nov 2007, Duncan Murdoch wrote:

>
> At first I thought you were complaining about the syntax, which I find
> ugly.  There was a proposal last year to overload + to do concatenation
> of strings, so you'd type cat("x=" + x + "y=" + y + "\n"), but there was
> substantial resistance, on the grounds that + should be commutative.
>

My objection, at least, was that + should be *associative*.  I don't think
anyone would expect a + b and b+a to be the same for strings, but I do
think the fact that (a+b)+c and a+(b+c) would be different (if some of a,
b,c were strings) has real potential for ugliness.

  -thomas

Thomas Lumley Assoc. Professor, Biostatistics
[hidden email] University of Washington, Seattle

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: R as a programming language

hadley wickham
> My objection, at least, was that + should be *associative*.  I don't think
> anyone would expect a + b and b+a to be the same for strings, but I do
> think the fact that (a+b)+c and a+(b+c) would be different (if some of a,
> b,c were strings) has real potential for ugliness.

You're assuming an automatic cast from numbers into strings?  What if
a + "4" threw an error?

Hadley

--
http://had.co.nz/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: R as a programming language

Duncan Murdoch
In reply to this post by Thomas Lumley
On 11/8/2007 11:51 AM, Thomas Lumley wrote:

> On Wed, 7 Nov 2007, Duncan Murdoch wrote:
>
>>
>> At first I thought you were complaining about the syntax, which I find
>> ugly.  There was a proposal last year to overload + to do concatenation
>> of strings, so you'd type cat("x=" + x + "y=" + y + "\n"), but there was
>> substantial resistance, on the grounds that + should be commutative.
>>
>
> My objection, at least, was that + should be *associative*.  I don't think
> anyone would expect a + b and b+a to be the same for strings, but I do
> think the fact that (a+b)+c and a+(b+c) would be different (if some of a,
> b,c were strings) has real potential for ugliness.

Sorry, I forgot about that. I think there were complaints about both
commutativity and associativity.

I do think lack of associativity is a less impressive complaint, because
it doesn't even hold for floating point addition without mixing types:

 > x <- .Machine$double.eps/2
 > A <- (1 + x) + x
 > B <- 1 + (x + x)
 > A == B
[1] FALSE

As far as I can see, string concatenation would only lose associativity
when some of the operands were automatically converted to strings.
Mixed type operations often give surprising results.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: R as a programming language

Duncan Murdoch
In reply to this post by hadley wickham
On 11/8/2007 12:57 PM, hadley wickham wrote:
>> My objection, at least, was that + should be *associative*.  I don't think
>> anyone would expect a + b and b+a to be the same for strings, but I do
>> think the fact that (a+b)+c and a+(b+c) would be different (if some of a,
>> b,c were strings) has real potential for ugliness.
>
> You're assuming an automatic cast from numbers into strings?  What if
> a + "4" threw an error?

Disallowing mixed types was one variation on the proposal, but I'd say
that misses the benefit of making computed strings easier to read.  If
they're full of "as.character(x)" explicit conversions, they're no
easier to read than paste() (which doesn't need explicit conversions).

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: R as a programming language

barry rowlingson
In reply to this post by hadley wickham
hadley wickham wrote:

> You're assuming an automatic cast from numbers into strings?  What if
> a + "4" threw an error?

  What's wrong with commas anyway when using cat():

  > cat("x is ",x,' and y is ',y,'\n',sep='')
  x is 1 and y is 2

  and there's always sprintf() for those moments when you want neat
formatting.

  Is it just me who thinks it's odd that in a language that is umpteen
years old we are still discussing the fundamentals of what essentially
makes up the 'hello world' example?

Barry

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: R as a programming language

Duncan Murdoch
On 11/8/2007 1:26 PM, Barry Rowlingson wrote:
> hadley wickham wrote:
>
>> You're assuming an automatic cast from numbers into strings?  What if
>> a + "4" threw an error?
>
>   What's wrong with commas anyway when using cat():
>
>   > cat("x is ",x,' and y is ',y,'\n',sep='')
>   x is 1 and y is 2

Nothing wrong when using cat(), but we sometimes need to compute strings
when we aren't using cat().

>   and there's always sprintf() for those moments when you want neat
> formatting.

That's good when you want good control over the formatting, but it
doesn't tend to be all that readable, with the variables all listed at
the end, instead of in between the bits of string.
>
>   Is it just me who thinks it's odd that in a language that is umpteen
> years old we are still discussing the fundamentals of what essentially
> makes up the 'hello world' example?

Maybe.

Duncan

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: R as a programming language

Gabor Grothendieck
In reply to this post by barry rowlingson
On Nov 8, 2007 1:26 PM, Barry Rowlingson <[hidden email]> wrote:

> hadley wickham wrote:
>
> > You're assuming an automatic cast from numbers into strings?  What if
> > a + "4" threw an error?
>
>  What's wrong with commas anyway when using cat():
>
>  > cat("x is ",x,' and y is ',y,'\n',sep='')
>  x is 1 and y is 2
>
>  and there's always sprintf() for those moments when you want neat
> formatting.
>
>  Is it just me who thinks it's odd that in a language that is umpteen
> years old we are still discussing the fundamentals of what essentially
> makes up the 'hello world' example?

The gsubfn package lets you do quasi-perl style
interpolation on the arguments of a function by
prefacing the function with fn$ like this:

library(gsubfn)
fn$cat("pi = $pi, e = `exp(1)`\n")

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: R as a programming language

Ted.Harding-2
On 08-Nov-07 18:39:57, Gabor Grothendieck wrote:

> On Nov 8, 2007 1:26 PM, Barry Rowlingson <[hidden email]>
> wrote:
>> hadley wickham wrote:
>>
>> > You're assuming an automatic cast from numbers into strings?  What
>> > if
>> > a + "4" threw an error?
>>
>>  What's wrong with commas anyway when using cat():
>>
>>  > cat("x is ",x,' and y is ',y,'\n',sep='')
>>  x is 1 and y is 2
>>
>>  and there's always sprintf() for those moments when you want neat
>> formatting.
>>
>>  Is it just me who thinks it's odd that in a language that is umpteen
>> years old we are still discussing the fundamentals of what essentially
>> makes up the 'hello world' example?
>
> The gsubfn package lets you do quasi-perl style
> interpolation on the arguments of a function by
> prefacing the function with fn$ like this:
>
> library(gsubfn)
> fn$cat("pi = $pi, e = `exp(1)`\n")

Not sure if it's been mentioned already in this thread, but
could there be a case for using unix-shell-like "backquotes",
as in

"pi = `pi`, e = `exp(1)`"

??

(though this might break something else, and would probably
need to be wrapped in a special 'interpreter' function anyway;
I'm out of my depth here!).

Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <[hidden email]>
Fax-to-email: +44 (0)870 094 0861
Date: 08-Nov-07                                       Time: 19:24:09
------------------------------ XFMail ------------------------------

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: R as a programming language

ALBERTO VIEIRA FERREIRA MONTEIRO
In reply to this post by Duncan Murdoch

Duncan Murdoch wrote:
>
>> and there's always sprintf() for those moments when you
>> want neat formatting.
>
> That's good when you want good control over the formatting, but it
> doesn't tend to be all that readable, with the variables all listed
> at the end, instead of in between the bits of string.
>
As the old saying goes, you can eat the cake and have it:

x <- rnorm(1)
cat("x is close to ", sprintf("%.1lf", x), " and closer to ",
  sprintf("%.10lf", x), "\n", sep = "")

:-)

I am using R as a generic programming language for doing
jobs in Windows that I can't do using DOS batch - things
like taking a text in Latin-1 and removing the accented
characters, or looping through a directory and renaming
files with weird names, or creating a .wpl file with the mp3s.

Alberto Monteiro

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: R as a programming language

Duncan Murdoch
On 11/8/2007 2:27 PM, Alberto Monteiro wrote:

> Duncan Murdoch wrote:
>>
>>> and there's always sprintf() for those moments when you
>>> want neat formatting.
>>
>> That's good when you want good control over the formatting, but it
>> doesn't tend to be all that readable, with the variables all listed
>> at the end, instead of in between the bits of string.
>>
> As the old saying goes, you can eat the cake and have it:
>
> x <- rnorm(1)
> cat("x is close to ", sprintf("%.1lf", x), " and closer to ",
>   sprintf("%.10lf", x), "\n", sep = "")
>
> :-)

Yes, but that doesn't address my first objection to cat(), which you cut
out:

>> Nothing wrong when using cat(), but we sometimes need to compute strings
>> when we aren't using cat().


>
> I am using R as a generic programming language for doing
> jobs in Windows that I can't do using DOS batch - things
> like taking a text in Latin-1 and removing the accented
> characters, or looping through a directory and renaming
> files with weird names, or creating a .wpl file with the mp3s.

I didn't claim that this would allow R to do something it can't do now,
only that it wouldn't have to be so ugly when it did it.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
12