A few suggestions and perspectives from a PhD student

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

A few suggestions and perspectives from a PhD student

Antonin Klima
Dear Sir or Madam,

I am in 2nd year of my PhD in bioinformatics, after taking my Master’s in computer science, and have been using R heavily during my PhD. As such, I have put together a list of certain features in R that, in my opinion, would be beneficial to add, or could be improved. The first two are already implemented in packages, but given that it is implemented as user-defined operators, it greatly restricts its usefulness. I hope you will find my suggestions interesting. If you find time, I will welcome any feedback as to whether you find the suggestions useful, or why you do not think they should be implemented. I will also welcome if you enlighten me with any features I might be unaware of, that might solve the issues I have pointed out below.

1) piping
Currently available in package magrittr, piping makes the code better readable by having the line start at its natural starting point, and following with functions that are applied - in order. The readability of several nested calls with a number of parameters each is almost zero, it’s almost as if one would need to come up with the solution himself. Pipeline in comparison is very straightforward, especially together with the point (2).

The package here works rather good nevertheless, the shortcomings of piping not being native are not quite as severe as in point (2). Nevertheless, an intuitive symbol such as | would be helpful, and it sometimes bothers me that I have to parenthesize anonymous function, which would probably not be required in a native pipe-operator, much like it is not required in f.ex. lapply. That is,
1:5 %>% function(x) x+2
should be totally fine

2) currying
Currently available in package Curry. The idea is that, having a function such as foo = function(x, y) x+y, one would like to write for example lapply(foo(3), 1:5), and have the interpreter figure out ok, foo(3) does not make a value result, but it can still give a function result - a function of y. This would be indeed most useful for various apply functions, rather than writing function(x) foo(3,x).

I suggest that currying would make the code easier to write, and more readable, especially when using apply functions. One might imagine that there could be some confusion with such a feature, especially from people unfamiliar with functional programming, although R already does take function as first-order arguments, so it could be just fine. But one could address it with special syntax, such as $foo(3) [$foo(x=3)] for partial application.  The current currying package has very limited usefulness, as, being limited by the user-defined operator framework, it only rarely can contribute to less code/more readability. Compare yourself:
$foo(x=3) vs foo %<% 3
goo = function(a,b,c)
$goo(b=3) vs goo %><% list(b=3)

Moreover, one would often like currying to have highest priority. For example, when piping:
data %>% foo %>% foo1 %<% 3
if one wants to do data %>% foo %>% $foo(x=3)

3) Code executable only when running the script itself
Whereas the first two suggestions are somewhat stealing from Haskell and the like, this suggestion would be stealing from Python. I’m building quite a complicated pipeline, using S4 classes. After defining the class and its methods, I also define how to build the class to my likings, based on my input data, using various now-defined methods. So I end up having a list of command line arguments to process, and the way to create the class instance based on them. If I write it to the class file, however, I end up running the code when it is sourced from the next step in the pipeline, that needs the previous class definitions.

A feature such as pythonic “if __name__ == __main__” would thus be useful. As it is, I had to create run scripts as separate files. Which is actually not so terrible, given the class and its methods often span a few hundred lines, but still.

4) non-exported global variables
I also find it lacking, that I seem to be unable to create constants that would not get passed to files that source the class definition. That is, if class1 features global constant CONSTANT=3, then if class2 sources class1, it will also include the constant. This 1) clutters the namespace when running the code interactively, 2) potentially overwrites the constants in case of nameclash. Some kind of export/nonexport variable syntax, or symbolic import, or namespace would be useful. I know if I converted it to a package I would get at least something like a namespace, but still.

I understand that the variable cannot just not be imported, in general, as the functions will generally rely on it (otherwise it wouldn’t have to be there). But one could consider hiding it in an implicit namespace for the file, for example.

5) S4 methods with same name, for different classes
Say I have an S4 class called datasetSingle, and another S4 class called datasetMulti, which gathers up a number of datasetSingle classes, and adds some extra functionality on top. The datasetSingle class may have a method replicates, that returns a named vector assigning replicate number to experiment names of the dataset. But I would also like to have a function with the same name for the datasetMulti class, that returns for data frame, or list, covering replicate numbers for all the datasets included.

But then, I need to setGeneric for the method. But if I set generic before both implementations, I will reset the generic in the second call, losing the definition for “replicates” for datasetSingle. Skipping this in the code for datasetMulti means that 1) I have to remember that I had the function defined for datasetSingle, 2) if I remove the function or change its name in datasetSingle, I now have to change the datasetMulti class file too. Moreover, if I would like to have a different generic for the datasetMulti version, I have to change it not in datasetMulti class file, but in the datasetSingle file, where it might not make much sense. In this case, I wanted to have another argument “datasets”, which would return the replicates only for the datasets specified, rather than for all.

I made a wrapper that could circumvent the first issue, but the second issue is not easy to circumvent.

6) Many parameters freeze S4 method calls
If I specify ca over 6 parameters for an S4 method, I would often get a “freeze” on the method call. The process would eat up a lot of memory before going into the call, upon which it would execute the call as normal (if it didn’t run out of memory or I didn’t run out of patience). Subsequent calls of the method would not include this overhead. The amount of memory this could take could be in gigabytes, and the time in minutes. I suspect this might be due to generating an entry in call table for each accepted signature. It can be circumvented, but sure isn’t a behaviour one would expect.

7) Default values for S4 methods
It would seem that it is not possible to set up default parameters for an S4 method in a usual way of definiton = function (x, y=5). I resorted to making class unions with “missing” for signatures on the call, with the call starting with if(missing(param)) param=DEFAULT_VALUE, but it certainly does not improve readability or ease of coding.


Thank you for your time if you have finished reading thus far. :) Looking forward to any answer.

Yours Sincerely,
Antonin Klima

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: A few suggestions and perspectives from a PhD student

Ista Zahn
On Fri, May 5, 2017 at 1:00 PM, Antonin Klima <[hidden email]> wrote:
> Dear Sir or Madam,
>
> I am in 2nd year of my PhD in bioinformatics, after taking my Master’s in computer science, and have been using R heavily during my PhD. As such, I have put together a list of certain features in R that, in my opinion, would be beneficial to add, or could be improved. The first two are already implemented in packages, but given that it is implemented as user-defined operators, it greatly restricts its usefulness.

Why do you think being implemented in a contributed package restricts
the usefulness of a feature?

I hope you will find my suggestions interesting. If you find time, I
will welcome any feedback as to whether you find the suggestions
useful, or why you do not think they should be implemented. I will
also welcome if you enlighten me with any features I might be unaware
of, that might solve the issues I have pointed out below.
>
> 1) piping
> Currently available in package magrittr, piping makes the code better readable by having the line start at its natural starting point, and following with functions that are applied - in order. The readability of several nested calls with a number of parameters each is almost zero, it’s almost as if one would need to come up with the solution himself. Pipeline in comparison is very straightforward, especially together with the point (2).

You may be surprised to learn that not everyone thinks pipes are a
good idea. Personally I see some advantages, but there is also a big
downside with is that they mess up the call stack and make tracking
down errors via traceback() more difficult.

There is a simple alternative to pipes already built in to R that
gives you some of the advantages of %>% without messing up the call
stack.  Using Hadley's famous "little bunny foo foo" example:

foo_foo <- little_bunny()

## nesting (it is rough)
bop(
  scoop(
    hop(foo_foo, through = forest),
    up = field_mice
  ),
  on = head
)

## magrittr
foo_foo %>%
  hop(through = forest) %>%
  scoop(up = field_mouse) %>%
  bop(on = head)

## regular R assignment
foo_foo -> .
  hop(., through = forest) -> .
  scoop(., up = field_mouse) -> .
  bop(., on = head)

This is more limited that magrittr's %>%, but it gives you a lot of
the advantages without the disadvantages.

>
> The package here works rather good nevertheless, the shortcomings of piping not being native are not quite as severe as in point (2). Nevertheless, an intuitive symbol such as | would be helpful, and it sometimes bothers me that I have to parenthesize anonymous function, which would probably not be required in a native pipe-operator, much like it is not required in f.ex. lapply. That is,
> 1:5 %>% function(x) x+2
> should be totally fine

That seems pretty small-potatoes to me.

>
> 2) currying
> Currently available in package Curry. The idea is that, having a function such as foo = function(x, y) x+y, one would like to write for example lapply(foo(3), 1:5), and have the interpreter figure out ok, foo(3) does not make a value result, but it can still give a function result - a function of y. This would be indeed most useful for various apply functions, rather than writing function(x) foo(3,x).

You can already do

lapply(1:5, foo, y = 3)

(assuming that the first argument to foo is named "y")

I'm stopping here since I don't have anything useful to say about your
subsequent points.

Best,
Ista

>
> I suggest that currying would make the code easier to write, and more readable, especially when using apply functions. One might imagine that there could be some confusion with such a feature, especially from people unfamiliar with functional programming, although R already does take function as first-order arguments, so it could be just fine. But one could address it with special syntax, such as $foo(3) [$foo(x=3)] for partial application.  The current currying package has very limited usefulness, as, being limited by the user-defined operator framework, it only rarely can contribute to less code/more readability. Compare yourself:
> $foo(x=3) vs foo %<% 3
> goo = function(a,b,c)
> $goo(b=3) vs goo %><% list(b=3)
>
> Moreover, one would often like currying to have highest priority. For example, when piping:
> data %>% foo %>% foo1 %<% 3
> if one wants to do data %>% foo %>% $foo(x=3)
>
> 3) Code executable only when running the script itself
> Whereas the first two suggestions are somewhat stealing from Haskell and the like, this suggestion would be stealing from Python. I’m building quite a complicated pipeline, using S4 classes. After defining the class and its methods, I also define how to build the class to my likings, based on my input data, using various now-defined methods. So I end up having a list of command line arguments to process, and the way to create the class instance based on them. If I write it to the class file, however, I end up running the code when it is sourced from the next step in the pipeline, that needs the previous class definitions.
>
> A feature such as pythonic “if __name__ == __main__” would thus be useful. As it is, I had to create run scripts as separate files. Which is actually not so terrible, given the class and its methods often span a few hundred lines, but still.
>
> 4) non-exported global variables
> I also find it lacking, that I seem to be unable to create constants that would not get passed to files that source the class definition. That is, if class1 features global constant CONSTANT=3, then if class2 sources class1, it will also include the constant. This 1) clutters the namespace when running the code interactively, 2) potentially overwrites the constants in case of nameclash. Some kind of export/nonexport variable syntax, or symbolic import, or namespace would be useful. I know if I converted it to a package I would get at least something like a namespace, but still.
>
> I understand that the variable cannot just not be imported, in general, as the functions will generally rely on it (otherwise it wouldn’t have to be there). But one could consider hiding it in an implicit namespace for the file, for example.
>
> 5) S4 methods with same name, for different classes
> Say I have an S4 class called datasetSingle, and another S4 class called datasetMulti, which gathers up a number of datasetSingle classes, and adds some extra functionality on top. The datasetSingle class may have a method replicates, that returns a named vector assigning replicate number to experiment names of the dataset. But I would also like to have a function with the same name for the datasetMulti class, that returns for data frame, or list, covering replicate numbers for all the datasets included.
>
> But then, I need to setGeneric for the method. But if I set generic before both implementations, I will reset the generic in the second call, losing the definition for “replicates” for datasetSingle. Skipping this in the code for datasetMulti means that 1) I have to remember that I had the function defined for datasetSingle, 2) if I remove the function or change its name in datasetSingle, I now have to change the datasetMulti class file too. Moreover, if I would like to have a different generic for the datasetMulti version, I have to change it not in datasetMulti class file, but in the datasetSingle file, where it might not make much sense. In this case, I wanted to have another argument “datasets”, which would return the replicates only for the datasets specified, rather than for all.
>
> I made a wrapper that could circumvent the first issue, but the second issue is not easy to circumvent.
>
> 6) Many parameters freeze S4 method calls
> If I specify ca over 6 parameters for an S4 method, I would often get a “freeze” on the method call. The process would eat up a lot of memory before going into the call, upon which it would execute the call as normal (if it didn’t run out of memory or I didn’t run out of patience). Subsequent calls of the method would not include this overhead. The amount of memory this could take could be in gigabytes, and the time in minutes. I suspect this might be due to generating an entry in call table for each accepted signature. It can be circumvented, but sure isn’t a behaviour one would expect.
>
> 7) Default values for S4 methods
> It would seem that it is not possible to set up default parameters for an S4 method in a usual way of definiton = function (x, y=5). I resorted to making class unions with “missing” for signatures on the call, with the call starting with if(missing(param)) param=DEFAULT_VALUE, but it certainly does not improve readability or ease of coding.
>
>
> Thank you for your time if you have finished reading thus far. :) Looking forward to any answer.
>
> Yours Sincerely,
> Antonin Klima
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: A few suggestions and perspectives from a PhD student

Gabor Grothendieck
In reply to this post by Antonin Klima
Regarding the anonymous-function-in-a-pipeline point one can already
do this which does use brackets but even so it involves fewer
characters than the example shown.  Here { . * 2 } is basically a
lambda whose argument is dot. Would this be sufficient?

  library(magrittr)

  1.5 %>% { . * 2 }
  ## [1] 3

Regarding currying note that with magrittr Ista's code could be written as:

  1:5 %>% lapply(foo, y = 3)

or at the expense of slightly more verbosity:

  1:5 %>% Map(f = . %>% foo(y = 3))


On Fri, May 5, 2017 at 1:00 PM, Antonin Klima <[hidden email]> wrote:

> Dear Sir or Madam,
>
> I am in 2nd year of my PhD in bioinformatics, after taking my Master’s in computer science, and have been using R heavily during my PhD. As such, I have put together a list of certain features in R that, in my opinion, would be beneficial to add, or could be improved. The first two are already implemented in packages, but given that it is implemented as user-defined operators, it greatly restricts its usefulness. I hope you will find my suggestions interesting. If you find time, I will welcome any feedback as to whether you find the suggestions useful, or why you do not think they should be implemented. I will also welcome if you enlighten me with any features I might be unaware of, that might solve the issues I have pointed out below.
>
> 1) piping
> Currently available in package magrittr, piping makes the code better readable by having the line start at its natural starting point, and following with functions that are applied - in order. The readability of several nested calls with a number of parameters each is almost zero, it’s almost as if one would need to come up with the solution himself. Pipeline in comparison is very straightforward, especially together with the point (2).
>
> The package here works rather good nevertheless, the shortcomings of piping not being native are not quite as severe as in point (2). Nevertheless, an intuitive symbol such as | would be helpful, and it sometimes bothers me that I have to parenthesize anonymous function, which would probably not be required in a native pipe-operator, much like it is not required in f.ex. lapply. That is,
> 1:5 %>% function(x) x+2
> should be totally fine
>
> 2) currying
> Currently available in package Curry. The idea is that, having a function such as foo = function(x, y) x+y, one would like to write for example lapply(foo(3), 1:5), and have the interpreter figure out ok, foo(3) does not make a value result, but it can still give a function result - a function of y. This would be indeed most useful for various apply functions, rather than writing function(x) foo(3,x).
>
> I suggest that currying would make the code easier to write, and more readable, especially when using apply functions. One might imagine that there could be some confusion with such a feature, especially from people unfamiliar with functional programming, although R already does take function as first-order arguments, so it could be just fine. But one could address it with special syntax, such as $foo(3) [$foo(x=3)] for partial application.  The current currying package has very limited usefulness, as, being limited by the user-defined operator framework, it only rarely can contribute to less code/more readability. Compare yourself:
> $foo(x=3) vs foo %<% 3
> goo = function(a,b,c)
> $goo(b=3) vs goo %><% list(b=3)
>
> Moreover, one would often like currying to have highest priority. For example, when piping:
> data %>% foo %>% foo1 %<% 3
> if one wants to do data %>% foo %>% $foo(x=3)
>
> 3) Code executable only when running the script itself
> Whereas the first two suggestions are somewhat stealing from Haskell and the like, this suggestion would be stealing from Python. I’m building quite a complicated pipeline, using S4 classes. After defining the class and its methods, I also define how to build the class to my likings, based on my input data, using various now-defined methods. So I end up having a list of command line arguments to process, and the way to create the class instance based on them. If I write it to the class file, however, I end up running the code when it is sourced from the next step in the pipeline, that needs the previous class definitions.
>
> A feature such as pythonic “if __name__ == __main__” would thus be useful. As it is, I had to create run scripts as separate files. Which is actually not so terrible, given the class and its methods often span a few hundred lines, but still.
>
> 4) non-exported global variables
> I also find it lacking, that I seem to be unable to create constants that would not get passed to files that source the class definition. That is, if class1 features global constant CONSTANT=3, then if class2 sources class1, it will also include the constant. This 1) clutters the namespace when running the code interactively, 2) potentially overwrites the constants in case of nameclash. Some kind of export/nonexport variable syntax, or symbolic import, or namespace would be useful. I know if I converted it to a package I would get at least something like a namespace, but still.
>
> I understand that the variable cannot just not be imported, in general, as the functions will generally rely on it (otherwise it wouldn’t have to be there). But one could consider hiding it in an implicit namespace for the file, for example.
>
> 5) S4 methods with same name, for different classes
> Say I have an S4 class called datasetSingle, and another S4 class called datasetMulti, which gathers up a number of datasetSingle classes, and adds some extra functionality on top. The datasetSingle class may have a method replicates, that returns a named vector assigning replicate number to experiment names of the dataset. But I would also like to have a function with the same name for the datasetMulti class, that returns for data frame, or list, covering replicate numbers for all the datasets included.
>
> But then, I need to setGeneric for the method. But if I set generic before both implementations, I will reset the generic in the second call, losing the definition for “replicates” for datasetSingle. Skipping this in the code for datasetMulti means that 1) I have to remember that I had the function defined for datasetSingle, 2) if I remove the function or change its name in datasetSingle, I now have to change the datasetMulti class file too. Moreover, if I would like to have a different generic for the datasetMulti version, I have to change it not in datasetMulti class file, but in the datasetSingle file, where it might not make much sense. In this case, I wanted to have another argument “datasets”, which would return the replicates only for the datasets specified, rather than for all.
>
> I made a wrapper that could circumvent the first issue, but the second issue is not easy to circumvent.
>
> 6) Many parameters freeze S4 method calls
> If I specify ca over 6 parameters for an S4 method, I would often get a “freeze” on the method call. The process would eat up a lot of memory before going into the call, upon which it would execute the call as normal (if it didn’t run out of memory or I didn’t run out of patience). Subsequent calls of the method would not include this overhead. The amount of memory this could take could be in gigabytes, and the time in minutes. I suspect this might be due to generating an entry in call table for each accepted signature. It can be circumvented, but sure isn’t a behaviour one would expect.
>
> 7) Default values for S4 methods
> It would seem that it is not possible to set up default parameters for an S4 method in a usual way of definiton = function (x, y=5). I resorted to making class unions with “missing” for signatures on the call, with the call starting with if(missing(param)) param=DEFAULT_VALUE, but it certainly does not improve readability or ease of coding.
>
>
> Thank you for your time if you have finished reading thus far. :) Looking forward to any answer.
>
> Yours Sincerely,
> Antonin Klima
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: A few suggestions and perspectives from a PhD student

Antonin Klima
Thanks for the answers,

I’m aware of the ‘.’ option, just wanted to give a very simple example.

But the lapply ‘…' parameter use has eluded me and thanks for enlightening me.

What do you mean by messing up the call stack. As far as I understand it, piping should translate into same code as deep nesting. So then I only see a tiny downside for debugging here. No loss of time/space efficiency or anything. With a change of inadvertent error in your example, coming from the fact that a variable is being reused and noone now checks for me whether it is being passed between the lines. And with having to specify the variable every single time. For me, that solution is clearly inferior.

Too bad you didn’t find my other comments interesting though.

>Why do you think being implemented in a contributed package restricts
>the usefulness of a feature?

I guess it depends on your philosophy. It may not restrict it per say, although it would make a lot of sense to me reusing the bash-style ‘|' and have a shorter, more readable version. One has extra dependence on a package for an item that fits the language so well that it should be its part.  It is without doubt my most used operator at least. Going to some of my folders I found 101 uses in 750 lines, and 132 uses in 3303 lines. I would compare it to having a computer game being really good with a fan-created mod, but lacking otherwise. :)

So to me, it makes sense that if there is no doubt that a feature improves the language, and especially if people extensively use it through a package already, it should be part of the “standard”. Question is whether it is indeed very popular, and whether you share my view. But that’s now up to you, I just wanted to point it out I guess.

Best Regards,
Antonin

> On 05 May 2017, at 22:33, Gabor Grothendieck <[hidden email]> wrote:
>
> Regarding the anonymous-function-in-a-pipeline point one can already
> do this which does use brackets but even so it involves fewer
> characters than the example shown.  Here { . * 2 } is basically a
> lambda whose argument is dot. Would this be sufficient?
>
>  library(magrittr)
>
>  1.5 %>% { . * 2 }
>  ## [1] 3
>
> Regarding currying note that with magrittr Ista's code could be written as:
>
>  1:5 %>% lapply(foo, y = 3)
>
> or at the expense of slightly more verbosity:
>
>  1:5 %>% Map(f = . %>% foo(y = 3))
>
>
> On Fri, May 5, 2017 at 1:00 PM, Antonin Klima <[hidden email]> wrote:
>> Dear Sir or Madam,
>>
>> I am in 2nd year of my PhD in bioinformatics, after taking my Master’s in computer science, and have been using R heavily during my PhD. As such, I have put together a list of certain features in R that, in my opinion, would be beneficial to add, or could be improved. The first two are already implemented in packages, but given that it is implemented as user-defined operators, it greatly restricts its usefulness. I hope you will find my suggestions interesting. If you find time, I will welcome any feedback as to whether you find the suggestions useful, or why you do not think they should be implemented. I will also welcome if you enlighten me with any features I might be unaware of, that might solve the issues I have pointed out below.
>>
>> 1) piping
>> Currently available in package magrittr, piping makes the code better readable by having the line start at its natural starting point, and following with functions that are applied - in order. The readability of several nested calls with a number of parameters each is almost zero, it’s almost as if one would need to come up with the solution himself. Pipeline in comparison is very straightforward, especially together with the point (2).
>>
>> The package here works rather good nevertheless, the shortcomings of piping not being native are not quite as severe as in point (2). Nevertheless, an intuitive symbol such as | would be helpful, and it sometimes bothers me that I have to parenthesize anonymous function, which would probably not be required in a native pipe-operator, much like it is not required in f.ex. lapply. That is,
>> 1:5 %>% function(x) x+2
>> should be totally fine
>>
>> 2) currying
>> Currently available in package Curry. The idea is that, having a function such as foo = function(x, y) x+y, one would like to write for example lapply(foo(3), 1:5), and have the interpreter figure out ok, foo(3) does not make a value result, but it can still give a function result - a function of y. This would be indeed most useful for various apply functions, rather than writing function(x) foo(3,x).
>>
>> I suggest that currying would make the code easier to write, and more readable, especially when using apply functions. One might imagine that there could be some confusion with such a feature, especially from people unfamiliar with functional programming, although R already does take function as first-order arguments, so it could be just fine. But one could address it with special syntax, such as $foo(3) [$foo(x=3)] for partial application.  The current currying package has very limited usefulness, as, being limited by the user-defined operator framework, it only rarely can contribute to less code/more readability. Compare yourself:
>> $foo(x=3) vs foo %<% 3
>> goo = function(a,b,c)
>> $goo(b=3) vs goo %><% list(b=3)
>>
>> Moreover, one would often like currying to have highest priority. For example, when piping:
>> data %>% foo %>% foo1 %<% 3
>> if one wants to do data %>% foo %>% $foo(x=3)
>>
>> 3) Code executable only when running the script itself
>> Whereas the first two suggestions are somewhat stealing from Haskell and the like, this suggestion would be stealing from Python. I’m building quite a complicated pipeline, using S4 classes. After defining the class and its methods, I also define how to build the class to my likings, based on my input data, using various now-defined methods. So I end up having a list of command line arguments to process, and the way to create the class instance based on them. If I write it to the class file, however, I end up running the code when it is sourced from the next step in the pipeline, that needs the previous class definitions.
>>
>> A feature such as pythonic “if __name__ == __main__” would thus be useful. As it is, I had to create run scripts as separate files. Which is actually not so terrible, given the class and its methods often span a few hundred lines, but still.
>>
>> 4) non-exported global variables
>> I also find it lacking, that I seem to be unable to create constants that would not get passed to files that source the class definition. That is, if class1 features global constant CONSTANT=3, then if class2 sources class1, it will also include the constant. This 1) clutters the namespace when running the code interactively, 2) potentially overwrites the constants in case of nameclash. Some kind of export/nonexport variable syntax, or symbolic import, or namespace would be useful. I know if I converted it to a package I would get at least something like a namespace, but still.
>>
>> I understand that the variable cannot just not be imported, in general, as the functions will generally rely on it (otherwise it wouldn’t have to be there). But one could consider hiding it in an implicit namespace for the file, for example.
>>
>> 5) S4 methods with same name, for different classes
>> Say I have an S4 class called datasetSingle, and another S4 class called datasetMulti, which gathers up a number of datasetSingle classes, and adds some extra functionality on top. The datasetSingle class may have a method replicates, that returns a named vector assigning replicate number to experiment names of the dataset. But I would also like to have a function with the same name for the datasetMulti class, that returns for data frame, or list, covering replicate numbers for all the datasets included.
>>
>> But then, I need to setGeneric for the method. But if I set generic before both implementations, I will reset the generic in the second call, losing the definition for “replicates” for datasetSingle. Skipping this in the code for datasetMulti means that 1) I have to remember that I had the function defined for datasetSingle, 2) if I remove the function or change its name in datasetSingle, I now have to change the datasetMulti class file too. Moreover, if I would like to have a different generic for the datasetMulti version, I have to change it not in datasetMulti class file, but in the datasetSingle file, where it might not make much sense. In this case, I wanted to have another argument “datasets”, which would return the replicates only for the datasets specified, rather than for all.
>>
>> I made a wrapper that could circumvent the first issue, but the second issue is not easy to circumvent.
>>
>> 6) Many parameters freeze S4 method calls
>> If I specify ca over 6 parameters for an S4 method, I would often get a “freeze” on the method call. The process would eat up a lot of memory before going into the call, upon which it would execute the call as normal (if it didn’t run out of memory or I didn’t run out of patience). Subsequent calls of the method would not include this overhead. The amount of memory this could take could be in gigabytes, and the time in minutes. I suspect this might be due to generating an entry in call table for each accepted signature. It can be circumvented, but sure isn’t a behaviour one would expect.
>>
>> 7) Default values for S4 methods
>> It would seem that it is not possible to set up default parameters for an S4 method in a usual way of definiton = function (x, y=5). I resorted to making class unions with “missing” for signatures on the call, with the call starting with if(missing(param)) param=DEFAULT_VALUE, but it certainly does not improve readability or ease of coding.
>>
>>
>> Thank you for your time if you have finished reading thus far. :) Looking forward to any answer.
>>
>> Yours Sincerely,
>> Antonin Klima
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: A few suggestions and perspectives from a PhD student

Ista Zahn
On Mon, May 8, 2017 at 8:08 AM, Antonin Klima <[hidden email]> wrote:
> Thanks for the answers,
>
> I’m aware of the ‘.’ option, just wanted to give a very simple example.
>
> But the lapply ‘…' parameter use has eluded me and thanks for enlightening me.
>
> What do you mean by messing up the call stack. As far as I understand it, piping should translate into same code as deep nesting.

Perhaps, but then magrittr is not really a pipe. Here is a simple example

library(magrittr)
data.frame(x = 1) %>%
    subset(y == 1)
traceback()

> Error in eval(e, x, parent.frame()) : object 'y' not found
> 12: eval(e, x, parent.frame())
11: eval(e, x, parent.frame())
10: subset.data.frame(., y == 1)
9: subset(., y == 1)
8: function_list[[k]](value)
7: withVisible(function_list[[k]](value))
6: freduce(value, `_function_list`)
5: `_fseq`(`_lhs`)
4: eval(quote(`_fseq`(`_lhs`)), env, env)
3: eval(quote(`_fseq`(`_lhs`)), env, env)
2: withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
1: data.frame(x = 1) %>% subset(y == 1)
>

subset(data.frame(x = 1),
      y == 1)
traceback()

> Error in eval(e, x, parent.frame()) : object 'y' not found
> 4: eval(e, x, parent.frame())
3: eval(e, x, parent.frame())
2: subset.data.frame(data.frame(x = 1), y == 1)
1: subset(data.frame(x = 1), y == 1)
>

It does pollute the call stack, making debugging harder.

 So then I only see a tiny downside for debugging here. No loss of
time/space efficiency or anything. With a change of inadvertent error
in your example, coming from the fact that a variable is being reused
and noone now checks for me whether it is being passed between the
lines. And with having to specify the variable every single time. For
me, that solution is clearly inferior.

There are tradeoffs. As demonstrated above, the pipe is clearly
inferior in that it is doing a lot of complicated stuff under the
hood, and when you try to traceback() through the call stack you have
to sift through all that complicated stuff. That's a pretty big
drawback in my opinion.

>
> Too bad you didn’t find my other comments interesting though.

I did not say that.

>
>>Why do you think being implemented in a contributed package restricts
>>the usefulness of a feature?
>
> I guess it depends on your philosophy. It may not restrict it per say, although it would make a lot of sense to me reusing the bash-style ‘|' and have a shorter, more readable version. One has extra dependence on a package for an item that fits the language so well that it should be its part.  It is without doubt my most used operator at least. Going to some of my folders I found 101 uses in 750 lines, and 132 uses in 3303 lines. I would compare it to having a computer game being really good with a fan-created mod, but lacking otherwise. :)

One of the key strengths of R is that packages are not akin to "fan
created mods". They are a central and necessary part of the R system.

>
> So to me, it makes sense that if there is no doubt that a feature improves the language, and especially if people extensively use it through a package already, it should be part of the “standard”. Question is whether it is indeed very popular, and whether you share my view. But that’s now up to you, I just wanted to point it out I guess.

>
> Best Regards,
> Antonin
>
>> On 05 May 2017, at 22:33, Gabor Grothendieck <[hidden email]> wrote:
>>
>> Regarding the anonymous-function-in-a-pipeline point one can already
>> do this which does use brackets but even so it involves fewer
>> characters than the example shown.  Here { . * 2 } is basically a
>> lambda whose argument is dot. Would this be sufficient?
>>
>>  library(magrittr)
>>
>>  1.5 %>% { . * 2 }
>>  ## [1] 3
>>
>> Regarding currying note that with magrittr Ista's code could be written as:
>>
>>  1:5 %>% lapply(foo, y = 3)
>>
>> or at the expense of slightly more verbosity:
>>
>>  1:5 %>% Map(f = . %>% foo(y = 3))
>>
>>
>> On Fri, May 5, 2017 at 1:00 PM, Antonin Klima <[hidden email]> wrote:
>>> Dear Sir or Madam,
>>>
>>> I am in 2nd year of my PhD in bioinformatics, after taking my Master’s in computer science, and have been using R heavily during my PhD. As such, I have put together a list of certain features in R that, in my opinion, would be beneficial to add, or could be improved. The first two are already implemented in packages, but given that it is implemented as user-defined operators, it greatly restricts its usefulness. I hope you will find my suggestions interesting. If you find time, I will welcome any feedback as to whether you find the suggestions useful, or why you do not think they should be implemented. I will also welcome if you enlighten me with any features I might be unaware of, that might solve the issues I have pointed out below.
>>>
>>> 1) piping
>>> Currently available in package magrittr, piping makes the code better readable by having the line start at its natural starting point, and following with functions that are applied - in order. The readability of several nested calls with a number of parameters each is almost zero, it’s almost as if one would need to come up with the solution himself. Pipeline in comparison is very straightforward, especially together with the point (2).
>>>
>>> The package here works rather good nevertheless, the shortcomings of piping not being native are not quite as severe as in point (2). Nevertheless, an intuitive symbol such as | would be helpful, and it sometimes bothers me that I have to parenthesize anonymous function, which would probably not be required in a native pipe-operator, much like it is not required in f.ex. lapply. That is,
>>> 1:5 %>% function(x) x+2
>>> should be totally fine
>>>
>>> 2) currying
>>> Currently available in package Curry. The idea is that, having a function such as foo = function(x, y) x+y, one would like to write for example lapply(foo(3), 1:5), and have the interpreter figure out ok, foo(3) does not make a value result, but it can still give a function result - a function of y. This would be indeed most useful for various apply functions, rather than writing function(x) foo(3,x).
>>>
>>> I suggest that currying would make the code easier to write, and more readable, especially when using apply functions. One might imagine that there could be some confusion with such a feature, especially from people unfamiliar with functional programming, although R already does take function as first-order arguments, so it could be just fine. But one could address it with special syntax, such as $foo(3) [$foo(x=3)] for partial application.  The current currying package has very limited usefulness, as, being limited by the user-defined operator framework, it only rarely can contribute to less code/more readability. Compare yourself:
>>> $foo(x=3) vs foo %<% 3
>>> goo = function(a,b,c)
>>> $goo(b=3) vs goo %><% list(b=3)
>>>
>>> Moreover, one would often like currying to have highest priority. For example, when piping:
>>> data %>% foo %>% foo1 %<% 3
>>> if one wants to do data %>% foo %>% $foo(x=3)
>>>
>>> 3) Code executable only when running the script itself
>>> Whereas the first two suggestions are somewhat stealing from Haskell and the like, this suggestion would be stealing from Python. I’m building quite a complicated pipeline, using S4 classes. After defining the class and its methods, I also define how to build the class to my likings, based on my input data, using various now-defined methods. So I end up having a list of command line arguments to process, and the way to create the class instance based on them. If I write it to the class file, however, I end up running the code when it is sourced from the next step in the pipeline, that needs the previous class definitions.
>>>
>>> A feature such as pythonic “if __name__ == __main__” would thus be useful. As it is, I had to create run scripts as separate files. Which is actually not so terrible, given the class and its methods often span a few hundred lines, but still.
>>>
>>> 4) non-exported global variables
>>> I also find it lacking, that I seem to be unable to create constants that would not get passed to files that source the class definition. That is, if class1 features global constant CONSTANT=3, then if class2 sources class1, it will also include the constant. This 1) clutters the namespace when running the code interactively, 2) potentially overwrites the constants in case of nameclash. Some kind of export/nonexport variable syntax, or symbolic import, or namespace would be useful. I know if I converted it to a package I would get at least something like a namespace, but still.
>>>
>>> I understand that the variable cannot just not be imported, in general, as the functions will generally rely on it (otherwise it wouldn’t have to be there). But one could consider hiding it in an implicit namespace for the file, for example.
>>>
>>> 5) S4 methods with same name, for different classes
>>> Say I have an S4 class called datasetSingle, and another S4 class called datasetMulti, which gathers up a number of datasetSingle classes, and adds some extra functionality on top. The datasetSingle class may have a method replicates, that returns a named vector assigning replicate number to experiment names of the dataset. But I would also like to have a function with the same name for the datasetMulti class, that returns for data frame, or list, covering replicate numbers for all the datasets included.
>>>
>>> But then, I need to setGeneric for the method. But if I set generic before both implementations, I will reset the generic in the second call, losing the definition for “replicates” for datasetSingle. Skipping this in the code for datasetMulti means that 1) I have to remember that I had the function defined for datasetSingle, 2) if I remove the function or change its name in datasetSingle, I now have to change the datasetMulti class file too. Moreover, if I would like to have a different generic for the datasetMulti version, I have to change it not in datasetMulti class file, but in the datasetSingle file, where it might not make much sense. In this case, I wanted to have another argument “datasets”, which would return the replicates only for the datasets specified, rather than for all.
>>>
>>> I made a wrapper that could circumvent the first issue, but the second issue is not easy to circumvent.
>>>
>>> 6) Many parameters freeze S4 method calls
>>> If I specify ca over 6 parameters for an S4 method, I would often get a “freeze” on the method call. The process would eat up a lot of memory before going into the call, upon which it would execute the call as normal (if it didn’t run out of memory or I didn’t run out of patience). Subsequent calls of the method would not include this overhead. The amount of memory this could take could be in gigabytes, and the time in minutes. I suspect this might be due to generating an entry in call table for each accepted signature. It can be circumvented, but sure isn’t a behaviour one would expect.
>>>
>>> 7) Default values for S4 methods
>>> It would seem that it is not possible to set up default parameters for an S4 method in a usual way of definiton = function (x, y=5). I resorted to making class unions with “missing” for signatures on the call, with the call starting with if(missing(param)) param=DEFAULT_VALUE, but it certainly does not improve readability or ease of coding.
>>>
>>>
>>> Thank you for your time if you have finished reading thus far. :) Looking forward to any answer.
>>>
>>> Yours Sincerely,
>>> Antonin Klima
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>>
>> --
>> Statistics & Software Consulting
>> GKX Group, GKX Associates Inc.
>> tel: 1-877-GKX-GROUP
>> email: ggrothendieck at gmail.com
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: A few suggestions and perspectives from a PhD student

hadley wickham
> There are tradeoffs. As demonstrated above, the pipe is clearly
> inferior in that it is doing a lot of complicated stuff under the
> hood, and when you try to traceback() through the call stack you have
> to sift through all that complicated stuff. That's a pretty big
> drawback in my opinion.

To be precise, that is a problem with the current implementation of
the pipe. It's not a limitation of the pipe per se.

Hadley

--
http://hadley.nz

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: A few suggestions and perspectives from a PhD student

Hilmar Berger-4
In reply to this post by Ista Zahn
Hi,

On 08/05/17 16:37, Ista Zahn wrote:
> One of the key strengths of R is that packages are not akin to "fan
> created mods". They are a central and necessary part of the R system.
>
I would tend to disagree here. R packages are in their majority not
maintained by the core R developers. Concepts, features and lifetime
depend mainly on the maintainers of the package (even though in theory
GPL will allow to somebody to take over anytime). Several packages that
are critical for processing big data and providing "modern"
visualizations introduce concepts quite different from the legacy S/R
language. I do feel that in a way, current core R shows strongly its
origin in S, while modern concepts (e.g. data.table, dplyr, ggplot, ...)
are often only available via extension packages. This is fine if one
considers R to be a statistical toolkit; as a programming language,
however, it introduces inconsistencies and uncertainties which could be
avoided if some of the "modern" parts (including language concepts)
could be more integrated in core-R.

Best regards,
Hilmar

--
Dr. Hilmar Berger, MD
Max Planck Institute for Infection Biology
Charitéplatz 1
D-10117 Berlin
GERMANY

Phone:  + 49 30 28460 430
Fax:    + 49 30 28460 401
 
E-Mail: [hidden email]
Web   : www.mpiib-berlin.mpg.de

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: A few suggestions and perspectives from a PhD student

Joris FA Meys
On Tue, May 9, 2017 at 9:47 AM, Hilmar Berger <[hidden email]>
wrote:

> Hi,
>
> On 08/05/17 16:37, Ista Zahn wrote:
>
>> One of the key strengths of R is that packages are not akin to "fan
>> created mods". They are a central and necessary part of the R system.
>>
>> I would tend to disagree here. R packages are in their majority not
> maintained by the core R developers. Concepts, features and lifetime depend
> mainly on the maintainers of the package (even though in theory GPL will
> allow to somebody to take over anytime). Several packages that are critical
> for processing big data and providing "modern" visualizations introduce
> concepts quite different from the legacy S/R language. I do feel that in a
> way, current core R shows strongly its origin in S, while modern concepts
> (e.g. data.table, dplyr, ggplot, ...) are often only available via
> extension packages. This is fine if one considers R to be a statistical
> toolkit; as a programming language, however, it introduces inconsistencies
> and uncertainties which could be avoided if some of the "modern" parts
> (including language concepts) could be more integrated in core-R.
>
> Best regards,
> Hilmar
>

And I would tend to disagree here. R is build upon the paradigm of a
functional programming language, and falls in the same group as clojure,
haskell and the likes. It is a turing complete programming language on its
own. That's quite a bit more than "a statistical toolkit". You can say that
about eg the macro language of SPSS, but not about R.

Second, there's little "modern" about the ideas behind the tidyverse.
Piping is about as old as unix itself. The grammar of graphics, on which
ggplot is based, stems from the SYStat graphics system from the nineties.
Hadley and colleagues did (and do) a great job implementing these ideas in
R, but the ideas do have a respectable age.

Third, there's a lot of nonstandard evaluation going on in all these
packages. Using them inside your own functions requires serious attention
(eg the difference between aes() and aes_() in ggplot2). Actually, even
though I definitely see the merits of these packages in data analysis, the
tidyverse feels like a (clean and powerful) macro language on top of R. And
that's good, but that doesn't mean these parts are essential to transform R
into a programming language. Rather the contrary actually: too heavily
relying on these packages does complicate things when you start to develop
your own packages in R.

Forth, the tidyverse masks quite some native R functions. Obviously they
took great care in keeping the functionality as close as one would expect,
but that's not always the case. The lag() function of dplyr() masks an S3
generic from the stats package for example. So if you work with time series
in the stats package, loading the tidyverse gives you trouble.

Fifth, many of the tidyverse packages are a version 0.x.y : they're still
in beta development and their functionality might (and will) change.
Functions disappear, arguments are called different, tags change,... Often
the changes improve the packages, but they did break older code for me more
than once. You can't expect the R core team to incorporate something that
is bound to change.

Last but not least, the tidyverse actually sometimes works against new R
users. At least R users that go beyond the classic data workflow. I
literally rewrote some code -from a consultant- that abused the _ply
functions to create nested loops. Removing all that stuff and rewriting the
code using a simple list in combination with a simple for-loop, sped up the
code with a factor 150. That has nothing to do with dplyr, it's very fast.
That has everything to do with that person having a hammer and thinking
everything he sees is a nail. The tidyverse is no reason to not learn the
concepts of the language it's built upon.

The one thing I would like to see though, is the adaptation of the
statistical toolkit so that it can work with data.table and tibble objects
directly, as opposed to having to convert to a data.frame once you start
building the models. And I believe that eventually there will be a
replacement for the data.frame that increases R's performance and lessens
its burden on the memory.

So all in all, I do admire the tidyverse and how it speeds up data
preparation for analysis. But tidyverse is a powerful data toolkit, not a
programming language. And it won't make R a programming language either.
Because R is already.

Cheers
Joris

>
> --
> Dr. Hilmar Berger, MD
> Max Planck Institute for Infection Biology
> Charitéplatz 1
> D-10117 Berlin
> GERMANY
>
> Phone:  + 49 30 28460 430
> Fax:    + 49 30 28460 401
>  E-Mail: [hidden email]
> Web   : www.mpiib-berlin.mpg.de
>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



--
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Mathematical Modelling, Statistics and Bio-Informatics

tel :  +32 (0)9 264 61 79
[hidden email]
-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: A few suggestions and perspectives from a PhD student

Lionel Henry
> Third, there's a lot of nonstandard evaluation going on in all these
> packages. Using them inside your own functions requires serious attention
> (eg the difference between aes() and aes_() in ggplot2). Actually, even
> though I definitely see the merits of these packages in data analysis, the
> tidyverse feels like a (clean and powerful) macro language on top of R.

That is going to change as we have put a lot of effort into learning
how to deal with capturing functions. See the tidyeval framework which
will enable full and flexible programmability of tidyverse grammars.

That said I agree that data analysis and package programming often
require different sets of tools.

Lionel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: A few suggestions and perspectives from a PhD student

Hilmar Berger-4
In reply to this post by Joris FA Meys


On 09/05/17 11:22, Joris Meys wrote:

>
>
> On Tue, May 9, 2017 at 9:47 AM, Hilmar Berger
> <[hidden email] <mailto:[hidden email]>> wrote:
>
>     Hi,
>
>     On 08/05/17 16:37, Ista Zahn wrote:
>
>         One of the key strengths of R is that packages are not akin to
>         "fan
>         created mods". They are a central and necessary part of the R
>         system.
>
>     I would tend to disagree here. R packages are in their majority
>     not maintained by the core R developers. Concepts, features and
>     lifetime depend mainly on the maintainers of the package (even
>     though in theory GPL will allow to somebody to take over anytime).
>     Several packages that are critical for processing big data and
>     providing "modern" visualizations introduce concepts quite
>     different from the legacy S/R language. I do feel that in a way,
>     current core R shows strongly its origin in S, while modern
>     concepts (e.g. data.table, dplyr, ggplot, ...) are often only
>     available via extension packages. This is fine if one considers R
>     to be a statistical toolkit; as a programming language, however,
>     it introduces inconsistencies and uncertainties which could be
>     avoided if some of the "modern" parts (including language
>     concepts) could be more integrated in core-R.
>
>     Best regards,
>     Hilmar
>
>
> And I would tend to disagree here. R is build upon the paradigm of a
> functional programming language, and falls in the same group as
> clojure, haskell and the likes. It is a turing complete programming
> language on its own. That's quite a bit more than "a statistical
> toolkit". You can say that about eg the macro language of SPSS, but
> not about R.
>
My point was that inconsistencies are harder to tolerate when using R as
a programming language as opposed to a toolkit that just has to do a job.
> Second, there's little "modern" about the ideas behind the tidyverse.
> Piping is about as old as unix itself. The grammar of graphics, on
> which ggplot is based, stems from the SYStat graphics system from the
> nineties. Hadley and colleagues did (and do) a great job implementing
> these ideas in R, but the ideas do have a respectable age.
Those ideas seem still to be more modern than e.g. stock R graphics
designed probably in the seventies or eighties. Which still do their job
for lots and lots of applications, however, the fact that many newer
packages use ggplot in stead of plot() forces users to learn and use
different paradigms for things so simple as drawing a line.

I also would like to make clear that I do not advocate for including the
whole tidyverse in core R. I just believe that having core concepts well
supported in core R instead of implemented in a package might make
things more consistent. E.g. method chaining ("%>%") is a core language
feature in many languages.
>
> The one thing I would like to see though, is the adaptation of the
> statistical toolkit so that it can work with data.table and tibble
> objects directly, as opposed to having to convert to a data.frame once
> you start building the models. And I believe that eventually there
> will be a replacement for the data.frame that increases R's
> performance and lessens its burden on the memory.
>
Which is a perfect example of what I mean: improved functionality should
find their way into core R at some time point, replacing or extending
outdated functionality. Otherwise, I don't know how hard it will be to
develop 21st century methods on top of a 1980s/90s language core.
Although I admit that the R developers are doing a great job to make it
possible.

Best,
Hilmar

> So all in all, I do admire the tidyverse and how it speeds up data
> preparation for analysis. But tidyverse is a powerful data toolkit,
> not a programming language. And it won't make R a programming language
> either. Because R is already.
>
> Cheers
> Joris
>
>
>     --
>     Dr. Hilmar Berger, MD
>     Max Planck Institute for Infection Biology
>     Charitéplatz 1
>     D-10117 Berlin
>     GERMANY
>
>     Phone: + 49 30 28460 430 <tel:%2B%2049%2030%2028460%20430>
>     Fax: + 49 30 28460 401 <tel:%2B%2049%2030%2028460%20401>
>      E-Mail: [hidden email]
>     <mailto:[hidden email]>
>     Web   : www.mpiib-berlin.mpg.de <http://www.mpiib-berlin.mpg.de>
>
>
>     ______________________________________________
>     [hidden email] <mailto:[hidden email]> mailing list
>     https://stat.ethz.ch/mailman/listinfo/r-devel
>     <https://stat.ethz.ch/mailman/listinfo/r-devel>
>
>
>
>
> --
> Joris Meys
> Statistical consultant
>
> Ghent University
> Faculty of Bioscience Engineering
> Department of Mathematical Modelling, Statistics and Bio-Informatics
>
> tel :  +32 (0)9 264 61 79
> [hidden email]
> -------------------------------
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

--
Dr. Hilmar Berger, MD
Max Planck Institute for Infection Biology
Charitéplatz 1
D-10117 Berlin
GERMANY

Phone:  + 49 30 28460 430
Fax:    + 49 30 28460 401
 
E-Mail: [hidden email]
Web   : www.mpiib-berlin.mpg.de


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel