About performance of R

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

About performance of R

Suman
Hi there,

Now that R has grown up with a vibrant community. It's no 1 statistical package used by scientists. It's graphics capabilities are amazing.
Now it's time to provide native support in "R core" for distributed and parallel computing for high performance in massive datasets.
And may be base R functions should be replaced with best R packages like data.table, dplyr, reader for fast and efficient operations.


Thanks

Sent from my iPad
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: About performance of R

Bert Gunter
Did you consider the amount of code your "suggestions" would break?

-- Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
Clifford Stoll




On Wed, May 27, 2015 at 8:00 AM, Suman <[hidden email]> wrote:

> Hi there,
>
> Now that R has grown up with a vibrant community. It's no 1 statistical package used by scientists. It's graphics capabilities are amazing.
> Now it's time to provide native support in "R core" for distributed and parallel computing for high performance in massive datasets.
> And may be base R functions should be replaced with best R packages like data.table, dplyr, reader for fast and efficient operations.
>
>
> Thanks
>
> Sent from my iPad
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: About performance of R

Jeff Newmiller
In reply to this post by Suman
a) Base R already includes the "parallel" package. Deciding to use more than one processor for a particular computation is a very high level decision that can require knowledge of computing time cost, importance of other tasks on the system, and interdependence of computation results. It is not a decision that R should automatically make.

b) Most performance issues with R arise due to users choosing inefficient algorithms. Inserting parallelism inside existing algorithms will not fix that.
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<[hidden email]>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
---------------------------------------------------------------------------
Sent from my phone. Please excuse my brevity.

On May 27, 2015 8:00:03 AM PDT, Suman <[hidden email]> wrote:

>Hi there,
>
>Now that R has grown up with a vibrant community. It's no 1 statistical
>package used by scientists. It's graphics capabilities are amazing.
>Now it's time to provide native support in "R core" for distributed and
>parallel computing for high performance in massive datasets.
>And may be base R functions should be replaced with best R packages
>like data.table, dplyr, reader for fast and efficient operations.
>
>
>Thanks
>
>Sent from my iPad
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: About performance of R

David Winsemius
In reply to this post by Suman

On May 27, 2015, at 8:00 AM, Suman wrote:

> Hi there,
>
> Now that R has grown up with a vibrant community. It's no 1 statistical package used by scientists. It's graphics capabilities are amazing.
> Now it's time to provide native support in "R core" for distributed and parallel computing for high performance in massive datasets.
> And may be base R functions should be replaced with best R packages like data.table, dplyr, reader for fast and efficient operations.
>
>
> Thanks
>
> Sent from my iPad

Generally email exhortations from iPads are quite ineffective in promoting fundamental advances.

In the US, efforts at cheerleading of sports events are often attempted by small groups of scantily clad women of various ages, usually young, using coordinated dance movements. I wonder if something similar should be attempted those desirous of more rapid advancement of computer software. On the other hand, I suppose the world has been historically effective in this domain by waving large bundles of cash and stock options, rather than waving unclad female body parts.

Got any cash to wave?

--

David Winsemius
Alameda, CA, USA

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: About performance of R

Duncan Murdoch-2
In reply to this post by Suman
On 27/05/2015 11:00 AM, Suman wrote:
> Hi there,
>
> Now that R has grown up with a vibrant community. It's no 1 statistical package used by scientists. It's graphics capabilities are amazing.
> Now it's time to provide native support in "R core" for distributed and parallel computing for high performance in massive datasets.
> And may be base R functions should be replaced with best R packages like data.table, dplyr, reader for fast and efficient operations.

Given your first three sentences, I would say the current development
strategy for R is successful.  As Bert mentioned, one thing we have
always tried to do is to make improvements without large disruptions to
the existing code base.  I think we will continue to do that.

This means we are unlikely to make big, incompatible replacements. But
there's nothing stopping people from using data.table, dplyr, etc. even
if they aren't in the core.  In fact, having them outside of core R is
better:  there are only so many core R developers, and if they are
working on data.table, etc., they wouldn't be working on other things.

Compatible replacements are another question.  There is ongoing work on
making R faster, and making it easier to take advantage of multiple
processors.  I believe R 3.2.0 is faster than the R 3.1.x series in many
things, and changes like that are likely to continue.  Plus, there is
base support for explicit parallel programming in the parallel package,
as Jeff mentioned.

As to David and his large bundles; those would definitely be appreciated.

Duncan Murdoch

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.