Runnable R packages

classic Classic list List threaded Threaded
27 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Runnable R packages

David Lindelof-3
Dear all,

I’m working as a data scientist in a major tech company. I have been using
R for almost 20 years now and there’s one issue that’s been bugging me of
late. I apologize in advance if this has been discussed before.

R has traditionally been used for running short scripts or data analysis
notebooks, but there’s recently been a growing interest in developing full
applications in the language. Three examples come to mind:

1) The Shiny web application framework, which facilitates the developent of
rich, interactive web applications
2) The httr package, which provides lower-level facilities than Shiny for
writing web services
3) Batch jobs run by data scientists according to, say, a cron schedule

Compared with other languages, R’s support for such applications is rather
poor. The Rscript program is generally used to run an R script or an
arbitrary R expression, but I feel it suffers from a few problems:

1) It encourages developers of batch jobs to provide their code in a single
R file (bad for code structure and unit-testability)
2) It provides no way to deal with dependencies on other packages
3) It provides no way to "run" an application provided as an R package

For example, let’s say I want to run a Shiny application that I provide as
an R package (to keep the code modular, to benefit from unit tests, and to
declare dependencies properly). I would then need to a) uncompress my R
package, b) somehow, ensure my dependencies are installed, and c) call
runApp(). This can get tedious, fast.

Other languages let the developer package their code in "runnable"
artefacts, and let the developer specify the main entry point. The
mechanics depend on the language but are remarkably similar, and suggest a
way to implement this in R. Through declarations in some file, the
developer can often specify dependencies and declare where the program’s
"main" function resides. Consider Java:

Artefact: .jar file
Declarations file: Manifest file
Entry point: declared as 'Main-Class'
Executed as: java -jar <jarfile>

Or Python:

Artefact: Python package, typically as .tar.gz source distribution file
Declarations file: setup.py (which specifies dependencies)
Entry point: special __main__() function
Executed as: python -m <package>

R has already much of this machinery:

Artefact: R package
Declarations file: DESCRIPTION
Entry point: ?
Executed as: ?

I feel that R could benefit from letting the developer specify, possibly in
DESCRIPTION, how to "run" the package. The package could then be run
through, for example, a new R CMD command, for example:

R CMD RUN <package> <args>

I’m sure there are plenty of wrinkles in this idea that need to be ironed
out, but is this something that has ever been considered, or that is on R’s
roadmap?

Thanks for reading so far,



David Lindelöf, Ph.D.
+41 (0)79 415 66 41 or skype:david.lindelof
http://computersandbuildings.com
Follow me on Twitter:
http://twitter.com/dlindelof

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Runnable R packages

Gergely Daróczi
Dear David, sharing some related (subjective) thoughts below.

On Mon, Jan 7, 2019 at 9:53 PM David Lindelof <[hidden email]> wrote:

>
> Dear all,
>
> I’m working as a data scientist in a major tech company. I have been using
> R for almost 20 years now and there’s one issue that’s been bugging me of
> late. I apologize in advance if this has been discussed before.
>
> R has traditionally been used for running short scripts or data analysis
> notebooks, but there’s recently been a growing interest in developing full
> applications in the language. Three examples come to mind:
>
> 1) The Shiny web application framework, which facilitates the developent of
> rich, interactive web applications
> 2) The httr package, which provides lower-level facilities than Shiny for
> writing web services
> 3) Batch jobs run by data scientists according to, say, a cron schedule
>
> Compared with other languages, R’s support for such applications is rather
> poor. The Rscript program is generally used to run an R script or an
> arbitrary R expression, but I feel it suffers from a few problems:
>
> 1) It encourages developers of batch jobs to provide their code in a single
> R file (bad for code structure and unit-testability)

I think it rather encourages developers to create (internal) R
packages and use those from the batch jobs. This way the structure is
pretty clean, sharing code between scripts is easy, unit testing can
be done within the package etc.

> 2) It provides no way to deal with dependencies on other packages

See above: create R package(s) and use those from the scripts.

> 3) It provides no way to "run" an application provided as an R package
>
> For example, let’s say I want to run a Shiny application that I provide as
> an R package (to keep the code modular, to benefit from unit tests, and to
> declare dependencies properly). I would then need to a) uncompress my R
> package, b) somehow, ensure my dependencies are installed, and c) call
> runApp(). This can get tedious, fast.

You can provide your app as a Docker image, so that the end-user
simply calls a "docker pull" and then "docker run" -- that can be done
from a user-friendly script as well.
Of course, this requires Docker to be installed, but if that's a
problem, probably better to "ship" the app as a web application and
share a URL with the user, eg backed by shinyproxy.io

>
> Other languages let the developer package their code in "runnable"
> artefacts, and let the developer specify the main entry point. The
> mechanics depend on the language but are remarkably similar, and suggest a
> way to implement this in R. Through declarations in some file, the
> developer can often specify dependencies and declare where the program’s
> "main" function resides. Consider Java:
>
> Artefact: .jar file
> Declarations file: Manifest file
> Entry point: declared as 'Main-Class'
> Executed as: java -jar <jarfile>
>
> Or Python:
>
> Artefact: Python package, typically as .tar.gz source distribution file
> Declarations file: setup.py (which specifies dependencies)
> Entry point: special __main__() function
> Executed as: python -m <package>
>
> R has already much of this machinery:
>
> Artefact: R package
> Declarations file: DESCRIPTION
> Entry point: ?
> Executed as: ?
>
> I feel that R could benefit from letting the developer specify, possibly in
> DESCRIPTION, how to "run" the package. The package could then be run
> through, for example, a new R CMD command, for example:
>
> R CMD RUN <package> <args>
>
> I’m sure there are plenty of wrinkles in this idea that need to be ironed
> out, but is this something that has ever been considered, or that is on R’s
> roadmap?
>
> Thanks for reading so far,
>
>
>
> David Lindelöf, Ph.D.
> +41 (0)79 415 66 41 or skype:david.lindelof
> http://computersandbuildings.com
> Follow me on Twitter:
> http://twitter.com/dlindelof
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Runnable R packages

Dirk Eddelbuettel
In reply to this post by David Lindelof-3

On 3 January 2019 at 11:43, David Lindelof wrote:
| Dear all,
|
| I’m working as a data scientist in a major tech company. I have been using
| R for almost 20 years now and there’s one issue that’s been bugging me of
| late. I apologize in advance if this has been discussed before.
|
| R has traditionally been used for running short scripts or data analysis
| notebooks, but there’s recently been a growing interest in developing full
| applications in the language. Three examples come to mind:
|
| 1) The Shiny web application framework, which facilitates the developent of
| rich, interactive web applications
| 2) The httr package, which provides lower-level facilities than Shiny for
| writing web services
| 3) Batch jobs run by data scientists according to, say, a cron schedule

That is a bit of a weird classification of "full applications". I have done
this about as long as you but I also provided (at least as tests and demos)
  i)  GUI apps using tcl/tk (which comes with R) and
  ii) GUI apps with Qt (or even Wt), see my RInside package.

But my main weapon for 3) is littler. See

   https://cran.r-project.org/package=littler

and particularly the many examples at

   https://github.com/eddelbuettel/littler/tree/master/inst/examples
 
| Compared with other languages, R’s support for such applications is rather
| poor. The Rscript program is generally used to run an R script or an
| arbitrary R expression, but I feel it suffers from a few problems:
|
| 1) It encourages developers of batch jobs to provide their code in a single
| R file (bad for code structure and unit-testability)
| 2) It provides no way to deal with dependencies on other packages
| 3) It provides no way to "run" an application provided as an R package

Err, no. See the examples/ directory above. About every single one uses
packages.

As illustrations I have long-running and somewhat visible cronjobs that are
implemented the same way: CRANberries (since 2007, now running hourly) and
CRAN Policy Watch (running once a day). Because both are 'hacks' I never
published the code but there is not that much to it. CRANberries just queries
CRAN, compares to what it had last, and writes out variants of the
DESCRIPTION file to text where a static blog engine (like Hugo, but older)
makes a feed and html pagaes out of it.  Oh, and we tweet because "why not?".
 
| For example, let’s say I want to run a Shiny application that I provide as
| an R package (to keep the code modular, to benefit from unit tests, and to
| declare dependencies properly). I would then need to a) uncompress my R
| package, b) somehow, ensure my dependencies are installed, and c) call
| runApp(). This can get tedious, fast.

Disagree here too. At work, I just write my code, organize it in packages,
update the packages and have shiny expose whatever makes sense.

| Other languages let the developer package their code in "runnable"
| artefacts, and let the developer specify the main entry point. The
| mechanics depend on the language but are remarkably similar, and suggest a
| way to implement this in R. Through declarations in some file, the
| developer can often specify dependencies and declare where the program’s
| "main" function resides. Consider Java:
|
| Artefact: .jar file
| Declarations file: Manifest file
| Entry point: declared as 'Main-Class'
| Executed as: java -jar <jarfile>
|
| Or Python:
|
| Artefact: Python package, typically as .tar.gz source distribution file
| Declarations file: setup.py (which specifies dependencies)
| Entry point: special __main__() function
| Executed as: python -m <package>
|
| R has already much of this machinery:
|
| Artefact: R package
| Declarations file: DESCRIPTION
| Entry point: ?
| Executed as: ?
|
| I feel that R could benefit from letting the developer specify, possibly in
| DESCRIPTION, how to "run" the package. The package could then be run
| through, for example, a new R CMD command, for example:
|
| R CMD RUN <package> <args>
|
| I’m sure there are plenty of wrinkles in this idea that need to be ironed
| out, but is this something that has ever been considered, or that is on R’s
| roadmap?

Hm. If _you_ have an itch to scratch here why don't _you_ implement a draft.

Dirk

--
http://dirk.eddelbuettel.com | @eddelbuettel | [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Runnable R packages

Murray Stokely
In reply to this post by David Lindelof-3
Some other major tech companies have in the past widely use Runnable R
Archives (".Rar" files), similar to Python .par files [1], and integrate
them completely into the proprietary R package build system in use there.
I thought there were a few systems like this that had made their way to
CRAN or the UseR conferences, but I don't have a link.

Building something specific to your organization on top of the python .par
framework to archive up R, your needed packages/shared libraries, and other
dependencies with a runner script to R CMD RUN your entry point in a
sandbox is pretty straightforward way to have control in a way that makes
sense for your environment.

      - Murray

[1] https://google.github.io/subpar/subpar.html

On Mon, Jan 7, 2019 at 12:53 PM David Lindelof <[hidden email]> wrote:

> Dear all,
>
> I’m working as a data scientist in a major tech company. I have been using
> R for almost 20 years now and there’s one issue that’s been bugging me of
> late. I apologize in advance if this has been discussed before.
>
> R has traditionally been used for running short scripts or data analysis
> notebooks, but there’s recently been a growing interest in developing full
> applications in the language. Three examples come to mind:
>
> 1) The Shiny web application framework, which facilitates the developent of
> rich, interactive web applications
> 2) The httr package, which provides lower-level facilities than Shiny for
> writing web services
> 3) Batch jobs run by data scientists according to, say, a cron schedule
>
> Compared with other languages, R’s support for such applications is rather
> poor. The Rscript program is generally used to run an R script or an
> arbitrary R expression, but I feel it suffers from a few problems:
>
> 1) It encourages developers of batch jobs to provide their code in a single
> R file (bad for code structure and unit-testability)
> 2) It provides no way to deal with dependencies on other packages
> 3) It provides no way to "run" an application provided as an R package
>
> For example, let’s say I want to run a Shiny application that I provide as
> an R package (to keep the code modular, to benefit from unit tests, and to
> declare dependencies properly). I would then need to a) uncompress my R
> package, b) somehow, ensure my dependencies are installed, and c) call
> runApp(). This can get tedious, fast.
>
> Other languages let the developer package their code in "runnable"
> artefacts, and let the developer specify the main entry point. The
> mechanics depend on the language but are remarkably similar, and suggest a
> way to implement this in R. Through declarations in some file, the
> developer can often specify dependencies and declare where the program’s
> "main" function resides. Consider Java:
>
> Artefact: .jar file
> Declarations file: Manifest file
> Entry point: declared as 'Main-Class'
> Executed as: java -jar <jarfile>
>
> Or Python:
>
> Artefact: Python package, typically as .tar.gz source distribution file
> Declarations file: setup.py (which specifies dependencies)
> Entry point: special __main__() function
> Executed as: python -m <package>
>
> R has already much of this machinery:
>
> Artefact: R package
> Declarations file: DESCRIPTION
> Entry point: ?
> Executed as: ?
>
> I feel that R could benefit from letting the developer specify, possibly in
> DESCRIPTION, how to "run" the package. The package could then be run
> through, for example, a new R CMD command, for example:
>
> R CMD RUN <package> <args>
>
> I’m sure there are plenty of wrinkles in this idea that need to be ironed
> out, but is this something that has ever been considered, or that is on R’s
> roadmap?
>
> Thanks for reading so far,
>
>
>
> David Lindelöf, Ph.D.
> +41 (0)79 415 66 41 or skype:david.lindelof
> http://computersandbuildings.com
> Follow me on Twitter:
> http://twitter.com/dlindelof
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Runnable R packages

Dirk Eddelbuettel
In reply to this post by Gergely Daróczi

On 7 January 2019 at 22:09, Gergely Daróczi wrote:
| You can provide your app as a Docker image, so that the end-user
| simply calls a "docker pull" and then "docker run" -- that can be done
| from a user-friendly script as well.
| Of course, this requires Docker to be installed, but if that's a
| problem, probably better to "ship" the app as a web application and
| share a URL with the user, eg backed by shinyproxy.io

Excellent suggestion.

Dirk

--
http://dirk.eddelbuettel.com | @eddelbuettel | [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Runnable R packages

Iñaki Ucar
In reply to this post by Gergely Daróczi
On Mon, 7 Jan 2019 at 22:09, Gergely Daróczi <[hidden email]> wrote:
>
> Dear David, sharing some related (subjective) thoughts below.
>
> You can provide your app as a Docker image, so that the end-user
> simply calls a "docker pull" and then "docker run" -- that can be done
> from a user-friendly script as well.
> Of course, this requires Docker to be installed, but if that's a
> problem, probably better to "ship" the app as a web application and
> share a URL with the user, eg backed by shinyproxy.io

If Docker is a problem, you can also try podman: same usage,
compatible with Dockerfiles and daemon-less, no admin rights required.

https://podman.io/

Iñaki

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Runnable R packages

David Lindelof-3
Belated thanks to all who replied to my initial query. In summary, three
approaches have been mentioned to run R code "in production": 1)
ShinyProxy, mentioned by Tobias, for deploying Shiny applications; 2)
Docker-like solutions, mentioned by Gergely and Iñaki; and 3) Solutions
based on Rscript or littler, mentioned by Dirk.

I can't speak to 1) because I don't currently use Shiny. And it seems to me
that Docker-like solutions will still need some "point of entry" for the R
application, which will have to be Rscript or littler.

In my first email, I observed that Rscript expects a single expression or a
single script, which is probably why (in my experience) many data
scientists tend to provide their code in a very limited number of files.
Gergely disagreed, arguing to the contrary that data scientists are
encouraged to provide their application as an R package called by a short
script executed by Rscript. But this doesn't happen where I work for
several reasons:

   - it implies installing your package on the production machine(s),
   including its dependencies, which must be done by hand
   - some machine learning platforms will simply not accept code provided
   as an R package
   - we have some "big data" use cases for which we need Spark; Spark can
   run R or Python code, but only when it is provided as a single file. (On
   the other hand, Spark can run applications provided as JAR files)

In summary, I'm convinced R would benefit from something similar to Java's
`Main-Class` header or Python's `__main__()` function. A new R CMD command
would take a package, install its dependencies, and run its "main"
function. If we have this machinery available, we could even consider
reaching out to Spark (and other tech stacks) developers and make it easier
to develop R applications for those platforms.

A candid comment from Dirk suggested that I should implement this myself,
which I would be happy to do, provided this is the normal procedure. Or is
there a more formal process I should follow?

Kind regards,

David Lindelöf

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Runnable R packages

Duncan Murdoch-2
On 31/01/2019 9:32 a.m., David Lindelof wrote:

> Belated thanks to all who replied to my initial query. In summary, three
> approaches have been mentioned to run R code "in production": 1)
> ShinyProxy, mentioned by Tobias, for deploying Shiny applications; 2)
> Docker-like solutions, mentioned by Gergely and Iñaki; and 3) Solutions
> based on Rscript or littler, mentioned by Dirk.
>
> I can't speak to 1) because I don't currently use Shiny. And it seems to me
> that Docker-like solutions will still need some "point of entry" for the R
> application, which will have to be Rscript or littler.
>
> In my first email, I observed that Rscript expects a single expression or a
> single script, which is probably why (in my experience) many data
> scientists tend to provide their code in a very limited number of files.
> Gergely disagreed, arguing to the contrary that data scientists are
> encouraged to provide their application as an R package called by a short
> script executed by Rscript. But this doesn't happen where I work for
> several reasons:
>
>     - it implies installing your package on the production machine(s),
>     including its dependencies, which must be done by hand
>     - some machine learning platforms will simply not accept code provided
>     as an R package
>     - we have some "big data" use cases for which we need Spark; Spark can
>     run R or Python code, but only when it is provided as a single file. (On
>     the other hand, Spark can run applications provided as JAR files)
>
> In summary, I'm convinced R would benefit from something similar to Java's
> `Main-Class` header or Python's `__main__()` function. A new R CMD command
> would take a package, install its dependencies, and run its "main"
> function. If we have this machinery available, we could even consider
> reaching out to Spark (and other tech stacks) developers and make it easier
> to develop R applications for those platforms.
>
> A candid comment from Dirk suggested that I should implement this myself,
> which I would be happy to do, provided this is the normal procedure. Or is
> there a more formal process I should follow?

You can't implement it to run under R CMD, but it should be
straightforward to put this in an R package, to be run by Rscript using
something like

   Rscript -e "yourpackage::run_main('somepackage')"

You can use the installation code from the `remotes` package, so
run_main() could be a pretty simple function.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Runnable R packages

barry rowlingson
In reply to this post by Iñaki Ucar
On Thu, Jan 31, 2019 at 3:14 PM David Lindelof <[hidden email]> wrote:

>
> In summary, I'm convinced R would benefit from something similar to Java's
> `Main-Class` header or Python's `__main__()` function. A new R CMD command
> would take a package, install its dependencies, and run its "main"
> function.



I just created and built a very boilerplate R package called "runme". I can
install its dependencies and run its "main" function with:

 $ R CMD INSTALL runme_0.0.0.9000.tar.gz
 $ R -e 'runme::main()'

No new R CMDs needed. Now my choice of "main" is arbitrary, whereas with
python and java and C the entrypoint is more tightly specified (__name__ ==
"__main__" in python, int main(..) in C and so on). But I don't think
that's much of a problem.

Does that not satisfy your requirements close enough? If you want it in one
line then:

R CMD INSTALL runme_0.0.0.9000.tar.gz && R -e 'runme::main()'

will do the second if the first succeeds (Unix shells).

You could write a script for $RHOME/bin/RUN which would be a two-liner and
that could mandate the use of "main" as an entry point. But good luck
getting anything into base R.

Barry




> If we have this machinery available, we could even consider
> reaching out to Spark (and other tech stacks) developers and make it easier
> to develop R applications for those platforms.
>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Runnable R packages

David Lindelof-3
Would you care to share how your package installs its own dependencies? I
assume this is done during the call to `main()`? (Last time I checked, R
CMD INSTALL would not install a package's dependencies...)


On Thu, Jan 31, 2019 at 4:38 PM Barry Rowlingson <
[hidden email]> wrote:

>
>
> On Thu, Jan 31, 2019 at 3:14 PM David Lindelof <[hidden email]> wrote:
>
>>
>> In summary, I'm convinced R would benefit from something similar to Java's
>> `Main-Class` header or Python's `__main__()` function. A new R CMD command
>> would take a package, install its dependencies, and run its "main"
>> function.
>
>
>
> I just created and built a very boilerplate R package called "runme". I
> can install its dependencies and run its "main" function with:
>
>  $ R CMD INSTALL runme_0.0.0.9000.tar.gz
>  $ R -e 'runme::main()'
>
> No new R CMDs needed. Now my choice of "main" is arbitrary, whereas with
> python and java and C the entrypoint is more tightly specified (__name__ ==
> "__main__" in python, int main(..) in C and so on). But I don't think
> that's much of a problem.
>
> Does that not satisfy your requirements close enough? If you want it in
> one line then:
>
> R CMD INSTALL runme_0.0.0.9000.tar.gz && R -e 'runme::main()'
>
> will do the second if the first succeeds (Unix shells).
>
> You could write a script for $RHOME/bin/RUN which would be a two-liner and
> that could mandate the use of "main" as an entry point. But good luck
> getting anything into base R.
>
> Barry
>
>
>
>
>> If we have this machinery available, we could even consider
>> reaching out to Spark (and other tech stacks) developers and make it
>> easier
>> to develop R applications for those platforms.
>>
>>
>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Runnable R packages

Jan Gorecki
In reply to this post by barry rowlingson
Quoting:

"In summary, I'm convinced R would benefit from something similar to Java's
`Main-Class` header or Python's `__main__()` function. A new R CMD command
would take a package, install its dependencies, and run its "main"
function."

This kind of increase the scope of your idea. New command in R CMD to
redirect to "main" is interesting idea. On the other hand it will
impose limitation on user comparing to the way how you could do it
now: Rscript -e 'mypkg::mymain("myparam")' (or littler, it should be
shipped with R IMO).
For production system one doesn't want to just "install its
dependencies". First dependencies has to be mirrored and their version
frozen. Then testing your package on that set of dependencies. Once
successfully done then same set of packages should be used for
production deployment. For those processes you might find tools4pkgs
branch in base R useful (packages.dcf, mirror.packages functions),
unfortunately not merged:
https://github.com/wch/r-source/compare/tools4pkgs

Jan Gorecki

On Thu, Jan 31, 2019 at 9:08 PM Barry Rowlingson
<[hidden email]> wrote:

>
> On Thu, Jan 31, 2019 at 3:14 PM David Lindelof <[hidden email]> wrote:
>
> >
> > In summary, I'm convinced R would benefit from something similar to Java's
> > `Main-Class` header or Python's `__main__()` function. A new R CMD command
> > would take a package, install its dependencies, and run its "main"
> > function.
>
>
>
> I just created and built a very boilerplate R package called "runme". I can
> install its dependencies and run its "main" function with:
>
>  $ R CMD INSTALL runme_0.0.0.9000.tar.gz
>  $ R -e 'runme::main()'
>
> No new R CMDs needed. Now my choice of "main" is arbitrary, whereas with
> python and java and C the entrypoint is more tightly specified (__name__ ==
> "__main__" in python, int main(..) in C and so on). But I don't think
> that's much of a problem.
>
> Does that not satisfy your requirements close enough? If you want it in one
> line then:
>
> R CMD INSTALL runme_0.0.0.9000.tar.gz && R -e 'runme::main()'
>
> will do the second if the first succeeds (Unix shells).
>
> You could write a script for $RHOME/bin/RUN which would be a two-liner and
> that could mandate the use of "main" as an entry point. But good luck
> getting anything into base R.
>
> Barry
>
>
>
>
> > If we have this machinery available, we could even consider
> > reaching out to Spark (and other tech stacks) developers and make it easier
> > to develop R applications for those platforms.
> >
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Runnable R packages

David Lindelof-3
In reply to this post by barry rowlingson
@Barry I'm not sure your proposal would work, since `R CMD INSTALL` won't
install a package's dependencies. Indeed it will fail with an error unless
all the dependencies are met before calling it.

Speaking of which, why doesn't R CMD INSTALL install a package's
dependencies? Would it make sense to submit this as a desirable feature?

Cheers,

David

On Thu, Jan 31, 2019 at 4:38 PM Barry Rowlingson <
[hidden email]> wrote:

>
>
> On Thu, Jan 31, 2019 at 3:14 PM David Lindelof <[hidden email]> wrote:
>
>>
>> In summary, I'm convinced R would benefit from something similar to Java's
>> `Main-Class` header or Python's `__main__()` function. A new R CMD command
>> would take a package, install its dependencies, and run its "main"
>> function.
>
>
>
> I just created and built a very boilerplate R package called "runme". I
> can install its dependencies and run its "main" function with:
>
>  $ R CMD INSTALL runme_0.0.0.9000.tar.gz
>  $ R -e 'runme::main()'
>
> No new R CMDs needed. Now my choice of "main" is arbitrary, whereas with
> python and java and C the entrypoint is more tightly specified (__name__ ==
> "__main__" in python, int main(..) in C and so on). But I don't think
> that's much of a problem.
>
> Does that not satisfy your requirements close enough? If you want it in
> one line then:
>
> R CMD INSTALL runme_0.0.0.9000.tar.gz && R -e 'runme::main()'
>
> will do the second if the first succeeds (Unix shells).
>
> You could write a script for $RHOME/bin/RUN which would be a two-liner and
> that could mandate the use of "main" as an entry point. But good luck
> getting anything into base R.
>
> Barry
>
>
>
>
>> If we have this machinery available, we could even consider
>> reaching out to Spark (and other tech stacks) developers and make it
>> easier
>> to develop R applications for those platforms.
>>
>>
>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Runnable R packages

barry rowlingson
In reply to this post by barry rowlingson
Ummm oops. Magic pixies? It assumed all of CRAN was installed?

Maybe I'll write something that could go in /usr/lib/R/bin/RUN that
checks and gets deps, installs the package, and runs package::main,
which I think is what the OP wants - you could do R CMD RUN
foo_1.0.0.tar.gz and away it goes...

B


On Thu, Jan 31, 2019 at 3:56 PM David Lindelof <[hidden email]> wrote:

>
> Would you care to share how your package installs its own dependencies? I assume this is done during the call to `main()`? (Last time I checked, R CMD INSTALL would not install a package's dependencies...)
>
>
> On Thu, Jan 31, 2019 at 4:38 PM Barry Rowlingson <[hidden email]> wrote:
>>
>>
>>
>> On Thu, Jan 31, 2019 at 3:14 PM David Lindelof <[hidden email]> wrote:
>>>
>>>
>>> In summary, I'm convinced R would benefit from something similar to Java's
>>> `Main-Class` header or Python's `__main__()` function. A new R CMD command
>>> would take a package, install its dependencies, and run its "main"
>>> function.
>>
>>
>>
>> I just created and built a very boilerplate R package called "runme". I can install its dependencies and run its "main" function with:
>>
>>  $ R CMD INSTALL runme_0.0.0.9000.tar.gz
>>  $ R -e 'runme::main()'
>>
>> No new R CMDs needed. Now my choice of "main" is arbitrary, whereas with python and java and C the entrypoint is more tightly specified (__name__ == "__main__" in python, int main(..) in C and so on). But I don't think that's much of a problem.
>>
>> Does that not satisfy your requirements close enough? If you want it in one line then:
>>
>> R CMD INSTALL runme_0.0.0.9000.tar.gz && R -e 'runme::main()'
>>
>> will do the second if the first succeeds (Unix shells).
>>
>> You could write a script for $RHOME/bin/RUN which would be a two-liner and that could mandate the use of "main" as an entry point. But good luck getting anything into base R.
>>
>> Barry
>>
>>
>>
>>>
>>> If we have this machinery available, we could even consider
>>> reaching out to Spark (and other tech stacks) developers and make it easier
>>> to develop R applications for those platforms.
>>>
>>
>>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Runnable R packages

R devel mailing list
In reply to this post by David Lindelof-3
To download a package with all its dependencies and install it, use the
install.packages() functions instead of 'R CMD INSTALL'.  E.g., in bash:

mkdir /tmp/libJunk
env R_LIBS_SITE=libJunk R --quiet -e 'if
(!requireNamespace("purrr",quietly=TRUE)) install.packages("purrr")'

For corporate "production use" you probably want to set up your own
repository containing
fixed versions of packages instead of using CRAN.  Then edd repos="..." to
the install.packages()
call.  Of course you can put this into a package and somehow deal with the
bootstrapping issue.

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Thu, Jan 31, 2019 at 8:04 AM David Lindelof <[hidden email]> wrote:

> Would you care to share how your package installs its own dependencies? I
> assume this is done during the call to `main()`? (Last time I checked, R
> CMD INSTALL would not install a package's dependencies...)
>
>
> On Thu, Jan 31, 2019 at 4:38 PM Barry Rowlingson <
> [hidden email]> wrote:
>
> >
> >
> > On Thu, Jan 31, 2019 at 3:14 PM David Lindelof <[hidden email]>
> wrote:
> >
> >>
> >> In summary, I'm convinced R would benefit from something similar to
> Java's
> >> `Main-Class` header or Python's `__main__()` function. A new R CMD
> command
> >> would take a package, install its dependencies, and run its "main"
> >> function.
> >
> >
> >
> > I just created and built a very boilerplate R package called "runme". I
> > can install its dependencies and run its "main" function with:
> >
> >  $ R CMD INSTALL runme_0.0.0.9000.tar.gz
> >  $ R -e 'runme::main()'
> >
> > No new R CMDs needed. Now my choice of "main" is arbitrary, whereas with
> > python and java and C the entrypoint is more tightly specified (__name__
> ==
> > "__main__" in python, int main(..) in C and so on). But I don't think
> > that's much of a problem.
> >
> > Does that not satisfy your requirements close enough? If you want it in
> > one line then:
> >
> > R CMD INSTALL runme_0.0.0.9000.tar.gz && R -e 'runme::main()'
> >
> > will do the second if the first succeeds (Unix shells).
> >
> > You could write a script for $RHOME/bin/RUN which would be a two-liner
> and
> > that could mandate the use of "main" as an entry point. But good luck
> > getting anything into base R.
> >
> > Barry
> >
> >
> >
> >
> >> If we have this machinery available, we could even consider
> >> reaching out to Spark (and other tech stacks) developers and make it
> >> easier
> >> to develop R applications for those platforms.
> >>
> >>
> >
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Runnable R packages

Dirk Eddelbuettel

On 1 February 2019 at 13:31, William Dunlap via R-devel wrote:
| To download a package with all its dependencies and install it, use the
| install.packages() functions instead of 'R CMD INSTALL'.  E.g., in bash:
|
| mkdir /tmp/libJunk
| env R_LIBS_SITE=libJunk R --quiet -e 'if
| (!requireNamespace("purrr",quietly=TRUE)) install.packages("purrr")'

Or one could use 'littler' and install some of its examples in the $PATH path
(which I tend to do via softlinks to get updates easily).

Then it is simply

   $ install.r purrr

and there is also install2.r with docopt goodness and more options.

These have been my preferred tools for many years at home and work, and they
found their way through Rocker dockerfiles as well as install2.r was started
by Carl for added features.
 
| For corporate "production use" you probably want to set up your own
| repository containing
| fixed versions of packages instead of using CRAN.  Then edd repos="..." to
| the install.packages()
| call.  Of course you can put this into a package and somehow deal with the
| bootstrapping issue.

Absolutely. But what repo to source packages from is somewhat orthogonal to
how to install from there. Also, thanks to Gergely, repos is now an argument
to install2.r

Dirk

--
http://dirk.eddelbuettel.com | @eddelbuettel | [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Runnable R packages

Abby Spurdle
In reply to this post by David Lindelof-3
This is possibly the most redundant discussion I've ever seen on the R
mailing lists.

In the original post:

> 2) It provides no way to deal with dependencies on other packages
> 3) It provides no way to "run" an application provided as an R package

Both completely false statements.

> recently been a growing interest in developing full applications

R was originally designed for interpreted use, with statistics and graphics.
However, GUI, web and other applications are possible.
And it's been around for a while.
(So, not "recently").

Maybe your organization could/should pay a programmer to do it?

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Runnable R packages

Abby Spurdle
In reply to this post by David Lindelof-3
Further to my previous post,
it would be possible to create an .exe file, say:

my_r_application.exe

That starts R, loads your R package(s), calls the R function of your choice
and does whatever else you want.

However, I don't think that it would add much value.
But feel free to correct me if you think that I'm wrong.

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Runnable R packages

barry rowlingson
In reply to this post by David Lindelof-3
I don't think anyone denies that you *could* make an EXE to do all
that. The discussion is on *how easy* it should be to create a single
file that contains an initial "main" function plus a set of bundled
code (potentially as a package) and which when run will install its
package code (which is contained in itself, its not in a repo),
install dependencies, and run the main() function.

Now, I could build a self-executable shar file that bundled a package
together with a script to do all the above. But if there was a "RUN"
command in R, and a convention that a function called "foo::main"
would be run by `R CMD RUN foo_1.1.1.tar.gz` then it would be so much
easier to develop and test.

If people think this adds value, then if they want to offer that value
to me as $ or £, I'd consider writing it if their total value was more
than my cost....

Barry


On Sat, Feb 2, 2019 at 12:54 AM Abs Spurdle <[hidden email]> wrote:

>
> Further to my previous post,
> it would be possible to create an .exe file, say:
>
> my_r_application.exe
>
> That starts R, loads your R package(s), calls the R function of your choice
> and does whatever else you want.
>
> However, I don't think that it would add much value.
> But feel free to correct me if you think that I'm wrong.
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Runnable R packages

Duncan Murdoch-2
On 02/02/2019 8:27 a.m., Barry Rowlingson wrote:

> I don't think anyone denies that you *could* make an EXE to do all
> that. The discussion is on *how easy* it should be to create a single
> file that contains an initial "main" function plus a set of bundled
> code (potentially as a package) and which when run will install its
> package code (which is contained in itself, its not in a repo),
> install dependencies, and run the main() function.
>
> Now, I could build a self-executable shar file that bundled a package
> together with a script to do all the above. But if there was a "RUN"
> command in R, and a convention that a function called "foo::main"
> would be run by `R CMD RUN foo_1.1.1.tar.gz` then it would be so much
> easier to develop and test.

I don't believe the "so much easier" argument that this requires a
change to base R.  If you put that functionality into a package, then
the only extra effort the user would require is to install that other
package.  After that, they could run

Rscript -e "yourpackage::run_main('foo_1.1.1.tar.gz')"

as I suggested before.  This is no harder than running

R CMD RUN foo_1.1.1.tar.gz

The advantage of this from R Core's perspective is that you would be
developing and maintaining "yourpackage", you wouldn't be passing the
burden on to them.  The advantage from your perspective is that you
could work with whatever packages you liked.  The "remotes" package has
almost everything you need so that "yourpackage" could be nearly
trivial.  You wouldn't need to duplicate it within base R.

Duncan Murdoch

>
> If people think this adds value, then if they want to offer that value
> to me as $ or £, I'd consider writing it if their total value was more
> than my cost....
>
> Barry
>
>
> On Sat, Feb 2, 2019 at 12:54 AM Abs Spurdle <[hidden email]> wrote:
>>
>> Further to my previous post,
>> it would be possible to create an .exe file, say:
>>
>> my_r_application.exe
>>
>> That starts R, loads your R package(s), calls the R function of your choice
>> and does whatever else you want.
>>
>> However, I don't think that it would add much value.
>> But feel free to correct me if you think that I'm wrong.
>>
>>          [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Runnable R packages

David Lindelof-3
I see some value in Duncan’s proposal to implement this as an extra package
instead of a change to base R, if only to see if the idea has legs. I’m
minded to do so myself using your suggestion, but is there a particular
reason why you recommend using the remotes package instead of devtools? The
latter seems to have the same functions I would need, and I believe it is
more widely installed that remotes?

Kind regards,

From: Duncan Murdoch <[hidden email]> <[hidden email]>
Reply: Duncan Murdoch <[hidden email]> <[hidden email]>
Date: 2 February 2019 at 15:37:16
To: Barry Rowlingson <[hidden email]>
<[hidden email]>, Abs Spurdle <[hidden email]>
<[hidden email]>
Cc: r-devel <[hidden email]> <[hidden email]>
Subject:  Re: [Rd] Runnable R packages

On 02/02/2019 8:27 a.m., Barry Rowlingson wrote:

> I don't think anyone denies that you *could* make an EXE to do all
> that. The discussion is on *how easy* it should be to create a single
> file that contains an initial "main" function plus a set of bundled
> code (potentially as a package) and which when run will install its
> package code (which is contained in itself, its not in a repo),
> install dependencies, and run the main() function.
>
> Now, I could build a self-executable shar file that bundled a package
> together with a script to do all the above. But if there was a "RUN"
> command in R, and a convention that a function called "foo::main"
> would be run by `R CMD RUN foo_1.1.1.tar.gz` then it would be so much
> easier to develop and test.

I don't believe the "so much easier" argument that this requires a
change to base R. If you put that functionality into a package, then
the only extra effort the user would require is to install that other
package. After that, they could run

Rscript -e "yourpackage::run_main('foo_1.1.1.tar.gz')"

as I suggested before. This is no harder than running

R CMD RUN foo_1.1.1.tar.gz

The advantage of this from R Core's perspective is that you would be
developing and maintaining "yourpackage", you wouldn't be passing the
burden on to them. The advantage from your perspective is that you
could work with whatever packages you liked. The "remotes" package has
almost everything you need so that "yourpackage" could be nearly
trivial. You wouldn't need to duplicate it within base R.

Duncan Murdoch

>
> If people think this adds value, then if they want to offer that value
> to me as $ or £, I'd consider writing it if their total value was more
> than my cost....
>
> Barry
>
>
> On Sat, Feb 2, 2019 at 12:54 AM Abs Spurdle <[hidden email]> wrote:
>>
>> Further to my previous post,
>> it would be possible to create an .exe file, say:
>>
>> my_r_application.exe
>>
>> That starts R, loads your R package(s), calls the R function of your
choice

>> and does whatever else you want.
>>
>> However, I don't think that it would add much value.
>> But feel free to correct me if you think that I'm wrong.
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


David Lindelöf, Ph.D.
+41 (0)79 415 66 41 <//415 66 41> or skype:david.lindelof
http://computersandbuildings.com
Follow me on Twitter:
http://twitter.com/dlindelof

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
12