Can anyone advise me on running R and Rstudio on an AWS virtual machine

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Can anyone advise me on running R and Rstudio on an AWS virtual machine

Chris Evans
This is a funny one and if it's off topic here, I would be grateful if I could be guided to where it would be on topic. I have done some searching but not very successfully so far.

Situation: I am doing some analyses of data that are stored in a postgres database in the AWS cloud and using the RJDBC and dplyr packages for the specifics of yanking the data to my own machine. They work and worked fine when the database was in Redshift last year. However, I am getting error messages on data transfer that I think are down to my very slow broadband where I am now and that pushed me to think I should move to doing the analyses on a virtual machine in the AWS cloud so the link from the data to the machine is fast and so only getting the code up there and the results down will go through my broadband. I would like to be able to work interactively and I have some experience of that, (which is clearly not R-help business!): I am fairly happy I can do that. If not, just ssh terminal access to the VM would be OK and I'm used to that too.

My suspicions that the errors are down to my broadband are the trigger but I suspect that such a set may be the only way for me to go quite soon as these particular data sets are growing fast (they're not huge yet, whole R saved session image is 28Mb). I can see being able to upscale a VM in the cloud is the sensible way to go. However, this is currently a bit outside my experience.

I have searched and found this from Amazon:
[ https://aws.amazon.com/marketplace/pp/B07ZDBJ42H/ref=portal_asin_url#pdp-reviews | https://aws.amazon.com/marketplace/pp/B07ZDBJ42H/ref=portal_asin_url#pdp-reviews ]
and a VM Ubuntu with R and Rstudio sounds perfect: pretty much replicating my laptop. I can try that out for free by the look of it (can that be true?!) but I would like to get any advice I can first.

I also found:
[ https://techcommunity.microsoft.com/t5/educator-developer-blog/hosting-rserver-and-rstudio-on-azure/ba-p/744389 | https://techcommunity.microsoft.com/t5/educator-developer-blog/hosting-rserver-and-rstudio-on-azure/ba-p/744389 ]
but I would like to keep things on Amazon if I can (no great fan of Amazon or M$ but sometimes I have to swallow my scruples).

Coming back to AWS I also found:
[ https://blog.martinez.fyi/post/cloud-computing-with-r-and-aws/ | https://blog.martinez.fyi/post/cloud-computing-with-r-and-aws/ ]
[ https://www.r-bloggers.com/2018/06/interacting-with-aws-from-r/ | https://www.r-bloggers.com/2018/06/interacting-with-aws-from-r/ ]
and
[ https://github.com/cloudyr/aws.ec2/commit/7566a353cc92082202f5646c7e1010df10c26dc5 | https://github.com/cloudyr/aws.ec2/commit/7566a353cc92082202f5646c7e1010df10c26dc5 ]
all of which look pertinent and that last package looks as if it might be another way to go to offload work up to the VM if I can create one.

However, the first two pages are from 2018 and things in this cloud VM world look to me to change very, very fast. That aws.ec2 package looks to be in an orphaned or semi-orphaned state. In all, I thought I would ask here to see if anyone has any fairly current advice about creating an AWS VM on which to run R (with the luxury of being able to upscale it, subject to my pocket money, as my needs may grow).

TIA,

Chris

--
Small contribution in our coronavirus rigours:
https://www.coresystemtrust.org.uk/home/free-options-to-replace-paper-core-forms-during-the-coronavirus-pandemic/ 

Chris Evans <[hidden email]> Visiting Professor, University of Sheffield <[hidden email]>
I do some consultation work for the University of Roehampton <[hidden email]> and other places
but <[hidden email]> remains my main Email address. I have a work web site at:
https://www.psyctc.org/psyctc/ 
and a site I manage for CORE and CORE system trust at:
http://www.coresystemtrust.org.uk/ 
I have "semigrated" to France, see:
https://www.psyctc.org/pelerinage2016/semigrating-to-france/ 
https://www.psyctc.org/pelerinage2016/register-to-get-updates-from-pelerinage2016/ 

If you want an Emeeting, I am trying to keep them to Thursdays and my diary is at:
https://www.psyctc.org/pelerinage2016/ceworkdiary/ 
Beware: French time, generally an hour ahead of UK.

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Can anyone advise me on running R and Rstudio on an AWS virtual machine

Robert Knight
Would recommend a CentOS 7 HVM image, from centos.org <http://centos.org/> as the direct vendor rather than others in the market.   Activate a t3.nano with 512MB of ram, and then follow the instructions to get Studio server installed.  Create a new user for Studio-server.  Also, ensure that you have a backup plan because the security effects of RStudio server in that situation are not extremely well studied.  RStudio server doesn’t require web servers or anything like that.  A t3a.nano would be about 10% cheaper, but slower on the CPU clock speed, which might reduce performance.  If you are using Dplyr and multithreading your work then you might want to move up in the t3 offerings until you get one that has dual vCPUs, but the cost increases pretty quickly.  It’s great to have a Studio version that never changes or updates on you.  Ensure a backup plan in case the server gets hacked in some fashion.  You could restrict access to your own IP address in the AWS security group settings which would drastically minimize the risk of that.

Robert Knight

> On Oct 14, 2020, at 12:00 PM, Chris Evans <[hidden email]> wrote:
>
> This is a funny one and if it's off topic here, I would be grateful if I could be guided to where it would be on topic. I have done some searching but not very successfully so far.
>
> Situation: I am doing some analyses of data that are stored in a postgres database in the AWS cloud and using the RJDBC and dplyr packages for the specifics of yanking the data to my own machine. They work and worked fine when the database was in Redshift last year. However, I am getting error messages on data transfer that I think are down to my very slow broadband where I am now and that pushed me to think I should move to doing the analyses on a virtual machine in the AWS cloud so the link from the data to the machine is fast and so only getting the code up there and the results down will go through my broadband. I would like to be able to work interactively and I have some experience of that, (which is clearly not R-help business!): I am fairly happy I can do that. If not, just ssh terminal access to the VM would be OK and I'm used to that too.
>
> My suspicions that the errors are down to my broadband are the trigger but I suspect that such a set may be the only way for me to go quite soon as these particular data sets are growing fast (they're not huge yet, whole R saved session image is 28Mb). I can see being able to upscale a VM in the cloud is the sensible way to go. However, this is currently a bit outside my experience.
>
> I have searched and found this from Amazon:
> [ https://aws.amazon.com/marketplace/pp/B07ZDBJ42H/ref=portal_asin_url#pdp-reviews | https://aws.amazon.com/marketplace/pp/B07ZDBJ42H/ref=portal_asin_url#pdp-reviews ]
> and a VM Ubuntu with R and Rstudio sounds perfect: pretty much replicating my laptop. I can try that out for free by the look of it (can that be true?!) but I would like to get any advice I can first.
>
> I also found:
> [ https://techcommunity.microsoft.com/t5/educator-developer-blog/hosting-rserver-and-rstudio-on-azure/ba-p/744389 | https://techcommunity.microsoft.com/t5/educator-developer-blog/hosting-rserver-and-rstudio-on-azure/ba-p/744389 ]
> but I would like to keep things on Amazon if I can (no great fan of Amazon or M$ but sometimes I have to swallow my scruples).
>
> Coming back to AWS I also found:
> [ https://blog.martinez.fyi/post/cloud-computing-with-r-and-aws/ | https://blog.martinez.fyi/post/cloud-computing-with-r-and-aws/ ]
> [ https://www.r-bloggers.com/2018/06/interacting-with-aws-from-r/ | https://www.r-bloggers.com/2018/06/interacting-with-aws-from-r/ ]
> and
> [ https://github.com/cloudyr/aws.ec2/commit/7566a353cc92082202f5646c7e1010df10c26dc5 | https://github.com/cloudyr/aws.ec2/commit/7566a353cc92082202f5646c7e1010df10c26dc5 ]
> all of which look pertinent and that last package looks as if it might be another way to go to offload work up to the VM if I can create one.
>
> However, the first two pages are from 2018 and things in this cloud VM world look to me to change very, very fast. That aws.ec2 package looks to be in an orphaned or semi-orphaned state. In all, I thought I would ask here to see if anyone has any fairly current advice about creating an AWS VM on which to run R (with the luxury of being able to upscale it, subject to my pocket money, as my needs may grow).
>
> TIA,
>
> Chris
>
> --
> Small contribution in our coronavirus rigours:
> https://www.coresystemtrust.org.uk/home/free-options-to-replace-paper-core-forms-during-the-coronavirus-pandemic/ 
>
> Chris Evans <[hidden email]> Visiting Professor, University of Sheffield <[hidden email]>
> I do some consultation work for the University of Roehampton <[hidden email]> and other places
> but <[hidden email]> remains my main Email address. I have a work web site at:
> https://www.psyctc.org/psyctc/ 
> and a site I manage for CORE and CORE system trust at:
> http://www.coresystemtrust.org.uk/ 
> I have "semigrated" to France, see:
> https://www.psyctc.org/pelerinage2016/semigrating-to-france/ 
> https://www.psyctc.org/pelerinage2016/register-to-get-updates-from-pelerinage2016/ 
>
> If you want an Emeeting, I am trying to keep them to Thursdays and my diary is at:
> https://www.psyctc.org/pelerinage2016/ceworkdiary/ 
> Beware: French time, generally an hour ahead of UK.
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.