Why Use Docker with R? A DevOps Perspective

October 16, 2017

There have been several blog posts going around about why one would use Docker with R. In this post I’ll try to add a DevOps point of view and explain how containerizing R is used in the context of the OpenCPU system for building and deploying R servers.

1: Easy Development

The flagship of the OpenCPU system is the OpenCPU server: a mature and powerful Linux stack for embedding R in systems and applications. Because OpenCPU is completely open source we can build and ship on DockerHub. A ready-to-go linux server with both OpenCPU and RStudio can be started using the following (use port 8004 or 80):

docker run -t -p 8004:8004 opencpu/rstudio

Now simply open http://localhost:8004/ocpu/ and http://localhost:8004/rstudio/ in your browser! Login via rstudio with user: opencpu (passwd: opencpu) to build or install apps. See the readme for more info.

Docker makes it easy to get started with OpenCPU. The container gives you the full flexibility of a Linux box, without the need to install anything on your system. You can install packages or apps via rstudio server, or use docker exec to a root shell on the running server:

# Lookup the container ID
docker ps

# Drop a shell
docker exec -i -t eec1cdae3228 /bin/bash

From the shell you can install additional software in the server, customize the apache2 httpd config (auth, proxies, etc), tweak R options, optimize performance by preloading data or packages, etc.

2: Shipping and Deployment via DockerHub

The most powerful use if Docker is shipping and deploying applications via DockerHub. To create a fully standalone application container, simply use a standard opencpu image and add your app.

For the purpose of this blog post I have wrapped up some of the example apps as docker containers by adding a very simple Dockerfile to each repository. For example the nabel app has a Dockerfile that contains the following:

FROM opencpu/base

RUN R -e 'devtools::install_github("rwebapps/nabel")'

It takes the standard opencpu/base image and then installs the nabel app from the Github repository. The result is a completeley isolated, standalone application. The application can be started by anyone using e.g:

docker run -d -p 8004:8004 rwebapps/nabel

The -d daemonizes on port 8004. Now open the app via: http://localhost:8004/ocpu/library/nabel. Obviously you can tweak the Dockerfile to install whatever extra software or settings you need for your application.

Containerized deployment shows the true power of docker: it allows for shipping fully self contained appliations that work out of the box, without installing any software or relying on paid hosting services. If you do prefer professional hosting, there are many companies that will gladly host docker applications for you on scalable infrastructure.

3 Cross Platform Building

There is a third way Docker is used for OpenCPU. At each release we build the opencpu-server installation package for half a dozen operating systems, which get published on https://archive.opencpu.org. This process has been fully automated using DockerHub. The following images automatically build the enitre stack from source:

DockerHub automatically rebuilds this images when a new release is published on Github. All that is left to do is run a script which pull down the images and copies the opencpu-server binaries to the archive server.