OpenCPU release 1.6

May 20, 2016

Following a few weeks of testing, OpenCPU 1.6 has been released. OpenCPU is a production-ready system for embedded statistical computing with R. It provides a neat API for remotely calling R functions over HTTP via e.g. JSON or Protocol Buffers. The OpenCPU server implementation is stable and has been thorougly tested. It runs on all major Linux distributions and plays nicely with the RStudio server IDE (demo).

Similarly to shiny, OpenCPU can run as a single-user development server within the interactive R session, and as a multi-user (cloud) server for deployments on Linux. Unlinke shiny however, the cloud server comes at no extra cost. On the contrary: you are encouraged to take advantage of the cloud server which is much faster and includes cool features like user libraries, concurrent sessions, continuous integration, customizable security policies, etc.

Improvements: protolite and feather

The OpenCPU API has not changed from the 1.4 and 1.5 branch. The version bump indicates that this version targets the R 3.3 and supports the new Ubuntu 16.04. Furthermore the underlying stack of bundled R packages has been upgraded. Navigate to /ocpu/info on your OpenCPU server to inspect the exact versions of all packages used by the system.

This version introduces two major improvements for binary data interchange. First the RProtoBuf dependency has been replaced by the much smaller protolite package, which has an optimized version of protobuf object serialization. The OpenCPU already had an API for exporting data to Protocol Buffers, it’s just much faster now.

req <- GET("")
mydiamonds <- unserialize_pb(content(req))

New in this version is the feather output format which can be parsed/generated with the new feather package.

curl_download("", "diamonds.feather")
mydiamonds <- read_feather("diamonds.feather")

Both pb and feather are a binary alternative to the text based json format:

con <- curl("")
mydiamonds <- fromJSON(con)

Installation and upgrading

The download page has instructions for installing the opencpu server on various distributions, either from source or using precompiled binaries. To upgrade an existing installation of opencpu on ubuntu, simply run:

sudo add-apt-repository ppa:opencpu/opencpu-1.6
sudo apt-get update
sudo apt-get dist-upgrade

Note that this will also upgrade the version of R to 3.3.0 (if you have not already done so) which might require that you reinstall some of your R packages.

You can also install opencpu-server on any version of Debian/Ubuntu/Fedora/CentOS/RHEL by building the deb/rpm installation package from source. This is really easy, see the readme for deb or rpm.

Getting started

For those completely new to OpenCPU there several resources to get started. The presentation from last year’s useR conference gives a broad overview of the system including some basic demo’s. The example apps and jsfiddle scripts show how to use the opencpu.js JavaScript client. The server manual has contains documentation on configuring your opencpu cloud server (although installation should work out of the box).

Finally this paper from my thesis describes more generally the challenges of embedded scientific computing, and the benefits (both technical and human) of decoupling your statistical computing from your front-end or application layer.

The public demo server

To deploy your OpenCPU apps on the public server, simply push your R package to Github and configure the webhook in your repository. Whenever you push an update to Github the package will be reinstalled on the server and can directly be used remotely by anyone on the internet. You can either use the full url or the shorthand url:

  • https://{username}{package}/

These urls are fully equivalent. Simply replace {username} with your github username, and {package} with your package name. Note that the package name must be identical to the github repository name (as is usually the case).

On writing packages

One prerequisite for using OpenCPU is knowing how to create an R package. There is no way around this; packages are the natural container format for shipping and deploying code/data/manuals in R, and the OpenCPU API assumes this format. Luckily, writing R packages is super easy these days and can be done in less than (10 seconds) using for example RStudio.

The good thing is that once you passed this little hurdle, the full power and flexibility of R and it’s packaging become available to your applications and APIs. Hadley’s latest book on writing R packages gives a nice overview of the R packaging system, and the OpenCPU API provides an easy HTTP interface to all of these features.