Our Docker Journey

by Daniel Craigmile (@x110dc)

We’re fans of Docker at The Texas Tribune. We’ve been playing with it since well before the 1.0 release and immediately started incorporating it into our infrastructure when the first production-ready code was released in June of last year. Today Docker is the foundation for a large part of our platform. This is the story of how we got there.

We’re a small nonprofit news organization that reports primarily on state-level politics and policy. We have a main news site (www.texastribune.org), an op-ed site (tribtalk.org) and several other sites developed by our very capable News Apps team. Our environment is mostly a mix of legacy applications, static sites and frequently-created new apps. The main site is based on Django but also uses Ruby, Node and a slew of dependencies for each. How do you get the correct versions of all that software installed in a consistent, repeatable way? Docker. As a new developer, if you install the project by following the detailed instructions in the README, you’ll be operational in a matter of hours or days. Using Docker, it’s a matter of minutes.

We started by just using Docker as a nicely isolated development environment, thinking of it as version management (think virtualenv or rvm) on steroids.

Gradually we started incorporating Docker in more ways. We use Jenkins for continuous integration and Rundeck for operational tasks like database backups, database refreshes and Django maintenance, among other things. For both Rundeck and Jenkins, the hosts are doing lots of distinct types of work for disparate projects, and each project has different requirements and software dependencies. Previously, we had to install software from every project on the boxes and hope there were no conflicts. That’s a false hope; there are always conflicts. Now, with Docker, everything is isolated, and nothing special has to be installed on the Jenkins or Rundeck hosts. Every task is run as an isolated Docker container. The hosts themselves can do one thing (and do it well): run our CI and our operations.

The Docker Hub is part of our workflow and has been since the service was offered. We use the automated build feature. This means that as soon as a new commit is pushed to GitHub, Docker Hub begins building the images. We don’t need a current copy of the Git repository on our build boxes. Those images get pulled by Rundeck and Jenkins. We also link to parent images on the Hub. When a parent gets updated all child images get updated as well. That’s all done with no intervention on our part. We publish all of our open source Docker projects on the Hub so others can easily find, use and fork our work. (That’s 28 projects and counting!)

We did a series of workshops on Docker to get our platform and News Apps teams acquainted with it and started using it for their own projects. Even the presentations were powered by Docker!

We also started using it for ancillary services. We have a Docker image that does mail delivery. We plug in our Mandrill key and go. There’s no need to go through the arduous task of setting up Postfix, Sendmail, or anything else. Simply start this container and linked containers or hosts have instant mail delivery capability with very little fuss. Configuring email delivery with Jenkins can be a pain; now it’s not.

We have another image that just serves as an OAuth2 proxy. We can link it to any other container and — bam! — instant authentication. Docker runs high-performance applications, too, like our video aggregation and streaming service. Another image does Elasticsearch for us. It’s so much easier to run that container for applications that need Elasticsearch than going through the process of installing Java and Elasticsearch on top of it.

Our popular Government Salaries Explorer and our Texas Legislative Guide are both run entirely on Docker. It means that our News Apps team can deploy and maintain the applications without having to worry about the intricacies of Amazon Web Services autoscaling groups and elastic load balancers.

We containerize any greenfield or overhauled applications with Docker and run them that way in production.

Our primary site isn’t on Docker yet, but we’re moving in that direction. It’s a bit more difficult because there are many moving parts, and as our longest-running service, it’s harder and riskier to change. We’re also waiting to see how the quickly changing and very competitive landscape pans out with services offering orchestration, deployment and management. We’re looking closely at Amazon’s Elastic Container Service (and products that use it like Empire and Convox) as well as Kubernetes, Tutum, and Rancher.

Docker is already pervasive in our shop and has provided very tangible benefits in time saved and ease of use. We look forward to using it for even more!