Deletions Sprint

Deletion celebration tweet
Deletion celebration tweet

Over 15,000 and still counting. That’s how many lines of code we removed from the Tribune codebase in June of 2015. That’s about 3% of the codebase, which remains over 455,000 lines strong, according to a Sublime Text regex search.

How did we get here? Our team works in two-week sprints, and we sometimes give our sprints themes, which means all three of us work on tasks focused around a single goal, like performance. We’d had deletion of legacy code hanging around on our list for awhile. The Tribune has been around for over five years now, so there was quite a bit of code that the project had evolved beyond. We have code to be deleted detailed in Github issues and Basecamp lists. Some of this code has been a part of the site since as early as October 2009!

When you tell non-programmer colleagues that you want to spend a few weeks focusing on deleting code and lessening your technical debt, be prepared for a blank look. Then you’ll have to translate “code deletion” and “less technical debt” into an explanation of why it’s important to you (and to them, and the rest of the organization) that you delete that code.

I like to use the analogy of a house; if your rooms are cluttered with unused items, it’ll take you a lot longer to navigate through and get things done. If you have a nice clean house, however, you’ll be able to, for example, easily find your way to the kitchen and cook up a delicious dinner. Same with a codebase. If there’s unused code hanging around, it’ll take you longer to wade through it to see what it’s really doing and add a new feature or improve the existing ones. But when the codebase is clean and shiny, the new feature is easier to add in and can be built faster.

So recently, after explaining why cleaning up legacy code is beneficial to everyone within earshot of our corner of the newsroom, and reminding everyone multiple times that’s what we’d be focusing on for our sprint, we finally got to roll up our sleeves and get down to the business of deletion.

This summer was the perfect time for us to clean up the codebase, too, before our newest team member, Liam Andrew, joined us. This way, Liam wasn’t distracted by unused code, which eased his period of orientation with the codebase. We’re also gearing up for a few big projects on the horizon, so this helps us get ready to attack those. Our codebase feels fresh and clean now; although it’s only about three percent, it feels like more. Immersing ourselves in code deletion has also been a positive because it’s made us better programmers. When you’ve been in the weeds deleting unused code, it makes you approach adding new code with a more complex perspective and refactor code as you work.

This is how light it feels
This is how light it feels

Much of the code that we deleted was replaced with something else that provided a similar, but improved, functionality. For example, we removed the code powering our previous donation page when we went live with our new donations app where people can become members and support our work. We also removed our homegrown Twitter app taking up a large amount of database tables that were powering Twitter widgets on our site, and replaced them with the widgets provided by Twitter, removing the burden on our database and codebase and placing it onto theirs. In addition, we removed the paywall that we had been housing inside our code for our premium Texas Weekly product and replaced it with a third-party paywall, Tinypass.

Some things we archived and then deleted the code powering them. From this process of deletion, we’ve come to appreciate more deeply how important it is to think through the process for sunsetting a project once it’s no longer relevant. This is especially important for news organizations where most stories are at least somewhat time-sensitive. Otherwise, projects will hang around forever and ever and ever and related code will break. Sunset plans allow you to create a pristine archive of your organization’s work, and they’ll ensure old projects don’t keep you reliant on requirements and legacy code that could hold you back.

The beauty of having a sunset plan
The beauty of having a sunset plan

When you’re removing code from a large project, especially code that was written by a previous developer, it can be scary to delete a significant number of lines. What if you forget to delete something related and leave behind orphaned code? Most frighteningly, what if you miss something dependent on the code you’re deleting and break parts of your site?

Our test suite gave us some comfort when it came to removing this code. So did deploying and testing out the code in our staging environment, which closely mimics our production site. We also followed our policy of always having another team member comb through our code changes. That got-your-back help of having another pair of eyes on your work can be integral to catching all the details.

We’ve also stepped up our documentation game. We’d already started fattening up our internal wikipedia prior to all these deletions, but the deletions definitely kept the momentum going. I think of looking at legacy code like an archeological dig; the more of the bones and tools you have, the better you can understand what you’re looking at. And if you understand how code was used, you’ll know what you can safely remove when that code’s no longer needed.

Don't let your codebase be as mysterious as Machu Picchu
Don’t let your codebase be as mysterious as Machu Picchu

Have you recently deleted some code from your organization’s codebase, or another project you’re working on? Do you have some code you’ve been meaning to remove but haven’t found the time and space to get around to it yet? We’d love to hear the story of your deletions, any challenges you came across, and how it feels now if you’ve already deleted the code and lessened your technical debt!

Our Docker Journey

by Daniel Craigmile (@x110dc)

We’re fans of Docker at The Texas Tribune. We’ve been playing with it since well before the 1.0 release and immediately started incorporating it into our infrastructure when the first production-ready code was released in June of last year. Today Docker is the foundation for a large part of our platform. This is the story of how we got there.

We’re a small nonprofit news organization that reports primarily on state-level politics and policy. We have a main news site (www.texastribune.org), an op-ed site (tribtalk.org) and several other sites developed by our very capable News Apps team. Our environment is mostly a mix of legacy applications, static sites and frequently-created new apps. The main site is based on Django but also uses Ruby, Node and a slew of dependencies for each. How do you get the correct versions of all that software installed in a consistent, repeatable way? Docker. As a new developer, if you install the project by following the detailed instructions in the README, you’ll be operational in a matter of hours or days. Using Docker, it’s a matter of minutes.

We started by just using Docker as a nicely isolated development environment, thinking of it as version management (think virtualenv or rvm) on steroids.

Gradually we started incorporating Docker in more ways. We use Jenkins for continuous integration and Rundeck for operational tasks like database backups, database refreshes and Django maintenance, among other things. For both Rundeck and Jenkins, the hosts are doing lots of distinct types of work for disparate projects, and each project has different requirements and software dependencies. Previously, we had to install software from every project on the boxes and hope there were no conflicts. That’s a false hope; there are always conflicts. Now, with Docker, everything is isolated, and nothing special has to be installed on the Jenkins or Rundeck hosts. Every task is run as an isolated Docker container. The hosts themselves can do one thing (and do it well): run our CI and our operations.

The Docker Hub is part of our workflow and has been since the service was offered. We use the automated build feature. This means that as soon as a new commit is pushed to GitHub, Docker Hub begins building the images. We don’t need a current copy of the Git repository on our build boxes. Those images get pulled by Rundeck and Jenkins. We also link to parent images on the Hub. When a parent gets updated all child images get updated as well. That’s all done with no intervention on our part. We publish all of our open source Docker projects on the Hub so others can easily find, use and fork our work. (That’s 28 projects and counting!)

We did a series of workshops on Docker to get our platform and News Apps teams acquainted with it and started using it for their own projects. Even the presentations were powered by Docker!

We also started using it for ancillary services. We have a Docker image that does mail delivery. We plug in our Mandrill key and go. There’s no need to go through the arduous task of setting up Postfix, Sendmail, or anything else. Simply start this container and linked containers or hosts have instant mail delivery capability with very little fuss. Configuring email delivery with Jenkins can be a pain; now it’s not.

We have another image that just serves as an OAuth2 proxy. We can link it to any other container and — bam! — instant authentication. Docker runs high-performance applications, too, like our video aggregation and streaming service. Another image does Elasticsearch for us. It’s so much easier to run that container for applications that need Elasticsearch than going through the process of installing Java and Elasticsearch on top of it.

Our popular Government Salaries Explorer and our Texas Legislative Guide are both run entirely on Docker. It means that our News Apps team can deploy and maintain the applications without having to worry about the intricacies of Amazon Web Services autoscaling groups and elastic load balancers.

We containerize any greenfield or overhauled applications with Docker and run them that way in production.

Our primary site isn’t on Docker yet, but we’re moving in that direction. It’s a bit more difficult because there are many moving parts, and as our longest-running service, it’s harder and riskier to change. We’re also waiting to see how the quickly changing and very competitive landscape pans out with services offering orchestration, deployment and management. We’re looking closely at Amazon’s Elastic Container Service (and products that use it like Empire and Convox) as well as Kubernetes, Tutum, and Rancher.

Docker is already pervasive in our shop and has provided very tangible benefits in time saved and ease of use. We look forward to using it for even more!

Hello world

Bees on Honeycomb
Bees on honeycomb made of hexagons

Welcome to The Texas Tribune platforms team tech blog! We’re a lean, four-person team comprised of Director of Technology Amanda Krauss, System Architect Daniel Craigmile, Developer Kathryn Beaty, and Developer Liam Andrew.

We support the efforts of the fantastic Texas Tribune editorial, events, marketing, membership, and sponsorship teams. We work on the CMS and websites for The Texas Tribune and TribTalk, both of which are Django apps. We integrate third-party services such as Eventbrite and Mailchimp into the site. Since we have a small team, we all wear many hats and have a hand in everything from maintaining the servers and databases on the backend to building with tools like Sass, JavaScript, and Grunt on the front end.

Why is our blog called Notes from the Hexagon? A hexagonal desk sits in the center of our corner of the newsroom, and when we’re not typing away at our standing desks, we’re sitting around its six sides with our laptops. A few of our colleagues started calling our space The Hexagon, and we embraced the shape as part of our team identity. We recently got hexagonal power plugins. And we like that hexagons are found in nature in bees’ honeycombs, the shape of Petoskey stones, and more. It’s safe to say that we love hexagons, and so it was only natural to name our blog after them.

Hexagon Power
Our hexagonal power plugins

In our blog, we’ll take turns writing posts sharing insight into the projects we’re working on, the challenges we’re facing, the successes we’ve had, and more. We figure that if we’ve come across a challenge, chances are someone else is also trying to find their way through that same challenge. Plus, it only makes successes sweeter to share them!

We’d love to hear your feedback on our blog, so please leave a comment, tweet at us, or send an email. Thanks, and we hope you enjoy Notes from the Hexagon.