When I first joined dnsimple, almost three years ago now, we had a few internal systems and services all controlled by Chef. For those who don't know, Chef is an automation framework written in Ruby to help you automate your server infrastructure and software configuration. Since it's written in Ruby it can do virtually anything you desire, which gives it a lot of power and flexibility. However, it is easy to succumb to anti-patterns due to that power and flexibility. Suddenly it's a hammer that makes every problem with your infrastructure look like a nail.

Where it started

For us back then, and even today, our internal systems were written in Go, Ruby, and Erlang, but deployed with Chef. To release a new version of our name server or redirector service meant following a set of compilation and deployment tasks. It was a tedious, annoying process that felt like a prime choice to automate with Chef. Automating it in Chef was the easy part, but maintaining it created a whole new set of problems for us.

Compile all the things!

The initial problems began to crop up whenever we had any step of the compilation or code update process fail on us. Imagine if you're missing a library that's not in your Chef recipe, or the source package changes slightly in its build process. You're now stuck in a new circle of hell where Chef can't run and you're now debugging instead of deploying. For us this even caused some outages when services were not properly deployed, but Chef was not smart enough to detect possible issues in the compilation steps.

Compile, THEN package all the things!

Our solution was to automate our building process outside of Chef. Chef is great at handling configuration changes, but that's about it. With this in mind we started to use a tool called FPM (short for eFfing Package Management) which provides a friendly API for building compiled system packages for package managers like Apt on the Debian and Ubuntu platforms, or RPM on Redhat and CentOS platforms. Here we were able to leverage a predictable compilation step and control where and how the new releases are built. Chef can now focus on what it does best: updating templates of configuration files and signaling services on their change. We can also test this infrastructure faster because there is no time spent compiling via Chef.

Another benefit is that you can build once then deploy the same artifact multiple times. The time savings alone was already a major win for us. For example we used to compile ruby every time we deployed a new system or ruby version. Even on a fast machine back then you are talking at least 5 minutes of compile time alone and this was no longer happening as part of a Chef run.

To move to this new packaging framework, we used another library called fpm-cookery that expands the work of FPM by making an easy to write and follow DSL to package our software. Here is an example recipe for nginx:

require 'mkmf'

class Nginx < FPM::Cookery::Recipe
  description 'a high performance web server and a reverse proxy server'

  name     'nginx'
  version  '1.8.1'
  revision 5
  homepage 'http://nginx.org/'
  source   "http://nginx.org/download/nginx-#{version}.tar.gz"
  sha256   '8f4b3c630966c044ec72715754334d1fdf741caa1d5795fb4646c27d09f797b7'

  section 'httpd'

  build_depends 'libpcre3-dev', 'zlib1g-dev', 'libssl-dev'
  depends       'libpcre3', 'zlib1g', 'libssl1.0.0'

  provides  'nginx-full', 'nginx-common'
  replaces  'nginx-full', 'nginx-common'
  conflicts 'nginx-full', 'nginx-common'

  config_files '/etc/nginx/nginx.conf', '/etc/nginx/mime.types'

  def build
    configure \
      '--with-http_stub_status_module',
      '--with-http_ssl_module',
      '--with-http_gzip_static_module',
      '--with-pcre-jit',
      '--with-debug',
      '--with-ipv6',

      prefix: prefix,

      user: 'www-data',
      group: 'www-data',

      pid_path: '/var/run/nginx.pid',
      lock_path: '/var/lock/nginx.lock',
      conf_path: '/etc/nginx/nginx.conf',
      http_log_path: '/var/log/nginx/access.log',
      error_log_path: '/var/log/nginx/error.log',
      http_proxy_temp_path: '/var/lib/nginx/proxy',
      http_fastcgi_temp_path: '/var/lib/nginx/fastcgi',
      http_client_body_temp_path: '/var/lib/nginx/body',
      http_uwsgi_temp_path: '/var/lib/nginx/uwsgi',
      http_scgi_temp_path: '/var/lib/nginx/scgi'

    make
  end

  def install
    # config files
    (etc/'nginx').install Dir['conf/*']

    # server
    sbin.install Dir['objs/nginx']

    # man page
    man8.install Dir['objs/nginx.8']
    gzip_path = find_executable 'gzip'
    safesystem gzip_path, man8/'nginx.8'

    # support dirs
    %w( run lock log/nginx lib/nginx ).map do |dir|
      (var/dir).mkpath
    end
  end
end

FPM Cookery elegantly lets you describe the build steps for making your package, which are then translated to FPM itself. What would normally take a lot of custom back and forth is greatly simplified with this easy to follow interface in ruby. If you want to see more recipes for some common software out there, check out this github repository.

In the above example, if I run fpm-cook, it will load and compile the recipe.rb file in my current directory. Keep in mind if you're an OSX user, and you're building Debian packages for example, you'll need to run this from within a Linux virtual machine since it will require the debian package build tools. You can build a virtual machine and mount your project directory as a shared folder to get builds going quickly for you locally and then run this in a CI service like Travis-CI.

Distributing your new packages

The missing piece of the packaging puzzle came to us in the form of a hosting company called Packagecloud. They can host and distribute packages for Apt, RPM, Rubygems, etc. Once we built the packages we would upload them to Packagecloud and they would now be available via native operating system package managers. We were even able to automate adding the repository they provided via their Chef cookbook.

Since we also use Travis-CI it was really quick and simple to setup the build steps. They even have a ready to go integration with Packagecloud that uploads your built package to their service. We use the feature in this integration to only ship a new package build when a new git tag is pushed. This lets us tweak and try new build options across version numbers and control when we actually release a package to their service for all of our servers to consume.

Looking ahead

Now that we have taken compilation out of the equation in our stack, we're back to focusing on deployment. A typical Chef run time when we deployed a brand new dnsimple web app system would take over an hour, but we now have this down to a few minutes or less. We're even looking into the next possible evolution of this system with Chef's brand new Habitat framework. Habitat takes what we would normally do with FPM and makes the app being deployed the new center of the universe. Thanks to our efforts with packaging, moving to Habitat is easier than ever, and I look forward to giving it a try soon. Another option that might work for us is Omnibus packaging that compiles a runtime like Ruby or Erlang with your app making it more of a self-contained environment. Your rubygems or beam files are already compiled and ready to go, deployable in seconds versus minutes or hours.

I hope this post has illustrated the power of packaging and how it has saved us a lot of time and headache. Speed is an obious benefit, but consistency with deployment and configuration are also majorly beneficial. Predictable behavior is a lot of why we use tools like configuration management to organize and configure all of our systems. The more we can control for the variables, the easier things are to test, verify, and even more importantly, maintain in the long run. This makes your customers happier because you can deliver features faster and more reliably while keeping the lights on.