DNSimple has been using Chef for years, and our testing story has evolved with it. We've seen the community go from manual, basic testing to fully automated testing in the span of a few years. With it, we've slowly evolved our testing strategy, too. I'd like to show you where we started, where we're headed, and where we're going in the future of our testing in Chef.

In the beginning…

At least 5 years ago, in the early days of Chef, testing was a manual and very painful process. You wrote your cookbook against the resource references in the Wiki (way before the nice docs we have now) or the ruby docs, and hoped you did it right. There might have been a syntax error or a bug in the resource, but you wouldn't really know. The way you found out was either live on a server as you ran chef interactively, or you had a special virtual machine you built with a copy of chef and a shared folder with your workstation to test your changes and hope you got it right.

Back then, DNSimple didn't really have a testing scenario. I wasn't there when the first lines of Chef code were written for our infrastructure, but testing was manual. We had our own Packer recipe to build an approximation of our software installs to at least automate building a test VM for running Chef. Around the time I joined, the invention of Test Kitchen and its merge with Jamie-CI gave Chef a mostly automated method of testing your Chef code.

Test Kitchen wasn't the only game in town for testing. See, Test Kitchen was great for integration level testing to make sure what you wrote actually ran and consistently built out what you expected in your system. When you wanted even faster responses, there was ChefSpec, which was a framework built inside of Ruby's RSpec library for testing the Ruby code. You created fake Chef runs and compiled the code, then asserted the resources changed to the state you expected. It didn't actually run your Chef code inside a system, that was what Test Kitchen was for. We, like many others in Chef, fell into the trap of writing too much ChefSpec code. When things broke, ChefSpec wouldn't catch it. Maybe Test Kitchen did when a run would crash, but it clearly didn't do us a lot of favors.

Running Test Kitchen was just the beginning. Back then, you had a few options for how to verify that what you created in Test Kitchen actually did the thing you expected. There were several frameworks for this, including BATS and ServerSpec, and we had written our test suites in both at one point. I favored ServerSpec, because it was very close to ChefSpec and had easier to understand output, even when a failure happened. However, maintaining two styles of tests in our Chef code wasn't working well for us. BATS was all written in Bash while ServerSpec was all Ruby.

And then there was Inspec

A few years ago, Chef acquired VulcanoSec and introduced a brand new tool called Inspec. While it was originally billed as a tool to automate compliance testing, it morphed into a replacement for ServerSpec. Chef finally had a way to write automated tests to verify that what you ran in Chef worked as expected. As an added bonus, you could take these same tests in your kitchen and run them against live servers to verify they too were running what you expected.

Slowly, we started migrating our BATS and ServerSpec tests over to Inspec. It initially helped that the Inspec syntax was remarkably close to ServerSpec. In many cases we could simply rename the ServerSpec tests to Inspec and we'd be good to go. While not perfect, it made it quick to get going. However, there was some initial confusion of where exactly to place the test files for Test Kitchen. Most of the time in Test Kitchen, everything goes under the test/integration folder to denote that these are integration level tests for Test Kitchen to run after everything converges. At one point they wanted it to be test/smoke or test/inspec, but ultimately they settled on test/integration.

There are a few options when it comes to how you write your Inspec. We settled on the "should" based syntax, which flies in the face of typical RSpec syntax, since Inspec is an extension of the RSpec testing framework. While it's possible to use the RSpec "expect" syntax, it doesn't seem to be widely supported in the community. While we aren't compliance auditors, Chef notes the "should" syntax is easier for them to read, and thus the more supported syntax. The "expect" syntax is aimed more at developers who are familiar with the RSpec framework.

Another option is whether to write your Inspec tests as a simple set of tests or "controls" as they're called. One is a simple file with _spec in the filename, while the other comes with a special folder structure and a yaml file to describe the control. We started with the simple approach, since you can more or less move your ServerSpec tests to this format immediately. However, we eventually migrated to the controls based approach. Here's an example of one versus the other:

A basic spec-based test would look like this:

describe port(5432) do
  it { should be_listening }
  its('processes') { should include 'postgres' }
end

You would put it into a directory with a filename inside your cookbook's test directory like so: test/integration/default/default_spec.rb and in your kitchen.yml file, you'd make sure you have:

verifier:
  name: inspec

This would tell Test Kitchen to run Inspec against your nodes after it has successfully converged instead of BATS or ServerSpec, even if those files are present in your test directory.

To evolve your setup to a control, you can simply move that file into a new directory inside the test folder in a path that matches the name of the suite in your Test Kitchen. Every test kitchen is generated with "default" as the suite name, so I'll use that as an example like above.

You'll make a new folder under test/integration/default called "controls" and move your default_spec.rb file inside of there. Inside the test/integration/default folder, you'll put an inspec.yml file with something like the following:

name: default
title: Default DNSimple Postgres Server
summary: Verify that postgres is setup and configured for DNSimple
supports:
  - os-family: linux

This is a pretty basic profile file, but you have a lot of options here for further describing your suite of controls they call a "profile". It'll help document your test suite as a whole once you migrate these profiles to an Automate server, but more on that later.

Inside your moved default_spec.rb file, you now can put your describe inside a control block:

control 'postgresql-dnsimple-baseline' do
  impact 1.0
  title 'Postgres for DNSimple'
  desc '
    This test ensures the baseline postgresql installation for DNSimple is correct.
  '

  describe port(5432) do
    it { should be_listening }
    its('processes') { should include 'postgres' }
  end
end

The control describes a test or group of tests centered around compliance. We leverage this to verify our Postgres server is running ang listening on the expected port. You can go further in this control, or another one, by connecting to the instance and verifying a user exists, for example. If a resource exists for it, you can test for it in a control, just like ServerSpec or BATS. It has the benefit of clear output when things pass or fail as well. Self-documenting code is always a benefit, and it's easy to make this happen for your tests, too.

The future of our Inspec

We're a few years in to the Inspec migration. Everything's running beautifully, but there's still more work to be done. I mentioned earlier that we leverage Test Kitchen to verify our test systems are configured correctly, but what about our currently running systems? Well, that's where we can leverage the compliance side of things. The next step for us is to move the Inspec profiles from the cookbooks themselves to a central folder in our Chef repo where we can upload them as profiles to our Automate instance and run those against our infrastructure continuously. This hasn't begun yet, but we believe it'll provide more value by allowing us to build the tests first for what we expect to be configured on a system, then build the cookbook change to make it pass the compliance test. It'll give us the benefit of de-coupling the testing cycle from the development and deployment cycle, and allow us to apply these tests against running systems as well as the test ones.

Start your Inspec journey

Testing in Chef is still pretty awesome. It's evolved over the years with better and better features and documentation. It's not without its pain points, and luckily there's a renewed effort to make ChefSpec more useful for custom resources in cookbooks with its recent release. With ChefSpec and Inspec, you and your team will have much better confidence in your infrastructure changes.

If you aren't testing yet, incredibly easy to get started. Chef has a free tutorial to get you going. If you already test, you may benefit from changing your integration test suite to Inspec, begin checking the basic things, and build from there. If you want a more advanced example of an Inspec control, complete with a custom matcher, check out our official Chef cookbook and look in the test folder.