At DNSimple we're always looking to improve and better utilize what we have. Doing more with less is generally an unwritten motto among the team, whether it's technical or something like a conference booth on a budget. I'd love to share the technical side of a change we made recently in how we do our Chef Cookbook Testing.
When I first arrived at DNSimple, we had no tests for our infrastructure, which at the time was quite complex and spread out. Over the years we slowly added in the first of many Test Kitchen test suites which vastly improved our overall quality of tests. One problem you quickly face though is: how do you test this in a CI environment like Travis-CI? Well that was a complicated answer.
Right out of the box, Test Kitchen supports testing via Vagrant and VirtualBox which work fairly well. In tandem they provide about as-good-as-you're-gonna-get test simulations of starting up a server, converging it, and testing the result. On a modern system, this is still relatively fast, since the installation of the OS has already been done and the images are optimized for faster testing with as minimal of an OS install as possible. However, as you add testing coverage and additional test cases, this time quickly grows. At one point our web app cookbook was taking close to 2 HOURS to complete. Can you imagine that slow of a feedback loop when you want to test something?
Over the years we got smarter about our testing by cutting down dependencies, sticking to one platform, and removing complex things, like deploying code via Chef. Don't do this by the way, Chef has already removed it. Another optimization we started to see was the Chef community experimenting with Docker which uses containers instead of full virtual machines for testing.
A couple years ago a Chef employee at the time, and now Chef Community Award winner, Sean Omeara wrote a plugin for Test Kitchen called Dokken. It took advantage of Docker in a way that allowed Test Kitchen to skip certain repetitive parts of the testing process. That process usually consisted of:
What Dokken short-cutted is the first two steps there. Dokken provides a pre-built Docker image with all the basics you need for most Chef testing but minus the Chef part. It's left out because in order to make parallel testing better, it initially creates a shared Docker image with Chef Client installed that's shared between all the test suites on that platform. This way future builds bypass those first two steps and go right to the Converge step. When you factor in Docker's speed, this makes for a ridiculous speed up in your test suites. After we converted some of our projects over to it we saw testing take half the time it did before.
Now this new testing style doesn't come without it's caveats because of the assumptions it makes about how you test your Chef code. I'll break down each issue we hit in it's own section and point.
We've internally written a Custom Ohai plugin to detect our private networking setup that works really well for how small and simple it is. However, if you remember my comment earlier about Dokken mounting that shared Chef Client Docker image you might see the problem. When your chef client runs it will install the Ohai plugin and execute it, however when you go to verify the plugin installed after the fact, you won't find it there. This is because that shared Chef installation also shares the Chef directories which don't stick around in the image after you re-run the image via kitchen verify or kitchen login commands.
To help test this we picked up a trick they do in the ohai cookbook where they use the ohai data during the Chef run to write a file only if the node data exists. This won't test the Ohai plugin itself exactly, but it will verify that it runs and returns some of the data you expect.
Another thing we found very difficult to test is firewalls. Since Docker likes to mess with IPTables and has certain ways of routing around the networking, you'll need to either disable these recipes or carefully review what rules are being added or removed via your cookbook. It can wreak havoc on your testing if you block out the wrong thing.
One trick we depended on is an environment variable usually provided by Test Kitchen called
TEST_KITCHEN which is true during a test kitchen run. Due to the way Chef and Docker are invoked by Dokken, this isn't possible right now even when I tried to fix it. Our workaround for this was to change all those checks to the
_default Chef environment instead, which is there unless you specify a different environment in Test Kitchen.
IPv6 networking works in a weird way in the Dokken images, and we have not fully investgated how to test it, so our IPv6 verifications simply don't work. The commands run just fine, but the IP command at least on linux won't actually create the IPv6 networking you configure in Chef.
Chef ships Dokken in their new Chef Workstation product which bundles the Chef Development Kit along with a lot of handy plugins and tools to get you started in the Chef world very quickly and efficiently. I'd highly recommend you give it a shot. It's a significant speed boost, and when you factor in the issues I mentioned earlier, it's generally a win for most test cases. The README for Dokken is a great place to start. It has a lot of information about how to configure test kitchen to use it, and what it's doing under the hood to test everything. We also use Dokken on our public Chef Cookbook which you can use to automate your domain and SSL certificate management with Chef.
Software and Server maintainer by day, board and video game geek by night.
Configure DNSimple as your secondary DNS provider to improve your domain's availability and redundancy with AXFR zone transfers.
Get a free limited-edition t-shirt featuring the characters of howdns.works and howhttps.works with any new yearly subscription to DNSimple.