Learning

Why aren't you using the Chef Accumulator Pattern?

Aaron Kalin on Oct 3, 2017

A somewhat recent development in the Chef Community is the idea of accumulators. Have you ever had a resource that would be the computed end result of a bunch of other cookbooks calling upon it? Say you had a single network interface file, but a lot of other cookbooks want to all modify this same file. This is where the accumulator pattern comes in handy to "accumulate" the changes before outputting the resulting file. The means for expressing this in Chef has been around for a little while, but it recently got a lot better and I'd like to show you how we used it to wrap another cookbook's custom resource.

A cloning problem

An internal library cookbook we have deals exclusively with how we gather and collect metrics for DataDog by calling to the DataDog Chef Cookbook. We use their provided monitor resource and what it does, in a nutshell, is write a template file to the configuration directory for a given check for their agent to report data. How this resource works is actually part of the problem, so let me explain with a concrete example of the issue.

Let's say we want to have our website HTTP response time checked by DataDog. A way we can write this in, for example, a wrapper cookbook called dnsimple-web would be:

# In dnsimple-web::default recipe
datadog_monitor 'http_check' do
  instances(
    [
      {
        'name' => 'DNSimple Web',
        'url' => 'https://dnsimple.com',
        'collect_response_time' => true
      }
    ]
  )
end

What the above will do is set up a HTTP check configuration for the data dog agent to pick up and restart accordingly. All seems well, right? Well, let's now add another HTTP service via a dnsimple-api cookbook along side the main site that handles API traffic. You might write this like so:

# In dnsimple-api::default recipe
datadog_monitor 'http_check' do
  instances(
    [
      {
        'name' => 'DNSimple API',
        'url' => 'https://api.dnsimple.com',
        'collect_response_time' => true
      }
    ]
  )
end

This cookbook will be part of the same run list as the server running the dnsimple-web cookbook as well. So to Chef, this is what the resources to run would look like:

# From dnsimple-web::default recipe
datadog_monitor 'http_check' do
  instances(
    [
      {
        'name' => 'DNSimple Web',
        'url' => 'https://dnsimple.com',
        'collect_response_time' => true
      }
    ]
  )
end

# From dnsimple-api::default recipe
datadog_monitor 'http_check' do
  instances(
    [
      {
        'name' => 'DNSimple API',
        'url' => 'https://api.dnsimple.com',
        'collect_response_time' => true
      }
    ]
  )
end

What looks wrong about this picture? If you guessed that we have two datadog_monitor resources named "http_check" you are correct! This behaviour is called resource cloning and depending on which version of Chef-Client you are currently running, the above code will cause issues in different ways. If you're on Chef 12, the Chef Client will attempt to merge the two resources together by way of cloning the attributes of the first into the second. In Chef 13, this behavior has been removed so what will ultimately happen is we will end up with one http check configuration, but only the API will be there because it was the last resource to be evaluated in the Chef run.

Accumulating checks

The above issue is quite the headache for anyone hoping to get a consistent state of configuration files for similar resources. So how can we solve this problem? Enter the Accumulator Pattern. Accumulators are nothing new in computer science. Have you ever written or seen a loop like this in ruby?

x = 0
while x < 5 do
  x += 1
end

The x in this case is the accumulator. We "accumulate" a value into it via the loop. While this is a trivial example it should illustrate the technique we are going for to prevent this resource cloning issue. The approach we took was to write a new Chef 12.10+ resource to "wrap" the DataDog resource and accumulate the instances property of the datadog_monitor resource instead of overwriting it.

property :check_name, String, name_property: true
property :init_config, Hash, required: true, default: {}
property :instances, Array, required: true, default: []
property :generic, [true, false], default: true

action :create do
  with_run_context :root do
    check_property = new_resource
    edit_resource(:datadog_monitor, check_property.check_name) do
      instances(instances + check_property.instances)
      init_config(init_config.merge(check_property.init_config))
      use_integration_template(check_property.generic)
      action :nothing
      delayed_action :add
    end
  end
end

The above code leverages some new methods in Chef Client 12.10 called edit_resource which allows you to modify or create a resource based upon it's name in the resource collection. We also use with_run_context to modify the final collection of resources accumulated during the compile phase of Chef. Every time you write out a file or directory in Chef, the name of that resource is added to a list of resources to execute during the second phase of Chef-Client called the run phase. With edit_resource and with_run_context we're able to directly manipulate this collection and add or modify values before the run phase occurs. This is where the accumulator pattern comes in and where I'll explain how we use this to our advantage.

with_run_context :root do

We start by calling up to the "root" run context. By default all of the resources you write out execute in it's own "context" to prevent the resource cloning issue. This is why we need to manipulate the root context; which is where the finalized collection occurs.

  check_property = new_resource
  edit_resource(:datadog_monitor, check_property.check_name) do

Here we are keeping a copy of the new_resource which is a method that gives us the new values chef is currently compiling. This way we can merge the new resource values with all of the other existing ones we find no matter which cookbook they happen to be defined in. This is why earlier we called to the "root" which lets us edit or create the global copy of this resource. The edit_resource call looks in the collection for a datadog_monitor resource by the name of the given check name in our own wrapper resource. This lets us avoid creating a resource cloning issue ourselves. Say if we had those two http check resources we mentioned earlier, we'd name them website_http and api_http, but they would define http_check as the check_nameproperty so they reference the same DataDog Monitor resource in the end.

  instances(instances + check_property.instances)
  init_config(init_config.merge(check_property.init_config))

Here is where the other magic happens. We directly manipulate the instances and init_config properties of the datadog cookbook's datadog_monitor resource and performing an array merge and a hash merge accordingly. The instances property in this case is an array of hashes so we "accumulate" the new resource data onto the existing one. If one does not exist, then it is initialized with these new values because the property default is an empty array.

So now, we can do this:

dnsimple_metrics_check 'dnsimple_web_https_check' do
  instances(
    [
      {
        'name' => 'DNSimple Web',
        'url' => 'https://dnsimple.com',
        'collect_response_time' => true
      }
    ]
  )
end

dnsimple_metrics_check 'dnsimple_api_https_check' do
  instances(
    [
      {
        'name' => 'DNSimple API',
        'url' => 'https://api.dnsimple.com',
        'collect_response_time' => true
      }
    ]
  )
end

And the result will be a single http_check.yaml file with both of those checks defined inside versus just the API check because it was the last to be evalulated in the Chef run.

Conclusion

The accumulator pattern is a little weird, but it works. We used this for monitoring, but you can see this pattern out in the wild with HAProxy and Samba where manipulating the same resource across multiple cookbooks can be accomplished. This pattern has actually existed in Chef in other crude forms which have led to very weird converge issues. These patterns leveraged internal chef APIs or hidden features like runtime attributes which aren't the most predictable thing. Thanks to a lot of effort from Chef it continues to improve and get better as they push forward with newer releases and explore this pattern futher.

Share on Twitter and Facebook