One of the services that we provide at DNSimple is URL redirection. You enter a special record in your DNSimple DNS for a domain and when we receive a DNS query for an A name of that record we return the IP address of our redirection service. If the next request is an HTTP request it will go through our redirection service which then looks up the URL to redirect to, does a bit of URL manipulation to ensure that the path and query strings are retained during the redirect, and then returns a 302 response to the client.
From Ruby to Clojure
From the beginning of DNSimple we were using a simple Ruby rack application on unicorn to provide the redirection. For several years it has worked very well, but as redirection traffic has grown so has the resource usage of the service. During the last few months we decided it was time to move from Ruby to something else. Initially I rewrote the application in Clojure, which was not too difficult, however we waited to deploy the service due to other tasks which were higher priority. Recently we attempted to deploy the new Clojure application, but ran into some snags along the way. Most of the issues were likely due to our own deficiencies, but it stopped our progress none-the-less. More importantly, after re-evaluating the code that I had previously written for the Clojure application, I decided that the resulting service code was more difficult to understand than the original Ruby application. No doubt this was likely due to our relative inexperience with Clojure when compared to Ruby, but regardless, I could not ignore this fact. Ultimately I decided that I didn’t really want to put this Clojure service in production. Once it was in production we’d want to see it through to completion and live with it for a while.
From Clojure to Go
At this point I had been itching to try my hand at Go. Given that this redirection service was a well understood problem it seemed like a good first project to get Go into production. After about 3 hours of work I had a working Go version of the redirector. Translating from the Ruby version was not too difficult in this case because the Ruby version was already fairly procedural. The Go code turned out to be similar in size to the Ruby code, coming in at 184 lines compared to Ruby’s 119. This is likely due to the simple and singular purpose of the app (which is something to think about in general).
With this new redirector in hand and working locally, Darrin and I agreed to take it to production quickly.
I’ve found operating Go processes in production to be relatively painless. We use our own flavor of continuous delivery. Opscode’s Chef deploys our redirector service. When Chef runs the source repository is synced from GitHub and compiles a Go binary if there’s a code change. For process supervision we rely on the runit service resource. We use Foreman to generate a runit configuration from a Procfile and set several environment variables. Standard output/error from the Go process is logged to a rotating file by runit and ingested into Papertrail via remote_syslog. We also use monit to monitor things like HTTP responses and confirm resource utilization is within bounds. Should monit detect trouble it’ll restart the process and alert an operator.
Bottom line: deployment was a breeze, and we had the application running in production shortly after deciding to do so.
To be clear, our first foray into Golang was not without issues. When we first deployed we found that the application memory usage was growing steadily and after few hours the application would die. Naturally this was pretty disappointing, but given it is the first time we’ve put Go into production, not too surprising. Initially I had used panic() in a couple places, which I removed. The application was still dying, so after digging a bit I found that I had used log.Fatal without fully understanding what it does. I assumed it was a log-level, however it actually logged the message and then called system.Exit(0). Whoops.
After fixing these basic issues, the application was still dying. The problem looked to be around opening syslog connections. After removing the syslog code we were still seeing issues. At this point we had narrowed the issue down to too many open file handles. A bit more digging and I found that the issue was around the default timeout values set by the Go HTTP server. The default settings allowed clients to use HTTP keep-alive to keep their connection indefinitely, essentially resulting in an ever-growing collection of open file handles. Setting the default read and write timeouts to a sufficiently low value stopped the problem completely. Since tracking down these issues, our Go code has been rock solid. It’s been running steadily in production with no issues for the last month. Both memory and CPU usage are at least an order of magnitude lower than the Ruby application.
The Bottom Line
Does this mean we’re switching everything over to Go? Not by a long shot. At DNSimple we use Ruby, Python, Erlang and Go for different purposes. We’ll likely continue doing this since it’s both fun and let’s us have a wide range of tools at our disposal. I will definitely say that the resource usage aspects of Go in comparison with something Erlang or Ruby does make it attractive as a tool for certain classes of applications, and that you should give it a try if you haven’t already. We’ve already implemented another internal tool in Go as well and will likely use it for more services internally where memory and CPU usage are critical.