Incident Report - DNS changes blocked (May 20th, 2013)
On May 20th, beginning at about 11:30am Pacific time (18:30 UTC), we experienced a partial service outage. A component failure blocked our customers from updating their DNS records. We apologize for any trouble this caused DNSimple customers.
- Detected time: 11:40 PDT
- Resolved time: 23:43 PDT
- Severity: All customers wanting to change DNS records.
Pros: Our alerting system mostly worked.
Cons: Our redirector service was out of sync and we were not alerted directly.
Long story short — we have a single point of failure around a MySQL database. Our name servers rely on MySQL to replicate changes. Maintenance side effects apparently caused a MySQL database to fail slowly. In actuality the physical server hosting the write-version of this database was failing slowly. These failures surfaced as high latency response times. Eventually this resulted in connection timeouts. When I attempted to reboot the physical machine it wasn’t booting normally. Next I migrated the container to our Tokyo data center. I restored the container from a backup database and then synced all the unicast name servers. Unfortunately, I forgot that our redirection service also relies on MySQL replication working. It wasn't repaired until after a DNSimple customer reported the issue.
We will add additional alerts around our redirector service. We will migrate MySQL to a high availability configuration. Customers may also want to consider migrating to DNSimple's anycast network currently in testing. It does not have this single point of failure.
I like shiny things (it says so on my blog). I keep the machines humming at DNSimple.
We think domain management should be easy.
That's why we continue building DNSimple.
DNSimple Now Supports Secondary DNS Hosting
Configure DNSimple as your secondary DNS provider to improve your domain's availability and redundancy with AXFR zone transfers.
Announcing DNSSEC General Availability
DNSimple is moving our DNSSEC out of beta and into general availability.