Post Incident Report for Partial DNS Outage on November 1st, 2020
On Sunday, November 1st, 2020, a zone managed by DNSimple was targeted with a Distributed Denial of Service attack. This caused disruptions in several of our data centers, with some servers unable to return responses to DNS queries. This attack is similar to other attacks we have recently seen occurring more frequently, using randomized subdomain traffic. Unlike the attack we recently experienced, this attack was aimed at a zone that we host. The volume and form of the attack resulted in exhaustion of worker resources on some nameservers by causing a sufficient amount of processing time to exceed worker timeout limits.
We have already deployed updates to our name server software to better handle the type of attack we saw during this incident. We are also performing additional analyses to determine other potential attack vectors to address before they result in a future incident.
Per our previous Post Incident Response (PIR), we have already added additional monitoring that triggered notifications to on-call personnel. We were still not fast enough to publish an incident page, due to the fact that alerts were also automatically resolving themselves as the incident was unfolding, because some servers were able to return responses. After analyzing our monitors, we have implemented further adjustments to improve our monitoring and alerting.
Finally, as previously mentioned, we are working on improving observibility of network traffic into and out of our systems. This project is ongoing, but we hope to have something in place shortly that will allow us to identify the target of attacks like this much faster. That way we can make better decisions for mitigating attacks more quickly.
The entire DNSimple team continues to work diligently to make our systems more resilient to these types of attacks. I am grateful to all our customers for being patient with us during these attacks, providing valuable feedback, and continuing to trust us with their DNS services.
I break things so Simone continues to have plenty to do. I occasionally have useful ideas, like building a domain and DNS provider that doesn't suck.
We think domain management should be easy.
That's why we continue building DNSimple.
Technical reasons behind the ALIAS record
In this article I will try to explain the technical reason behind the ALIAS record and important limitations of the CNAME record you need to know.
Elapsed time with Ruby, the right way
Elapsed time calculations based on Time.now are wrong. Learn why they are wrong and how to fix them.