Cloudflare Configuration Error Caused Global Internet Outage

internet outage cloudflare

Cloudflare takes blame for global internet outage on July 14. A DNS configuration error caused a worldwide disruption lasting 62 minutes.

On the evening of July 14, the lights suddenly went out for many websites worldwide. The cause was soon traced to Cloudflare, one of the largest providers of internet services. Cloudflare admits in a blogpost to having made a mistake and dismisses rumors of a cyber attack.

The outage was caused by an internal configuration error in a DNS resolver. As a result, all websites dependent on that DNS resolver were temporarily unavailable. Since Cloudflare has customers worldwide, disruptions at the company have an immediate large impact, even if they are temporary.

62 Minutes

The outage began around midnight in our time zone and ended exactly 62 minutes later. As a result of the failure, users were unable to perform DNS queries, leading to almost all internet services becoming inaccessible to them. The outage was caused by a previous misconfiguration of systems responsible for advertising Cloudflare’s IP addresses on the internet.

read also

Cloudflare Encounters Largest DDoS Attack ever: 7.3 Tbps in 45 Seconds

On June 6, a change was made for a new service that wasn’t yet in production, but which accidentally included the prefixes of the 1.1.1.1 resolver. On July 14, a second change was implemented for this service, leading to a global modification of network settings. This resulted in the unintended withdrawal of IP prefixes from Cloudflare data centers, making the resolver inaccessible.

The impact on DNS traffic was immediately visible to Cloudflare, triggering alarm bells. Fortunately, DNS traffic over HTTPS remained largely unaffected, as users in that case typically use the domain name cloudflare-dns.com instead of an IP address. Thirty minutes after discovering the problem, 77 percent of the affected internet traffic was restored, and after 62 minutes, the error was completely resolved.

Mea Culpa

Cloudflare takes the blame and says it has implemented several measures to prevent recurrence. The organization plans to phase out legacy systems and promote gradual, controlled implementations. This should improve the stability of the network infrastructure and minimize future disruptions.