Cloudflare Outage: Bug Takes Down Huge Parts of Web
- A major Cloudflare outage on November 18, 2025, caused widespread internet disruptions, presenting users with HTTP 5xx error pages.
- The company confirmed the incident was not a cyber attack but was triggered by an internal bug in its Bot Management system that caused a critical configuration file to double in size.
- The oversized file exceeded memory limits in Cloudflare's core proxy software, leading to system-wide failures and service unavailability.
- Key services including Core CDN, Workers KV, Access, and Turnstile were heavily impacted, with the issue taking several hours to fully resolve.
Widespread Outage Hits Cloudflare, Affecting Major Internet Services
On November 18, 2025, at approximately 11:20 UTC, large portions of the internet became inaccessible as Cloudflare, a major web infrastructure and security company, experienced a significant service outage. Users attempting to access a multitude of websites and online services were greeted with error pages, indicating a failure within Cloudflare's network. The company, which handles a substantial portion of global internet traffic, acknowledged the disruption, calling it their worst outage since 2019 and apologizing for the widespread impact on its customers and the internet community.
What Caused the Internet Disruption?
In a detailed post-mortem, Cloudflare explained that the outage was not the result of a malicious cyber attack, a scenario the team initially investigated. Instead, the failure stemmed from a seemingly minor internal change that had catastrophic consequences. The root cause was a bug triggered by a permissions change in one of its database systems.
The Bug in the Bot Management System
The issue originated within Cloudflare’s Bot Management system. A query used to generate a “feature file”—a configuration file that helps the system identify and manage bot traffic—began to output duplicate entries after the database change. This unforeseen behavior caused the feature file to double in size.
This larger-than-expected file was then distributed across Cloudflare's global network. The core proxy software, which processes all traffic, had a pre-allocated memory limit for this specific file. When the oversized file was loaded, it exceeded this limit, causing the software to crash and resulting in the widespread 5xx errors seen by users.
The Ripple Effect: Services Impacted
The failure of the core proxy system had a cascading effect across Cloudflare’s product suite. The most significant impacts included:
- Core CDN and Security: The primary services for content delivery and security failed, making websites unavailable.
- Workers KV: A key data storage service experienced a high rate of errors, affecting applications built on Cloudflare's platform.
- Access and Turnstile: Authentication services failed, preventing users from logging into the Cloudflare Dashboard and other protected applications.
The fluctuating nature of the problem in its initial stages, where the system would periodically recover, made diagnosis difficult and initially led engineers to suspect a sophisticated DDoS attack.
Cloudflare's Apology and Future Safeguards
After identifying the root cause, Cloudflare’s engineering team stopped the propagation of the corrupted file and deployed a last-known-good version, with core traffic beginning to flow normally by 14:30 UTC. All systems were reported as fully functional by 17:06 UTC.
Acknowledging the severity of the incident, Cloudflare stated, "An outage like today is unacceptable." The company has already begun implementing new safeguards to prevent a recurrence, including hardening how its systems handle internal configuration files, improving global kill-switches for features, and reviewing failure modes across all core modules to build a more resilient network.