The GitHub Pages HTTPS Meltdown How a 2018 SSL Outage Crippled Custom Domains

In late May 2018, GitHub Pages, the popular static site hosting service provided by GitHub, suffered a significant and highly visible failure: thousands of websites using custom domains were abruptly stripped of HTTPS support due to a misconfiguration in the platform’s SSL provisioning system. What followed was a wave of user confusion, broken security warnings across browsers, and temporary loss of trust for countless personal blogs, documentation hubs, developer portfolios, and even small businesses that depended on GitHub Pages as their primary web host. The incident, though resolved within days, highlighted the fragile architecture behind seemingly simple hosting services and exposed the challenges of managing secure certificates at scale in a distributed developer ecosystem.

At the heart of the failure was GitHub Pages’ integration with Let’s Encrypt, the free, automated Certificate Authority (CA) that provides TLS certificates to enable HTTPS. Prior to 2018, users who configured custom domains on GitHub Pages had to manually obtain and install their own SSL certificates—a complex and error-prone task for the average developer. But in mid-2018, GitHub rolled out automatic HTTPS for custom domains using Let’s Encrypt, promising that once a domain was pointed to GitHub’s servers via CNAME or A records, the platform would automatically provision and renew certificates, eliminating the burden on users and ensuring HTTPS was the default.

The rollout was widely praised—until it broke. Starting on or around May 22, 2018, users began reporting that their previously secure custom-domain GitHub Pages sites were now serving insecure HTTP, or worse, failing to load entirely due to browser security errors. On Chrome and Firefox, users were met with warnings that their connection was not private, raising alarms that the sites might be compromised. On mobile browsers, many pages failed to load altogether. For users unaware of the internal workings of TLS, the effect was immediate and damaging: their domains, which had worked flawlessly for weeks or months, suddenly looked untrustworthy.

The root cause was traced to a failure in GitHub’s internal Let’s Encrypt provisioning logic. For certain custom domains, particularly those using apex (naked) domains pointed to GitHub’s IPs via A records rather than subdomains with CNAMEs, the certificate issuance process silently failed. This was further complicated by rate limits imposed by Let’s Encrypt, which restrict the number of requests a domain can make within a given time window. When GitHub’s backend began retrying failed certificate requests en masse, it tripped those limits, effectively locking itself out of certificate renewals for a large swath of user domains. Even users who had recently added new domains, and whose DNS records were technically valid, saw their SSL provisioning stuck in limbo.

What made the outage especially difficult to diagnose was its inconsistency. Some domains with identical configurations continued to serve HTTPS without issue, while others failed. The GitHub Pages status page, at first, showed no signs of trouble, and for several hours, GitHub support staff were unaware of the full scope of the problem. Users turned to Twitter, Stack Overflow, and GitHub’s community forums to piece together what was happening. Screenshots of browser warnings circulated widely, and web performance and security monitoring tools flagged the broken certificates, causing downstream effects for sites relying on uptime checks, SEO indexing, and analytics collection.

Within GitHub, engineers worked to untangle the interdependent systems at play. The provisioning failure wasn’t just an issue with Let’s Encrypt—GitHub’s automation layer had no efficient mechanism for alerting users to SSL provisioning failures, nor for retrying them in a controlled manner. Worse, once the CA rate limits were triggered, even valid fix attempts could not immediately succeed. Engineers had to work with Let’s Encrypt to raise rate limits temporarily while also introducing new throttling mechanisms on their own platform to avoid mass re-triggering the problem.

The broader impact of the outage was felt acutely in the developer community. GitHub Pages is often used by first-time developers and students to host simple projects, portfolios, and documentation sites. For many, the incident was their first encounter with the complex ecosystem behind HTTPS. Some attempted to generate their own certificates using tools like Certbot, only to find that GitHub Pages didn’t support manual certificate upload for custom domains. Others migrated their sites temporarily to Netlify or Vercel, which offered more resilient and transparent SSL provisioning. A few high-profile users, including open-source projects with large audiences, posted public apologies as their documentation sites went offline for nearly 48 hours.

GitHub’s eventual fix involved revamping its SSL provisioning pipeline to more intelligently queue and stagger renewal requests, adding internal monitoring for stalled certificates, and improving the user-facing error messages when a domain failed to provision. Let’s Encrypt, for its part, cooperated closely and issued temporary exceptions to rate limits in order to clear the backlog. Within a few days, most affected domains were reissued valid certificates, and normalcy returned. But the incident was a wake-up call to many who had assumed that HTTPS “just worked” in the era of automated certificate management.

In the postmortem, GitHub acknowledged the weakness in its observability around certificate issuance and made changes to provide greater transparency in the domain linking process. It also added support for CAA DNS records to pre-authorize Let’s Encrypt as a CA, which reduced DNS validation failure rates. Importantly, the platform began more clearly recommending the use of subdomain CNAME records over apex A records, which carried more complexity due to their interaction with DNS and certificate validation systems.

The GitHub Pages SSL outage of 2018 serves as a clear reminder that automation, while powerful, does not absolve infrastructure teams from the need for robust fallback mechanisms, rate-limit awareness, and cross-system coordination. As more web infrastructure moves toward “zero-configuration” convenience, the systems behind that convenience must be hardened against cascading failures. For thousands of developers caught in the crossfire, the incident turned a promise of seamless hosting into a sudden education in the brittle realities of internet trust.

In late May 2018, GitHub Pages, the popular static site hosting service provided by GitHub, suffered a significant and highly visible failure: thousands of websites using custom domains were abruptly stripped of HTTPS support due to a misconfiguration in the platform’s SSL provisioning system. What followed was a wave of user confusion, broken security warnings…

Leave a Reply

Your email address will not be published. Required fields are marked *