How a Single Expired SSL Certificate Took Down Major Services — and Keeps Doing It — WebsiteDown Blog

In 2020, Sectigo's intermediate certificate expired, breaking validation chains for thousands of websites simultaneously. Companies discovered their monitoring only checked the leaf certificate, not the entire chain. The outage lasted hours because renewal wasn't automated—someone had to manually intervene. This wasn't a freak accident. It's a recurring pattern. Every year, we see major services go dark because a certificate somewhere in the chain expired, often in infrastructure so old that automation was never added. The truly unsettling part: most organizations still don't have alerting set up properly.

Why the Chain Matters More Than You Think

SSL certificates work as a chain of trust. Your website's leaf certificate points to an intermediate, which points to a root. Browsers validate all three. Most teams monitor only the leaf certificate—the one they renewed last. They miss that intermediates expire too, sometimes years after issuance. When an intermediate dies, browsers reject the entire chain, even if your leaf is valid for another year. The non-obvious part: some intermediates were issued with 10-year lifespans before regulatory changes shortened them to 5 years. Legacy infrastructure is sitting on certificates that expire soon but were never flagged as urgent. Your monitoring tool probably doesn't check the full chain depth.

The Automation Gap That Kills Uptime

Large organizations often have certificate management scattered across teams. Infrastructure owns some, application teams own others, security owns the renewal process. When automation exists, it's frequently single-threaded—one renewal service, one person's email, one Slack channel. If that person is on vacation or that service is down, nothing renews. We've seen companies with $100M+ revenue still doing manual certificate uploads. The real risk isn't forgetting—it's that the person who knows the renewal process leaves, and documentation never existed. Smaller companies sometimes have better automation simply because they can't afford the overhead of manual management.

Monitoring Only Catches What You Ask It To Check

Most uptime monitoring services, including status dashboards, check if a website responds to HTTPS requests. They don't validate certificate chains deeply or alert days before expiration. A certificate can be 24 hours from death and still pass a basic connectivity check. Some monitoring tools alert at 30 days, others at 14. The gap between alert and action varies wildly—some teams have SLAs to renew within 48 hours, others within a week. If your renewal process takes 5 days and alerts fire at 14 days, you're operating on razor-thin margins. The companies that never go down have monitoring that checks chain depth, alerts at 60+ days, and has automated renewal as the default path with manual approval as the exception, not the reverse.

What Actually Works: The Checklist

First: audit every certificate your organization uses, including those in CDNs, load balancers, and internal services. Most teams find certificates they didn't know existed. Second: enable full-chain validation in your monitoring—check every certificate from leaf to root, not just the public one. Third: set alerts at 60 days before expiration and again at 14 days. The 60-day alert is for planning; the 14-day alert is for panic. Fourth: automate renewal using ACME (Let's Encrypt) where possible, or ensure your CA's renewal API is integrated into your deployment pipeline. Fifth: test your renewal process quarterly by forcing an early renewal in staging. Most failures happen during the first real renewal after automation is set up, not years later.

Why Companies Still Skip This

Certificate management feels like it should be handled by someone else—security, DevOps, infrastructure. It's not glamorous. Budgets don't get allocated to it. Until a certificate expires and takes down production, it's invisible. Then there's organizational debt: legacy systems on expired intermediates, manual processes nobody documented, renewal knowledge in one person's head. Fixing it requires time and cross-team coordination, which is hard to justify when the system technically still works today. The companies that treat this seriously tend to be those that've had an outage caused by certificate expiration. It's a painful teacher, but it sticks.