98.21% uptime sounds acceptable until you sit down and do the maths. That figure, Claude API's measured availability for March 2025, translates to 13 hours and 4 minutes of downtime in a single month.
A 99.9% SLA, the baseline most teams expect from a critical vendor, allows 43 minutes of monthly downtime. Claude was down 18 times longer than that threshold.
If your product depends on Claude for inference, search, summarisation, or any other feature, you felt every one of those 13 hours. Your users noticed. Your incident channel lit up. And there was nothing you could do except wait.
This isn't about picking on Anthropic. OpenAI had its own 34-hour global outage in June 2025. The pattern across LLM providers is consistent: outages are frequent, global, and long.
The question isn't whether your LLM provider will have a bad month. It's whether your architecture can survive it. If you don't have a circuit breaker, a fallback model, or a degraded-mode plan, that gap is worth closing now.