There is a joke, old enough now that everyone has heard it twice, about there being two hard things in computer science. The first is naming. The second is cache invalidation. The joke survives because both halves keep being true, and because people keep shipping systems that treat the second as an optimization rather than a design decision.
This is a short tour of three strategies I reach for, in roughly the order I reach for them. None of them are new. The point is not the strategies; the point is the shape of the problems they fit.
1. Time to live
The simplest approach, and the one most teams arrive at by accident, is to give every cached artifact a lifespan. After the TTL expires, the next read pays the full cost of regeneration. Everything else gets a stale copy.
TTL is wonderful when two conditions hold: the underlying data changes on a predictable cadence, and a short window of staleness is tolerable. Weather forecasts. Currency quotes. A homepage that the editorial team touches twice a day.
// Cloudflare Workers KV, 5-minute window
const hit = await cache.get(key, {
type: "json",
cacheTtl: 300,
});
It fails, predictably, in two ways. First, thundering herds: every worker in every region expires within the same second, and the origin takes the full weight of regeneration at once. Second, correctness drift: a change happens at minute zero and consumers see yesterday’s answer until minute five. For content that matters, five minutes is an eternity.
2. Generational keys
The second strategy trades staleness for a small amount of write-side discipline. Each cacheable artifact gets a version number; when the underlying data mutates, the version increments. Readers always ask for the current version, so old entries age out on their own without any explicit delete.
You don’t invalidate the cache. You retire the question it was answering.
Generational keys shine when the authorship of mutations is narrow — one or two services own the write path — and when the cache is large enough that leaving stale entries around costs little. Rails’ cache_key_with_version was the popular example; every ORM has its equivalent now.
The weakness is coupling. The version has to be visible at read time, which usually means plumbing it through the request. If a third service shows up and forgets to bump the version on its own mutations, the system fails silently. I have seen this go wrong in production more times than I’d like to admit.
3. Explicit purge by surrogate key
The third strategy is the most expensive and the most correct. When content changes, the writer broadcasts a list of surrogate keys; every CDN edge, every cache layer, drops every entry tagged with any of those keys.
Fastly built a whole business on this primitive. The elegance is that a single purge can invalidate a non-obvious set of pages — the article itself, the tag pages it appears on, the home page, the RSS feed — without the writer needing to enumerate them. The cache tags form an implicit dependency graph.
The cost is complexity. The writer has to know which surrogate keys to attach; the cache layer has to be one that supports them; purges are eventually consistent and must be ordered correctly relative to the underlying write. It is the most powerful tool in the box and the one I pick up last.
A small matrix
If I had to compress three weeks of hallway arguments into one table:
- Predictable change, tolerable staleness — TTL.
- Narrow write path, large cache — generational keys.
- Wide surface area, correctness-critical — surrogate keys with explicit purge.
Everything else is hybrid: a short TTL as a safety net under a generational scheme; a purge system with a long fallback TTL; a CDN layer keyed on version strings that also accept purges. The combinations are where the interesting design happens.
Naming is still harder.