The model cadence is breaking my reading list

Published

May 18, 2026

Reading time

1 minutes

I used to read every release post. As of this month, I've given up.

In the last sixty days the frontier shipped: OpenAI's GPT-5.5 (six weeks after 5.4, now the default Instant), Google's Gemini 3.1 Ultra (2M-token native multimodal), and four major Claude updates in roughly fifty days. The pace is no longer "every quarter, with anticipation" — it's "every Tuesday, with fatigue."

The more interesting move is architectural. SubQ released the first commercial subquadratic LLM with a 12M-token context — and claims to run that context at one-fifth the inference cost. If that holds under load, transformers stop being the only game.

§ 01 What this means for building

For the kind of work I do — shipping product fast — the takeaway isn't "switch models." It's stop coupling your code to any one of them. Capabilities cluster within months. Cost asymmetries are the only durable difference. Build through a router, log every call, and let the cheap-enough model win on the eval that matters to your product.

The frontier no longer rewards the people who read every release. It rewards the ones with a tight eval loop and the patience to swap.

◆

AMARTUVSHIN

The model cadence is breaking my reading list

§ 01 What this means for building

The EU AI Office Gets Teeth on August 2