I used to read every release post. As of this month, I've given up.
In the last sixty days the frontier shipped: OpenAI's GPT-5.5 (six weeks after 5.4, now the default Instant), Google's Gemini 3.1 Ultra (2M-token native multimodal), and four major Claude updates in roughly fifty days. The pace is no longer "every quarter, with anticipation" — it's "every Tuesday, with fatigue."
The more interesting move is architectural. SubQ released the first commercial subquadratic LLM with a 12M-token context — and claims to run that context at one-fifth the inference cost. If that holds under load, transformers stop being the only game.
§ 01 What this means for building
For the kind of work I do — shipping product fast — the takeaway isn't "switch models." It's stop coupling your code to any one of them. Capabilities cluster within months. Cost asymmetries are the only durable difference. Build through a router, log every call, and let the cheap-enough model win on the eval that matters to your product.
The frontier no longer rewards the people who read every release. It rewards the ones with a tight eval loop and the patience to swap.