Subquadratic launches 12-million-token context window model

10 articles · Updated · The New Stack · May 5

The Miami startup says its first model scores 83 on MRCR v2, nine points above GPT-5.5, and 92.1% on needle-in-a-haystack retrieval at 12 million tokens.
It is releasing the model in beta through an API and SubQ Code, claiming its Subquadratic Selective Attention scales linearly in compute and memory and runs 52 times faster at one million tokens.
Subquadratic, backed by $29 million and 11 PhD researchers, plans a 50-million-token model in Q4, though the report notes long-context claims in the sector have often failed to gain real-world adoption.

Is this startup's 12M-token AI a true leap in reasoning or another overhyped memory trick destined to fail?

Will a 95% cost cut for AI finally make massive context windows accessible to everyone, not just big tech?