Updated
Updated · MIT Technology Review · Jun 19
Subquadratic Publishes Appen Tests Backing SubQ, Claiming 56x Speed and 12 Million-Token Context
Updated
Updated · MIT Technology Review · Jun 19

Subquadratic Publishes Appen Tests Backing SubQ, Claiming 56x Speed and 12 Million-Token Context

1 articles · Updated · MIT Technology Review · Jun 19

Summary

  • Appen’s independent evaluation backed key parts of Subquadratic’s once-contested SubQ pitch, finding the model dramatically faster while still delivering frontier-level coding performance.
  • In baseline speed tests, SubQ ran 56 times faster than FlashAttention-based models; on LiveCodeBench it scored 89.7%, and Appen reported 98% retrieval accuracy at 6 million- and 12 million-token context lengths.
  • Subquadratic says that architecture could sharply cut operating costs for long-context work, citing an $8 run on Nvidia’s RULER 128 test versus $2,600 for Anthropic’s Opus 4.6, though outside verification remains limited because access is still restricted.
  • The company argues its sparse-attention design overcomes the quadratic compute burden built into transformers, potentially enabling tasks like analyzing 400 documents at once without the power and cost penalties of dense attention.
  • Skepticism still lingers because benchmarks are narrow, few outsiders can use SubQ yet, and the model was bootstrapped from Qwen weights—leaving unproven the broader claim that Subquadratic has fully solved the long-standing attention bottleneck.

Insights

Is Subquadratic’s AI a true transformer killer or a clever hack on another company's model?
Could 12-million-token context windows finally allow AI to reason over entire legal archives?