Multiverse Computing Cuts Llama 3.1 8B Perplexity 1.4% With 156-Qubit IBM Quantum System Two
Updated
Updated · Quantum Zeitgeist · May 11
Multiverse Computing Cuts Llama 3.1 8B Perplexity 1.4% With 156-Qubit IBM Quantum System Two
2 articles · Updated · Quantum Zeitgeist · May 11
A 1.4% perplexity reduction on Llama 3.1 8B marks Multiverse Computing’s claim of measurable language-model gains on real quantum hardware rather than simulation.
The result came from inserting Cayley-parameterised unitary adapters into the model’s frozen projection layers, adding only 6,000 parameters on IBM’s 156-qubit Quantum System Two.
SmolLM2 tests showed perplexity improved as unitary block dimension increased and recovered 83% of compression-induced degradation, with some answers beating classical baselines.
Researchers said the experiments exposed a sharp noise-expressivity phase transition, suggesting modest advances in qubit scale and hardware quality could unlock larger AI gains beyond classical memory limits.
Is a 1.4% LLM accuracy boost worth the price of a quantum computer?
Can classical AI achieve the same performance boost without needing a quantum processor?
Quantum Computing Boosts Llama 3.1 8B: 83% Compression Recovery and CompactifAI’s Leap Toward Efficient AI
Overview
Multiverse Computing has achieved a major breakthrough by integrating quantum computing with artificial intelligence, demonstrating a functional quantum-enhanced Llama 3.1 8B large language model. Led by Borja Aizpurua and his team, this was accomplished using a 156-qubit IBM Quantum System Two processor. By inserting Cayley-parameterised unitary adapters into the model, they recovered 83% of the performance lost due to compression, resulting in a clear improvement in the model’s quality and efficiency. This advancement marks a significant step toward using quantum technology to meet the growing resource demands of advanced AI systems.