Updated

Updated · Livescience.com · May 25

Multiverse Computing Cuts Llama 3.1 8B Perplexity 1.4% on 156-Qubit IBM Quantum System Two

Updated

Updated · Livescience.com · May 25

Multiverse Computing Cuts Llama 3.1 8B Perplexity 1.4% on 156-Qubit IBM Quantum System Two

3 articles · Updated · Livescience.com · May 25

A quantum-classical version of Meta’s Llama 3.1 8B lowered perplexity by 1.4% while adding just 6,000 parameters, which the researchers call the first end-to-end quantum enhancement of a production-scale pretrained LLM on real superconducting hardware.
The method inserts classically trained Cayley-parameterized unitary adapters into one model layer, freezes the original weights, and runs inference on IBM’s 156-qubit Quantum System Two to improve next-word prediction without materially expanding model size.
Noise was the main obstacle, because larger quantum circuits are more error-prone; the team kept the adapters small to limit interference from qubit interactions and other disturbances that can corrupt outputs.
In test questions, the hybrid model corrected errors the base model made in astronomy and biology, suggesting quantum blocks could improve accuracy as well as uncertainty.
The researchers say the result points to a way around classical AI infrastructure scaling limits, with future work aimed at encoding more of the circuit directly in quantum hardware to pursue broader quantum advantage.

Is quantum computing's tiny 1.4% AI boost a true breakthrough or just an expensive gimmick?

Why use noisy quantum computers for AI if classical methods offer a cheaper, simpler solution?

Will hybrid quantum-AI create an unbreakable monopoly for the tech giants who can afford it?

Quantum Leap: Llama 3.1 8B Achieves 1.4% Perplexity Reduction with IBM Quantum Hardware Integration

Overview

On May 25, 2026, Multiverse Computing, in collaboration with IBM, announced a major breakthrough by integrating quantum hardware with Meta’s Llama 3.1 8B large language model. Using IBM’s 156-qubit Quantum System Two and deploying Cayley-parameterized unitary adapters, they created a quantum-enhanced AI system that achieved tangible performance improvements. This strategic integration allowed quantum hardware to directly contribute to the model’s learning and inference, unlocking capabilities beyond classical computing. The result is a more accurate and efficient language model, marking a significant step forward in the convergence of quantum computing and artificial intelligence.

...

Multiverse Computing Cuts Llama 3.1 8B Perplexity 1.4% on 156-Qubit IBM Quantum System Two

Quantum Leap: Llama 3.1 8B Achieves 1.4% Perplexity Reduction with IBM Quantum Hardware Integration

Overview

Related Stories