Intel, AMD Release ACE x86 AI Extensions With 16x More Matrix Ops Than AVX10
Updated
Updated · Tom's Hardware · Jun 20
Intel, AMD Release ACE x86 AI Extensions With 16x More Matrix Ops Than AVX10
3 articles · Updated · Tom's Hardware · Jun 20
Summary
Intel and AMD published the full ACE specification, a joint x86 extension aimed at running smaller or latency-sensitive AI workloads more efficiently on CPUs.
ACE adds dedicated matrix-multiplication silicon while reusing AVX10’s 512-bit registers, letting developers target one standard path instead of tuning separately for varying x86 AI features.
For the same number of input vectors, ACE can execute 16x as many operations as AVX10, cutting instruction overhead and potentially improving power use and memory-bandwidth efficiency.
The extension supports common ML formats including INT8, FP8, FP16, FP32 and BF16, plus Open Compute Project MX block-scaled formats, broadening CPU support where GPUs or NPUs are unavailable or less practical.
The move positions x86 CPUs to take on more local AI inference as Intel and AMD seek a vendor-agnostic software target for frameworks such as PyTorch and TensorFlow.
Will ACE finally let CPUs run powerful local AI, making dedicated GPUs obsolete for most users?
Is the Intel-AMD truce on ACE a genuine leap for AI or a defensive wall against the rising ARM ecosystem?
Intel and AMD Launch ACE: A Unified x86 Matrix Acceleration Standard for Next-Gen AI Performance
Overview
AI Compute Extensions (ACE), announced in June 2026, marks a major step for x86 AI computing through a unique collaboration between Intel and AMD. ACE addresses the growing need for efficient matrix multiplication, a core operation in AI, by introducing a unified instruction set that overcomes the limitations and high power use of current x86 methods like AVX10. Previously, x86 CPUs struggled with matrix operations because AVX was not designed for this purpose, leading to fragmented solutions across platforms. With ACE, Intel and AMD aim to standardize and optimize AI acceleration, transforming the x86 ecosystem for future AI workloads.