"The 48 Product"
The 48 Product
Multiplying two 4×4 matrices by the standard method requires 64 multiplications. Strassen’s algorithm, applied recursively to 2×2 blocks, uses 49. This paper achieves 48 — one fewer — with improved numerical accuracy.
The constraint is non-commutativity. The elements being multiplied aren’t numbers; they’re elements of a ring where AB ≠ BA. This matters because Strassen-like algorithms work by computing clever linear combinations that cancel cross-terms, and non-commutativity limits which combinations are valid. The algorithm requires only that the ring has an inverse for 2, a mild assumption satisfied by most computational settings.
The numerical improvement is measured by the error bound exponent. Previous fast 4×4 algorithms achieved an exponent of approximately 2.577 in the max-norm setting. This variant brings it down to approximately 2.386. The improvement comes from a rational (not complex-valued) construction that avoids the numerical instability introduced by complex coefficients in earlier approaches.
Each improvement in small matrix multiplication has downstream consequences. The exponent of matrix multiplication — the ω in O(n^ω) for multiplying n×n matrices — depends on how efficiently small matrices can be multiplied. Better algorithms for 4×4 contribute to the ongoing effort to determine the true value of ω, currently known to be below 2.372.
But the practical significance may be different from the theoretical one. For the hardware sizes where matrix multiplication dominates computation (GPUs, neural network training), what matters is not the asymptotic exponent but the leading constant and numerical stability at fixed size. An algorithm that uses 48 multiplications with error exponent 2.386 is more useful in practice than one with 47 multiplications and error exponent 3.
Write a comment