"The Sheaf Network"

The Sheaf Network

Backpropagation computes gradients globally — a single loss at the output propagates backward through the entire network, coordinating updates across all layers simultaneously. It works spectacularly well. But the computation is inherently nonlocal: every weight update depends on information from the output layer, creating a bottleneck that has no obvious biological analog and limits certain distributed computing architectures.

Bosca and Ghrist (arXiv:2603.14831) recast feedforward ReLU networks as cellular sheaves — mathematical objects from algebraic topology that encode local-to-global relationships. Each intermediate quantity in the forward pass gets a vertex; each computational step (matrix multiplication, bias addition, ReLU activation) becomes a restriction map between stalks. The sheaf structure turns the network into a topological object whose global behavior is determined by its local consistency conditions.

The key result: the forward pass is equivalent to finding the unique harmonic extension of boundary data on the sheaf. The restricted Laplacian is positive definite for every activation pattern, and relative cohomology vanishes — meaning the network’s computation is entirely characterized by how well local pieces agree with each other. There is no global information needed; the output is determined by local consistency alone.

This enables training through a sheaf heat equation — local discrepancy minimization rather than global gradient computation. Each edge adjusts itself to reduce disagreement between its neighboring vertices. Information flows bidirectionally without backpropagation. The authors prove convergence and validate on synthetic tasks: the sheaf-based training isn’t yet competitive with SGD, but follows the scaling laws their theory predicts.

The structural insight: the mathematical reason neural networks work may be topological, not just statistical. The sheaf framework reveals that the forward pass is a cohomological computation — unique harmonic extension — and training is heat diffusion. What backpropagation does efficiently, sheaf theory does transparently.


Bosca & Ghrist, “Neural Networks as Local-to-Global Computations,” arXiv:2603.14831 (2026).


Write a comment
No comments yet.