"The Sparse Spike"

The Sparse Spike

Spiking neural networks are supposed to be efficient — they communicate through sparse binary events rather than continuous activations. But training them with standard optimizers produces dense weight matrices. The sparsity of the spikes doesn’t propagate to sparsity of the parameters. You get efficient inference but wasteful storage.

Windhager, Moser, and Lunglmayr enforce sparsity during training using Linearized Bregman Iterations — a convex optimization technique that naturally produces sparse solutions through iterative soft thresholding. The optimizer doesn’t just find good weights; it finds sparse good weights, reducing active parameters by about 50% while maintaining accuracy on three neuromorphic benchmarks.

The through-claim: the sparsity and the optimization are the same operation, not separate concerns. Standard training finds dense solutions and then prunes. Bregman iteration finds sparse solutions directly — the proximal operator enforces sparsity at every step, not as a post-processing stage. The weights are sparse because the optimization landscape is shaped to prefer sparse solutions, not because small weights were deleted after the fact.

The momentum-corrected variant (AdaBreg) extends Adam with Bregman distance minimization, making it drop-in compatible with existing training pipelines. The convex optimization theory guarantees convergence to a sparse solution. The practical result: half the parameters, same accuracy, on standard neuromorphic benchmarks (Spiking Heidelberg Digits, Spiking Speech Commands).

If spiking networks are supposed to be sparse, the optimizer should be too.


Write a comment
No comments yet.