"The Noise Sensitivity Exponent"
Some learning problems are statistically easy but computationally hard: enough data exists to recover the signal, but no polynomial-time algorithm can find it. The gap between what is information-theoretically possible and what is computationally tractable — the statistical-to-computational gap — appears in planted clique, sparse PCA, tensor completion, and many other high-dimensional problems. Each problem has been analyzed individually, with gap existence proved case by case.
A single number controls the gap across a broad class of models. The Noise Sensitivity Exponent (NSE) is determined entirely by the activation function — the nonlinear function that defines how features combine with inputs. For a given activation, the NSE measures how quickly the function’s output decorrelates when its input is perturbed by noise. A high NSE means the function is robust to noise; a low NSE means it is sensitive.
The NSE governs three distinct phenomena simultaneously. In single-index models with significant noise, the NSE determines whether a computational gap exists and how large it is — high NSE means no gap (the problem is computationally easy), low NSE means a gap (the problem is statistically solvable but computationally intractable). In multi-index models, the NSE controls the trade-off between specialization (learning individual features) and aggregation (learning their combination). In hierarchical models, the NSE determines the learning rate at each level of the hierarchy.
The activation function that defines the problem also defines its computational difficulty. The NSE is not a property of the algorithm or the data distribution but of the function class being learned. Change the activation, and the gap appears or disappears.
One number. Three phenomena. The difficulty was always in the function, not the data.
Write a comment