"The Unnormalized Test"
The Unnormalized Test
Many probability models in machine learning have intractable normalizing constants. You can evaluate the unnormalized density — the numerator — but you cannot compute the denominator that makes it a proper probability. This blocks standard likelihood ratio tests, which require both densities to be fully normalized.
Dombowsky, Engelhardt, and Ramdas (arXiv:2603.15845) show how to do valid hypothesis testing anyway, using Besag and Clifford’s parallel sampling method. The trick: generate samples that are exchangeable with the data under the null hypothesis. If you can do this, the normalizing constants cancel in the test statistic. You never need to compute them.
The construction builds e-values — evidence measures that are valid at any stopping time and can be combined across multiple tests without correction. As the number of parallel chains grows, these e-values approach the performance of an oracle that knows the true normalizing constant, with the gap shrinking at a rate governed by the Markov chain’s mixing time.
What makes this work is a shift in what “testing” means. The classical framework requires computing P(data | model), which demands normalization. The e-value framework requires only comparing the data to exchangeable reference samples drawn from the same unnormalized distribution. You don’t need to know the probability of the data — you need to know whether the data looks like it could have been generated by the same process that generated the reference samples.
This is a broader principle: exact computation of absolute quantities (the normalizing constant) can be replaced by relative comparison (exchangeability with reference samples). The information needed for the test was always less than the information needed for the probability. The normalizing constant was never required — it was just the only tool the classical framework had for constructing valid comparisons.
Write a comment