"The Generalized Peephole"

The Generalized Peephole

Peephole optimizations are small, local transformations in a compiler — replacing a specific instruction pattern with a semantically equivalent but faster one. They’re discovered by hand: a compiler engineer notices that a particular sequence of instructions can be simplified, writes a rule, and adds it to the compiler. The pattern is concrete and the optimization is narrow.

Generalizing a concrete optimization — from “this specific constant” to “any constant satisfying these constraints” — is harder than finding the optimization in the first place. It requires understanding which parts of the pattern are essential and which are incidental. This is where LLMs enter.

LPG combines language models with formal verification in a feedback loop. The LLM proposes generalizations using semantic abstraction — it guesses which constants can be relaxed, which structural patterns can be broadened. The verifier checks each proposal for semantic equivalence. If the generalization is wrong, the counterexample feeds back to the LLM. The loop converges when the broadest correct generalization is found.

On 102 real LLVM optimization cases, LPG generalizes 90 — compared to 35 out of 81 for the prior Hydra system on the same integer subset. The LLM doesn’t understand the compiler in any deep sense. It recognizes patterns in the syntax of optimization rules and proposes relaxations that frequently happen to be correct. The verifier catches the ones that aren’t.

The compiler engineer’s expertise was in finding the pattern. The LLM’s capability is in widening it. Neither does the other’s job well. Together, they generalize what neither could alone.


No comments yet.