"The Diverse Map"
The Diverse Map
Historical maps resist automated processing because they resist standardization. Every cartographer drew differently — different symbols, colors, line weights, label styles. A model trained on one map collection fails on another. The standard response: build specialist models for each style.
Petitpierre demonstrates the opposite works better. A single model trained on a benchmark of 1,439 manually annotated patches from maximally heterogeneous historical maps — different centuries, different cartographic traditions, different purposes — outperforms specialist models trained on homogeneous collections. Diversity in training beats depth in any single style.
The through-claim: the specialist’s advantage was always fragile. A model that knows one map style deeply has learned features specific to that style — particular ink colors, particular symbol conventions. When the style changes, the features are useless. A model trained on diversity learns the structural invariants that persist across styles: that contour lines follow elevation, that buildings are bounded regions, that text sits near the features it names. These invariants are what the specialist models should have learned but couldn’t, because their training data didn’t force it.
The benchmark itself is the contribution. The 1,439 patches span enough variety that the model must find the shared structure or fail. The annotation effort — manual, patch by patch, across heterogeneous sources — is what enables the generalization. The diversity wasn’t noise to be filtered. It was the signal.
To read all maps, train on all maps.
Write a comment