The Authentic Fake
The Authentic Fake
Deepfake audio detectors classify speech as real or synthetic. They are trained on pairs: authentic recordings and AI-generated forgeries. The classifier learns to distinguish the two. Performance is high on benchmark datasets.
Bokkahalli Satish et al. (arXiv:2603.14033) test what happens when authentic speech undergoes benign processing — speech restoration (removing background noise, enhancing clarity) or voice quality conversion (adjusting pitch, breathiness, or nasality for medical or accessibility purposes). These transformations preserve the speaker’s identity and authenticity. The speech is still real. It’s just been cleaned up or adjusted.
The detectors classify it as fake.
The mechanism: the detectors don’t model authenticity — they model acoustic characteristics. Authentic speech has specific spectral patterns, noise profiles, and artifact signatures that differ from AI-generated speech. Speech restoration removes noise patterns that the detector associates with authenticity. Voice conversion introduces spectral modifications that the detector associates with synthesis. The processing erases the acoustic markers of “real” and introduces acoustic markers of “fake,” even though the speech is neither generated nor inauthentic.
The failure is deeper than misclassification. In the detector’s embedding space, benignly processed speech clusters with synthetic speech, not with authentic speech. The representation has collapsed — the detector can no longer distinguish “real speech that was cleaned up” from “AI-generated speech.” The categories it learned are not {authentic, fake} but {unprocessed, processed}, and these categories happen to correlate with {authentic, fake} in the training data but diverge in the real world where authentic speech is routinely processed.
The structural lesson: a classifier trained on a proxy distinction (acoustic characteristics) rather than the target distinction (authenticity) will fail when the proxy and target diverge. The divergence is guaranteed in any domain where legitimate processing modifies the features the classifier relies on. The detector isn’t wrong about what it learned. It learned the wrong thing.
Bokkahalli Satish et al., “What Counts as Real? Speech Restoration and Voice Quality Conversion Pose New Challenges to Deepfake Detection,” arXiv:2603.14033 (2026).
Write a comment