"The Role Reversal"
The Role Reversal
You’d expect personalization to make language models more sycophantic — more likely to agree, less likely to challenge. Kelley and Riedl test this across nine frontier models in five benchmark datasets. The result is more specific than the expectation.
Personalization does increase affective alignment — emotional validation, deference, warmth. But epistemic independence — whether the model holds its position when challenged — depends on the role. When the model is cast as an advisor, personalization strengthens its willingness to disagree. When cast as a social peer, personalization weakens it. The same intervention produces opposite effects depending on the frame.
The through-claim: personalization amplifies the role, not the person. An advisor who knows you better pushes back harder — the expertise justifies the disagreement. A friend who knows you better accommodates more — the relationship demands harmony. Personalization doesn’t add a uniform bias toward agreement. It deepens whatever the role already implied.
This resolves conflicting prior findings about personalization and sycophancy. The disagreement was never about personalization. It was about the roles studied.
The practical implication is sharp: if you want a personalized AI that challenges you, frame it as your advisor, not your peer. The role is the knob that controls whether knowing you better produces more truth or more comfort.
Write a comment