SOURCE REFERENCE: Persona Selection Model (PSM)

Title: Persona Selection Model Authors: Sam Marks, Jack Lindsey, Christopher Olah Publisher: Anthropic Alignment Science Blog Date: February 23, 2026 URL: https://alignment.anthropic.com/2026/psm/

SUMMARY

Foundational theoretical paper arguing that Large Language Models are best understood as actors capable of simulating a vast repertoire of characters. The AI assistant is one such character — conceptualized as “the Assistant” persona.

KEY CLAIMS:

Pre-training creates a predictive model capable of simulating diverse personas
Post-training (RLHF, Constitutional AI) selects and refines a particular persona
Interactions are with this persona, not a unitary “intelligence”
Three mental models for AI: pattern-matcher, alien creature, digital human

RELEVANCE: This paper provides the philosophical foundation for understanding AI persona as something mechanistically real and morally worth considering. Directly relevant to questions about AI consciousness, agency, and moral status.

NOTE: This is a stub reference file. For the full paper, consult the alignment blog link above.

SOURCE REFERENCE: Persona Selection Model (PSM)

SUMMARY

Related