Developmental Trajectories of Situation Modeling and Mentalizing in Transformer Language Models

arXiv:2606.28524v1 Announce Type: new Abstract: Recent work suggests that Large Language Models (LLMs) are sensitive to the belief states of agents described by text, as measured by the false belief task (FBT), yet persistent concerns of construct validity remain. We adopt a **developmental perspective**, tracing the pattern of mental state reasoning behavior -- and likely **preconditions** for this behavior -- across multiple training stages in the Olmo2 and Pythia language model suites. We fin...

arXiv cs.CL ·Pamela D. Rivi\`ere, Cameron Jones, Sean Trott ·
compartilhar: