What Helps Language Models Predict Human Beliefs: Demographics or Prior Stances?
Abstract
Beliefs shape how people reason, communicate, and behave. Rather than existing in isolation, they exhibit a rich correlational structure--some connected through logical dependencies, others through indirect associations or social processes. As usage of large language models (LLMs) becomes more ubiquitous in our society, LLMs' ability to understand and reason through human beliefs has many implications from privacy issues to personalized persuasion and the potential for stereotyping. Yet how LLMs capture this interrelated landscape of beliefs remains unclear. For instance, when predicting someone's beliefs, what information affects the prediction most--who they are (demographics), what else they believe (prior stances), or a combination of both? We address these questions using data from an online debate platform, evaluating the ability of off-the-shelf open-weight LLMs to predict individuals' stance under four conditions: no context, demographics only, prior beliefs only, and both combined. We find that both types of information improve predictions over a blind baseline, with their combination yielding the best performance in most cases. However, the relative value of each varies substantially across belief domains. These findings reveal how current LLMs leverage different types of social information when reasoning about human beliefs, highlighting both their capabilities and limitations.