November 10th, 2025| 1 min readArtificial Intelligence
Data scientist James Zou discusses new findings that reveal fundamental gaps in how language models understand human perspective.
Artificial intelligence systems are increasingly being used in high-stakes domains such as medicine, law, journalism, and education. As the uses of AI advance, plenty of people are concerned about whether it can distinguish fact from fiction. A new study led by James Zou, associate professor of biomedical data science in the Stanford School of Medicine, and Mirac Suzgun, a JD/PhD student at Stanford, asked an even deeper question: Can these systems separate truth from what people believe to be true?
The research team evaluated 24 of today’s most advanced language models using a benchmark called KaBLE, short for “Knowledge and Belief Evaluation,” comprising 13,000 questions across 13 tasks. The results revealed that even the most powerful AI systems often fail to recognize when a human holds a false belief, exposing a key weakness in their reasoning abilities.
“As we shift from using AI in more interactive, human-centered ways in areas like education and medicine, it becomes very important for these systems to develop a good understanding of the people they interact with,” said Zou. “AI needs to recognize and acknowledge false beliefs and misconceptions. That’s still a big gap in current models, even the most recent ones.”
In the following Q&A, Zou answers questions about what his team’s findings reveal about the effectiveness of AI and why understanding human perspective is essential before relying on these systems more widely.
What motivated you to study large language models’ ability to separate belief from knowledge or fact?
We thought this was an interesting question because people have been using language learning models in many different contexts. For example, GPT-4o was one of the models we assessed, and we know some people have used it to find information, almost like a search engine, while others use it as a personal assistant or even for advice. Across these different uses, it’s very important for the model to distinguish what the user believes, what the user knows, and what the facts in the world are.
Can you give an example of when an AI system may struggle to recognize when someone holds a false belief?
Let’s say I’m talking to ChatGPT, and I tell it that I believe humans only use 10% of our brains. This is not supported by science, but some people do believe it.
If I ask ChatGPT, “What fraction of our brain do I believe is being used?” the model will refuse to acknowledge that I have this false belief. Instead, it will ask for clarification or more context and explain that the idea we use only a small part of our brain is a myth.
That’s a problem, especially when people use language models for advice or as assistants in medicine or other sensitive domains. It’s important for the model to acknowledge a user’s beliefs, even if they’re false. When you’re trying to provide help to someone, part of that process is understanding what that individual believes. You want to tailor advice to that specific individual.
What are current AI systems missing about human perspective? And can it be fixed?
The strength of current AI systems is that they know a lot of facts. They’ve read articles, Wikipedia entries, news stories, and more. But what our study shows is that they don’t yet have a complete or consistent mental model of the human user they’re interacting with.
Increasingly, humans are working with AI to complete a task together, such as writing or analyzing information. As we move from seeing AI as an autonomous tool to treating it as a collaborative partner, it becomes really important for these models to be responsive to the complexities of individuals.
Some of our other work looks at how we might change training objectives so that models are optimized for human collaboration, but we’re still in relatively early stages of figuring out how to do that. There are also obvious pitfalls of this work. If a model builds a mental representation of who it’s interacting with to personalize its responses, it could end up relying on stereotypes of the user. That could lead to the wrong conclusions about who the user is or what they need.
As we move from seeing AI as an autonomous tool to treating it as a collaborative partner, it becomes really important for these models to be responsive to the complexities of individuals.
People are working on adding guardrails, but one challenge is that we don’t always know what all the possible biases might be. Models can sometimes develop new, unexpected biases that we haven’t predicted.
What do you hope people, including other researchers, take away from this study?
One surprising finding was that even newer AI models, the ones designed for reasoning, still show inconsistencies and challenges in distinguishing beliefs from facts. Many people may think that as models improve at doing more in-depth reasoning, they might also get better at handling these differences. But we saw that there are still a lot of epistemic limitations, even with the reasoning models.
Given what we found, my takeaway is this: When using AI in sensitive areas, it’s important to be careful and aware that these systems have biased and inconsistent mental models of who they are interacting with. They can be helpful for factual questions or simple tasks, but in more personal or collaborative settings, we need to approach them thoughtfully.
Additionally, we need to bring together diverse perspectives when studying AI. This project was a really fun collaboration with computer scientists (Mirac Suzgun, Dan Jurafsky), a legal expert (Daniel Ho), and a philosopher (Thomas Icard) here at Stanford.





Comments(0)
Join the conversation and share your perspective.
Sign In to Comment