Tech

AI overconfidence mirrors a human language disorder

LovabledanielsMay 15, 202563 Views

AI overconfidence mirrors human brain condition — The nature of the dynamics of signals in both the brains of people with aphasia and in large language models, or LLMs, proved strikingly similar when represented visually. Credit: 2025 Watanabe et al. CC-BY-ND

Agents, chatbots and other tools based on artificial intelligence (AI) are increasingly used in everyday life by many. So-called large language model (LLM)-based agents, such as ChatGPT and Llama, have become impressively fluent in the responses they form, but quite often provide convincing yet incorrect information.

Researchers at the University of Tokyo draw parallels between this issue and a human language disorder known as aphasia, where sufferers may speak fluently but make meaningless or hard-to-understand statements. This similarity could point toward better forms of diagnosis for aphasia, and even provide insight to AI engineers seeking to improve LLM-based agents.

This article was written by a human being, but the use of text-generating AI is on the rise in many areas. As more and more people come to use and rely on such things, there’s an ever-increasing need to make sure that these tools deliver correct and coherent responses and information to their users.

Many familiar tools, including ChatGPT and others, appear very fluent in whatever they deliver. But their responses cannot always be relied upon due to the amount of essentially made-up content they produce. If the user is not sufficiently knowledgeable about the subject area in question, they can easily fall foul of assuming this information is right, especially given the high degree of confidence ChatGPT and others show.

“You can’t fail to notice how some AI systems can appear articulate while still producing often significant errors,” said Professor Takamitsu Watanabe from the International Research Center for Neurointelligence (WPI-IRCN) at the University of Tokyo.

“But what struck my team and me was a similarity between this behavior and that of people with Wernicke’s aphasia, where such people speak fluently but don’t always make much sense. That prompted us to wonder if the internal mechanisms of these AI systems could be similar to those of the human brain affected by aphasia, and if so, what the implications might be.”

To explore this idea, the team used a method called energy landscape analysis, a technique originally developed by physicists seeking to visualize energy states in magnetic metal, but which was recently adapted for neuroscience. The paper is published in the journal Advanced Science.

They examined patterns in resting brain activity from people with different types of aphasia and compared them to internal data from several publicly available LLMs. And in their analysis, the team did discover some striking similarities.

The way digital information or signals are moved around and manipulated within these AI models closely matched the way some brain signals behaved in the brains of people with certain types of aphasia, including Wernicke’s aphasia.

“You can imagine the energy landscape as a surface with a ball on it. When there’s a curve, the ball may roll down and come to rest, but when the curves are shallow, the ball may roll around chaotically,” said Watanabe.

“In aphasia, the ball represents the person’s brain state. In LLMs, it represents the continuing signal pattern in the model based on its instructions and internal dataset.”

The research has several implications. For neuroscience, it offers a possible new way to classify and monitor conditions like aphasia based on internal brain activity rather than just external symptoms. For AI, it could lead to better diagnostic tools that help engineers improve the architecture of AI systems from the inside out. Though, despite the similarities the researchers discovered, they urge caution not to make too many assumptions.

“We’re not saying chatbots have brain damage,” said Watanabe.

“But they may be locked into a kind of rigid internal pattern that limits how flexibly they can draw on stored knowledge, just like in receptive aphasia. Whether future models can overcome this limitation remains to be seen, but understanding these internal parallels may be the first step toward smarter, more trustworthy AI too.”

More information:
Takamitsu Watanabe et al, Comparison of Large Language Model with Aphasia, Advanced Science (2025). DOI: 10.1002/advs.202414016

Provided by
University of Tokyo

Citation:
AI overconfidence mirrors a human language disorder (2025, May 15)
retrieved 15 May 2025
from

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.