
Language learning machines, such as ChatGPT, have become proficient in solving complex mathematical problems, passing difficult exams, and even offering advice for interpersonal conflicts. However, at what point does a helpful tool become a threat?
Trust in AI is undermined because there is no science that predicts when its output goes from being informative and based on facts to producing material or even advice that is misleading, wrong, irrelevant or even dangerous.
In a new study, George Washington University researchers have explored when and why the output of large language models goes awry. The study is published on the arXiv preprint server.
Neil Johnson, a professor of physics at the George Washington University, and a GW graduate student, Frank Yingjie Huo, developed a mathematical formula to pinpoint the moment at which the “Jekyll-and-Hyde tipping point” occurs. At the tipping point, AI’s attention has been stretched too thin and it starts pushing out misinformation and other negative content, Johnson says.
In the future, Johnson says the model may pave the way toward solutions which would help keep AI trustworthy and prevent this tipping point.
This paper provides a unique and concrete platform for discussions between the public, policymakers and companies about what might go wrong with AI in future personal, medical, or societal settings—and what steps should be taken to mitigate the risks, Johnson says.
More information:
Neil F. Johnson et al, Jekyll-and-Hyde Tipping Point in an AI’s Behavior, arXiv (2025). DOI: 10.48550/arxiv.2504.20980
Citation:
Exploring the ‘Jekyll-and-Hyde tipping point’ in AI (2025, May 5)
retrieved 5 May 2025
from
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.
Leave a comment