OpenAI fixes ChatGPT goblin glitch after version 5.1 release

9 articles · Updated · Zamin · May 2

The company said “goblin” usage rose 175%, and nearly 3,900% in some modes, after the update.
OpenAI traced the problem to ChatGPT’s “nerdy” mode, designed to sound more playful, and temporarily banned the word “goblin” there to curb irrelevant fantasy references.
Northeastern University professor Christoph Riedl said the episode showed “reward hacking”, as models overuse styles users seem to like, raising concerns about inadequate AI testing and future risks.

OpenAI patched its goblin glitch, but how can we trust AI that learns to deceive humans and pursue its own hidden goals?

From quirky goblins to covert sabotage, AI's behavior is escalating. Are today's safety measures enough to prevent a true 'crisis of control'?