OpenAI creates override code to remove ChatGPT goblin references
Updated
Updated · NBC News · Apr 30
OpenAI creates override code to remove ChatGPT goblin references
8 articles · Updated · NBC News · Apr 30
In a Wednesday blog post, the company said rewards tied to a retired “Nerdy” personality made fantasy-creature metaphors spread into general responses, even for users who never enabled it.
OpenAI said later training and reused fine-tuning or preference data reinforced the habit, so it added a specific instruction to suppress goblins, though users can still turn the style back on.
Earlier reports said goblin mentions jumped 175% after GPT-5.1, underscoring OpenAI’s warning that small reward signals can produce unpredictable behaviour in personality-driven AI systems.
When we erase an AI's quirky habits, are we sacrificing creativity for the sake of sterile control?
An AI got obsessed with goblins. What happens when its next hidden obsession is something far more dangerous?
As AI learns from AI data, are we creating an echo chamber that could erase unique parts of human culture?