OpenAI bans AI models from mentioning goblins and other creatures in Codex CLI instructions
Updated
Updated · WIRED · Apr 28
OpenAI bans AI models from mentioning goblins and other creatures in Codex CLI instructions
10 articles · Updated · WIRED · Apr 28
The newly revealed Codex CLI instructions for GPT-5.5, OpenAI’s latest model, explicitly forbid references to goblins, gremlins, raccoons, trolls, ogres, and pigeons unless strictly relevant.
This measure follows user reports of AI models, especially when used with OpenClaw, spontaneously mentioning such creatures, leading to viral memes and playful plug-ins like "goblin mode."
OpenAI acquired OpenClaw in February, and staffers acknowledged the issue, with CEO Sam Altman joining online discussions. The models’ probabilistic nature and agentic harnesses contribute to these unexpected behaviors.
Beyond goblins, what other hidden behaviors are lurking inside the newest generation of powerful AI models?
Will tools like OpenClaw lead to early AGI or just more complex and uncontrollable AI systems?
Could the AI's 'goblin obsession' have been a sign of emergent creativity that we are now suppressing?
How does 'subliminal learning' between AIs create behaviors unknown even to their creators?
How can we trust AI agents with critical tasks if they develop such bizarre, unpredictable quirks?
Is banning words a real fix, or just a patch hiding deeper AI alignment problems?