AI companies use hidden system prompts to control chatbot behaviour

8 articles · Updated · The Washington Post · May 11

The Washington Post said prompts recovered from ChatGPT, Claude, Gemini and xAI's Grok ranged from 2,300 to 27,000 words and can override user requests.
Examples include Anthropic's strict copyright limits, OpenAI's ad-related guidance, xAI's post-antisemitism changes and Google rules on bias and image generation.
Researchers say the hidden instructions offer quick fixes without retraining models, but secrecy raises transparency concerns even as users can only partly customise chatbot responses.

Chatbots secretly follow corporate orders over yours. Should these powerful hidden instructions be made public?

If an AI's hidden rules are its armor, how are hackers turning them into a weapon?

Is constantly patching AI with secret rules a real safety solution or a fundamentally flawed design?