Updated
Updated · Futura · Jun 25
GPT-4.5 Fools 73% in Turing Test as Human-Mimic Prompts Lift Hit Rate From 36%
Updated
Updated · Futura · Jun 25

GPT-4.5 Fools 73% in Turing Test as Human-Mimic Prompts Lift Hit Rate From 36%

1 articles · Updated · Futura · Jun 25

Summary

  • UC San Diego researchers found GPT-4.5 was judged human in 73% of text conversations when given prompts to mimic slang, typos and emotional variation.
  • That result jumped from 36% on GPT-4.5's default settings, suggesting prompt design—not just raw model capability—drove the breakthrough in conversational imitation.
  • LLaMa-3.1-405B reached 56% with the same human-imitation setup, while GPT-4o scored 21% and even trailed ELIZA, which fooled participants 23% of the time.
  • 307 participants across student and online groups completed eight rounds each in a two-phase imitation game, with judges trying to distinguish humans from machines.
  • The study said the finding reflects human perception rather than machine sentience, and argues the Turing test now depends heavily on behavioral cues users associate with AI.

Insights

If AI can perfectly fake human flaws, what does it truly mean to be human in a digital world?
As AI masters deception, how can we prevent the collapse of online trust and our shared reality?
Will AI companions offering 'friction-free' intimacy ultimately destroy our capacity for real human connection?

GPT-4.5 Surpasses Humans in Turing Test with 73% Success Rate: Redefining Intelligence and Raising Societal Risks

Overview

In 2025, researchers at UC San Diego conducted a landmark study showing that OpenAI’s GPT-4.5 became the first AI to pass the original Turing Test. Using a rigorous three-party test, participants had natural conversations with both a human and the AI, then tried to identify which was which. Remarkably, GPT-4.5 was judged as human more often than real people, providing strong evidence of human-like conversational ability. This breakthrough marks a major milestone in AI development and sparks new debates about what it means for machines to truly imitate or understand human intelligence.

...