GPT-4.5 and LLaMa-3.1 pass the Turing test:
A milestone for AI and companies
Conversations with AI can no longer be distinguished from human dialogue partners
A significant milestone in artificial intelligence was reached in March 2025: GPT-4.5 and LLaMa-3.1 were the first systems to successfully pass the classic three-party Turing test. In this test format, human participants simultaneously conducted five-minute text conversations with another human and an AI system without knowing who was who. Remarkably, GPT-4.5 was identified as the human dialogue partner in 73% of cases - more often than the actual human participants.
The Turing test, named after the British mathematician Alan Turing, was proposed in 1950 as a measure of machine intelligence. In its original form, the test involves a human examiner communicating via a text-based interface with two dialogue partners - a human and a machine - without knowing who is who. If the examiner cannot reliably distinguish the machine from the human, the machine is considered intelligent.
The study by Cameron R. Jones and Benjamin K. Bergen of the University of California, San Diego, evaluated four systems: ELIZA, GPT-4o, LLaMa-3.1-405B and GPT-4.5. Participants had simultaneous five-minute conversations with another human and one of the AI systems.
Notably, when instructed to assume a human-like personality, GPT-4.5 was identified as human 73% of the time - significantly more often than the actual human participants.
LLaMa-3.1 achieved a recognition rate of 56% under the same conditions, which was not statistically significantly different from the human comparison group but also counted as passing.
The base models ELIZA and GPT-4o, on the other hand, achieved success rates well below chance (23% and 21% respectively).
An interesting aspect, however, is that the language models had to be prompted to slip into a certain role. Without this prompt, the AI was still recognisable as such. This raises a question that the lead author of the study - Cameron Jones - asks himself in a thread on Platform X: "Did the LLMs really pass if they needed a prompt?" However, he goes on to say, "Without a prompt, LLMs would fail for trivial reasons (e.g. admitting they are AI) ... so I think it's fair to say the LLMs pass."
Ultimately, this raises the key question: can we really speak of intelligence in these language models, or have the models simply become perfect at simulating human behaviour? And what implications does this have for their practical use?
Process automation
at a fixed price!
Contact us now.
AIMAX Business Solutions combines excellent solutions with first-class service. Your added value is our goal. Unique AI systems allow us to act independently of the application. With process automation and digital assistance, we unlock new potential in your company.
The successful completion of the Turing test has far-reaching implications for the use of AI in companies. If the models are now able to show human-like behaviour - at least in time-limited interactions - they could be used even more intensively in future than they already are in numerous areas, such as
- Natural customer interactions: AI agents can now communicate with customers in a more convincing and human-like way, which can increase customer satisfaction.
- Increased efficiency: routine tasks such as making appointments or simple enquiries can be efficiently taken over by AI systems, reducing the workload of employees.
- Scalability: Companies can offer their services around the clock without additional human resources.
- Personalisation: AI systems can respond individually to customer needs by understanding contexts and reacting accordingly.
Companies will also follow this development - even if it is through a certain pressure to act that they feel from their competitors.
As the example of the CEO of Shopify shows, companies are already starting to make clear judgements: AI or new recruitment? The message clearly formulated by Tobias Lütke:
- The efficient use of AI is a fundamental expectation for all Shopify employees.
- Teams looking for additional staff will only fulfil this wish if the tasks cannot be performed by AI.
"This study marks a turning point that, in my opinion, is not currently receiving the attention it deserves.
But perhaps it will convince some notorious sceptics to take a closer look at the new opportunities and challenges of AI.
I would be happy to talk to you personally."
Despite the impressive progress, there are also challenges that need to be addressed.
- Transparency: It is becoming increasingly difficult to distinguish between human and AI-generated content, which raises questions about authenticity.
- Responsibility: Who is responsible if AI systems make wrong decisions or are misused?
- Data protection: The processing of large amounts of data by AI systems must comply with data protection regulations.
The European Union's AI Act has already laid important foundations. Their concrete design and implementation has only just begun and will still give us some tough nuts to crack.
The successful completion of the Turing test marks a turning point in the development of artificial intelligence. New opportunities are opening up for companies to utilise AI effectively and responsibly. It is now up to companies to utilise these technologies for the benefit of their customers and employees, while always keeping ethical and legal aspects in mind.