ChatGPT Surprises Scientists By Solving Plato’s Ancient Math Test In A New Way

(Image by Ryan DeBerardinis on Shutterstock)

When two researchers at the University of Cambridge challenged ChatGPT with a classic puzzle from ancient Greece, they found that the model sometimes behaved less like a search engine and more like a learner. The platform took time testing approaches, reconsidering when prompted, and even resisting wrong suggestions.

The study suggests that artificial intelligence may do more than retrieve memorized answers. In certain settings, it can appear to work through problems in a way that resembles student reasoning.

This finding does not mean ChatGPT “thinks” like a human. The authors emphasize their study is exploratory and based on a single conversation. Still, the results raise questions about how AI might support education if guided well.

How Researchers Gave ChatGPT Plato’s Famous Math Test

Nadav Marco, who’s now at Hebrew University, and Andreas Stylianides revisited Plato’s dialogue “Meno.” In that text, Socrates shows an uneducated slave boy how to double the area of a square through guided questions. Socrates used this exchange to argue that knowledge already exists in the mind and can be drawn out through teaching.

The researchers posed the same 2,400-year-old puzzle to ChatGPT-4. Instead of repeating the well-known geometric solution from Plato’s dialogue, ChatGPT used algebra, which wasn’t invented until centuries later.

What made this notable is that the AI later showed it did know the geometric method. If it were simply recalling from training data, the obvious move would have been to cite Plato’s approach immediately. Instead, it appeared to construct a different solution pathway.

The researchers also tried to mislead ChatGPT into making the same mistake as Plato’s slave, who initially thought doubling the sides would double the area. But ChatGPT refused to accept this wrong answer, carefully explaining why doubling the sides actually creates four times the area, not twice.

When ChatGPT Faced Variations on the Problem

The researchers then changed the puzzle, asking ChatGPT how to double the area of a rectangle. Here, the model showed surprising awareness of the problem’s limitations. Rather than incorrectly applying the square’s diagonal method, ChatGPT explained that “the diagonal does not offer a straightforward new dimension” for rectangles.

This response demonstrated something resembling mathematical reasoning. The AI seemed to understand that techniques working for one shape don’t automatically apply to others—a distinction that often challenges human students learning geometry.

When prompted for more practical solutions, ChatGPT initially focused on algebraic approaches, similar to its first response about squares. But the AI’s explanations of how it was reasoning were inconsistent. At times it described generating answers in real time; at other points it implied the responses were not spontaneous.

The authors noted that these reflections may not accurately represent how the system works. They cautioned against taking the AI’s own words at face value, since language models are not reliable guides to their inner processes.

The “Chat’s ZPD”: Where AI Learns with Guidance

Drawing on psychologist Lev Vygotsky, the researchers described a “Chat’s Zone of Proximal Development.” These are problems ChatGPT could not solve independently but managed when guided with timely prompts.

Vygotsky’s original concept describes the gap between what a child can do alone versus what they can accomplish with help from a teacher or more skilled peer. The researchers found a similar pattern with ChatGPT: certain problems remained out of reach until the right kind of guidance appeared.

Some answers looked like retrieval from training data. Others, especially those involving resistance to incorrect suggestions or adaptation to new prompts, resembled the problem-solving steps of students. While this does not prove that the model truly “understands,” it does suggest that, under the right conditions, AI output can mirror aspects of human learning.

When the researchers asked for an “elegant and exact” solution to the original square problem, ChatGPT provided the geometric construction method. The AI itself admitted that “there [was] indeed a more straightforward and mathematically precise approach … which [it] should have emphasised directly in response to [our] initial inquiry.”

This self-correction suggested the model could reflect on and improve its responses when given appropriate prompts, much like a student who realizes they took a harder path than necessary.

What This Means for Students and Teachers

If AI tools can sometimes behave like learners, they could become useful educational partners. Instead of treating ChatGPT as an answer machine, students and teachers might experiment with prompts that invite collaboration and exploration.

The type of prompt matters significantly. The researchers found that asking for exploration and collaboration yielded different responses than requesting summaries based on reliable sources. Knowing how to phrase prompts could shape whether the model retrieves or attempts to generate new approaches.

Teachers could use this approach to model problem-solving strategies. Rather than asking AI for the final answer, they might guide it through the same thinking process they want students to follow. This could help students see that even sophisticated systems sometimes struggle, reconsider approaches, and need guidance to reach better solutions.

Students, meanwhile, could practice their own reasoning by working alongside AI that shows its thinking process. When ChatGPT resists incorrect suggestions or explains why certain approaches won’t work, students get opportunities to understand mathematical reasoning rather than just memorize procedures.

The authors stress that their study, published in the International Journal of Mathematical Education in Science and Technology, involved only one conversation with one model (ChatGPT-4 in February 2024). Results may differ with newer versions or different systems. Still, the findings invite educators to consider how AI might support exploration, not just provide ready-made answers.

As the researchers put it, users should “pay attention to the type of knowledge they wish to get from an LLM and try to communicate it clearly in their prompts.” Guidance can help AI attempt solutions it would not manage on its own.

Building Mathematical Understanding Through AI Collaboration

The study reveals potential for AI to serve as more than an information source. When ChatGPT resisted incorrect suggestions and explained its reasoning, it demonstrated behaviors that could help students develop critical thinking skills.

Rather than simply accepting or rejecting AI outputs, students could learn to evaluate mathematical reasoning, whether from artificial systems or human sources. This skill becomes increasingly valuable as AI tools become more prevalent in academic and professional settings.

The researchers’ approach also highlights how questioning techniques can reveal different aspects of AI behavior. By varying their prompts and challenging the system’s responses, they uncovered evidence of both retrieval and generation processes within the same conversation.

A Tentative Step, Not a Final Word

The study opens questions about how we understand machine intelligence. If AI can engage in something resembling reasoning, complete with self-correction and resistance to errors, the line between retrieval and generation becomes blurred. This doesn’t mean AI has achieved consciousness, but it suggests these systems might be more sophisticated thinking partners than previously imagined.

For teachers and students, the lesson is not that machines replace human reasoning, but that they could help learners explore strategies, confront mistakes, and practice persistence in problem-solving. The key lies in knowing how to prompt and guide these systems effectively.

Source : https://studyfinds.org/chatgpt-solves-platos-ancient-math-test/