Apple Research Reveals AI Still Lacks True Humanlike Reasoning, Far from AGI

June 9, 2025 – Despite major leaps in artificial intelligence, current large language models (LLMs) still fall short of achieving true artificial general intelligence (AGI), according to a new study from Apple researchers.
In a thought-provoking paper titled “The Illusion of Thinking”, published in June, Apple’s AI research team argues that even the most advanced AI models still struggle with general reasoning—one of the core requirements for AGI. This discovery comes amid growing industry claims that AGI is just around the corner.
Evaluating the Illusion of AI “Thinking”
Apple researchers took a deeper dive into the reasoning capabilities of leading AI chatbots, including OpenAI’s o3-mini and o1, Anthropic’s Claude Sonnet, and DeepSeek-R1 and V3. These models were put through a series of puzzle-based challenges, going beyond conventional benchmarks like math and coding accuracy.
While most evaluations focus on whether a model produces the right answer, Apple’s study focused on how the models arrived at their conclusions. The findings were concerning:
Read More:- Apple KYC Glitch on Bybit Locks User Out of $100K — Executive Team Steps In for Swift Recovery
“Frontier large reasoning models (LRMs) experience complete accuracy collapse as task complexity increases. They fail to generalize and often overthink, leading to incorrect answers even after arriving at the correct one early in the process,” the researchers reported.
Verification of final answers and intermediate reasoning traces (top chart), and charts showing non-thinking models are more accurate at low complexity (bottom charts). Source: Apple Machine Learning Research
AI Overthinks But Doesn’t Truly Understand
Interestingly, the study observed that AI models often generated correct answers initially but failed due to overthinking or inconsistent logic chains. This reveals a crucial flaw: while LLMs can simulate reasoning, they do not internalize or generalize reasoning patterns like humans do.
“These LRMs are mimicking intelligent thought, not demonstrating it,” the study noted. “They don’t use explicit algorithms and show inconsistent logic—falling far short of what AGI-level reasoning would require.”
AGI Still a Distant Goal Despite Industry Optimism
The research stands in stark contrast to recent statements from leading AI figures. In January, OpenAI CEO Sam Altman claimed the company was closer than ever to building AGI, stating they were confident in knowing “how to build AGI as traditionally understood.”
Meanwhile, Anthropic CEO Dario Amodei predicted in late 2024 that AGI might surpass human intelligence by 2026 or 2027, citing rapid progress in AI capabilities.
However, Apple’s findings suggest otherwise. Rather than nearing human-level reasoning, today’s AI may be hitting foundational roadblocks that need entirely new paradigms to overcome.
Illustration of the four puzzle environments. Source: Apple
Read More: OpenAI Admits It Ignored Expert Warnings Before Releasing Overly Agreeable ChatGPT Update
Conclusion: The AGI Dream Needs a Reality Check
Apple’s paper adds a critical layer of skepticism to the AGI conversation. While current AI models show remarkable performance in pattern recognition and natural language generation, they still lack deep reasoning skills that are crucial for achieving human-level general intelligence.
As the race to AGI accelerates, this study serves as a reminder that simulating intelligence is not the same as possessing it. More transparency, research, and honest evaluation of AI capabilities are needed before the industry can truly claim to have achieved AGI.
Disclaimer: The content on this website is for informational purposes only and does not constitute financial or investment advice. We do not endorse any project or product. Readers should conduct their own research and assume full responsibility for their decisions. We are not liable for any loss or damage arising from reliance on the information provided. Crypto investments carry risks.