Apple Challenges Reasoning Abilities of AI Models Ahead of WWDC 2025
Apple Inc. (NASDAQ: AAPL) has cast doubt on the reasoning abilities of today’s leading AI models in a new research paper titled “The Illusion of Thinking: Understanding the Strength and Limitations of Reasoning Models via the Lens of Problem Complexity.” The study evaluated large reasoning models (LRMs) such as OpenAI’s O1/o3, DeepSeek-R1, Claude 3.7 Sonnet Thinking, and Gemini Thinking, revealing significant performance declines as task complexity increased.
Using controlled algorithmic puzzle environments, Apple researchers demonstrated that these state-of-the-art models consistently failed to solve complex problems and lacked scalable reasoning capabilities. They noted that beyond a certain threshold of difficulty, the models’ accuracy dropped to zero, exposing critical limitations in general problem-solving and adaptability.
The paper also criticized current AI evaluation benchmarks, suggesting they overestimate the true capabilities of modern LRMs. Apple instead proposed more rigorous testing environments to better assess how models handle abstract, non-standard tasks. Researchers concluded that despite their size, these models exhibit fundamental inefficiencies and cannot yet emulate the flexible reasoning seen in human cognition.
This research adds to growing skepticism about the proximity of general artificial intelligence (AGI)—a hypothetical form of AI capable of human-like understanding and reasoning. Current large language models primarily rely on pattern recognition and predictive algorithms, making them prone to logical errors and inconsistency in reasoning.
The paper’s release comes just ahead of Apple’s Worldwide Developers Conference (WWDC) 2025, where anticipation remains subdued amid criticism that the company has lagged rivals in AI development. Despite a partnership with OpenAI, Apple’s much-hyped “Apple Intelligence” features have faced delays, raising questions about its readiness to compete in the AI race.
This study underscores Apple’s critical view on the industry’s AGI ambitions while signaling a renewed focus on foundational AI research.