Based on the search results, I can provide a comprehensive comparison of Musk’s Grok 4 and OpenAI’s ChatGPT-5 performance across multiple benchmarks and capabilities.
Current Status and Claims
Both models launched within weeks of each other in July-August 2025, with Elon Musk directly challenging OpenAI’s latest release. Musk claimed that “Grok 4 Heavy was smarter 2 weeks ago than GPT5 is now” when GPT-5 was unveiled in August 2025.
Academic Performance Benchmarks
Humanity’s Last Exam (HLE)
This comprehensive 2,500-question benchmark spans over 100 disciplines from humanities to quantum chemistry:
- Grok 4: 25.4% without tools, 38.6% with tools
- Grok 4 Heavy: 44.4% with tools
- GPT-5 comparison: While specific GPT-5 scores on HLE aren’t directly provided, earlier models like OpenAI’s o3 achieved 21% without tools and 24.9% with tools
Graduate-Level Science Questions (GPQA)
Testing PhD-level scientific understanding:
- Grok 4: 87% accuracy
- GPT-5 Pro (with Python): 89.4% accuracy
- GPT-5 (with Python): 87.3% accuracy
Mathematics Performance
American Invitational Mathematics Examination (AIME):
- Grok 4: 100% perfect score
- GPT-5: Specific AIME scores not provided in available data
Coding and Software Engineering
SWE-bench Verified (Real-world GitHub Issues)
- GPT-5: 74.9% accuracy
- Grok 4: Specific performance data not available in search results
General Coding Capabilities
- GPT-5: Described as “state-of-the-art across key coding benchmarks” with 88% on Aider polyglot
- Grok 4: Shows strong performance but specific coding benchmark scores not detailed
Key Differentiating Features
Grok 4 Advantages
- Real-time data access through X (Twitter) integration
- Perfect mathematics scores on competition-level problems
- Collaborative AI agents in Grok 4 Heavy variant for enhanced problem-solving
- Cultural fluency and less restrictive content policiesvktr
GPT-5 Advantages
- Reduced hallucinations by 78-84% compared to previous versions
- Superior coding performance on real-world software engineering tasks
- Thinking mode for enhanced reasoning capabilities
- Better safety and factual accuracy with only 4.8% production traffic errors vs previous models
Pricing and Accessibility
- Grok 4: Requires X Premium+ subscription ($16/month), with SuperGrok Heavy at $300/month
- GPT-5: Available to all ChatGPT users including free tier users
Expert and User Reception
The performance comparison remains contested, with xAI claiming Grok 4 as the “most intelligent model in the world” while OpenAI positions GPT-5 as offering “PhD-level expertise”. Independent verification of some benchmark claims is still pending, particularly for Grok 4’s Humanity’s Last Exam results.
Both models represent significant advances in AI capabilities, with Grok 4 excelling in mathematics and real-time information processing, while GPT-5 shows superior performance in coding tasks and maintains better safety standards with reduced error rates.
- https://www.foxbusiness.com/technology/musk-jabs-openai-says-grok-4-heavy-smarter-2-weeks-ago-than-newly-launched-gpt-5
- https://www.scientificamerican.com/article/elon-musks-new-grok-4-takes-on-humanitys-last-exam-as-the-ai-race-heats-up/
- https://timesofindia.indiatimes.com/technology/tech-news/grok-4-vs-grok-3-what-makes-elon-musks-newest-ai-model-the-worlds-most-powerful-ai/articleshow/122364407.cms
- https://www.getpassionfruit.com/blog/chatgpt-5-vs-gpt-5-pro-vs-gpt-4o-vs-o3-performance-benchmark-comparison-recommendation-of-openai-s-2025-models
- https://openai.com/index/introducing-gpt-5-for-developers/
- https://www.indiatoday.in/technology/news/story/elon-musk-says-grok-4-can-solve-real-world-engineering-problems-books-and-internet-cant-answer-2754728-2025-07-12
- https://www.vktr.com/ai-market/chatgpt-gemini-or-grok-we-tested-all-3-heres-what-you-should-know/
- https://indianexpress.com/article/technology/elon-musk-unveils-grok-4-grok-4-heavy-and-premium-300-supergrok-heavymodel-10117742/
- https://x.ai/news/grok-4
- https://www.youtube.com/watch?v=dbgL00a7_xs
- https://www.interconnects.ai/p/grok-4-an-o3-look-alike-in-search
- https://www.hindustantimes.com/trending/us/elon-musks-string-of-warnings-for-openai-satya-nadella-after-gpt-5-release-grok-will-101754596632162.html
- https://openai.com/index/introducing-gpt-5/
- https://www.bbc.com/news/articles/cy5prvgw0r1o
- https://www.reddit.com/r/cscareerquestions/comments/1mk8zj6/the_fact_that_chatgpt_5_is_barely_an_improvement/
- https://www.vibecoding.com/2025/07/18/gpt-5-vs-grok-4-the-ai-showdown-of-2025/
- https://www.technologyreview.com/2025/08/07/1121308/gpt-5-is-here-now-what/
- https://www.reddit.com/r/artificial/comments/1m6qdic/can_someone_tell_me_what_makes_people_think_grok/
- https://www.youtube.com/watch?v=WLdBimUS1IE
- https://techcrunch.com/2025/08/07/openais-gpt-5-is-here/