Grok 3 outperforms GPT-4o, Claude 3.5 Sonnet, and Gemini 2.0 Pro on AIME 2024, GPQA Science, LiveCodeBench, and Chatbot Arena. And the Grok 3 Reasoning model delivers even stronger performance, ...