🏆 LLM Leaderboard
Live AI model rankings based on Chatbot Arena ELO scores and benchmark results.
📅Last updated: June 6, 2026
| # | Model | Organization | ELO | GPQA Diamond | HumanEval | MATH | MMLU | MT-Bench | License |
|---|---|---|---|---|---|---|---|---|---|
| 1 | Anthropic: Claude Opus 4.6 (Fast) | Unknown | 1503 | — | — | — | — | — | Unknown |
| 2 | Anthropic: Claude Opus 4.7 (Fast) | Unknown | 1500 | — | — | — | — | — | Unknown |
| 3 | Google: Nano Banana Pro (Gemini 3 Pro Image Preview) | Unknown | 1485 | — | — | — | — | — | Unknown |
| 4 | GPT-4o | OpenAI | 0 | — | — | — | 86.4% | 9.32% | Proprietary |
| 5 | Qwen: Qwen2.5 VL 72B Instruct | Unknown | 0 | — | — | — | 84.2% | 9.12% | Unknown |
| 6 | Claude 4 Sonnet | Anthropic | 0 | — | — | — | 77% | 7.9% | Proprietary |
| 7 | Anthropic: Claude 3 Haiku | Unknown | 0 | — | — | — | 75.2% | — | Unknown |
| 8 | Gemini 2.5 Pro | Google DeepMind | 0 | — | — | — | 71.8% | — | Proprietary |
| 9 | Google Gemini Pro Latest | Unknown | 0 | — | — | — | 71.8% | — | Unknown |
| 10 | DeepSeek: DeepSeek V4 Pro | Unknown | 0 | — | — | — | 71.3% | — | Unknown |
| 11 | Mistral: Mixtral 8x22B Instruct | Unknown | 0 | — | — | — | 70.6% | 8.3% | Unknown |
| 12 | Qwen: Qwen3.7 Plus | Unknown | 0 | — | — | — | 66.5% | 6.96% | Unknown |
| 13 | Google: Gemma 4 31B | Unknown | 0 | — | — | — | 64.3% | — | Unknown |
| 14 | Meta: Llama Guard 4 12B | Unknown | 0 | — | — | — | 63% | 6.86% | Unknown |
| 15 | Mistral: Mistral Medium 3.5 | Unknown | 0 | — | — | — | 55.4% | 6.84% | Unknown |
📊 Data source: LMSys Chatbot Arena, Hugging Face Open LLM Leaderboard