Which AI draws the best Pokémon card from pure code? Every model gets the same prompt. No pre-built assets. Pure SVG artistry. AI-judged on a 100-point rubric.
| #▲ | Model | Score | Δ | Effort | Duration | Tokens | Date |
|---|---|---|---|---|---|---|---|
| 1 | claude-sonnet-4-5max4× | ★ 90 | ▼2 | 🧠 High | 67s | 5.2K | Mar 01, 2026 |
| 2 | claude-opus-4-6max4× | ★ 90 | ▼2 | 🧠 High | 285s | 19.5K | Mar 01, 2026 |
| 3 | minimax-m2.167× | ★ 88 | ▲3 | 🧠 Low | 119s | 7.5K | Apr 28, 2026 |
| 4 | glm-53× | ★ 88 | ▼2 | 🧠 Medium | 93s | 5.8K | Mar 01, 2026 |
| 5 | claude-sonnet-4-6max4× | ★ 87 | ▼5 | 🧠 High | 626s | 50.6K | Mar 01, 2026 |
| 6 | kimi-k2.578× | ★ 86 | ▼6 | 🧠 Low | 139s | 4.9K | Apr 30, 2026 |
| 7 | minimax-m2.575× | ★ 86 | ▼4 | 🧠 Low | 62s | 5.4K | Apr 30, 2026 |
| 8 | gemini-2.5-pro-preview79× | ★ 86 | ▼9 | 🧠 Low | 87s | 12.1K | Apr 30, 2026 |
| 9 | gemini-2.5-flash65× | ★ 86 | ▼6 | 🧠 Low | 38s | 7.8K | Apr 30, 2026 |
| 10 | gemini-3.1-pro-preview81× | ★ 85 | ▼7 | 🧠 Low | 51s | 6.5K | Apr 30, 2026 |
| 11 | gpt-5.283× | ★ 83 | ▼9 | 🧠 Low | 97s | 7.6K | Apr 30, 2026 |
| 12 | deepseek-v3.266× | ★ 82 | ▼8 | 🧠 Low | 129s | 4.1K | Apr 30, 2026 |
| 13 | gemini-3-flash-preview83× | ★ 81 | ▼4 | 🧠 Low | 17s | 3.8K | Apr 30, 2026 |
| 14 | gemini-2.5-flash-lite81× | ★ 81 | ▼4 | 🧠 Low | 32s | 10.6K | Apr 30, 2026 |
| 15 | step-3.5-flash:free41× | ★ 81 | — | 🧠 Low | 125s | 11.5K | Apr 07, 2026 |
| 16 | claude-haiku-4.567× | ★ 79 | ▼13 | 🧠 Low | 44s | 8.8K | Apr 30, 2026 |
| 17 | grok-4.1-fast82× | ★ 77 | ▼13 | 🧠 Low | 39s | 5.0K | Apr 30, 2026 |
| 18 | gpt-5-nano80× | ★ 75 | ▼17 | 🧠 Low | 39s | 3.9K | Apr 30, 2026 |
| 19 | trinity-large-preview:free57× | ★ 75 | ▼10 | 🧠 Low | 36s | 2.9K | Apr 22, 2026 |
| 20 | gpt-oss-120b82× | ★ 74 | ▼11 | 🧠 Low | 55s | 2.9K | Apr 30, 2026 |
| 21 | gemini-2.0-flash-lite-00164× | ★ 73 | ▼19 | 🧠 Low | 17s | 2.9K | Apr 30, 2026 |
PokéBench is an open-source visual coding benchmark that tests how well AI models can generate complex, structured visual output. Unlike text-based benchmarks, this measures spatial reasoning, color theory, typography, and artistic ability — all in a single prompt. The Pokémon card format was chosen because it requires every skill at once: illustration, layout, typography, and coherent design.
Show your model’s PokéBench score. Copy the badge code below and paste it into your README, blog, or docs.
[](https://pokebench.info?model=claude-sonnet-4-5)
[](https://pokebench.info?model=claude-opus-4-6)
[](https://pokebench.info?model=minimax/minimax-m2.1)
[](https://pokebench.info?model=z-ai/glm-5)
[](https://pokebench.info?model=claude-sonnet-4-6)
[](https://pokebench.info?model=moonshotai/kimi-k2.5)
[](https://pokebench.info?model=minimax/minimax-m2.5)
[](https://pokebench.info?model=google/gemini-2.5-pro-preview)
[](https://pokebench.info?model=google/gemini-2.5-flash)