VS
You have reached the daily limit of 3 comparisons per IP address.

Frequently Asked Questions

How many comparisons can I run per day?

To keep this service free, we limit usage to 3 comparisons per IP address per day. The counter resets at midnight UTC.

Why do different models give different answers?

Models are trained on different datasets and use different architectures. Some might excel at creative writing, while others are better at strict logical reasoning or coding.

Can I use custom system prompts?

Not in the public arena. The arena is designed for quick, zero-shot comparisons using standard default configurations.

Is my prompt data stored or shared?

Your prompts are sent to AI providers for processing but are not stored by AI Days. However, each model provider has their own data policies which you should review.

Which model is best for coding tasks?

Gemini 2.5 Flash-Lite excels at efficiency and speed across a wide range of tasks. window.

Why is one model faster than the other?

Response speed depends on model size, server load, and architecture. Gemini 2.5 Flash-Lite is optimized for extreme speed and low latency. or 'Opus' models prioritize quality.

Can I compare more than two models at once?

Currently the arena supports head-to-head (2 model) comparisons for clarity. Multi-model comparisons may be added in a future update.

What happens if a model returns an error?

If a model fails (due to rate limits, content filters, or API issues), you will see a clear error message in that panel. The other model's response will still display normally.

Are the responses rendered as Markdown?

Yes! If a model returns formatted text with headers, lists, or code blocks, the arena renders it as rich Markdown using the marked.js library.

How do I interpret the results?

Focus on accuracy, completeness, tone, and formatting. A good comparison prompt is specific enough to reveal meaningful differences — try coding tasks, creative writing, or factual questions.