Skip to main content

Prices LLM usage as you go

Our LLM prices based on usage: Cost per 1,000 tokens depending on the model.

K
Written by Katja Gersdorf
Updated over 2 months ago

Model Performance and Pricing

Note

All prices are in CHF per 1,000 tokens. MMLU scores indicate model performance on the Massive Multitask Language Understanding benchmark.

High Performance Models

Model

Input

Output

MMLU

Details

llama3-swiss πŸ‡¨πŸ‡­

0.0150

0.0450

85.2%

Advanced, Efficient, Recommended

gpt-4o

0.0038

0.0150

92.3%

Leading Performance

claude-sonnet

0.0045

0.0225

88.7%

Multilingual, Writing, Coding

Balanced Models

Model

Input

Output

MMLU

Details

llama-swiss-medium πŸ‡¨πŸ‡­

0.0075

0.0150

79.2%

Strong for Size

mixtral-swiss-big πŸ‡¨πŸ‡­

0.0015

0.0045

N/A

Advanced Multilingual

mistral-medium

0.0041

0.0122

77.3%

Efficient, Multilingual

mixtral-swiss-medium πŸ‡¨πŸ‡­

0.0045

0.0150

77.3%

Efficient, Multilingual

gpt-4

0.0450

0.0900

86.5%

Consistent

claude-opus

0.0225

0.1125

86.8%

Strong Reasoning

Efficient Models

Model

Input

Output

MMLU

Details

gpt-3.5-turbo-1106

0.0015

0.0030

~70%

Legacy

mistral-tiny

0.0004

0.0013

60.1%

Compact, Fast, Cost-Effective

mistral-small

0.0002

0.0005

70.6%

Balanced Speed/Quality

Without Category

Model

Input

Output

MMLU

Details

gpt-4o-mini

0.0002

0.0009

gpt-4.1

0.0030

0.0120

gpt-4-1106-preview

0.0150

0.0450

gpt-4-0125-preview

0.0150

0.0450

gpt-4-turbo

0.0150

0.0450

gpt-4.5

0.1125

0.2250

gpt-5

0.0019

0.0150

gpt-swiss CH

0.0200

0.0600

o1-mini

0.0017

0.0017

o1-pro

0.2250

0.9000

o1-review

0.0225

0.0900

o3-mini

0.0017

0.0066

mistral-swiss CH

0.0450

0.0900

mistral-small-swiss CH

0.0002

0.0005

gemini-2-5-pro-preview

0.0038

0.0225

deepseek-r1

0.0300

0.0900

deepseek-reasoner

0.0008

0.0033

deepseak-chat

0.0004

0.0017

gemma-3

0.0170

0.0400

gemma-3-swiss CH

0.0200

0.0600

qwen-3-fast

0.0005

0.0014

qwen-3

0.0003

0.0009

mixtral-swiss-medium CH

0.0045

0.0150

Performance Notes

  • MMLU scores marked with (1) indicate single-shot performance

  • Scores marked with (5-shot) use few-shot learning

  • N/A indicates pending benchmark data

MMLU scores

While MMLU scores provide a useful metric for comparing language model capabilities, they represent only one dimension of performance. These scores primarily measure how well models can handle a standardized set of tasks, but they do not fully capture broader skills such as multilingual comprehension, information retention, domain-specific usage, programming proficiency, or complex reasoning abilities. In practice, different tasks place different demands on a model’s underlying architecture and training data, causing performance to vary considerably across these domains. As a result, MMLU should be seen as a helpful indicator rather than a definitive measure of a model’s overall quality or suitability for a given application. Source: Llm leaderboard


Rate Limiting

All endpoints have a combined limit of CHF 50 per month. If you would like to increase it, please contact support with your estimated usage and use case.

Did this answer your question?