Back to pricing
Model pricing
When using Octavus-managed API keys, the provider cost for each request is passed through at the rates below. These prices are automatically synced from provider APIs. With BYOK (Bring Your Own Keys), provider costs are not charged.
OpenAI(55 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
OpenAI: GPT-3.5 Turbo | $0.50 | $1.50 | - | - | 16K |
OpenAI: GPT-3.5 Turbo (older v0613) | $1.00 | $2.00 | - | - | 4K |
OpenAI: GPT-3.5 Turbo 16k | $3.00 | $4.00 | - | - | 16K |
OpenAI: GPT-3.5 Turbo Instruct | $1.50 | $2.00 | - | - | 4K |
OpenAI: GPT-4 | $30.00 | $60.00 | - | - | 8K |
OpenAI: GPT-4 (older v0314) | $30.00 | $60.00 | - | - | 8K |
OpenAI: GPT-4 Turbo (older v1106) | $10.00 | $30.00 | - | - | 128K |
OpenAI: GPT-4 Turbo | $10.00 | $30.00 | - | - | 128K |
OpenAI: GPT-4 Turbo Preview | $10.00 | $30.00 | - | - | 128K |
OpenAI: GPT-4.1 | $2.00 | $8.00 | $0.50 | - | 1M |
OpenAI: GPT-4.1 Mini | $0.40 | $1.60 | $0.10 | - | 1M |
OpenAI: GPT-4.1 Nano | $0.10 | $0.40 | $0.025 | - | 1M |
OpenAI: GPT-4o | $2.50 | $10.00 | $1.25 | - | 128K |
OpenAI: GPT-4o (2024-05-13) | $5.00 | $15.00 | - | - | 128K |
OpenAI: GPT-4o (2024-08-06) | $2.50 | $10.00 | $1.25 | - | 128K |
OpenAI: GPT-4o (2024-11-20) | $2.50 | $10.00 | $1.25 | - | 128K |
OpenAI: GPT-4o Audio | $2.50 | $10.00 | - | - | 128K |
OpenAI: GPT-4o-mini | $0.15 | $0.60 | $0.075 | - | 128K |
OpenAI: GPT-4o-mini (2024-07-18) | $0.15 | $0.60 | $0.075 | - | 128K |
OpenAI: GPT-4o-mini Search Preview | $0.15 | $0.60 | - | - | 128K |
OpenAI: GPT-4o Search Preview | $2.50 | $10.00 | - | - | 128K |
OpenAI: GPT-4o (extended) | $6.00 | $18.00 | - | - | 128K |
OpenAI: GPT-5 | $1.25 | $10.00 | $0.125 | - | 400K |
OpenAI: GPT-5 Chat | $1.25 | $10.00 | $0.125 | - | 128K |
OpenAI: GPT-5 Codex | $1.25 | $10.00 | $0.125 | - | 400K |
OpenAI: GPT-5 Image | $10.00 | $10.00 | $1.25 | - | 400K |
OpenAI: GPT-5 Image Mini | $2.50 | $2.00 | $0.25 | - | 400K |
OpenAI: GPT-5 Mini | $0.25 | $2.00 | $0.025 | - | 400K |
OpenAI: GPT-5 Nano | $0.05 | $0.40 | $0.005 | - | 400K |
OpenAI: GPT-5 Pro | $15.00 | $120.00 | - | - | 400K |
OpenAI: GPT-5.1 | $1.25 | $10.00 | $0.125 | - | 400K |
OpenAI: GPT-5.1 Chat | $1.25 | $10.00 | $0.125 | - | 128K |
OpenAI: GPT-5.1-Codex | $1.25 | $10.00 | $0.125 | - | 400K |
OpenAI: GPT-5.1-Codex-Max | $1.25 | $10.00 | $0.125 | - | 400K |
OpenAI: GPT-5.1-Codex-Mini | $0.25 | $2.00 | $0.025 | - | 400K |
OpenAI: GPT-5.2 | $1.75 | $14.00 | $0.175 | - | 400K |
OpenAI: GPT-5.2 Chat | $1.75 | $14.00 | $0.175 | - | 128K |
OpenAI: GPT-5.2-Codex | $1.75 | $14.00 | $0.175 | - | 400K |
OpenAI: GPT-5.2 Pro | $21.00 | $168.00 | - | - | 400K |
OpenAI: GPT Audio | $2.50 | $10.00 | - | - | 128K |
OpenAI: GPT Audio Mini | $0.60 | $2.40 | - | - | 128K |
OpenAI: gpt-oss-120b | $0.039 | $0.19 | - | - | 131K |
OpenAI: gpt-oss-120b (exacto) | $0.039 | $0.19 | - | - | 131K |
OpenAI: gpt-oss-20b | $0.03 | $0.14 | - | - | 131K |
OpenAI: gpt-oss-safeguard-20b | $0.075 | $0.30 | $0.037 | - | 131K |
OpenAI: o1 | $15.00 | $60.00 | $7.50 | - | 200K |
OpenAI: o1-pro | $150.00 | $600.00 | - | - | 200K |
OpenAI: o3 | $2.00 | $8.00 | $0.50 | - | 200K |
OpenAI: o3 Deep Research | $10.00 | $40.00 | $2.50 | - | 200K |
OpenAI: o3 Mini | $1.10 | $4.40 | $0.55 | - | 200K |
OpenAI: o3 Mini High | $1.10 | $4.40 | $0.55 | - | 200K |
OpenAI: o3 Pro | $20.00 | $80.00 | - | - | 200K |
OpenAI: o4 Mini | $1.10 | $4.40 | $0.275 | - | 200K |
OpenAI: o4 Mini Deep Research | $2.00 | $8.00 | $0.50 | - | 200K |
OpenAI: o4 Mini High | $1.10 | $4.40 | $0.275 | - | 200K |
Anthropic(13 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Anthropic: Claude 3 Haiku | $0.25 | $1.25 | $0.03 | - | 200K |
Anthropic: Claude 3.5 Haiku | $0.80 | $4.00 | $0.08 | - | 200K |
Anthropic: Claude 3.5 Sonnet | $6.00 | $30.00 | $0.60 | - | 200K |
Anthropic: Claude 3.7 Sonnet | $3.00 | $15.00 | $0.30 | - | 200K |
Anthropic: Claude 3.7 Sonnet (thinking) | $3.00 | $15.00 | $0.30 | - | 200K |
Anthropic: Claude Haiku 4.5 | $1.00 | $5.00 | $0.10 | - | 200K |
Anthropic: Claude Opus 4 | $15.00 | $75.00 | $1.50 | - | 200K |
Anthropic: Claude Opus 4.1 | $15.00 | $75.00 | $1.50 | - | 200K |
Anthropic: Claude Opus 4.5 | $5.00 | $25.00 | $0.50 | - | 200K |
Anthropic: Claude Opus 4.6 | $5.00 | $25.00 | $0.50 | - | 1M |
Anthropic: Claude Sonnet 4 | $3.00 | $15.00 | $0.30 | - | 1M |
Anthropic: Claude Sonnet 4.5 | $3.00 | $15.00 | $0.30 | - | 1M |
Anthropic: Claude Sonnet 4.6 | $3.00 | $15.00 | $0.30 | - | 1M |
Google(19 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Google: Gemini 2.0 Flash | $0.10 | $0.40 | $0.025 | $0.40 | 1M |
Google: Gemini 2.0 Flash Lite | $0.075 | $0.30 | - | $0.30 | 1M |
Google: Gemini 2.5 Flash | $0.30 | $2.50 | $0.03 | $2.50 | 1M |
Google: Gemini 2.5 Flash Image (Nano Banana) | $0.30 | $2.50 | $0.03 | $2.50 | 33K |
Google: Gemini 2.5 Flash Lite | $0.10 | $0.40 | $0.01 | $0.40 | 1M |
Google: Gemini 2.5 Flash Lite Preview 09-2025 | $0.10 | $0.40 | $0.01 | $0.40 | 1M |
Google: Gemini 2.5 Pro | $1.25 | $10.00 | $0.125 | $10.00 | 1M |
Google: Gemini 2.5 Pro Preview 06-05 | $1.25 | $10.00 | $0.125 | $10.00 | 1M |
Google: Gemini 2.5 Pro Preview 05-06 | $1.25 | $10.00 | $0.125 | $10.00 | 1M |
Google: Gemini 3 Flash Preview | $0.50 | $3.00 | $0.05 | $3.00 | 1M |
Google: Nano Banana Pro (Gemini 3 Pro Image Preview) | $2.00 | $12.00 | $0.20 | $12.00 | 66K |
Google: Gemini 3 Pro Preview | $2.00 | $12.00 | $0.20 | $12.00 | 1M |
Google: Gemini 3.1 Pro Preview | $2.00 | $12.00 | $0.20 | $12.00 | 1M |
Google: Gemma 2 27B | $0.65 | $0.65 | - | - | 8K |
Google: Gemma 2 9B | $0.03 | $0.09 | - | - | 8K |
Google: Gemma 3 12B | $0.04 | $0.13 | - | - | 131K |
Google: Gemma 3 27B | $0.04 | $0.15 | $0.02 | - | 128K |
Google: Gemma 3 4B | $0.04 | $0.08 | - | - | 131K |
Google: Gemma 3n 4B | $0.02 | $0.04 | - | - | 33K |
ai21(1 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
AI21: Jamba Large 1.7 | $2.00 | $8.00 | - | - | 256K |
aion-labs(3 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
AionLabs: Aion-1.0 | $4.00 | $8.00 | - | - | 131K |
AionLabs: Aion-1.0-Mini | $0.70 | $1.40 | - | - | 131K |
AionLabs: Aion-RP 1.0 (8B) | $0.80 | $1.60 | - | - | 33K |
alfredpros(1 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
AlfredPros: CodeLLaMa 7B Instruct Solidity | $0.80 | $1.20 | - | - | 4K |
Alibaba(1 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Tongyi DeepResearch 30B A3B | $0.09 | $0.45 | $0.09 | - | 131K |
allenai(7 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
AllenAI: Molmo2 8B | $0.20 | $0.20 | - | - | 37K |
AllenAI: Olmo 2 32B Instruct | $0.05 | $0.20 | - | - | 128K |
AllenAI: Olmo 3 32B Think | $0.15 | $0.50 | - | - | 66K |
AllenAI: Olmo 3 7B Instruct | $0.10 | $0.20 | - | - | 66K |
AllenAI: Olmo 3 7B Think | $0.12 | $0.20 | - | - | 66K |
AllenAI: Olmo 3.1 32B Instruct | $0.20 | $0.60 | - | - | 66K |
AllenAI: Olmo 3.1 32B Think | $0.15 | $0.50 | - | - | 66K |
alpindale(1 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Goliath 120B | $3.75 | $7.50 | - | - | 6K |
Amazon(5 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Amazon: Nova 2 Lite | $0.30 | $2.50 | - | - | 1M |
Amazon: Nova Lite 1.0 | $0.06 | $0.24 | - | - | 300K |
Amazon: Nova Micro 1.0 | $0.035 | $0.14 | - | - | 128K |
Amazon: Nova Premier 1.0 | $2.50 | $12.50 | $0.625 | - | 1M |
Amazon: Nova Pro 1.0 | $0.80 | $3.20 | - | - | 300K |
anthracite-org(1 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Magnum v4 72B | $3.00 | $5.00 | - | - | 16K |
arcee-ai(5 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Arcee AI: Coder Large | $0.50 | $0.80 | - | - | 33K |
Arcee AI: Maestro Reasoning | $0.90 | $3.30 | - | - | 131K |
Arcee AI: Spotlight | $0.18 | $0.18 | - | - | 131K |
Arcee AI: Trinity Mini | $0.045 | $0.15 | - | - | 131K |
Arcee AI: Virtuoso Large | $0.75 | $1.20 | - | - | 131K |
baidu(5 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Baidu: ERNIE 4.5 21B A3B | $0.07 | $0.28 | - | - | 120K |
Baidu: ERNIE 4.5 21B A3B Thinking | $0.07 | $0.28 | - | - | 131K |
Baidu: ERNIE 4.5 300B A47B | $0.28 | $1.10 | - | - | 123K |
Baidu: ERNIE 4.5 VL 28B A3B | $0.14 | $0.56 | - | - | 30K |
Baidu: ERNIE 4.5 VL 424B A47B | $0.42 | $1.25 | - | - | 123K |
bytedance(1 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
ByteDance: UI-TARS 7B | $0.10 | $0.20 | - | - | 128K |
bytedance-seed(2 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
ByteDance Seed: Seed 1.6 | $0.25 | $2.00 | - | - | 262K |
ByteDance Seed: Seed 1.6 Flash | $0.075 | $0.30 | - | - | 262K |
Cohere(4 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Cohere: Command A | $2.50 | $10.00 | - | - | 256K |
Cohere: Command R (08-2024) | $0.15 | $0.60 | - | - | 128K |
Cohere: Command R+ (08-2024) | $2.50 | $10.00 | - | - | 128K |
Cohere: Command R7B (12-2024) | $0.0375 | $0.15 | - | - | 128K |
deepcogito(1 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Deep Cogito: Cogito v2.1 671B | $1.25 | $1.25 | - | - | 128K |
DeepSeek(12 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
DeepSeek: DeepSeek V3 | $0.32 | $0.89 | - | - | 164K |
DeepSeek: DeepSeek V3 0324 | $0.19 | $0.87 | $0.095 | - | 164K |
DeepSeek: DeepSeek V3.1 | $0.15 | $0.75 | - | - | 33K |
DeepSeek: R1 | $0.70 | $2.50 | - | - | 64K |
DeepSeek: R1 0528 | $0.40 | $1.75 | $0.20 | - | 164K |
DeepSeek: R1 Distill Llama 70B | $0.70 | $0.80 | - | - | 131K |
DeepSeek: R1 Distill Qwen 32B | $0.29 | $0.29 | - | - | 33K |
DeepSeek: DeepSeek V3.1 Terminus | $0.21 | $0.79 | $0.13 | - | 164K |
DeepSeek: DeepSeek V3.1 Terminus (exacto) | $0.21 | $0.79 | $0.168 | - | 164K |
DeepSeek: DeepSeek V3.2 | $0.26 | $0.38 | $0.13 | - | 164K |
DeepSeek: DeepSeek V3.2 Exp | $0.27 | $0.41 | - | - | 164K |
DeepSeek: DeepSeek V3.2 Speciale | $0.40 | $1.20 | $0.20 | - | 164K |
eleutherai(1 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
EleutherAI: Llemma 7b | $0.80 | $1.20 | - | - | 4K |
essentialai(1 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
EssentialAI: Rnj 1 Instruct | $0.15 | $0.15 | - | - | 33K |
gryphe(1 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
MythoMax 13B | $0.06 | $0.06 | - | - | 4K |
ibm-granite(1 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
IBM: Granite 4.0 Micro | $0.017 | $0.11 | - | - | 131K |
inception(2 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Inception: Mercury | $0.25 | $1.00 | - | - | 128K |
Inception: Mercury Coder | $0.25 | $1.00 | - | - | 128K |
inflection(2 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Inflection: Inflection 3 Pi | $2.50 | $10.00 | - | - | 8K |
Inflection: Inflection 3 Productivity | $2.50 | $10.00 | - | - | 8K |
kwaipilot(1 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Kwaipilot: KAT-Coder-Pro V1 | $0.207 | $0.828 | $0.0414 | - | 256K |
liquid(2 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
LiquidAI: LFM2-2.6B | $0.01 | $0.02 | - | - | 33K |
LiquidAI: LFM2-8B-A1B | $0.01 | $0.02 | - | - | 33K |
mancer(1 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Mancer: Weaver (alpha) | $0.75 | $1.00 | - | - | 8K |
meituan(1 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Meituan: LongCat Flash Chat | $0.20 | $0.80 | $0.20 | - | 131K |
Meta(15 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Meta: Llama 3 70B Instruct | $0.51 | $0.74 | - | - | 8K |
Meta: Llama 3 8B Instruct | $0.03 | $0.04 | - | - | 8K |
Meta: Llama 3.1 405B (base) | $4.00 | $4.00 | - | - | 33K |
Meta: Llama 3.1 405B Instruct | $4.00 | $4.00 | - | - | 131K |
Meta: Llama 3.1 70B Instruct | $0.40 | $0.40 | - | - | 131K |
Meta: Llama 3.1 8B Instruct | $0.02 | $0.05 | - | - | 16K |
Meta: Llama 3.2 11B Vision Instruct | $0.049 | $0.049 | - | - | 131K |
Meta: Llama 3.2 1B Instruct | $0.027 | $0.20 | - | - | 60K |
Meta: Llama 3.2 3B Instruct | $0.02 | $0.02 | - | - | 131K |
Meta: Llama 3.3 70B Instruct | $0.10 | $0.32 | - | - | 131K |
Meta: Llama 4 Maverick | $0.15 | $0.60 | - | - | 1M |
Meta: Llama 4 Scout | $0.08 | $0.30 | - | - | 328K |
Meta: LlamaGuard 2 8B | $0.20 | $0.20 | - | - | 8K |
Llama Guard 3 8B | $0.02 | $0.06 | - | - | 131K |
Meta: Llama Guard 4 12B | $0.18 | $0.18 | - | - | 164K |
microsoft(2 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Microsoft: Phi 4 | $0.06 | $0.14 | - | - | 16K |
WizardLM-2 8x22B | $0.62 | $0.62 | - | - | 66K |
MiniMax(6 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
MiniMax: MiniMax-01 | $0.20 | $1.10 | - | - | 1M |
MiniMax: MiniMax M1 | $0.40 | $2.20 | - | - | 1M |
MiniMax: MiniMax M2 | $0.255 | $1.00 | $0.03 | - | 197K |
MiniMax: MiniMax M2-her | $0.30 | $1.20 | $0.03 | - | 66K |
MiniMax: MiniMax M2.1 | $0.27 | $0.95 | $0.03 | - | 197K |
MiniMax: MiniMax M2.5 | $0.30 | $1.10 | $0.15 | - | 197K |
Mistral(27 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Mistral: Codestral 2508 | $0.30 | $0.90 | - | - | 256K |
Mistral: Devstral 2 2512 | $0.40 | $2.00 | - | - | 262K |
Mistral: Devstral Medium | $0.40 | $2.00 | - | - | 131K |
Mistral: Devstral Small 1.1 | $0.10 | $0.30 | - | - | 131K |
Mistral: Ministral 3 14B 2512 | $0.20 | $0.20 | - | - | 262K |
Mistral: Ministral 3 3B 2512 | $0.10 | $0.10 | - | - | 131K |
Mistral: Ministral 3 8B 2512 | $0.15 | $0.15 | - | - | 262K |
Mistral: Mistral 7B Instruct | $0.20 | $0.20 | - | - | 33K |
Mistral: Mistral 7B Instruct v0.1 | $0.11 | $0.19 | - | - | 3K |
Mistral: Mistral 7B Instruct v0.2 | $0.20 | $0.20 | - | - | 33K |
Mistral: Mistral 7B Instruct v0.3 | $0.20 | $0.20 | - | - | 33K |
Mistral Large | $2.00 | $6.00 | - | - | 128K |
Mistral Large 2407 | $2.00 | $6.00 | - | - | 131K |
Mistral Large 2411 | $2.00 | $6.00 | - | - | 131K |
Mistral: Mistral Large 3 2512 | $0.50 | $1.50 | - | - | 262K |
Mistral: Mistral Medium 3 | $0.40 | $2.00 | - | - | 131K |
Mistral: Mistral Medium 3.1 | $0.40 | $2.00 | - | - | 131K |
Mistral: Mistral Nemo | $0.02 | $0.04 | - | - | 131K |
Mistral: Saba | $0.20 | $0.60 | - | - | 33K |
Mistral: Mistral Small 3 | $0.05 | $0.08 | - | - | 33K |
Mistral: Mistral Small 3.1 24B | $0.35 | $0.56 | - | - | 128K |
Mistral: Mistral Small 3.2 24B | $0.06 | $0.18 | $0.03 | - | 131K |
Mistral: Mistral Small Creative | $0.10 | $0.30 | - | - | 33K |
Mistral: Mixtral 8x22B Instruct | $2.00 | $6.00 | - | - | 66K |
Mistral: Mixtral 8x7B Instruct | $0.54 | $0.54 | - | - | 33K |
Mistral: Pixtral Large 2411 | $2.00 | $6.00 | - | - | 131K |
Mistral: Voxtral Small 24B 2507 | $0.10 | $0.30 | - | - | 32K |
Moonshot(5 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
MoonshotAI: Kimi K2 0711 | $0.50 | $2.40 | - | - | 131K |
MoonshotAI: Kimi K2 0905 | $0.40 | $2.00 | $0.15 | - | 131K |
MoonshotAI: Kimi K2 0905 (exacto) | $0.60 | $2.50 | - | - | 262K |
MoonshotAI: Kimi K2 Thinking | $0.47 | $2.00 | $0.141 | - | 131K |
MoonshotAI: Kimi K2.5 | $0.45 | $2.20 | $0.225 | - | 262K |
morph(2 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Morph: Morph V3 Fast | $0.80 | $1.20 | - | - | 82K |
Morph: Morph V3 Large | $0.90 | $1.90 | - | - | 262K |
neversleep(2 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
NeverSleep: Lumimaid v0.2 8B | $0.09 | $0.60 | - | - | 33K |
Noromaid 20B | $1.00 | $1.75 | - | - | 4K |
nex-agi(1 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Nex AGI: DeepSeek V3.1 Nex N1 | $0.27 | $1.00 | - | - | 131K |
nousresearch(5 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
NousResearch: Hermes 2 Pro - Llama-3 8B | $0.14 | $0.14 | - | - | 8K |
Nous: Hermes 3 405B Instruct | $1.00 | $1.00 | - | - | 131K |
Nous: Hermes 3 70B Instruct | $0.30 | $0.30 | - | - | 66K |
Nous: Hermes 4 405B | $1.00 | $3.00 | - | - | 131K |
Nous: Hermes 4 70B | $0.13 | $0.40 | - | - | 131K |
NVIDIA(6 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
NVIDIA: Llama 3.1 Nemotron 70B Instruct | $1.20 | $1.20 | - | - | 131K |
NVIDIA: Llama 3.1 Nemotron Ultra 253B v1 | $0.60 | $1.80 | - | - | 131K |
NVIDIA: Llama 3.3 Nemotron Super 49B V1.5 | $0.10 | $0.40 | - | - | 131K |
NVIDIA: Nemotron 3 Nano 30B A3B | $0.05 | $0.20 | - | - | 262K |
NVIDIA: Nemotron Nano 12B 2 VL | $0.07 | $0.20 | - | - | 131K |
NVIDIA: Nemotron Nano 9B V2 | $0.04 | $0.16 | - | - | 131K |
opengvlab(1 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
OpenGVLab: InternVL3 78B | $0.15 | $0.60 | $0.075 | - | 33K |
openrouter(2 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Auto Router | $-1000000.00 | $-1000000.00 | - | - | 2M |
Body Builder (beta) | $-1000000.00 | $-1000000.00 | - | - | 128K |
perplexity(5 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Perplexity: Sonar | $1.00 | $1.00 | - | - | 127K |
Perplexity: Sonar Deep Research | $2.00 | $8.00 | - | $3.00 | 128K |
Perplexity: Sonar Pro | $3.00 | $15.00 | - | - | 200K |
Perplexity: Sonar Pro Search | $3.00 | $15.00 | - | - | 200K |
Perplexity: Sonar Reasoning Pro | $2.00 | $8.00 | - | - | 128K |
prime-intellect(1 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Prime Intellect: INTELLECT-3 | $0.20 | $1.10 | - | - | 131K |
Qwen(40 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Qwen2.5 72B Instruct | $0.12 | $0.39 | - | - | 33K |
Qwen: Qwen2.5 7B Instruct | $0.04 | $0.10 | - | - | 33K |
Qwen2.5 Coder 32B Instruct | $0.20 | $0.20 | - | - | 33K |
Qwen: Qwen2.5-VL 7B Instruct | $0.20 | $0.20 | - | - | 33K |
Qwen: Qwen-Max | $1.60 | $6.40 | $0.32 | - | 33K |
Qwen: Qwen-Plus | $0.40 | $1.20 | $0.08 | - | 1M |
Qwen: Qwen Plus 0728 | $0.40 | $1.20 | - | - | 1M |
Qwen: Qwen Plus 0728 (thinking) | $0.40 | $1.20 | - | - | 1M |
Qwen: Qwen-Turbo | $0.05 | $0.20 | $0.01 | - | 131K |
Qwen: Qwen VL Max | $0.80 | $3.20 | - | - | 131K |
Qwen: Qwen VL Plus | $0.21 | $0.63 | $0.042 | - | 131K |
Qwen: Qwen2.5 Coder 7B Instruct | $0.03 | $0.09 | - | - | 33K |
Qwen: Qwen2.5 VL 32B Instruct | $0.20 | $0.60 | - | - | 128K |
Qwen: Qwen2.5 VL 72B Instruct | $0.80 | $0.80 | - | - | 33K |
Qwen: Qwen3 14B | $0.06 | $0.24 | - | - | 41K |
Qwen: Qwen3 235B A22B | $0.455 | $1.82 | - | - | 131K |
Qwen: Qwen3 235B A22B Instruct 2507 | $0.071 | $0.10 | - | - | 262K |
Qwen: Qwen3 30B A3B | $0.08 | $0.28 | - | - | 41K |
Qwen: Qwen3 30B A3B Instruct 2507 | $0.09 | $0.30 | - | - | 262K |
Qwen: Qwen3 30B A3B Thinking 2507 | $0.051 | $0.34 | - | - | 33K |
Qwen: Qwen3 32B | $0.08 | $0.24 | $0.04 | - | 41K |
Qwen: Qwen3 8B | $0.05 | $0.40 | $0.05 | - | 32K |
Qwen: Qwen3 Coder 480B A35B | $0.22 | $1.00 | $0.022 | - | 262K |
Qwen: Qwen3 Coder 30B A3B Instruct | $0.07 | $0.27 | - | - | 160K |
Qwen: Qwen3 Coder Flash | $0.30 | $1.50 | $0.06 | - | 1M |
Qwen: Qwen3 Coder Next | $0.12 | $0.75 | $0.06 | - | 262K |
Qwen: Qwen3 Coder Plus | $1.00 | $5.00 | $0.20 | - | 1M |
Qwen: Qwen3 Coder 480B A35B (exacto) | $0.22 | $1.80 | $0.022 | - | 262K |
Qwen: Qwen3 Max | $1.20 | $6.00 | $0.24 | - | 262K |
Qwen: Qwen3 Max Thinking | $1.20 | $6.00 | - | - | 262K |
Qwen: Qwen3 Next 80B A3B Instruct | $0.09 | $1.10 | - | - | 262K |
Qwen: Qwen3 Next 80B A3B Thinking | $0.15 | $1.20 | - | - | 128K |
Qwen: Qwen3 VL 235B A22B Instruct | $0.20 | $0.88 | $0.11 | - | 262K |
Qwen: Qwen3 VL 30B A3B Instruct | $0.13 | $0.52 | - | - | 131K |
Qwen: Qwen3 VL 32B Instruct | $0.104 | $0.416 | - | - | 131K |
Qwen: Qwen3 VL 8B Instruct | $0.08 | $0.50 | - | - | 131K |
Qwen: Qwen3 VL 8B Thinking | $0.117 | $1.36 | - | - | 131K |
Qwen: Qwen3.5 397B A17B | $0.55 | $3.50 | $0.55 | - | 262K |
Qwen: Qwen3.5 Plus 2026-02-15 | $0.40 | $2.40 | - | - | 1M |
Qwen: QwQ 32B | $0.15 | $0.40 | - | - | 33K |
raifle(1 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
SorcererLM 8x22B | $4.50 | $4.50 | - | - | 16K |
relace(2 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Relace: Relace Apply 3 | $0.85 | $1.25 | - | - | 256K |
Relace: Relace Search | $1.00 | $3.00 | - | - | 256K |
sao10k(5 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Sao10k: Llama 3 Euryale 70B v2.1 | $1.48 | $1.48 | - | - | 8K |
Sao10K: Llama 3 8B Lunaris | $0.04 | $0.05 | - | - | 8K |
Sao10K: Llama 3.1 70B Hanami x1 | $3.00 | $3.00 | - | - | 16K |
Sao10K: Llama 3.1 Euryale 70B v2.2 | $0.65 | $0.75 | - | - | 33K |
Sao10K: Llama 3.3 Euryale 70B | $0.65 | $0.75 | - | - | 131K |
stepfun(1 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
StepFun: Step 3.5 Flash | $0.10 | $0.30 | $0.02 | - | 256K |
switchpoint(1 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Switchpoint Router | $0.85 | $3.40 | - | - | 131K |
tencent(1 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Tencent: Hunyuan A13B Instruct | $0.14 | $0.57 | - | - | 131K |
thedrummer(4 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
TheDrummer: Cydonia 24B V4.1 | $0.30 | $0.50 | - | - | 131K |
TheDrummer: Rocinante 12B | $0.17 | $0.43 | - | - | 33K |
TheDrummer: Skyfall 36B V2 | $0.55 | $0.80 | - | - | 33K |
TheDrummer: UnslopNemo 12B | $0.40 | $0.40 | - | - | 33K |
tngtech(1 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
TNG: DeepSeek R1T2 Chimera | $0.25 | $0.85 | $0.125 | - | 164K |
undi95(1 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
ReMM SLERP 13B | $0.45 | $0.65 | - | - | 6K |
writer(1 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Writer: Palmyra X5 | $0.60 | $6.00 | - | - | 1M |
xAI(8 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
xAI: Grok 3 | $3.00 | $15.00 | $0.75 | - | 131K |
xAI: Grok 3 Beta | $3.00 | $15.00 | $0.75 | - | 131K |
xAI: Grok 3 Mini | $0.30 | $0.50 | $0.075 | - | 131K |
xAI: Grok 3 Mini Beta | $0.30 | $0.50 | $0.075 | - | 131K |
xAI: Grok 4 | $3.00 | $15.00 | $0.75 | - | 256K |
xAI: Grok 4 Fast | $0.20 | $0.50 | $0.05 | - | 2M |
xAI: Grok 4.1 Fast | $0.20 | $0.50 | $0.05 | - | 2M |
xAI: Grok Code Fast 1 | $0.20 | $1.50 | $0.02 | - | 256K |
xiaomi(1 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Xiaomi: MiMo-V2-Flash | $0.09 | $0.29 | $0.045 | - | 262K |
Z.ai(10 models)
| Model | Input / 1M tokens | Output / 1M tokens | Cache Read / 1M | Reasoning / 1M | Context |
|---|---|---|---|---|---|
Z.ai: GLM 4 32B | $0.10 | $0.10 | - | - | 128K |
Z.ai: GLM 4.5 | $0.55 | $2.00 | - | - | 131K |
Z.ai: GLM 4.5 Air | $0.13 | $0.85 | $0.025 | - | 131K |
Z.ai: GLM 4.5V | $0.60 | $1.80 | $0.11 | - | 66K |
Z.ai: GLM 4.6 | $0.35 | $1.71 | - | - | 203K |
Z.ai: GLM 4.6 (exacto) | $0.44 | $1.76 | $0.11 | - | 205K |
Z.ai: GLM 4.6V | $0.30 | $0.90 | - | - | 131K |
Z.ai: GLM 4.7 | $0.38 | $1.70 | $0.19 | - | 203K |
Z.ai: GLM 4.7 Flash | $0.06 | $0.40 | $0.01 | - | 203K |
Z.ai: GLM 5 | $0.95 | $2.55 | $0.20 | - | 205K |