Model Pricing
This page shows pricing structures and model limits across supported model types so you can compare integration cost and capabilities more easily. Please refer to the model marketplace as the source of truth, as documentation may lag behind.
- Chat Models
- Image Generation
- Video Generation
- Text to Speech
- Realtime Audio
- Translation
- Image Recognition
- Embeddings
| Vendor | Model | Tier | Pricing Details | Max Output | Context Window |
|---|---|---|---|---|---|
| OpenAI | GPT-5.4 gpt-5.4 | Input Tokens < 272K | Input2.5$/Million Tokens Output15$/Million Tokens Cache Read0.25$/Million Tokens | 128K | 1.05M |
| Input Tokens >= 272K | Input5$/Million Tokens Output22.5$/Million Tokens Cache Read0.5$/Million Tokens | ||||
GPT-5.3-Codex gpt-5.3-codex | Default Tier | Input1.75$/Million Tokens Output14$/Million Tokens Cache Read0.175$/Million Tokens | 128K | 400K | |
GPT-5.4-Pro | Input Tokens < 272K | Input30$/Million Tokens Output180$/Million Tokens | 128K | 1.05M | |
| Input Tokens >= 272K | Input60$/Million Tokens Output270$/Million Tokens | ||||
| Anthropic | Claude-Opus-4.6 claude-opus-4.6 | Default Tier | Input5$/Million Tokens Output25$/Million Tokens Cache Write (5m)6.25$/Million Tokens Cache Write (1h)10$/Million Tokens Cache Read0.5$/Million Tokens | 128K | 200K |
Claude-Sonnet-4.6 claude-sonnet-4.6 | Default Tier | Input3$/Million Tokens Output15$/Million Tokens Cache Write (5m)3.75$/Million Tokens Cache Write (1h)6$/Million Tokens Cache Read0.3$/Million Tokens | 200K | 128K | |
Claude-Opus-4.5 claude-opus-4.5 | Default Tier | Input5$/Million Tokens Output25$/Million Tokens Cache Read0.5$/Million Tokens Cache Write (5m)6.25$/Million Tokens Cache Write (1h)10$/Million Tokens | 200K | 64K | |
Claude-Sonnet-4.5 claude-sonnet-4.5 | Input Tokens <= 200K | Input3$/Million Tokens Output15$/Million Tokens Cache Write (5m)3.75$/Million Tokens Cache Write (1h)6$/Million Tokens Cache Read0.3$/Million Tokens | 64K | 200K | |
| Input Tokens > 200K | Input201$/Million Tokens Output22.5$/Million Tokens Cache Write (5m)7.5$/Million Tokens Cache Write (1h)12$/Million Tokens Cache Read0.6$/Million Tokens | ||||
Claude-Haiku-4.5 claude-haiku-4.5 | Default Tier | Input1$/Million Tokens Output5$/Million Tokens Cache Write (5m)1.25$/Million Tokens Cache Write (1h)2$/Million Tokens Cache Read0.1$/Million Tokens | - | - | |
Gemini-3.1-Pro gemini-3.1-pro | Input Tokens <= 200K | Input2$/Million Tokens Output12$/Million Tokens Cache Read0.2$/Million Tokens | 64K | 1M | |
| Input Tokens > 200K | Input4$/Million Tokens Output18$/Million Tokens Cache Read0.4$/Million Tokens | ||||
Gemini-3-Flash gemini-3-flash | Default Tier | Input0.5$/Million Tokens Output3$/Million Tokens Cache Read0.05$/Million Tokens | 64K | 1M | |
| DeepSeek | DeepSeek-V3.2 | Default Tier | Input0.3$/Million Tokens Output0.5$/Million Tokens | 32K | 128K |
DeepSeek-V3.2-Thinking | Default Tier | Input0.3$/Million Tokens Output0.5$/Million Tokens | 32K | 128K | |
DeepSeek-R1 | Default Tier | Input0.6$/Million Tokens Output2.4$/Million Tokens | 28K | 128K | |
| Alibaba | Qwen3-32B qwen3-32b | Default Tier | Input0.284$/Million Tokens Output1.136$/Million Tokens | 32K | 32K |
Qwen3-32B-Thinking qwen3-32b-thinking | Default Tier | Input0.284$/Million Tokens Output2.84$/Million Tokens | 32K | 32K | |
Qwen3-coder-plus | Input Tokens <= 32K | Input0.574$/Million Tokens Output2.294$/Million Tokens Cache Read0.115$/Million Tokens | 63K | 1M | |
| Input Tokens <= 128K and Input Tokens > 32K | Input0.861$/Million Tokens Output3.441$/Million Tokens Cache Read0.173$/Million Tokens | ||||
| Input Tokens <= 256K and Input Tokens > 128K | Input1.434$/Million Tokens Output5.735$/Million Tokens Cache Read0.287$/Million Tokens | ||||
| Input Tokens > 256K | Input2.868$/Million Tokens Output28.671$/Million Tokens Cache Read0.574$/Million Tokens | ||||
| MiniMax | MiniMax-M2.5 | Default Tier | Input0.304$/Million Tokens Output1.213$/Million Tokens Cache Read0.061$/Million Tokens | - | - |
| Moonshot | Kimi-K2.5 | Default Tier | Input0.574$/Million Tokens Output3.011$/Million Tokens Cache Read0.115$/Million Tokens | 32K | 256K |
| xAI | Grok-3 | Default Tier | Input3$/Million Tokens Output15$/Million Tokens | - | 131K |
| Zhipu | GLM-5 | Input Tokens <= 32K | Input0.573$/Million Tokens Output2.58$/Million Tokens Cache Read0.115$/Million Tokens | 128K | 200K |
| Input Tokens > 32K | Input0.86$/Million Tokens Output3.154$/Million Tokens Cache Read0.172$/Million Tokens | ||||
| ByteDance | bytedance-seed-2.0-lite | Input Tokens < 128K | Input0.25$/Million Tokens Output2$/Million Tokens Cache Read0.05$/Million Tokens | 128K | 256K |
| Input Tokens > 128K | Input0.5$/Million Tokens Output4$/Million Tokens Cache Read0.1$/Million Tokens | ||||
| OPEAI | gemma-4 | Default Tier | Input0.85$/Million Tokens Output0.85$/Million Tokens | - | - |
| Vendor | Model | Tier | Pricing Details | Max Output | Context Window |
|---|---|---|---|---|---|
| Alibaba | Qwen-Image | Default Tier | Image1$/image | - | - |
wan2.6-image | Default Tier | Image0.03$/image | - | - | |
wan2.6-t2i | Default Tier | Image0.03$/image | - | - | |
| ByteDance | Doubao-Seedream-4.5 | Default Tier | Image0.05$/image | - | - |
Doubao-Seedream-3.0 | Default Tier | Image0.03$/image | - | - | |
bytedance-seedream-5.0 | Default Tier | Image0.035$/image | - | - | |
Nano Banana 2 | Default Tier | Image0.16$/image | - | - |
| Vendor | Model | Tier | Pricing Details | Max Output | Context Window |
|---|---|---|---|---|---|
| ByteDance | bytedance-seedance-1.5 | Default Tier | Video (Audio)2.4$/Million Tokens Video (Silent)1.2$/Million Tokens | - | - |
| OpenAI | sora-2 | Default Tier | Video0.1$/second | - | - |
| Alibaba | wan2.6-i2v | Default Tier | Video0.15$/second | - | - |
wan2.7-i2v | Default Tier | Video0.15$/second | - | - |
| Vendor | Model | Tier | Pricing Details | Max Output | Context Window |
|---|---|---|---|---|---|
| OPEAI | AudioLLM/Spark | Default Tier | Input1.6$/Million Tokens Output1.6$/Million Tokens | - | - |
AudioLLM/Voice1.0 | Default Tier | Input1.8$/Million Tokens Output3.6$/Million Tokens | - | - |
| Vendor | Model | Tier | Pricing Details | Max Output | Context Window |
|---|---|---|---|---|---|
| OPEAI | AudioLLM/Voice2.0 | Default Tier | Input1.5$/Million Tokens Output3$/Million Tokens | - | - |
| Vendor | Model | Tier | Pricing Details | Max Output | Context Window |
|---|---|---|---|---|---|
| OPEAI | MLModel1.5 | Default Tier | Input1$/Million Tokens Output2$/Million Tokens | - | - |
MTModel1.0 | Default Tier | Input1.4$/Million Tokens Output1.4$/Million Tokens | - | - | |
Tencent/MT-Hunyuan-7B | Default Tier | Input1$/Million Tokens Output1$/Million Tokens | - | - |
| Vendor | Model | Tier | Pricing Details | Max Output | Context Window |
|---|---|---|---|---|---|
| OPEAI | PaddleOCR-VL-0.9B | Default Tier | Input0.6$/Million Tokens Output1.2$/Million Tokens | - | - |
Image-Recognition | Default Tier | Input1.35$/Million Tokens Output3.5$/Million Tokens | - | - |
| Vendor | Model | Tier | Pricing Details | Max Output | Context Window |
|---|---|---|---|---|---|
| OPEAI | bge-m3 | Default Tier | Input0.1$/Million Tokens Output0.1$/Million Tokens | - | - |