Model catalog
53 models available across 19 providers. One API, consistent format.
Orpheus v1
Canopy Labs
Directorial TTS with bracketed emotional cues. 100 chars/sec.
orpheus-v1
Eleven Multilingual v3
ElevenLabs
Expressive TTS with emotional control. Sighs, whispers, cues.
eleven-multilingual-v3
GPT Realtime
OpenAI
Native bidirectional audio streaming. Sub-200ms latency.
gpt-realtime
Whisper Large v3 Turbo
OpenAI
228x speed transcription on Groq. Ultra-low cost STT.
whisper-large-v3-turbo
Qwen 3 235B
Alibaba
Large MoE (22B active) for complex multilingual tasks.
qwen3-235b-a22b
Qwen 3 32B
Alibaba
Dual-mode thinking/non-thinking. 662 TPS on Groq hardware.
qwen3-32b
Claude Haiku 4.5
Anthropic
Ultra-fast responses at low cost. Ideal for high-throughput.
claude-haiku-4-5
Claude Opus 4.6
Anthropic
Most powerful Claude. Extended thinking, 1M beta context window.
claude-opus-4-6
Claude Sonnet 4.6
Anthropic
Balanced performance and speed for enterprise workloads.
claude-sonnet-4-6
DeepSeek V3.1
DeepSeek
671B MoE (37B active). Extreme efficiency with sparse attention.
deepseek-v3.1
Gemini 2.5 Flash Lite
Fastest Gemini variant at near-zero cost.
gemini-2.5-flash-lite
Gemini 2.5 Pro
2M context with native Google Search grounding.
gemini-2.5-pro
Gemini 3.1 Pro
Latest Gemini with advanced vibe coding and multimodality.
gemini-3.1-pro-preview
Llama 4 Maverick
Meta
400B MoE (17B active). Native multimodal, 562 TPS on Groq.
llama-4-maverick-17b-128e
Llama 4 Scout
Meta
109B MoE (17B active). Lean multimodal, near 600 TPS.
llama-4-scout-17b-16e
Mistral Large 3
Mistral AI
Enterprise-grade with 256K context. EU data sovereignty.
mistral-large-latest
Mistral Small 3
Mistral AI
Efficient model for routine tasks and high volume.
mistral-small-3
Kimi K2
Moonshot AI
1T MoE (32B active). Excels at frontend dev and tool calling.
kimi-k2-instruct
GPT-4.1
OpenAI
Reliable general-purpose model with function calling and vision.
gpt-4.1
GPT-5 Mini
OpenAI
Cost-efficient model for high-volume production tasks.
gpt-5-mini
GPT-5 Nano
OpenAI
Ultra-low cost for triage, extraction, and metadata tasks.
gpt-5-nano
GPT-5.2
OpenAI
Latest OpenAI flagship. 400K context with prompt caching support.
gpt-5.2
GPT-OSS 120B
OpenAI
Open-weight MoE (5.1B active). Optimized for agentic workflows.
gpt-oss-120b
GPT-OSS 20B
OpenAI
Compact 20B model. Over 1,000 TPS on Groq LPU hardware.
gpt-oss-20b
Grok 4
xAI
2M context with fast reasoning and competitive output pricing.
grok-4
Qwen 3 Coder 480B
Alibaba
Massive 480B coding MoE (35B active). Top benchmark scores.
qwen3-coder-480b-a35b
Codestral
Mistral AI
Specialized for code generation, completion, and refactoring.
codestral
Devstral 2
Mistral AI
Open model tailored for code agents and automation.
devstral-2
Grok Code Fast
xAI
Optimized for fast code generation and debugging.
grok-code-fast-1
Embed English v3
Cohere
1024-dim English embeddings. Native image embedding support.
embed-english-v3
Jina Embeddings v5
Jina AI
Task-specific LoRA adapters. 1024 dims, truncatable to 32.
jina-embeddings-v5
Text Embedding 3 Large
OpenAI
3072-dimensional high-accuracy embeddings.
text-embedding-3-large
Text Embedding 3 Small
OpenAI
Cost-efficient 1536-dimensional embeddings.
text-embedding-3-small
Voyage 4 Large
Voyage AI
First MoE embedding model. 1024 dims, Matryoshka support.
voyage-4-large
Voyage 4 Lite
Voyage AI
Cost-optimized embeddings with flexible dimensions.
voyage-4-lite
FLUX.1 Pro
Black Forest Labs
Professional quality generation with high detail.
flux-1-pro
FLUX.1 Schnell
Black Forest Labs
Ultra-fast 4-step generation at minimal cost.
flux-1-schnell
FLUX.2 Max
Black Forest Labs
Premium high-fidelity synthesis. 50 diffusion steps.
flux-2-max
Ideogram 3.0
Ideogram
Best-in-class text rendering accuracy in generated images.
ideogram-3.0
Kling 2.1 Image
Kling AI
Text-to-image and multi-image style transfers.
kling-v2-1-image
GPT Image 1.5
OpenAI
Tokenized image generation. DALL-E successor with precise prompting.
gpt-image-1.5
Stable Diffusion 3.5 Flash
Stability AI
Fast distilled generation. 2.5 credits per generation.
sd3.5-flash
Stable Diffusion 3.5 Large
Stability AI
High-fidelity diffusion model. 6.5 credits per generation.
sd3.5-large
Stable Diffusion 3.5 Medium
Stability AI
Balanced quality and speed. 3.5 credits per generation.
sd3.5-medium
DeepSeek R1
DeepSeek
Reasoning specialist with 23K internal thinking tokens. AIME SOTA.
deepseek-r1
O3
OpenAI
Advanced reasoning for complex multi-step deduction.
o3
O3 Pro
OpenAI
Premium reasoning with highest accuracy. Math and crypto focus.
o3-pro
O4 Mini
OpenAI
Cost-effective reasoning model for everyday logic tasks.
o4-mini
Veo 3.0
Advanced video generation architecture via Runway API.
veo-3.0
Kling 2.1 Pro
Kling AI
Cinematic realism with complex camera motions. 1080p.
kling-2.1-pro
Sora 2
OpenAI
State-of-the-art physics simulation. 720p with synced audio.
sora-2
Sora 2 Pro
OpenAI
Premium 1080p video generation with native audio.
sora-2-pro
Gen-4 Turbo
Runway
Fast video generation. 5 credits per second.
gen-4-turbo