Skip to main content
We host 100+ open-source models on our serverless inference platform and even more on dedicated endpoints. This guide helps you choose the right model for your specific use case. For a complete list of all available models with detailed specifications, visit our Serverless and Dedicated Models pages.
Use CaseRecommended ModelModel StringAlternativesLearn More
ChatKimi K2 Instruct 0905moonshotai/Kimi-K2-Instruct-0905deepseek-ai/DeepSeek-V3.1, Qwen/Qwen3-235B-A22B-Instruct-2507-tputChat
ReasoningDeepSeek-R1-0528deepseek-ai/DeepSeek-R1Qwen/Qwen3-235B-A22B-Thinking-2507, openai/gpt-oss-120bReasoning Guide, DeepSeek R1
Coding AgentsQwen3-Coder 480B-A35BQwen/Qwen3-Coder-480B-A35B-Instruct-FP8moonshotai/Kimi-K2-Instruct-0905, deepseek-ai/DeepSeek-V3.1Building Agents
Small & FastGPT-OSS 20Bopenai/gpt-oss-20bQwen/Qwen2.5-7B-Instruct-Turbo, meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo-
Medium General PurposeGLM 4.5 Airzai-org/GLM-4.5-Air-FP8Qwen/Qwen3-Next-80B-A3B-Instruct, openai/gpt-oss-120b-
Function CallingGLM 4.5 Airzai-org/GLM-4.5-Air-FP8Qwen/Qwen3-Next-80B-A3B-Instruct, moonshotai/Kimi-K2-Instruct-0905Function Calling
VisionLlama 4 Maverickmeta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8Qwen/Qwen2.5-VL-72B-InstructVision, OCR
Image GenerationQwen ImageQwen/Qwen-Imagegoogle/flash-image-2.5, ByteDance-Seed/Seedream-4.0Images
Image-to-ImageFlash Image 2.5 (Nano Banana)google/flash-image-2.5black-forest-labs/FLUX.1-kontext-proFlux Kontext
Text-to-VideoSora 2openai/sora-2-progoogle/veo-3.0, ByteDance/Seedance-1.0-proVideo Generation
Image-to-VideoVeo 3.0google/veo-3.0ByteDance/Seedance-1.0-pro, kwaivgI/kling-2.1-masterVideo Generation
Text-to-SpeechCartesia Sonic 2cartesia/sonic-2cartesia/sonicText-to-Speech
Speech-to-TextWhisper Large v3openai/whisper-large-v3-Speech-to-Text
EmbeddingsGTE-Modernbert-baseAlibaba-NLP/gte-modernbert-baseintfloat/multilingual-e5-large-instructEmbeddings
RerankMixedBread Rerank Largemixedbread-ai/Mxbai-Rerank-Large-V2Salesforce/Llama-Rank-v1Rerank, Guide
ModerationVirtue GuardVirtueAI/VirtueGuard-Text-Litemeta-llama/Llama-Guard-4-12B-

Need Help Choosing? For high-volume production workloads, consider Dedicated Inference for guaranteed capacity and predictable performance.