Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.together.ai/llms.txt

Use this file to discover all available pages before exploring further.

Together hosts 100+ open-source models across text, image, video, and audio. Most of the models below are for instant serverless inference, or reserved hardware deployments on dedicated endpoints. Both options use the same inference API.

Chat & text

Use caseRecommended modelModel stringAlternativesLearn more
ChatKimi K2.5 (instant mode)moonshotai/Kimi-K2.5openai/gpt-oss-120bChat completions
ReasoningKimi K2.5 (reasoning mode)moonshotai/Kimi-K2.5deepseek-ai/DeepSeek-R1, Qwen/Qwen3-235B-A22B-Instruct-2507-tputReasoning
Coding agentsGLM-5.1zai-org/GLM-5.1Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8Build coding agents
Small and fastGemma 4 31B ITgoogle/gemma-4-31B-itopenai/gpt-oss-20b, Qwen/Qwen3.5-9B-
Mid-size general purposeGPT-OSS 120Bopenai/gpt-oss-120bMiniMaxAI/MiniMax-M2.7, meta-llama/Llama-3.3-70B-Instruct-Turbo-
Function callingGLM-5.1zai-org/GLM-5.1moonshotai/Kimi-K2.5Function calling

Vision

Use caseRecommended modelModel stringAlternativesLearn more
VisionKimi K2.5moonshotai/Kimi-K2.5google/gemma-4-31B-it, Qwen/Qwen3.5-397B-A17B, Qwen/Qwen3.5-9BVision, OCR quickstart

Image generation

Use caseRecommended modelModel stringAlternativesLearn more
Text-to-imageFlash Image 2.5google/flash-image-2.5black-forest-labs/FLUX.2-pro, ByteDance-Seed/Seedream-4.0Text-to-image
Image-to-imageFlash Image 2.5google/flash-image-2.5black-forest-labs/FLUX.1-kontext-max, google/gemini-3-pro-imageImage-to-image

Video generation

Use caseRecommended modelModel stringAlternativesLearn more
Text-to-videoSora 2 Proopenai/sora-2-progoogle/veo-3.0, ByteDance/Seedance-1.0-proVideo generation
Image-to-videoVeo 3.0google/veo-3.0ByteDance/Seedance-1.0-pro, kwaivgI/kling-2.1-masterVideo generation

Audio

Use caseRecommended modelModel stringAlternativesLearn more
Text-to-speechCartesia Sonic 3cartesia/sonic-3canopylabs/orpheus-3b-0.1-ft, hexgrad/Kokoro-82MText-to-speech
Speech-to-textWhisper Large v3openai/whisper-large-v3nvidia/parakeet-tdt-0.6b-v3, deepgram/nova-3-en, mistralai/Voxtral-Mini-3B-2507Speech-to-text

Embeddings, rerank, and moderation

Use caseRecommended modelModel stringNotesLearn more
EmbeddingsMultilingual E5 Largeintfloat/multilingual-e5-large-instruct-Embeddings
RerankMixedBread Rerank Largemixedbread-ai/Mxbai-Rerank-Large-V2Only on dedicated endpointsRerank, Improve search with rerankers
ModerationLlama Guard 4 12Bmeta-llama/Llama-Guard-4-12B--

Serverless models

Full catalog with context windows, pricing, and capabilities.

Dedicated endpoint models

Models available on reserved hardware.

WhichLLM

Categorical benchmarks to compare models across use cases.

Pricing

Per-token and per-output pricing for all models.