Recommended models - Together AI docs

Together hosts 100+ open-source models across text, image, video, and audio. Most of the models below are for instant serverless inference, or reserved hardware deployments on dedicated endpoints. Both options use the same inference API.

Chat & text

Use case	Recommended model	Model string	Alternatives	Learn more
Chat	Kimi K2.6 (instant mode)	`moonshotai/Kimi-K2.6`	`MiniMaxAI/MiniMax-M2.7`	Chat completions
Reasoning	Kimi K2.6 (reasoning mode)	`moonshotai/Kimi-K2.6`	`Qwen/Qwen3-235B-A22B-Instruct-2507-tput`	Reasoning
Coding agents	GLM-5.1	`zai-org/GLM-5.1`	`Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8`	Build coding agents
Small and fast	Gemma 4 31B IT	`google/gemma-4-31B-it`	`openai/gpt-oss-20b`, `Qwen/Qwen3.5-9B`	-
Mid-size general purpose	MiniMax M2.7	`MiniMaxAI/MiniMax-M2.7`	`meta-llama/Llama-3.3-70B-Instruct-Turbo`	-
Function calling	GLM-5.1	`zai-org/GLM-5.1`	`moonshotai/Kimi-K2.6`	Function calling

Vision

Use case	Recommended model	Model string	Alternatives	Learn more
Vision	Kimi K2.6	`moonshotai/Kimi-K2.6`	`google/gemma-4-31B-it`, `Qwen/Qwen3.5-397B-A17B`, `Qwen/Qwen3.5-9B`	Vision, OCR quickstart

Image generation

Use case	Recommended model	Model string	Alternatives	Learn more
Text-to-image	Flash Image 2.5	`google/flash-image-2.5`	`black-forest-labs/FLUX.2-pro`, `ByteDance-Seed/Seedream-4.0`	Text-to-image
Image-to-image	Flash Image 2.5	`google/flash-image-2.5`	`black-forest-labs/FLUX.1-kontext-max`, `google/gemini-3-pro-image`	Image-to-image

Video generation

Use case	Recommended model	Model string	Alternatives	Learn more
Text-to-video	Sora 2 Pro	`openai/sora-2-pro`	`google/veo-3.0`, `ByteDance/Seedance-1.0-pro`	Video generation
Image-to-video	Veo 3.0	`google/veo-3.0`	`ByteDance/Seedance-1.0-pro`, `kwaivgI/kling-2.1-master`	Video generation

Audio

Use case	Recommended model	Model string	Alternatives	Learn more
Text-to-speech	Cartesia Sonic 3	`cartesia/sonic-3`	`canopylabs/orpheus-3b-0.1-ft`, `hexgrad/Kokoro-82M`	Text-to-speech
Speech-to-text	Whisper Large v3	`openai/whisper-large-v3`	`nvidia/parakeet-tdt-0.6b-v3`, `deepgram/nova-3-en`, `mistralai/Voxtral-Mini-3B-2507`	Speech-to-text

Embeddings, rerank, and moderation

Use case	Recommended model	Model string	Notes	Learn more
Embeddings	Multilingual E5 Large	`intfloat/multilingual-e5-large-instruct`	-	Embeddings
Rerank	MixedBread Rerank Large	`mixedbread-ai/Mxbai-Rerank-Large-V2`	Only on dedicated endpoints	Rerank, Improve search with rerankers
Moderation	Llama Guard 4 12B	`meta-llama/Llama-Guard-4-12B`	-	-

Serverless models

Full catalog with context windows, pricing, and capabilities.

Dedicated endpoint models

Models available on reserved hardware.

WhichLLM

Categorical benchmarks to compare models across use cases.

Pricing

Per-token and per-output pricing for all models.

Documentation Index

​Chat & text

​Vision

​Image generation

​Video generation

​Audio

​Embeddings, rerank, and moderation

​Related resources