Evaluations Supported Models

The following models are supported for use as both judge models and models to be evaluated in the Together AI Evaluations API. You can specify any of these models in the model_name field of your evaluation configuration.

Judge models: For best results, we recommend using a larger, more capable model (such as DeepSeek-V3-0324 or DeepSeek-R1-0528) as the judge.
Evaluated models: You may use any of these models as the subject of your evaluation, either by referencing a column in your dataset or by providing a model configuration object.

Note: The list below is updated regularly as new models become available.

Organization	Model Name	API Model String	Context Length
Moonshot	Kimi K2 Instruct	moonshotai/Kimi-K2-Instruct	128,000
DeepSeek	DeepSeek-V3-0324	deepseek-ai/DeepSeek-V3	163,839
DeepSeek	DeepSeek-R1-0528	deepseek-ai/DeepSeek-R1-0528	163,839
Meta	Llama 3.1 405B Instruct Turbo	meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo	130,815
Qwen	Qwen3 235B A22B Throughput	Qwen/Qwen3-235B-A22B-fp8-tput	40,960
Qwen	Qwen 2.5 72B Instruct Turbo	Qwen/Qwen2.5-72B-Instruct-Turbo	32,768
DeepSeek	DeepSeek R1 Distill Llama 70B	deepseek-ai/DeepSeek-R1-Distill-Llama-70B	131,072
Meta	Llama 3.3 70B Instruct Turbo	meta-llama/Llama-3.3-70B-Instruct-Turbo	131,072
Meta	Llama 3.1 70B Instruct Turbo	meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo	131,072
Qwen	QwQ-32B	Qwen/QwQ-32B	32,768
Qwen	Qwen 2.5 Coder 32B Instruct	Qwen/Qwen2.5-Coder-32B-Instruct	32,768
Google	Gemma 2 27B	google/gemma-2-27b-it	8,192
Mistral AI	Mistral Small 3 Instruct (24B)	mistralai/Mistral-Small-24B-Instruct-2501	32,768
DeepSeek	DeepSeek R1 Distill Qwen 14B	deepseek-ai/DeepSeek-R1-Distill-Qwen-14B	131,072
Marin Community	Marin 8B Instruct	marin-community/marin-8b-instruct	4,096
Meta	Llama 3.1 8B Instruct Turbo	meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo	131,072
Qwen	Qwen 2.5 7B Instruct Turbo	Qwen/Qwen2.5-7B-Instruct-Turbo	32,768
Meta	Llama 3.2 3B Instruct Turbo	meta-llama/Llama-3.2-3B-Instruct-Turbo	131,072
Meta	Llama 4 Maverick (17Bx128E)	meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8	1,048,576
Meta	Llama 4 Scout (17Bx16E)	meta-llama/Llama-4-Scout-17B-16E-Instruct	1,048,576

AI Evaluations UI Text-to-Speech

Getting Started

Inference

Capabilities

Examples

Training

Guides

❓ Frequently Asked Questions