Chat Models

See which open-source chat models we currently host, or learn how to configure and host your own.

Our Chat API has built-in support for many popular models we host via our serverless endpoints, as well as any model that you configure and host yourself using our dedicated GPU infrastructure.

When using one of our hosted serverless models, you'll be charged based on the amount of tokens you use in your queries. For dedicated models you configure and run yourself, you'll be charged per minute as long as your endpoint is running. You can start or stop your endpoint at any time using our online playground.

To learn more about the pricing for our serverless endoints, check out our pricing page.

Hosted models

If you're not sure which Chat model to use, we currently recommend Llama 3.1 8B Turbo (meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo) or Llama 3.1 70B Turbo (meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo)

OrganizationModel NameModel String for APIContext length
MetaLlama 3.1 8B Instruct Turbometa-llama/Meta-Llama-3.1-8B-Instruct-Turbo8192
MetaLlama 3.1 70B Instruct Turbometa-llama/Meta-Llama-3.1-70B-Instruct-Turbo8192
MetaLlama 3.1 405B Instruct Turbometa-llama/Meta-Llama-3.1-405B-Instruct-Turbo4096
MetaLlama 3 8B Instruct Turbometa-llama/Meta-Llama-3-8B-Instruct-Turbo8192
MetaLlama 3 70B Instruct Turbometa-llama/Meta-Llama-3-70B-Instruct-Turbo8192
MetaLlama 3 8B Instruct Litemeta-llama/Meta-Llama-3-8B-Instruct-Lite8192
MetaLlama 3 70B Instruct Litemeta-llama/Meta-Llama-3-70B-Instruct-Lite8192
01.AI01-ai Yi Chat (34B)zero-one-ai/Yi-34B-Chat4096
Allen AIOLMo Instruct (7B)allenai/OLMo-7B-Instruct2048
Allen AIOLMo Twin-2T (7B)allenai/OLMo-7B-Twin-2T2048
Allen AIOLMo (7B)allenai/OLMo-7B2048
AustismChronos Hermes (13B)Austism/chronos-hermes-13b2048
cognitivecomputationsDolphin 2.5 Mixtral 8x7bcognitivecomputations/dolphin-2.5-mixtral-8x7b32768
databricksDBRX Instructdatabricks/dbrx-instruct32768
DeepSeekDeepseek Coder Instruct (33B)deepseek-ai/deepseek-coder-33b-instruct16384
DeepSeekDeepSeek LLM Chat (67B)deepseek-ai/deepseek-llm-67b-chat4096
garage-bAIndPlatypus2 Instruct (70B)garage-bAInd/Platypus2-70B-instruct4096
GoogleGemma Instruct (2B)google/gemma-2b-it8192
GoogleGemma Instruct (7B)google/gemma-7b-it8192
GrypheMythoMax-L2 (13B)Gryphe/MythoMax-L2-13b4096
LM SysVicuna v1.5 (13B)lmsys/vicuna-13b-v1.54096
LM SysVicuna v1.5 (7B)lmsys/vicuna-7b-v1.54096
MetaCode Llama Instruct (13B)codellama/CodeLlama-13b-Instruct-hf16384
MetaCode Llama Instruct (34B)codellama/CodeLlama-34b-Instruct-hf16384
MetaCode Llama Instruct (70B)codellama/CodeLlama-70b-Instruct-hf4096
MetaCode Llama Instruct (7B)codellama/CodeLlama-7b-Instruct-hf16384
MetaLLaMA-2 Chat (70B)meta-llama/Llama-2-70b-chat-hf4096
MetaLLaMA-2 Chat (13B)meta-llama/Llama-2-13b-chat-hf4096
MetaLLaMA-2 Chat (7B)meta-llama/Llama-2-7b-chat-hf4096
MetaLLaMA-3 Chat (8B)meta-llama/Llama-3-8b-chat-hf8192
MetaLLaMA-3 Chat (70B)meta-llama/Llama-3-70b-chat-hf8192
mistralaiMistral (7B) Instructmistralai/Mistral-7B-Instruct-v0.18192
mistralaiMistral (7B) Instruct v0.2mistralai/Mistral-7B-Instruct-v0.232768
mistralaiMistral (7B) Instruct v0.3mistralai/Mistral-7B-Instruct-v0.332768
mistralaiMixtral-8x7B Instruct (46.7B)mistralai/Mixtral-8x7B-Instruct-v0.132768
mistralaiMixtral-8x22B Instruct (141B)mistralai/Mixtral-8x22B-Instruct-v0.165536
NousResearchNous Capybara v1.9 (7B)NousResearch/Nous-Capybara-7B-V1p98192
NousResearchNous Hermes 2 - Mistral DPO (7B)NousResearch/Nous-Hermes-2-Mistral-7B-DPO32768
NousResearchNous Hermes 2 - Mixtral 8x7B-DPO (46.7B)NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO32768
NousResearchNous Hermes 2 - Mixtral 8x7B-SFT (46.7B)NousResearch/Nous-Hermes-2-Mixtral-8x7B-SFT32768
NousResearchNous Hermes LLaMA-2 (7B)NousResearch/Nous-Hermes-llama-2-7b4096
NousResearchNous Hermes Llama-2 (13B)NousResearch/Nous-Hermes-Llama2-13b4096
NousResearchNous Hermes-2 Yi (34B)NousResearch/Nous-Hermes-2-Yi-34B4096
OpenChatOpenChat 3.5 (7B)openchat/openchat-3.5-12108192
OpenOrcaOpenOrca Mistral (7B) 8KOpen-Orca/Mistral-7B-OpenOrca8192
QwenQwen 1.5 Chat (0.5B)Qwen/Qwen1.5-0.5B-Chat32768
QwenQwen 1.5 Chat (1.8B)Qwen/Qwen1.5-1.8B-Chat32768
QwenQwen 1.5 Chat (4B)Qwen/Qwen1.5-4B-Chat32768
QwenQwen 1.5 Chat (7B)Qwen/Qwen1.5-7B-Chat32768
QwenQwen 1.5 Chat (14B)Qwen/Qwen1.5-14B-Chat32768
QwenQwen 1.5 Chat (32B)Qwen/Qwen1.5-32B-Chat32768
QwenQwen 1.5 Chat (72B)Qwen/Qwen1.5-72B-Chat32768
QwenQwen 1.5 Chat (110B)Qwen/Qwen1.5-110B-Chat32768
QwenQwen 2 Instruct (72B)Qwen/Qwen2-72B-Instruct32768
Snorkel AISnorkel Mistral PairRM DPO (7B)snorkelai/Snorkel-Mistral-PairRM-DPO32768
SnowflakeSnowflake Arctic InstructSnowflake/snowflake-arctic-instruct4096
StanfordAlpaca (7B)togethercomputer/alpaca-7b2048
TekniumOpenHermes-2-Mistral (7B)teknium/OpenHermes-2-Mistral-7B8192
TekniumOpenHermes-2.5-Mistral (7B)teknium/OpenHermes-2p5-Mistral-7B8192
TogetherLLaMA-2-7B-32K-Instruct (7B)togethercomputer/Llama-2-7B-32K-Instruct32768
TogetherRedPajama-INCITE Chat (3B)togethercomputer/RedPajama-INCITE-Chat-3B-v12048
TogetherRedPajama-INCITE Chat (7B)togethercomputer/RedPajama-INCITE-7B-Chat2048
TogetherStripedHyena Nous (7B)togethercomputer/StripedHyena-Nous-7B32768
Undi95ReMM SLERP L2 (13B)Undi95/ReMM-SLERP-L2-13B4096
Undi95Toppy M (7B)Undi95/Toppy-M-7B4096
WizardLMWizardLM v1.2 (13B)WizardLM/WizardLM-13B-V1.24096
upstageUpstage SOLAR Instruct v1 (11B)upstage/SOLAR-10.7B-Instruct-v1.04096

Dedicated Instances

Customizable on-demand deployable model instances, priced by hour hosted. All models in the serverless endpoints are available for hosting as private dedicated instances. Additionally, the below models are also available for hosting as private dedicated instances.

OrganizationModel NameModel String for APIContext length
DatabricksDolly v2 (12B)databricks/dolly-v2-12b2048
DatabricksDolly v2 (3B)databricks/dolly-v2-3b2048
DatabricksDolly v2 (7B)databricks/dolly-v2-7b2048
DiscoResearchDiscoLM Mixtral 8x7b (46.7B)DiscoResearch/DiscoLM-mixtral-8x7b-v232768
HuggingFaceZephyr-7B-ßHuggingFaceH4/zephyr-7b-beta32768
HuggingFaceH4StarCoderChat Alpha (16B)HuggingFaceH4/starchat-alpha8192
LAIONOpen-Assistant Pythia SFT-4 (12B)OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.52048
LAIONOpen-Assistant StableLM SFT-7 (7B)OpenAssistant/stablelm-7b-sft-v7-epoch-34096
LM SysKoala (13B)togethercomputer/Koala-13B2048
LM SysKoala (7B)togethercomputer/Koala-7B2048
LM SysVicuna v1.3 (13B)lmsys/vicuna-13b-v1.32048
LM SysVicuna v1.3 (7B)lmsys/vicuna-7b-v1.32048
LM SysVicuna-FastChat-T5 (3B)lmsys/fastchat-t5-3b-v1.0512
Mosaic MLMPT-Chat (30B)togethercomputer/mpt-30b-chat2048
Mosaic MLMPT-Chat (7B)togethercomputer/mpt-7b-chat2048
NousResearchNous Hermes LLaMA-2 (70B)NousResearch/Nous-Hermes-Llama2-70b4096
QwenQwen Chat (7B)Qwen/Qwen-7B-Chat2048
QwenQwen Chat (14B)Qwen/Qwen-14B-Chat2048
TIIFalcon Instruct (7B)tiiuae/falcon-7b-instruct2048
TIIFalcon Instruct (40B)tiiuae/falcon-40b-instruct2048
Tim DettmersGuanaco (13B)togethercomputer/guanaco-13b2048
Tim DettmersGuanaco (33B)togethercomputer/guanaco-33b2048
Tim DettmersGuanaco (65B)togethercomputer/guanaco-65b2048
Tim DettmersGuanaco (7B)togethercomputer/guanaco-7b2048
TogetherGPT-NeoXT-Chat-Base (20B)togethercomputer/GPT-NeoXT-Chat-Base-20B2048
TogetherPythia-Chat-Base (7B)togethercomputer/Pythia-Chat-Base-7B-v0.162048

Request a model

Don't see a model you want to use?

Send us a Model Request here →