Code/Language Models

See which open-source language and code models we currently host, or learn how to configure and host your own.

Our Completions API has built-in support for many popular models we host via our serverless endpoints, as well as any model that you configure and host yourself using our dedicated GPU infrastructure.

When using one of our hosted serverless models, you'll be charged based on the amount of tokens you use in your queries. For dedicated models you configure and run yourself, you'll be charged per minute as long as your endpoint is running. You can start or stop your endpoint at any time using our online playground.

To learn more about the pricing for both our serverless and dedicated endoints, check out our pricing page.

Hosted models

Language Models

Use our Completions endpoint for Language Models.

OrganizationModel NameModel String for APIContext length
MetaLLaMA-2 (70B)meta-llama/Llama-2-70b-hf4096
mistralaiMistral (7B)mistralai/Mistral-7B-v0.18192
mistralaiMixtral-8x7B (46.7B)mistralai/Mixtral-8x7B-v0.132768

Code Models

Use our Completions endpoint for Code Models.

OrganizationModel NameModel String for APIContext length

Moderation Models

Use our Completions endpoint to run a moderation model as a standalone classifier, or use it alongside any of the other models above as a filter to safeguard responses from 100+ models, by specifying the parameter "safety_model": "MODEL_API_STRING"

OrganizationModel NameModel String for APIContext length
MetaLlama Guard (7B)Meta-Llama/Llama-Guard-7b4096

Genomic Models

Use our Completions endpoint for Genomic Models.

OrganizationModel NameModel String for APIContext length

Request a model

Don't see a model you want to use?

Send us a Model Request here →