Skip to main content

Overview

We regularly update our platform with the latest and most powerful open-source models. This document outlines our model lifecycle policy, including how we handle model upgrades, redirects, and deprecations.

Model Lifecycle Policy

To ensure customers get predictable behavior while we maintain a high-quality model catalog, we follow a structured approach to introducing new models, upgrading existing models, and deprecating older versions.

Model Upgrades (Redirects)

An upgrade is a model release that is materially the same model lineage with targeted improvements and no fundamental changes to how developers use or reason about it. A model qualifies as an upgrade when one or more of the following are true (and none of the “New Model” criteria apply):
  • Same modality and task profile (e.g., instruct → instruct, reasoning → reasoning)
  • Same architecture family (e.g., DeepSeek-V3 → DeepSeek-V3-0324)
  • Post-training/fine-tuning improvements, bug fixes, safety tuning, or small data refresh
  • Behavior is strongly compatible (prompting patterns and evals are similar)
  • Pricing change is none or small (≤10% increase)
Outcome: The current endpoint redirects to the upgraded version after a 3-day notice. The old version remains available via Dedicated Endpoints.

New Models (No Redirect)

A new model is a release with materially different capabilities, costs, or operating characteristics—such that a silent redirect would be misleading. Any of the following triggers classification as a new model:
  • Modality shift (e.g., reasoning-only ↔ instruct/hybrid, text → multimodal)
  • Architecture shift (e.g., Qwen3 → Qwen3-Next, Llama 3 → Llama 4)
  • Large behavior shift (prompting patterns, output style/verbosity materially different)
  • Experimental flag by provider (e.g., DeepSeek-V3-Exp)
  • Large price change (>10% increase or pricing structure change)
  • Benchmark deltas that meaningfully change task positioning
  • Safety policy or system prompt changes that noticeably affect outputs
Outcome: No automatic redirect. We announce the new model and deprecate the old one on a 2-week timeline (both are available during this window). Customers must explicitly switch model IDs.

Active Model Redirects

The following models are currently being redirected to newer versions. Requests to the original model ID are automatically routed to the upgraded version:
Original ModelRedirects ToNotes
Kimi-K2Kimi-K2-0905Same architecture, improved post-training
DeepSeek-V3DeepSeek-V3-0324Same architecture, targeted improvements
DeepSeek-R1DeepSeek-R1-0528Same architecture, targeted improvements
If you need to use the original model version, you can always deploy it as a Dedicated Endpoint.

Deprecation Policy

Model TypeDeprecation NoticeNotes
Preview Model<24 hrs of notice, after 30 daysClearly marked in docs and playground with “Preview” tag
Serverless Endpoint2 or 3 weeks*
On Demand Dedicated Endpoint2 or 3 weeks*
*Depends on usage and whether there’s an available newer version of the model.
  • Users of models scheduled for deprecation will be notified by email.
  • All changes will be reflected on this page.
  • Each deprecated model will have a specified removal date.
  • After the removal date, the model will no longer be queryable via its serverless endpoint but options to migrate will be available as described below.

Migration Options

When a model is deprecated on our serverless platform, users have three options:
  1. On-demand Dedicated Endpoint (if supported):
    • Reserved solely for the user, users choose underlying hardware.
    • Charged on a price per minute basis.
    • Endpoints can be dynamically spun up and down.
  2. Monthly Reserved Dedicated Endpoint:
    • Reserved solely for the user.
    • Charged on a month-by-month basis.
    • Can be requested via this form.
  3. Migrate to a newer serverless model:
    • Switch to an updated model on the serverless platform.

Migration Steps

  1. Review the deprecation table below to find your current model.
  2. Check if on-demand dedicated endpoints are supported for your model.
  3. Decide on your preferred migration option.
  4. If choosing a new serverless model, test your application thoroughly with the new model before fully migrating.
  5. Update your API calls to use the new model or dedicated endpoint.

Deprecation History

All deprecations are listed below, with the most recent deprecations at the top.
Removal DateModelSupported by on-demand dedicated endpoints
2026-01-05Qwen/Qwen2.5-VL-72B-InstructYes
2025-12-23deepseek-ai/DeepSeek-R1-Distill-Llama-70BYes
2025-12-23meta-llama/Meta-Llama-3-70B-Instruct-TurboYes
2025-12-23black-forest-labs/FLUX.1-schnell-freeNo
2025-12-23meta-llama/Meta-Llama-Guard-3-8BNo
2025-11-19deepcogito/cogito-v2-preview-deepseek-671bNo
2025-07-25arcee-ai/callerNo
2025-07-25arcee-ai/arcee-blitzNo
2025-07-25arcee-ai/virtuoso-medium-v2No
2025-11-17arcee-ai/virtuoso-largeNo
2025-11-17arcee-ai/maestro-reasoningNo
2025-11-17arcee_ai/arcee-spotlightNo
2025-11-17arcee-ai/coder-largeNo
2025-11-13deepseek-ai/DeepSeek-R1-Distill-Qwen-14BYes
2025-11-13mistralai/Mistral-7B-Instruct-v0.1Yes
2025-11-13Qwen/Qwen2.5-Coder-32B-InstructYes
2025-11-13Qwen/QwQ-32BYes
2025-11-13deepseek-ai/DeepSeek-R1-Distill-Llama-70B-freeNo
2025-11-13meta-llama/Llama-3.3-70B-Instruct-Turbo-FreeNo
2025-08-28Qwen/Qwen2-VL-72B-InstructYes
2025-08-28nvidia/Llama-3.1-Nemotron-70B-Instruct-HFYes
2025-08-28perplexity-ai/r1-1776No (coming soon!)
2025-08-28meta-llama/Meta-Llama-3-8B-InstructYes
2025-08-28google/gemma-2-27b-itYes
2025-08-28Qwen/Qwen2-72B-InstructYes
2025-08-28meta-llama/Llama-Vision-FreeNo
2025-08-28Qwen/Qwen2.5-14BYes
2025-08-28meta-llama-llama-3-3-70b-instruct-loraNo (coming soon!)
2025-08-28meta-llama/Llama-3.2-11B-Vision-Instruct-TurboNo (coming soon!)
2025-08-28NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPOYes
2025-08-28deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5BYes
2025-08-28black-forest-labs/FLUX.1-depthNo (coming soon!)
2025-08-28black-forest-labs/FLUX.1-reduxNo (coming soon!)
2025-08-28meta-llama/Llama-3-8b-chat-hfYes
2025-08-28black-forest-labs/FLUX.1-cannyNo (coming soon!)
2025-08-28meta-llama/Llama-3.2-90B-Vision-Instruct-TurboNo (coming soon!)
2025-06-13gryphe-mythomax-l2-13bNo (coming soon!)
2025-06-13mistralai-mixtral-8x22b-instruct-v0-1No (coming soon!)
2025-06-13mistralai-mixtral-8x7b-v0-1No (coming soon!)
2025-06-13togethercomputer-m2-bert-80m-2k-retrievalNo (coming soon!)
2025-06-13togethercomputer-m2-bert-80m-8k-retrievalNo (coming soon!)
2025-06-13whereisai-uae-large-v1No (coming soon!)
2025-06-13google-gemma-2-9b-itNo (coming soon!)
2025-06-13google-gemma-2b-itNo (coming soon!)
2025-06-13gryphe-mythomax-l2-13b-liteNo (coming soon!)
2025-05-16meta-llama-llama-3-2-3b-instruct-turbo-loraNo (coming soon!)
2025-05-16meta-llama-meta-llama-3-8b-instruct-turboNo (coming soon!)
2025-04-24meta-llama/Llama-2-13b-chat-hfNo (coming soon!)
2025-04-24meta-llama-meta-llama-3-70b-instruct-turboNo (coming soon!)
2025-04-24meta-llama-meta-llama-3-1-8b-instruct-turbo-loraNo (coming soon!)
2025-04-24meta-llama-meta-llama-3-1-70b-instruct-turbo-loraNo (coming soon!)
2025-04-24meta-llama-llama-3-2-1b-instruct-loraNo (coming soon!)
2025-04-24microsoft-wizardlm-2-8x22bNo (coming soon!)
2025-04-24upstage-solar-10-7b-instruct-v1No (coming soon!)
2025-04-14stabilityai/stable-diffusion-xl-base-1.0No (coming soon!)
2025-04-04meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo-loraNo (coming soon!)
2025-03-27mistralai/Mistral-7B-v0.1No
2025-03-25Qwen/QwQ-32B-PreviewNo
2025-03-13databricks-dbrx-instructNo
2025-03-11meta-llama/Meta-Llama-3-70B-Instruct-LiteNo
2025-03-08Meta-Llama/Llama-Guard-7bNo
2025-02-06sentence-transformers/msmarco-bert-base-dot-v5No
2025-02-06bert-base-uncasedNo
2024-10-29Qwen/Qwen1.5-72B-ChatNo
2024-10-29Qwen/Qwen1.5-110B-ChatNo
2024-10-07NousResearch/Nous-Hermes-2-Yi-34BNo
2024-10-07NousResearch/Hermes-3-Llama-3.1-405B-TurboNo
2024-08-22NousResearch/Nous-Hermes-2-Mistral-7B-DPOYes
2024-08-22SG161222/Realistic_Vision_V3.0_VAENo
2024-08-22meta-llama/Llama-2-70b-chat-hfNo
2024-08-22mistralai/Mixtral-8x22BNo
2024-08-22Phind/Phind-CodeLlama-34B-v2No
2024-08-22meta-llama/Meta-Llama-3-70BYes
2024-08-22teknium/OpenHermes-2p5-Mistral-7BYes
2024-08-22openchat/openchat-3.5-1210Yes
2024-08-22WizardLM/WizardCoder-Python-34B-V1.0No
2024-08-22NousResearch/Nous-Hermes-2-Mixtral-8x7B-SFTYes
2024-08-22NousResearch/Nous-Hermes-Llama2-13bYes
2024-08-22zero-one-ai/Yi-34B-ChatNo
2024-08-22codellama/CodeLlama-34b-Instruct-hfNo
2024-08-22codellama/CodeLlama-34b-Python-hfNo
2024-08-22teknium/OpenHermes-2-Mistral-7BYes
2024-08-22Qwen/Qwen1.5-14B-ChatYes
2024-08-22stabilityai/stable-diffusion-2-1No
2024-08-22meta-llama/Llama-3-8b-hfYes
2024-08-22prompthero/openjourneyNo
2024-08-22runwayml/stable-diffusion-v1-5No
2024-08-22wavymulder/Analog-DiffusionNo
2024-08-22Snowflake/snowflake-arctic-instructNo
2024-08-22deepseek-ai/deepseek-coder-33b-instructNo
2024-08-22Qwen/Qwen1.5-7B-ChatYes
2024-08-22Qwen/Qwen1.5-32B-ChatNo
2024-08-22cognitivecomputations/dolphin-2.5-mixtral-8x7bNo
2024-08-22garage-bAInd/Platypus2-70B-instructNo
2024-08-22google/gemma-7b-itYes
2024-08-22meta-llama/Llama-2-7b-chat-hfYes
2024-08-22Qwen/Qwen1.5-32BNo
2024-08-22Open-Orca/Mistral-7B-OpenOrcaYes
2024-08-22codellama/CodeLlama-13b-Instruct-hfYes
2024-08-22NousResearch/Nous-Capybara-7B-V1p9Yes
2024-08-22lmsys/vicuna-13b-v1.5Yes
2024-08-22Undi95/ReMM-SLERP-L2-13BYes
2024-08-22Undi95/Toppy-M-7BYes
2024-08-22meta-llama/Llama-2-13b-hfNo
2024-08-22codellama/CodeLlama-70b-Instruct-hfNo
2024-08-22snorkelai/Snorkel-Mistral-PairRM-DPOYes
2024-08-22togethercomputer/LLaMA-2-7B-32K-InstructYes
2024-08-22Austism/chronos-hermes-13bYes
2024-08-22Qwen/Qwen1.5-72BNo
2024-08-22zero-one-ai/Yi-34BNo
2024-08-22codellama/CodeLlama-7b-Instruct-hfYes
2024-08-22togethercomputer/evo-1-131k-baseNo
2024-08-22codellama/CodeLlama-70b-hfNo
2024-08-22WizardLM/WizardLM-13B-V1.2Yes
2024-08-22meta-llama/Llama-2-7b-hfNo
2024-08-22google/gemma-7bYes
2024-08-22Qwen/Qwen1.5-1.8B-ChatYes
2024-08-22Qwen/Qwen1.5-4B-ChatYes
2024-08-22lmsys/vicuna-7b-v1.5Yes
2024-08-22zero-one-ai/Yi-6BYes
2024-08-22Nexusflow/NexusRaven-V2-13BYes
2024-08-22google/gemma-2bYes
2024-08-22Qwen/Qwen1.5-7BYes
2024-08-22NousResearch/Nous-Hermes-llama-2-7bYes
2024-08-22togethercomputer/alpaca-7bYes
2024-08-22Qwen/Qwen1.5-14BYes
2024-08-22codellama/CodeLlama-70b-Python-hfNo
2024-08-22Qwen/Qwen1.5-4BYes
2024-08-22togethercomputer/StripedHyena-Hessian-7BNo
2024-08-22allenai/OLMo-7B-InstructNo
2024-08-22togethercomputer/RedPajama-INCITE-7B-InstructNo
2024-08-22togethercomputer/LLaMA-2-7B-32KYes
2024-08-22togethercomputer/RedPajama-INCITE-7B-BaseNo
2024-08-22Qwen/Qwen1.5-0.5B-ChatYes
2024-08-22microsoft/phi-2Yes
2024-08-22Qwen/Qwen1.5-0.5BYes
2024-08-22togethercomputer/RedPajama-INCITE-7B-ChatNo
2024-08-22togethercomputer/RedPajama-INCITE-Chat-3B-v1No
2024-08-22togethercomputer/GPT-JT-Moderation-6BNo
2024-08-22Qwen/Qwen1.5-1.8BYes
2024-08-22togethercomputer/RedPajama-INCITE-Instruct-3B-v1No
2024-08-22togethercomputer/RedPajama-INCITE-Base-3B-v1No
2024-08-22WhereIsAI/UAE-Large-V1No
2024-08-22allenai/OLMo-7BNo
2024-08-22togethercomputer/evo-1-8k-baseNo
2024-08-22WizardLM/WizardCoder-15B-V1.0No
2024-08-22codellama/CodeLlama-13b-Python-hfYes
2024-08-22allenai-olmo-7b-twin-2tNo
2024-08-22sentence-transformers/msmarco-bert-base-dot-v5No
2024-08-22codellama/CodeLlama-7b-Python-hfYes
2024-08-22hazyresearch/M2-BERT-2k-Retrieval-Encoder-V1No
2024-08-22bert-base-uncasedNo
2024-08-22mistralai/Mistral-7B-Instruct-v0.1-jsonNo
2024-08-22mistralai/Mistral-7B-Instruct-v0.1-toolsNo
2024-08-22togethercomputer-codellama-34b-instruct-jsonNo
2024-08-22togethercomputer-codellama-34b-instruct-toolsNo
**Notes on model support: **
  • Models marked “Yes” in the on-demand dedicated endpoint support column can be spun up as dedicated endpoints with customizable hardware.
  • Models marked “No” are not available as on-demand endpoints and will require migration to a different model or a monthly reserved dedicated endpoint.
  • Regularly check this page for updates on model deprecations.
  • Plan your migration well in advance of the removal date to ensure a smooth transition.
  • If you have any questions or need assistance with migration, please contact our support team.
For the most up-to-date information on model availability, support, and recommended alternatives, please check our API documentation or contact our support team.