Skip to main content
The tables below list every model available through the fine-tuning API. Context lengths are the maximum for that model in SFT and DPO modes. Batch sizes refer to packed batches for text formats. See data preparation for details on packing.
Some models can be fine-tuned but cannot be deployed as dedicated endpoints. To verify deployability before training, run client.endpoints.list_hardware(model="<BASE_MODEL>"). A 404 means the base can’t host a fine-tune.
Fill out this form to request a model that isn’t in the list.

LoRA fine-tuning

Full fine-tuning

Vision-language

LoRA target modules

LoRA fine-tuning

OrganizationModelAPI IDContext (SFT)Context (DPO)Max batch (SFT)Max batch (DPO)Min batchGrad accumMax LoRA rank
QwenQwen3.5 397B A17BQwen/Qwen3.5-397B-A17B3276816384161616164
QwenQwen3.5 122B A10BQwen/Qwen3.5-122B-A10B6553632768161616164
QwenQwen3.5 35B A3BQwen/Qwen3.5-35B-A3B6553632768888164
QwenQwen3.6 35B A3BQwen/Qwen3.6-35B-A3B6553632768888164
QwenQwen3.5 35B A3B BaseQwen/Qwen3.5-35B-A3B-Base6553632768888164
QwenQwen3.5 27BQwen/Qwen3.5-27B3276816384161616164
QwenQwen3.5 9BQwen/Qwen3.5-9B6553649152888164
QwenQwen3.5 4BQwen/Qwen3.5-4B13107265536888164
QwenQwen3.5 2BQwen/Qwen3.5-2B131072131072888164
QwenQwen3.5 0.8BQwen/Qwen3.5-0.8B131072131072888164
Moonshot AIKimi K2.5moonshotai/Kimi-K2.53276816384444816
Moonshot AIKimi K2 Thinkingmoonshotai/Kimi-K2-Thinking3276816384444816
Moonshot AIKimi K2 Instruct 0905moonshotai/Kimi-K2-Instruct-09053276816384444816
Moonshot AIKimi K2 Instructmoonshotai/Kimi-K2-Instruct3276816384444816
Moonshot AIKimi K2 Basemoonshotai/Kimi-K2-Base3276816384444816
Z.aiGLM 5.1zai-org/GLM-5.15068825344111116
Z.aiGLM 5zai-org/GLM-55068825344111116
Z.aiGLM 4.7zai-org/GLM-4.712800064000111864
Z.aiGLM 4.6zai-org/GLM-4.612800064000111864
OpenAIGPT-OSS 20Bopenai/gpt-oss-20b13107265536888164
OpenAIGPT-OSS 120Bopenai/gpt-oss-120b6553632768161616164
DeepSeekDeepSeek R1 0528deepseek-ai/DeepSeek-R1-05286553632768222816
DeepSeekDeepSeek R1deepseek-ai/DeepSeek-R16553632768222816
DeepSeekDeepSeek V3.1deepseek-ai/DeepSeek-V3.16553632768222816
DeepSeekDeepSeek V3 0324deepseek-ai/DeepSeek-V3-03246553632768222816
DeepSeekDeepSeek V3deepseek-ai/DeepSeek-V36553632768222816
DeepSeekDeepSeek V3.1 Basedeepseek-ai/DeepSeek-V3.1-Base6553632768222816
DeepSeekDeepSeek V3 Basedeepseek-ai/DeepSeek-V3-Base6553632768222816
DeepSeekDeepSeek R1 Distill Llama 70Bdeepseek-ai/DeepSeek-R1-Distill-Llama-70B2457612288888164
DeepSeekDeepSeek R1 Distill Llama 70B 32kdeepseek-ai/DeepSeek-R1-Distill-Llama-70B-32k3276832768111864
DeepSeekDeepSeek R1 Distill Llama 70B 131kdeepseek-ai/DeepSeek-R1-Distill-Llama-70B-131k13107232768111864
DeepSeekDeepSeek R1 Distill Qwen 14Bdeepseek-ai/DeepSeek-R1-Distill-Qwen-14B6553632768888164
DeepSeekDeepSeek R1 Distill Qwen 1.5Bdeepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B131072131072888164
MetaLlama 4 Scout 17B 16Emeta-llama/Llama-4-Scout-17B-16E6553612288888164
MetaLlama 4 Scout 17B 16E Instructmeta-llama/Llama-4-Scout-17B-16E-Instruct6553612288888164
MetaLlama 4 Scout 17B 16E Instruct VLMmeta-llama/Llama-4-Scout-17B-16E-Instruct-VLM3276832768888164
MetaLlama 4 Maverick 17B 128Emeta-llama/Llama-4-Maverick-17B-128E1638416384161616164
MetaLlama 4 Maverick 17B 128E Instructmeta-llama/Llama-4-Maverick-17B-128E-Instruct1638424576161616164
MetaLlama 4 Maverick 17B 128E Instruct VLMmeta-llama/Llama-4-Maverick-17B-128E-Instruct-VLM1638416384161616164
GoogleGemma 3 270Mgoogle/gemma-3-270m32768327681281288164
GoogleGemma 3 270M ITgoogle/gemma-3-270m-it32768327681281288164
GoogleGemma 3 1B ITgoogle/gemma-3-1b-it327683276832328164
GoogleGemma 3 1B PTgoogle/gemma-3-1b-pt327683276832328164
GoogleGemma 3 4B ITgoogle/gemma-3-4b-it13107265536888164
GoogleGemma 3 4B IT VLMgoogle/gemma-3-4b-it-VLM3276832768888164
GoogleGemma 3 4B PTgoogle/gemma-3-4b-pt13107265536888164
GoogleGemma 3 12B ITgoogle/gemma-3-12b-it6553649152888164
GoogleGemma 3 12B IT VLMgoogle/gemma-3-12b-it-VLM3276832768888164
GoogleGemma 3 12B PTgoogle/gemma-3-12b-pt6553649152888164
GoogleGemma 3 27B ITgoogle/gemma-3-27b-it4915224576888164
GoogleGemma 3 27B IT VLMgoogle/gemma-3-27b-it-VLM3276824576888164
GoogleGemma 3 27B PTgoogle/gemma-3-27b-pt4915224576888164
GoogleGemma 4 31B ITgoogle/gemma-4-31B-it4915224576444264
GoogleGemma 4 26B A4B ITgoogle/gemma-4-26B-A4B-it4915224576444264
QwenQwen3 Next 80B A3B InstructQwen/Qwen3-Next-80B-A3B-Instruct1638424576161616164
QwenQwen3 Next 80B A3B ThinkingQwen/Qwen3-Next-80B-A3B-Thinking1638424576161616164
QwenQwen3 0.6BQwen/Qwen3-0.6B409604096064648164
QwenQwen3 0.6B BaseQwen/Qwen3-0.6B-Base327683276864648164
QwenQwen3 1.7BQwen/Qwen3-1.7B409604096032328164
QwenQwen3 1.7B BaseQwen/Qwen3-1.7B-Base327683276832328164
QwenQwen3 4BQwen/Qwen3-4B409604096016168164
QwenQwen3 4B BaseQwen/Qwen3-4B-Base327683276816168164
QwenQwen3 8BQwen/Qwen3-8B4096040960888164
QwenQwen3 8B BaseQwen/Qwen3-8B-Base327683276816168164
QwenQwen3 14BQwen/Qwen3-14B4096040960888164
QwenQwen3 14B BaseQwen/Qwen3-14B-Base3276832768888164
QwenQwen3 32BQwen/Qwen3-32B4096024576888164
QwenQwen3 30B A3B BaseQwen/Qwen3-30B-A3B-Base491523276816168164
QwenQwen3 30B A3BQwen/Qwen3-30B-A3B491523276816168164
QwenQwen3 30B A3B Instruct 2507Qwen/Qwen3-30B-A3B-Instruct-2507491523276816168164
QwenQwen3 235B A22BQwen/Qwen3-235B-A22B4096032768888264
QwenQwen3 235B A22B Instruct 2507Qwen/Qwen3-235B-A22B-Instruct-25074915232768888264
QwenQwen3 Coder 30B A3B InstructQwen/Qwen3-Coder-30B-A3B-Instruct262144262144222464
QwenQwen3 Coder 480B A35B InstructQwen/Qwen3-Coder-480B-A35B-Instruct26214465536222864
QwenQwen3 VL 8B InstructQwen/Qwen3-VL-8B-Instruct2457616384888164
QwenQwen3 VL 32B InstructQwen/Qwen3-VL-32B-Instruct1638416384888164
QwenQwen3 VL 30B A3B InstructQwen/Qwen3-VL-30B-A3B-Instruct1228812288888164
QwenQwen3 VL 235B A22B InstructQwen/Qwen3-VL-235B-A22B-Instruct1228812288161616164
NVIDIANVIDIA Nemotron Nano 9B v2nvidia/NVIDIA-Nemotron-Nano-9B-v23276816384888164
NVIDIANVIDIA Nemotron 3 Super 120B A12B BF16nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF164915224576444264
MetaLlama 3.3 70B Instruct Referencemeta-llama/Llama-3.3-70B-Instruct-Reference2457612288888164
MetaLlama 3.3 70B 32k Instruct Referencemeta-llama/Llama-3.3-70B-32k-Instruct-Reference3276832768111864
MetaLlama 3.3 70B 131k Instruct Referencemeta-llama/Llama-3.3-70B-131k-Instruct-Reference13107265536111864
MetaLlama 3.2 3B Instructmeta-llama/Llama-3.2-3B-Instruct13107265536888164
MetaLlama 3.2 3Bmeta-llama/Llama-3.2-3B13107265536888164
MetaLlama 3.2 1B Instructmeta-llama/Llama-3.2-1B-Instruct131072131072888164
MetaLlama 3.2 1Bmeta-llama/Llama-3.2-1B131072131072888164
MetaMeta Llama 3.1 8B Instruct Referencemeta-llama/Meta-Llama-3.1-8B-Instruct-Reference13107265536888164
MetaMeta Llama 3.1 8B 131k Instruct Referencemeta-llama/Meta-Llama-3.1-8B-131k-Instruct-Reference131072131072441164
MetaMeta Llama 3.1 8B Referencemeta-llama/Meta-Llama-3.1-8B-Reference13107265536888164
MetaMeta Llama 3.1 8B 131k Referencemeta-llama/Meta-Llama-3.1-8B-131k-Reference131072131072441164
MetaMeta Llama 3.1 70B Instruct Referencemeta-llama/Meta-Llama-3.1-70B-Instruct-Reference2457612288888164
MetaMeta Llama 3.1 70B 32k Instruct Referencemeta-llama/Meta-Llama-3.1-70B-32k-Instruct-Reference3276832768111864
MetaMeta Llama 3.1 70B 131k Instruct Referencemeta-llama/Meta-Llama-3.1-70B-131k-Instruct-Reference13107265536111864
MetaMeta Llama 3.1 70B Referencemeta-llama/Meta-Llama-3.1-70B-Reference2457612288888164
MetaMeta Llama 3.1 70B 32k Referencemeta-llama/Meta-Llama-3.1-70B-32k-Reference3276832768111864
MetaMeta Llama 3.1 70B 131k Referencemeta-llama/Meta-Llama-3.1-70B-131k-Reference13107265536111864
MetaMeta Llama 3 8B Instructmeta-llama/Meta-Llama-3-8B-Instruct8192819264648164
MetaMeta Llama 3 8Bmeta-llama/Meta-Llama-3-8B8192819264648164
MetaMeta Llama 3 70B Instructmeta-llama/Meta-Llama-3-70B-Instruct81928192888164
QwenQwen2.5 72B InstructQwen/Qwen2.5-72B-Instruct2457612288888164
QwenQwen2.5 72BQwen/Qwen2.5-72B2457612288888164
QwenQwen2.5 32B InstructQwen/Qwen2.5-32B-Instruct3276832768888164
QwenQwen2.5 32BQwen/Qwen2.5-32B4915232768888164
QwenQwen2.5 14B InstructQwen/Qwen2.5-14B-Instruct3276832768888164
QwenQwen2.5 14BQwen/Qwen2.5-14B6553649152888164
QwenQwen2.5 7B InstructQwen/Qwen2.5-7B-Instruct327683276816168164
QwenQwen2.5 7BQwen/Qwen2.5-7B13107265536888164
QwenQwen2.5 3B InstructQwen/Qwen2.5-3B-Instruct327683276832328164
QwenQwen2.5 3BQwen/Qwen2.5-3B327683276832328164
QwenQwen2.5 1.5B InstructQwen/Qwen2.5-1.5B-Instruct327683276832328164
QwenQwen2.5 1.5BQwen/Qwen2.5-1.5B131072131072888164
QwenQwen2 72B InstructQwen/Qwen2-72B-Instruct3276816384161616164
QwenQwen2 72BQwen/Qwen2-72B3276816384161616164
QwenQwen2 7B InstructQwen/Qwen2-7B-Instruct3276832768888164
QwenQwen2 7BQwen/Qwen2-7B13107224576888164
QwenQwen2 1.5B InstructQwen/Qwen2-1.5B-Instruct327683276832328164
QwenQwen2 1.5BQwen/Qwen2-1.5B131072131072888164
MistralMixtral 8x7B Instruct v0.1mistralai/Mixtral-8x7B-Instruct-v0.13276832768888164
MistralMixtral 8x7B v0.1mistralai/Mixtral-8x7B-v0.13276832768888164
MistralMistral 7B Instruct v0.2mistralai/Mistral-7B-Instruct-v0.2327683276816168164
MistralMistral 7B v0.1mistralai/Mistral-7B-v0.1327683276816168164
TogetherLlama 2 7B Chattogethercomputer/llama-2-7b-chat409640961281288164

Full fine-tuning

OrganizationModelAPI IDContext (SFT)Context (DPO)Max batch (SFT)Max batch (DPO)Min batch
QwenQwen3.5 27BQwen/Qwen3.5-27B3276816384161616
QwenQwen3.5 9BQwen/Qwen3.5-9B6553649152888
QwenQwen3.5 4BQwen/Qwen3.5-4B13107265536888
QwenQwen3.5 2BQwen/Qwen3.5-2B131072131072888
QwenQwen3.5 0.8BQwen/Qwen3.5-0.8B131072131072888
DeepSeekDeepSeek R1 Distill Llama 70Bdeepseek-ai/DeepSeek-R1-Distill-Llama-70B2457612288323232
DeepSeekDeepSeek R1 Distill Qwen 14Bdeepseek-ai/DeepSeek-R1-Distill-Qwen-14B6553632768888
DeepSeekDeepSeek R1 Distill Qwen 1.5Bdeepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B131072131072888
GoogleGemma 3 270Mgoogle/gemma-3-270m32768327681281288
GoogleGemma 3 270M ITgoogle/gemma-3-270m-it32768327681281288
GoogleGemma 3 1B ITgoogle/gemma-3-1b-it327683276864648
GoogleGemma 3 1B PTgoogle/gemma-3-1b-pt327683276864648
GoogleGemma 3 4B ITgoogle/gemma-3-4b-it13107265536888
GoogleGemma 3 4B IT VLMgoogle/gemma-3-4b-it-VLM3276832768888
GoogleGemma 3 4B PTgoogle/gemma-3-4b-pt13107265536888
GoogleGemma 3 12B ITgoogle/gemma-3-12b-it6553649152888
GoogleGemma 3 12B IT VLMgoogle/gemma-3-12b-it-VLM3276832768888
GoogleGemma 3 12B PTgoogle/gemma-3-12b-pt6553649152888
GoogleGemma 3 27B ITgoogle/gemma-3-27b-it4915224576161616
GoogleGemma 3 27B IT VLMgoogle/gemma-3-27b-it-VLM3276824576161616
GoogleGemma 3 27B PTgoogle/gemma-3-27b-pt4915224576161616
GoogleGemma 4 31B ITgoogle/gemma-4-31B-it4915224576888
GoogleGemma 4 26B A4B ITgoogle/gemma-4-26B-A4B-it4915224576888
QwenQwen3 0.6BQwen/Qwen3-0.6B409604096064648
QwenQwen3 0.6B BaseQwen/Qwen3-0.6B-Base327683276864648
QwenQwen3 1.7BQwen/Qwen3-1.7B409604096032328
QwenQwen3 1.7B BaseQwen/Qwen3-1.7B-Base327683276832328
QwenQwen3 4BQwen/Qwen3-4B409604096016168
QwenQwen3 4B BaseQwen/Qwen3-4B-Base327683276816168
QwenQwen3 8BQwen/Qwen3-8B4096040960888
QwenQwen3 8B BaseQwen/Qwen3-8B-Base327683276816168
QwenQwen3 14BQwen/Qwen3-14B4096040960888
QwenQwen3 14B BaseQwen/Qwen3-14B-Base3276832768888
QwenQwen3 32BQwen/Qwen3-32B4096024576161616
QwenQwen3 VL 8B InstructQwen/Qwen3-VL-8B-Instruct2457616384888
QwenQwen3 VL 32B InstructQwen/Qwen3-VL-32B-Instruct1638416384161616
QwenQwen3 VL 30B A3B InstructQwen/Qwen3-VL-30B-A3B-Instruct1228812288888
NVIDIANVIDIA Nemotron Nano 9B v2nvidia/NVIDIA-Nemotron-Nano-9B-v23276816384888
MetaLlama 3.3 70B Instruct Referencemeta-llama/Llama-3.3-70B-Instruct-Reference2457612288323232
MetaLlama 3.2 3B Instructmeta-llama/Llama-3.2-3B-Instruct13107265536888
MetaLlama 3.2 3Bmeta-llama/Llama-3.2-3B13107265536888
MetaLlama 3.2 1B Instructmeta-llama/Llama-3.2-1B-Instruct131072131072888
MetaLlama 3.2 1Bmeta-llama/Llama-3.2-1B131072131072888
MetaMeta Llama 3.1 8B Instruct Referencemeta-llama/Meta-Llama-3.1-8B-Instruct-Reference13107265536888
MetaMeta Llama 3.1 8B Referencemeta-llama/Meta-Llama-3.1-8B-Reference13107265536888
MetaMeta Llama 3.1 70B Instruct Referencemeta-llama/Meta-Llama-3.1-70B-Instruct-Reference2457612288323232
MetaMeta Llama 3.1 70B Referencemeta-llama/Meta-Llama-3.1-70B-Reference2457612288323232
MetaMeta Llama 3 8B Instructmeta-llama/Meta-Llama-3-8B-Instruct8192819264648
MetaMeta Llama 3 8Bmeta-llama/Meta-Llama-3-8B8192819264648
MetaMeta Llama 3 70B Instructmeta-llama/Meta-Llama-3-70B-Instruct81928192323232
QwenQwen2 7B InstructQwen/Qwen2-7B-Instruct3276832768888
QwenQwen2 7BQwen/Qwen2-7B13107224576888
QwenQwen2 1.5B InstructQwen/Qwen2-1.5B-Instruct327683276832328
QwenQwen2 1.5BQwen/Qwen2-1.5B131072131072888
MistralMixtral 8x7B Instruct v0.1mistralai/Mixtral-8x7B-Instruct-v0.13276832768161616
MistralMixtral 8x7B v0.1mistralai/Mixtral-8x7B-v0.13276832768161616
MistralMistral 7B Instruct v0.2mistralai/Mistral-7B-Instruct-v0.2327683276816168
MistralMistral 7B v0.1mistralai/Mistral-7B-v0.1327683276816168
TogetherLlama 2 7B Chattogethercomputer/llama-2-7b-chat409640961281288

Vision-language models

The following models support vision-language fine-tuning on image and text data. See vision fine-tuning for the dataset schema and the train_vision parameter.
ModelFull fine-tuningLoRA fine-tuning
Qwen/Qwen3-VL-8B-Instruct
Qwen/Qwen3-VL-30B-A3B-Instruct
Qwen/Qwen3-VL-235B-A22B-Instruct
meta-llama/Llama-4-Maverick-17B-128E-Instruct-VLM
meta-llama/Llama-4-Scout-17B-16E-Instruct-VLM
google/gemma-3-4b-it-VLM
google/gemma-3-12b-it-VLM
google/gemma-3-27b-it-VLM

LoRA target modules

See LoRA vs. full fine-tuning for the default target modules per model. Pass lora_trainable_modules="all-linear" to train every linear layer.