Skip to main content
The following models are available to use with our fine-tuning API. Get started with fine-tuning a model! Note: This list is different from the models that support Serverless LoRA inference, which allows you to perform LoRA fine-tuning and run inference immediately. See the LoRA inference page for the list of supported base models for serverless LoRA. Note: The batch sizes listed below refer to packed batch sizes for text formats. For more details on packing behavior and data formats, see the Data Preparation page. Important: When uploading LoRA adapters for serverless inference, you must use base models from the serverless LoRA list, not the fine-tuning models list. Using an incompatible base model (such as Turbo variants) will result in a "No lora_model specified" error during upload. For example, use Qwen/Qwen3-235B-A22B-Instruct-2507-tput instead of Qwen/Qwen3-235B-A22B-Instruct-2507 for serverless LoRA adapters. Request a model

LoRA Fine-tuning

OrganizationModel NameModel String for APIContext Length (SFT)Context Length (DPO)Max Batch Size (SFT)Max Batch Size (DPO)Min Batch SizeGradient Accumulation Steps
QwenQwen3.5-397B-A17BQwen/Qwen3.5-397B-A17B32768163841616161
QwenQwen3.5-122B-A10BQwen/Qwen3.5-122B-A10B65536327681616161
QwenQwen3.5-35B-A3BQwen/Qwen3.5-35B-A3B65536327688881
QwenQwen3.5-35B-A3B-BaseQwen/Qwen3.5-35B-A3B-Base65536327688881
Moonshot AIKimi-K2.5moonshotai/Kimi-K2.532768163844448
Moonshot AIKimi-K2-Thinkingmoonshotai/Kimi-K2-Thinking32768163844448
Moonshot AIKimi-K2-Instruct-0905moonshotai/Kimi-K2-Instruct-090532768163844448
Moonshot AIKimi-K2-Instructmoonshotai/Kimi-K2-Instruct32768163844448
Moonshot AIKimi-K2-Basemoonshotai/Kimi-K2-Base32768163844448
Z.aiGLM-4.7zai-org/GLM-4.7128000640001118
Z.aiGLM-4.6zai-org/GLM-4.6128000640001118
OpenAIgpt-oss-20bopenai/gpt-oss-20b24576245768881
OpenAIgpt-oss-120bopenai/gpt-oss-120b16384163841616161
DeepSeekDeepSeek-R1-0528deepseek-ai/DeepSeek-R1-0528131072327682228
DeepSeekDeepSeek-R1deepseek-ai/DeepSeek-R1131072491522228
DeepSeekDeepSeek-V3.1deepseek-ai/DeepSeek-V3.1131072327682228
DeepSeekDeepSeek-V3-0324deepseek-ai/DeepSeek-V3-0324131072327682228
DeepSeekDeepSeek-V3deepseek-ai/DeepSeek-V3131072327682228
DeepSeekDeepSeek-V3.1-Basedeepseek-ai/DeepSeek-V3.1-Base131072327682228
DeepSeekDeepSeek-V3-Basedeepseek-ai/DeepSeek-V3-Base131072327682228
DeepSeekDeepSeek-R1-Distill-Llama-70Bdeepseek-ai/DeepSeek-R1-Distill-Llama-70B24576122888881
DeepSeekDeepSeek-R1-Distill-Llama-70B-32kdeepseek-ai/DeepSeek-R1-Distill-Llama-70B-32k32768327681118
DeepSeekDeepSeek-R1-Distill-Llama-70B-131kdeepseek-ai/DeepSeek-R1-Distill-Llama-70B-131k131072327681118
DeepSeekDeepSeek-R1-Distill-Qwen-14Bdeepseek-ai/DeepSeek-R1-Distill-Qwen-14B65536327688881
DeepSeekDeepSeek-R1-Distill-Qwen-1.5Bdeepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B1310721310728881
MetaLlama-4-Scout-17B-16Emeta-llama/Llama-4-Scout-17B-16E65536122888881
MetaLlama-4-Scout-17B-16E-Instructmeta-llama/Llama-4-Scout-17B-16E-Instruct65536122888881
MetaLlama-4-Scout-17B-16E-Instruct-VLMmeta-llama/Llama-4-Scout-17B-16E-Instruct-VLM32768327688881
MetaLlama-4-Maverick-17B-128Emeta-llama/Llama-4-Maverick-17B-128E16384163841616161
MetaLlama-4-Maverick-17B-128E-Instructmeta-llama/Llama-4-Maverick-17B-128E-Instruct16384245761616161
MetaLlama-4-Maverick-17B-128E-Instruct-VLMmeta-llama/Llama-4-Maverick-17B-128E-Instruct-VLM16384163841616161
Googlegemma-3-270mgoogle/gemma-3-270m327683276812812881
Googlegemma-3-270m-itgoogle/gemma-3-270m-it327683276812812881
Googlegemma-3-1b-itgoogle/gemma-3-1b-it3276832768323281
Googlegemma-3-1b-ptgoogle/gemma-3-1b-pt3276832768323281
Googlegemma-3-4b-itgoogle/gemma-3-4b-it131072655368881
Googlegemma-3-4b-it-VLMgoogle/gemma-3-4b-it-VLM32768327688881
Googlegemma-3-4b-ptgoogle/gemma-3-4b-pt131072655368881
Googlegemma-3-12b-itgoogle/gemma-3-12b-it65536491528881
Googlegemma-3-12b-it-VLMgoogle/gemma-3-12b-it-VLM32768327688881
Googlegemma-3-12b-ptgoogle/gemma-3-12b-pt65536491528881
Googlegemma-3-27b-itgoogle/gemma-3-27b-it49152245768881
Googlegemma-3-27b-it-VLMgoogle/gemma-3-27b-it-VLM32768245768881
Googlegemma-3-27b-ptgoogle/gemma-3-27b-pt49152245768881
QwenQwen3-Next-80B-A3B-InstructQwen/Qwen3-Next-80B-A3B-Instruct16384245761616161
QwenQwen3-Next-80B-A3B-ThinkingQwen/Qwen3-Next-80B-A3B-Thinking16384245761616161
QwenQwen3-0.6BQwen/Qwen3-0.6B4096040960646481
QwenQwen3-0.6B-BaseQwen/Qwen3-0.6B-Base3276832768646481
QwenQwen3-1.7BQwen/Qwen3-1.7B4096040960323281
QwenQwen3-1.7B-BaseQwen/Qwen3-1.7B-Base3276832768323281
QwenQwen3-4BQwen/Qwen3-4B4096040960161681
QwenQwen3-4B-BaseQwen/Qwen3-4B-Base3276832768161681
QwenQwen3-8BQwen/Qwen3-8B40960409608881
QwenQwen3-8B-BaseQwen/Qwen3-8B-Base3276832768161681
QwenQwen3-14BQwen/Qwen3-14B40960409608881
QwenQwen3-14B-BaseQwen/Qwen3-14B-Base32768327688881
QwenQwen3-32BQwen/Qwen3-32B40960245768881
QwenQwen3-30B-A3B-BaseQwen/Qwen3-30B-A3B-Base819232768161681
QwenQwen3-30B-A3BQwen/Qwen3-30B-A3B819232768161681
QwenQwen3-30B-A3B-Instruct-2507Qwen/Qwen3-30B-A3B-Instruct-2507819232768161681
QwenQwen3-235B-A22BQwen/Qwen3-235B-A22B40960327688882
QwenQwen3-235B-A22B-Instruct-2507Qwen/Qwen3-235B-A22B-Instruct-250749152327688882
QwenQwen3-Coder-30B-A3B-InstructQwen/Qwen3-Coder-30B-A3B-Instruct2621442621442224
QwenQwen3-Coder-480B-A35B-InstructQwen/Qwen3-Coder-480B-A35B-Instruct262144655362228
QwenQwen3-VL-8B-InstructQwen/Qwen3-VL-8B-Instruct24576163848881
QwenQwen3-VL-32B-InstructQwen/Qwen3-VL-32B-Instruct16384163848881
QwenQwen3-VL-30B-A3B-InstructQwen/Qwen3-VL-30B-A3B-Instruct16384163848881
QwenQwen3-VL-235B-A22B-InstructQwen/Qwen3-VL-235B-A22B-Instruct16384122881616161
MetaLlama-3.3-70B-Instruct-Referencemeta-llama/Llama-3.3-70B-Instruct-Reference24576122888881
MetaLlama-3.3-70B-32k-Instruct-Referencemeta-llama/Llama-3.3-70B-32k-Instruct-Reference32768327681118
MetaLlama-3.3-70B-131k-Instruct-Referencemeta-llama/Llama-3.3-70B-131k-Instruct-Reference131072655361118
MetaLlama-3.2-3B-Instructmeta-llama/Llama-3.2-3B-Instruct131072655368881
MetaLlama-3.2-3Bmeta-llama/Llama-3.2-3B131072655368881
MetaLlama-3.2-1B-Instructmeta-llama/Llama-3.2-1B-Instruct1310721310728881
MetaLlama-3.2-1Bmeta-llama/Llama-3.2-1B1310721310728881
MetaMeta-Llama-3.1-8B-Instruct-Referencemeta-llama/Meta-Llama-3.1-8B-Instruct-Reference131072655368881
MetaMeta-Llama-3.1-8B-131k-Instruct-Referencemeta-llama/Meta-Llama-3.1-8B-131k-Instruct-Reference1310721310724411
MetaMeta-Llama-3.1-8B-Referencemeta-llama/Meta-Llama-3.1-8B-Reference131072655368881
MetaMeta-Llama-3.1-8B-131k-Referencemeta-llama/Meta-Llama-3.1-8B-131k-Reference1310721310724411
MetaMeta-Llama-3.1-70B-Instruct-Referencemeta-llama/Meta-Llama-3.1-70B-Instruct-Reference24576122888881
MetaMeta-Llama-3.1-70B-32k-Instruct-Referencemeta-llama/Meta-Llama-3.1-70B-32k-Instruct-Reference32768327681118
MetaMeta-Llama-3.1-70B-131k-Instruct-Referencemeta-llama/Meta-Llama-3.1-70B-131k-Instruct-Reference131072655361118
MetaMeta-Llama-3.1-70B-Referencemeta-llama/Meta-Llama-3.1-70B-Reference24576122888881
MetaMeta-Llama-3.1-70B-32k-Referencemeta-llama/Meta-Llama-3.1-70B-32k-Reference32768327681118
MetaMeta-Llama-3.1-70B-131k-Referencemeta-llama/Meta-Llama-3.1-70B-131k-Reference131072655361118
MetaMeta-Llama-3-8B-Instructmeta-llama/Meta-Llama-3-8B-Instruct81928192646481
MetaMeta-Llama-3-8Bmeta-llama/Meta-Llama-3-8B81928192646481
MetaMeta-Llama-3-70B-Instructmeta-llama/Meta-Llama-3-70B-Instruct819281928881
QwenQwen2.5-72B-InstructQwen/Qwen2.5-72B-Instruct24576122888881
QwenQwen2.5-72BQwen/Qwen2.5-72B24576122888881
QwenQwen2.5-32B-InstructQwen/Qwen2.5-32B-Instruct32768327688881
QwenQwen2.5-32BQwen/Qwen2.5-32B49152327688881
QwenQwen2.5-14B-InstructQwen/Qwen2.5-14B-Instruct32768327688881
QwenQwen2.5-14BQwen/Qwen2.5-14B65536491528881
QwenQwen2.5-7B-InstructQwen/Qwen2.5-7B-Instruct3276832768161681
QwenQwen2.5-7BQwen/Qwen2.5-7B131072655368881
QwenQwen2.5-3B-InstructQwen/Qwen2.5-3B-Instruct3276832768323281
QwenQwen2.5-3BQwen/Qwen2.5-3B3276832768323281
QwenQwen2.5-1.5B-InstructQwen/Qwen2.5-1.5B-Instruct3276832768323281
QwenQwen2.5-1.5BQwen/Qwen2.5-1.5B1310721310728881
QwenQwen2-72B-InstructQwen/Qwen2-72B-Instruct32768163841616161
QwenQwen2-72BQwen/Qwen2-72B32768163841616161
QwenQwen2-7B-InstructQwen/Qwen2-7B-Instruct32768327688881
QwenQwen2-7BQwen/Qwen2-7B131072245768881
QwenQwen2-1.5B-InstructQwen/Qwen2-1.5B-Instruct3276832768323281
QwenQwen2-1.5BQwen/Qwen2-1.5B1310721310728881
MistralMixtral-8x7B-Instruct-v0.1mistralai/Mixtral-8x7B-Instruct-v0.132768327688881
MistralMixtral-8x7B-v0.1mistralai/Mixtral-8x7B-v0.132768327688881
MistralMistral-7B-Instruct-v0.2mistralai/Mistral-7B-Instruct-v0.23276832768161681
MistralMistral-7B-v0.1mistralai/Mistral-7B-v0.13276832768161681
Togetherllama-2-7b-chattogethercomputer/llama-2-7b-chat4096409612812881

Full Fine-tuning

OrganizationModel NameModel String for APIContext Length (SFT)Context Length (DPO)Max Batch Size (SFT)Max Batch Size (DPO)Min Batch Size
DeepSeekDeepSeek-R1-Distill-Llama-70Bdeepseek-ai/DeepSeek-R1-Distill-Llama-70B2457612288323232
DeepSeekDeepSeek-R1-Distill-Qwen-14Bdeepseek-ai/DeepSeek-R1-Distill-Qwen-14B6553632768888
DeepSeekDeepSeek-R1-Distill-Qwen-1.5Bdeepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B131072131072888
Googlegemma-3-270mgoogle/gemma-3-270m32768327681281288
Googlegemma-3-270m-itgoogle/gemma-3-270m-it32768327681281288
Googlegemma-3-1b-itgoogle/gemma-3-1b-it327683276864648
Googlegemma-3-1b-ptgoogle/gemma-3-1b-pt327683276864648
Googlegemma-3-4b-itgoogle/gemma-3-4b-it13107265536888
Googlegemma-3-4b-it-VLMgoogle/gemma-3-4b-it-VLM3276832768888
Googlegemma-3-4b-ptgoogle/gemma-3-4b-pt13107265536888
Googlegemma-3-12b-itgoogle/gemma-3-12b-it6553649152888
Googlegemma-3-12b-it-VLMgoogle/gemma-3-12b-it-VLM3276832768888
Googlegemma-3-12b-ptgoogle/gemma-3-12b-pt6553649152888
Googlegemma-3-27b-itgoogle/gemma-3-27b-it4915224576161616
Googlegemma-3-27b-it-VLMgoogle/gemma-3-27b-it-VLM3276824576161616
Googlegemma-3-27b-ptgoogle/gemma-3-27b-pt4915224576161616
QwenQwen3-0.6BQwen/Qwen3-0.6B409604096064648
QwenQwen3-0.6B-BaseQwen/Qwen3-0.6B-Base327683276864648
QwenQwen3-1.7BQwen/Qwen3-1.7B409604096032328
QwenQwen3-1.7B-BaseQwen/Qwen3-1.7B-Base327683276832328
QwenQwen3-4BQwen/Qwen3-4B409604096016168
QwenQwen3-4B-BaseQwen/Qwen3-4B-Base327683276816168
QwenQwen3-8BQwen/Qwen3-8B4096040960888
QwenQwen3-8B-BaseQwen/Qwen3-8B-Base327683276816168
QwenQwen3-14BQwen/Qwen3-14B4096040960888
QwenQwen3-14B-BaseQwen/Qwen3-14B-Base3276832768888
QwenQwen3-32BQwen/Qwen3-32B4096024576161616
QwenQwen3-VL-8B-InstructQwen/Qwen3-VL-8B-Instruct2457616384888
QwenQwen3-VL-32B-InstructQwen/Qwen3-VL-32B-Instruct1638416384161616
QwenQwen3-VL-30B-A3B-InstructQwen/Qwen3-VL-30B-A3B-Instruct1638416384888
MetaLlama-3.3-70B-Instruct-Referencemeta-llama/Llama-3.3-70B-Instruct-Reference2457612288323232
MetaLlama-3.2-3B-Instructmeta-llama/Llama-3.2-3B-Instruct13107265536888
MetaLlama-3.2-3Bmeta-llama/Llama-3.2-3B13107265536888
MetaLlama-3.2-1B-Instructmeta-llama/Llama-3.2-1B-Instruct131072131072888
MetaLlama-3.2-1Bmeta-llama/Llama-3.2-1B131072131072888
MetaMeta-Llama-3.1-8B-Instruct-Referencemeta-llama/Meta-Llama-3.1-8B-Instruct-Reference13107265536888
MetaMeta-Llama-3.1-8B-Referencemeta-llama/Meta-Llama-3.1-8B-Reference13107265536888
MetaMeta-Llama-3.1-70B-Instruct-Referencemeta-llama/Meta-Llama-3.1-70B-Instruct-Reference2457612288323232
MetaMeta-Llama-3.1-70B-Referencemeta-llama/Meta-Llama-3.1-70B-Reference2457612288323232
MetaMeta-Llama-3-8B-Instructmeta-llama/Meta-Llama-3-8B-Instruct8192819264648
MetaMeta-Llama-3-8Bmeta-llama/Meta-Llama-3-8B8192819264648
MetaMeta-Llama-3-70B-Instructmeta-llama/Meta-Llama-3-70B-Instruct81928192323232
QwenQwen2-7B-InstructQwen/Qwen2-7B-Instruct3276832768888
QwenQwen2-7BQwen/Qwen2-7B13107224576888
QwenQwen2-1.5B-InstructQwen/Qwen2-1.5B-Instruct327683276832328
QwenQwen2-1.5BQwen/Qwen2-1.5B131072131072888
MistralMixtral-8x7B-Instruct-v0.1mistralai/Mixtral-8x7B-Instruct-v0.13276832768161616
MistralMixtral-8x7B-v0.1mistralai/Mixtral-8x7B-v0.13276832768161616
MistralMistral-7B-Instruct-v0.2mistralai/Mistral-7B-Instruct-v0.2327683276816168
MistralMistral-7B-v0.1mistralai/Mistral-7B-v0.1327683276816168
Togetherllama-2-7b-chattogethercomputer/llama-2-7b-chat409640961281288