Create
To start a new fine-tuning job:--model (to start from a base model) or --from-checkpoint (to resume from a previous job). Before the job is submitted, the CLI prints an estimated price and asks for confirmation. Pass --confirm (or -y) to skip the prompt in scripts and CI.
If
--training-file (or --validation-file) is a local path, the CLI uploads the file to the Files API automatically before kicking off the job.Parameters
| Flag | Description |
|---|---|
--training-file/-t [string | Path] | required Training file ID from the Files API or a local path to upload. The maximum allowed file size is 25 GB. |
--model [string] | Base model to fine-tune. See supported models. Required unless --from-checkpoint is set. |
--from-checkpoint [string] | Continue training from a previous fine-tuning job. Format: JOB_ID/OUTPUT_MODEL_NAME:STEP. The step is optional; the final checkpoint is used when omitted. Mutually exclusive with --model. |
--validation-file/-v [string] | Validation file ID from the Files API or a local path to upload. Required when --n-evals > 0. The maximum allowed file size is 25 GB. |
--suffix [string] | Up to 40 characters appended to the fine-tuned model name. Recommended to differentiate fine-tuned models. |
--packing/--no-packing | Whether to use sequence packing for training. Default: enabled. |
--max-seq-length [integer] | Maximum sequence length to use for training. Required when --no-packing is set. Defaults to the maximum allowed for the model and training type. |
--n-epochs/-ne [integer] | Number of epochs to fine-tune on the dataset. Default: 1. Min: 1. Max: 20. |
--n-evals [integer] | Number of evaluation loops to run on the validation set. Default: 0. Min: 0. Max: 100. |
--n-checkpoints/-c [integer] | The number of checkpoints to save during training. Default: 1. One checkpoint is always saved on the last epoch. Must be 1 ≤ n-checkpoints ≤ n-epochs. |
--batch-size/-b [integer | max] | Batch size for each training iteration. See supported models for min and max batch sizes per model. Default: max. |
--learning-rate/-lr [float] | Learning rate multiplier. Default: 0.00001. Min: 0.00000001. Max: 0.01. |
--lr-scheduler-type [linear | cosine] | Learning rate scheduler type. Default: cosine. |
--min-lr-ratio [float] | Ratio of the final learning rate to the peak learning rate. Default: 0.0. Min: 0.0. Max: 1.0. |
--scheduler-num-cycles [float] | Number or fraction of cycles for the cosine learning rate scheduler. Must be non-negative. Default: 0.5. |
--warmup-ratio [float] | Fraction of steps at the start of training to linearly warm up the learning rate. Default: 0.0. Min: 0.0. Max: 1.0. |
--max-grad-norm [float] | Max gradient norm for gradient clipping. Set to 0 to disable. Default: 1.0. Min: 0.0. |
--weight-decay [float] | Weight decay for the optimizer. Default: 0.0. Min: 0.0. |
--random-seed [integer] | Random seed for reproducible training. Uses the server default if unset. |
--confirm/-y | Skip the price-confirmation prompt. Useful in scripts and CI. |
--train-on-inputs [true | false | auto] | Whether to mask user messages in conversational data or prompts in instruction data.auto infers from the data format:
auto. |
--train-vision/--no-train-vision | Update the vision encoder parameters. Default: false. Only available for vision-language models. |
--from-hf-model [string] | Hugging Face Hub repository to start training from. Should match the base model’s architecture and size. When --lora is set with --lora-trainable-modules all-linear, the modules k_proj, o_proj, q_proj, v_proj are targeted for adapter training. |
--hf-model-revision [string] | Revision (branch name or commit hash) of the Hugging Face Hub model. |
--hf-api-token [string] | Hugging Face API token for downloading from a private repo or uploading the output model. |
--hf-output-repo-name [string] | Hugging Face repo to upload the fine-tuned model to. |
Weights & Biases
| Flag | Description |
|---|---|
--wandb-api-key [string] | Your Weights & Biases API key. Falls back to the WANDB_API_KEY environment variable. |
--wandb-base-url [string] | Base URL of a dedicated Weights & Biases instance. Leave empty if you are not using a self-hosted instance. |
--wandb-project-name [string] | Weights & Biases project for your run. Defaults to together. |
--wandb-name [string] | Weights & Biases run name. |
--wandb-entity [string] | Weights & Biases entity (team or user). |
LoRA
| Flag | Description |
|---|---|
--lora/--no-lora | Force LoRA fine-tuning (--lora) or full fine-tuning (--no-lora). When omitted, the API auto-detects: it defaults to LoRA on most base models, and inherits the parent job’s training type when --from-checkpoint is set. |
--lora-r [integer] | Rank for LoRA adapter weights. Default: 8. Min: 1. Max: 64. |
--lora-alpha [integer] | Alpha for LoRA adapter training. Default: 8. Min: 1. |
--lora-dropout [float] | Dropout probability for LoRA layers. Default: 0.0. Min: 0.0. Max: 1.0. |
--lora-trainable-modules [string] | Comma-separated list of LoRA trainable modules. Default: all-linear. See supported modules for LoRA training. |
Preference fine-tuning (DPO, RPO, SimPO)
| Flag | Description |
|---|---|
--training-method [sft | dpo] | Training method. sft is supervised fine-tuning; dpo is Direct Preference Optimization. Default: sft. The DPO method also accepts the RPO and SimPO loss modifiers below. |
--dpo-beta [float] | Beta parameter for DPO training. Only used when --training-method dpo. |
--dpo-normalize-logratios-by-length | Normalize logratios by sample length. Only used when --training-method dpo. Default: false. |
--rpo-alpha [float] | RPO alpha parameter (adds NLL term to the DPO loss). Only used when --training-method dpo. |
--simpo-gamma [float] | SimPO gamma parameter. Only used when --training-method dpo. |
List
To list past and running fine-tune jobs:Retrieve
To retrieve metadata for a job, including its current status:List events
To list events of a past or running job:Cancel
To cancel a running job:List checkpoints
To list saved checkpoints of a job:Download model weights
To download the weights of a fine-tuned model, run:.zst) weights. To extract them, run tar -xf filename.
Parameters
| Flag | Description |
|---|---|
--output-dir/-o [Path] | Output directory. |
--checkpoint-step/-s [integer] | Download a specific checkpoint’s weights. Defaults to the latest checkpoint. |
--checkpoint-type/-c [merged | adapter | default] | Checkpoint type. merged and adapter apply to LoRA jobs only; default resolves to merged for LoRA jobs and to the full model for non-LoRA jobs. Default: merged. |
Delete
To delete a fine-tuning job:Parameters
| Flag | Description |
|---|---|
--force | Bypass confirmation prompt. |