The Finetune function of the Together Python Library is used to create, manage, and monitor fine-tune jobs.

Help

See all commands with:

together finetune --help

Create

To start a new fine-tune job:

together finetune create -t <FILE-ID> -m <MODEL>

Other arguments:

  • --training-file,-t (file-id, required) -- Specifies a training file with the file-id of a previously uploaded file (See Files).
  • --model,-m (model name, optional) -- Specifies the base model to fine-tune. Default: togethercomputer/RedPajama-INCITE-7B-Chat.
  • --suffix,-s (string, optional) -- Up to 40 characters that will be added to your fine-tuned model name. It is recommended to add this to differentiate fine-tuned models. Default: None.
  • --n-epochs, -ne (integer, optional) -- Number of epochs to fine-tune on the dataset. Default: 4, Min: 1, Max: 20
  • --n-checkpoints, -c (integer, optional) -- The number of checkpoints to save during training. Default: 1 One checkpoint is always saved on the last epoch for the trained model. The number of checkpoints must be < the number of epochs. If a larger number is given, the number of epochs will be used for the number of checkpoints.
  • --batch-size,-b (integer, optional) -- The batch size to use for training. Default: 32, Min: 4, Max: 128 (For CodeLlama-7b, the max batch size is 16. For CodeLlama-13b, the max batch size is 8.)
  • --learning-rate, -lr (float optional) -- The learning rate multiplier to use for training. Default: 0.00001, Min: 0.00000001, Max: 0.01
  • --wandb-api-key-- Your own Weights & Biases API key. If you provide the key, you can monitor your job progress on your Weights & Biases page. If not set WANDB_API_KEY environment variable is used.

The id field in the JSON response contains the value for the fine-tune job ID (ft-id) that can be used to get the status, retrieve logs, cancel the job, and download weights.

List

To list past and running fine-tune jobs:

together finetune list

The jobs will be sorted oldest-to-newest with the newest jobs at the bottom of the list.

Retrieve

To retrieve metadata on a job:

together finetune retrieve <FT-ID>

Monitor Events

To list events of a past or running job:

together finetune list-events <FT-ID>

Cancel

To cancel a running job:

together finetune cancel <FT-ID>

Status

To get the status of a job:

together finetune status <FT-ID>

Checkpoints

To list saved-checkpoints of a job:

together finetune checkpoints <FT-ID>

Download Model and Checkpoint Weights

To download the weights of a fine-tuned model, run:

together finetune download <FT-ID>

This command will download ZSTD compressed weights of the model. To extract the weights, run tar -xf filename.

Other arguments:

  • --output,-o (filename, optional) -- Specify the output filename. Default: <MODEL-NAME>.tar.zst
  • --step,-s (integer, optional) -- Download a specific checkpoint's weights. Defaults to download the latest weights. Default: -1