Fine-Tune a Model with CLI

Reference this guide to start fine-tuning a model using the command-line interface.

Fine-tuning builds upon the foundational knowledge of pre-trained models, enabling them to specialize in specific tasks tailored to your requirements. With Together, you can fine-tune and run your model in the cloud without having to set up any GPUs. The Together API allows you to enhance state-of-the-art models sourced from leading open-source platforms using your unique data. You can tune all model layers, control hyper-parameters, and run the model with Together’s inference service. You can also download the weights to run wherever you’d like.

In this guide, we’ll show you how to fine-tune one of Together's models with a dataset we are providing.

Install the Library

To get started, install the together Python library:

pip install --upgrade together

Authenticate

The API Key can be configured by setting the TOGETHER_API_KEY environment variable, like this:

export TOGETHER_API_KEY=xxxxx

Find your API token in your account settings.

Prepare your Data

Prepare your dataset as a .jsonl file with a text field.

{"text": "..."}
{"text": "..."}

For more details and examples, check out this page.

To confirm that your dataset has the right format, run the following command:

$ together files check PATH_TO_DATA_FILE

Upload your Data

Replace PATH_TO_DATA_FILE with the path to your dataset.

$ together files upload PATH_TO_DATA_FILE

The output should be similar to the following (in this case using an example dataset from HuggingFace):

$ together files upload ~/data/joke_explanations/2023-06-27/unified_joke_explanations.jsonl
{
    "filename": "unified_joke_explanations.jsonl",
    "bytes": 150047,
    "created_at": 1687982638,
    "id": "file-d88343a5-3ba5-4b42-809a-9f1ee2b83861",
    "purpose": "fine-tune",
    "object": "file",
    "LineCount": 356
}

Start Fine-Tuning

Submit your fine-tuning job using the CLI:

$ together finetune create --training-file $FILE_ID --model $MODEL_NAME --wandb-api-key $WANDB_API_KEY

Replace FILE_ID with the ID of the training file.
Replace MODEL_NAME with the API name of the base model you want to fine-tune (refer to the models list).
Replace WANDB_API_KEY with your own Weights & Biases API key (Optional).

You can also use suffix parameter to customize your model name. To see all input arguments and their details, visit this page.

Here's a sample output:

$ together finetune create --training-file file-d88343a5-3ba5-4b42-809a-9f1ee2b83861 --model togethercomputer/RedPajama-INCITE-7B-Chat
{
    "training_file": "file-d88343a5-3ba5-4b42-809a-9f1ee2b83861",
    "model_output_name": "username/togethercomputer/RedPajama-INCITE-7B-Chat",
    "model_output_path": "s3://together-dev/finetune/63e2b89da6382c4d75d5ef22/csris/togethercomputer/RedPajama-INCITE-7B-Chat",
    "Suffix": "",
    "model": "togethercomputer/RedPajama-INCITE-7B-Chat",
    "n_epochs": 4,
    "batch_size": 128,
    "learning_rate": 1e-06,
    "checkpoint_steps": 2,
    "created_at": 1687982945,
    "updated_at": 1687982945,
    "status": "pending",
    "id": "ft-5bf8990b-841d-4d63-a8a3-5248d73e045f",
    "job_id": "",
    "token_count": 0,
    "param_count": 0,
    "total_price": 0,
    "epochs_completed": 0,
    "events": [
        {
            "object": "fine-tune-event",
            "created_at": 1687982945,
            "level": "",
            "message": "Fine tune request created",
            "type": "JOB_PENDING",
            "param_count": 0,
            "token_count": 0,
            "checkpoint_path": "",
            "model_path": ""
        }
    ],
    "queue_depth": 0,
    "wandb_project_name": ""
}

Take note of the ID of the job ("id" from the output) as you'll need that to track progress and download model weights. For example, from the sample output above, ft-5bf8990b-841d-4d63-a8a3-5248d73e045f is your Job ID.

A fine-tune job can take anywhere between a couple minutes to hours depending on the base model, dataset size, number of epochs, and job queue.

Monitor Progress

View progress by navigating to the Jobs tab in the Playground. You can also monitor progress using the CLI:

$ together finetune list-events FINETUNE_ID

Replace FINETUNE_ID with the ID of the fine tuning job.

The output should be similar to:

$ together finetune list-events ft-5bf8990b-841d-4d63-a8a3-5248d73e045f
{
    "data": [
        {
            "object": "fine-tune-event",
            "created_at": 1687982945,
            "level": "",
            "message": "Fine tune request created",
            "type": "JOB_PENDING",
            "param_count": 0,
            "token_count": 0,
            "checkpoint_path": "",
            "model_path": ""
        },
        {
            "object": "fine-tune-event",
            "created_at": 1687982993,
            "level": "info",
            "message": "Training started at Wed Jun 28 13:09:51 PDT 2023",
            "type": "JOB_START",
            "param_count": 0,
            "token_count": 0,
            "checkpoint_path": "",
            "model_path": ""
        },
        {
            "object": "fine-tune-event",
            "created_at": 1687983122,
            "level": "info",
            "message": "Model data downloaded for togethercomputer/RedPajama-INCITE-7B-Chat at Wed Jun 28 13:12:01 PDT 2023",
            "type": "MODEL_DOWNLOAD_COMPLETE",
            "param_count": 0,
            "token_count": 0,
            "checkpoint_path": "",
            "model_path": ""
        },
        {
            "object": "fine-tune-event",
            "created_at": 1687983124,
            "level": "info",
            "message": "Training data downloaded for togethercomputer/RedPajama-INCITE-7B-Chat at Wed Jun 28 13:12:03 PDT 2023",
            "type": "TRAINING_DATA_DOWNLOAD_COMPLETE",
            "param_count": 0,
            "token_count": 0,
            "checkpoint_path": "",
            "model_path": ""
        }
    ],
    "object": "list"
}

Other commands

  1. List all of your current jobs

    $ together finetune list
    
  2. Cancel a job

    $ together finetune cancel FINETUNE_ID
    

Replace FINETUNE_ID with the ID of the fine tuning job.

Commands

Here are all the commands available through CLI

# list commands
together --help

# list available models
together models list

# start a model
together models start togethercomputer/RedPajama-INCITE-7B-Base

# create completion
together complete "Space robots are" -m togethercomputer/RedPajama-INCITE-7B-Base

# check which models are running
together models instances

# stop a model
together models stop togethercomputer/RedPajama-INCITE-7B-Base

# check your jsonl file
together files check jokes.jsonl

# upload your jsonl file
together files upload jokes.jsonl

# upload your jsonl file and disable file checking
together files upload jokes.jsonl --no-check

# list your uploaded files
together files list

# start fine-tuning a model on your jsonl file (use the id of your file given to after upload or from together files list)
together finetune create -t file-9263d6b7-736f-43fc-8d14-b7f0efae9079 -m togethercomputer/RedPajama-INCITE-Chat-3B-v1

# check the status of the finetune job
together finetune status ft-dd93c727-f35e-41c2-a370-7d55b54128fa

# retrieve progress updates about the finetune job
together finetune retrieve ft-dd93c727-f35e-41c2-a370-7d55b54128fa

# download your finetuned model (with your fine_tune_id from the id key given during create or from together finetune list)
together finetune download ft-dd93c727-f35e-41c2-a370-7d55b54128fa 

# check if your newly started finetuned model is ready for inference
together models ready yourname/ft-dd93c727-f35e-41c2-a370-7d55b54128fa-2023-08-16-10-15-09

# inference using your new finetuned model (with new finetuned model name from together models list)
together complete "Space robots are" -m yourname/ft-dd93c727-f35e-41c2-a370-7d55b54128fa-2023-08-16-10-15-09

Resources

See the list of base models available to fine-tune with the Together API.

Estimate fine-tuning pricing with our calculator. Pricing is based on model size, dataset size, and the number of epochs.

For more detailed options, see the Python Library's docs for Files and Fine-tuning