Quick Links
- Install the Library
- Prepare your Data
- Check and Upload your Data
- Start Fine-Tuning
- Monitor Progress
- Using a Downloaded Model
- Deploy your Fine-Tuned Model
- Colab Notebook Finetuning Project Tutorial
Install the Library
To get started, install thetogether
Python library:
TOGETHER_API_KEY
environment variable:
Upload your Data
See the data preparation instruction to understand the requirements and its readiness. To upload your data, run the following code:id
of the file you just uploaded, but if you forget it, you can get the id
’s of all the files you have uploaded using files.list()
. You’ll need these id
’s that start with file-960be810-4d....
in order to start a fine-tuning job.
Start Fine-Tuning
Once you’ve uploaded your dataset, copy your file id from the output above and select a base model to fine-tune. Check out the full models list available for fine-tuning. Run the following command to start your fine-tuning job usingfine_tuning.create
:
resp
response to highlight some of the useful information about your finetune job.
fine_tuning.retrieve()
method using the job ID provided above. For example, from the sample output above, ft-3b883474-f39c-40d9-9d5a-7f97ba9eeb9f
is your Job ID.
You can also list all the events for a specific fine-tuning job to check the progress or cancel your job with the commands below.
Monitor Progress
You can check the completion progress of your fine-tuning job in Jobs tab of the playground. If you provided your weights & biases API key, you can also check the learning progress of your fine-tuning job at wandb.ai, for example, with my wandb user configurations, I would go to:https://wandb.ai/<username>/together?workspace=user-<username>
where <username>
is your unique weights & biases user-name like mama-llama-88
.
Congratulations! You’ve just fine-tuned a model with the Together API. Now it’s time to deploy your model.
Deploy your Fine-Tuned Model
Host your Model
Once the fine-tune job completes and you host your new model, you will be able to see your model in the Playground Models page. You can directly deploy the model through the web UI by clicking on the model, selecting your hardware and clicking play! Available hardware includes RTX6000, L40, L40S, A100 PCIe, A100 SXM and H100. Hardware options displayed depends on model constraints and overall hardware availability. Once the model is deployed, you can use the model through the playground or through our inference API. For the inference API, follow the instructions in the inference documentation. For our model above, we can run inference on it with the following code:Pricing
Pricing for fine-tuning is based on model size, the number of tokens, and the number of epochs. You can estimate fine-tuning pricing with our calculator. The tokenization step is a part of the fine-tuning process on our API, and the exact number of tokens and the price of your job will be available after the tokenization step is done. You can find the information in the “JOBS” page or retrieve them by runningtogether fine-tuning retrieve $JOB_ID
in your CLI.
Q: Is there a minimum price? The minimum price for a fine-tuning job is 366. If you fine-tune this model for 1M tokens for 1 epoch, it is 5.
Q: What happens if I cancel my job? The final price will be determined baed on the amount of tokens used to train your model up to the point of the cancellation. For example, if your fine-tuning job is using Llama-3-8B with a batch size of 8, and you cancelled the job after 1000 training steps, the total number of tokens used for training is 8192 [context length] x 8 [batch size] x 1000 [steps] = 65,536,000. This results in $27.21 as you can check in the pricing page.
Using a Downloaded Model
If you want to download your model locally, you can do so by following the steps below. The model will download as atar.zst
file.
ls my-model
.bin
and .json
files to load your model