Python library
Reference this guide to start fine-tuning a model using the Python library.
Quick Links
- Install the Library
- Prepare your Data
- Check and Upload your Data
- Start Fine-Tuning
- Monitor Progress
- Using a Downloaded Model
- Deploy your Fine-Tuned Model
- Colab Notebook Finetuning Project Tutorial
Install the Library
To get started, install the together
Python library:
pip install --upgrade together
Then, configure your API key by setting the TOGETHER_API_KEY
environment variable:
export TOGETHER_API_KEY=xxxxx
Upload your Data
See the data preparation instruction to understand the requirements and its readiness.
To upload your data, run the following code:
import os
from together import Together
client = Together(api_key=os.environ.get("TOGETHER_API_KEY"))
resp = client.files.upload(file="joke_explanations.jsonl") # uploads a file
print(resp.dict())
Here is the output:
{
"id": "file-f6d02dc8-c9f9-4e38-ae63-7899fa603a86",
"object": "file",
"created_at": 1713481731,
"type": null,
"purpose": "fine-tune",
"filename": "joke_explanations.jsonl",
"bytes": 0,
"line_count": 0,
"processed": false
}
You will get back the file id
of the file you just uploaded, but if you forget it, you can get the id
's of all the files you have uploaded using files.list()
. You'll need these id
's that start with file-960be810-4d....
in order to start a fine-tuning job.
import os
from together import Together
client = Together(api_key=os.environ.get("TOGETHER_API_KEY"))
filesUploaded = client.files.list() # lists all uploaded files
print(filesUploaded)
[{'filename': 'joke_explanations.jsonl',
'bytes': 40805,
'created_at': 1691710036,
'id': 'file-960be810-4d33-449a-885a-9f69bd8fd0e2',
'purpose': 'fine-tune',
'object': 'file',
'LineCount': 0,
'Processed': True},
{'filename': 'sample_jsonl.jsonl',
'bytes': 1235,
'created_at': 1692190883,
'id': 'file-d0d318cb-b7d9-493a-bd70-1cfe089d3815',
'purpose': 'fine-tune',
'object': 'file',
'LineCount': 0,
'Processed': True}]
Start Fine-Tuning
Once you've uploaded your dataset, copy your file id from the output above and select a base model to fine-tune. Check out the full models list available for fine-tuning.
Run the following command to start your fine-tuning job using fine_tuning.create
:
import os
from together import Together
client = Together(api_key=os.environ.get("TOGETHER_API_KEY"))
resp = client.fine_tuning.create(
training_file = 'file-d0d318cb-b7d9-493a-bd70-1cfe089d3815',
model = 'meta-llama/Meta-Llama-3-8B',
n_epochs = 3,
n_checkpoints = 1,
batch_size = 4,
learning_rate = 1e-5,
wandb_api_key = '1a2b3c4d5e.......',
)
fine_tune_id = resp['id']
print(resp)
Here is an example of part of the resp
response to highlight some of the useful information about your finetune job.
{
"id": "ft-3b883474-f39c-40d9-9d5a-7f97ba9eeb9f",
"training_file": "file-2490a204-16e2-481e-a3d5-5636a6f3a4ea",
"model": "meta-llama/Meta-Llama-3-8B",
"output_name": "[email protected]/Meta-Llama-3-8B-2024-04-18-19-37-52",
"n_epochs": 1,
"n_checkpoints": 1,
"batch_size": 32,
"learning_rate": 3e-05,
"created_at": "2024-04-18T19:37:52.611Z",
"updated_at": "2024-04-18T19:37:52.611Z",
"status": "pending",
"events": [
{
"object": "fine-tune-event",
"created_at": "2024-04-18T19:37:52.611Z",
"message": "Fine tune request created",
"type": "JOB_PENDING",
...
}
],
"training_file_size": 150047,
"model_output_path": "s3://together-dev/finetune/65987df6752090cead0c9056/[email protected]/Meta-Llama-3-8B-2024-04-18-19-37-52/ft-3b883474-f39c-40d9-9d5a-7f97ba9eeb9f",
"user_id": "65987df6752090cead0c9056",
"owner_address": "0xf42ea9df7377257571fb0aae8799b6a357ba1bfb",
"enable_checkpoints": false,
...
}
You can retrieve all this information again by running the fine_tuning.retrieve()
method using the job ID provided above. For example, from the sample output above, ft-3b883474-f39c-40d9-9d5a-7f97ba9eeb9f
is your Job ID.
You can also list all the events for a specific fine-tuning job to check the progress or cancel your job with the commands below.
print(client.fine_tuning.retrieve(id="ft-3b883474-f39c-40d9-9d5a-7f97ba9eeb9f")) # retrieves information on finetune event
print(client.fine_tuning.list_events(id="ft-3b883474-f39c-40d9-9d5a-7f97ba9eeb9f")) # Lists events of a fine-tune job
print(client.fine_tuning.cancel(id="ft-3b883474-f39c-40d9-9d5a-7f97ba9eeb9f")) # Cancels a fine-tuning job
A fine-tune job can take anywhere between a couple minutes to hours depending on the base model, dataset size, number of epochs, and job queue.
Monitor Progress
You can check the completion progress of your fine-tuning job in Jobs tab of the playground.
If you provided your weights & biases API key, you can also check the learning progress of your fine-tuning job at wandb.ai, for example, with my wandb user configurations, I would go to: https://wandb.ai/<username>/together?workspace=user-<username>
where <username>
is your unique weights & biases user-name like mama-llama-88
.
π Congratulations! You've just fine-tuned a model with the Together API. Now it's time to deploy your model.
Deploy your Fine-Tuned Model
Host your Model
Once the fine-tune job completes, you will be able to see your model in the Playground. You can directly deploy the model through the web UI by clicking the play button on your job pop-up. Alternatively, you can deploy the model through our start API. Once the model is deployed, you can use the model through the playground or through our inference API. For the inference API, follow the instructions in the inference documentation.
For our model above, we can run inference on it with the following code:
import os
from together import Together
client = Together(api_key=os.environ.get("TOGETHER_API_KEY"))
response = client.chat.completions.create(
model="[email protected]/Meta-Llama-3-8B-2024-04-18-19-37-52",
messages=[{"role": "user", "content": "tell me about new york"}],
)
print(response.choices[0].message.content)
To directly download the model weights, see the instruction here.
Pricing
Please note that hosting your fine-tuned model is charged hourly. See the hourly pricing for fine-tuned model inference in the pricing table. When you are not using the model, be sure to stop the instance either through the web UI or through the stop API. However, frequent starting and stopping may incur delay on your deployment.
Pricing for fine-tuning is based on model size, the number of tokens, and the number of epochs. You can estimate fine-tuning pricing with our calculator.
Using a Downloaded Model
If you want to download your model locally, you can do so by following the steps below. The model will download as a tar.zst
file.
client.fine_tuning.download(
fine_tune_id="ft-eb167402-98ed-4ac5-b6f5-8140c4ba146e",
output = "my-model/model.tar.zst"
)
To uncompress this filetype on Mac you need to install zstd.
brew install zstd
cd my-model
zstd -d model.tar.zst
tar -xvf model.tar
cd ..
Within the folder that you uncompress the file, you will find a set of files like this:
ls my-model
tokenizer_config.json
special_tokens_map.json
pytorch_model.bin
generation_config.json
tokenizer.json
config.json
Use the folder path that contains these .bin
and .json
files to load your model
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
tokenizer = AutoTokenizer.from_pretrained("./my-model")
model = AutoModelForCausalLM.from_pretrained(
"./my-model",
trust_remote_code=True,
).to(device)
input_context = "Space Robots are"
input_ids = tokenizer.encode(input_context, return_tensors="pt")
output = model.generate(input_ids.to(device), max_length=128, temperature=0.7).cpu()
output_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(output_text)
Space Robots are a great way to get your kids interested in science. After all, they are the future!
Updated 4 days ago