Upload a model - Together AI docs

You can run inference on your own custom or fine-tuned models by uploading them to Together AI and deploying them on a dedicated endpoint. Models can come from the Hugging Face Hub or from an archive in S3.

Prerequisites

A model is eligible for upload if it meets all of the following:

Source: Hugging Face Hub or an S3 presigned URL.
Type: text generation or embedding model.
Scale: fits on a single node. Multi-node models aren’t supported.

The model files must be in standard Hugging Face repository format, compatible with from_pretrained. A valid model directory contains files like:

config.json
generation_config.json
model-00001-of-00004.safetensors
model-00002-of-00004.safetensors
model-00003-of-00004.safetensors
model-00004-of-00004.safetensors
model.safetensors.index.json
special_tokens_map.json
tokenizer.json
tokenizer_config.json

To upload a LoRA adapter instead of a full model, see Deploy a fine-tuned adapter.

S3 archive requirements

If you’re uploading from S3, package the files in a single archive (.zip or .tar.gz) with the model files at the root of the archive. Don’t nest them inside an extra top-level directory. Correct (files at root):

config.json
model.safetensors
tokenizer.json
...

Incorrect (files nested in a directory):

my-model/
  config.json
  model.safetensors
  tokenizer.json
  ...

To create the archive from within the model directory:

Shell

cd /path/to/your/model
tar -czvf ../model.tar.gz .

The presigned URL must point to the archive file in S3 and have an expiration of at least 100 minutes.

Upload the model

CLI / SDK
UI

Upload from Hugging Face by passing the repo path as model_source. Include your Hugging Face token for private or gated repos.

together models upload \
  --model-name <MODEL_NAME> \
  --model-source <HUGGING_FACE_REPO> \
  --hf-token "$HUGGINGFACE_TOKEN"

Upload from S3 by passing the presigned archive URL as model_source:

together models upload \
  --model-name <MODEL_NAME> \
  --model-source <S3_PRESIGNED_URL>

The response includes a job_id. Use it to poll for upload status.

CLI options

Option	Required	Description
`--model-name`	Yes	The name to give the uploaded model.
`--model-source`	Yes	A Hugging Face repo path or an S3 presigned URL.
`--hf-token`	For Hugging Face	Your Hugging Face token. Required for private or gated repos.
`--model-type`	No	`model` (default) or `adapter`.
`--description`	No	A description of the model.

Open the upload form

Enter the source

In Source URL, enter your Hugging Face repo path (for example, meta-llama/Llama-3.3-70B-Instruct) or an S3 presigned URL pointing to your model archive.

Name and describe the model

Enter a model name and an optional description. Both appear in your Together AI account once the upload completes.

Submit

Select Upload. Together AI returns a job ID and starts the upload.

Check upload status

Poll the upload job until its status field is Complete. The model is ready to deploy at that point.

curl -X GET "https://api.together.ai/v1/jobs/<JOB_ID>" \
     -H "Authorization: Bearer $TOGETHER_API_KEY"

You can also see uploaded models on the My models page in the dashboard.

Deploy the model

Uploaded models deploy as dedicated endpoints, the same way as any other model.

CLI / SDK
UI

List hardware available for the uploaded model:

together endpoints hardware --model <MODEL_NAME>

Create the endpoint, using the hardware ID from the list:

together endpoints create \
  --display-name <ENDPOINT_NAME> \
  --model <MODEL_NAME> \
  --hardware <HARDWARE_ID>

See Manage dedicated endpoints for the full endpoint lifecycle, including autoscaling, listing, and deletion.

Open the model page

Go to My models and select your uploaded model to open its detail page.

Create the endpoint

Select Create dedicated endpoint and pick the hardware and scaling configuration you want.

Deploy

Confirm to deploy. Once the endpoint state is STARTED, call it from the playground or via the API.

Documentation Index

​Prerequisites

​S3 archive requirements

​Upload the model

​CLI options

​Check upload status

​Deploy the model