> ## Documentation Index > Fetch the complete documentation index at: https://docs.together.ai/llms.txt > Use this file to discover all available pages before exploring further. # Upload a model > Upload a custom or fine-tuned model from Hugging Face or S3 and serve it on a dedicated endpoint. You can run inference on your own custom or fine-tuned models by uploading them to Together AI and deploying them on a [dedicated endpoint](/docs/dedicated-endpoints/overview). Models can come from the Hugging Face Hub or from an archive in S3. ## Prerequisites A model is eligible for upload if it meets all of the following: * **Source:** Hugging Face Hub or an S3 presigned URL. * **Type:** text generation or embedding model. * **Scale:** fits on a single node. Multi-node models aren't supported. The model files must be in standard Hugging Face repository format, compatible with `from_pretrained`. A valid model directory contains files like: ``` config.json generation_config.json model-00001-of-00004.safetensors model-00002-of-00004.safetensors model-00003-of-00004.safetensors model-00004-of-00004.safetensors model.safetensors.index.json special_tokens_map.json tokenizer.json tokenizer_config.json ``` To upload a LoRA adapter instead of a full model, see [Deploy a fine-tuned adapter](/docs/dedicated-endpoints/adapter). ### S3 archive requirements If you're uploading from S3, package the files in a single archive (`.zip` or `.tar.gz`) with the model files at the root of the archive. Don't nest them inside an extra top-level directory. Correct (files at root): ``` config.json model.safetensors tokenizer.json ... ``` Incorrect (files nested in a directory): ``` my-model/ config.json model.safetensors tokenizer.json ... ``` To create the archive from within the model directory: ```bash Shell theme={null} cd /path/to/your/model tar -czvf ../model.tar.gz . ``` The presigned URL must point to the archive file in S3 and have an expiration of at least 100 minutes. ## Upload the model Upload from Hugging Face by passing the repo path as `model_source`. Include your Hugging Face token for private or gated repos. ```shell Shell theme={null} together models upload \ --model-name \ --model-source \ --hf-token "$HUGGINGFACE_TOKEN" ``` ```python Python theme={null} from together import Together client = Together() response = client.models.upload( model_name="", model_source="", hf_token="", ) print(response.data.job_id) ``` ```typescript TypeScript theme={null} import Together from "together-ai"; const client = new Together(); const response = await client.models.upload({ model_name: "", model_source: "", hf_token: "", }); console.log(response.data.job_id); ``` Upload from S3 by passing the presigned archive URL as `model_source`: ```shell Shell theme={null} together models upload \ --model-name \ --model-source ``` ```python Python theme={null} from together import Together client = Together() response = client.models.upload( model_name="", model_source="", ) print(response.data.job_id) ``` ```typescript TypeScript theme={null} import Together from "together-ai"; const client = new Together(); const response = await client.models.upload({ model_name: "", model_source: "", }); console.log(response.data.job_id); ``` The response includes a `job_id`. Use it to poll for upload status. ### CLI options | Option | Required | Description | | ---------------- | ---------------- | ------------------------------------------------------------- | | `--model-name` | Yes | The name to give the uploaded model. | | `--model-source` | Yes | A Hugging Face repo path or an S3 presigned URL. | | `--hf-token` | For Hugging Face | Your Hugging Face token. Required for private or gated repos. | | `--model-type` | No | `model` (default) or `adapter`. | | `--description` | No | A description of the model. | Sign in and go to [Models > Upload a model](https://api.together.ai/models/upload). In **Source URL**, enter your Hugging Face repo path (for example, `meta-llama/Llama-3.3-70B-Instruct`) or an S3 presigned URL pointing to your model archive. Enter a model name and an optional description. Both appear in your Together AI account once the upload completes. Select **Upload**. Together AI returns a job ID and starts the upload. ## Check upload status Poll the upload job until its `status` field is `Complete`. The model is ready to deploy at that point. ```shell Shell theme={null} curl -X GET "https://api.together.ai/v1/jobs/" \ -H "Authorization: Bearer $TOGETHER_API_KEY" ``` ```python Python theme={null} from together import Together client = Together() response = client.models.uploads.status("") print(response.status) ``` You can also see uploaded models on the [My models](https://api.together.ai/models?category=my-models) page in the dashboard. ## Deploy the model Uploaded models deploy as dedicated endpoints, the same way as any other model. List hardware available for the uploaded model: ```shell Shell theme={null} together endpoints hardware --model ``` ```python Python theme={null} from together import Together client = Together() response = client.endpoints.list_hardware(model="") for hw in response.data: print(hw.id) ``` ```typescript TypeScript theme={null} import Together from "together-ai"; const client = new Together(); const response = await client.endpoints.listHardware({ model: "", }); for (const hw of response.data) { console.log(hw.id); } ``` Create the endpoint, using the hardware ID from the list: ```shell Shell theme={null} together endpoints create \ --display-name \ --model \ --hardware ``` ```python Python theme={null} from together import Together client = Together() endpoint = client.endpoints.create( model="", hardware="", display_name="", autoscaling={"min_replicas": 1, "max_replicas": 1}, ) print(endpoint.id, endpoint.name) ``` ```typescript TypeScript theme={null} import Together from "together-ai"; const client = new Together(); const endpoint = await client.endpoints.create({ model: "", hardware: "", display_name: "", autoscaling: { min_replicas: 1, max_replicas: 1 }, }); console.log(endpoint.id, endpoint.name); ``` See [Manage dedicated endpoints](/docs/dedicated-endpoints/manage) for the full endpoint lifecycle, including autoscaling, listing, and deletion. Go to [My models](https://api.together.ai/models?category=my-models) and select your uploaded model to open its detail page. Select **Create dedicated endpoint** and pick the hardware and scaling configuration you want. Confirm to deploy. Once the endpoint state is `STARTED`, call it from the playground or via the API.