Endpoint ID
Many commands require anENDPOINT_ID to identify which endpoint to operate on. The endpoint ID is a unique identifier assigned when an endpoint is created, in the format:
endpoint-c2a48674-9ec7-45b3-ac30-0f25f2ad9462
The endpoint ID is different from the model name (e.g.,
mistralai/Mixtral-8x7B-Instruct-v0.1) or the display name you set with --display-name.How to find your endpoint ID
You can find your endpoint ID in the following ways:- From the create command output: The endpoint ID is returned when you create an endpoint.
-
Using the list command: Run
together endpoints list --mine trueto see all your endpoints with their IDs. - From the web interface: The endpoint ID is shown in the endpoint details page on the Together AI console.
Create
Create a new dedicated inference endpoint.Usage
Shell
Example
Shell
Options
| Options | Description |
|---|---|
--model- TEXT | (required) The model to deploy |
--gpu [ h100 | a100 | l40 | l40s | rtx-6000] | (required) GPU type to use for inference |
--min-replicas- INTEGER | Minimum number of replicas to deploy |
--max-replicas- INTEGER | Maximum number of replicas to deploy |
--gpu-count - INTEGER | Number of GPUs to use per replica |
--display-name- TEXT | A human-readable name for the endpoint |
--no-speculative-decoding | Disable speculative decoding for this endpoint |
--no-auto-start | Create the endpoint in STOPPED state instead of auto-starting it |
--wait | Wait for the endpoint to be ready after creation |
Hardware
List all the hardware options, optionally filtered by model.Usage
Shell
Example
Shell
Options
| Options | Description |
|---|---|
--model- TEXT | Filter hardware options by model |
--json | Print output in JSON format |
--available | Print only available hardware options (can only be used if model is passed in) |
Get
Print details for a specific endpoint.Usage
Shell
Example
Shell
Options
| Options | Description |
|---|---|
--json | Print output in JSON format |
Update
Update an existing endpoint by listing the changes followed by the endpoint ID. You can find the endpoint ID by listing your dedicated endpoints.Usage
Shell
Example
Shell
Options
Note: Both--min-replicas and --max-replicas must be specified together
| Options | Description |
|---|---|
--display-name - TEXT | A new human-readable name for the endpoint |
--min-replicas - INTEGER | New minimum number of replicas to maintain |
--max-replicas - INTEGER | New maximum number of replicas to scale up to |
Start
Start a dedicated inference endpoint.Usage
Shell
Example
Shell
Options
| Options | Description |
|---|---|
--wait | Wait for the endpoint to start |
Stop
Stop a dedicated inference endpoint.Usage
Shell
Example
Shell
Options
| Options | Description |
|---|---|
--wait | Wait for the endpoint to stop |
Delete
Delete a dedicated inference endpoint.Usage
Shell
Example
Shell
List
Usage
Shell
Example
Shell
Options
| Options | Description |
|---|---|
--json | Print output in JSON format |
type [dedicated | serverless] | Filter by endpoint type |
Help
See all commands withShell