Skip to main content

Endpoint ID

Many commands require an ENDPOINT_ID to identify which endpoint to operate on. The endpoint ID is a unique identifier assigned when an endpoint is created, in the format:
endpoint-<uuid>
For example: endpoint-c2a48674-9ec7-45b3-ac30-0f25f2ad9462
The endpoint ID is different from the model name (e.g., mistralai/Mixtral-8x7B-Instruct-v0.1) or the display name you set with --display-name.

How to find your endpoint ID

You can find your endpoint ID in the following ways:
  1. From the create command output: The endpoint ID is returned when you create an endpoint.
  2. Using the list command: Run together endpoints list --mine true to see all your endpoints with their IDs.
  3. From the web interface: The endpoint ID is shown in the endpoint details page on the Together AI console.

Create

Create a new dedicated inference endpoint.

Usage

Shell
together endpoints create [MODEL] [GPU] [OPTIONS]

Example

Shell
together endpoints create \
--model mistralai/Mixtral-8x7B-Instruct-v0.1 \
--gpu h100 \
--gpu-count 2 \
--display-name "My Endpoint" \
--wait

Options

OptionsDescription
--model- TEXT(required) The model to deploy
--gpu [ h100 | a100 | l40 | l40s | rtx-6000](required) GPU type to use for inference
--min-replicas- INTEGERMinimum number of replicas to deploy
--max-replicas- INTEGERMaximum number of replicas to deploy
--gpu-count - INTEGERNumber of GPUs to use per replica
--display-name- TEXTA human-readable name for the endpoint
--no-speculative-decodingDisable speculative decoding for this endpoint
--no-auto-startCreate the endpoint in STOPPED state instead of auto-starting it
--waitWait for the endpoint to be ready after creation

Hardware

List all the hardware options, optionally filtered by model.

Usage

Shell
together endpoints hardware [OPTIONS]

Example

Shell
together endpoints hardware --model mistralai/Mixtral-8x7B-Instruct-v0.1

Options

OptionsDescription
--model- TEXTFilter hardware options by model
--jsonPrint output in JSON format
--availablePrint only available hardware options (can only be used if model is passed in)

Get

Print details for a specific endpoint.

Usage

Shell
together endpoints get [OPTIONS]

Example

Shell
together endpoints get endpoint-c2a48674-9ec7-45b3-ac30-0f25f2ad9462

Options

OptionsDescription
--jsonPrint output in JSON format

Update

Update an existing endpoint by listing the changes followed by the endpoint ID. You can find the endpoint ID by listing your dedicated endpoints.

Usage

Shell
together endpoints update [OPTIONS] ENDPOINT_ID

Example

Shell
together endpoints update --min-replicas 2 --max-replicas 4 endpoint-c2a48674-9ec7-45b3-ac30-0f25f2ad9462

Options

Note: Both --min-replicas and --max-replicas must be specified together
OptionsDescription
--display-name - TEXTA new human-readable name for the endpoint
--min-replicas - INTEGERNew minimum number of replicas to maintain
--max-replicas - INTEGERNew maximum number of replicas to scale up to

Start

Start a dedicated inference endpoint.

Usage

Shell
together endpoints start [OPTIONS] ENDPOINT_ID

Example

Shell
together endpoints start endpoint-c2a48674-9ec7-45b3-ac30-0f25f2ad9462

Options

OptionsDescription
--waitWait for the endpoint to start

Stop

Stop a dedicated inference endpoint.

Usage

Shell
together endpoints stop [OPTIONS] ENDPOINT_ID

Example

Shell
together endpoints stop endpoint-c2a48674-9ec7-45b3-ac30-0f25f2ad9462

Options

OptionsDescription
--waitWait for the endpoint to stop

Delete

Delete a dedicated inference endpoint.

Usage

Shell
together endpoints delete [OPTIONS] ENDPOINT_ID

Example

Shell
together endpoints delete endpoint-c2a48674-9ec7-45b3-ac30-0f25f2ad9462

List

Usage

Shell
together endpoints list [FLAGS]

Example

Shell
together endpoints list --type dedicated

Options

OptionsDescription
--jsonPrint output in JSON format
type [dedicated | serverless]Filter by endpoint type

Help

See all commands with
Shell
together endpoints --help