POST
/
endpoints
from together import Together
import os

client = Together(
    api_key=os.environ.get("TOGETHER_API_KEY"),
)

endpoint = client.endpoints.create(
    model="meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
    hardware="1x_nvidia_a100_80gb_sxm",
    min_replicas=2,
    max_replicas=5,
)

print(endpoint.id)
{
  "object": "endpoint",
  "id": "endpoint-d23901de-ef8f-44bf-b3e7-de9c1ca8f2d7",
  "name": "devuser/meta-llama/Llama-3-8b-chat-hf-a32b82a1",
  "display_name": "My Llama3 70b endpoint",
  "model": "meta-llama/Llama-3-8b-chat-hf",
  "hardware": "1x_nvidia_a100_80gb_sxm",
  "type": "dedicated",
  "owner": "devuser",
  "state": "STARTED",
  "autoscaling": {
    "min_replicas": 2,
    "max_replicas": 5
  },
  "created_at": "2025-02-04T10:43:55.405Z"
}

Authorizations

Authorization
string
header
default:default
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

Response

200
application/json

200

Details about a dedicated endpoint deployment