Update or Scale GPU Cluster

Together AI SDK (v2)

from together import Together
client = Together()

cluster = client.beta.clusters.update("cluster_id", cluster_type="KUBERNETES", num_gpus=24)
print(cluster)

{
  "cluster_id": "<string>",
  "cluster_type": "KUBERNETES",
  "region": "<string>",
  "gpu_type": "H100_SXM",
  "cluster_name": "<string>",
  "duration_hours": 123,
  "driver_version": "CUDA_12_5_555",
  "volumes": [
    {
      "volume_id": "<string>",
      "volume_name": "<string>",
      "size_tib": 123,
      "status": "<string>"
    }
  ],
  "status": "WaitingForControlPlaneNodes",
  "control_plane_nodes": [
    {
      "node_id": "<string>",
      "node_name": "<string>",
      "status": "<string>",
      "host_name": "<string>",
      "num_cpu_cores": 123,
      "memory_gib": 123,
      "network": "<string>"
    }
  ],
  "gpu_worker_nodes": [
    {
      "node_id": "<string>",
      "node_name": "<string>",
      "status": "<string>",
      "host_name": "<string>",
      "num_cpu_cores": 123,
      "num_gpus": 123,
      "memory_gib": 123,
      "networks": [
        "<string>"
      ]
    }
  ],
  "kube_config": "<string>",
  "num_gpus": 123
}

PUT

compute

clusters

{cluster_id}

Together AI SDK (v2)

from together import Together
client = Together()

cluster = client.beta.clusters.update("cluster_id", cluster_type="KUBERNETES", num_gpus=24)
print(cluster)

{
  "cluster_id": "<string>",
  "cluster_type": "KUBERNETES",
  "region": "<string>",
  "gpu_type": "H100_SXM",
  "cluster_name": "<string>",
  "duration_hours": 123,
  "driver_version": "CUDA_12_5_555",
  "volumes": [
    {
      "volume_id": "<string>",
      "volume_name": "<string>",
      "size_tib": 123,
      "status": "<string>"
    }
  ],
  "status": "WaitingForControlPlaneNodes",
  "control_plane_nodes": [
    {
      "node_id": "<string>",
      "node_name": "<string>",
      "status": "<string>",
      "host_name": "<string>",
      "num_cpu_cores": 123,
      "memory_gib": 123,
      "network": "<string>"
    }
  ],
  "gpu_worker_nodes": [
    {
      "node_id": "<string>",
      "node_name": "<string>",
      "status": "<string>",
      "host_name": "<string>",
      "num_cpu_cores": 123,
      "num_gpus": 123,
      "memory_gib": 123,
      "networks": [
        "<string>"
      ]
    }
  ],
  "kube_config": "<string>",
  "num_gpus": 123
}

Authorizations

Authorization

string

header

default:default

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Path Parameters

cluster_id

string

required

The ID of the cluster to update

Body

application/json

cluster_type

enum<string>

Type of cluster to update.

Available options:

KUBERNETES,

SLURM

num_gpus

integer

Number of GPUs to allocate in the cluster. This must be multiple of 8. For example, 8, 16 or 24

Response

200 - application/json

cluster_id

string

required

cluster_type

enum<string>

required

Type of cluster.

Available options:

KUBERNETES,

SLURM

region

string

required

gpu_type

enum<string>

required

Available options:

H100_SXM,

H200_SXM,

RTX_6000_PCI,

L40_PCIE,

B200_SXM,

H100_SXM_INF

cluster_name

string

required

duration_hours

integer

required

driver_version

enum<string>

required

Available options:

CUDA_12_5_555,

CUDA_12_6_560,

CUDA_12_6_565,

CUDA_12_8_570

volumes

object[]

required

Show child attributes

status

enum<string>

required

Current status of the GPU cluster.

Available options:

WaitingForControlPlaneNodes,

WaitingForDataPlaneNodes,

WaitingForSubnet,

WaitingForSharedVolume,

InstallingDrivers,

RunningAcceptanceTests,

Paused,

OnDemandComputePaused,

Ready,

Degraded,

Deleting

control_plane_nodes

object[]

required

Show child attributes

gpu_worker_nodes

object[]

required

Show child attributes

kube_config

string

required

num_gpus

integer

required

Retrieve Cluster

Delete a Cluster

⌘I

Together APIs

Python CLI

General

Authorizations

Path Parameters

Body

Response