Skip to main content
POST
/
compute
/
clusters
Python
from together import Together

client = Together()

response = client.beta.clusters.create(
  cluster_name="my-gpu-cluster",
  region="us-central-8",
  gpu_type="H100_SXM",
  num_gpus=8,
  driver_version="CUDA_12_6_560",
  billint_type="ON_DEMAND",
)

print(response.cluster_id)
{
  "cluster_id": "<string>",
  "cluster_type": "KUBERNETES",
  "region": "<string>",
  "gpu_type": "H100_SXM",
  "cluster_name": "<string>",
  "volumes": [
    {
      "volume_id": "<string>",
      "volume_name": "<string>",
      "size_tib": 123,
      "status": "<string>"
    }
  ],
  "status": "WaitingForControlPlaneNodes",
  "control_plane_nodes": [
    {
      "node_id": "<string>",
      "node_name": "<string>",
      "status": "<string>",
      "host_name": "<string>",
      "num_cpu_cores": 123,
      "memory_gib": 123,
      "network": "<string>"
    }
  ],
  "gpu_worker_nodes": [
    {
      "node_id": "<string>",
      "node_name": "<string>",
      "status": "<string>",
      "host_name": "<string>",
      "num_cpu_cores": 123,
      "num_gpus": 123,
      "memory_gib": 123,
      "networks": [
        "<string>"
      ],
      "instance_id": "<string>"
    }
  ],
  "kube_config": "<string>",
  "num_gpus": 123,
  "cuda_version": "<string>",
  "nvidia_driver_version": "<string>",
  "duration_hours": 123,
  "slurm_shm_size_gib": 123,
  "capacity_pool_id": "<string>",
  "reservation_start_time": "2023-11-07T05:31:56Z",
  "reservation_end_time": "2023-11-07T05:31:56Z",
  "install_traefik": true,
  "created_at": "2023-11-07T05:31:56Z"
}

Documentation Index

Fetch the complete documentation index at: https://docs.together.ai/llms.txt

Use this file to discover all available pages before exploring further.

Authorizations

Authorization
string
header
default:default
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

GPU Cluster create request

region
string
required

Region to create the GPU cluster in. Usable regions can be found from client.clusters.list_regions()

gpu_type
enum<string>
required

Type of GPU to use in the cluster

Available options:
H100_SXM,
H200_SXM,
RTX_6000_PCI,
L40_PCIE,
B200_SXM,
H100_SXM_INF
num_gpus
integer
required

Number of GPUs to allocate in the cluster. This must be multiple of 8. For example, 8, 16 or 24

cluster_name
string
required

Name of the GPU cluster.

billing_type
enum<string>
required

RESERVED billing types allow you to specify the duration of the cluster reservation via the duration_days field. ON_DEMAND billing types will give you ownership of the cluster until you delete it. SCHEDULED_CAPACITY billing types allow you to reserve capacity for a scheduled time window. You must specify the reservation_start_time and reservation_end_time with this request.

Available options:
RESERVED,
ON_DEMAND,
SCHEDULED_CAPACITY
cuda_version
string
required

CUDA version for this cluster. For example, 12.5

nvidia_driver_version
string
required

Nvidia driver version for this cluster. For example, 550. Only some combination of cuda_version and nvidia_driver_version are supported.

cluster_type
enum<string>

Type of cluster to create.

Available options:
KUBERNETES,
SLURM
duration_days
integer

Duration in days to keep the cluster running.

shared_volume
object

Inline configuration to create a shared volume with the cluster creation.

volume_id
string

ID of an existing volume to use with the cluster creation.

gpu_node_failover_enabled
boolean
default:false

Whether automated GPU node failover should be enabled for this cluster. By default, it is disabled.

auto_scaled
boolean
default:false

Whether GPU cluster should be auto-scaled based on the workload. By default, it is not auto-scaled.

auto_scale_max_gpus
integer

Maximum number of GPUs to which the cluster can be auto-scaled up. This field is required if auto_scaled is true.

slurm_shm_size_gib
integer

Shared memory size in GiB for Slurm cluster. This field is required if cluster_type is SLURM.

capacity_pool_id
string

ID of the capacity pool to use for the cluster. This field is optional and only applicable if the cluster is created from a capacity pool.

reservation_start_time
string<date-time>

Reservation start time of the cluster. This field is required for SCHEDULED billing to specify the reservation start time for the cluster. If not provided, the cluster provisions immediately.

reservation_end_time
string<date-time>

Reservation end time of the cluster. This field is required for SCHEDULED billing to specify the reservation end time for the cluster.

install_traefik
boolean
default:false

Whether to install Traefik ingress controller in the cluster. This field is only applicable for Kubernetes clusters and is false by default.

slurm_image
string

Custom Slurm image for Slurm clusters.

Response

200 - application/json

OK

cluster_id
string
required
cluster_type
enum<string>
required

Type of cluster.

Available options:
KUBERNETES,
SLURM
region
string
required
gpu_type
enum<string>
required
Available options:
H100_SXM,
H200_SXM,
RTX_6000_PCI,
L40_PCIE,
B200_SXM,
H100_SXM_INF
cluster_name
string
required
volumes
object[]
required
status
enum<string>
required

Current status of the GPU cluster.

Available options:
WaitingForControlPlaneNodes,
WaitingForDataPlaneNodes,
WaitingForSubnet,
WaitingForSharedVolume,
InstallingDrivers,
RunningAcceptanceTests,
Paused,
OnDemandComputePaused,
Ready,
Degraded,
Deleting
control_plane_nodes
object[]
required
gpu_worker_nodes
object[]
required
kube_config
string
required
num_gpus
integer
required
cuda_version
string
required
nvidia_driver_version
string
required
duration_hours
integer
slurm_shm_size_gib
integer
capacity_pool_id
string
reservation_start_time
string<date-time>
reservation_end_time
string<date-time>
install_traefik
boolean
created_at
string<date-time>