Skip to main content
POST
/
compute
/
clusters
Python
from together import Together

client = Together()

response = client.beta.clusters.create(
  cluster_name="my-gpu-cluster",
  region="us-central-8",
  gpu_type="H100_SXM",
  num_gpus=8,
  driver_version="CUDA_12_6_560",
  billint_type="ON_DEMAND",
)

print(response.cluster_id)
{
  "cluster_id": "<string>",
  "cluster_type": "KUBERNETES",
  "region": "<string>",
  "gpu_type": "H100_SXM",
  "cluster_name": "<string>",
  "volumes": [
    {
      "volume_id": "<string>",
      "volume_name": "<string>",
      "size_tib": 123,
      "status": "<string>"
    }
  ],
  "status": "WaitingForControlPlaneNodes",
  "control_plane_nodes": [
    {
      "node_id": "<string>",
      "node_name": "<string>",
      "status": "<string>",
      "host_name": "<string>",
      "num_cpu_cores": 123,
      "memory_gib": 123,
      "network": "<string>"
    }
  ],
  "gpu_worker_nodes": [
    {
      "node_id": "<string>",
      "node_name": "<string>",
      "status": "<string>",
      "host_name": "<string>",
      "num_cpu_cores": 123,
      "num_gpus": 123,
      "memory_gib": 123,
      "networks": [
        "<string>"
      ],
      "instance_id": "<string>"
    }
  ],
  "kube_config": "<string>",
  "num_gpus": 123,
  "cuda_version": "<string>",
  "nvidia_driver_version": "<string>",
  "duration_hours": 123,
  "slurm_shm_size_gib": 123,
  "capacity_pool_id": "<string>",
  "reservation_start_time": "2023-11-07T05:31:56Z",
  "reservation_end_time": "2023-11-07T05:31:56Z",
  "install_traefik": true,
  "created_at": "2023-11-07T05:31:56Z"
}

Authorizations

Authorization
string
header
default:default
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

GPU Cluster create request

region
string
required

Region to create the GPU cluster in. Usable regions can be found from client.clusters.list_regions()

gpu_type
enum<string>
required

Type of GPU to use in the cluster

Available options:
H100_SXM,
H200_SXM,
RTX_6000_PCI,
L40_PCIE,
B200_SXM,
H100_SXM_INF
num_gpus
integer
required

Number of GPUs to allocate in the cluster. This must be multiple of 8. For example, 8, 16 or 24

cluster_name
string
required

Name of the GPU cluster.

billing_type
enum<string>
required

RESERVED billing types allow you to specify the duration of the cluster reservation via the duration_days field. ON_DEMAND billing types will give you ownership of the cluster until you delete it. SCHEDULED_CAPACITY billing types allow you to reserve capacity for a scheduled time window. You must specify the reservation_start_time and reservation_end_time with this request.

Available options:
RESERVED,
ON_DEMAND,
SCHEDULED_CAPACITY
cuda_version
string
required

CUDA version for this cluster. For example, 12.5

nvidia_driver_version
string
required

Nvidia driver version for this cluster. For example, 550. Only some combination of cuda_version and nvidia_driver_version are supported.

cluster_type
enum<string>

Type of cluster to create.

Available options:
KUBERNETES,
SLURM
duration_days
integer

Duration in days to keep the cluster running.

shared_volume
object

Inline configuration to create a shared volume with the cluster creation.

volume_id
string

ID of an existing volume to use with the cluster creation.

gpu_node_failover_enabled
boolean
default:false

Whether automated GPU node failover should be enabled for this cluster. By default, it is disabled.

auto_scaled
boolean
default:false

Whether GPU cluster should be auto-scaled based on the workload. By default, it is not auto-scaled.

auto_scale_max_gpus
integer

Maximum number of GPUs to which the cluster can be auto-scaled up. This field is required if auto_scaled is true.

slurm_shm_size_gib
integer

Shared memory size in GiB for Slurm cluster. This field is required if cluster_type is SLURM.

capacity_pool_id
string

ID of the capacity pool to use for the cluster. This field is optional and only applicable if the cluster is created from a capacity pool.

reservation_start_time
string<date-time>

Reservation start time of the cluster. This field is required for SCHEDULED billing to specify the reservation start time for the cluster. If not provided, the cluster will be provisioned immediately.

reservation_end_time
string<date-time>

Reservation end time of the cluster. This field is required for SCHEDULED billing to specify the reservation end time for the cluster.

install_traefik
boolean
default:false

Whether to install Traefik ingress controller in the cluster. This field is only applicable for Kubernetes clusters and is false by default.

slurm_image
string

Custom Slurm image for Slurm clusters.

Response

200 - application/json

OK

cluster_id
string
required
cluster_type
enum<string>
required

Type of cluster.

Available options:
KUBERNETES,
SLURM
region
string
required
gpu_type
enum<string>
required
Available options:
H100_SXM,
H200_SXM,
RTX_6000_PCI,
L40_PCIE,
B200_SXM,
H100_SXM_INF
cluster_name
string
required
volumes
object[]
required
status
enum<string>
required

Current status of the GPU cluster.

Available options:
WaitingForControlPlaneNodes,
WaitingForDataPlaneNodes,
WaitingForSubnet,
WaitingForSharedVolume,
InstallingDrivers,
RunningAcceptanceTests,
Paused,
OnDemandComputePaused,
Ready,
Degraded,
Deleting
control_plane_nodes
object[]
required
gpu_worker_nodes
object[]
required
kube_config
string
required
num_gpus
integer
required
cuda_version
string
required
nvidia_driver_version
string
required
duration_hours
integer
slurm_shm_size_gib
integer
capacity_pool_id
string
reservation_start_time
string<date-time>
reservation_end_time
string<date-time>
install_traefik
boolean
created_at
string<date-time>