from together import Together
client = Together()
response = client.beta.clusters.create(
cluster_name="my-gpu-cluster",
region="us-central-8",
gpu_type="H100_SXM",
num_gpus=8,
driver_version="CUDA_12_6_560",
billint_type="ON_DEMAND",
)
print(response.cluster_id){
"cluster_id": "<string>",
"cluster_type": "KUBERNETES",
"region": "<string>",
"gpu_type": "H100_SXM",
"cluster_name": "<string>",
"volumes": [
{
"volume_id": "<string>",
"volume_name": "<string>",
"size_tib": 123,
"status": "<string>"
}
],
"status": "WaitingForControlPlaneNodes",
"control_plane_nodes": [
{
"node_id": "<string>",
"node_name": "<string>",
"status": "<string>",
"host_name": "<string>",
"num_cpu_cores": 123,
"memory_gib": 123,
"network": "<string>"
}
],
"gpu_worker_nodes": [
{
"node_id": "<string>",
"node_name": "<string>",
"status": "<string>",
"host_name": "<string>",
"num_cpu_cores": 123,
"num_gpus": 123,
"memory_gib": 123,
"networks": [
"<string>"
],
"instance_id": "<string>"
}
],
"kube_config": "<string>",
"num_gpus": 123,
"cuda_version": "<string>",
"nvidia_driver_version": "<string>",
"duration_hours": 123,
"slurm_shm_size_gib": 123,
"capacity_pool_id": "<string>",
"reservation_start_time": "2023-11-07T05:31:56Z",
"reservation_end_time": "2023-11-07T05:31:56Z",
"install_traefik": true,
"created_at": "2023-11-07T05:31:56Z"
}Create an Instant Cluster on Together’s high-performance GPU clusters. With features like on-demand scaling, long-lived resizable high-bandwidth shared DC-local storage, Kubernetes and Slurm cluster flavors, a REST API, and Terraform support, you can run workloads flexibly without complex infrastructure management.
from together import Together
client = Together()
response = client.beta.clusters.create(
cluster_name="my-gpu-cluster",
region="us-central-8",
gpu_type="H100_SXM",
num_gpus=8,
driver_version="CUDA_12_6_560",
billint_type="ON_DEMAND",
)
print(response.cluster_id){
"cluster_id": "<string>",
"cluster_type": "KUBERNETES",
"region": "<string>",
"gpu_type": "H100_SXM",
"cluster_name": "<string>",
"volumes": [
{
"volume_id": "<string>",
"volume_name": "<string>",
"size_tib": 123,
"status": "<string>"
}
],
"status": "WaitingForControlPlaneNodes",
"control_plane_nodes": [
{
"node_id": "<string>",
"node_name": "<string>",
"status": "<string>",
"host_name": "<string>",
"num_cpu_cores": 123,
"memory_gib": 123,
"network": "<string>"
}
],
"gpu_worker_nodes": [
{
"node_id": "<string>",
"node_name": "<string>",
"status": "<string>",
"host_name": "<string>",
"num_cpu_cores": 123,
"num_gpus": 123,
"memory_gib": 123,
"networks": [
"<string>"
],
"instance_id": "<string>"
}
],
"kube_config": "<string>",
"num_gpus": 123,
"cuda_version": "<string>",
"nvidia_driver_version": "<string>",
"duration_hours": 123,
"slurm_shm_size_gib": 123,
"capacity_pool_id": "<string>",
"reservation_start_time": "2023-11-07T05:31:56Z",
"reservation_end_time": "2023-11-07T05:31:56Z",
"install_traefik": true,
"created_at": "2023-11-07T05:31:56Z"
}Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
GPU Cluster create request
Region to create the GPU cluster in. Usable regions can be found from client.clusters.list_regions()
Type of GPU to use in the cluster
H100_SXM, H200_SXM, RTX_6000_PCI, L40_PCIE, B200_SXM, H100_SXM_INF Number of GPUs to allocate in the cluster. This must be multiple of 8. For example, 8, 16 or 24
Name of the GPU cluster.
RESERVED billing types allow you to specify the duration of the cluster reservation via the duration_days field. ON_DEMAND billing types will give you ownership of the cluster until you delete it. SCHEDULED_CAPACITY billing types allow you to reserve capacity for a scheduled time window. You must specify the reservation_start_time and reservation_end_time with this request.
RESERVED, ON_DEMAND, SCHEDULED_CAPACITY CUDA version for this cluster. For example, 12.5
Nvidia driver version for this cluster. For example, 550. Only some combination of cuda_version and nvidia_driver_version are supported.
Type of cluster to create.
KUBERNETES, SLURM Duration in days to keep the cluster running.
Inline configuration to create a shared volume with the cluster creation.
Show child attributes
ID of an existing volume to use with the cluster creation.
Whether automated GPU node failover should be enabled for this cluster. By default, it is disabled.
Whether GPU cluster should be auto-scaled based on the workload. By default, it is not auto-scaled.
Maximum number of GPUs to which the cluster can be auto-scaled up. This field is required if auto_scaled is true.
Shared memory size in GiB for Slurm cluster. This field is required if cluster_type is SLURM.
ID of the capacity pool to use for the cluster. This field is optional and only applicable if the cluster is created from a capacity pool.
Reservation start time of the cluster. This field is required for SCHEDULED billing to specify the reservation start time for the cluster. If not provided, the cluster will be provisioned immediately.
Reservation end time of the cluster. This field is required for SCHEDULED billing to specify the reservation end time for the cluster.
Whether to install Traefik ingress controller in the cluster. This field is only applicable for Kubernetes clusters and is false by default.
Custom Slurm image for Slurm clusters.
OK
Type of cluster.
KUBERNETES, SLURM H100_SXM, H200_SXM, RTX_6000_PCI, L40_PCIE, B200_SXM, H100_SXM_INF Show child attributes
Current status of the GPU cluster.
WaitingForControlPlaneNodes, WaitingForDataPlaneNodes, WaitingForSubnet, WaitingForSharedVolume, InstallingDrivers, RunningAcceptanceTests, Paused, OnDemandComputePaused, Ready, Degraded, Deleting Show child attributes
Show child attributes
Was this page helpful?