from together import Together
client = Together()
response = client.beta.clusters.create(
cluster_name="my-gpu-cluster",
region="us-central-8",
gpu_type="H100_SXM",
num_gpus=8,
driver_version="CUDA_12_6_560",
billint_type="ON_DEMAND",
)
print(response.cluster_id){
"cluster_id": "<string>",
"cluster_type": "KUBERNETES",
"region": "<string>",
"gpu_type": "H100_SXM",
"cluster_name": "<string>",
"duration_hours": 123,
"driver_version": "CUDA_12_5_555",
"volumes": [
{
"volume_id": "<string>",
"volume_name": "<string>",
"size_tib": 123,
"status": "<string>"
}
],
"status": "WaitingForControlPlaneNodes",
"control_plane_nodes": [
{
"node_id": "<string>",
"node_name": "<string>",
"status": "<string>",
"host_name": "<string>",
"num_cpu_cores": 123,
"memory_gib": 123,
"network": "<string>"
}
],
"gpu_worker_nodes": [
{
"node_id": "<string>",
"node_name": "<string>",
"status": "<string>",
"host_name": "<string>",
"num_cpu_cores": 123,
"num_gpus": 123,
"memory_gib": 123,
"networks": [
"<string>"
]
}
],
"kube_config": "<string>",
"num_gpus": 123
}Create an Instant Cluster on Together’s high-performance GPU clusters. With features like on-demand scaling, long-lived resizable high-bandwidth shared DC-local storage, Kubernetes and Slurm cluster flavors, a REST API, and Terraform support, you can run workloads flexibly without complex infrastructure management.
from together import Together
client = Together()
response = client.beta.clusters.create(
cluster_name="my-gpu-cluster",
region="us-central-8",
gpu_type="H100_SXM",
num_gpus=8,
driver_version="CUDA_12_6_560",
billint_type="ON_DEMAND",
)
print(response.cluster_id){
"cluster_id": "<string>",
"cluster_type": "KUBERNETES",
"region": "<string>",
"gpu_type": "H100_SXM",
"cluster_name": "<string>",
"duration_hours": 123,
"driver_version": "CUDA_12_5_555",
"volumes": [
{
"volume_id": "<string>",
"volume_name": "<string>",
"size_tib": 123,
"status": "<string>"
}
],
"status": "WaitingForControlPlaneNodes",
"control_plane_nodes": [
{
"node_id": "<string>",
"node_name": "<string>",
"status": "<string>",
"host_name": "<string>",
"num_cpu_cores": 123,
"memory_gib": 123,
"network": "<string>"
}
],
"gpu_worker_nodes": [
{
"node_id": "<string>",
"node_name": "<string>",
"status": "<string>",
"host_name": "<string>",
"num_cpu_cores": 123,
"num_gpus": 123,
"memory_gib": 123,
"networks": [
"<string>"
]
}
],
"kube_config": "<string>",
"num_gpus": 123
}Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
GPU Cluster create request
Region to create the GPU cluster in. Valid values are us-central-8 and us-central-4.
us-central-8, us-central-4 Type of GPU to use in the cluster
H100_SXM, H200_SXM, RTX_6000_PCI, L40_PCIE, B200_SXM, H100_SXM_INF Number of GPUs to allocate in the cluster. This must be multiple of 8. For example, 8, 16 or 24
Name of the GPU cluster.
NVIDIA driver version to use in the cluster.
CUDA_12_5_555, CUDA_12_6_560, CUDA_12_6_565, CUDA_12_8_570 RESERVED, ON_DEMAND KUBERNETES, SLURM Duration in days to keep the cluster running.
Show child attributes
OK
KUBERNETES, SLURM H100_SXM, H200_SXM, RTX_6000_PCI, L40_PCIE, B200_SXM, H100_SXM_INF CUDA_12_5_555, CUDA_12_6_560, CUDA_12_6_565, CUDA_12_8_570 Show child attributes
Current status of the GPU cluster.
WaitingForControlPlaneNodes, WaitingForDataPlaneNodes, WaitingForSubnet, WaitingForSharedVolume, InstallingDrivers, RunningAcceptanceTests, Paused, OnDemandComputePaused, Ready, Degraded, Deleting Show child attributes
Show child attributes
Was this page helpful?