from together import Together
client = Together()
response = client.beta.clusters.create(
cluster_name="my-gpu-cluster",
region="us-central-8",
gpu_type="H100_SXM",
num_gpus=8,
driver_version="CUDA_12_6_560",
billint_type="ON_DEMAND",
)
print(response.cluster_id){
"cluster_id": "<string>",
"cluster_type": "KUBERNETES",
"region": "<string>",
"gpu_type": "H100_SXM",
"cluster_name": "<string>",
"duration_hours": 123,
"driver_version": "CUDA_12_5_555",
"volumes": [
{
"volume_id": "<string>",
"volume_name": "<string>",
"size_tib": 123,
"status": "<string>"
}
],
"status": "WaitingForControlPlaneNodes",
"control_plane_nodes": [
{
"node_id": "<string>",
"node_name": "<string>",
"status": "<string>",
"host_name": "<string>",
"num_cpu_cores": 123,
"memory_gib": 123,
"network": "<string>"
}
],
"gpu_worker_nodes": [
{
"node_id": "<string>",
"node_name": "<string>",
"status": "<string>",
"host_name": "<string>",
"num_cpu_cores": 123,
"num_gpus": 123,
"memory_gib": 123,
"networks": [
"<string>"
]
}
],
"kube_config": "<string>",
"num_gpus": 123
}Create an Instant Cluster on Together’s high-performance GPU clusters. With features like on-demand scaling, long-lived resizable high-bandwidth shared DC-local storage, Kubernetes and Slurm cluster flavors, a REST API, and Terraform support, you can run workloads flexibly without complex infrastructure management.
from together import Together
client = Together()
response = client.beta.clusters.create(
cluster_name="my-gpu-cluster",
region="us-central-8",
gpu_type="H100_SXM",
num_gpus=8,
driver_version="CUDA_12_6_560",
billint_type="ON_DEMAND",
)
print(response.cluster_id){
"cluster_id": "<string>",
"cluster_type": "KUBERNETES",
"region": "<string>",
"gpu_type": "H100_SXM",
"cluster_name": "<string>",
"duration_hours": 123,
"driver_version": "CUDA_12_5_555",
"volumes": [
{
"volume_id": "<string>",
"volume_name": "<string>",
"size_tib": 123,
"status": "<string>"
}
],
"status": "WaitingForControlPlaneNodes",
"control_plane_nodes": [
{
"node_id": "<string>",
"node_name": "<string>",
"status": "<string>",
"host_name": "<string>",
"num_cpu_cores": 123,
"memory_gib": 123,
"network": "<string>"
}
],
"gpu_worker_nodes": [
{
"node_id": "<string>",
"node_name": "<string>",
"status": "<string>",
"host_name": "<string>",
"num_cpu_cores": 123,
"num_gpus": 123,
"memory_gib": 123,
"networks": [
"<string>"
]
}
],
"kube_config": "<string>",
"num_gpus": 123
}Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
GPU Cluster create request
Region to create the GPU cluster in. Usable regions can be found from client.clusters.list_regions()
Type of GPU to use in the cluster
H100_SXM, H200_SXM, RTX_6000_PCI, L40_PCIE, B200_SXM, H100_SXM_INF Number of GPUs to allocate in the cluster. This must be multiple of 8. For example, 8, 16 or 24
Name of the GPU cluster.
NVIDIA driver version to use in the cluster.
CUDA_12_5_555, CUDA_12_6_560, CUDA_12_6_565, CUDA_12_8_570 RESERVED billing types allow you to specify the duration of the cluster reservation via the duration_days field. ON_DEMAND billing types will give you ownership of the cluster until you delete it.
RESERVED, ON_DEMAND Type of cluster to create.
KUBERNETES, SLURM Duration in days to keep the cluster running.
Inline configuration to create a shared volume with the cluster creation.
Show child attributes
ID of an existing volume to use with the cluster creation.
OK
Type of cluster.
KUBERNETES, SLURM H100_SXM, H200_SXM, RTX_6000_PCI, L40_PCIE, B200_SXM, H100_SXM_INF CUDA_12_5_555, CUDA_12_6_560, CUDA_12_6_565, CUDA_12_8_570 Show child attributes
Current status of the GPU cluster.
WaitingForControlPlaneNodes, WaitingForDataPlaneNodes, WaitingForSubnet, WaitingForSharedVolume, InstallingDrivers, RunningAcceptanceTests, Paused, OnDemandComputePaused, Ready, Degraded, Deleting Show child attributes
Show child attributes
Was this page helpful?