POST
/
api
/
v1
/
gpu_cluster
curl -X POST \
-H "Authorization Bearer $TOGETHER_API_KEY" \
--data '{ "region": "us-west-2", "gpu_type": "H100_SXM", "num_gpus": 8, "cluster_name": "my-gpu-cluster", "duration_days": 7, "driver_version": "CUDA_12_6_560" }' \
https://manager.cloud.together.ai/api/v1/gpu_cluster
{
  "cluster_id": "<string>",
  "status": "UNKNOWN_STATUS"
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

GPU Cluster create request

region
string
required

Region to create the GPU cluster in. Valid values are us-central-8 and us-central-4.

gpu_type
enum<string>
required

Type of GPU to use in the cluster

Available options:
UNKNOWN_GPU_TYPE,
H100_SXM,
H200_SXM,
RTX_6000_PCI
num_gpus
integer
required

Number of GPUs to allocate in the cluster. This must be multiple of 8. For example, 8, 16 or 24

cluster_name
string
required

Name of the GPU cluster.

duration_days
integer
required

Duration in days to keep the cluster running.

driver_version
enum<string>
required

NVIDIA driver version to use in the cluster.

Available options:
UNKNOWN_DRIVER,
CUDA_12_5_555,
CUDA_12_6_560,
CUDA_12_6_565,
CUDA_12_8_570
billing_type
enum<string>
required
Available options:
UNSPECIFIED,
RESERVED,
ON_DEMAND
cluster_type
enum<string>

GPU Cluster create request.

Available options:
UNKNOWN_TYPE,
KUBERNETES,
SLURM
shared_volume
object
volume_id
string

Response

200 - application/json

OK

cluster_id
string
status
enum<string>
Available options:
UNKNOWN_STATUS,
PENDING,
QUEUED,
WAITING_CONTROL_PLANE,
WAITING_DATA_PLANE,
READY,
FAILED