Quickstart: Create Your First Cluster

Create a Cluster

Follow these steps to create your first GPU cluster:

1. Access the Cluster Console

Log into api.together.ai
Click GPU Clusters in the top navigation menu
Click Create Cluster

2. Choose Capacity Type

Select the billing mode that fits your needs:

Reserved – Pay upfront to reserve capacity for 1-90 days with discounted pricing
On-demand – Pay hourly with no commitment; terminate anytime

Learn more about capacity types →

3. Configure Your Cluster

Cluster Size

Select the number and type of GPUs (e.g., 8xH100)
Available options: H100, H200, B200

Cluster Name

Enter a descriptive name for easy identification

Cluster Type

Kubernetes – For containerized workloads and K8s-native tools
Slurm – For HPC-style batch scheduling and traditional workflows

Region

Select the datacenter region closest to your data or team

Duration (Reserved only)

Choose reservation length: 1-90 days

Shared Volume

Create and name your persistent storage volume
Minimum size: 1 TiB
Can be resized later as needed

Optional Settings

Select NVIDIA driver version
Select CUDA version

4. Create and Verify

Click Proceed to create your cluster
Monitor the cluster status in the UI as it provisions
Wait for status to transition to Ready

Your cluster is now ready to use!

Next Steps

For Kubernetes Clusters

Install kubectl
- MacOS installation guide
- Or use your preferred method for your OS
Download kubeconfig
- From the cluster UI, download the kubeconfig file
- Copy it to your local machine:

~/.kube/together_cluster.kubeconfig
export KUBECONFIG=$HOME/.kube/together_cluster.kubeconfig

Verify connectivity

kubectl get nodes

You should see all worker and control plane nodes listed.

Start using your cluster
- Deploy workloads
- Access the K8s Dashboard

For Slurm Clusters

Add SSH key (if not already done)
- Ensure your SSH key is added to your account at api.together.ai/settings/ssh-key
- Keys must be added before cluster creation
Connect via SSH
- Use the connection command shown in the cluster UI
- SSH directly to the Slurm login node
Verify Slurm

sinfo          # View node status
squeue         # View job queue

Start submitting jobs
- Learn about Slurm commands
- Submit batch jobs with sbatch
- Run interactive jobs with srun

Common First Tasks

Upload Data

For small datasets:

# Create a pod with your shared volume mounted
# Then copy files directly
kubectl cp local_file.tar.gz pod-name:/mnt/shared/

For large datasets, create a pod that downloads from S3 or your data source.

Run a Test Job

Kubernetes example:

kubectl run test --image=ubuntu --command -- sleep infinity
kubectl exec -it test -- bash

Slurm example:

srun --gpus=1 --pty bash
nvidia-smi

Troubleshooting

Can’t see my nodes

Verify your kubeconfig is set: echo $KUBECONFIG
Check cluster status in the UI (should be “Ready”)
Ensure you downloaded the latest kubeconfig

SSH connection refused

Verify your SSH key was added before cluster creation
Check the connection command in the cluster UI
Ensure you’re using the correct hostname

Capacity unavailable

Use the “Notify Me” option to get alerts when capacity is available
Try a different region
Contact [email protected] for custom requirements

Getting Started

Inference

Training

Capabilities

Other APIs

Quickstart: Create Your First Cluster

Create a Cluster

1. Access the Cluster Console

2. Choose Capacity Type

3. Configure Your Cluster

4. Create and Verify

Next Steps

For Kubernetes Clusters

For Slurm Clusters

Common First Tasks

Upload Data

Run a Test Job

Troubleshooting

Can’t see my nodes

SSH connection refused

Capacity unavailable

What’s Next?

Getting Started

Inference

Training

Capabilities

Other APIs

​Create a Cluster

​1. Access the Cluster Console

​2. Choose Capacity Type

​3. Configure Your Cluster

​4. Create and Verify

​Next Steps

​For Kubernetes Clusters

​For Slurm Clusters

​Common First Tasks

​Upload Data

​Run a Test Job

​Troubleshooting

​Can’t see my nodes

​SSH connection refused

​Capacity unavailable

​What’s Next?

Create a Cluster

1. Access the Cluster Console

2. Choose Capacity Type

3. Configure Your Cluster

4. Create and Verify

Next Steps

For Kubernetes Clusters

For Slurm Clusters

Common First Tasks

Upload Data

Run a Test Job

Troubleshooting

Can’t see my nodes

SSH connection refused

Capacity unavailable

What’s Next?