Introduction - Together AI Docs

Quickstart
Concepts
Guides
Reference

Dedicated Containers let you run your own Dockerized inference workloads on Together’s managed GPU infrastructure. You bring the container — Together handles compute provisioning, autoscaling, networking, and observability. You build and push a Docker image using the Jig CLI. Inside your container, the Sprocket SDK connects your inference code to Together’s managed job queue. Once deployed, your workers can receive requests.

Wrap and deploy your model in 20 minutes
Boost conversion and margins with fair priority queueing
Bottomless capacity just before you need it

Dedicated Containers Architecture

Quickstart

Deploy Your First Container

Deploy your first container from the command line

Concepts

Platform Overview

Architecture, deployment lifecycle, autoscaling, and troubleshooting

Jig CLI

Build, deploy, secrets, and volumes

Sprocket SDK

Inference workers with setup() and predict()

Queue API

Async jobs with priority and progress

Guides

Image Generation

Single-GPU Flux2 model

Video Generation

Multi-GPU Wan 2.1 with torchrun

Reference

Jig CLI

CLI commands and pyproject.toml configuration

Sprocket SDK

Base classes, file handling, and error reference

REST API

Deployments, secrets, storage, and queue

Get Access

Contact your account representative or [email protected] to enable Dedicated Containers for your organization.

Upload a LoRA Adapter

⌘I