Skip to main content
This guide walks you through deploying a sample inference worker to Together’s managed GPU infrastructure.

Prerequisites

  • Together API Key – Required for all operations. Get one from together.ai.
  • Dedicated Containers access – Contact your account representative or [email protected] to enable Dedicated Containers for your organization.
  • Docker – For building and pushing container images. Get it here.
  • uv (optional) – For Python/package management. Install from astral-sh/uv.

Step 1: Install the Together CLI

uv tool install together
Set your API key:
export TOGETHER_API_KEY=your_key_here

Step 2: Clone the Sprocket Examples

git clone [email protected]:togethercomputer/sprocket.git
cd sprocket
The hello-world worker is a minimal Sprocket that returns a greeting:
import os
import sprocket


class HelloWorld(sprocket.Sprocket):
    def setup(self) -> None:
        self.greeting = "Hello"

    def predict(self, args: dict) -> dict:
        name = args.get("name", "world")
        return {"message": f"{self.greeting}, {name}!"}


if __name__ == "__main__":
    queue_name = os.environ.get("TOGETHER_DEPLOYMENT_NAME", "hello-world")
    sprocket.run(HelloWorld(), queue_name)

Step 3: Build and Deploy

Navigate to the example worker and deploy:
cd examples/hello-world
together beta jig deploy
This command:
  1. Builds the Docker image from the example
  2. Pushes it to Together’s private registry
  3. Creates a deployment on Together’s GPU infrastructure
Note your deployment name in the pyproject.toml and from the output (you’ll need it for the next steps). The example worker uses this pyproject.toml configuration:
[project]
name = "hello-world"
version = "0.1.0"
dependencies = ["sprocket"]

[[tool.uv.index]]
name = "together-pypi"
url = "https://pypi.together.ai/"

[tool.uv.sources]
sprocket = { index = "together-pypi" }

[tool.jig.image]
python_version = "3.11"
cmd = "python3 hello_world.py --queue"
copy = ["hello_world.py"]

[tool.jig.deploy]
gpu_type = "none"
gpu_count = 0
cpu = 1
memory = 2
storage = 10
port = 8000
min_replicas = 1
max_replicas = 1

Step 4: Watch Deployment Status

watch 'together beta jig status'
Wait until the deployment shows running and replicas are ready. Press Ctrl+C to stop watching. Note that watch is not installed by default on MacOS, use brew install watch or your package manager of choice.

Step 5: Test the Health Endpoint

curl https://api.together.ai/v1/deployments/hello-world/health \
  -H "Authorization: Bearer $TOGETHER_API_KEY"
Expected response:
{"status": "healthy"}

Step 6: Submit a Job

curl -X POST "https://api.together.ai/v1/queue/submit" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "hello-world",
    "payload": {"name": "Together"},
    "priority": 1
  }'
Response:
{
  "request_id": "req_abc123",
  "status": "pending"
}
Copy the request_id for the next step.

Step 7: Get the Job Result

curl "https://api.together.ai/v1/queue/status?model=hello-world&request_id=req_abc123" \
  -H "Authorization: Bearer $TOGETHER_API_KEY"
Real request IDs use UUIDv7 format (e.g., 019ba379-92da-71e4-ac40-d98059fd67c7). Replace req_abc123 with your actual request ID from the submit response.
Response (when complete):
{
  "request_id": "req_abc123",
  "model": "hello-world",
  "status": "done",
  "outputs": {"message": "Hello, Together!"}
}

Step 8: View Logs

Stream logs from your deployment:
together beta jig logs --follow

Step 9: Clean Up

When you’re done, delete the deployment:
together beta jig destroy

Next Steps

Now that you’ve deployed your first container, explore the full platform:

Example Guides