Skip to main content
This guide walks you through deploying a sample inference worker to Together’s managed GPU infrastructure.

Prerequisites

  • Together API Key – Required for all operations. Get one from together.ai.
  • Dedicated Containers access – Contact your account representative or [email protected] to enable Dedicated Containers for your organization.
  • Docker – For building and pushing container images. Get it here.
  • uv (optional) – For Python/package management. Install from astral-sh/uv.

Step 1: Install the Together CLI

uv tool install together
Set your API key:
export TOGETHER_API_KEY=your_key_here

Step 2: Clone the Sprocket Examples

git clone [email protected]:togethercomputer/sprocket.git
cd sprocket
The hello-world worker, included in sprocket/examples/hello_world, is a minimal Sprocket that returns a greeting:
import os
import sprocket


class HelloWorld(sprocket.Sprocket):
    def setup(self) -> None:
        self.greeting = "Hello"

    def predict(self, args: dict) -> dict:
        name = args.get("name", "world")
        return {"message": f"{self.greeting}, {name}!"}


if __name__ == "__main__":
    queue_name = os.environ.get("TOGETHER_DEPLOYMENT_NAME", "hello-world")
    sprocket.run(HelloWorld(), queue_name)

Step 3: Build and Deploy

Deployments can be configured with a pyproject.toml file. The deployment name, set by the configuration, must be globally unique. The example worker uses this pyproject.toml configuration:
[project]
name = "hello-world"
version = "0.1.0"
dependencies = ["sprocket"]

[[tool.uv.index]]
name = "together-pypi"
url = "https://pypi.together.ai/"

[tool.uv.sources]
sprocket = { index = "together-pypi" }

[tool.jig.image]
python_version = "3.11"
cmd = "python3 hello_world.py --queue"
copy = ["hello_world.py"]

[tool.jig.deploy]
gpu_type = "none"
gpu_count = 0
cpu = 1
memory = 2
storage = 10
port = 8000
min_replicas = 1
max_replicas = 1
Change the project name in pyproject.toml and use this name for the rest of the tutorial.
Navigate to the example worker and deploy:
cd examples/hello-world
together beta jig deploy
This command:
  1. Builds the Docker image from the example
  2. Pushes it to Together’s private registry
  3. Creates a deployment on Together’s GPU infrastructure

Step 4: Watch Deployment Status

watch 'together beta jig status'
Wait until the deployment shows running and replicas are ready. Press Ctrl+C to stop watching. Note that watch is not installed by default on MacOS, use brew install watch or your package manager of choice.
You can also view the status of your deployments from the Together AI web console.

Step 5: Test the Health Endpoint

curl https://api.together.ai/v1/deployment-request/hello-world/health \
  -H "Authorization: Bearer $TOGETHER_API_KEY"
Expected response:
{"status": "healthy"}

Step 6: Submit a Job

curl -X POST "https://api.together.ai/v1/queue/submit" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "hello-world",
    "payload": {"name": "Together"},
    "priority": 1
  }'
Response:
{
  "request_id": "req_abc123"
}
Copy the request_id for the next step.

Step 7: Get the Job Result

curl "https://api.together.ai/v1/queue/status?model=hello-world&request_id=req_abc123" \
  -H "Authorization: Bearer $TOGETHER_API_KEY"
Real request IDs use UUIDv7 format (e.g., 019ba379-92da-71e4-ac40-d98059fd67c7). Replace req_abc123 with your actual request ID from the submit response.
Response (when complete):
{
  "request_id": "req_abc123",
  "model": "hello-world",
  "status": "done",
  "outputs": {"message": "Hello, Together!"}
}

Step 8: View Logs

Stream logs from your deployment:
together beta jig logs --follow

Step 9: Clean Up

When you’re done, delete the deployment:
together beta jig destroy

Next Steps

Now that you’ve deployed your first container, explore the full platform:

Example Guides