Environment Variables
| Variable | Default | Description |
|---|---|---|
TOGETHER_API_KEY | Required | Your Together API key |
TOGETHER_DEBUG | "" | Enable debug logging ("1" or "true") |
WARMUP_ENV_NAME | TORCHINDUCTOR_CACHE_DIR | Environment variable for cache location |
WARMUP_DEST | torch_cache | Cache directory path in container |
together beta jig. Use --config <path> to specify a custom config file (default: pyproject.toml).
Build
jig init
Create a starterpyproject.toml with sensible defaults.
jig dockerfile
Generate a Dockerfile from yourpyproject.toml configuration. Useful for debugging the build.
jig build
Build the Docker image locally.| Flag | Description |
|---|---|
--tag <tag> | Image tag (default: content-hash) |
--warmup | Pre-generate compile caches after build (requires GPU, see Cache Warmup) |
jig push
Push the built image to Together’s registry atregistry.together.xyz.
| Flag | Description |
|---|---|
--tag <tag> | Image tag to push |
Deployments
jig deploy
Build, push, and create or update the deployment. Combinesbuild, push, and deployment creation into one step.
| Flag | Description |
|---|---|
--tag <tag> | Image tag |
--warmup | Pre-generate compile caches (requires GPU) |
--build-only | Build and push only, skip deployment creation |
--image <ref> | Deploy an existing image, skip build and push |
jig status
Show deployment status and configuration.jig list
List all deployments in your organization.jig logs
Retrieve deployment logs.| Flag | Description |
|---|---|
--follow | Stream logs in real-time |
jig destroy
Delete the deployment.jig endpoint
Print the deployment’s endpoint URL.Queue
jig submit
Submit a job to the deployment’s queue.| Flag | Description |
|---|---|
--prompt <text> | Shorthand for --payload '{"prompt": "..."}' |
--payload <json> | Full JSON payload |
--watch | Wait for the job to complete and print the result |
jig job_status
Get the status of a submitted job.| Flag | Description |
|---|---|
--request-id <id> | The job’s request ID (required) |
jig queue_status
Show queue backlog and worker status.Secrets
Secrets are encrypted environment variables injected at runtime. Manage them with thesecrets subcommand.
jig secrets set
| Flag | Description |
|---|---|
--name <name> | Secret name (required) |
--value <value> | Secret value (required) |
--description <text> | Human-readable description |
jig secrets list
List all secrets for the deployment.jig secrets unset
Remove a secret.Volumes
Volumes mount read-only data — like model weights — into your container without baking them into the image.jig volumes create
Create a volume and upload files.| Flag | Description |
|---|---|
--name <name> | Volume name (required) |
--source <path> | Local directory to upload (required) |
jig volumes update
Update a volume with new files.jig volumes describe
Show volume details and contents.jig volumes list
List all volumes.jig volumes delete
Delete a volume.Configuration Reference
Jig reads configuration from yourpyproject.toml file or a standalone jig.toml file. You can also specify a custom config file explicitly:
staging_jig.toml, production_jig.toml).
The configuration is split into three sections: build settings, deployment settings, and autoscaling.
The [tool.jig.image] section
The [tool.jig.image] section controls how your container image is built.
python_version
Sets the Python version for the container. Jig uses this to select the appropriate base image."3.11"
system_packages
A list of APT packages to install in the container. Useful for libraries that require system dependencies like FFmpeg for video processing or OpenGL for graphics.[]
environment
Environment variables are a part the image (asENV directives). These are available during the Docker build, the warmup step, and at runtime. Use this for build configuration like CUDA architecture targets.
[tool.jig.deploy.environment_variables] instead. This is useful for values that can change without changing the image.
Default: {}
run
Additional shell commands to run during the Docker build. Each command becomes a separateRUN instruction. Use this for custom installation steps that can’t be expressed as Python dependencies.
[]
cmd
The default command to run when the container starts. This becomes the DockerCMD instruction.
--queue flag.
Default: "python app.py"
copy
A list of files and directories to copy into the container. Paths are relative to your project root.[]
auto_include_git
When enabled, automatically includes all git-tracked files in the container in addition to files specified incopy. Requires a clean git repository (no uncommitted changes).
copy to include additional untracked files.
Default: false
The [tool.jig.deploy] section
The [tool.jig.deploy] section controls how your container runs on Together’s infrastructure.
description
A human-readable description of your deployment. This appears in the Together dashboard and API responses.""
gpu_type
The type of GPU to allocate for each replica. Together supports NVIDIA H100, or CPU-only deployments."h100-80gb"- NVIDIA H100 with 80GB memory (recommended for large models)"none"- CPU-only deployment
"h100-80gb"
Other hardware is also available by request, please reach out to sales.
gpu_count
The number of GPUs to allocate per replica. For multi-GPU inference with tensor parallelism, set this higher and useuse_torchrun=True in your Sprocket. See Multi-GPU / Distributed Inference.
1
cpu
CPU cores to allocate per replica. Supports fractional values for smaller workloads.0.1= 100 millicores,1= 1 core,8= 8 cores
1.0
memory
Memory to allocate per replica, in gigabytes. Supports fractional values. Set this high enough for your model weights plus inference overhead.0.5= 512 MB,8= 8 GB,64= 64 GB
8.0
storage
Ephemeral storage to allocate per replica, in gigabytes. This is the disk space available to your container at runtime for temporary files, caches, and model artifacts.100
min_replicas
The minimum number of replicas to keep running. Set to0 to allow scaling to zero when idle (saves costs but adds cold start latency).
1
max_replicas
The maximum number of replicas the autoscaler can create. Set this based on your expected peak load and budget.1
port
The port your container listens on. Sprocket uses port 8000 by default.8000
health_check_path
The endpoint Together uses to check if your container is ready to accept traffic. The endpoint must return a200 status when healthy.
"/health"
termination_grace_period_seconds
How long to wait for a worker to finish its current job before forcefully terminating during shutdown or scale-down. Set this higher for long-running inference jobs.300
command
Override the container’s startup command at deploy time. This takes precedence over thecmd setting in [tool.jig.image].
null (uses the image’s CMD)
environment_variables
Runtime environment variables injected into your container. For sensitive values like API keys, use secrets instead.{}
The [tool.jig.autoscaling] section
The [tool.jig.autoscaling] section controls how your deployment scales based on demand.
profile
The autoscaling strategy to use. Currently,QueueBacklogPerWorker is the recommended profile for queue-based workloads.
min_replicas).
targetValue
The target ratio for the autoscaler. This controls how aggressively the system scales.desired_replicas = queue_depth / targetValue
For example, if there are 100 jobs in the pending or running state, here’s what would happen with each setting:
"1.0"- Exact match, 100 workers."1.05"- 5% underprovisioning, 95 workers (slightly less than needed, recommended)."0.9"- 10% overprovisoning, 105 workers (more than strictly needed, lower latency).