Skip to main content
Together AI Skills are instruction files that give AI coding agents domain-specific knowledge about the Together AI platform. When your agent detects a relevant task, it automatically loads the right skill and uses it to write correct code with proper model IDs, SDK patterns, and best practices, no manual lookup required. Together AI publishes 12 skills covering the full platform. They work with Claude Code, Cursor, Codex, and Gemini CLI and any other coding agent you might be using.

Installation

npx skills add togethercomputer/skills

Verify installation

You should see one SKILL.md per installed skill.
ls your-project/.claude/skills/together-*/SKILL.md

Available skills

Once installed, skills activate automatically when the agent detects a relevant task. Individual skills can be explicitly called from your coding agent but this is not needed and the agent can also automatically load in relevant skills if required to a task.
SkillWhat it covers
together-chat-completionsServerless chat inference, streaming, multi-turn conversations, function calling (6 patterns), structured JSON outputs, and reasoning models
together-imagesText-to-image generation, image editing with Kontext, FLUX model selection, LoRA-based styling, and reference-image guidance
together-videoText-to-video and image-to-video generation, keyframe control, model and dimension selection, async job polling
together-audioText-to-speech (REST, streaming, realtime WebSocket) and speech-to-text (transcription, translation, diarization, timestamps)
together-embeddingsDense vector generation, semantic search, RAG pipelines, and reranking with dedicated endpoints
together-fine-tuningLoRA, full, DPO preference, VLM, function-calling, and reasoning fine-tuning plus BYOM uploads
together-batch-inferenceAsync batch jobs with JSONL input, polling, result downloads, and up to 50% cost savings
together-evaluationsLLM-as-a-judge workflows: classify, score, and compare evaluations with external provider support
together-sandboxesRemote sandboxed Python execution with session reuse, file uploads, and chart outputs
together-dedicated-endpointsSingle-tenant GPU endpoints with hardware sizing, autoscaling, and fine-tuned model deployment
together-dedicated-containersCustom Dockerized inference workers using the Jig CLI, Sprocket SDK, and queue API
together-gpu-clustersOn-demand and reserved GPU clusters (H100, H200, B200) with Kubernetes, Slurm, and shared storage

How skills are structured

Each skill is a self-contained directory:
skills/together-<product>/
├── SKILL.md           # Core instructions (loaded when the skill triggers)
├── references/        # Detailed docs: model lists, API parameters, CLI commands
└── scripts/           # Runnable Python and TypeScript examples
When a skill triggers, the agent first loads SKILL.md for high-level routing and rules. If it needs deeper detail (model tables, full API specs, or data format docs) it pulls from references/. For complete working code, it uses the scripts/ directory.

Using skills individually

Each skill works on its own for focused tasks. Just describe what you want and the right skill activates or you can even invoke a particular skills by using /<skills-name> such as /together-fine-tuning. Chat with streaming and tool use:
> Build a multi-turn chatbot using Together AI with Kimi-K2.5
> that can call a weather API and return structured JSON
The agent uses together-chat-completions to generate correct v2 SDK code with the right model ID, streaming setup, tool definitions, and the complete tool call loop. Generate and edit images:
> Generate a product hero image with FLUX.2, then use Kontext
> to change the background to a rainy cyberpunk alley
The agent uses together-images for both the initial generation and the Kontext editing call, handling base64 decoding and file saving. Fine-tune a model:
> Fine-tune Llama 3.3 70B on my support conversations using LoRA,
> then deploy the result to a dedicated endpoint
The agent uses together-fine-tuning for data preparation, upload, training configuration, and monitoring, then hands off to together-dedicated-endpoints for deployment.

Combining skills for complex workflows

Skills explicitly define hand-off boundaries between different product so the agent can chain them together for multi-step workflows. Here are four examples that span multiple skills. Build a RAG pipeline with evaluation
> Embed my document corpus with Together AI, build a retrieval pipeline
> with reranking, then evaluate the answer quality with an LLM judge
The agent chains three skills:
  1. together-embeddings: generates dense vectors for your documents and builds a cosine-similarity retriever with reranking
  2. together-chat-completions: generates answers from the retrieved context using a chat model
  3. together-evaluations: sets up a score evaluation to grade answer quality with an LLM judge, polls for results, and downloads the per-row scores
Fine-tune, deploy, and benchmark
> Fine-tune Qwen on my preference data with DPO, deploy the result,
> then compare it against the base model using Together evaluations
The agent chains three skills:
  1. together-fine-tuning: prepares preference pairs, runs SFT first then DPO training, and monitors the job
  2. together-dedicated-endpoints: deploys the fine-tuned checkpoint to a dedicated endpoint with hardware sizing and autoscaling
  3. together-evaluations: runs a compare evaluation between the base model and your fine-tuned model, downloads the results
Generate product media from a single prompt
> Generate a product photo with FLUX.2, edit it with Kontext to add
> studio lighting, then animate the final image into a 5-second video
The agent chains two skills:
  1. together-images: generates the initial image, then edits it with Kontext for studio lighting
  2. together-video: takes the edited image as a first-frame keyframe, submits an image-to-video job, polls until completion, and downloads the MP4
Batch-process and analyze results
> Classify 50,000 support tickets overnight with the Batch API,
> then run the results through Together Sandboxes to generate
> a breakdown chart by category
The agent chains two skills:
  1. together-batch-inference: prepares the JSONL input, uploads it, creates the batch job, and polls until the results are ready
  2. together-sandboxes: uploads the results file to a sandboxed Python session, runs pandas analysis, and generates a matplotlib chart

SDK compatibility

All code generated by these skills targets the Together Python v2 SDK (together>=2.0.0) and the Together TypeScript SDK (together-ai). If you are upgrading from v1, see the migration guide for breaking changes in method names, argument styles, and response shapes.

Resources