Installation
Verify installation
You should see oneSKILL.md per installed skill.
Available skills
Once installed, skills activate automatically when the agent detects a relevant task. Individual skills can be explicitly called from your coding agent but this is not needed and the agent can also automatically load in relevant skills if required to a task.| Skill | What it covers |
|---|---|
| together-chat-completions | Serverless chat inference, streaming, multi-turn conversations, function calling (6 patterns), structured JSON outputs, and reasoning models |
| together-images | Text-to-image generation, image editing with Kontext, FLUX model selection, LoRA-based styling, and reference-image guidance |
| together-video | Text-to-video and image-to-video generation, keyframe control, model and dimension selection, async job polling |
| together-audio | Text-to-speech (REST, streaming, realtime WebSocket) and speech-to-text (transcription, translation, diarization, timestamps) |
| together-embeddings | Dense vector generation, semantic search, RAG pipelines, and reranking with dedicated endpoints |
| together-fine-tuning | LoRA, full, DPO preference, VLM, function-calling, and reasoning fine-tuning plus BYOM uploads |
| together-batch-inference | Async batch jobs with JSONL input, polling, result downloads, and up to 50% cost savings |
| together-evaluations | LLM-as-a-judge workflows: classify, score, and compare evaluations with external provider support |
| together-sandboxes | Remote sandboxed Python execution with session reuse, file uploads, and chart outputs |
| together-dedicated-endpoints | Single-tenant GPU endpoints with hardware sizing, autoscaling, and fine-tuned model deployment |
| together-dedicated-containers | Custom Dockerized inference workers using the Jig CLI, Sprocket SDK, and queue API |
| together-gpu-clusters | On-demand and reserved GPU clusters (H100, H200, B200) with Kubernetes, Slurm, and shared storage |
How skills are structured
Each skill is a self-contained directory:SKILL.md for high-level routing and rules. If it needs deeper detail (model tables, full API specs, or data format docs) it pulls from references/. For complete working code, it uses the scripts/ directory.
Using skills individually
Each skill works on its own for focused tasks. Just describe what you want and the right skill activates or you can even invoke a particular skills by using/<skills-name> such as /together-fine-tuning.
Chat with streaming and tool use:
together-chat-completions to generate correct v2 SDK code with the right model ID, streaming setup, tool definitions, and the complete tool call loop.
Generate and edit images:
together-images for both the initial generation and the Kontext editing call, handling base64 decoding and file saving.
Fine-tune a model:
together-fine-tuning for data preparation, upload, training configuration, and monitoring, then hands off to together-dedicated-endpoints for deployment.
Combining skills for complex workflows
Skills explicitly define hand-off boundaries between different product so the agent can chain them together for multi-step workflows. Here are four examples that span multiple skills. Build a RAG pipeline with evaluation- together-embeddings: generates dense vectors for your documents and builds a cosine-similarity retriever with reranking
- together-chat-completions: generates answers from the retrieved context using a chat model
- together-evaluations: sets up a score evaluation to grade answer quality with an LLM judge, polls for results, and downloads the per-row scores
- together-fine-tuning: prepares preference pairs, runs SFT first then DPO training, and monitors the job
- together-dedicated-endpoints: deploys the fine-tuned checkpoint to a dedicated endpoint with hardware sizing and autoscaling
- together-evaluations: runs a compare evaluation between the base model and your fine-tuned model, downloads the results
- together-images: generates the initial image, then edits it with Kontext for studio lighting
- together-video: takes the edited image as a first-frame keyframe, submits an image-to-video job, polls until completion, and downloads the MP4
- together-batch-inference: prepares the JSONL input, uploads it, creates the batch job, and polls until the results are ready
- together-sandboxes: uploads the results file to a sandboxed Python session, runs pandas analysis, and generates a matplotlib chart
SDK compatibility
All code generated by these skills targets the Together Python v2 SDK (together>=2.0.0) and the Together TypeScript SDK (together-ai).
If you are upgrading from v1, see the migration guide for breaking changes in method names, argument styles, and response shapes.
Resources
- Skills repository on GitHub: source code, full reference docs, and runnable scripts for all 12 skills
- Agent Skills specification: the open standard these skills follow
- Together AI MCP Server: connect your coding agent to the Together AI documentation via MCP
- Together AI Quickstart: get your API key and run your first query
- Together AI Cookbook: end-to-end examples and tutorials