Together AI publishes two complementary tools for coding agents:Documentation Index
Fetch the complete documentation index at: https://docs.together.ai/llms.txt
Use this file to discover all available pages before exploring further.
- Skills: 12 domain-specific skills that load on demand and teach your agent how to write correct Together AI code (right model IDs, SDK patterns, best practices).
- Docs MCP server: Gives your agent live access to this documentation site so it can look up current information without leaving your editor.
Skills
When your agent detects a relevant task, it automatically loads the right skill. You can also call a skill explicitly with/<skill_name>.
Install skills
SKILL.md per installed skill (for example, ls your-project/.claude/skills/together-*/SKILL.md).
Available skills
| Skill | What it covers |
|---|---|
| together-chat-completions | Serverless chat inference, streaming, multi-turn conversations, function calling (6 patterns), structured JSON outputs, and reasoning models. |
| together-images | Text-to-image generation, image editing with Kontext, FLUX model selection, LoRA-based styling, and reference-image guidance. |
| together-video | Text-to-video and image-to-video generation, keyframe control, model and dimension selection, and async job polling. |
| together-audio | Text-to-speech (REST, streaming, realtime WebSocket) and speech-to-text (transcription, translation, diarization, timestamps). |
| together-embeddings | Dense vector generation, semantic search, RAG pipelines, and reranking with dedicated endpoints. |
| together-fine-tuning | LoRA, full, DPO preference, VLM, function-calling, and reasoning fine-tuning, plus BYOM uploads. |
| together-batch-inference | Async batch jobs with JSONL input, polling, result downloads, and up to 50% cost savings. |
| together-evaluations | LLM-as-a-judge workflows: classify, score, and compare evaluations with external provider support. |
| together-sandboxes | Remote sandboxed Python execution with session reuse, file uploads, and chart outputs. |
| together-dedicated-endpoints | Single-tenant GPU endpoints with hardware sizing, autoscaling, and fine-tuned model deployment. |
| together-dedicated-containers | Custom Dockerized inference workers using the Jig CLI, Sprocket SDK, and queue API. |
| together-gpu-clusters | On-demand and reserved GPU clusters (H100, H200, B200) with Kubernetes, Slurm, and shared storage. |
Use a single skill
Each skill works on its own for focused tasks. Describe what you want and the right skill activates, or invoke a specific skill with/<skill_name>.
If you prompt your agent with:
together-chat-completions to generate correct SDK code with the right model ID, streaming setup, tool definitions, and the complete tool-call loop.
Chain skills together
Skills define hand-off boundaries between products, so the agent can chain them together for tasks that span multiple Together AI services. If you prompt your agent with:together-embeddings: Generates dense vectors and builds a cosine-similarity retriever with reranking.together-chat-completions: Generates answers from the retrieved context.together-evaluations: Scores answer quality with an LLM judge and downloads the per-row results.
SDK compatibility
All generated code targets the Together Python v2 SDK (together>=2.0.0) and the Together TypeScript SDK (together-ai). If you’re upgrading from v1, see the Python v2 SDK migration guide.
Docs MCP server
Model Context Protocol (MCP) lets AI coding agents call external tools and pull in external data. The Together AI docs MCP server gives your agent direct access to this documentation site without leaving your editor.Install
The fastest install is the universalnpx add-mcp shortcut, which detects your active client and configures the server in one step. The other tabs cover client-specific install commands and manual configuration.
- Universal
- Claude Code
- Cursor
- VS Code
- Codex
- OpenCode
Prompt examples
Once installed, your agent can answer prompts like:- “Write a script to process data with batch inference.”
- “Build a simple chat app with Together AI’s chat completions API.”
- “Find the best open-source model for frontier coding.”
- “How do I fine-tune a model on my own data?”
Resources
- Skills repository on GitHub: Source code, full reference docs, and runnable scripts for all 12 skills.
- Together AI cookbook: End-to-end examples and tutorials.
- Python v2 SDK migration guide: Breaking changes between the v1 and v2 SDKs.
- Agent Skills specification: The open standard these skills follow.