Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.together.ai/llms.txt

Use this file to discover all available pages before exploring further.

Together AI publishes two complementary tools for coding agents:
  • Skills: 12 domain-specific skills that load on demand and teach your agent how to write correct Together AI code (right model IDs, SDK patterns, best practices).
  • Docs MCP server: Gives your agent live access to this documentation site so it can look up current information without leaving your editor.
Install both for the best experience: skills for code generation and MCP for documentation lookup.

Skills

When your agent detects a relevant task, it automatically loads the right skill. You can also call a skill explicitly with /<skill_name>.

Install skills

npx skills add togethercomputer/skills
To verify the install, you should see one SKILL.md per installed skill (for example, ls your-project/.claude/skills/together-*/SKILL.md).

Available skills

SkillWhat it covers
together-chat-completionsServerless chat inference, streaming, multi-turn conversations, function calling (6 patterns), structured JSON outputs, and reasoning models.
together-imagesText-to-image generation, image editing with Kontext, FLUX model selection, LoRA-based styling, and reference-image guidance.
together-videoText-to-video and image-to-video generation, keyframe control, model and dimension selection, and async job polling.
together-audioText-to-speech (REST, streaming, realtime WebSocket) and speech-to-text (transcription, translation, diarization, timestamps).
together-embeddingsDense vector generation, semantic search, RAG pipelines, and reranking with dedicated endpoints.
together-fine-tuningLoRA, full, DPO preference, VLM, function-calling, and reasoning fine-tuning, plus BYOM uploads.
together-batch-inferenceAsync batch jobs with JSONL input, polling, result downloads, and up to 50% cost savings.
together-evaluationsLLM-as-a-judge workflows: classify, score, and compare evaluations with external provider support.
together-sandboxesRemote sandboxed Python execution with session reuse, file uploads, and chart outputs.
together-dedicated-endpointsSingle-tenant GPU endpoints with hardware sizing, autoscaling, and fine-tuned model deployment.
together-dedicated-containersCustom Dockerized inference workers using the Jig CLI, Sprocket SDK, and queue API.
together-gpu-clustersOn-demand and reserved GPU clusters (H100, H200, B200) with Kubernetes, Slurm, and shared storage.

Use a single skill

Each skill works on its own for focused tasks. Describe what you want and the right skill activates, or invoke a specific skill with /<skill_name>. If you prompt your agent with:
Build a multi-turn chatbot using Together AI with Kimi-K2.5
that can call a weather API and return structured JSON.
The agent uses together-chat-completions to generate correct SDK code with the right model ID, streaming setup, tool definitions, and the complete tool-call loop.

Chain skills together

Skills define hand-off boundaries between products, so the agent can chain them together for tasks that span multiple Together AI services. If you prompt your agent with:
Embed my document corpus with Together AI, build a retrieval pipeline
with reranking, then evaluate the answer quality with an LLM judge.
The agent chains three skills:
  1. together-embeddings: Generates dense vectors and builds a cosine-similarity retriever with reranking.
  2. together-chat-completions: Generates answers from the retrieved context.
  3. together-evaluations: Scores answer quality with an LLM judge and downloads the per-row results.
See the skills repository for more workflow examples.

SDK compatibility

All generated code targets the Together Python v2 SDK (together>=2.0.0) and the Together TypeScript SDK (together-ai). If you’re upgrading from v1, see the Python v2 SDK migration guide.

Docs MCP server

Model Context Protocol (MCP) lets AI coding agents call external tools and pull in external data. The Together AI docs MCP server gives your agent direct access to this documentation site without leaving your editor.

Install

The fastest install is the universal npx add-mcp shortcut, which detects your active client and configures the server in one step. The other tabs cover client-specific install commands and manual configuration.
npx add-mcp https://docs.together.ai/mcp

Prompt examples

Once installed, your agent can answer prompts like:
  • “Write a script to process data with batch inference.”
  • “Build a simple chat app with Together AI’s chat completions API.”
  • “Find the best open-source model for frontier coding.”
  • “How do I fine-tune a model on my own data?”
The MCP server provides tools to search and retrieve documentation content, so your agent gets accurate answers without leaving your coding environment.

Resources