Jump to Content
Together
DocumentationAPI Reference
Web PlaygroundLog InTogether
Documentation
Web PlaygroundLog In
DocumentationAPI Reference

Getting Started

  • Introduction
  • Quickstart
  • OpenAI Compatibility
  • Together Code Interpreter
  • Llama 4 Quickstart
  • DeepSeek-R1 Quickstart
    • Reasoning Models Guide
    • Prompting DeepSeek-R1
    • DeepSeek FAQs

Inference

  • Serverless models
  • Dedicated models
    • Dedicated Inference
    • Uploading a fine-tuned model

Capabilities

  • Chat
  • Structured outputs
  • Function calling
  • Images
    • Quickstart: Flux Tools Models
    • Quickstart: Flux LoRA Inference
  • Vision
  • Code Execution (CSB SDK)
  • Other modalities
    • Text-to-Speech
    • Embeddings
      • RAG Integrations
    • Rerank
      • QuickStart: LlamaRank
    • Code/Language

Examples

  • Agent Workflows
    • Sequential Workflow
    • Parallel Workflow
    • Conditional Workflow
    • Iterative Workflow
    • Agent Integrations
      • CrewAI
      • LangGraph
      • DSPy
      • PydanticAI
      • Agno
      • AutoGen(AG2)
      • Composio
  • Together Cookbooks
  • Example Apps
  • Integrations

Training

  • Fine-tuning
    • Fine-tuning Guide
    • Data preparation
    • Supported Models
    • Serverless LoRA Inference
    • Preference Fine-Tuning
    • Deploying a fine-tuned model
    • Pricing
  • GPU Clusters
    • Cluster user management
    • Cluster storage
    • Slurm management system

Guides

  • How to Build Coding Agents
  • Getting started with logprobs
  • Quickstart: Retrieval Augmented Generation (RAG)
  • Quickstart: Next.js
  • Quickstart: Using Hugging Face Inference with Together
  • Quickstart: Using Vercel's AI SDK with Together AI
  • Together Mixture-Of-Agents (MoA)
  • Search and RAG
    • How to Improve Search with Rerankers
    • How to build an AI search engine (OSS Perplexity Clone)
    • Building a RAG Workflow
    • How to Implement Contextual RAG from Anthropic
  • Apps
    • How to build a Claude Artifacts Clone with Llama 3.1 405B
    • Building an AI data analyst
    • Fine-tuning Llama-3 to get 90% of GPT-4’s performance
    • How to build a real-time image generator with Flux and Together AI
    • How to build an Open Source NotebookLM: PDF to Podcast
    • How to build an Interactive AI Tutor with Llama 3.1

frequently asked questions

  • Deployment Options
  • Rate limits
  • Error codes
  • Deploy dedicated endpoints in the web
  • Priority Support
    • Create tickets in Slack
    • Customer Ticket Portal
  • Deprecations
  • Playground
  • Inference FAQs
  • Fine-tuning FAQs
  • Multiple API Keys
  • Dedicated endpoints

Examples

Use these examples to learn best practices for using the inference API.

Suggest Edits
  • Together Mixture-Of-Agents (MoA)
  • Building an AI data analyst
  • How to build a Claude Artifacts Clone with Llama 3.1 405B

Updated 7 months ago