# Together AI Docs ## Docs - [Upload a LoRA Adapter](https://docs.together.ai/docs/adapter-upload.md): Bring Your Own Adapter: Upload your own LoRA adapter and run inference on Together AI - [Agent Integrations](https://docs.together.ai/docs/agent-integrations.md): Using OSS agent frameworks with Together AI - [Agno](https://docs.together.ai/docs/agno.md): Using Agno with Together AI - [LLM Evaluations](https://docs.together.ai/docs/ai-evaluations.md): Learn how to run LLM-as-a-Judge evaluations - [AI Evaluations UI](https://docs.together.ai/docs/ai-evaluations-ui.md): Guide to using the AI Evaluations UI for model assessment - [How To Build An AI Search Engine (OSS Perplexity Clone)](https://docs.together.ai/docs/ai-search-engine.md): How to build an AI search engine inspired by Perplexity with Next.js and Together AI - [How To Build An Interactive AI Tutor With Llama 3.1](https://docs.together.ai/docs/ai-tutor.md): Learn we built LlamaTutor from scratch – an open source AI tutor with 90k users. - [API Keys & Authentication](https://docs.together.ai/docs/api-keys-authentication.md): Create, manage, and authenticate with Project-scoped API keys - [AutoGen(AG2)](https://docs.together.ai/docs/autogen.md): Using AutoGen(AG2) with Together AI - [Batch](https://docs.together.ai/docs/batch-inference.md): Process jobs asynchronously with the Batch API. - [Credits](https://docs.together.ai/docs/billing-credits.md): Understanding credits and billing basics on Together AI. - [Payment Methods & Invoices](https://docs.together.ai/docs/billing-payment-methods.md): Managing payment cards, ACH transfers, viewing invoices, and updating billing details. - [Billing Troubleshooting](https://docs.together.ai/docs/billing-troubleshooting.md): Resolving payment issues, understanding charges, and managing billing problems. - [Usage Limits & Analytics](https://docs.together.ai/docs/billing-usage-limits.md): Understanding account tiers, rate limits, model access, and cost analytics on Together AI. - [Building a RAG Workflow](https://docs.together.ai/docs/building-a-rag-workflow.md): Learn how to build a RAG workflow with Together AI embedding and chat endpoints! - [Changelog](https://docs.together.ai/docs/changelog.md) - [Chat](https://docs.together.ai/docs/chat-overview.md): Learn how to query our open-source chat models. - [Cluster Storage](https://docs.together.ai/docs/cluster-storage.md) - [Composio](https://docs.together.ai/docs/composio.md): Using Composio With Together AI - [Conditional Workflow](https://docs.together.ai/docs/conditional-workflows.md): Adapt to different tasks by conditionally navigating to various LLMs and tools. - [Quickstart](https://docs.together.ai/docs/containers-quickstart.md): Deploy your first container in 20 minutes. - [Create Tickets In Slack](https://docs.together.ai/docs/create-tickets-in-slack.md): For customers who have a shared Slack channel with us - [CrewAI](https://docs.together.ai/docs/crewai.md): Using CrewAI with Together - [Upload a Model](https://docs.together.ai/docs/custom-models.md): Run inference on your custom or fine-tuned models - [Building An AI Data Analyst](https://docs.together.ai/docs/data-analyst-agent.md): Learn how to use code interpreter to build an AI data analyst with E2B and Together AI. - [Introduction](https://docs.together.ai/docs/dedicated-container-inference.md): Deploy custom containers on Together's managed GPU infrastructure with automatic scaling, job queues, and built-in observability. - [Dedicated Endpoints FAQs](https://docs.together.ai/docs/dedicated-endpoints.md) - [Deploying Dedicated Endpoints](https://docs.together.ai/docs/dedicated-endpoints-ui.md): Guide to creating dedicated endpoints via the web UI. - [Dedicated Inference](https://docs.together.ai/docs/dedicated-inference.md): Deploy models on your own custom endpoints for improved reliability at scale - [Dedicated Models](https://docs.together.ai/docs/dedicated-models.md) - [Image Generation with Flux2](https://docs.together.ai/docs/dedicated_containers_image.md): Deploy a Flux2 image generation model on Together's managed GPU infrastructure using Dedicated Containers. - [Video Generation with Wan 2.1](https://docs.together.ai/docs/dedicated_containers_video.md): Deploy a multi-GPU video generation model on Together's managed GPU infrastructure using Dedicated Containers. - [DeepSeek V3.1 QuickStart](https://docs.together.ai/docs/deepseek-3-1-quickstart.md): How to get started with DeepSeek V3.1 - [DeepSeek FAQs](https://docs.together.ai/docs/deepseek-faqs.md) - [DeepSeek R1 Quickstart](https://docs.together.ai/docs/deepseek-r1.md): How to get the most out of reasoning models like DeepSeek-R1. - [Deploying a Fine-tuned Model](https://docs.together.ai/docs/deploying-a-fine-tuned-model.md): Once your fine-tune job completes, you should see your new model in [your models dashboard](https://api.together.xyz/models). - [Deployment Options Overview](https://docs.together.ai/docs/deployment-options.md): Compare Together AI's deployment options: fully-managed cloud service vs. secure VPC deployment for enterprises. - [Jig CLI](https://docs.together.ai/docs/deployments-jig.md): Build, push, and deploy containers to Together's managed GPU infrastructure. - [Queue API](https://docs.together.ai/docs/deployments-queue.md): Submit, monitor, and manage asynchronous jobs for your Dedicated Container deployments. - [Sprocket SDK](https://docs.together.ai/docs/deployments-sprocket.md): A Python SDK for building inference workers that support both synchronous and asynchronous requests via Together's platform. - [Deprecations](https://docs.together.ai/docs/deprecations.md) - [DSPy](https://docs.together.ai/docs/dspy.md): Using DSPy with Together AI - [Embeddings](https://docs.together.ai/docs/embeddings-overview.md): Learn how to get an embedding vector for a given text input. - [RAG Integrations](https://docs.together.ai/docs/embeddings-rag.md) - [Error Codes](https://docs.together.ai/docs/error-codes.md): An overview on error status codes, causes, and quick fix solutions - [Supported Models](https://docs.together.ai/docs/evaluations-supported-models.md): Supported models for Evaluations - [Fine-tuning BYOM](https://docs.together.ai/docs/fine-tuning-byom.md): Bring Your Own Model: Fine-tune Custom Models from the Hugging Face Hub - [Data Preparation](https://docs.together.ai/docs/fine-tuning-data-preparation.md): Together Fine-tuning API accepts two data formats for training dataset files: text data and tokenized data (in the form of Parquet files). Below, you can learn about different types of those formats and the scenarios in which they can be most useful. - [Fine Tuning FAQs](https://docs.together.ai/docs/fine-tuning-faqs.md) - [Function Calling Fine-tuning](https://docs.together.ai/docs/fine-tuning-function-calling.md): Learn how to fine-tune models with function calling capabilities using Together AI. - [Supported Models](https://docs.together.ai/docs/fine-tuning-models.md): A list of all the models available for fine-tuning. - [Pricing](https://docs.together.ai/docs/fine-tuning-pricing.md): Fine-tuning pricing at Together AI is based on the total number of tokens processed during your job. - [Fine-tuning Guide](https://docs.together.ai/docs/fine-tuning-quickstart.md): Learn the basics and best practices of fine-tuning large language models. - [Reasoning Fine-tuning](https://docs.together.ai/docs/fine-tuning-reasoning.md): Learn how to fine-tune reasoning models with chain-of-thought data using Together AI. - [Vision-Language Fine-tuning](https://docs.together.ai/docs/fine-tuning-vlm.md): Learn how to fine-tune Vision-Language Models (VLMs) on image+text data using Together AI. - [Function Calling](https://docs.together.ai/docs/function-calling.md): Learn how to get LLMs to respond to queries with named functions and structured arguments. - [GLM-5 Quickstart](https://docs.together.ai/docs/glm-5-quickstart.md): How to get the most out of GLM-5 for reasoning and agentic tasks. - [OpenAI GPT-OSS Quickstart](https://docs.together.ai/docs/gpt-oss.md): Get started with OpenAI's GPT-OSS, open-source reasoning model duo. - [API & Integrations](https://docs.together.ai/docs/gpu-clusters-api.md): Manage clusters programmatically with CLI, REST API, Terraform, and third-party tools - [Billing & Pricing](https://docs.together.ai/docs/gpu-clusters-billing.md): Understand billing, pricing, and lifecycle policies for GPU Clusters - [Cluster Management](https://docs.together.ai/docs/gpu-clusters-management.md): Manage, scale, and operate your GPU clusters - [GPU Clusters Overview](https://docs.together.ai/docs/gpu-clusters-overview.md): High-performance GPU clusters for training, fine-tuning, and large-scale AI workloads - [Quickstart: Create Your First Cluster](https://docs.together.ai/docs/gpu-clusters-quickstart.md): Get started with GPU Clusters in minutes - [Guides Homepage](https://docs.together.ai/docs/guides.md): Quickstarts and step-by-step guides for building with Together AI. - [Health Checks and Node Repair](https://docs.together.ai/docs/health-checks.md): Proactively validate GPU node health and trigger repair actions for issues - [How to build a Lovable clone with Kimi K2](https://docs.together.ai/docs/how-to-build-a-lovable-clone-with-kimi-k2.md): Learn how to build a full-stack Next.js app that can generate React apps with a single prompt. - [How to Build Coding Agents](https://docs.together.ai/docs/how-to-build-coding-agents.md): How to build your own simple code editing agent from scratch in 400 lines of code! - [Build a Phone Voice Agent with Together AI](https://docs.together.ai/docs/how-to-build-phone-voice-agent.md): Build a real-time phone voice agent from scratch with Twilio Media Streams, Together AI realtime STT, chat completions, realtime TTS, and local voice activity detection. - [How to build an AI audio transcription app with Whisper](https://docs.together.ai/docs/how-to-build-real-time-audio-transcription-app.md): Learn how to build a real-time AI audio transcription app with Whisper, Next.js, and Together AI. - [How To Implement Contextual RAG From Anthropic](https://docs.together.ai/docs/how-to-implement-contextual-rag-from-anthropic.md): An open source line-by-line implementation and explanation of Contextual RAG from Anthropic! - [How To Improve Search With Rerankers](https://docs.together.ai/docs/how-to-improve-search-with-rerankers.md): Learn how you can improve semantic search quality with reranker models! - [How to use Cline with DeepSeek V3 to build faster](https://docs.together.ai/docs/how-to-use-cline.md): Use Cline (an AI coding agent) with DeepSeek V3 (a powerful open source model) to code faster. - [Quickstart: How to Use OpenClaw with Together AI](https://docs.together.ai/docs/how-to-use-openclaw.md): Learn how to pair OpenClaw, a powerful autonomous agent, with frontier OSS models on Together AI like Kimi K2.5 and GLM 4.7. - [How to use OpenCode with Together AI to build faster](https://docs.together.ai/docs/how-to-use-opencode.md): Learn how to combine OpenCode, a powerful terminal-based AI coding agent, with Together AI models like DeepSeek V3 to supercharge your development workflow. - [How to use Qwen Code with Together AI for enhanced development workflow](https://docs.together.ai/docs/how-to-use-qwen-code.md): Learn how to configure Qwen Code, a powerful AI-powered command-line workflow tool, with Together AI models to supercharge your coding workflow with advanced code understanding and automation. - [Together's IAM Model](https://docs.together.ai/docs/identity-access-management.md): How users, credentials, and resources are organized across the Together platform - [Image Generation](https://docs.together.ai/docs/images-overview.md): Generate high-quality images from text + image prompts. - [Inference FAQs](https://docs.together.ai/docs/inference-faqs.md) - [Playground](https://docs.together.ai/docs/inference-web-interface.md): Guide to using Together AI's web playground for interactive AI model inference across chat, image, video, audio, and transcribe models. - [Integrations](https://docs.together.ai/docs/integrations.md): Use Together AI models through partner integrations. - [Iterative Workflow](https://docs.together.ai/docs/iterative-workflow.md): Iteratively call LLMs to optimize task performance. - [Structured Outputs](https://docs.together.ai/docs/json-mode.md): Learn how to use JSON mode to get structured outputs from LLMs like DeepSeek V3 & Llama 3.3. - [Kimi K2 QuickStart](https://docs.together.ai/docs/kimi-k2-quickstart.md): How to get the most out of models like Kimi K2. - [Kimi K2 Thinking QuickStart](https://docs.together.ai/docs/kimi-k2-thinking-quickstart.md): How to get the most out of reasoning models like Kimi K2 Thinking. - [Kimi K2.5 Quickstart](https://docs.together.ai/docs/kimi-k2.5-quickstart.md): How to get the most out of Kimi's new K2.5 model. - [LangGraph](https://docs.together.ai/docs/langgraph.md): Using LangGraph with Together AI - [Llama 4 Quickstart](https://docs.together.ai/docs/llama4-quickstart.md): How to get the most out of the new Llama 4 models. - [Getting Started with Logprobs](https://docs.together.ai/docs/logprobs.md): Learn how to return log probabilities for your output tokens & build better classifiers. - [LoRA Fine-Tuning and Inference](https://docs.together.ai/docs/lora-training-and-inference.md): Fine-tune and run inference for a model with LoRA adapters - [Together AI MCP Server](https://docs.together.ai/docs/mcp.md): Install our MCP server in Cursor, Claude Code, or OpenCode in 1 click. - [Together Mixture Of Agents (MoA)](https://docs.together.ai/docs/mixture-of-agents.md) - [How to run nanochat on Instant Clusters⚡️](https://docs.together.ai/docs/nanochat-on-instant-clusters.md): Learn how to train Andrej Karpathy's end-to-end ChatGPT clone on Together's on-demand GPU clusters - [Quickstart: Next.Js](https://docs.together.ai/docs/nextjs-chat-quickstart.md): Build an app that can ask a single question or chat with an LLM using Next.js and Together AI. - [How To Build An Open Source NotebookLM: PDF To Podcast](https://docs.together.ai/docs/open-notebooklm-pdf-to-podcast.md): In this guide we will see how to create a podcast like the one below from a PDF input! - [OpenAI Compatibility](https://docs.together.ai/docs/openai-api-compatibility.md): Together's API is compatible with OpenAI's libraries, making it easy to try out our open-source models on existing applications. - [Organizations](https://docs.together.ai/docs/organizations.md): Create and manage your Together Organization, invite Members, and configure billing - [Parallel Workflow](https://docs.together.ai/docs/parallel-workflows.md): Execute multiple LLM calls in parallel and aggregate afterwards. - [Preference Fine-Tuning](https://docs.together.ai/docs/preference-fine-tuning.md): Learn how to use preference fine-tuning on Together Fine-Tuning Platform - [Projects](https://docs.together.ai/docs/projects.md): Create isolated workspaces to organize resources, manage team access, and scope API keys - [Prompting DeepSeek R1](https://docs.together.ai/docs/prompting-deepseek-r1.md): Prompt engineering for DeepSeek-R1. - [PydanticAI](https://docs.together.ai/docs/pydanticai.md): Using PydanticAI with Together - [Python v2 SDK Migration Guide](https://docs.together.ai/docs/pythonv2-migration-guide.md): Migrate from Together Python v1 to v2 - the new Together AI Python SDK with improved type safety and modern architecture. - [Quickstart](https://docs.together.ai/docs/quickstart.md): Get up to speed with our API in one minute. - [Quickstart: FLUX.2](https://docs.together.ai/docs/quickstart-flux.md): Learn how to use FLUX.2, the next generation image model with advanced prompting capabilities - [Quickstart: Flux Kontext](https://docs.together.ai/docs/quickstart-flux-kontext.md): Learn how to use Flux's new in-context image generation models - [Quickstart: Flux LoRA Inference](https://docs.together.ai/docs/quickstart-flux-lora.md) - [Quickstart: How to do OCR](https://docs.together.ai/docs/quickstart-how-to-do-ocr.md): A step by step guide on how to do OCR with Together AI's vision models with structured outputs - [Quickstart: Retrieval Augmented Generation (RAG)](https://docs.together.ai/docs/quickstart-retrieval-augmented-generation-rag.md): How to build a RAG workflow in under 5 mins! - [Quickstart: Using Hugging Face Inference With Together](https://docs.together.ai/docs/quickstart-using-hugging-face-inference.md): This guide will walk you through how to use Together models with Hugging Face Inference. - [Inference Rate Limits](https://docs.together.ai/docs/rate-limits.md): Rate limits restrict how often a user or client can access our API within a set timeframe. - [Reasoning Models Guide](https://docs.together.ai/docs/reasoning-models-guide.md): How reasoning models like DeepSeek-R1 work. - [Reasoning](https://docs.together.ai/docs/reasoning-overview.md): Learn how to use reasoning models that think step-by-step before answering. - [Recommended Models](https://docs.together.ai/docs/recommended-models.md): Find the right models for your use case - [Rerank](https://docs.together.ai/docs/rerank-overview.md): Learn how to improve the relevance of your search and RAG systems with reranking. - [Roles & Permissions (RBAC)](https://docs.together.ai/docs/roles-permissions.md): Understand Organization and Project role-based access control (RBAC) including Admin and Member roles and what each can do across the Together platform - [Sequential Workflow](https://docs.together.ai/docs/sequential-agent-workflow.md): Coordinating a chain of LLM calls to solve a complex task. - [Serverless Models](https://docs.together.ai/docs/serverless-models.md) - [Slurm Management System](https://docs.together.ai/docs/slurm.md) - [Slurm Configuration](https://docs.together.ai/docs/slurm-configuration.md): Customize Slurm cluster settings to match your workload requirements - [Speech-to-Text](https://docs.together.ai/docs/speech-to-text.md): Learn how to transcribe and translate audio into text! - [Single Sign-On (SSO)](https://docs.together.ai/docs/sso.md): Connect your Identity Provider for secure, automated team access to Together - [Customer Ticket Portal](https://docs.together.ai/docs/support-ticket-portal.md) - [Text-to-Speech](https://docs.together.ai/docs/text-to-speech.md): Learn how to use the text-to-speech functionality supported by Together AI. - [Together Code Interpreter](https://docs.together.ai/docs/together-code-interpreter.md): Execute LLM-generated code seamlessly with a simple API call. - [Together Code Sandbox](https://docs.together.ai/docs/together-code-sandbox.md): Level-up generative code tooling with fast, secure code sandboxes at scale - [Platform Overview](https://docs.together.ai/docs/together-deployments.md): Architecture, deployment lifecycle, and core concepts for Dedicated Container Inference. - [Quickstart: Using Mastra with Together AI](https://docs.together.ai/docs/using-together-with-mastra.md): This guide will walk you through how to use Together models with Mastra. - [Quickstart: Using Vercel AI SDK With Together AI](https://docs.together.ai/docs/using-together-with-vercels-ai-sdk.md): This guide will walk you through how to use Together models with the Vercel AI SDK. - [Video Generation](https://docs.together.ai/docs/videos-overview.md): Generate high-quality videos from text and image prompts. - [Vision LLMs](https://docs.together.ai/docs/vision-overview.md): Learn how to use the vision models supported by Together AI. - [Agent Workflows](https://docs.together.ai/docs/workflows.md): Orchestrating together multiple language model calls to solve complex tasks. - [Together Cookbooks & Example Apps](https://docs.together.ai/examples.md): Explore our vast library of open-source cookbooks & example apps - [How to build a real-time image generator with Flux and Together AI](https://docs.together.ai/external-link-02.md) - [Overview](https://docs.together.ai/intro.md): Welcome to Together AI’s docs! Together makes it easy to run, finetune, and train open source AI models with transparency and privacy. - [Python Library](https://docs.together.ai/python-library.md) - [Create Audio Generation Request](https://docs.together.ai/reference/audio-speech.md): Generate audio from input text - [Create realtime text-to-speech](https://docs.together.ai/reference/audio-speech-websocket.md): Establishes a WebSocket connection for real-time text-to-speech generation. This endpoint uses WebSocket protocol (wss://api.together.ai/v1/audio/speech/websocket) for bidirectional streaming communication. - [Create an Audio Transcription](https://docs.together.ai/reference/audio-transcriptions.md): Transcribes audio into text - [Create a realtime audio transcription](https://docs.together.ai/reference/audio-transcriptions-realtime.md): Establishes a WebSocket connection for real-time audio transcription. This endpoint uses WebSocket protocol (wss://api.together.ai/v1/realtime) for bidirectional streaming communication. - [Create an Audio Translation](https://docs.together.ai/reference/audio-translations.md): Translates audio into English - [Authentication](https://docs.together.ai/reference/authentication.md) - [Cancel a batch job](https://docs.together.ai/reference/batch-cancel.md): Cancel a batch job by ID - [Create a batch job](https://docs.together.ai/reference/batch-create.md): Create a new batch job with the given input file and endpoint - [Get a batch job](https://docs.together.ai/reference/batch-get.md): Get details of a batch job by ID - [List all batch jobs](https://docs.together.ai/reference/batch-list.md): List all batch jobs for the authenticated user - [Create Chat Completion](https://docs.together.ai/reference/chat-completions.md): Generate a model response for a given chat conversation. Supports single queries and multi-turn conversations with system, user, and assistant messages. - [Introduction](https://docs.together.ai/reference/cli/beta-intro.md): Documentation for using beta features with the Together Python SDK/CLI. - [Clusters](https://docs.together.ai/reference/cli/clusters.md) - [Endpoints](https://docs.together.ai/reference/cli/endpoints.md): Create, update and delete endpoints via the CLI - [Evals](https://docs.together.ai/reference/cli/evals.md): Manage model evaluation jobs - [Files](https://docs.together.ai/reference/cli/files.md) - [Fine Tuning](https://docs.together.ai/reference/cli/finetune.md) - [Getting Started](https://docs.together.ai/reference/cli/getting-started.md): Get started with Together's Python CLI (`together`). - [Containers (Jig)](https://docs.together.ai/reference/cli/jig-redirect-stub.md): CLI commands and configuration for Dedicated Containers. - [Models](https://docs.together.ai/reference/cli/models.md) - [Create a Cluster](https://docs.together.ai/reference/clusters-create.md): Create an Instant Cluster on Together's high-performance GPU clusters. With features like on-demand scaling, long-lived resizable high-bandwidth shared DC-local storage, Kubernetes and Slurm cluster flavors, a REST API, and Terraform support, you can run workloads flexibly without complex infrastruc… - [Delete a Cluster](https://docs.together.ai/reference/clusters-delete.md): Delete a GPU cluster by cluster ID. - [Retrieve Cluster](https://docs.together.ai/reference/clusters-get.md): Retrieve information about a specific GPU cluster. - [List all Clusters](https://docs.together.ai/reference/clusters-list.md): List all GPU clusters. - [List compute region capabilities](https://docs.together.ai/reference/clusters-list-regions.md) - [Update or Scale GPU Cluster](https://docs.together.ai/reference/clusters-update.md): Update the configuration of an existing GPU cluster. - [Create a shared volume](https://docs.together.ai/reference/clusters_storages-create.md): Instant Clusters supports long-lived, resizable in-DC shared storage with user data persistence. You can dynamically create and attach volumes to your cluster at cluster creation time, and resize as your data grows. All shared storage is backed by multi-NIC bare metal paths, ensuring high-throughput… - [Delete a shared volume](https://docs.together.ai/reference/clusters_storages-delete.md): Delete a shared volume. Note that if this volume is attached to a cluster, deleting will fail. - [Retrieve a shared volumes](https://docs.together.ai/reference/clusters_storages-get.md): Retrieve information about a specific shared volume. - [List shared volumes](https://docs.together.ai/reference/clusters_storages-list.md): List all shared volumes. - [Update a shared volume](https://docs.together.ai/reference/clusters_storages-update.md): Update the configuration of an existing shared volume. - [Create Completion](https://docs.together.ai/reference/completions.md): Generate text completions for a given prompt using a language, code, or image model. - [Create Evaluation](https://docs.together.ai/reference/create-evaluation.md) - [Create Video](https://docs.together.ai/reference/create-videos.md): Create a video - [Create A Dedicated Endpoint](https://docs.together.ai/reference/createendpoint.md): Creates a new dedicated endpoint for serving models. The endpoint will automatically start after creation. You can deploy any supported model on hardware configurations that meet the model's requirements. - [Jig CLI](https://docs.together.ai/reference/dci-reference-jig.md): CLI commands, pyproject.toml configuration, environment variables, and Python SDK for Dedicated Containers. - [Sprocket SDK](https://docs.together.ai/reference/dci-reference-sprocket.md): API reference for Sprocket classes, functions, and configuration. - [Delete A File](https://docs.together.ai/reference/delete-files-id.md): Delete a previously uploaded data file. - [Delete A Fine-tuning Event](https://docs.together.ai/reference/delete-fine-tunes-id.md): Delete a fine-tuning job. - [Delete Endpoint](https://docs.together.ai/reference/deleteendpoint.md): Permanently deletes an endpoint. This action cannot be undone. - [Create Deployment](https://docs.together.ai/reference/deployments-create.md): Create a new deployment with specified configuration - [Delete Deployment](https://docs.together.ai/reference/deployments-delete.md): Delete an existing deployment - [Get Deployment](https://docs.together.ai/reference/deployments-get.md): Retrieve details of a specific deployment by its ID or name - [List Deployments](https://docs.together.ai/reference/deployments-list.md): Get a list of all deployments in your project - [Get Deployment Logs](https://docs.together.ai/reference/deployments-logs.md): Retrieve logs from a deployment, optionally filtered by replica ID. - [Create Secret](https://docs.together.ai/reference/deployments-secrets-create.md): Create a new secret to store sensitive configuration values - [Delete Secret](https://docs.together.ai/reference/deployments-secrets-delete.md): Delete an existing secret - [Get Secret](https://docs.together.ai/reference/deployments-secrets-get.md): Retrieve details of a specific secret by its ID or name - [List Secrets](https://docs.together.ai/reference/deployments-secrets-list.md): Retrieve all secrets in your project - [Update Secret](https://docs.together.ai/reference/deployments-secrets-update.md): Update an existing secret's value or metadata - [Get Storage File](https://docs.together.ai/reference/deployments-storage-get.md): Download a file by redirecting to a signed URL - [Create Storage Volume](https://docs.together.ai/reference/deployments-storage-volumes-create.md): Create a new volume to preload files in deployments - [Delete Storage Volume](https://docs.together.ai/reference/deployments-storage-volumes-delete.md): Delete an existing volume - [Get Storage Volume](https://docs.together.ai/reference/deployments-storage-volumes-get.md): Retrieve details of a specific volume by its ID or name - [List Storage Volumes](https://docs.together.ai/reference/deployments-storage-volumes-list.md): Retrieve all volumes in your project - [Update Storage Volume](https://docs.together.ai/reference/deployments-storage-volumes-update.md): Update an existing volume's configuration or contents - [Update Deployment](https://docs.together.ai/reference/deployments-update.md): Update an existing deployment configuration - [Create Embedding](https://docs.together.ai/reference/embeddings.md): Generate vector embeddings for one or more text inputs. Returns numerical arrays representing semantic meaning, useful for search, classification, and retrieval. - [Get Evaluation](https://docs.together.ai/reference/get-evaluation.md) - [Get Evaluation Status](https://docs.together.ai/reference/get-evaluation-status.md) - [List All Files](https://docs.together.ai/reference/get-files.md): List the metadata for all uploaded data files. - [List File](https://docs.together.ai/reference/get-files-id.md): Retrieve the metadata for a single uploaded data file. - [Get File Contents](https://docs.together.ai/reference/get-files-id-content.md): Get the contents of a single uploaded data file. - [List All Jobs](https://docs.together.ai/reference/get-fine-tunes.md): List the metadata for all fine-tuning jobs. Returns a list of FinetuneResponseTruncated objects. - [List Job](https://docs.together.ai/reference/get-fine-tunes-id.md): List the metadata for a single fine-tuning job. - [List checkpoints](https://docs.together.ai/reference/get-fine-tunes-id-checkpoint.md): List the checkpoints for a single fine-tuning job. - [List Job Events](https://docs.together.ai/reference/get-fine-tunes-id-events.md): List the events for a single fine-tuning job. - [Download Model](https://docs.together.ai/reference/get-finetune-download.md): Receive a compressed fine-tuned model or checkpoint. - [Get Video](https://docs.together.ai/reference/get-videos-id.md): Fetch video metadata - [Get Endpoint By ID](https://docs.together.ai/reference/getendpoint.md): Retrieves details about a specific endpoint, including its current state, configuration, and scaling settings. - [List Evaluation Models](https://docs.together.ai/reference/list-evaluation-models.md) - [List All Evaluations](https://docs.together.ai/reference/list-evaluations.md) - [List All Endpoints](https://docs.together.ai/reference/listendpoints.md): Returns a list of all endpoints associated with your account. You can filter the results by type (dedicated or serverless). - [List Available Hardware Configurations](https://docs.together.ai/reference/listhardware.md): Returns a list of available hardware configurations for deploying models. When a model parameter is provided, it returns only hardware configurations compatible with that model, including their current availability status. - [List All Models](https://docs.together.ai/reference/models.md): Lists all of Together's open-source models - [Create Job](https://docs.together.ai/reference/post-fine-tunes.md): Create a fine-tuning job with the provided model and training data. - [Cancel Job](https://docs.together.ai/reference/post-fine-tunes-id-cancel.md): Cancel a currently running fine-tuning job. Returns a FinetuneResponseTruncated object. - [Create Image](https://docs.together.ai/reference/post-images-generations.md): Use an image model to generate an image for a given prompt. - [Cancel Queue Job](https://docs.together.ai/reference/queue-cancel.md): Cancel a pending job. Only jobs in pending status can be canceled. Running jobs cannot be stopped. Returns the job status after the attempt. If the job is not pending, returns 409 with the current status unchanged. - [Get Queue Metrics](https://docs.together.ai/reference/queue-metrics.md): Get the current queue statistics for a model, including pending and running job counts. - [Get Queue Status](https://docs.together.ai/reference/queue-status.md): Poll the current status of a previously submitted job. Provide the request_id and model as query parameters. - [Submit Queue Job](https://docs.together.ai/reference/queue-submit.md): Submit a new job to the queue for asynchronous processing. Jobs are processed in strict priority order (higher priority first, FIFO within the same priority). Returns a request ID that can be used to poll status or cancel the job. - [Create A Rerank Request](https://docs.together.ai/reference/rerank.md): Rerank a list of documents by relevance to a query. Returns a relevance score and ordering index for each document. - [/tci/execute](https://docs.together.ai/reference/tci-execute.md): Executes the given code snippet and returns the output. Without a session_id, a new session will be created to run the code. If you do pass in a valid session_id, the code will be run in that session. This is useful for running multiple code snippets in the same environment, because dependencies and… - [/tci/sessions](https://docs.together.ai/reference/tci-sessions.md): Lists all your currently active sessions. - [Update, Start or Stop Endpoint](https://docs.together.ai/reference/updateendpoint.md): Updates an existing endpoint's configuration. You can modify the display name, autoscaling settings, or change the endpoint's state (start/stop). - [Upload a file](https://docs.together.ai/reference/upload-file.md): Upload a file with specified purpose, file name, and file type. - [Upload a custom model or adapter](https://docs.together.ai/reference/upload-model.md): Upload a custom model or adapter from Hugging Face or S3 - [TypeScript Library](https://docs.together.ai/typescript-library.md) ## OpenAPI Specs - [openapi](https://docs.together.ai/openapi.yaml) - [tcloud](https://docs.together.ai/tcloud.yaml) - [deprecated-spec](https://docs.together.ai/deprecated-spec.json) Built with [Mintlify](https://mintlify.com).