Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.together.ai/llms.txt

Use this file to discover all available pages before exploring further.

The Together AI API is OpenAI-compatible, so most third-party SDKs work by pointing them at https://api.together.ai/v1 and supplying your Together API key. The integrations below ship dedicated Together support, with first-class clients, helpers, or providers. For agent frameworks (CrewAI, LangGraph, DSPy, PydanticAI, AutoGen, Agno, Composio), see the dedicated pages under Framework integrations.

Hugging Face

Use Together AI as a provider for Hugging Face Inference clients:
pip install "huggingface_hub>=0.29.0"
from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="together",
    api_key="<your_api_key>",  # HF token or Together API key
)

completion = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-R1",
    messages=[{"role": "user", "content": "What is the capital of France?"}],
    max_tokens=500,
)

print(completion.choices[0].message)
See the Together AI Hugging Face guide for more details.

Vercel AI SDK

The Vercel AI SDK is a TypeScript library for building AI-powered applications. The @ai-sdk/togetherai provider gives you native access to Together AI models.
Shell
npm i ai @ai-sdk/togetherai
TypeScript
import { togetherai } from "@ai-sdk/togetherai";
import { generateText } from "ai";

const { text } = await generateText({
  model: togetherai("moonshotai/Kimi-K2.5"),
  prompt: "Write a vegetarian lasagna recipe for 4 people.",
});

console.log(text);
See the Together AI Vercel AI SDK guide for details on streaming, tool use, and structured outputs.

LangChain

LangChain is a framework for building context-aware, reasoning applications powered by LLMs. The langchain-together package provides chat models and embeddings.
Shell
pip install --upgrade langchain-together
Python
from langchain_together import ChatTogether

chat = ChatTogether(model="meta-llama/Llama-3.3-70B-Instruct-Turbo")

for chunk in chat.stream("Tell me fun things to do in NYC"):
    print(chunk.content, end="", flush=True)
For RAG patterns with LangChain plus Together embeddings, see RAG integrations and the LangChain provider docs.

LlamaIndex

LlamaIndex is a data framework for connecting custom data sources to LLMs. Together AI works with LlamaIndex through the OpenAILike LLM and dedicated embedding classes.
Shell
pip install llama-index
Python
import os
from llama_index.llms import OpenAILike

llm = OpenAILike(
    model="meta-llama/Llama-3.3-70B-Instruct-Turbo",
    api_base="https://api.together.ai/v1",
    api_key=os.environ["TOGETHER_API_KEY"],
    is_chat_model=True,
    is_function_calling_model=True,
    temperature=0.1,
)

response = llm.complete("Explain large language models in 500 words.")
print(response)
For RAG patterns, see RAG integrations, the LlamaIndex Together LLM docs, and the LlamaIndex Together embeddings docs.

Helicone

Helicone is an open-source LLM observability platform. Route Together requests through Helicone’s gateway by overriding base_url and adding the auth header.
import os
from together import Together

client = Together(
    api_key=os.environ["TOGETHER_API_KEY"],
    base_url="https://together.hconeai.com/v1",
    default_headers={
        "Helicone-Auth": f"Bearer {os.environ['HELICONE_API_KEY']}",
    },
)

stream = client.chat.completions.create(
    model="Qwen/Qwen2.5-7B-Instruct-Turbo",
    messages=[
        {
            "role": "user",
            "content": "What are some fun things to do in New York?",
        }
    ],
    stream=True,
)

for chunk in stream:
    if chunk.choices:
        print(chunk.choices[0].delta.content or "", end="", flush=True)

Agent frameworks

Each framework below has a dedicated guide with installation, model selection, and runnable examples:
  • CrewAI: Open-source orchestration for multi-agent workflows.
  • LangGraph: Stateful, multi-actor applications built on LangChain.
  • DSPy: Modular AI systems written in code instead of prompt strings.
  • PydanticAI: Typed agent framework from the Pydantic team.
  • AutoGen (AG2): Conversational multi-agent systems.
  • Agno: Open-source library for multimodal agents.
  • Composio: Tool-use platform for connecting agents to external services.

Vector stores and RAG

For Pinecone, MongoDB, Pixeltable, and other vector-store integrations, see RAG integrations.