Third-party integrations - Together AI docs

The Together AI API is OpenAI-compatible, so most third-party SDKs work by pointing them at https://api.together.ai/v1 and supplying your Together API key. The integrations below ship dedicated Together support, with first-class clients, helpers, or providers. For agent frameworks (CrewAI, LangGraph, DSPy, PydanticAI, AutoGen, Agno, Composio), see the dedicated pages under Framework integrations.

Hugging Face

Use Together AI as a provider for Hugging Face Inference clients:

pip install "huggingface_hub>=0.29.0"

from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="together",
    api_key="<your_api_key>",  # HF token or Together API key
)

completion = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-R1",
    messages=[{"role": "user", "content": "What is the capital of France?"}],
    max_tokens=500,
)

print(completion.choices[0].message)

See the Together AI Hugging Face guide for more details.

Vercel AI SDK

The Vercel AI SDK is a TypeScript library for building AI-powered applications. The @ai-sdk/togetherai provider gives you native access to Together AI models.

Shell

npm i ai @ai-sdk/togetherai

TypeScript

import { togetherai } from "@ai-sdk/togetherai";
import { generateText } from "ai";

const { text } = await generateText({
  model: togetherai("moonshotai/Kimi-K2.5"),
  prompt: "Write a vegetarian lasagna recipe for 4 people.",
});

console.log(text);

See the Together AI Vercel AI SDK guide for details on streaming, tool use, and structured outputs.

LangChain

LangChain is a framework for building context-aware, reasoning applications powered by LLMs. The langchain-together package provides chat models and embeddings.

Shell

pip install --upgrade langchain-together

Python

from langchain_together import ChatTogether

chat = ChatTogether(model="meta-llama/Llama-3.3-70B-Instruct-Turbo")

for chunk in chat.stream("Tell me fun things to do in NYC"):
    print(chunk.content, end="", flush=True)

For RAG patterns with LangChain plus Together embeddings, see RAG integrations and the LangChain provider docs.

LlamaIndex

LlamaIndex is a data framework for connecting custom data sources to LLMs. Together AI works with LlamaIndex through the OpenAILike LLM and dedicated embedding classes.

Shell

pip install llama-index

Python

import os
from llama_index.llms import OpenAILike

llm = OpenAILike(
    model="meta-llama/Llama-3.3-70B-Instruct-Turbo",
    api_base="https://api.together.ai/v1",
    api_key=os.environ["TOGETHER_API_KEY"],
    is_chat_model=True,
    is_function_calling_model=True,
    temperature=0.1,
)

response = llm.complete("Explain large language models in 500 words.")
print(response)

For RAG patterns, see RAG integrations, the LlamaIndex Together LLM docs, and the LlamaIndex Together embeddings docs.

Helicone

Helicone is an open-source LLM observability platform. Route Together requests through Helicone’s gateway by overriding base_url and adding the auth header.

import os
from together import Together

client = Together(
    api_key=os.environ["TOGETHER_API_KEY"],
    base_url="https://together.hconeai.com/v1",
    default_headers={
        "Helicone-Auth": f"Bearer {os.environ['HELICONE_API_KEY']}",
    },
)

stream = client.chat.completions.create(
    model="Qwen/Qwen2.5-7B-Instruct-Turbo",
    messages=[
        {
            "role": "user",
            "content": "What are some fun things to do in New York?",
        }
    ],
    stream=True,
)

for chunk in stream:
    if chunk.choices:
        print(chunk.choices[0].delta.content or "", end="", flush=True)

Agent frameworks

Each framework below has a dedicated guide with installation, model selection, and runnable examples:

CrewAI: Open-source orchestration for multi-agent workflows.
LangGraph: Stateful, multi-actor applications built on LangChain.
DSPy: Modular AI systems written in code instead of prompt strings.
PydanticAI: Typed agent framework from the Pydantic team.
AutoGen (AG2): Conversational multi-agent systems.
Agno: Open-source library for multimodal agents.
Composio: Tool-use platform for connecting agents to external services.

Vector stores and RAG

For Pinecone, MongoDB, Pixeltable, and other vector-store integrations, see RAG integrations.

Documentation Index

​Hugging Face

​Vercel AI SDK

​LangChain

​LlamaIndex

​Helicone

​Agent frameworks

​Vector stores and RAG

Hugging Face

Vercel AI SDK

LangChain

LlamaIndex

Helicone

Agent frameworks

Vector stores and RAG