Documentation Index
Fetch the complete documentation index at: https://docs.together.ai/llms.txt
Use this file to discover all available pages before exploring further.
The Together AI API is OpenAI-compatible, so most third-party SDKs work by pointing them at https://api.together.ai/v1 and supplying your Together API key. The integrations below ship dedicated Together support, with first-class clients, helpers, or providers.
For agent frameworks (CrewAI, LangGraph, DSPy, PydanticAI, AutoGen, Agno, Composio), see the dedicated pages under Framework integrations.
Hugging Face
Use Together AI as a provider for Hugging Face Inference clients:
pip install "huggingface_hub>=0.29.0"
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="together",
api_key="<your_api_key>", # HF token or Together API key
)
completion = client.chat.completions.create(
model="deepseek-ai/DeepSeek-R1",
messages=[{"role": "user", "content": "What is the capital of France?"}],
max_tokens=500,
)
print(completion.choices[0].message)
See the Together AI Hugging Face guide for more details.
Vercel AI SDK
The Vercel AI SDK is a TypeScript library for building AI-powered applications. The @ai-sdk/togetherai provider gives you native access to Together AI models.
npm i ai @ai-sdk/togetherai
import { togetherai } from "@ai-sdk/togetherai";
import { generateText } from "ai";
const { text } = await generateText({
model: togetherai("moonshotai/Kimi-K2.5"),
prompt: "Write a vegetarian lasagna recipe for 4 people.",
});
console.log(text);
See the Together AI Vercel AI SDK guide for details on streaming, tool use, and structured outputs.
LangChain
LangChain is a framework for building context-aware, reasoning applications powered by LLMs. The langchain-together package provides chat models and embeddings.
pip install --upgrade langchain-together
from langchain_together import ChatTogether
chat = ChatTogether(model="meta-llama/Llama-3.3-70B-Instruct-Turbo")
for chunk in chat.stream("Tell me fun things to do in NYC"):
print(chunk.content, end="", flush=True)
For RAG patterns with LangChain plus Together embeddings, see RAG integrations and the LangChain provider docs.
LlamaIndex
LlamaIndex is a data framework for connecting custom data sources to LLMs. Together AI works with LlamaIndex through the OpenAILike LLM and dedicated embedding classes.
import os
from llama_index.llms import OpenAILike
llm = OpenAILike(
model="meta-llama/Llama-3.3-70B-Instruct-Turbo",
api_base="https://api.together.ai/v1",
api_key=os.environ["TOGETHER_API_KEY"],
is_chat_model=True,
is_function_calling_model=True,
temperature=0.1,
)
response = llm.complete("Explain large language models in 500 words.")
print(response)
For RAG patterns, see RAG integrations, the LlamaIndex Together LLM docs, and the LlamaIndex Together embeddings docs.
Helicone
Helicone is an open-source LLM observability platform. Route Together requests through Helicone’s gateway by overriding base_url and adding the auth header.
import os
from together import Together
client = Together(
api_key=os.environ["TOGETHER_API_KEY"],
base_url="https://together.hconeai.com/v1",
default_headers={
"Helicone-Auth": f"Bearer {os.environ['HELICONE_API_KEY']}",
},
)
stream = client.chat.completions.create(
model="Qwen/Qwen2.5-7B-Instruct-Turbo",
messages=[
{
"role": "user",
"content": "What are some fun things to do in New York?",
}
],
stream=True,
)
for chunk in stream:
if chunk.choices:
print(chunk.choices[0].delta.content or "", end="", flush=True)
Agent frameworks
Each framework below has a dedicated guide with installation, model selection, and runnable examples:
- CrewAI: Open-source orchestration for multi-agent workflows.
- LangGraph: Stateful, multi-actor applications built on LangChain.
- DSPy: Modular AI systems written in code instead of prompt strings.
- PydanticAI: Typed agent framework from the Pydantic team.
- AutoGen (AG2): Conversational multi-agent systems.
- Agno: Open-source library for multimodal agents.
- Composio: Tool-use platform for connecting agents to external services.
Vector stores and RAG
For Pinecone, MongoDB, Pixeltable, and other vector-store integrations, see RAG integrations.