You can use Together AI models with Hugging Face Inference.Install the huggingface_hub library:
Copy
Ask AI
pip install huggingface_hub>=0.29.0
Chat Completion with Hugging Face Hub library
Copy
Ask AI
from huggingface_hub import InferenceClient## Initialize the InferenceClient with together as the providerclient = InferenceClient( provider="together", api_key="xxxxxxxxxxxxxxxxxxxxxxxx" # Replace with your API key (HF or custom) )## Define the chat messagesmessages = [ { "role": "user", "content": "What is the capital of France?" } ]## Generate a chat completioncompletion = client.chat.completions.create( model="deepseek-ai/DeepSeek-R1", messages=messages, max_tokens=500)## Print the responseprint(completion.choices[0].message)
The Vercel AI SDK is a powerful Typescript library designed to help developers build AI-powered applications.Install both the Vercel AI SDK and OpenAI’s Vercel package.
Shell
Copy
Ask AI
npm i ai @ai-sdk/openai
Instantiate the Together client and call the generateText function with Llama 3.1 8B to generate some text.
TypeScript
Copy
Ask AI
import { createOpenAI } from "@ai-sdk/openai";import { generateText } from "ai";const together = createOpenAI({ apiKey: process.env.TOGETHER_API_KEY ?? "", baseURL: "https://api.together.xyz/v1",});async function main() { const { text } = await generateText({ model: together("meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo"), prompt: "Write a vegetarian lasagna recipe for 4 people.", }); console.log(text);}main();
LangChain is a framework for developing context-aware, reasoning applications powered by language models.To install the LangChain x Together library, run:
Shell
Copy
Ask AI
pip install --upgrade langchain-together
Here’s sample code to get you started with Langchain + Together AI:
Python
Copy
Ask AI
from langchain_together import ChatTogetherchat = ChatTogether(model="meta-llama/Llama-3-70b-chat-hf")for m in chat.stream("Tell me fun things to do in NYC"): print(m.content, end="", flush=True)
See this tutorial blog for the RAG implementation details using Together and LangChain.
LlamaIndex is a simple, flexible data framework for connecting custom data sources to large language models (LLMs).Install llama-index
Shell
Copy
Ask AI
pip install llama-index
Here’s sample code to get you started with Llama Index + Together AI:
Python
Copy
Ask AI
from llama_index.llms import OpenAILikellm = OpenAILike( model="mistralai/Mixtral-8x7B-Instruct-v0.1", api_base="https://api.together.xyz/v1", api_key="TOGETHER_API_KEY", is_chat_model=True, is_function_calling_model=True, temperature=0.1,)response = llm.complete("Write up to 500 words essay explaining Large Language Models")print(response)
See this tutorial blog for the RAG implementation details using Together and LlamaIndex.
CrewAI is an open source framework for orchestrating AI agent systems.Install crewai
Shell
Copy
Ask AI
pip install crewaiexport TOGETHER_API_KEY=***
Build an multi-agent workflow:
Python
Copy
Ask AI
import osfrom crewai import LLM, Task, Agent, Crewllm = LLM(model="together_ai/meta-llama/Llama-3.3-70B-Instruct-Turbo", api_key=os.environ.get("TOGETHER_API_KEY"), base_url="https://api.together.xyz/v1" )research_agent = Agent( llm = llm, role="Research Analyst", goal="Find and summarize information about specific topics", backstory="You are an experienced researcher with attention to detail", verbose=True # Enable logging for debugging)research_task = Task( description="Conduct a thorough research about AI Agents.", expected_output="A list with 10 bullet points of the most relevant information about AI Agents", agent=research_agent)## Execute the crewcrew = Crew( agents=[research_agent], tasks=[research_task], verbose=True)result = crew.kickoff()## Accessing the task outputtask_output = research_task.outputprint(task_output)
import osfrom langchain_together import ChatTogetherllm = ChatTogether(model="meta-llama/Llama-3.3-70B-Instruct-Turbo", api_key=os.getenv("TOGETHER_API_KEY"))## Define a tooldef multiply(a: int, b: int) -> int: return a * b## Augment the LLM with toolsllm_with_tools = llm.bind_tools([multiply])## Invoke the LLM with input that triggers the tool callmsg = llm_with_tools.invoke("What is 2 times 3?")## Get the tool callmsg.tool_calls
Arcade is a platform that lets AI securely use tools like email, files, and APIs to take real action—not just chat. Build powerful assistants in minutes with ready-to-use integrations or a custom SDK.Our guide demonstrates how to integrate Together AI’s language models with Arcade’s tools to create an AI agent that can send emails.Prerequisites:
## install the required packages!pip install -qU together arcadepy
Gmail Configuration:
Copy
Ask AI
import osfrom arcadepy import Arcadefrom together import Together## Set environment variablesos.environ["TOGETHER_API_KEY"] = "XXXXXXXXXXXXX" # Replace with your actual Together API keyos.environ["ARCADE_API_KEY"] = "arc_XXXXXXXXXXX" # Replace with your actual Arcade API key## Initialize clientstogether_client = Together(api_key=os.getenv("TOGETHER_API_KEY"))arcade_client = Arcade() # Automatically finds the ARCADE_API_KEY env variable## Set up user ID (your email)USER_ID = "[email protected]" # Change this to your email## Authorize Gmail accessauth_response = arcade_client.tools.authorize( tool_name="Google.SendEmail", user_id=USER_ID,)if auth_response.status != "completed": print(f"Click this link to authorize: {auth_response.url}") # Wait for the authorization to complete arcade_client.auth.wait_for_completion(auth_response)print("Authorization completed!")
DSPy is a framework that enables you to build modular AI systems with code instead of hand-crafted promptingInstall dspy
Shell
Copy
Ask AI
pip install -U dspyexport TOGETHER_API_KEY=***
Build a question answering agent
Python
Copy
Ask AI
import dspy#Configure dspy with a LLM from Together AIlm = dspy.LM('together_ai/togethercomputer/llama-2-70b-chat', api_key=os.environ.get("TOGETHER_API_KEY"), api_base="https://api.together.xyz/v1")#Configure dspy to use the LLMdspy.configure(lm=lm)## Gives the agent access to a python interpreterdef evaluate_math(expression: str): return dspy.PythonInterpreter({}).execute(expression)## Gives the agent access to a wikipedia search tooldef search_wikipedia(query: str): results = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')(query, k=3) return [x['text'] for x in results]## setup ReAct module with question and math answer signaturereact = dspy.ReAct("question -> answer: float", tools=[evaluate_math, search_wikipedia])pred = react(question="What is 9362158 divided by the year of birth of David Gregory of Kinnairdy castle?")print(pred.answer)
Learn more in our DSPy Guide including code a notebook.
AG2 (formerly AutoGen) is an open-source framework for building and orchestrating AI agents.Install autogen
Shell
Copy
Ask AI
pip install autogenexport TOGETHER_API_KEY=***
Build a coding agent
Python
Copy
Ask AI
import osfrom pathlib import Pathfrom autogen import AssistantAgent, UserProxyAgentfrom autogen.coding import LocalCommandLineCodeExecutorconfig_list = [ { # Let's choose the Mixtral 8x7B model "model": "mistralai/Mixtral-8x7B-Instruct-v0.1", # Provide your Together.AI API key here or put it into the TOGETHER_API_KEY environment variable. "api_key": os.environ.get("TOGETHER_API_KEY"), # We specify the API Type as 'together' so it uses the Together.AI client class "api_type": "together", "stream": False, }]## Setting up the code executorworkdir = Path("coding")workdir.mkdir(exist_ok=True)code_executor = LocalCommandLineCodeExecutor(work_dir=workdir)## Setting up the agents## The UserProxyAgent will execute the code that the AssistantAgent providesuser_proxy_agent = UserProxyAgent( name="User", code_execution_config={"executor": code_executor}, is_termination_msg=lambda msg: "FINISH" in msg.get("content"),)system_message = """You are a helpful AI assistant who writes code and the user executes it.Solve tasks using your coding and language skills."""## The AssistantAgent, using Together.AI's Code Llama model, will take the coding request and return codeassistant_agent = AssistantAgent( name="Together Assistant", system_message=system_message, llm_config={"config_list": config_list},)## Start the chat, with the UserProxyAgent asking the AssistantAgent the messagechat_result = user_proxy_agent.initiate_chat( assistant_agent, message="Provide code to count the number of prime numbers from 1 to 10000.",)
Learn more in our Autogen Guide including code a notebook.
Pinecone is a vector database that helps companies build RAG applications.Here’s some sample code to get you started with Pinecone + Together AI:
Python
Copy
Ask AI
from pinecone import Pinecone, ServerlessSpecfrom together import Togetherpc = Pinecone( api_key="PINECONE_API_KEY", source_tag="TOGETHER_AI")client = Together()## Create an index in pineconeindex = pc.create_index( name="serverless-index", dimension=1536, metric="cosine", spec=ServerlessSpec(cloud="aws", region="us-west-2"),)## Create an embedding on Together AItextToEmbed = "Our solar system orbits the Milky Way galaxy at about 515,000 mph"embeddings = client.embeddings.create( model="togethercomputer/m2-bert-80M-8k-retrieval", input=textToEmbed)## Use index.upsert() to insert embeddings and index.query() to query for similar vectors
Helicone is an open source LLM observability platform.Here’s some sample code to get started with using Helicone + Together AI:
Python
Copy
Ask AI
import osfrom together import Togetherclient = Together( api_key=os.environ.get("TOGETHER_API_KEY"), base_url="https://together.hconeai.com/v1", supplied_headers={ "Helicone-Auth": f"Bearer {os.environ.get('HELICONE_API_KEY')}", },)stream = client.chat.completions.create( model="meta-llama/Llama-3-8b-chat-hf", messages=[ {"role": "user", "content": "What are some fun things to do in New York?"} ], stream=True,)for chunk in stream: print(chunk.choices[0].delta.content or "", end="", flush=True)