Function calling

Learn how to get LLMs to respond to queries with named functions and structured arguments.

Introduction

Certain models support function calling (also called tool calling), which gives them the ability to respond to queries with function names and arguments that you can then invoke in your own application code.

To use it, pass an array of function descriptions to the tools key. If the LLM decides one or more of the available functions should be used to answer a query, it will respond with an array of the function names and their arguments to call in the tool_calls key of its response.

You can then use the data from tool_calls to invoke the named functions and get the results, which you can then provide directly to the user or pass them back into subsequent LLM queries for further processing.

Supported models

The following models currently support function calling:

  • meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8
  • meta-llama/Llama-4-Scout-17B-16E-Instruct
  • meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo
  • meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo
  • meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
  • meta-llama/Llama-3.3-70B-Instruct-Turbo
  • meta-llama/Llama-3.2-3B-Instruct-Turbo
  • Qwen/Qwen2.5-7B-Instruct-Turbo
  • Qwen/Qwen2.5-72B-Instruct-Turbo
  • Qwen/Qwen3-235B-A22B-fp8-tput
  • deepseek-ai/DeepSeek-V3
  • mistralai/Mistral-Small-24B-Instruct-2501

Basic example

Let's say our application has access to a get_current_weather function which takes in two named arguments,location and unit:

# Hypothetical function that exists in our app
get_current_weather(
  location="San Francisco, CA",
  unit="fahrenheit"
)
// Hypothetical function that exists in our app
getCurrentWeather({
  location: "San Francisco, CA",
  unit: "fahrenheit"
})

We can make this function available to our LLM by passing its description to the tools key alongside the user's query. Let's suppose the user asks, "What is the current temperature of New York, San Francisco and Chicago?"

import json
from together import Together

client = Together()

response = client.chat.completions.create(
    model="Qwen/Qwen2.5-7B-Instruct-Turbo",
    messages=[
      {"role": "system", "content": "You are a helpful assistant that can access external functions. The responses from these function calls will be appended to this dialogue. Please provide responses based on the information from these function calls."},
      {"role": "user", "content": "What is the current temperature of New York, San Francisco and Chicago?"},
    ],
    tools=[
      {
        "type": "function",
        "function": {
          "name": "get_current_weather",
          "description": "Get the current weather in a given location",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {
                "type": "string",
                "description": "The city and state, e.g. San Francisco, CA"
              },
              "unit": {
                "type": "string",
                "enum": [
                  "celsius",
                  "fahrenheit"
                ]
              }
            }
          }
        }
      }
    ]
)

print(json.dumps(response.choices[0].message.model_dump()['tool_calls'], indent=2))
import Together from 'together-ai';

const together = new Together();

const response = await together.chat.completions.create({
  model: "Qwen/Qwen2.5-7B-Instruct-Turbo",
  messages: [
    {
      role: "system",
      content:
      "You are a helpful assistant that can access external functions. The responses from these function calls will be appended to this dialogue. Please provide responses based on the information from these function calls.",
    },
    {
      role: "user",
      content:
      "What is the current temperature of New York, San Francisco and Chicago?",
    },
  ],
  tools: [
    {
      type: "function",
      function: {
        name: "getCurrentWeather",
        description: "Get the current weather in a given location",
        parameters: {
          type: "object",
          properties: {
            location: {
              type: "string",
              description: "The city and state, e.g. San Francisco, CA",
            },
            unit: {
              type: "string",
              enum: ["celsius", "fahrenheit"],
            },
          },
        },
      },
    },
  ],
});

console.log(
  JSON.stringify(answerResponse.choices[0].message?.tool_calls, null, 2),
);

In response, the tool_calls key of the LLM's response will look like this:

[
  {
    "index": 0,
    "id": "call_aisak3q1px3m2lzb41ay6rwf",
    "type": "function",
    "function": {
      "arguments": "{\"location\":\"New York, NY\",\"unit\":\"fahrenheit\"}",
      "name": "get_current_weather"
    }
  },
  {
    "index": 1,
    "id": "call_agrjihqjcb0r499vrclwrgdj",
    "type": "function",
    "function": {
      "arguments": "{\"location\":\"San Francisco, CA\",\"unit\":\"fahrenheit\"}",
      "name": "get_current_weather"
    }
  },
  {
    "index": 2,
    "id": "call_17s148ekr4hk8m5liicpwzkk",
    "type": "function",
    "function": {
      "arguments": "{\"location\":\"Chicago, IL\",\"unit\":\"fahrenheit\"}",
      "name": "get_current_weather"
    }
  }
]

As we can see, the LLM has given us three function calls that we can programmatically execute to answer the user's question.

Selecting a specific tool

By default, an LLM that's been provided with tools will automatically attempt to use the most appropriate one when generating responses.

If you'd like to manually select a specific tool to use for a completion, pass in the tool's name to the tool_choice parameter:

import json
from together import Together

client = Together()

tools = [
  {
    "type": "function",
    "function": {
      "name": "get_current_weather",
      # ...
    }
  },
  {
    "type": "function",
    "function": {
      "name": "get_current_stock_price",
      # ...
    }
  }
]

response = client.chat.completions.create(
    model="Qwen/Qwen2.5-7B-Instruct-Turbo",
    messages=[
      {"role": "user", "content": "What's the current price of Apple's stock?"},
    ],
    tools=tools,
    tool_choice={"type": "function", "function": {"name": "get_current_stock_price"}}
)

print(json.dumps(response.choices[0].message.model_dump()['tool_calls'], indent=2))
import Together from "together-ai";

const together = new Together();

const tools = [
  {
    type: "function",
    function: {
      name: "getCurrentWeather",
      // ...
    },
  },
  {
    type: "function",
    function: {
      name: "getCurrentStockPrice",
      // ...
    },
  },
];

const response = await together.chat.completions.create({
  model: "Qwen/Qwen2.5-7B-Instruct-Turbo",
  messages: [
    {
      role: "user",
      content: "What's the current price of Apple's stock?",
    },
  ],
  tools,
  tool_choice: "getCurrentStockPrice",
});

console.log(
  JSON.stringify(response.choices[0].message?.tool_calls, null, 2),
);

This ensures the model will use the provided function when generating its response:

[
  {
    "index": 0,
    "id": "call_jxo8ybor16ju34abq552jymn",
    "type": "function",
    "function": {
      "arguments": "{\"ticker\":\"APPL\"}",
      "name": "get_current_stock_price"
    }
  }
]

Multi-turn example

Here's an example of passing the result of a tool call from one completion into a second follow-up completion:

import json
from together import Together

client = Together()

# Example function to make available to model
def get_current_weather(location, unit="fahrenheit"):
    """Get the weather for some location"""
    if "chicago" in location.lower():
        return json.dumps({"location": "Chicago", "temperature": "13", "unit": unit})
    elif "san francisco" in location.lower():
        return json.dumps({"location": "San Francisco", "temperature": "55", "unit": unit})
    elif "new york" in location.lower():
        return json.dumps({"location": "New York", "temperature": "11", "unit": unit})
    else:
        return json.dumps({"location": location, "temperature": "unknown"})

tools = [
  {
    "type": "function",
    "function": {
      "name": "get_current_weather",
      "description": "Get the current weather in a given location",
      "parameters": {
        "type": "object",
        "properties": {
          "location": {
            "type": "string",
            "description": "The city and state, e.g. San Francisco, CA"
          },
          "unit": {
            "type": "string",
            "enum": [
              "celsius",
              "fahrenheit"
            ]
          }
        }
      }
    }
  }
]

messages = [
    {"role": "system", "content": "You are a helpful assistant that can access external functions. The responses from these function calls will be appended to this dialogue. Please provide responses based on the information from these function calls."},
    {"role": "user", "content": "What is the current temperature of New York, San Francisco and Chicago?"}
]
    
# Completion #1: Get the appropriate tool calls
response = client.chat.completions.create(
    model="Qwen/Qwen2.5-7B-Instruct-Turbo",
    messages=messages,
    tools=tools,
)

tool_calls = response.choices[0].message.tool_calls
if tool_calls:
    for tool_call in tool_calls:
        function_name = tool_call.function.name
        function_args = json.loads(tool_call.function.arguments)

        if function_name == "get_current_weather":
            function_response = get_current_weather(
                location=function_args.get("location"),
                unit=function_args.get("unit"),
            )
            messages.append(
                {
                    "tool_call_id": tool_call.id,
                    "role": "tool",
                    "name": function_name,
                    "content": function_response,
                }
            )

    # Completion #2: Provide the results to get the final answer
    function_enriched_response = client.chat.completions.create(
        model="Qwen/Qwen2.5-7B-Instruct-Turbo",
        messages=messages,
    )
    print(json.dumps(function_enriched_response.choices[0].message.model_dump(), indent=2))
import Together from "together-ai";
import { CompletionCreateParams } from "together-ai/resources/chat/completions.mjs";

const together = new Together();

// Example function to make available to model
function getCurrentWeather({
  location,
  unit = "fahrenheit",
}: {
  location: string;
  unit: "fahrenheit" | "celsius";
}) {
  let result: { location: string; temperature: number | null; unit: string };
  if (location.toLowerCase().includes("chicago")) {
    result = {
      location: "Chicago",
      temperature: 13,
      unit,
    };
  } else if (location.toLowerCase().includes("san francisco")) {
    result = {
      location: "San Francisco",
      temperature: 55,
      unit,
    };
  } else if (location.toLowerCase().includes("new york")) {
    result = {
      location: "New York",
      temperature: 11,
      unit,
    };
  } else {
    result = {
      location,
      temperature: null,
      unit,
    };
  }

  return JSON.stringify(result);
}

const tools = [
  {
    type: "function",
    function: {
      name: "getCurrentWeather",
      description: "Get the current weather in a given location",
      parameters: {
        type: "object",
        properties: {
          location: {
            type: "string",
            description: "The city and state, e.g. San Francisco, CA",
          },
          unit: {
            type: "string",
            enum: ["celsius", "fahrenheit"],
          },
        },
      },
    },
  },
];

const messages: CompletionCreateParams.Message[] = [
  {
    role: "system",
    content:
      "You are a helpful assistant that can access external functions. The responses from these function calls will be appended to this dialogue. Please provide responses based on the information from these function calls.",
  },
  {
    role: "user",
    content:
      "What is the current temperature of New York, San Francisco and Chicago?",
  },
];

const response = await together.chat.completions.create({
  model: "Qwen/Qwen2.5-7B-Instruct-Turbo",
  messages,
  tools,
});

if (response.choices[0].message?.tool_calls) {
  for (const toolCall of response.choices[0].message.tool_calls) {
    if (toolCall.function.name === "getCurrentWeather") {
      const args = JSON.parse(toolCall.function.arguments);
      const functionResponse = getCurrentWeather(args);

      messages.push({
        role: "tool",
        content: functionResponse,
      });
    }
  }

  const functionEnrichedResponse = await together.chat.completions.create({
    model: "Qwen/Qwen2.5-7B-Instruct-Turbo",
    messages,
    tools,
  });

  console.log(
    JSON.stringify(functionEnrichedResponse.choices[0].message, null, 2),
  );
}

And here's the final output from the second call:

{
  "content": "The current temperature in New York is 11 degrees Fahrenheit, in San Francisco it is 55 degrees Fahrenheit, and in Chicago it is 13 degrees Fahrenheit.",
  "role": "assistant"
}

We've successfully used our LLM to generate three tool call descriptions, iterated over those descriptions to execute each one, and passed the results into a follow-up message to get the LLM to produce a final answer!