Llama 3.1

Learn how to do function calling for Llama 3.1 models!

Llama 3.1 shipped natively with function calling support, but instead of specifying a tool_choice parameter like traditional function calling, it works with a special prompt syntax. Let's take a look at how to do function calling with Llama 3.1 models – strictly with a custom prompt!

According to Meta, if you want to do a full conversation with tool calling, Llama 3.1 70B and Llama 3.1 405B are the two recommended options. Llama 3.1 8B is good for zero shot tool calling, but can't hold a full conversation at the same time.

Function calling w/ Llama 3.1 70B

Say we have a function called weatherTool and want to pass it to LLama 3.1 to be able to call it when it sees fit. We'll define the function attributes, use the special prompt from llama-agentic-system (from Meta) to pass the function to the model in the system prompt, and send in a prompt asking how the weather is in Tokyo.

import json
from together import Together

together = Together()

weatherTool = {
    "name": "get_current_weather",
    "description": "Get the current weather in a given location",
    "parameters": {
        "type": "object",
        "properties": {
            "location": {
                "type": "string",
                "description": "The city and state, e.g. San Francisco, CA",
            },
        },
        "required": ["location"],
    },
}

toolPrompt = f"""
You have access to the following functions:

Use the function '{weatherTool["name"]}' to '{weatherTool["description"]}':
{json.dumps(weatherTool)}

If you choose to call a function ONLY reply in the following format with no prefix or suffix:

<function=example_function_name>{{\"example_name\": \"example_value\"}}</function>

Reminder:
- If looking for real time information use relevant functions before falling back to brave_search
- Function calls MUST follow the specified format, start with <function= and end with </function>
- Required parameters MUST be specified
- Only call one function at a time
- Put the entire function call reply on one line

"""


response = together.chat.completions.create(
    model="meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo",
    messages=[
        {
            "role": "user",
            "content": "What is the weather in Tokyo?",
        },
        {
            "role": "user",
            "content": toolPrompt,
        },
    ],
    max_tokens=1024,
    temperature=0,
)

print(response.choices[0].message.content)
import Together from 'together-ai';

const together = new Together();

const weatherTool = {
  name: 'get_current_weather',
  description: 'Get the current weather in a given location',
  parameters: {
    type: 'object',
    properties: {
      location: {
        type: 'string',
        description: 'The city and state, e.g. San Francisco, CA',
      },
    },
    required: ['location'],
  },
};

const toolPrompt = `You have access to the following functions:

Use the function '${weatherTool.name}' to '${weatherTool.description}':
${JSON.stringify(weatherTool)}

If you choose to call a function ONLY reply in the following format with no prefix or suffix:

<function=example_function_name>{"example_name": "example_value"}</function>

Reminder:
- If looking for real time information use relevant functions before falling back to brave_search
- Function calls MUST follow the specified format, start with <function= and end with </function>
- Required parameters MUST be specified
- Only call one function at a time
- Put the entire function call reply on one line

`;

async function main() {
  const response = await together.chat.completions.create({
    model: 'meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo',
    messages: [
      {
        role: 'user',
        content: 'What is the weather in Tokyo?',
      },
      {
        role: 'user',
        content: toolPrompt,
      },
    ],
    max_tokens: 1024,
    temperature: 0,
  });

  console.log(response.choices?.[0]?.message?.content);
}

main();

Output

<function=get_current_weather>{"location": "Tokyo, Japan"}</function>

The output to the function above gets us the following string. We'll then need to parse this string & can call our function after.

Parsing

Note: Function calling with Llama 3.1 does not call the function for us, but it simply lets us know what function(s) to call with what parameters.

To parse the function above, we can write some code to get a nice JSON object.

import re
import json

def parse_tool_response(response: str):
    function_regex = r"<function=(\w+)>(.*?)</function>"
    match = re.search(function_regex, response)

    if match:
        function_name, args_string = match.groups()
        try:
            args = json.loads(args_string)
            return {
                "function": function_name,
                "arguments": args,
            }
        except json.JSONDecodeError as error:
            print(f"Error parsing function arguments: {error}")
            return None
    return None

print(parse_tool_response(response.choices[0].message.content))
function parseToolResponse(response: string) {
  const functionRegex = /<function=(\w+)>(.*?)<\/function>/;
  const match = response.match(functionRegex);

  if (match) {
    const [, functionName, argsString] = match;
    try {
      const arguments = JSON.parse(argsString);
      return {
        function: functionName,
        arguments,
      };
    } catch (error) {
      console.error('Error parsing function arguments:', error);
      return null;
    }
  }

  return null;
}

const parsedResponse = parseToolResponse(response.choices[0].message.content);
console.log(parsedResponse);
{
  "function": "get_current_weather",
  "arguments": { "location": "Tokyo, Japan" },
}

Now that we have the parsed function call and arguments in JSON, we can pass these into our actual weather function, pass these into the LLM, and have it respond back to the user.

How good is Llama 3.1 at tool calling?

Llama 3.1 70B and Llama 3.1 405B are both excellent models for tool calling since they were trained with this functionality in mind. In fact, according to some evals, these models even perform better than GPT-4o.