Llama 3.1 Native FC
Learn how to do function calling for Llama 3.1 models!
Llama 3.1 shipped natively with function calling support, but instead of specifying a tool_choice
parameter like traditional function calling, it works with a special prompt syntax. Let's take a look at how to do function calling with Llama 3.1 models – strictly with a custom prompt!
Note: OpenAI compatible function calling with Llama 3.1 is also supported! This is usually what you want, especially when working with agent frameworks. Learn how to do that here.
According to Meta, if you want to do a full conversation with tool calling, Llama 3.1 70B and Llama 3.1 405B are the two recommended options. Llama 3.1 8B is good for zero shot tool calling, but can't hold a full conversation at the same time.
Function calling w/ Llama 3.1 70B
Say we have a function called weatherTool
and want to pass it to LLama 3.1 to be able to call it when it sees fit. We'll define the function attributes, use the special prompt from llama-agentic-system (from Meta) to pass the function to the model in the system prompt, and send in a prompt asking how the weather is in Tokyo.
import json
from together import Together
together = Together()
weatherTool = {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
},
"required": ["location"],
},
}
toolPrompt = f"""
You have access to the following functions:
Use the function '{weatherTool["name"]}' to '{weatherTool["description"]}':
{json.dumps(weatherTool)}
If you choose to call a function ONLY reply in the following format with no prefix or suffix:
<function=example_function_name>{{\"example_name\": \"example_value\"}}</function>
Reminder:
- Function calls MUST follow the specified format, start with <function= and end with </function>
- Required parameters MUST be specified
- Only call one function at a time
- Put the entire function call reply on one line
- If there is no function call available, answer the question like normal with your current knowledge and do not tell the user about function calls
"""
messages = [
{
"role": "system",
"content": toolPrompt,
},
{
"role": "user",
"content": "What is the weather in Tokyo?",
},
]
response = together.chat.completions.create(
model="meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo",
messages=messages,
max_tokens=1024,
temperature=0,
)
messages.append(response.choices[0].message)
print(response.choices[0].message.content)
import Together from 'together-ai';
const together = new Together();
const weatherTool = {
name: 'get_current_weather',
description: 'Get the current weather in a given location',
parameters: {
type: 'object',
properties: {
location: {
type: 'string',
description: 'The city and state, e.g. San Francisco, CA',
},
},
required: ['location'],
},
};
const toolPrompt = `You have access to the following functions:
Use the function '${weatherTool.name}' to '${weatherTool.description}':
${JSON.stringify(weatherTool)}
If you choose to call a function ONLY reply in the following format with no prefix or suffix:
<function=example_function_name>{"example_name": "example_value"}</function>
Reminder:
- Function calls MUST follow the specified format, start with <function= and end with </function>
- Required parameters MUST be specified
- Only call one function at a time
- Put the entire function call reply on one line
- If there is no function call available, answer the question like normal with your current knowledge and do not tell the user about function calls
`;
let messages = [
{
role: 'system',
content: toolPrompt,
},
{
role: 'user',
content: 'What is the weather in Casablanca?',
},
];
async function main() {
const response = await together.chat.completions.create({
model: 'meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo',
messages: messages,
max_tokens: 1024,
temperature: 0,
});
if (response.choices?.[0]?.message) {
messages.push({
role: response.choices[0].message.role,
content: response.choices[0].message.content!,
});
console.log(response.choices?.[0]?.message?.content);
}
}
main();
<function=get_current_weather>{"location": "Tokyo, Japan"}</function>
The output to the function above gets us the following string. We'll then need to parse this string & can call our function after.
Parsing
Note: Function calling with Llama 3.1 does not call the function for us, but it simply lets us know what function(s) to call with what parameters.
To parse the function above, we can write some code to get a nice JSON object.
import re
import json
def parse_tool_response(response: str):
function_regex = r"<function=(\w+)>(.*?)</function>"
match = re.search(function_regex, response)
if match:
function_name, args_string = match.groups()
try:
args = json.loads(args_string)
return {
"function": function_name,
"arguments": args,
}
except json.JSONDecodeError as error:
print(f"Error parsing function arguments: {error}")
return None
return None
parsed_response = parse_tool_response(response.choices[0].message.content)
print(parse_tool_response(response.choices[0].message.content))
function parseToolResponse(response: string) {
const functionRegex = /<function=(\w+)>(.*?)<\/function>/;
const match = response.match(functionRegex);
if (match) {
const [, functionName, argsString] = match;
try {
return {
function: functionName,
arguments: JSON.parse(argsString),
};
} catch (error) {
console.error('Error parsing function arguments:', error);
return null;
}
}
return null;
}
const parsedResponse = parseToolResponse(response.choices[0].message.content);
console.log(parsedResponse);
{
"function": "get_current_weather",
"arguments": { "location": "Tokyo, Japan" },
}
Now that we have the parsed function call and arguments in JSON, we can pass these into our actual weather function, pass these into the LLM, and have it respond back to the user.
Calling the function
Now that Llama 3.1 has told us what function(s) to call and with what parameters, we can execute this ourselves and pass it back to the LLM so it can respond to the user.
def get_current_weather(location: str) -> str:
# This would be replaced by a weather API
if location == "San Francisco, CA":
return "62 degrees and cloudy"
elif location == "Philadelphia, PA":
return "83 degrees and sunny"
return "Weather is unknown"
if parsed_response:
available_functions = {"get_current_weather": get_current_weather}
function_to_call = available_functions[parsed_response["function"]]
weather = function_to_call(parsed_response["arguments"]["location"])
messages.append(
{
"role": "tool",
"content": weather,
}
)
print("Weather answer is: ", weather)
res = together.chat.completions.create(
model="meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo",
messages=messages,
max_tokens=1000,
temperature=0,
)
print("Answer from the LLM: ", res.choices[0].message)
else:
print("No function call found in the response")
function get_current_weather(location: string) {
// This would be replaced by a weather API
if (location === 'San Francisco, CA') {
return '62 degrees and cloudy';
} else if (location === 'Philadelphia, PA') {
return '87 degrees and sunny';
}
return 'Weather is unknown';
}
if (parsedResponse) {
const availableFunctions = {
get_current_weather: get_current_weather,
};
if (parsedResponse.function in availableFunctions) {
const functionToCall = availableFunctions[parsedResponse.function];
let weather = functionToCall(parsedResponse.arguments.location);
messages.push({
role: 'tool',
content: weather,
});
console.log('Weather answer is: ', weather);
let res = await together.chat.completions.create({
model: 'meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo',
messages: messages as any,
max_tokens: 1000,
temperature: 0,
});
console.log('Answer from the LLM: ', res.choices[0].message);
} else {
console.log(`Function ${parsedResponse.function} not found`);
}
} else {
console.log('No function call found in the response');
}
Weather answer is: 87 degrees and sunny
Answer from the LLM: {
role: 'assistant',
content: 'The current weather in Casablanca is 87 degrees and sunny.'
}
How good is Llama 3.1 at tool calling?
Llama 3.1 70B and Llama 3.1 405B are both excellent models for tool calling since they were trained with this functionality in mind. In fact, according to some evals, these models even perform better than GPT-4o.
Updated 3 months ago