Introduction
Function calling (also called tool calling) enables LLMs to respond with structured function names and arguments that you can execute in your application. This allows models to interact with external systems, retrieve real-time data, and power agentic AI workflows. Pass function descriptions to thetools parameter, and the model will return tool_calls when it determines a function should be used. You can then execute these functions and optionally pass the results back to the model for further processing.
Basic Function Calling
Let’s say our application has access to aget_current_weather function which takes in two named arguments,location and unit:
tools key alongside the user’s query. Let’s suppose the user asks, “What is the current temperature of New York?”
tool_calls array, specifying the function name and arguments needed to get the weather for New York.
JSON
Streaming
Function calling also works with streaming responses. When streaming is enabled, tool calls are returned incrementally and can be accessed from thedelta.tool_calls object in each chunk.
Supported models
The following models currently support function calling:openai/gpt-oss-120bopenai/gpt-oss-20bmoonshotai/Kimi-K2-Thinkingmoonshotai/Kimi-K2-Instruct-0905zai-org/GLM-4.5-Air-FP8Qwen/Qwen3-Next-80B-A3B-InstructQwen/Qwen3-Next-80B-A3B-ThinkingQwen/Qwen3-235B-A22B-Thinking-2507Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8Qwen/Qwen3-235B-A22B-fp8-tputdeepseek-ai/DeepSeek-R1deepseek-ai/DeepSeek-V3meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8meta-llama/Llama-4-Scout-17B-16E-Instructmeta-llama/Meta-Llama-3.1-8B-Instruct-Turbometa-llama/Meta-Llama-3.1-70B-Instruct-Turbometa-llama/Meta-Llama-3.1-405B-Instruct-Turbometa-llama/Llama-3.3-70B-Instruct-Turbometa-llama/Llama-3.2-3B-Instruct-TurboQwen/Qwen2.5-7B-Instruct-TurboQwen/Qwen2.5-72B-Instruct-Turbomistralai/Mistral-Small-24B-Instruct-2501arcee-ai/virtuoso-large
Types of Function Calling
Function calling can be implemented in six different patterns, each serving different use cases:| Type | Description | Use Cases |
|---|---|---|
| Simple | One function, one call | Basic utilities, simple queries |
| Multiple | Choose from many functions | Many tools, LLM has to choose |
| Parallel | Same function, multiple calls | Complex prompts, multiple tools called |
| Parallel Multiple | Multiple functions, parallel calls | Complex single requests with many tools |
| Multi-Step | Sequential function calling in one turn | Data processing workflows |
| Multi-Turn | Conversational context + functions | AI Agents with humans in the loop |
1. Simple Function Calling
This is the most basic type of function calling where one function is defined and one user prompt triggers one function call. The model identifies the need to call the function and extracts the right parameters. This is the example presented in the above code. Only one tool is provided to the model and it responds with one invocation of the tool.2. Multiple Function Calling
Multiple function calling involves having several different functions available, with the model choosing the best function to call based on the user’s intent. The model must understand the request and select the appropriate tool from the available options. In the example below we provide two tools to the model and it responds with one tool invocation.get_current_stock_price function.
Selecting a specific tool
If you’d like to manually select a specific tool to use for a completion, pass in the tool’s name to thetool_choice parameter:
Understanding tool_choice options
Thetool_choice parameter controls how the model uses functions. It accepts:
String values:
"auto"(default) - Model decides whether to call a function or generate a text response"none"- Model will never call functions, only generates text"required"- Model must call at least one function
3. Parallel Function Calling
In parallel function calling, the same function is called multiple times simultaneously with different parameters. This is more efficient than making sequential calls for similar operations.tool_calls key of the LLM’s response will look like this:
JSON
4. Parallel Multiple Function Calling
This pattern combines parallel and multiple function calling: multiple different functions are available, and one user prompt triggers multiple different function calls simultaneously. The model chooses which functions to call AND calls them in parallel.JSON
5. Multi-Step Function Calling
Multi-step function calling involves sequential function calls within one conversation turn. Functions are called, results are processed, then used to inform the final response. This demonstrates the complete flow from initial function calls to processing function results to final response incorporating all the data. Here’s an example of passing the result of a tool call from one completion into a second follow-up completion:JSON
6. Multi-Turn Function Calling
Multi-turn function calling represents the most sophisticated form of function calling, where context is maintained across multiple conversation turns and functions can be called at any point in the conversation. Previous function results inform future decisions, enabling truly agentic behavior.- Turn 1: Calls weather functions for three cities and provides temperature information
- Turn 2: Remembers the previous weather data, analyzes which city is best for outdoor activities (San Francisco with 65°F), and automatically calls the restaurant recommendation function for that city