Structured outputs
Learn how to use JSON mode to get structured outputs from LLMs like DeepSeek V3 & Llama 3.3.
Introduction
Standard large language models respond to user queries by generating plain text. This is great for many applications like chatbots, but if you want to programmatically access details in the response, plain text is hard to work with.
Some models have the ability to respond with structured JSON instead, making it easy to work with data from the LLM's output directly in your application code.
If you're using a supported model, you can enable structured responses by providing your desired schema details to the response_format
key of the Chat Completions API.
Supported models
The following models currently support JSON mode:
meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo
(32K context)meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo
meta-llama/Llama-3.2-3B-Instruct-Turbo
meta-llama/Llama-3.3-70B-Instruct-Turbo
meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8
meta-llama/Llama-4-Scout-17B-16E-Instruct
deepseek-ai/DeepSeek-V3
Qwen/Qwen3-235B-A22B-fp8-tput
Qwen/Qwen2.5-VL-72B-Instruct
Basic example
Let's look at a simple example, where we pass a transcript of a voice note to a model and ask it to summarize it.
We want the summary to have the following structure:
{
title: "A title for the voice note",
summary: "A short one-sentence summary of the voice note",
actionItems: [
"Action item 1",
"Action item 2",
]
}
We can tell our model to use this structure by giving it a JSON Schema definition. Since writing JSON Schema by hand is a bit tedious, we'll use a library to help – Pydantic in Python, and Zod in TypeScript.
Once we have the schema, we can give it to our model using the response_format
key.
Finally – and this is important – we need to make sure to instruct our model to only respond in JSON format. This ensures it will actually use the schema we provide when generating its response.
Important: You must always instruct your model to only respond in JSON format, either in the system prompt or a user message, in addition to passing your schema to the
response_format
key.
Let's see what this looks like:
import json
import together
from pydantic import BaseModel, Field
client = together.Together()
# Define the schema for the output
class VoiceNote(BaseModel):
title: str = Field(description="A title for the voice note")
summary: str = Field(description="A short one sentence summary of the voice note.")
actionItems: list[str] = Field(
description="A list of action items from the voice note"
)
def main():
transcript = (
"Good morning! It's 7:00 AM, and I'm just waking up. Today is going to be a busy day, "
"so let's get started. First, I need to make a quick breakfast. I think I'll have some "
"scrambled eggs and toast with a cup of coffee. While I'm cooking, I'll also check my "
"emails to see if there's anything urgent."
)
# Call the LLM with the JSON schema
extract = client.chat.completions.create(
messages=[
{
"role": "system",
"content": "The following is a voice message transcript. Only answer in JSON.",
},
{
"role": "user",
"content": transcript,
},
],
model="meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
response_format={
"type": "json_object",
"schema": VoiceNote.model_json_schema(),
},
)
output = json.loads(extract.choices[0].message.content)
print(json.dumps(output, indent=2))
return output
main()
import Together from 'together-ai';
import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema';
const together = new Together();
// Defining the schema we want our data in
const voiceNoteSchema = z.object({
title: z.string().describe('A title for the voice note'),
summary: z
.string()
.describe('A short one sentence summary of the voice note.'),
actionItems: z
.array(z.string())
.describe('A list of action items from the voice note'),
});
const jsonSchema = zodToJsonSchema(voiceNoteSchema, { target: 'openAi' });
async function main() {
const transcript =
"Good morning! It's 7:00 AM, and I'm just waking up. Today is going to be a busy day, so let's get started. First, I need to make a quick breakfast. I think I'll have some scrambled eggs and toast with a cup of coffee. While I'm cooking, I'll also check my emails to see if there's anything urgent.";
const extract = await together.chat.completions.create({
messages: [
{
role: 'system',
content:
'The following is a voice message transcript. Only answer in JSON.',
},
{
role: 'user',
content: transcript,
},
],
model: 'meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo',
response_format: { type: 'json_object', schema: jsonSchema },
});
if (extract?.choices?.[0]?.message?.content) {
const output = JSON.parse(extract?.choices?.[0]?.message?.content);
console.log(output);
return output;
}
return 'No output.';
}
main();
curl -X POST https://api.together.xyz/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOGETHER_API_KEY" \
-d '{
"messages": [
{
"role": "system",
"content": "The following is a voice message transcript. Only answer in JSON."
},
{
"role": "user",
"content": "Good morning! It'"'"'s 7:00 AM, and I'"'"'m just waking up. Today is going to be a busy day, so let'"'"'s get started. First, I need to make a quick breakfast. I think I'"'"'ll have some scrambled eggs and toast with a cup of coffee. While I'"'"'m cooking, I'"'"'ll also check my emails to see if there'"'"'s anything urgent."
}
],
"model": "meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
"response_format": {
"type": "json_object",
"schema": {
"properties": {
"title": {
"description": "A title for the voice note",
"title": "Title",
"type": "string"
},
"summary": {
"description": "A short one sentence summary of the voice note.",
"title": "Summary",
"type": "string"
},
"actionItems": {
"description": "A list of action items from the voice note",
"items": { "type": "string" },
"title": "Actionitems",
"type": "array"
}
},
"required": ["title", "summary", "actionItems"],
"title": "VoiceNote",
"type": "object"
}
}
}'
If we try it out, our model responds with the following:
{
"title": "Morning Routine",
"summary": "Starting the day with a quick breakfast and checking emails",
"actionItems": [
"Cook scrambled eggs and toast",
"Brew a cup of coffee",
"Check emails for urgent messages"
]
}
Pretty neat!
Our model has generated a summary of the user's transcript using the schema we gave it.
Vision model example
Let's look at another example, this time using a vision model.
We want our LLM to extract text from the following screenshot of a Trello board:
In particular, we want to know the name of the project (Project A), and the number of columns in the board (4).
Let's try it out:
import json
import together
from pydantic import BaseModel, Field
client = together.Together()
# Define the schema for the output
class ImageDescription(BaseModel):
project_name: str = Field(description="The name of the project shown in the image")
col_num: int = Field(description="The number of columns in the board")
def main():
imageUrl = "https://napkinsdev.s3.us-east-1.amazonaws.com/next-s3-uploads/d96a3145-472d-423a-8b79-bca3ad7978dd/trello-board.png"
# Call the LLM with the JSON schema
extract = client.chat.completions.create(
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Extract a JSON object from the image."},
{
"type": "image_url",
"image_url": {
"url": imageUrl,
},
},
],
},
],
model="Qwen/Qwen2.5-VL-72B-Instruct",
response_format={
"type": "json_object",
"schema": ImageDescription.model_json_schema(),
},
)
output = json.loads(extract.choices[0].message.content)
print(json.dumps(output, indent=2))
return output
main()
import Together from "together-ai";
import { z } from "zod";
import { zodToJsonSchema } from "zod-to-json-schema";
const together = new Together();
// Define the shape of our data
const schema = z.object({
projectName: z
.string()
.describe("The name of the project shown in the image"),
columnCount: z.number().describe("The number of columns in the board"),
});
const jsonSchema = zodToJsonSchema(schema, { target: "openAi" });
const imageUrl =
"https://napkinsdev.s3.us-east-1.amazonaws.com/next-s3-uploads/d96a3145-472d-423a-8b79-bca3ad7978dd/trello-board.png";
async function main() {
const extract = await together.chat.completions.create({
messages: [
{
role: "user",
content: [
{ type: "text", text: "Extract a JSON object from the image." },
{
type: "image_url",
image_url: { url: imageUrl },
},
],
},
],
model: "Qwen/Qwen2.5-VL-72B-Instruct",
response_format: {
type: "json_object",
schema: jsonSchema,
},
});
if (extract?.choices?.[0]?.message?.content) {
const output = JSON.parse(extract?.choices?.[0]?.message?.content);
console.log(output);
return output;
}
return "No output.";
}
main();
If we run it, we get the following output:
{
projectName: 'Project A',
columnCount: 4
}
JSON mode has worked perfectly alongside Qwen's vision model to help us extract structured text from an image!
Try out your code in the Together Playground
You can try out JSON Mode in the Together Playground to test out variations on your schema and prompt:
Just click the RESPONSE FORMAT dropdown in the right-hand sidebar, choose JSON, and upload your schema!
Updated 6 days ago