Logprobs

Learn how to return log probabilities your output and prompt.

Logprobs, short for log probabilities, are logarithms of probabilities that indicate the likelihood of each token occurring based on the previous tokens in the context. They allow users to gauge a model's confidence in its outputs and explore alternative responses considered by the model and are beneficial for various applications such as classification tasks, retrieval evaluations, and autocomplete suggestions.

Returning logprobs

To return logprobs from our API, simply add logprobs: 1 to your API call as seen below.

curl -X POST https://api.together.xyz/v1/chat/completions \
     -H 'Content-Type: application/json' \
     -H "Authorization: Bearer $TOGETHER_API_KEY"\
     -d '{
     "model": "codellama/CodeLlama-70b-Instruct-hf",
     "stream": false,
     "max_tokens": 10,
     "messages": [
       {"role": "user", "content": "write an async function in python"}
     ],
     "logprobs": 1
 }'

Response of returning logprobs

Here's the response you can expect. You'll notice both the tokens and the log probability of every token is shown.

{
  "id": "85d3d3ff1a0d8c57-EWR",
  "object": "chat.completion",
  "created": 1709240335,
  "model": "codellama/CodeLlama-70b-Instruct-hf",
  "prompt": [],
  "choices": [
    {
      "finish_reason": "length",
      "logprobs": {
        "tokens": [
          "1",
          ".",
          " Define",
          " the",
          " function",
          " with",
          " the",
          " async",
          " keyword",
          "."
        ],
        "token_logprobs": [
          -1.0029297,
          -0.07861328,
          -2.1367188,
          -0.921875,
          -0.6479492,
          -1.1318359,
          -0.35668945,
          -0.6743164,
          -0.023162842,
          -1.1484375
        ]
      },
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "1. Define the function with the async keyword."
      }
    }
  ],
  "usage": {
    "prompt_tokens": 33,
    "completion_tokens": 10,
    "total_tokens": 43
  }
}

Returning your prompt logprobs

If you want to also return your prompt along with its logprobs, simply add echo: true as a parameter in your API request.

curl -X POST https://api.together.xyz/v1/chat/completions \
     -H 'Content-Type: application/json' \
     -H "Authorization: Bearer $TOGETHER_API_KEY"\
     -d '{
     "model": "codellama/CodeLlama-70b-Instruct-hf",
     "stream": false,
     "max_tokens": 10,
     "messages": [
       {"role": "user", "content": "write an async function in python"}
     ],
     "logprobs": 1,
     "echo": true
 }'

Response of returning your prompt logprobs

You'll notice that in addition to returning the logprobs of the model output, we're also returning the logprobs of the prompt.

{
  "id": "85d3cf2ae98c0f73-EWR",
  "object": "chat.completion",
  "created": 1709240137,
  "model": "codellama/CodeLlama-70b-Instruct-hf",
  "prompt": [
    {
      "text": "<s> Source: user\nDestination: assistant\n\n write an async function in python<step> Source: assistant\nDestination: user\n\n ",
      "logprobs": {
        "tokens": [
          "<s>",
          "",
          "Source",
          ":",
          "user",
          "\n",
          "Dest",
          "ination",
          ":",
          "assistant",
          "\n",
          "\n",
          "write",
          "an",
          "async",
          "function",
          "in",
          "python",
          "<",
          "step",
          ">",
          "Source",
          ":",
          "assistant",
          "\n",
          "Dest",
          "ination",
          ":",
          "user",
          "\n",
          "\n",
          ""
        ],
        "token_logprobs": [
          null,
          -3.6757812,
          -12.984375,
          -0.025985718,
          -14.7109375,
          -0.7006836,
          -1.5390625,
          -0.00027370453,
          -0.00031781197,
          -3.3457031,
          -0.00097322464,
          -0.049835205,
          -14,
          -3.6210938,
          -11.109375,
          -0.3190918,
          -4.3242188,
          -2.3574219,
          -15.890625,
          -10.1953125,
          -4.171875,
          -8.0859375,
          -0.0010204315,
          -0.000026106834,
          -0.0019760132,
          -0.0027122498,
          -0.00000333786,
          -0.0000021457672,
          -0.79589844,
          -0.0015296936,
          -0.0010738373,
          -6.2734375
        ]
      }
    }
  ],
  "choices": [
    {
      "finish_reason": "length",
      "logprobs": {
        "tokens": [
          "1",
          ".",
          " Define",
          " the",
          " function",
          " with",
          " the",
          " async",
          " keyword",
          "."
        ],
        "token_logprobs": [
          -1.0009766,
          -0.07861328,
          -2.1367188,
          -0.921875,
          -0.6479492,
          -1.1318359,
          -0.35668945,
          -0.6743164,
          -0.023162842,
          -1.1484375
        ]
      },
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "1. Define the function with the async keyword."
      }
    }
  ],
  "usage": {
    "prompt_tokens": 33,
    "completion_tokens": 10,
    "total_tokens": 43
  }
}