Documentation Index
Fetch the complete documentation index at: https://docs.together.ai/llms.txt
Use this file to discover all available pages before exploring further.
Create
Create a new model evaluation job. For the full list of supported models, see Supported Models.Parameters
| Flag | Description |
|---|---|
--type [classify|score|compare] | Type of evaluation to create. required |
--judge-model [string] | Name or URL of the judge model to use for evaluation. required |
--judge-model-source [serverless|dedicated|external] | Source of the judge model. required |
--judge-system-template [string] | System template for the judge model. required |
--input-data-file-path [string] | Path to the input data file. required |
--judge-external-api-token [string] | API token for an external judge model. Pass an empty string ("") when --judge-model-source is serverless or dedicated.required |
--judge-external-base-url [string] | Base URL for an external judge model. Pass an empty string ("") when --judge-model-source is serverless or dedicated.required |
--model-field [string] | Name of the field in the input file containing text generated by the model. Mutually exclusive with --model-to-evaluate and the other detailed-config flags below. |
--model-to-evaluate [string] | Model name when using the detailed config. |
--model-to-evaluate-source [serverless|dedicated|external] | Source of the model to evaluate. |
--model-to-evaluate-external-api-token [string] | Optional external API token for the model to evaluate. |
--model-to-evaluate-external-base-url [string] | Optional external base URL for the model to evaluate. |
--model-to-evaluate-max-tokens [integer] | Max tokens for the model to evaluate. |
--model-to-evaluate-temperature [float] | Temperature for the model to evaluate. |
--model-to-evaluate-system-template [string] | System template for the model to evaluate. |
--model-to-evaluate-input-template [string] | Input template for the model to evaluate. |
--labels [string] | Comma-separated list of classification labels. |
--pass-labels [string] | Comma-separated list of labels considered as passing. Required for the classify type. |
--min-score [float] | Minimum score value. Required for the score type. |
--max-score [float] | Maximum score value. Required for the score type. |
--pass-threshold [float] | Threshold score for passing. Required for the score type. |
--model-a-field [string] | Name of the field in the input file containing text generated by model A. Mutually exclusive with --model-a and the other model-A flags below. |
--model-a [string] | Model name or URL for model A when using the detailed config. |
--model-a-source [serverless|dedicated|external] | Source of model A. |
--model-a-external-api-token [string] | Optional external API token for model A. |
--model-a-external-base-url [string] | Optional external base URL for model A. |
--model-a-max-tokens [integer] | Max tokens for model A. |
--model-a-temperature [float] | Temperature for model A. |
--model-a-system-template [string] | System template for model A. |
--model-a-input-template [string] | Input template for model A. |
--model-b-field [string] | Name of the field in the input file containing text generated by model B. Mutually exclusive with --model-b and the other model-B flags below. |
--model-b [string] | Model name or URL for model B when using the detailed config. |
--model-b-source [serverless|dedicated|external] | Source of model B. |
--model-b-external-api-token [string] | Optional external API token for model B. |
--model-b-external-base-url [string] | Optional external base URL for model B. |
--model-b-max-tokens [integer] | Max tokens for model B. |
--model-b-temperature [float] | Temperature for model B. |
--model-b-system-template [string] | System template for model B. |
--model-b-input-template [string] | Input template for model B. |
--disable-position-bias-correction | Skip the flipped-order judge pass and run only a single judge pass (original order). Halves judge cost and latency at the expense of position-bias correction. Default: off (two-pass mode). |
List
List all eval jobs.Parameters
| Flag | Description |
|---|---|
--status [pending|queued|running|completed|error|user_error] | Filter by job status. |
--limit [integer] | Limit number of results (max 100). |
--after [string] | Pagination cursor. |