Training metrics

The Together platform records metrics at every training and evaluation step. You can retrieve them at any point during or after a job — useful for tracking loss curves, diagnosing runs, and comparing experiments.

Retrieve metrics

import os
from together import Together

client = Together(api_key=os.environ.get("TOGETHER_API_KEY"))

response = client.fine_tuning.list_metrics("ft-your-job-id")
for step in response.metrics:
    print(step)

Output format

By default, the CLI renders ASCII charts. Use --json to get raw JSON output instead.

Shell

# ASCII charts (default)
together fine-tuning list-metrics "ft-your-job-id"

# JSON output
together fine-tuning list-metrics "ft-your-job-id" --json

# Save JSON to a file
together fine-tuning list-metrics "ft-your-job-id" --json > metrics.json

# Save ASCII charts to a file
together fine-tuning list-metrics "ft-your-job-id" > plots.txt

The JSON output is a list of metric objects. Training and eval steps are returned as separate objects: training steps contain train/* keys, while eval steps contain eval/* keys. When an eval checkpoint occurs at the same step as a training step, both objects appear for that step:

[
  {"timestamp": 1779196193564587000, "train/global_step": 1, "train/epoch": 0.1, "train/loss": 2.43, "train/grad_norm": 1.21, "train/learning_rate": 1e-5 },
  {"timestamp": 1779196253564587000, "train/global_step": 2, "train/epoch": 0.2, "train/loss": 2.11, "train/grad_norm": 0.94, "train/learning_rate": 9e-6 },
  {"timestamp": 1779196313564587000, "train/global_step": 3, "train/epoch": 0.3, "train/loss": 1.98, "train/grad_norm": 0.87, "train/learning_rate": 8e-6 },
  {"timestamp": 1779196314564587000, "train/global_step": 3, "train/epoch": 0.3, "eval/loss": 2.05 },
  ...
]

Filter by step or time

All filter parameters are optional. Omit them to retrieve all recorded metrics.

from datetime import datetime

response = client.fine_tuning.list_metrics(
    "ft-your-job-id",
    global_step_from=100,
    global_step_to=500,
    logged_at_from=datetime.fromisoformat("2024-01-01T00:00:00+00:00"),
    logged_at_to=datetime.fromisoformat("2024-01-02T00:00:00+00:00"),
)

Downsample with resolution

For long training runs, use resolution to cap the response at a fixed number of uniformly sampled training steps. Eval metrics are always returned in full regardless of this setting.

response = client.fine_tuning.list_metrics(
    "ft-your-job-id",
    resolution=50,
)

Parameters

Parameter	Type	Description
`global_step_from`	integer	Return only metrics with `global_step` ≥ this value.
`global_step_to`	integer	Return only metrics with `global_step` ≤ this value.
`logged_at_from`	string or datetime	Return only metrics logged at or after this ISO 8601 timestamp.
`logged_at_to`	string or datetime	Return only metrics logged at or before this ISO 8601 timestamp.
`resolution`	integer	Maximum number of uniformly sampled training metric points to return. Does not affect eval metrics.

Available metrics

Every job reports train/global_step, train/loss, timestamp, and related training metrics. When you supply a validation_file and set n_evals > 0, the response also includes eval/loss and other validation metrics. Preference fine-tuning jobs additionally report reward, accuracy, KL divergence, and log-probability metrics for both preferred and non-preferred responses.

GET STARTED

INFERENCE

TRAINING

GPU CLUSTERS

CODE EXECUTION

CUSTOM CONTAINERS

ORGANIZATIONS

Retrieve metrics

Output format

Filter by step or time

Downsample with resolution

Parameters

Available metrics

GET STARTED

INFERENCE

TRAINING

GPU CLUSTERS

CODE EXECUTION

CUSTOM CONTAINERS

ORGANIZATIONS

Documentation Index

​Retrieve metrics

​Output format

​Filter by step or time

​Downsample with resolution

​Parameters

​Available metrics

Retrieve metrics

Output format

Filter by step or time

Downsample with resolution

Parameters

Available metrics