> ## Documentation Index
> Fetch the complete documentation index at: https://docs.together.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Reasoning fine-tuning

> Learn how to fine-tune reasoning models with chain-of-thought data using Together AI.

## Introduction

Reasoning fine-tuning allows you to adapt models that support chain-of-thought reasoning. By providing `reasoning` or `reasoning_content` fields alongside assistant responses, you can shape how a model thinks through problems before producing an answer.

This guide covers the specific steps for reasoning fine-tuning. For general fine-tuning concepts, environment setup, and hyperparameter details, see the [fine-tuning overview](/docs/fine-tuning-overview).

## Quick Links

* [Dataset Requirements](#reasoning-dataset)
* [Supported Models](#supported-models)
* [Check and Upload Dataset](#check-and-upload-dataset)
* [Start a Fine-tuning Job](#starting-a-fine-tuning-job)
* [Monitor Progress](#monitoring-your-fine-tuning-job)
* [Deploy Your Model](#using-your-fine-tuned-model)

## Reasoning Dataset

**Dataset Requirements:**

* **Format**: `.jsonl` file
* **Supported types**: Conversational, Preferential — more details on their purpose [here](/docs/fine-tuning-data-preparation#text-data)
* Assistant messages support a `reasoning` or `reasoning_content` field containing the model's chain of thought
* The `content` field contains the final response shown to the user

<Warning>
  Reasoning models should always be fine-tuned with reasoning data. Training without it can degrade the model's reasoning ability. If your dataset doesn't include reasoning, use an instruct model instead.
</Warning>

### Conversation Reasoning Format

This is what one row/example from the reasoning dataset looks like in conversation format:

```json theme={null}
{
  "messages": [
    {"role": "user", "content": "What is the capital of France?"},
    {
      "role": "assistant",
      "reasoning": "The user is asking about the capital of France. France is a country in Western Europe. Its capital city is Paris, which has been the capital since the 10th century.",
      "content": "The capital of France is Paris."
    }
  ]
}
```

<Info>
  When fine-tuning reasoning models on conversational data, only the last assistant message is trained on by default. For multi-turn reasoning, split the conversation so each assistant message is the final message in its own conversation.
</Info>

### Preference Reasoning Format

```json theme={null}
{
  "input": {
    "messages": [
      {"role": "user", "content": "What is the capital of France?"}
    ]
  },
  "preferred_output": [
    {
      "role": "assistant",
      "reasoning": "The user is asking about the capital of France. France is a country in Western Europe. Its capital city is Paris.",
      "content": "The capital of France is Paris."
    }
  ],
  "non_preferred_output": [
    {
      "role": "assistant",
      "reasoning": "Hmm, let me think about European capitals.",
      "content": "The capital of France is Berlin."
    }
  ]
}
```

## Supported Models

The following models support reasoning fine-tuning:

| Organization | Model Name                   | Model String for API               |
| :----------- | :--------------------------- | :--------------------------------- |
| Qwen         | Qwen 3 0.6B Base             | `Qwen/Qwen3-0.6B-Base`             |
| Qwen         | Qwen 3 0.6B                  | `Qwen/Qwen3-0.6B`                  |
| Qwen         | Qwen 3 1.7B Base             | `Qwen/Qwen3-1.7B-Base`             |
| Qwen         | Qwen 3 1.7B                  | `Qwen/Qwen3-1.7B`                  |
| Qwen         | Qwen 3 4B Base               | `Qwen/Qwen3-4B-Base`               |
| Qwen         | Qwen 3 4B                    | `Qwen/Qwen3-4B`                    |
| Qwen         | Qwen 3 8B Base               | `Qwen/Qwen3-8B-Base`               |
| Qwen         | Qwen 3 8B                    | `Qwen/Qwen3-8B`                    |
| Qwen         | Qwen 3 14B Base              | `Qwen/Qwen3-14B-Base`              |
| Qwen         | Qwen 3 14B                   | `Qwen/Qwen3-14B`                   |
| Qwen         | Qwen 3 32B                   | `Qwen/Qwen3-32B`                   |
| Qwen         | Qwen 3 32B 16k               | `Qwen/Qwen3-32B-16k`               |
| Qwen         | Qwen 3 30B A3B Base          | `Qwen/Qwen3-30B-A3B-Base`          |
| Qwen         | Qwen 3 30B A3B               | `Qwen/Qwen3-30B-A3B`               |
| Qwen         | Qwen 3 235B A22B             | `Qwen/Qwen3-235B-A22B`             |
| Qwen         | Qwen 3 Next 80B A3B Thinking | `Qwen/Qwen3-Next-80B-A3B-Thinking` |
| Z.ai         | GLM 4.6                      | `zai-org/GLM-4.6`                  |
| Z.ai         | GLM 4.7                      | `zai-org/GLM-4.7`                  |

## Check and Upload Dataset

To upload your data, use the CLI or our Python library:

<CodeGroup>
  ```sh CLI theme={null}
  tg files check "reasoning_dataset.jsonl"

  tg files upload "reasoning_dataset.jsonl"
  ```

  ```python Python theme={null}
  import os
  from together import Together

  client = Together(api_key=os.environ.get("TOGETHER_API_KEY"))

  file_resp = client.files.upload(file="reasoning_dataset.jsonl", check=True)

  print(file_resp.model_dump())
  ```
</CodeGroup>

You'll see the following output once the upload finishes:

```json theme={null}
{
  "id": "file-629e58b4-ff73-438c-b2cc-f69542b27980",
  "object": "file",
  "created_at": 1732573871,
  "type": null,
  "purpose": "fine-tune",
  "filename": "reasoning_dataset.jsonl",
  "bytes": 0,
  "line_count": 0,
  "processed": false,
  "FileType": "jsonl"
}
```

You'll be using your file's ID (the string that begins with `file-`) to start your fine-tuning job, so store it somewhere before moving on.

## Starting a Fine-tuning Job

We support both LoRA and full fine-tuning for reasoning models.

For an exhaustive list of all the available fine-tuning parameters, refer to the [Together AI Fine-tuning API Reference](/reference/cli/finetune).

### LoRA Fine-tuning (Recommended)

<CodeGroup>
  ```sh CLI theme={null}
  tg fine-tuning create \
    --training-file "file-629e58b4-ff73-438c-b2cc-f69542b27980" \
    --model "Qwen/Qwen3-8B" \
    --lora
  ```

  ```python Python theme={null}
  import os
  from together import Together

  client = Together(api_key=os.environ.get("TOGETHER_API_KEY"))

  response = client.fine_tuning.create(
      training_file=file_resp.id,
      model="Qwen/Qwen3-8B",
      lora=True,
  )

  print(response)
  ```
</CodeGroup>

### Full Fine-tuning

<CodeGroup>
  ```sh CLI theme={null}
  tg fine-tuning create \
    --training-file "file-629e58b4-ff73-438c-b2cc-f69542b27980" \
    --model "Qwen/Qwen3-8B" \
    --no-lora
  ```

  ```python Python theme={null}
  import os
  from together import Together

  client = Together(api_key=os.environ.get("TOGETHER_API_KEY"))

  response = client.fine_tuning.create(
      training_file="file-629e58b4-ff73-438c-b2cc-f69542b27980",
      model="Qwen/Qwen3-8B",
      lora=False,
  )

  print(response)
  ```
</CodeGroup>

You can specify many more fine-tuning parameters to customize your job. See the full list of hyperparameters and their definitions [here](/reference/cli/finetune).

## Monitoring Your Fine-tuning Job

Fine-tuning can take time depending on the model size, dataset size, and hyperparameters. Your job will progress through several states: Pending, Queued, Running, Uploading, and Completed.

**Dashboard Monitoring**

You can monitor your job on the [Together AI jobs dashboard](https://api.together.ai/jobs).

**Check Status via API**

<CodeGroup>
  ```sh CLI theme={null}
  tg fine-tuning retrieve "your-job-id"

  tg fine-tuning list-events "your-job-id"
  ```

  ```python Python theme={null}
  import os
  from together import Together

  client = Together(api_key=os.environ.get("TOGETHER_API_KEY"))

  # Check status of the job
  resp = client.fine_tuning.retrieve("your-job-id")
  print(resp.status)

  # List events for the job
  for event in client.fine_tuning.list_events(id="your-job-id").data:
      print(event.message)
  ```
</CodeGroup>

## Using Your Fine-tuned Model

Once your fine-tuning job completes, your model will be available for use. You can view your fine-tuned models in [your models dashboard](https://api.together.ai/models).

### Dedicated Endpoint Deployment

You can now deploy your fine-tuned model on a dedicated endpoint for production use:

1. Visit [your models dashboard](https://api.together.ai/models)
2. Find your fine-tuned model and click **"+ CREATE DEDICATED ENDPOINT"**
3. Select your hardware configuration and scaling options
4. Click **"DEPLOY"**

You can also deploy programmatically:

```python theme={null}
import os
from together import Together

client = Together(api_key=os.environ.get("TOGETHER_API_KEY"))

response = client.endpoints.create(
    display_name="Fine-tuned Qwen3-8B Reasoning",
    model="your-username/Qwen3-8B-your-suffix",
    hardware="4x_nvidia_h100_80gb_sxm",
    autoscaling={"min_replicas": 1, "max_replicas": 1},
)

print(response)
```

Running this code will deploy a dedicated endpoint, which incurs charges. For detailed documentation around how to deploy, delete and modify endpoints see the [Endpoints API Reference](/reference/createendpoint).

For more details, read the detailed walkthrough [How-to: Fine-tuning](/docs/finetuning).
