Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.together.ai/llms.txt

Use this file to discover all available pages before exploring further.

For real-time applications where time-to-first-byte (TTFB) is critical, use streaming mode. Streaming returns a sequence of server-sent events containing base64-encoded audio chunks, so playback can start before generation finishes. For the lowest possible interactive latency (and bidirectional text input), see the WebSocket API.

Streaming audio

from together import Together

client = Together()

# Save the streamed audio to a file
with client.audio.speech.with_streaming_response.create(
    model="canopylabs/orpheus-3b-0.1-ft",
    input="The quick brown fox jumps over the lazy dog",
    voice="tara",
    stream=True,
    response_format="raw",  # Required for streaming
    response_encoding="pcm_s16le",  # 16-bit PCM for clean audio
) as response:
    response.stream_to_file("speech_streaming.pcm")

Streaming response format

When stream: true, the API returns a stream of server-sent events. Audio chunk:
data: {"type":"conversation.item.audio_output.delta","item_id":"tts_1","delta":"<base64_encoded_audio>"}
Word timestamps (when alignment=word):
data: {"type":"conversation.item.word_timestamps","words":["Hello","world"],"start_seconds":[0.0,0.4],"end_seconds":[0.4,0.8]}
Stream end:
data: [DONE]
When streaming is enabled, only raw (PCM) format is supported. For non-streaming requests, you can use mp3, wav, or raw.

Output raw bytes

If you want to extract raw audio bytes (for example, to feed into a custom audio pipeline), use the settings below.
import requests
import os

url = "https://api.together.ai/v1/audio/speech"
api_key = os.environ.get("TOGETHER_API_KEY")

headers = {"Authorization": f"Bearer {api_key}"}

data = {
    "input": "This is a test of raw PCM audio output.",
    "voice": "tara",
    "response_format": "raw",
    "response_encoding": "pcm_s16le",
    "sample_rate": 24000,
    "stream": False,
    "model": "canopylabs/orpheus-3b-0.1-ft",
}

response = requests.post(url, headers=headers, json=data)

with open("output_raw.pcm", "wb") as f:
    f.write(response.content)

print(f"Raw PCM audio saved to output_raw.pcm")
print(f"   Size: {len(response.content)} bytes")
This writes the raw bytes to a test2.pcm file.

See also