Quickstart

1. Register for an account

First, register for an account to get an API key. New accounts come with $5 to get started.

Once you've registered, set your account's API key to an environment variable named TOGETHER_API_KEY:

export TOGETHER_API_KEY=xxxxx

2. Query the models via our API

In this example, we're giving it a string and asking the model to return a .mp3 back to us.

from together import Together

client = Together()

speech_file_path = "speech.mp3"

response = client.audio.speech.create(
    model="cartesia/sonic",
    input="Today is a wonderful day to build something people love!",
    voice="helpful woman",
    )
    
response.stream_to_file(speech_file_path)

import Together from 'together-ai';

const together = new Together();

async function generateAudio() {
   const res = await together.audio.create({
    input: 'Hello, how are you today?',
    voice: 'laidback woman',
    response_format: 'mp3',
    sample_rate: 44100,
    stream: false,
    model: 'cartesia/sonic',
  });

  if (res.body) {
    console.log(res.body);
    const nodeStream = Readable.from(res.body as ReadableStream);
    const fileStream = createWriteStream('./speech.mp3');

    nodeStream.pipe(fileStream);
  }
}

generateAudio();

curl --location 'https://api.together.ai/v1/audio/generations' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $TOGETHER_API_KEY' \
--output speech.mp3 \
--data '{
    "input": "hello, how are you?",
    "voice": "laidback woman",
    "response_format": "mp3",
    "sample_rate": 44100,
    "stream": false,
    "model": "cartesia/sonic"
}'

This will output a speech.mp3file that can be played.

For a complete example using this API to generate podcasts refer to our notebook

Output Raw Bytes

If you want to extract out raw audio bytes use the settings below:

import requests

url = "https://api.together.ai/v1/audio/generations"

headers = {"Authorization": f"Bearer {TOGETHER_API_KEY}"}

data = {
    "input": text,
    "voice": voice,
    "response_format": "raw",
    "response_encoding": "pcm_f32le",
    "sample_rate": 44100,
    "stream": False,
    "model": "cartesia/sonic",
}

response = requests.post(url, headers=headers, json=data)

with open("text2.pcm", "wb") as f:
        f.write(response.content)

import Together from 'together-ai';

const together = new Together();

async function generateRawBytes() {
  const res = await together.audio.create({
    input: 'Hello, how are you today?',
    voice: 'laidback woman',
    response_format: 'raw',
    response_encoding: 'pcm_f32le',
    sample_rate: 44100,
    stream: false,
    model: 'cartesia/sonic',
  });

  console.log(res.body);
}

generateRawBytes();

curl --location 'https://api.together.ai/v1/audio/generations' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $TOGETHER_API_KEY' \
--output test2.pcm \
--data '{
        "input": text,
        "voice": voice,
        "response_format": "raw",
        "response_encoding": "pcm_f32le",
        "sample_rate": 44100,
        "stream": false,
        "model": "cartesia/sonic"
}'

This will output a raw bytes test2.pcmfile.

Voices Available

All valid voice model strings:

'german conversational woman',
'nonfiction man',
'friendly sidekick',
'french conversational lady',
'french narrator lady',
'german reporter woman',
'indian lady',
'british reading lady',
'british narration lady',
'japanese children book',
'japanese woman conversational',
'japanese male conversational',
'reading lady',
'newsman',
'child',
'meditation lady',
'maria',
"1920's radioman",
'newslady',
'calm lady',
'helpful woman',
'mexican woman',
'korean narrator woman',
'russian calm lady',
'russian narrator man 1',
'russian narrator man 2',
'russian narrator woman',
'hinglish speaking lady',
'italian narrator woman',
'polish narrator woman',
'chinese female conversational',
'pilot over intercom',
'chinese commercial man',
'french narrator man',
'spanish narrator man',
'reading man',
'new york man',
'friendly french man',
'barbershop man',
'indian man',
'australian customer support man',
'friendly australian man',
'wise man',
'friendly reading man',
'customer support man',
'dutch confident man',
'dutch man',
'hindi reporter man',
'italian calm man',
'italian narrator man',
'swedish narrator man',
'polish confident man',
'spanish-speaking storyteller man',
'kentucky woman',
'chinese commercial woman',
'middle eastern woman',
'hindi narrator woman',
'sarah',
'sarah curious',
'laidback woman',
'reflective woman',
'helpful french lady',
'pleasant brazilian lady',
'customer support lady',
'british lady',
'wise lady',
'australian narrator lady',
'indian customer support lady',
'swedish calm lady',
'spanish narrator lady',
'salesman',
'yogaman',
'movieman',
'wizardman',
'australian woman',
'korean calm woman',
'friendly german man',
'announcer man',
'wise guide man',
'midwestern man',
'kentucky man',
'brazilian young man',
'chinese call center man',
'german reporter man',
'confident british man',
'southern man',
'classy british man',
'polite man',
'mexican man',
'korean narrator man',
'turkish narrator man',
'turkish calm man',
'hindi calm man',
'hindi narrator man',
'polish narrator man',
'polish young man',
'alabama male',
'australian male',
'anime girl',
'japanese man book',
'sweet lady',
'commercial lady',
'teacher lady',
'princess',
'commercial man',
'asmr lady',
'professional woman',
'tutorial man',
'calm french woman',
'new york woman',
'spanish-speaking lady',
'midwestern woman',
'sportsman',
'storyteller lady',
'spanish-speaking man',
'doctor mischief',
'spanish-speaking reporter man',
'young spanish-speaking woman',
'the merchant',
'stern french man',
'madame mischief',
'german storyteller man',
'female nurse',
'german conversation man',
'friendly brazilian man',
'german woman',
'southern woman',
'british customer support lady',
'chinese woman narrator',
'pleasant man',
'california girl',
'john',
'anna'

Pricing

Model	Price
Cartesia Sonic 2	$65 per 1 Million characters