Text-to-Speech
Learn how to use the text-to-speech functionality supported by Together AI.
Quickstart
1. Register for an account
First, register for an account to get an API key. New accounts come with $5 to get started.
Once you've registered, set your account's API key to an environment variable named TOGETHER_API_KEY
:
export TOGETHER_API_KEY=xxxxx
2. Query the models via our API
In this example, we're giving it a string and asking the model to return a .wav
back to us.
import requests
url = "https://api.together.ai/v1/audio/generations"
headers = {"Authorization": f"Bearer {TOGETHER_API_KEY}"}
data = {
"input": text,
"voice": voice,
"response_format": "raw",
"sample_rate": 44100,
"stream": False,
"model": "cartesia/sonic",
}
response = requests.post(url, headers=headers, json=data)
response.content
import Together from 'together-ai';
const together = new Together();
async function generateAudio() {
const res = await together.audio.create({
input: 'Hello, how are you today?',
voice: 'laidback woman',
response_format: 'wav',
sample_rate: 44100,
stream: false,
model: 'cartesia/sonic',
});
if (res.body) {
console.log(res.body);
const nodeStream = Readable.from(res.body as ReadableStream);
const fileStream = createWriteStream('./text2.wav');
nodeStream.pipe(fileStream);
}
}
generateAudio();
curl --location 'https://api.together.ai/v1/audio/generations' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $TOGETHER_API_KEY' \
--output test2.wav \
--data '{
"input": "hello, how are you?",
"voice": "laidback woman",
"response_format": "wav",
"sample_rate": 44100,
"stream": false,
"model": "cartesia/sonic"
}'
This will output a test2.wav
file that can be played.
For a complete example using this API to generate podcasts refer to our notebook
Output Raw Bytes
If you want to extract out raw audio bytes use the settings below:
import requests
url = "https://api.together.ai/v1/audio/generations"
headers = {"Authorization": f"Bearer {TOGETHER_API_KEY}"}
data = {
"input": text,
"voice": voice,
"response_format": "raw",
"response_encoding": "pcm_f32le",
"sample_rate": 44100,
"stream": False,
"model": "cartesia/sonic",
}
response = requests.post(url, headers=headers, json=data)
with open("text2.pcm", "wb") as f:
f.write(response.content)
import Together from 'together-ai';
const together = new Together();
async function generateRawBytes() {
const res = await together.audio.create({
input: 'Hello, how are you today?',
voice: 'laidback woman',
response_format: 'raw',
response_encoding: 'pcm_f32le',
sample_rate: 44100,
stream: false,
model: 'cartesia/sonic',
});
console.log(res.body);
}
generateRawBytes();
curl --location 'https://api.together.ai/v1/audio/generations' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $TOGETHER_API_KEY' \
--output test2.pcm \
--data '{
"input": text,
"voice": voice,
"response_format": "raw",
"response_encoding": "pcm_f32le",
"sample_rate": 44100,
"stream": false,
"model": "cartesia/sonic"
}'
This will output a raw bytes test2.pcm
file.
Voices Available
All valid voice model strings:
'german conversational woman',
'nonfiction man',
'friendly sidekick',
'french conversational lady',
'french narrator lady',
'german reporter woman',
'indian lady',
'british reading lady',
'british narration lady',
'japanese children book',
'japanese woman conversational',
'japanese male conversational',
'reading lady',
'newsman',
'child',
'meditation lady',
'maria',
"1920's radioman",
'newslady',
'calm lady',
'helpful woman',
'mexican woman',
'korean narrator woman',
'russian calm lady',
'russian narrator man 1',
'russian narrator man 2',
'russian narrator woman',
'hinglish speaking lady',
'italian narrator woman',
'polish narrator woman',
'chinese female conversational',
'pilot over intercom',
'chinese commercial man',
'french narrator man',
'spanish narrator man',
'reading man',
'new york man',
'friendly french man',
'barbershop man',
'indian man',
'australian customer support man',
'friendly australian man',
'wise man',
'friendly reading man',
'customer support man',
'dutch confident man',
'dutch man',
'hindi reporter man',
'italian calm man',
'italian narrator man',
'swedish narrator man',
'polish confident man',
'spanish-speaking storyteller man',
'kentucky woman',
'chinese commercial woman',
'middle eastern woman',
'hindi narrator woman',
'sarah',
'sarah curious',
'laidback woman',
'reflective woman',
'helpful french lady',
'pleasant brazilian lady',
'customer support lady',
'british lady',
'wise lady',
'australian narrator lady',
'indian customer support lady',
'swedish calm lady',
'spanish narrator lady',
'salesman',
'yogaman',
'movieman',
'wizardman',
'australian woman',
'korean calm woman',
'friendly german man',
'announcer man',
'wise guide man',
'midwestern man',
'kentucky man',
'brazilian young man',
'chinese call center man',
'german reporter man',
'confident british man',
'southern man',
'classy british man',
'polite man',
'mexican man',
'korean narrator man',
'turkish narrator man',
'turkish calm man',
'hindi calm man',
'hindi narrator man',
'polish narrator man',
'polish young man',
'alabama male',
'australian male',
'anime girl',
'japanese man book',
'sweet lady',
'commercial lady',
'teacher lady',
'princess',
'commercial man',
'asmr lady',
'professional woman',
'tutorial man',
'calm french woman',
'new york woman',
'spanish-speaking lady',
'midwestern woman',
'sportsman',
'storyteller lady',
'spanish-speaking man',
'doctor mischief',
'spanish-speaking reporter man',
'young spanish-speaking woman',
'the merchant',
'stern french man',
'madame mischief',
'german storyteller man',
'female nurse',
'german conversation man',
'friendly brazilian man',
'german woman',
'southern woman',
'british customer support lady',
'chinese woman narrator',
'pleasant man',
'california girl',
'john',
'anna'
Pricing
Model | Price |
---|---|
Cartesia Sonic | $65 per 1 Million characters |
Updated 7 days ago