Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
multipart/form-data
Audio file upload or public HTTP/HTTPS URL. Supported formats .wav, .mp3, .m4a, .webm, .flac. Audio file to transcribe
Model to use for transcription
Available options:
openai/whisper-large-v3
Optional ISO 639-1 language code. If auto
is provided, language is auto-detected.
Example:
"en"
Optional text to bias decoding.
The format of the response
Available options:
json
, verbose_json
Sampling temperature between 0.0 and 1.0
Required range:
0 <= x <= 1
Controls level of timestamp detail in verbose_json. Only used when response_format is verbose_json. Can be a single granularity or an array to get multiple levels.
Available options:
segment
, word
Example:
["word", "segment"]
Response
OK
The transcribed text
Example:
"Hello, world!"