curl --request POST \
--url https://api.together.xyz/inference \
--header 'Authorization: <api-key>' \
--header 'Content-Type: application/json' \
--data '{
"model": "togethercomputer/RedPajama-INCITE-Instruct-7B-v0.1",
"prompt": "The capital of France is",
"max_tokens": 1,
"temperature": 0.7,
"top_p": 0.7,
"top_k": 50,
"repetition_penalty": 1
}'
"{\n \"status\": \"finished\",\n \"prompt\": [\n \"The capital of France is \"\n ],\n \"model\": \"togethercomputer/RedPajama-INCITE-Instruct-7B-v0.1\",\n \"model_owner\": \"\",\n \"tags\": {},\n \"num_returns\": 1,\n \"args\": {\n \"model\": \"togethercomputer/RedPajama-INCITE-Instruct-7B-v0.1\",\n \"prompt\": \"The capital of France is \",\n \"temperature\": 0.8,\n \"top_p\": 0.7,\n \"top_k\": 50,\n \"max_tokens\": 1\n },\n \"subjobs\": [],\n \"output\": {\n \"choices\": [\n {\n \"finish_reason\": \"length\",\n \"index\": 0,\n \"text\": \" Paris\"\n }\n ],\n \"raw_compute_time\": 0.06382315792143345,\n \"result_type\": \"language-model-inference\"\n }\n}"
Legacy /inference endpoint. We recommend using the newer completions or chat completions endpoints.
curl --request POST \
--url https://api.together.xyz/inference \
--header 'Authorization: <api-key>' \
--header 'Content-Type: application/json' \
--data '{
"model": "togethercomputer/RedPajama-INCITE-Instruct-7B-v0.1",
"prompt": "The capital of France is",
"max_tokens": 1,
"temperature": 0.7,
"top_p": 0.7,
"top_k": 50,
"repetition_penalty": 1
}'
"{\n \"status\": \"finished\",\n \"prompt\": [\n \"The capital of France is \"\n ],\n \"model\": \"togethercomputer/RedPajama-INCITE-Instruct-7B-v0.1\",\n \"model_owner\": \"\",\n \"tags\": {},\n \"num_returns\": 1,\n \"args\": {\n \"model\": \"togethercomputer/RedPajama-INCITE-Instruct-7B-v0.1\",\n \"prompt\": \"The capital of France is \",\n \"temperature\": 0.8,\n \"top_p\": 0.7,\n \"top_k\": 50,\n \"max_tokens\": 1\n },\n \"subjobs\": [],\n \"output\": {\n \"choices\": [\n {\n \"finish_reason\": \"length\",\n \"index\": 0,\n \"text\": \" Paris\"\n }\n ],\n \"raw_compute_time\": 0.06382315792143345,\n \"result_type\": \"language-model-inference\"\n }\n}"
200
The response is of type object
.
Was this page helpful?