We’ve aggregated questions frequently asked by our customers in this page. If you have a question that is not answered anywhere in our documentation, contact our support team. We will respond to you promptly!
This error is returned by the server if an Inference VM has not been started for the model being queried. To resolve this error, simply navigate to the Playground and start an instance of a model by hitting the "play" button in the model card.
- See 50+ models hosted for inference.
- For technical details, see the API Reference.
- Check out our web-based Chat, Language, Code, and Image Playgrounds.
- Pricing for inference is distinct for each model based on tokens used. See inference pricing.
- Learn best practices and prompt engineering techniques through examples.
- Learn other ways to run inference with Rest API or the Python API.
Complete can be made to behave similar to
Chat, but requires formatting the prompts manually. Here's an example of
Complete commands that will produce the same results:
$ together chat Loading togethercomputer/RedPajama-INCITE-7B-Chat Type /quit to quit, /help, or /? to list commands. >>> List the best restaurants in SF
$ together complete "<human>: List the best restaurants in SF\n<bot>: "
The key difference is that
Chat is a purely interactive experience with features like chat history, while
Complete offers a more customizable and manual experience useful for implementations in custom use-cases.
If you want to run inference
- Choose from the available models list.
- For Featured models, direct inference is possible without the need to initiate a virtual machine (VM).
- If a model is marked as Offline, initiate its VM instance either:
- Directly from the model's page on api.together.xyz, or
- Utilizing the start and stop instances of our APIs.
- If your desired model isn't listed, feel free to request a model.
Updated 2 months ago