The Chat function of the Together Python Library allows you to easily chat with any of the models in Together API with a single command. With pre-structured prompts and history, the chat function allows for seamless back-and-forth interactions with your favorite models from the command line.

To get started, first start an Inference VM from the Playground (starting and stopping instances with an API call and the Python library are coming soon!).

See all commands with:

together chat --help

Quickstart

To start an interactive chat session, simply run:

$ together chat

This will start a chat session with the default model togethercomputer/RedPajama-INCITE-7B-Chat.

You can specify a different model with --model or -m. See the List Models section for how to list all available models. Here's an example:

$ together chat -m togethercomputer/RedPajama-INCITE-Chat-3B-v1

Chat format

The prompt format for chat looks like this:

"{HISTORY}{user_id}: {prompt}\n{bot_id}:"

The default value for is <human> for {user_id} and <bot> for {bot_id} which follows the prompt template for the laion/OIG dataset. These values can be assigned with --user_id and --bot_id and recommended to be changed to the model's format for best results.

Other arguments

  • max-tokens: Maximum number of tokens to generate per request. Defaults to 128.
  • stop: Tokens to stop generation at. Accepts multiple values. Defaults to [""] and adds user_id if a custom one is provided.
  • temperature: Determines the degree of randomness in the response. Default=0.7
  • top-p: Used to dynamically adjust the number of choices for each predicted token based on the cumulative probabilities. Default=0.7
  • top-k: Used to limit the number of choices for the next predicted word or token. Default=50
  • repetition-penalty: Controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

Chat vs Complete

Complete can be made to behave similar to Chat, but requires formatting the prompts manually. Here's an example of Chat and Complete commands that will produce the same results:

$ together chat
Loading togethercomputer/RedPajama-INCITE-7B-Chat
Type /quit to quit, /help, or /? to list commands.

>>> List the best restaurants in SF
$ together complete "<human>: List the best restaurants in SF\n<bot>: "

The key difference is that Chat is a purely interactive experience with features like chat history, while Complete offers a more customizable and manual experience useful for implementations in custom use-cases.

FAQ

Why do I see a 429 status code error - "No instance started"?

This error is returned by the server if an Inference VM has not been started for the model being queried. To resolve this error, simply navigate to the Playground and start an instance of a model by hitting the "play" button in the model card.

What models are available to chat with?

See the Models page for a list of models that can be queried. You can also see these models by navigating to the Playground.