Python library

Reference this guide to learn how to run inference using our Python API.

In this tutorial, we will teach you how to use the Python API to run a chat model. We will be querying the togethercomputer/RedPajama-INCITE-7B-Base model to complete the phrase "Isaac Asimov's Three Laws of Robotics are"




For the full API reference go to API Reference.



Pre-requisites

  • Ensure you have Python installed on your machine.
  • Have an app, notebook, or evaluation script where you want to use the Python Library.
  • Create a free account with together.ai to obtain a Together API Key.
  • For your reference, here is a link to the Together Python Library.

Install the Library

Install or update the Together library by executing the following command:

pip install --upgrade together
import together

Authenticate

The API key can be configured by running this command:

together.api_key = "xxxxx"

Find your API token in your account settings.

Select your Model

The model we are using for this guide is togethercomputer/LLaMA-2-7B-32K. You can browse all available models on this list or by executing this command:

# see available models
model_list = together.Models.list()

print(f"{len(model_list)} models available")

# print the first 10 models on the menu
model_names = [model_dict['name'] for model_dict in model_list]
model_names[:10]

Start Querying

Once you've started or selected a model, you can start querying. Notice the inputs available to you to adjust the output you get and how the text is returned to you in the choices list.

output = together.Complete.create(
  prompt = "<human>: What are Isaac Asimov's Three Laws of Robotics?\n<bot>:", 
  model = "togethercomputer/RedPajama-INCITE-7B-Instruct", 
  max_tokens = 256,
  temperature = 0.8,
  top_k = 60,
  top_p = 0.6,
  repetition_penalty = 1.1,
  stop = ['<human>', '\n\n']
)

# print generated text
print(output['output']['choices'][0]['text'])

We are constantly updating the capabilities of these models and our API, but here is one example just to show you the different components of the output available to you:

# print the entire output to see it's components
print(output)
{
    "status": "finished",
    "prompt": [
        "<human>: What are Isaac Asimov's Three Laws of Robotics?\n<bot>:"
    ],
    "model": "togethercomputer/RedPajama-INCITE-7B-Instruct",
    "model_owner": "",
    "tags": {},
    "num_returns": 1,
    "args": {
        "model": "togethercomputer/RedPajama-INCITE-7B-Instruct",
        "prompt": "<human>: What are Isaac Asimov's Three Laws of Robotics?\n<bot>:",
        "top_p": 0.6,
        "top_k": 60,
        "temperature": 0.8,
        "max_tokens": 256,
        "stop": [
            "<human>",
            "\n\n"
        ],
        "repetition_penalty": 1.1,
        "logprobs": null
    },
    "subjobs": [],
    "output": {
        "choices": [
            {
                "finish_reason": "length",
                "index": 0,
                "text": " The three laws were written by Isaac Asimov. They are: 1) A robot may not injure a human being or, through inaction, allow a human being to come to harm 2) A robot must obey the orders given it by human beings except where such orders would conflict with the First Law 3) A robot must protect its own existence as long as such protection does not conflict with the First or Second law"
            }
        ],
        "raw_compute_time": 3.1673731608316302,
        "result_type": "language-model-inference"
    }
}