Overview

Learn how to improve the relevance of your search and RAG systems with reranking.

What is a reranker?

A reranker is a specialized model that improves search relevancy by reassessing and reordering a set of retrieved documents based on their relevance to a given query. It takes a query and a set of text inputs (called 'documents'), and returns a relevancy score for each document relative to the given query. This process helps filter and prioritize the most pertinent information, enhancing the quality of search results.

In Retrieval Augmented Generation (RAG) pipelines, the reranking step sits between the initial retrieval step and the final generation phase. It acts as a quality filter, refining the selection of documents that will be used as context for language models. By ensuring that only the most relevant information is passed to the generation phase, rerankers play a crucial role in improving the accuracy of generated responses while potentially reducing processing costs.

How does Together's Rerank API work?

Together's serverless Rerank API allows you to seamlessly integrate supported rerank models into your enterprise applications. It takes in a query and a number of documents, and outputs a relevancy score and ordering index for each document. It can also filter its response to the n most relevant documents.

Together's Rerank API is also compatible with Cohere Rerank, making it easy to try out our reranker models on your existing applications.

Key features of Together's Rerank API include:

  • Flagship support for LlamaRank, Salesforce’s reranker model
  • Support for JSON and tabular data
  • Long 8K context per document
  • Low latency for fast search queries
  • Full compatibility with Cohere's Rerank API

Get started building with Together Rerank today →

Cohere Rerank compatibility

The Together Rerank endpoint is compatible with Cohere Rerank, making it easy to test out models like LlamaRank for your existing applications. Simply switch it out by updating the URL, API key and model.

import cohere

co = cohere.Client(
    base_url="https://api.together.xyz/v1",
    api_key=TOGETHER_API_KEY,
)
docs = [
    "Carson City is the capital city of the American state of Nevada.",
    "The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan.",
    "Capitalization or capitalisation in English grammar is the use of a capital letter at the start of a word. English usage varies from capitalization in other languages.",
    "Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district.",
    "Capital punishment (the death penalty) has existed in the United States since beforethe United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states.",
]
response = co.rerank(
    model="Salesforce/Llama-Rank-V1",
    query="What is the capital of the United States?",
    documents=docs,
    top_n=3,
)

Get Started

Example with text

In the example below, we use the Rerank API endpoint to index the list of documents from most to least relevant to the query What animals can I find near Peru?.

Request

In this example, the documents being passed in are a list of strings:

from together import Together

client = Together()

query = "What animals can I find near Peru?"

documents = [
  "The giant panda (Ailuropoda melanoleuca), also known as the panda bear or simply panda, is a bear species endemic to China.",
  "The llama is a domesticated South American camelid, widely used as a meat and pack animal by Andean cultures since the pre-Columbian era.",
  "The wild Bactrian camel (Camelus ferus) is an endangered species of camel endemic to Northwest China and southwestern Mongolia.",
  "The guanaco is a camelid native to South America, closely related to the llama. Guanacos are one of two wild South American camelids; the other species is the vicuña, which lives at higher elevations.",
]

response = client.rerank.create(
  model="Salesforce/Llama-Rank-V1",
  query=query,
  documents=documents,
  top_n=2
)

for result in response.results:
    print(f"Document Index: {result.index}")
    print(f"Document: {documents[result.index]}")
    print(f"Relevance Score: {result.relevance_score}")

Example with JSON Data

Alternatively, you can pass in a JSON object and specify the fields you’d like to rank over, and the order they should be considered in. If you do not pass in any rank_fields, it will default to the text key.

The example below shows passing in some emails, with the query Which pricing did we get from Oracle?.

Request

from together import Together

client = Together()

query = "Which pricing did we get from Oracle?"

documents = [
    {
        "from": "Paul Doe <[email protected]>",
        "to": ["Steve <[email protected]>", "[email protected]"],
        "date": "2024-03-27",
        "subject": "Follow-up",
        "text": "We are happy to give you the following pricing for your project.",
    },
    {
        "from": "John McGill <[email protected]>",
        "to": ["Steve <[email protected]>"],
        "date": "2024-03-28",
        "subject": "Missing Information",
        "text": "Sorry, but here is the pricing you asked for for the newest line of your models.",
    },
    {
        "from": "John McGill <[email protected]>",
        "to": ["Steve <[email protected]>"],
        "date": "2024-02-15",
        "subject": "Commited Pricing Strategy",
        "text": "I know we went back and forth on this during the call but the pricing for now should follow the agreement at hand.",
    },
    {
        "from": "Generic Airline Company<no_reply@generic_airline_email.com>",
        "to": ["Steve <[email protected]>"],
        "date": "2023-07-25",
        "subject": "Your latest flight travel plans",
        "text": "Thank you for choose to fly Generic Airline Company. Your booking status is confirmed.",
    },
    {
        "from": "Generic SaaS Company<marketing@generic_saas_email.com>",
        "to": ["Steve <[email protected]>"],
        "date": "2024-01-26",
        "subject": "How to build generative AI applications using Generic Company Name",
        "text": "Hey Steve! Generative AI is growing so quickly and we know you want to build fast!",
    },
    {
        "from": "Paul Doe <[email protected]>",
        "to": ["Steve <[email protected]>", "[email protected]"],
        "date": "2024-04-09",
        "subject": "Price Adjustment",
        "text": "Re: our previous correspondence on 3/27 we'd like to make an amendment on our pricing proposal. We'll have to decrease the expected base price by 5%.",
    },
]

response = client.rerank.create(
    model="Salesforce/Llama-Rank-V1",
    query=query,
    documents=documents,
    return_documents=True,
    rank_fields=["from", "to", "date", "subject", "text"],
)

print(response)

In the documents parameter, we are passing in a list of objects which have the key values: ['from', 'to', 'date', 'subject', 'text']. As part of the Rerank call, under rank_fields we are specifying which keys to rank over, as well as the order in which the key value pairs should be considered. Finally, we're also setting return_documents to True since we want to see them in the response.

Response

{
  "model": "Salesforce/Llama-Rank-v1",
  "choices": [
    {
      "index": 0,
      "document": {
        "text": "{\"from\":\"Paul Doe <[email protected]>\",\"to\":[\"Steve <[email protected]>\",\"[email protected]\"],\"date\":\"2024-03-27\",\"subject\":\"Follow-up\",\"text\":\"We are happy to give you the following pricing for your project.\"}"
      },
      "relevance_score": 0.606349439153678
    },
    {
      "index": 5,
      "document": {
        "text": "{\"from\":\"Paul Doe <[email protected]>\",\"to\":[\"Steve <[email protected]>\",\"[email protected]\"],\"date\":\"2024-04-09\",\"subject\":\"Price Adjustment\",\"text\":\"Re: our previous correspondence on 3/27 we'd like to make an amendment on our pricing proposal. We'll have to decrease the expected base price by 5%.\"}"
      },
      "relevance_score": 0.5059948716207964
    },
    {
      "index": 1,
      "document": {
        "text": "{\"from\":\"John McGill <[email protected]>\",\"to\":[\"Steve <[email protected]>\"],\"date\":\"2024-03-28\",\"subject\":\"Missing Information\",\"text\":\"Sorry, but here is the pricing you asked for for the newest line of your models.\"}"
      },
      "relevance_score": 0.2271930688841643
    },
    {
      "index": 2,
      "document": {
        "text": "{\"from\":\"John McGill <[email protected]>\",\"to\":[\"Steve <[email protected]>\"],\"date\":\"2024-02-15\",\"subject\":\"Commited Pricing Strategy\",\"text\":\"I know we went back and forth on this during the call but the pricing for now should follow the agreement at hand.\"}"
      },
      "relevance_score": 0.2229844295907072
    },
    {
      "index": 4,
      "document": {
        "text": "{\"from\":\"Generic SaaS Company<marketing@generic_saas_email.com>\",\"to\":[\"Steve <[email protected]>\"],\"date\":\"2024-01-26\",\"subject\":\"How to build generative AI applications using Generic Company Name\",\"text\":\"Hey Steve! Generative AI is growing so quickly and we know you want to build fast!\"}"
      },
      "relevance_score": 0.0021253144747196517
    },
    {
      "index": 3,
      "document": {
        "text": "{\"from\":\"Generic Airline Company<no_reply@generic_airline_email.com>\",\"to\":[\"Steve <[email protected]>\"],\"date\":\"2023-07-25\",\"subject\":\"Your latest flight travel plans\",\"text\":\"Thank you for choose to fly Generic Airline Company. Your booking status is confirmed.\"}"
      },
      "relevance_score": 0.0010322494264659
    }
  ]
}