- If you have a high volume of steady traffic and good payment history for this traffic, you can request a higher limit by emailing [email protected].
- If you are interested in our Enterprise package, with custom requests per minute (RPM) and unlimited tokens per minute (TPM), please reach out to sales here.
Best Practice
To maximize successful requests:- Stay within your cap rate limit.
- Prefer steady, consistent traffic and avoid bursts.

How we measure Rate limits
If your limit is 60 RPM, plan for roughly RPM/60 = 1 RPS (i.e., spread requests evenly across the minute). We enforce rate limits per second internally (RPS/TPS). We display limits per minute (RPM/TPM) to align with common industry conventions.Earned Rate Limits
This is a new feature we will roll out to help us handle bursty traffic, and help users steadily increase their success rate.How do we handle bursty traffic ?
How do we handle bursty traffic ?
To ensure fair use of a model across all users, we buffer sudden surges in traffic and apply a fairness mechanism so everyone continues to receive timely service. We also make a best-effort attempt upfront to absorb and smooth bursts via our leading inference speed and capacity management, before any limiting behavior is applied.If a burst still results in failed requests despite these protections, we apply response attribution using an Earned Rate threshold.
Earned Rate
We track an Earned Rate per user and per model:Earned Rate ≈ 2 × past_hour_successful_request_rateWe constrain Earned Rate as:base_rate ≤ earned_rate ≤ cap_rate- Default
base_rateis 60 RPM.
Behavior during burst failures
When bursty requests fail:- Requests at or below your Earned Rate (≤ Earned Rate) receive 503 Service Unavailable.
These failures are attributed to platform capacity under burst conditions — we take responsibility. - Requests above your Earned Rate (> Earned Rate) receive 429 Too Many Requests, with:
error_type: "earned_request_limited"(request-based limiting), orerror_type: "earned_token_limited"(token-based limiting)
Recommendation
We strongly recommend avoiding bursty traffic. If your traffic spikes to roughly 2× (or more) of what you’ve successfully sustained over the past hour, requests beyond your Earned Rate may be limited even after our best-effort buffering.Rewards of sustained traffic.
Rewards of sustained traffic.
Steady Traffic Improves Success Rates and Increases Earned Rate

A Virtuous Cycle: Consistency Builds Capacity

Rate limit tiers
You can view your rate limit by navigating to Settings > Billing. As your usage of the Together API and your spend on our API increases, we will automatically increase your rate limits. Chat, language & code models| Tier | Qualification criteria | RPM | TPM |
|---|---|---|---|
| Tier 1 | Credit card added, $5 paid | 600 | 180,000 |
| Tier 2 | $50 paid | 1,800 | 250,000 |
| Tier 3 | $100 paid | 3,000 | 500,000 |
| Tier 4 | $250 paid | 4,500 | 1,000,000 |
| Tier 5 | $1,000 paid | 6,000 | 2,000,000 |
Due to high demand on the platform, DeepSeek R1 has these special rate limits. We are actively increasing them.
| Tier | RPM |
|---|---|
| Tier 1 | 3 |
| Tier 2 | 60 |
| Tier 3 | ~400+ |
| Tier 4 | ~400+ |
| Tier 5 | ~1200+ |
| Tier | Qualification criteria | RPM | TPM |
|---|---|---|---|
| Tier 1 | Credit card added, $5 paid | 3,000 | 2,000,000 |
| Tier 2 | $50 paid | 5,000 | 2,000,000 |
| Tier 3 | $100 paid | 5,000 | 10,000,000 |
| Tier 4 | $250 paid | 10,000 | 10,000,000 |
| Tier 5 | $1,000 paid | 10,000 | 20,000,000 |
| Tier | Qualification criteria | RPM | TPM |
|---|---|---|---|
| Tier 1 | Credit card added, $5 paid | 2,500 | 500,000 |
| Tier 2 | $50 paid | 3,500 | 1,500,000 |
| Tier 3 | $100 paid | 4,000 | 2,000,000 |
| Tier 4 | $250 paid | 7,500 | 3,000,000 |
| Tier 5 | $1,000 paid | 9,000 | 5,000,000 |
| Tier | Qualification criteria | Img/min |
|---|---|---|
| Tier 1 | Credit card added, $5 paid | 240 |
| Tier 2 | $50 paid | 480 |
| Tier 3 | $100 paid | 600 |
| Tier 4 | $250 paid | 960 |
| Tier 5 | $1,000 paid | 1,200 |
- FLUX.1 [schnell] Free has a model specific rate limit of 6 img/min.
- FLUX.1 Kontext [pro] has a model specific rate limit of 57 img/min.
| Tier | Qualification criteria | RPM |
|---|---|---|
| Tier 1 | Credit card added, $5 paid | 60 |
| Tier 2 | $50 paid | 60 |
| Tier 3 | $100 paid | 60 |
| Tier 4 | $250 paid | 60 |
| Tier 5 | $1,000 paid | 100 |
| Field | Description |
|---|---|
| x-ratelimit-limit | The maximum number of requests per sec that are permitted before exhausting the rate limit. |
| x-ratelimit-remaining | The remaining number of requests per sec that are permitted before exhausting the rate limit. |
| x-ratelimit-reset | The time until the rate limit (based on requests per sec) resets to its initial state. |
| x-tokenlimit-limit | The maximum number of tokens per sec that are permitted before exhausting the rate limit. |
| x-tokenlimit-remaining | The remaining number of tokens per sec that are permitted before exhausting the rate limit. |