DeepSeek FAQs

How can I access DeepSeek R1 and V3?

Together AI hosts DeepSeek R1 and V3 models on Serverless. Find them in our playground: DeepSeek R1 / DeepSeek V3.

Why is R1 more expensive than V3 if they share the same architecture and are the same size?

R1 produces more tokens in the form of long reasoning chains, which significantly increase memory and compute requirements per query. Each user request locks more of the GPU for a longer period, limiting the number of simultaneous requests the hardware can handle and leading to higher per-query costs compared to V3.

Have you changed the DeepSeek model in any way? Is it quantized, distilled or modified?

  • No quantization – Full-precision versions are hosted.
  • No distillation — we do offer distilled models but as separate endpoints (e.g. deepseek-ai/DeepSeek-R1-Distill-Llama-70B)
  • No modifications — no forced system prompt or censorship.

Do you send data to China or DeepSeek?

No. We host DeepSeek models on secure, private (North America-based) data centers. DeepSeek does not have access to user's requests or API calls. We provide full opt-out privacy controls for our users. Learn more about our privacy policy here.

Can I deploy DeepSeek in Dedicated Endpoints? What speed and costs can I expect?

We recently launched Together Reasoning Clusters, which allows users to get dedicated, high-performance compute built for large-scale, low-latency inference.

Together Reasoning Clusters include:

✅ Speeds up to 110 tokens/sec with no rate limits or resource sharing
✅ Custom optimizations fine-tuned for your traffic profile
✅ Predictable pricing for cost-effective scaling
✅ Enterprise SLAs with 99.9% uptime
✅ Secure deployments with full control over your data

Looking to deploy DeepSeek-R1 in production? Contact us!

What are the rate limits for DeepSeek R1?

Due to high demand, DeepSeek R1 has model specific rate limits that are based on load. For Free and Tier 1 users the rate limits can range from 1.2 RPM to 3 RPM at this time. All other users (Build Tier 1+) have a rate limit of 240 RPM. Contact sales if you need higher limits for BT 5/Enterprise/Scale.