Kimi K2 is a state-of-the-art mixture-of-experts (MoE) language model developed by Moonshot AI. It’s a 1 trillion total parameter model (32B activated) that is currently the best non-reasoning open source model out there. It was trained on 15.5 trillion tokens, supports a 256k context window, and excels in agentic tasks, coding, reasoning, and tool use. Even though it’s a 1T model, at inference time, the fact that only 32 B parameters are active gives it near‑frontier quality at a fraction of the compute of dense peers. In this quick guide, we’ll go over the main use cases for Kimi K2, how to get started with it, when to use it, and prompting tips for getting the most out of this incredible model.Documentation Index
Fetch the complete documentation index at: https://docs.together.ai/llms.txt
Use this file to discover all available pages before exploring further.
How to use Kimi K2
Get started with this model in 10 lines of code! The model ID ismoonshotai/Kimi-K2-Instruct-0905 and the pricing is $1.00 per 1M input tokens and $3.00 per 1M output tokens.
Use cases
Kimi K2 shines in scenarios requiring autonomous problem-solving – specifically with coding & tool use:- Agentic Workflows: Automate multi-step tasks like booking flights, research, or data analysis using tools/APIs
- Coding & Debugging: Solve software engineering tasks (e.g., SWE-bench), generate patches, or debug code
- Research & Report Generation: Summarize technical documents, analyze trends, or draft reports using long-context capabilities
- STEM Problem-Solving: Tackle advanced math (AIME, MATH), logic puzzles (ZebraLogic), or scientific reasoning
- Tool Integration: Build AI agents that interact with APIs (e.g., weather data, databases).
Prompting tips
| Tip | Rationale |
|---|---|
Keep the system prompt simple - "You are Kimi, an AI assistant created by Moonshot AI." is the recommended default. | Matches the prompt used during instruction tuning. |
| Temperature ≈ 0.6 | Calibrated to Kimi-K2-Instruct’s RLHF alignment curve; higher values yield verbosity. |
| Leverage native tool calling | Pass a JSON schema in tools=[...]; set tool_choice="auto". Kimi decides when/what to call. |
| Think in goals, not steps | Because the model is “agentic”, give a high-level objective (“Analyse this CSV and write a report”), letting it orchestrate sub-tasks. |
| Chunk very long contexts | 256 K is huge, but response speed drops on >100 K inputs; supply a short executive summary in the final user message to focus the model. |