Kimi K2.7
Moonshot AI's open-weight 1T-parameter MoE coding model — 32B active, 256K context, Modified MIT, metered at $0.95 in / $4.00 out per million tokens.
Quick Summary
Kimi K2.7 (Kimi K2.7-Code) is Moonshot AI's open-weight coding-focused large language model announced June 12, 2026. It is a 1 trillion parameter Mixture-of-Experts model with 32 billion active parameters and a 256K context window, released under a Modified MIT license with weights on HuggingFace. Self-reported scores include 62.0 on Kimi Code Bench v2 and 81.1 on MCP Mark Verified; Moonshot did not submit independent public benchmarks. Metered API: $0.95 per million input tokens cache miss, $0.19 cache hit, $4.00 output.
Kimi K2.7 is Moonshot AI's open-weight coding model: a 1 trillion parameter Mixture-of-Experts system with 32 billion active parameters, a 256K-token context window, and a native MoonViT vision encoder, released June 12, 2026 under a Modified MIT license. It is built for agentic coding and tool use, sold as a metered pay-as-you-go API at $0.95 per million input tokens with no subscription, and its weights are downloadable on day one from HuggingFace.
Verdict: 8.4 out of 10
Kimi K2.7 is one of the most practical open-weight coding models we have tested this year. It pairs frontier-scale architecture with genuinely cheap metered input pricing and same-day downloadable weights, and it shines on agentic tool-use workloads. The catch is honesty around evidence: every benchmark Moonshot published is from its own in-house harness, with no independent public-suite results yet. We score it 8.4 out of 10 — excellent value for teams who want to self-host or pay only for what they use, with eyes open about the unverified scores.
Who it's for: developers and teams who want a cheap, self-hostable, MCP-friendly coding model and can tolerate vendor-reported benchmarks.
Overview: what Kimi K2.7 actually is
Kimi K2.7, also marketed as Kimi K2.7-Code, is the latest open-weight coding model from Moonshot AI, the Beijing-based lab behind the Kimi family. It landed on June 12, 2026 — exactly one day before Zhipu's GLM-5.2 — in what has become a crowded month for Chinese open-weight launches. Where the previous generation, Kimi K2.6, was a strong but token-hungry coder, K2.7 reframes the pitch around efficiency: Moonshot says it reaches higher scores while spending roughly 30 percent fewer reasoning tokens, which directly lowers the cost of every coding task you run through it.
The model is positioned squarely at agentic coding and tool use rather than chat. It ships with an OpenAI-compatible API, native support for tool calls and JSON mode, and a vision encoder so it can read screenshots and UI mockups inside a coding loop. Crucially, the weights are open and available to download from HuggingFace the same day the API went live — there was no staggered release or waiting list, which is not always the case with frontier-scale Chinese models.
In our testing we treated K2.7 the way most teams will use it: as a drop-in coding model behind an OpenAI-compatible endpoint, wired into an agent that calls tools, edits files, and occasionally looks at an image. That is the workload it was clearly built for, and it is where the model is at its most convincing.
Key features
K2.7 is a frontier-scale model that behaves like a cheap one. The headline numbers are large, but the active footprint per token is small, which is what keeps inference affordable.
- 1 trillion total parameters, 32 billion active. A sparse Mixture-of-Experts design with 384 experts and 8 selected per token (plus 1 shared expert) means only a fraction of the network fires for any given token.
- 256K context window. 262,144 tokens of context, with automatic context caching that makes repeated long-context calls dramatically cheaper.
- Native MoonViT vision. A 400 million parameter vision encoder lets the model read images, screenshots, and UI mockups — useful inside coding workflows where you want it to look at a design and produce markup or fix a layout.
- Open weights, Modified MIT license. You can download and self-host the model today; the license is effectively MIT with an added attribution clause for very large commercial deployments above a user threshold.
- Agentic, tool-first orientation. OpenAI-compatible API with tool calls and JSON mode, and Moonshot's strongest self-reported category is MCP tool use.
- Token efficiency. Roughly 30 percent fewer reasoning tokens than Kimi K2.6 to reach a comparable or higher score, which compounds into real savings at scale.
Architecture and specs
K2.7 keeps the same architectural family as Kimi K2.6 but tunes it for efficiency. It is a 61-layer model — 60 Mixture-of-Experts layers plus 1 dense layer — using MLA attention. The MoE router selects 8 of 384 experts per token, with a single shared expert always active, which is how a 1 trillion parameter model runs with only 32 billion parameters active at inference time.
| Spec | Kimi K2.7 |
|---|---|
| Total parameters | 1 trillion |
| Active parameters per token | 32 billion |
| Experts | 384 (8 selected per token, 1 shared) |
| Layers | 61 (60 MoE + 1 dense), MLA attention |
| Context window | 256K (262,144 tokens) |
| Vision encoder | MoonViT, 400 million parameters |
| License | Modified MIT (open weights) |
| Release date | June 12, 2026 |
| API compatibility | OpenAI-compatible (tool calls, JSON mode) |
The practical upshot of the sparse design is that you get frontier-scale capability without frontier-scale inference cost, whether you are renting the metered API or running the weights on your own GPU cluster. The 256K context is generous for most day-to-day coding agents, though it is smaller than some rivals, as we discuss below.
Pricing
Kimi K2.7 is metered, pay-as-you-go, with no subscription tier to commit to. You pay per token, and the rates split between input and output. Input costs $0.95 per million tokens on a cache miss, and automatic context caching drops repeated input to $0.19 per million cached tokens. Output costs $4.00 per million tokens.
The cache-hit economics matter more than they first appear. Agentic coding sessions reuse a lot of the same context — the same files, the same system prompt, the same tool definitions — over and over. With automatic caching, that repeated input is billed at roughly a fifth of the cache-miss rate, so the effective input cost of a long agent run drifts toward $0.19 per million tokens rather than $0.95.
Where the model is less aggressive is output. At $4.00 per million output tokens, K2.7 is far more expensive on generation than the ultra-cheap open rivals — DeepSeek's V4-Flash, for example, sits around $0.28 per million output tokens. If your workload generates enormous volumes of code, that gap adds up. K2.7's pitch is cheap input plus token efficiency, not the lowest output rate on the market.
Benchmarks (self-reported)
Here is the important caveat first: every benchmark number Moonshot published for K2.7 comes from its own in-house harness. As of release there were no independent third-party results on standard public suites — no SWE-bench Verified, no LiveCodeBench, no Terminal-Bench, no Aider Polyglot, no GPQA Diamond. So K2.7 is not a model "without benchmarks," but the benchmarks it has are vendor-reported and run in Moonshot's own evaluation environment. Treat them as directional, not independently verified.
With that framing, the self-reported table is genuinely informative because Moonshot published comparison columns against its own predecessor and two frontier closed models. The pattern is clear: K2.7 is strongest on agentic and MCP tool-use categories, and trails the closed frontier models on raw coding.
| Benchmark (in-house) | Kimi K2.7-Code | Kimi K2.6 | Claude Opus 4.8 | GPT-5.5 |
|---|---|---|---|---|
| Kimi Code Bench v2 | 62.0 | 50.9 | 67.4 | 69.0 |
| MCP Mark Verified | 81.1 | 72.8 | 76.4 | 92.9 |
| MCP Atlas | 76.0 | 69.4 | 81.3 | 79.4 |
| Program Bench | 53.6 | 48.3 | 63.8 | 69.1 |
| MLS Bench Lite | 35.1 | 26.7 | 42.8 | 35.5 |
| Kimi Claw 24/7 Bench | 46.9 | 42.9 | 50.4 | 52.8 |
The most notable result is MCP Mark Verified, where K2.7's self-reported 81.1 edges past Claude Opus 4.8 at 76.4 — though GPT-5.5 still leads the category at 92.9. On the headline Kimi Code Bench v2, K2.7 jumps from K2.6's 50.9 to 62.0, a 21.8 percent improvement, but it sits behind both Opus 4.8 (67.4) and GPT-5.5 (69.0). The honest read: K2.7 narrows the gap to the frontier on its own metrics and leads on tool use, but you should not assume it matches closed models on independent coding suites until someone runs them.
How we tested it
We ran Kimi K2.7 the way a team would in production: behind its OpenAI-compatible endpoint, wired into an agent that reads a repository, calls tools, edits files, and runs commands. We focused on three things — agentic coding loops, MCP-style tool use, and the vision path — over roughly a week of daily use.
On agentic tool use, the model lived up to its strongest self-reported category. In our testing it was reliable about calling the right tool with well-formed arguments, recovering from a failed tool call, and chaining several steps without losing the plot. JSON mode held up; we rarely had to retry for malformed output. This is the workload where K2.7 felt most like a frontier model rather than a budget one.
On raw code generation, it was strong but not best-in-class. For well-scoped tasks — implement this function, refactor this module, write tests for this file — it produced clean, idiomatic code on the first or second try. On gnarlier multi-file changes it occasionally needed more steering than the top closed models, which tracks with the in-house benchmark gap on Program Bench.
The vision path is a genuine differentiator. We handed it screenshots of UI mockups and asked for markup, and MoonViT did the reading well enough to produce usable starting components. It is not a design tool, but for a coding model the ability to look at a screenshot inside the same loop is a real convenience.
The token-efficiency claim also held in practice. Compared with how we remembered K2.6 behaving, K2.7 reached answers with noticeably less back-and-forth reasoning, and with cache hits on repeated context, the per-task bill stayed low. In our production workflow, the combination of cheap cached input and fewer reasoning tokens was the single biggest reason to keep it in rotation.
Use cases
- Cheap metered agentic coding where you pay only for what you consume instead of a flat monthly subscription.
- Self-hosted, on-premises coding deployments that need downloadable weights from day one for privacy or compliance reasons.
- Tool-use and MCP-heavy agent workflows — the model's strongest self-reported category and, in our testing, its most convincing one.
- Multimodal coding tasks that involve reading screenshots, UI mockups, or diagrams via the MoonViT vision encoder.
- Long-context work that benefits from automatic caching, where repeated context (the same files and prompts) makes the effective input cost very low.
- Teams migrating from Kimi K2.6 who want the token-efficiency gain within the same license family and architecture.
- Cost-sensitive startups that want frontier-scale capability on input-heavy workloads without frontier-scale bills.
Pros and cons
Pros
- Open weights available to download and self-host today on HuggingFace under a Modified MIT license — no waiting period, unlike some rival Chinese launches.
- Roughly 30 percent fewer reasoning tokens than its predecessor Kimi K2.6 to reach a higher score, which lowers the effective cost per coding task.
- Strong agentic tool-use orientation — Moonshot self-reports 81.1 on MCP Mark Verified and 76.0 on MCP Atlas, its best published category.
- Cheap metered, pay-as-you-go API at $0.95 per million input tokens on a cache miss, with automatic context caching dropping repeated input to $0.19 per million.
- 1 trillion total parameters with only 32 billion active per token (384 experts, 8 selected) keeps inference cost low for a frontier-scale model.
- Includes a 400M-parameter MoonViT vision encoder, so it can read screenshots and UI mockups inside coding workflows.
- OpenAI-compatible API that plugs into agents and editors supporting custom model endpoints.
Cons
- No independent third-party benchmarks on standard public suites — Moonshot skipped SWE-bench Verified, LiveCodeBench, Terminal-Bench, and Aider Polyglot, so every published number is self-reported in its own harness.
- 256K context window is smaller than DeepSeek V4's 1M, which matters for whole-repository prompts and very long autonomous sessions.
- On the lab's own table, Kimi K2.7-Code (62.0 on Kimi Code Bench v2) trails Claude Opus 4.8 and GPT-5.5 on its headline coding metric.
- Modified MIT license adds an attribution clause for very large commercial deployments above a user threshold — irrelevant for most teams but not pure MIT.
- Output tokens are expensive relative to ultra-cheap open rivals — $4.00 per million output tokens is far above DeepSeek V4-Flash at roughly $0.28 per million.
Who it's for, and who should skip it
Buy in if you run agentic coding agents, care about MCP and tool use, want to self-host on your own hardware, or have input-heavy workloads where cheap cached input and token efficiency dominate your bill. K2.7 is one of the best value-per-token open-weight coding models available right now.
Skip it if your workload is output-heavy code generation where DeepSeek V4-Flash's far cheaper output wins, if you need a 1M-token context for whole-repository prompts, or if you cannot ship on vendor-reported benchmarks and require independent public-suite verification before adopting a model.
Alternatives
DeepSeek V4. The most direct open-weight rival. It offers a 1M-token context — four times K2.7's window — and its V4-Flash tier is dramatically cheaper on output at roughly $0.28 per million tokens. If your bottleneck is long-repository context or high output volume, DeepSeek V4 is the more economical pick; K2.7 counters with stronger self-reported tool use and cheaper cached input.
GLM-5.2. Zhipu's model, launched a single day after K2.7, and unlike Moonshot it published results on standard public suites, which gives it an evidence advantage for buyers who refuse to rely on in-house numbers. It is the natural cross-shop if independent benchmarks are a hard requirement.
Qwen 3.6. Alibaba's open-weight family remains a broad, well-supported alternative with a large ecosystem of fine-tunes and tooling. It is worth evaluating alongside K2.7 if ecosystem maturity and community support matter as much as raw scores.
Closed frontier models. For pure capability ceiling, Claude Opus 4.8 and GPT-5.5 still lead K2.7 on most of Moonshot's own coding columns. The trade is obvious: they are not open weight, not self-hostable, and not metered at K2.7's input prices.
Frequently asked questions
What is Kimi K2.7?
Kimi K2.7 (Kimi K2.7-Code) is Moonshot AI's open-weight coding model, released June 12, 2026. It is a 1 trillion parameter Mixture-of-Experts model with 32 billion active parameters, a 256K-token context window, and a native MoonViT vision encoder, distributed under a Modified MIT license with downloadable weights on HuggingFace.
How much does Kimi K2.7 cost?
It is metered, pay-as-you-go, with no subscription. Input costs $0.95 per million tokens on a cache miss, $0.19 per million cached tokens on a cache hit, and output costs $4.00 per million tokens. You can also self-host the open weights and pay only your own compute.
Can I self-host Kimi K2.7?
Yes. The weights are open and were available to download from HuggingFace on the day of release under a Modified MIT license. The modified clause adds an attribution requirement only for very large commercial deployments above a user threshold; for most teams it behaves like a standard MIT license.
What is the context window of Kimi K2.7?
256K tokens (262,144), with automatic context caching that makes repeated long-context calls much cheaper. That is smaller than DeepSeek V4's 1M-token window but generous for most day-to-day coding agents.
Are Kimi K2.7's benchmarks independently verified?
No. Every published benchmark — Kimi Code Bench v2, MCP Mark Verified, MCP Atlas, Program Bench, MLS Bench Lite, and Kimi Claw 24/7 Bench — comes from Moonshot's own in-house harness. As of release there were no independent results on standard public suites such as SWE-bench Verified or LiveCodeBench, so the scores should be read as vendor-reported and directional.
Is Kimi K2.7 better than DeepSeek V4?
It depends on the workload. K2.7 reports stronger agentic tool use and offers cheaper cached input, while DeepSeek V4 has a far larger 1M-token context and much cheaper output via its V4-Flash tier at roughly $0.28 per million output tokens. Pick K2.7 for input-heavy, MCP-driven agents; pick DeepSeek V4 for long-repository context or output-heavy generation.
Does Kimi K2.7 support vision?
Yes. It includes a 400 million parameter MoonViT vision encoder, so it can read images, screenshots, and UI mockups directly inside coding workflows — for example, turning a screenshot of a design into starting markup.
How does Kimi K2.7 compare to Claude Opus 4.8 and GPT-5.5?
On Moonshot's own table, K2.7 leads Claude Opus 4.8 on MCP Mark Verified (81.1 versus 76.4) but trails both Opus 4.8 and GPT-5.5 on the headline Kimi Code Bench v2 (62.0 versus 67.4 and 69.0). Since these are in-house scores, treat the comparison as directional rather than definitive.
Our verdict
Kimi K2.7 earns 8.4 out of 10. It is a frontier-scale, open-weight coding model that behaves like a budget one: cheap metered input, automatic caching, 30 percent better token efficiency than its predecessor, and same-day downloadable weights under a near-MIT license. For agentic, MCP-heavy, self-hosted, or input-heavy coding work, it is excellent value and was a pleasure to run in our testing.
The reservation is evidence, not capability. Moonshot shipped only in-house benchmarks, output pricing is expensive next to the cheapest open rivals, and the 256K context is smaller than some competitors. None of that is disqualifying — it just means you should adopt K2.7 for what it provably does well (cheap, efficient, tool-using, self-hostable coding) and verify the rest against your own workload before betting a critical pipeline on the headline scores.
Key Features
Pros & Cons
Pros
- Open weights available to download and self-host today on HuggingFace under a Modified MIT license — no waiting period, unlike some rival Chinese launches
- Roughly 30 percent fewer reasoning tokens than its predecessor Kimi K2.6 to reach a higher score, which lowers the effective cost per coding task
- Strong agentic tool-use orientation — Moonshot self-reports 81.1 on MCP Mark Verified and 76.0 on MCP Atlas, its best published category
- Cheap metered, pay-as-you-go API at $0.95 per million input tokens on a cache miss, with automatic context caching dropping repeated input to $0.19 per million
- 1 trillion total parameters with only 32 billion active per token (384 experts, 8 selected) keeps inference cost low for a frontier-scale model
- Includes a 400M-parameter MoonViT vision encoder, so it can read screenshots and UI mockups inside coding workflows
- OpenAI-compatible API that plugs into agents and editors supporting custom model endpoints
Cons
- No independent third-party benchmarks on standard public suites — Moonshot skipped SWE-bench Verified, LiveCodeBench, Terminal-Bench, and Aider Polyglot, so every published number is self-reported in its own harness
- 256K context window is smaller than DeepSeek V4's 1M, which matters for whole-repository prompts and very long autonomous sessions
- On the lab's own table, Kimi K2.7-Code (62.0 on Kimi Code Bench v2) trails Claude Opus 4.8 and GPT-5.5 on its headline coding metric
- Modified MIT license adds an attribution clause for very large commercial deployments above a user threshold — irrelevant for most teams but not pure MIT
- Output tokens are expensive relative to ultra-cheap open rivals — $4.00 per million output is far above DeepSeek V4-Flash at $0.28
Best Use Cases
Platforms & Integrations
Available On
Integrations
Compare Kimi K2.7

We're developers and SaaS builders who use these tools daily in production. Every review comes from hands-on experience building real products — DealPropFirm, ThePlanetIndicator, PropFirmsCodes, and many more. We don't just review tools — we build and ship with them every day.
Written and tested by developers who build with these tools daily.
Frequently Asked Questions
What is Kimi K2.7?
Moonshot AI's open-weight 1T-parameter MoE coding model — 32B active, 256K context, Modified MIT, metered at $0.95 in / $4.00 out per million tokens.
How much does Kimi K2.7 cost?
Kimi K2.7 has a free tier. Premium plans start at $0.95/month.
Is Kimi K2.7 free?
Yes, Kimi K2.7 offers a free plan. Paid plans start at $0.95/month.
What are the best alternatives to Kimi K2.7?
Top-rated alternatives to Kimi K2.7 can be found in our WebApplication category, where we've reviewed and scored every tool on ThePlanetTools.ai.
Is Kimi K2.7 good for beginners?
Kimi K2.7 is rated 8.2/10 for ease of use.
What platforms does Kimi K2.7 support?
Kimi K2.7 is available on API, Web, Self-hosted (open weights).
Does Kimi K2.7 offer a free trial?
Yes, Kimi K2.7 offers a free trial.
Is Kimi K2.7 worth the price?
Kimi K2.7 scores 8.3/10 for value. We consider it excellent value.
Who should use Kimi K2.7?
Kimi K2.7 is ideal for: Cheap metered agentic coding where you pay only for what you consume rather than a flat subscription, Self-hosted on-premises coding deployments that need downloadable weights on day one, Tool-use and MCP-heavy agent workflows, the model's strongest self-reported category, Multimodal coding tasks that involve reading screenshots, UI mockups, or diagrams, Teams switching from Kimi K2.6 that want the token-efficiency gain at the same license family.
What are the main limitations of Kimi K2.7?
Some limitations of Kimi K2.7 include: No independent third-party benchmarks on standard public suites — Moonshot skipped SWE-bench Verified, LiveCodeBench, Terminal-Bench, and Aider Polyglot, so every published number is self-reported in its own harness; 256K context window is smaller than DeepSeek V4's 1M, which matters for whole-repository prompts and very long autonomous sessions; On the lab's own table, Kimi K2.7-Code (62.0 on Kimi Code Bench v2) trails Claude Opus 4.8 and GPT-5.5 on its headline coding metric; Modified MIT license adds an attribution clause for very large commercial deployments above a user threshold — irrelevant for most teams but not pure MIT; Output tokens are expensive relative to ultra-cheap open rivals — $4.00 per million output is far above DeepSeek V4-Flash at $0.28.
Ready to try Kimi K2.7?
Start with the free plan
Try Kimi K2.7 Free →