GLM-5.2 vs Kimi K2.7-Code: 2 Chinese Open Coders

GLM-5.2 and Kimi K2.7-Code are two Chinese open-weight coding models launched the same week of June 2026. GLM-5.2 (Zhipu) is 744B parameters with about 40B active and a 1M token context; Kimi K2.7-Code (Moonshot) is 1T total with 32B active and a 256K context. Zhipu's GLM Coding Plan starts at $18 per month; Kimi is metered at $0.95 input and $4.00 output per million tokens. Zhipu published no GLM-5.2 benchmarks, so a head-to-head score comparison is not possible yet.

Within roughly 24 hours of each other in June 2026, two Chinese labs shipped coding-focused frontier models under open-ish licenses. On June 12, Moonshot AI announced Kimi K2.7-Code. On June 13, Zhipu (which now goes to market as Z.ai internationally) announced GLM-5.2. Both are huge Mixture-of-Experts models, both are built specifically for writing code and driving coding agents, and both are positioned as something you can run for a small fraction of a premium Western coding subscription.

The temptation is to crown a winner. Resist it. The honest version of this comparison is not a scoreboard, because only one of these two labs published numbers. Zhipu shipped GLM-5.2 with no published benchmark scores, so any "GLM beats Kimi" or "Kimi beats GLM" claim on raw performance is impossible to make right now. What we can compare honestly is architecture, license, how you pay, what is actually available to use today, and which coding tools each one plugs into. That is where the real decision sits.

The specs side by side

Here is what each lab has confirmed about the two models. Note the asymmetry that runs through the whole piece: GLM-5.2 has a larger context window but a smaller total parameter count, while Kimi K2.7-Code is the larger model on paper but with a shorter context. And only Kimi is fully available to use today.

Attribute	GLM-5.2 (Zhipu / Z.ai)	Kimi K2.7-Code (Moonshot AI)
Total parameters	744B (Mixture-of-Experts)	1T (Mixture-of-Experts)
Active parameters	~40B per token	32B per token
Context window	1M tokens	256K (262,144) tokens
License	MIT (weights promised, not yet released at launch)	Modified MIT (available now)
Pricing model	Flat subscription (GLM Coding Plan)	Metered pay-as-you-go API
Entry price	From $18 per month	$0.95 input / $4.00 output per million tokens
Announced	June 13, 2026	June 12, 2026
Published benchmarks	None at launch	Self-reported (Moonshot)

GLM-5.2 744B total 40B active 1M context versus Kimi K2.7-Code 1T total 32B active 256K context — Two Mixture-of-Experts designs: GLM-5.2 trades total size for a 1M context, Kimi K2.7-Code trades context for raw parameter count

The pricing claims here are taken directly from each vendor. Zhipu's coding subscription language is published on its developer docs at docs.z.ai, and Kimi K2.7-Code's per-token rates are listed on platform.kimi.ai. Kimi's model card, including the license and parameter counts, sits on its HuggingFace page.

Two open-weight licenses, two different catches

Both models are sold on the open-weight story, but with different fine print. GLM-5.2 is going out under a straight MIT license, which is about as permissive as licenses get. The catch: at announcement, Zhipu said the weights would land "next week." So at launch you could call the API, but you could not yet download the model and self-host it. Frame it honestly as MIT weights promised, not yet downloadable on day one.

Kimi K2.7-Code ships under a Modified MIT license, and crucially the weights are available now on HuggingFace. "Modified MIT" is the same baseline permissiveness with an added clause that large-scale commercial deployments above a certain user threshold display Kimi attribution — a condition that does not affect the vast majority of developers and teams. This is the same family of license Moonshot used for its predecessor, Kimi K2.6.

So on availability, Kimi wins the moment of launch outright: you can grab the weights, run them, or hit the API today. GLM-5.2 asks you to wait a few days for the download, or use the hosted endpoint in the meantime. For anyone whose whole reason to choose a Chinese open-weight model is the ability to self-host on-premises, that timing difference is not a footnote — it is the difference between deploying this week and deploying later.

The real story is how you pay

This is the headline insight of the whole comparison. GLM-5.2 and Kimi K2.7-Code do not just differ on specs — they ask you to pay in fundamentally different ways, and that shapes who each one suits.

Zhipu sells GLM-5.2 through the GLM Coding Plan, a flat monthly subscription. Z.ai's own docs describe it as "Plans from $18 per month," with the entry tier (Lite) at that price and the plan covering both GLM-5.2 and the lighter GLM-5-Turbo. Z.ai lists higher Pro and Max tiers, but it has not published their exact monthly prices at the time of writing, so treat anything above the $18 entry point as unconfirmed. The model here is predictability: you pay a fixed amount and you stop watching the meter.

Moonshot sells Kimi K2.7-Code as metered, pay-as-you-go API access. The published rates are $0.95 per million input tokens on a cache miss, just $0.19 per million input tokens on a cache hit, and $4.00 per million output tokens. There is no flat monthly floor — you pay for exactly what you consume, and the automatic context caching can pull effective input costs down sharply on repetitive, long-context coding sessions.

Flat subscription from 18 dollars per month versus metered 0.95 input and 4 dollars output per million tokens versus 200 dollars Claude Code Max — Two ways to pay for Chinese open-weight code — and both undercut a $200 per month Claude Code Max plan

Now the framing from the headline: both of these are cheap next to a premium Western coding subscription. A Claude Code Max plan runs $200 per month. GLM's coding subscription starts at $18 per month. Kimi is metered and, for many workflows, genuinely inexpensive per token. So yes — you can run a capable Chinese open-weight coding model for a small fraction of a top-tier Claude subscription.

But do not read "cheaper" as "equivalent." These are different tiers of model, and the only published benchmarks in this matchup (Kimi's) put it behind the Western frontier on most tasks, as we cover below. The cost story is real and it is the strongest reason to look at these models. It is not a claim of capability parity with Claude Opus 4.8.

What Moonshot claims about Kimi K2.7-Code's performance

Moonshot did publish numbers for Kimi K2.7-Code, and they are worth reading carefully — both for what they show and for what they cannot show. Every figure below is self-reported by Moonshot, drawn from its own published table, and run in its own evaluation harness. Treat them as vendor claims, not independent results.

Moonshot's main pitch is improvement over its own predecessor: it reports Kimi K2.7-Code scoring 62.0 on its Kimi Code Bench v2, which it frames as a 21.8 percent jump over Kimi K2.6, alongside roughly 30 percent fewer reasoning tokens used to get there. The efficiency gain is the genuinely interesting part: cheaper per task because it thinks in fewer tokens.

Benchmark (Moonshot's table)	Kimi K2.7-Code	Claude Opus 4.8	GPT-5.5
Kimi Code Bench v2	62.0	67.4	69.0
MCP Mark Verified	81.1	76.4	92.9

Source: Moonshot AI, self-reported. Run in Moonshot's own harness; not independently verified.

Read honestly, the table is mixed. On Kimi Code Bench v2, K2.7-Code (62.0) trails both Claude Opus 4.8 (67.4) and GPT-5.5 (69.0). On MCP Mark Verified — a test of tool-use and agentic workflows — it scores 81.1, which beats Opus 4.8 (76.4) but still sits well below GPT-5.5 (92.9). Across most of the benchmarks Moonshot itself published, K2.7-Code trails the Western frontier; the one clear lead it shows over Opus 4.8 is on MCP Mark Verified.

Two cautions. First, these are single-vendor numbers: a lab benchmarking its own model against competitors in its own harness is not running an apples-to-apples evaluation, and the competitor scores may not match what those competitors report. Second, and most important for this comparison: GLM-5.2 published nothing comparable. There is no GLM column in that table because Zhipu released no benchmarks at all. Anyone telling you GLM-5.2 out-codes Kimi K2.7-Code is guessing.

Kimi K2.7-Code self-reported scores Kimi Code Bench v2 62.0 and MCP Mark Verified 81.1 with GLM-5.2 shown as no published data — Kimi K2.7-Code's self-reported scores — and the empty GLM-5.2 column that makes a head-to-head impossible

Context windows: 1M vs 256K

One spec where the gap is real and measurable is context length. GLM-5.2 advertises a 1M token window; Kimi K2.7-Code offers 256K. In practical coding terms, both are large enough to hold a substantial repository, multiple files, and a long agent conversation. The 1M window on GLM-5.2 matters most for the heaviest cases: dropping an entire mid-sized codebase plus its docs and dependency tree into a single prompt, or running very long autonomous sessions without aggressive retrieval and chunking.

For most day-to-day coding agent work — a handful of files, a task, iterative edits — 256K is already comfortable, and Kimi's automatic context caching is designed to make repeated long-context calls cheaper rather than just bigger. So treat the 1M figure as a genuine edge for whole-repository workflows, not as a blanket advantage for everyday use.

Where each one fits

Given that we cannot rank them on raw scores, the honest "who is this for" comes down to how you work and how you want to pay.

Lean toward GLM-5.2 if you want predictable, flat monthly cost; you need the 1M context window for whole-repository or long-session work; and you want a drop-in model for existing coding agents. Zhipu positions GLM-5.2 as compatible with Claude Code, Cline, Kilo Code, OpenClaw, Goose, and Roo — point the agent at the endpoint and go. The trade-off: you may have to wait a few days past launch for downloadable weights, and you are buying on architecture and the vendor's reputation rather than any published benchmark.

Lean toward Kimi K2.7-Code if you want cheap metered tokens rather than a subscription; you value the roughly 30 percent reasoning-token efficiency gain on cost; and you want weights you can download and self-host today. You also get the only published benchmarks in this matchup, even if they show the model trailing Opus 4.8 on most tasks. The trade-off: a smaller 256K context and a Modified-MIT attribution clause at very large commercial scale.

For a lot of teams the real answer is "try both," because the switching cost is low: both speak standard APIs, both plug into the same agent tooling, and neither locks you in the way a proprietary model does.

The bigger picture

These two launches are not isolated. They are the latest beats in a sustained Chinese push to own the open-weight coding tier. In the same stretch of 2026 we have covered DeepSeek V4, Qwen 3.6, and MiniMax's open-weight coding frontier in MiniMax M3 — a steady cadence of large MoE models released under permissive-ish terms.

The contrast with the United States is sharp. As we wrote in our piece on NVIDIA's Nemotron 3 Ultra, the strongest American open-weights model still trails the Chinese frontier on independent indices. GLM-5.2 and Kimi K2.7-Code arriving within a day of each other, both aimed squarely at coding, both undercutting Western subscription pricing, is exactly the dynamic that keeps China ahead in this specific lane.

The bottom line

There is no scoreboard winner to declare here, and anyone who declares one is inventing data. GLM-5.2 shipped without benchmarks, so the only published numbers in this matchup belong to Kimi K2.7-Code — and they show it trailing Claude Opus 4.8 and GPT-5.5 on most tasks while leading Opus 4.8 on MCP tool-use, all self-reported.

The decision that actually matters is not "which scores higher" but "which fits how you build." GLM-5.2 is the pick for flat, predictable cost and a 1M context, with a short wait for downloadable weights. Kimi K2.7-Code is the pick for cheap metered tokens, a real token-efficiency gain, and weights you can self-host today. Both let you run capable Chinese open-weight code generation for a fraction of a $200 per month Claude Code subscription — just not at the same capability tier as the Western frontier, on the evidence we have.

We will revisit this the moment Zhipu publishes GLM-5.2 benchmarks and ships the weights. Until then, the comparison is about license, availability, context, and price — and on those terms, both of these models earn a look.

Frequently asked questions

What is the difference between GLM-5.2 and Kimi K2.7-Code?

Both are Chinese open-weight Mixture-of-Experts coding models launched in June 2026. GLM-5.2 (Zhipu) has 744B total parameters with about 40B active and a 1M token context, sold as a flat subscription from $18 per month. Kimi K2.7-Code (Moonshot) has 1T total parameters with 32B active and a 256K context, sold as metered API access. GLM has a larger context; Kimi is the larger model and is available to download today.

How much do GLM-5.2 and Kimi K2.7-Code cost?

Zhipu's GLM Coding Plan starts at $18 per month for the entry tier, and also covers GLM-5-Turbo; higher Pro and Max tiers exist but their exact prices are not published. Kimi K2.7-Code is metered at $0.95 per million input tokens on a cache miss, $0.19 per million input tokens on a cache hit, and $4.00 per million output tokens, with no flat monthly fee.

Are GLM-5.2 and Kimi K2.7-Code open source?

Both are open-weight. GLM-5.2 uses a permissive MIT license, but at launch Zhipu said the downloadable weights would arrive "next week," so they were not available on day one. Kimi K2.7-Code uses a Modified MIT license and its weights are available now on HuggingFace, with an attribution clause that only applies to very large commercial deployments.

Is GLM-5.2 better than Kimi K2.7-Code?

There is no honest way to rank them on performance right now, because Zhipu published no benchmark scores for GLM-5.2. Only Kimi K2.7-Code has published numbers. The defensible comparison is on architecture, license, pricing model, context window, and availability — not on raw benchmark results, which do not exist for GLM-5.2.

How does Kimi K2.7-Code compare to Claude Opus 4.8 and GPT-5.5?

By Moonshot's own self-reported table, Kimi K2.7-Code scores 62.0 on Kimi Code Bench v2 versus 67.4 for Claude Opus 4.8 and 69.0 for GPT-5.5, so it trails both there. On MCP Mark Verified it scores 81.1, beating Opus 4.8 (76.4) but trailing GPT-5.5 (92.9). These figures are self-reported and run in Moonshot's own harness, so they are not independently verified.

Can I use GLM-5.2 and Kimi K2.7-Code with Claude Code?

Yes for both, via their APIs. Zhipu advertises GLM-5.2 as a drop-in for Claude Code, Cline, Kilo Code, OpenClaw, Goose, and Roo. Kimi K2.7-Code is used through Moonshot's API, which works with agents and editors that support custom model endpoints.

What context window do GLM-5.2 and Kimi K2.7-Code support?

GLM-5.2 supports a 1M token context window. Kimi K2.7-Code supports 256K tokens (262,144). The larger GLM window matters most for whole-repository prompts and very long agent sessions; 256K is already comfortable for typical multi-file coding tasks, and Kimi adds automatic context caching to lower the cost of repeated long-context calls.

Are these Chinese models cheaper than a Claude Code subscription?

Yes, substantially. A Claude Code Max plan is $200 per month. Zhipu's GLM Coding Plan starts at $18 per month, and Kimi K2.7-Code is metered and cheap per token. The caveat is capability: these are not equivalent to the Western frontier, and the only published benchmarks here (Kimi's) show it trailing Claude Opus 4.8 on most tasks.

GLM-5.2 vs Kimi K2.7-Code: Two Chinese Open-Weight Coding Models