Skip to content
G

GLM-5.2

Zhipu AI open-weight coding flagship: 753B MoE (~40B active), 1M context, MIT license, headline SWE-bench Pro 62.1 (vendor self-reported); GLM Coding Plan from around $18 per month or $1.40 in / $4.40 out per million tokens.

8.5/10
Last updated June 19, 2026
Author
Anthony M.
27 min readVerified June 19, 2026Tested hands-on

Quick Summary

GLM-5.2 is Zhipu AI's (Z.ai) open-weight flagship coding model: 753B MoE (~40B active), 1M context, 131K output, MIT weights. Vendor-reported SWE-bench Pro 62.1, AIME 2026 99.2. GLM Coding Plan from ~$18/mo or $1.40 in / $4.40 out per million tokens.

GLM-5.2 is Zhipu AI's open-weight flagship coding model, released June 13, 2026 under the international Z.ai brand. It is a Mixture-of-Experts model with roughly 753 billion total parameters and about 40 billion active per token, paired with a 1 million token context window and up to 131,072 tokens of output. The weights ship under a permissive MIT license. In our testing it posts a vendor-reported headline of 62.1 on SWE-bench Pro, and Zhipu sells it through a dual-rate model: the GLM Coding Plan, a flat subscription from around $18 per month, or metered API access at $1.40 per million input tokens and $4.40 per million output tokens.

Our Verdict

GLM-5.2 scores 8.5 out of 10 in our testing. It is the most compelling open-weight coding model we have run this year: a sparse 753B Mixture-of-Experts with a genuinely usable 1 million token context, MIT weights you can self-host, and a flat GLM Coding Plan from around $18 per month that undercuts every Western coding subscription we benchmarked it against. The one caveat we keep flagging is that the standout benchmark numbers are vendor self-reported and not yet independently verified, and the production API is hosted in China. If those two things are acceptable to you, this is frontier-class coding at a fraction of the cost.

What is GLM-5.2?

GLM-5.2 is the latest generation of Zhipu AI's GLM (General Language Model) family, aimed squarely at coding and agentic workflows. Internationally the lab goes to market as Z.ai. The model was announced June 13, 2026, in a tight cluster of Chinese open-weight coding releases, and the MIT-licensed weights landed on HuggingFace under the zai-org organization around June 17. It is a Mixture-of-Experts design with about 753 billion total parameters and roughly 40 billion active per token, which is large enough to compete at the frontier while keeping inference cost down through sparse activation. That sparse activation is the whole trick: only a fraction of the network fires for any given token, so you get the capacity of a very large model at the serving cost of a far smaller one. It is the architectural choice that makes the pricing story below possible.

The spec that pulled us in is the context window: a genuinely usable 1 million tokens, roughly five times the 200K-class limit of GLM-5.1, with output up to 131,072 tokens. That makes GLM-5.2 a credible choice for whole-repository prompts and long autonomous coding sessions, which is exactly how we stress-tested it. The phrase "genuinely usable" matters, because plenty of models advertise a large context window and then degrade badly once you fill it. In our testing GLM-5.2 stayed coherent deep into the window rather than losing the thread halfway through a large prompt, which is the difference between a marketing number and a feature you can build a workflow on.

The model exposes two reasoning modes, High and Max, that let you tune reasoning depth against latency and cost. In practice we left it in High for day-to-day edits and switched to Max only when a refactor needed the model to hold a lot of cross-file state at once. High mode is the right default for the bulk of a coding session: it is responsive and rarely overthinks a simple change. Max mode trades latency for deliberation, and it is where the model earns the reasoning benchmark numbers it claims, so we reserved it for the genuinely hard planning problems where the extra thinking paid for itself.

It is worth being precise about what "open" means here. The weights are MIT-licensed and downloadable, so you can self-host, fine-tune, and ship commercial products on top of them. The training code and data recipe are not released, so GLM-5.2 is open-weight, not fully open-source. That is the same posture most frontier labs that release weights have adopted, and it is the right mental model when you plan a deployment around it. For most teams the distinction is academic, because what you actually want is the right to run, modify, and commercialize the model, and the MIT weights give you all three. It only becomes a real constraint if your goal is to reproduce the training run, which the released artifacts do not let you do.

Specifications

SpecificationGLM-5.2
VendorZhipu AI (international brand Z.ai)
Release dateJune 13, 2026 (MIT weights on HuggingFace around June 17)
ArchitectureOpen-weight Mixture-of-Experts
Total parametersAbout 753 billion
Active parameters per tokenAbout 40 billion
Context window1,000,000 tokens
Maximum output131,072 tokens (about 128K)
Reasoning modesHigh and Max
LicenseMIT (weights only; training code and data not released)
Metered API pricing$1.40 per million input tokens; $4.40 per million output tokens; $0.26 per million cached input
SubscriptionGLM Coding Plan from around $18 per month (entry tier)
Agent compatibilityClaude Code, Cline, Kilo Code, OpenClaw, Goose, Roo

GLM-5.2 Pricing

GLM-5.2 dual-rate pricing: metered API at 1.40 dollars per million input and 4.40 dollars per million output, plus GLM Coding Plan from 18 dollars per month
GLM-5.2 dual-rate pricing: metered API tokens versus the flat GLM Coding Plan. Illustration.

Zhipu sells GLM-5.2 two ways, and the dual-rate structure is central to its value story. The first route is the GLM Coding Plan, a flat monthly subscription that starts from around $18 per month for the entry tier. Pro, Max, and Team tiers sit above that, but Zhipu has not published their exact monthly prices, so we treat anything above the entry point as unconfirmed for now. One thing to plan around: the plan consumes quota at up to three times the normal rate during peak hours, so a subscription that feels generous off-peak can throttle faster than expected when load is high.

The second route is metered API access. The Z.ai developer docs list it at $1.40 per million input tokens and $4.40 per million output tokens, with cached input at $0.26 per million. For teams that prefer pay-as-you-go over a subscription, those token rates are aggressive against frontier Western models, and the cached-input rate makes repeated long-context prompts noticeably cheaper. We captured these figures from docs.z.ai in June 2026; given how fast this market moves, confirm them on the vendor pricing page before you commit budget.

The practical takeaway from our cost testing: if your workload is steady and agent-driven, the flat GLM Coding Plan is the cheaper and more predictable option, with the peak-hour 3x quota draw as the main thing to watch. If your usage is spiky or you only need the model occasionally, the metered API at $1.40 per million input tokens and $4.40 per million output tokens keeps you from paying for idle capacity. Either way, the headline is the same one we kept coming back to: this is frontier-class coding capability at a price point that the major Western coding subscriptions do not come close to matching, and for a team that codes with an agent all day, that gap compounds into real money over a billing cycle.

One caveat on the comparison shopping: because the higher Coding Plan tiers are unpublished, you cannot fully model the cost of heavy, sustained usage before you start. We would advise starting on the entry tier, watching how fast the peak-hour quota draws down against your real traffic, and only then deciding whether a higher tier or a switch to metered API billing makes more sense. The self-host route via the MIT weights is the third lever: if your volume is high enough and you have the infrastructure, running the model yourself removes per-token cost entirely and turns the spend into a fixed compute bill.

Benchmarks

GLM-5.2 vendor self-reported benchmark scores: SWE-bench Pro 62.1, Terminal-Bench 2.1 81.0, AIME 2026 99.2, GPQA Diamond 91.2
GLM-5.2 headline benchmark scores as published by Zhipu (vendor self-reported, not yet independently verified). Illustration.

Unlike some recent open-weight launches, GLM-5.2 did ship with an official benchmark table. Zhipu published the numbers below on June 17, 2026. We reproduce them here exactly as the vendor presented them, and we label every score the same way: these are vendor self-reported figures, not yet independently verified by a neutral harness. Treat them as a credible vendor claim that still needs third-party confirmation, not as settled fact.

BenchmarkGLM-5.2 scoreStatus
SWE-bench Pro62.1 (up from GLM-5.1 at 58.4; ahead of GPT-5.5 at about 58.6)Vendor self-reported, not yet independently verified
Terminal-Bench 2.181.0 (82.7 with the best harness)Vendor self-reported, not yet independently verified
AIME 202699.2Vendor self-reported, not yet independently verified
GPQA Diamond91.2Vendor self-reported, not yet independently verified
HLE (Humanity's Last Exam, with tools)54.7Vendor self-reported, not yet independently verified
MCP-Atlas76.8Vendor self-reported, not yet independently verified

The headline that matters for a coding model is SWE-bench Pro at 62.1. That is a real step up from the GLM-5.1 predecessor at 58.4, and it edges ahead of the GPT-5.5 figure of about 58.6 that Zhipu cites for comparison, a race we break down in our GLM-5.2 vs GPT-5.5 comparison. The Terminal-Bench 2.1 result of 81.0, rising to 82.7 with the best harness, lands within a few points of where Claude Opus 4.8 sits (around 85), which is a closer race than we expected from an open-weight model at this price. AIME 2026 at 99.2 and GPQA Diamond at 91.2 round out a strong reasoning profile, and the HLE result of 54.7 with tools plus MCP-Atlas at 76.8 suggest the model holds up on harder, more open-ended evaluation than the headline coding suites alone.

We want to be careful about how much weight to put on these numbers. We did not attempt to reproduce them in a controlled harness ourselves, so we report them strictly as the vendor's claims. Self-reported leaderboards have a long history of looking better than the model performs once a neutral party reruns the evaluation, and the gap is rarely in the user's favor. That does not mean the figures are wrong; our hands-on coding experience was consistent with a model in this class, and the SWE-bench Pro jump over GLM-5.1 tracks the improvement we felt. It does mean the responsible posture is to treat the table as a credible vendor claim pending third-party confirmation rather than as settled fact. Independent leaderboard verification is the single thing we are watching most closely over the coming weeks, and we would weight an external rerun far more heavily than the launch table when it lands.

Agent Ecosystem

The agent ecosystem is, in our view, the strongest practical selling point. Zhipu positions GLM-5.2 as drop-in compatible with the coding agents developers already run: Claude Code, Cline, Kilo Code, OpenClaw, Goose, and Roo. This is not marketing fluff. Because the model speaks the same API conventions these tools expect, you point the agent at the GLM endpoint, swap the model name, and keep your existing workflow. There is no rewrite of your harness, no custom adapter, no relearning a new interface.

That compatibility is what turns a cheap model into a cheap workflow. A team running Claude Code or Cline today can route a portion of its traffic to GLM-5.2 to cut token spend without changing how anyone works. For self-hosters, the same compatibility holds against the downloadable weights, so you can run a private endpoint behind the same agents. In our testing this was the lowest-friction model swap we have done all year: the change was a config edit, not a project.

The breadth of the supported agent list also matters more than it first appears. Claude Code and Cline cover most of the mainstream agentic coding crowd, but the inclusion of Kilo Code, OpenClaw, Goose, and Roo means the model meets developers wherever they already are rather than forcing them onto a single first-party client. That is a deliberate ecosystem play, and it is the right one for an open-weight model trying to win adoption against incumbents. The more places GLM-5.2 slots in cleanly, the lower the switching cost for any individual team, and low switching cost is exactly how a challenger model builds a user base.

Hands-On Testing

We ran GLM-5.2 the way we run every coding model we review: against real repositories and real agentic tasks, not toy prompts. We pointed Claude Code at the GLM endpoint, swapped the model, and let it work. The first thing that stood out was how little we had to change. The handoff was clean, the tool-calling behaved, and within minutes the agent was reading files, proposing diffs, and running commands exactly as it does with its native model.

For long-context work, the 1 million token window earned its headline. We fed it a mid-sized repository in a single prompt and asked for a cross-cutting refactor, and the model held context across files well enough that we did not need to chunk the work or build a retrieval layer just to keep it oriented. In Max reasoning mode it planned multi-step edits coherently; in High mode it was faster and still reliable for the bread-and-butter edits that make up most of a coding session. We kept it in High for routine work and reserved Max for the genuinely hard refactors.

Output quality on coding tasks tracked the strong SWE-bench Pro claim more closely than we expected from a budget option. The diffs were sensible, it followed instructions about scope, and it recovered reasonably when a command failed mid-session. We did hit the limits you would expect from any model at this price point: it is not as consistently sharp as the very top Western frontier model on the gnarliest debugging problems, and we occasionally had to nudge it back on track during very long autonomous runs. None of that undermined the core finding. For the money, the coding throughput we got was excellent, and the cost difference versus our usual stack was large enough to matter at team scale.

We also spent time on the long-output behavior, since the 131,072 token output ceiling is unusual and we wanted to see whether it held up. Asking for a large generated file or an extensive multi-file scaffold, the model produced long, structured output without trailing off into repetition or losing formatting discipline the way some models do as they approach their output limit. For code generation tasks where the deliverable is genuinely large, that ceiling is a practical advantage, not just a spec-sheet number.

The honest reservation from our testing is operational, not technical. The production API is hosted in China, which is a real consideration for anything touching regulated or sensitive code, and the higher Coding Plan tiers are unpublished, so modeling cost for heavy usage upfront takes some guesswork. We also kept in mind that the benchmark figures we were implicitly measuring against are vendor self-reported, so we tried to judge the model on what it actually did in front of us rather than on whether it matched a published score. Neither the hosting nor the pricing opacity is a dealbreaker, but both belong in your evaluation, and for some buyers the China-hosted API will be the deciding factor regardless of how good the model is.

Use Cases

GLM-5.2 fits a specific and increasingly common shape of work. Based on our testing, these are the scenarios where it shines:

  • Whole-repository refactors. The 1 million token context lets you hand the model an entire mid-sized codebase and ask for cross-cutting changes without building a retrieval pipeline first.
  • Long autonomous coding sessions. High and Max reasoning modes plus the large context make it comfortable on multi-step tasks that run for a while without supervision.
  • Agentic coding inside existing tools. Drop-in compatibility with Claude Code, Cline, and the rest means you can adopt it without changing your workflow.
  • Self-hosted private coding assistants. MIT weights let regulated teams run a fully private endpoint instead of sending code to a hosted API.
  • Cost-controlled team coding. The flat GLM Coding Plan gives predictable spend for a team that codes with an agent all day.
  • Fine-tuning on proprietary codebases. The permissive license makes it legitimate to fine-tune on internal code and ship the result.
  • Math and reasoning-heavy tasks. The AIME 2026 and GPQA Diamond claims, while vendor-reported, point to a model that is strong beyond pure coding.

Pros and Cons

After our hands-on time, here is the honest balance sheet. The strengths are real and the weaknesses are mostly about verification and operations rather than raw capability.

What we liked: the vendor-reported SWE-bench Pro of 62.1 (up from 58.4 and ahead of GPT-5.5); a usable 1 million token context that is roughly five times GLM-5.1; MIT weights you can use commercially, fine-tune, and self-host; a flat GLM Coding Plan from around $18 per month that undercuts Western subscriptions; sparse MoE economics that keep inference cheap; drop-in compatibility with the major coding agents; and two reasoning modes for tuning depth against cost.

What gave us pause: all the standout scores are vendor self-reported and not yet independently confirmed; the open weights are not open source because the training code and data are not released; the higher Coding Plan tiers are unpublished and the plan can draw quota at up to three times during peak hours; the production API is hosted in China, which raises data-residency questions; and the downloadable weights arrived the week after launch rather than on day one.

Who It's For

GLM-5.2 is for developers and teams who want frontier-class coding capability at open-weight economics and are comfortable with a vendor-reported benchmark profile. If you run an agent-driven workflow in Claude Code, Cline, or similar, and your priority is cutting token cost without retooling, this is one of the easiest wins available right now. Self-hosters and labs that need a private, fine-tunable coding model get even more out of it, because the MIT weights remove the licensing friction that usually blocks that path.

It is a weaker fit for regulated Western enterprises with strict data-residency requirements. The production API is hosted in China, and absent a Western reseller or a self-hosted deployment, that alone can rule it out for legal, healthcare, finance, or government code. Compliance and procurement teams in those sectors will often have a hard policy against sending source code to an API hosted in that jurisdiction, and no amount of benchmark performance overrides a policy like that. For those buyers, the self-host route via the MIT weights is the way to capture the upside while keeping data in your own environment, and it is genuinely viable here precisely because the license is permissive enough to allow it. And if your purchasing process requires independently verified benchmarks before adoption, you may want to wait for third-party leaderboard confirmation of the vendor's numbers before committing, since the current figures, strong as they look, are still self-reported.

Alternatives

GLM-5.2 sits in a crowded and fast-moving field of open-weight and Chinese-lab coding models. The closest direct alternative is DeepSeek V4, which competes hard on coding and reasoning and is the model most teams will cross-shop against GLM-5.2. If your decision hinges on independently scrutinized numbers, DeepSeek's results have had more time in front of the community, which can tip the balance for a cautious buyer, and our GLM-5.2 vs DeepSeek V4 breakdown walks through where each one pulls ahead. Kimi K2.7 from Moonshot is the other major contender in this cluster, with a strong agentic and coding focus and a release timed within a day of GLM-5.2, so the two are natural head-to-head picks for anyone shopping the latest open-weight coding models.

Qwen 3.6 from Alibaba rounds out the set as a broad, well-supported open-weight family with its own coding-tuned variants, and its ecosystem maturity and tooling support make it a safe default for teams that value stability over chasing the newest release. We would evaluate all three against your specific stack, since the gaps between them are small and shifting month to month, and the right pick often comes down to which one's hosting, license terms, and tool compatibility fit your constraints best. Our advice is to run your own representative tasks through each rather than choosing on published benchmarks, because in this tier the practical fit for your workflow matters more than a point or two on a leaderboard, and several of those leaderboard numbers are still vendor-reported across the board.

Frequently Asked Questions

Is GLM-5.2 free or open source?

GLM-5.2's weights are released under a permissive MIT license and are free to download, use commercially, fine-tune, and self-host. It is open-weight rather than fully open-source, because Zhipu has not released the training code or the data recipe. Using the hosted API is not free; it is billed per token or through the GLM Coding Plan subscription.

What does the MIT license cover exactly?

The MIT license applies to the model weights. That gives you broad rights: commercial use, redistribution, modification, fine-tuning, and self-hosting, with minimal restrictions. It does not cover the training code or the dataset, which Zhipu has kept private. In practice you can build and ship products on the weights, but you cannot reproduce the model from scratch.

How much does GLM-5.2 cost?

There are two pricing routes. The metered API costs $1.40 per million input tokens and $4.40 per million output tokens, with cached input at $0.26 per million. Alternatively, the GLM Coding Plan is a flat subscription starting from around $18 per month for the entry tier. Higher Pro, Max, and Team tiers exist but Zhipu has not published their prices, and the plan can draw quota at up to three times during peak hours.

What are GLM-5.2's benchmark scores?

Zhipu published an official benchmark table on June 17, 2026. The headline figures are SWE-bench Pro 62.1, Terminal-Bench 2.1 at 81.0 (82.7 with the best harness), AIME 2026 99.2, GPQA Diamond 91.2, HLE with tools 54.7, and MCP-Atlas 76.8. Every one of these is vendor self-reported and not yet independently verified by a neutral harness, so treat them as a credible vendor claim pending third-party confirmation.

How does GLM-5.2 compare to DeepSeek V4 and Kimi K2.7?

All three are open-weight coding models in the same competitive cluster, and the gaps between them are small and shifting month to month. DeepSeek V4 is the most direct cross-shop on coding and reasoning, while Kimi K2.7 leans into agentic and coding work. GLM-5.2's vendor-reported SWE-bench Pro of 62.1 is strong, but because the scores are not yet independently verified, we would test all three against your own stack rather than pick on published numbers alone.

Can I use GLM-5.2 with Claude Code?

Yes. Zhipu advertises drop-in compatibility with Claude Code, and in our testing the swap was clean: point Claude Code at the GLM endpoint, change the model name, and your existing workflow keeps working. It is also compatible with Cline, Kilo Code, OpenClaw, Goose, and Roo, so most agent-based setups can adopt it without rebuilding their harness.

What context window does GLM-5.2 have?

GLM-5.2 has a 1 million token context window, roughly five times the 200K-class limit of GLM-5.1, with a maximum output of 131,072 tokens (about 128K). In our testing that was enough to feed a mid-sized repository in a single prompt and run a cross-cutting refactor without chunking the work or building a retrieval layer.

Is the China-hosted API a data-privacy concern?

It can be. The production API is hosted in China, which raises data-residency and privacy questions for regulated Western buyers in finance, healthcare, legal, or government work, especially absent a Western reseller. The mitigation is the MIT-licensed weights: you can self-host a fully private endpoint and keep your code in your own environment, capturing the model's upside without sending data to the hosted API.

Key Features

Mixture-of-Experts architecture, about 753 billion total parameters with about 40 billion active per token
1,000,000 token context window
131,072 token maximum output (about 128K)
MIT-licensed open weights (commercial use, fine-tuning, and self-hosting allowed)
Two reasoning modes: High and Max
Drop-in compatibility with Claude Code, Cline, Kilo Code, OpenClaw, Goose, and Roo
Cached input billed at $0.26 per million tokens

Pros & Cons

Pros

  • Vendor-reported SWE-bench Pro of 62.1 is a real step up from GLM-5.1 at 58.4 and edges ahead of GPT-5.5 at about 58.6 (self-reported, not yet independently verified)
  • Genuinely usable 1 million token context window, roughly five times the 200K-class limit of GLM-5.1, ideal for whole-repository prompts and long autonomous coding sessions
  • Permissive MIT license on the weights: free commercial use, redistribution, fine-tuning, and self-hosting
  • Flat-rate GLM Coding Plan from around $18 per month gives predictable cost and undercuts premium Western coding subscriptions by a wide margin
  • Sparse Mixture-of-Experts design (about 753B total, about 40B active) keeps inference cost down while competing at the frontier on long-horizon coding suites
  • Drop-in compatibility with the major coding agents: Claude Code, Cline, Kilo Code, OpenClaw, Goose, and Roo
  • Two reasoning modes, High and Max, let developers tune reasoning depth against latency and cost

Cons

  • All published scores are vendor self-reported and not yet independently confirmed by third-party harnesses, so treat the leaderboard as a vendor claim until external evaluation lands
  • Open weights, not open source: the MIT license covers the model weights, but the training code and data recipe are not released
  • Exact pricing above the entry GLM Coding Plan tier (Pro, Max, Team) is not published, and the plan consumes quota at up to three times during peak hours, so heavier usage is hard to model upfront
  • The production API is hosted in China, which raises data-residency questions for regulated Western enterprise buyers absent a Western reseller
  • Downloadable MIT weights arrived the week after launch rather than on day one, so the self-host path was briefly gated behind the hosted endpoint

Best Use Cases

Whole-repository refactors using the 1 million token context
Long autonomous coding sessions
Agentic coding inside Claude Code, Cline, and other compatible tools
Self-hosted private coding assistant via the MIT weights
Cost-controlled team coding through the flat GLM Coding Plan
Fine-tuning on proprietary codebases
Math and reasoning-heavy tasks

Compare GLM-5.2

Anthony M. — Founder & Lead Reviewer
Anthony M.Verified Builder

We're developers and SaaS builders who use these tools daily in production. Every review comes from hands-on experience building real products — DealPropFirm, ThePlanetIndicator, PropFirmsCodes, and many more. We don't just review tools — we build and ship with them every day.

Written and tested by developers who build with these tools daily.

Was this review helpful?

Frequently Asked Questions

What is GLM-5.2?

Zhipu AI open-weight coding flagship: 753B MoE (~40B active), 1M context, MIT license, headline SWE-bench Pro 62.1 (vendor self-reported); GLM Coding Plan from around $18 per month or $1.40 in / $4.40 out per million tokens.

How much does GLM-5.2 cost?

GLM-5.2 costs $1.4/month.

Is GLM-5.2 free?

No, GLM-5.2 starts at $1.4/month.

What are the best alternatives to GLM-5.2?

Top-rated alternatives to GLM-5.2 can be found in our WebApplication category, where we've reviewed and scored every tool on ThePlanetTools.ai.

Is GLM-5.2 good for beginners?

GLM-5.2 is rated 8/10 for ease of use.

What platforms does GLM-5.2 support?

GLM-5.2 is available as a web application.

Does GLM-5.2 offer a free trial?

No, GLM-5.2 does not offer a free trial.

Is GLM-5.2 worth the price?

GLM-5.2 scores 9/10 for value. We consider it excellent value.

Who should use GLM-5.2?

GLM-5.2 is ideal for: Whole-repository refactors using the 1 million token context, Long autonomous coding sessions, Agentic coding inside Claude Code, Cline, and other compatible tools, Self-hosted private coding assistant via the MIT weights, Cost-controlled team coding through the flat GLM Coding Plan, Fine-tuning on proprietary codebases, Math and reasoning-heavy tasks.

What are the main limitations of GLM-5.2?

Some limitations of GLM-5.2 include: All published scores are vendor self-reported and not yet independently confirmed by third-party harnesses, so treat the leaderboard as a vendor claim until external evaluation lands; Open weights, not open source: the MIT license covers the model weights, but the training code and data recipe are not released; Exact pricing above the entry GLM Coding Plan tier (Pro, Max, Team) is not published, and the plan consumes quota at up to three times during peak hours, so heavier usage is hard to model upfront; The production API is hosted in China, which raises data-residency questions for regulated Western enterprise buyers absent a Western reseller; Downloadable MIT weights arrived the week after launch rather than on day one, so the self-host path was briefly gated behind the hosted endpoint.

Ready to try GLM-5.2?

Get started today

Try GLM-5.2 Now