top of page

The GLM Coding Plan Went Viral in North America: Then the Price Doubled

A $3-per-month Chinese AI coding subscription quietly appeared in developer circles in late 2025. No product launch event, no press release, just a link shared in Discords and Slack channels by developers who had stumbled onto something that seemed too cheap to be real. The glm coding plan from Zhipu AI, a Beijing-based company spun out of Tsinghua University, offered API access to a competitive large language model that plugged directly into Cursor, Cline, Claude Code, and over 20 other AI coding tools. For North American developers paying $10 to $20 a month for GitHub Copilot or Claude Pro, the math was obvious.

What happened next is a compressed case study in Chinese AI's global expansion strategy: explosive demand, an infrastructure crisis, a pricing reset that erased the original value proposition, and a "passport tax" controversy that has since become a recurring flashpoint in Chinese AI developer communities.

What Is the GLM Coding Plan and Why Did Developers Sign Up

Zhipu AI, now operating its international products under the Z.ai brand, completed an IPO on the Hong Kong Stock Exchange on January 8, 2026. It is widely cited as the first publicly traded foundation model company in the world, with a market valuation of approximately $31.3 billion. The company's flagship product line, the GLM series, has been in commercial development since 2023, and the coding plan is its subscription offering specifically designed for software development workflows.

The design of the plan is straightforward in a way that most AI tool subscriptions are not. One API key, one monthly quota, shared across all supported tools. A developer using Cursor for one project and Claude Code for another can draw from the same subscription pool. At launch, that pool was priced at $3 per month for the Lite tier under a first-purchase promotional rate.

The performance story supported the price. GLM-5.1, the model underlying the current plan, scores 94.6% of Claude Opus 4.6's performance on coding benchmarks according to Z.ai. That number is self-reported rather than independently verified, but independent community testing broadly supported the claim. On routine coding tasks, including debugging, refactoring, documentation, and boilerplate generation, the model performed comparably to models that cost significantly more per request.

Roo Code, one of the major AI coding tools, announced its integration with Z.ai on LinkedIn with language about "low-cost coding models for everyone." For an independent developer or a small team without enterprise contracts, the subscription offered access to near-frontier coding capability at a fraction of the cost of any Western alternative.

Why It Matters: Chinese AI's Price-First Playbook

The pricing story is striking, but the technical story underneath it is more significant. GLM-5.1, released and open-sourced on April 7, 2026, sits at the top of SWE-Bench Pro, the most rigorous publicly available coding evaluation, with a score of 58.4 compared to Claude Opus 4.6's 57.3. It is the first open-weight model to reach the top three on that benchmark.

The model architecture is a 744 billion parameter Mixture-of-Experts design with 40 billion active parameters, a 200,000 token context window, and MIT license open-source weights available on HuggingFace. It was trained on approximately 100,000 Huawei Ascend 910B chips using the MindSpore framework. No Nvidia GPUs were used at any point in the training process.

That last fact matters beyond the technical. The GLM-5 family is among the first frontier-level models to reach performance parity with US labs' flagship models while operating entirely outside the Nvidia supply chain, and therefore outside the scope of US chip export controls. When a Chinese AI model built on non-Nvidia chips starts outperforming Claude Opus on coding benchmarks, the price story becomes a geopolitical story.

For North American developers, the immediate implication is practical: there is now a coding model available at competitive or lower cost that was built on infrastructure the US government cannot restrict. Whether that fact influences purchasing decisions depends on the developer, but it is no longer a theoretical scenario.

The broader trend this expansion represents is consistent with how Chinese AI products have entered international markets: aggressive entry pricing to build user base and workflow integration, followed by normalization once usage patterns are established. DeepSeek followed a similar path with its API pricing. Kimi has done the same. The pattern is recognizable, and Zhipu ran it faster than most.

The Rush, the Cap, and the Price Hike That Followed

The demand that followed the $3-per-month promotional pricing exceeded what Zhipu AI had planned for. At one point, new sign-ups were capped at 20% of normal capacity because the volume of GLM-4.7 inference requests was overwhelming the available infrastructure. This is notable: Zhipu had just gone public, had raised over $700 million, and was still caught short by the speed of international developer adoption.

On February 11, 2026, Z.ai announced the end of the promotional pricing via its official X account, citing growing demand and rising compute costs. The announcement confirmed a price adjustment for new subscribers: first-purchase discounts were removed, and the overseas pricing moved to approximately $10 per month for the Lite tier on a quarterly billing cycle.

Two months later, on April 11, 2026, a second price increase went into effect for international users. The overseas Lite tier moved to $18 per month. Pro went to $72. Max went to $160. The domestic Chinese pricing remained unchanged: Lite at approximately $7, Pro at $21, Max at approximately $68.

The "passport tax" controversy arose directly from this divergence. At the Max tier, an overseas subscriber pays $160 per month for the same quota allocation as a Chinese subscriber paying $68, a ratio of approximately 2.35 to one. The term "passport tax" is used specifically in Chinese AI communities to describe situations where the only variable determining price is the payment method or national identity of the buyer, with no corresponding difference in service.

The community response generated its own engineering solutions. A GitHub repository called copilot-proxy, which wraps the Z.ai API in a format that mimics GitHub Copilot's API interface, appeared shortly after the pricing changes, enabling developers to use GLM models with tools that don't natively support custom API endpoints. Separately, developers who could access Chinese payment systems discovered that purchasing the domestic plan and using it internationally worked without technical restriction, effectively enabling international users to pay domestic prices if they had the payment infrastructure.

A secondary controversy emerged within developer communities: some users reported that the GLM-5.1 upgrade had reduced per-session quota limits compared to previous plans, meaning subscribers were getting a more capable model with less headroom to use it. The tradeoff drew criticism as a quality-of-life regression even for users who approved of the underlying model improvements.

OpenAI, for comparison, maintains global unified pricing on its API endpoints. The contrast with Zhipu's regional dual-track model is explicit and frequently cited in developer discussions comparing the two ecosystems. Anthropic similarly does not maintain separate pricing tiers for different geographies on its Claude API. Among major AI providers, Zhipu's dual-pricing approach is unusual enough that it has become the reference example when developers discuss how Chinese AI companies approach international market entry differently from their US counterparts. The SCMP reported Zhipu's price increases as a deliberate strategy to close the gap with US rivals, framing it as competitive normalization rather than pure cost recovery.

How It Fits Into the AI Coding Tool Landscape

The AI coding tool market in 2026 has split into two distinct layers: tool interfaces (Cursor, Cline, Kilo Code, Roo Code, Claude Code) and model backends (OpenAI, Anthropic, Z.ai, DeepSeek). The coding plan competes primarily in the backend layer, where the question is cost per token rather than user experience.

GitHub Copilot remains the dominant choice for enterprise developers, primarily because of its deep integration with VS Code and GitHub, tools that most enterprise engineering teams already use. Its model backend is not fixed to any single provider; GitHub Copilot uses GPT-4o, Claude, and Gemini interchangeably depending on the task. The switching cost away from Copilot is workflow friction, not model quality.

Claude Pro at $20 per month offers broader capabilities beyond coding, but its coding performance ceiling is higher than what most individual developers will consistently reach in daily use. For developers who are already paying for Claude Pro, the marginal value of the GLM plan is reduced. For those not on Claude, it remains a lower-cost entry point to comparable coding capability.

DeepSeek's API operates on a per-token model rather than a monthly subscription, which makes direct comparison difficult. At high usage volumes, DeepSeek's API pricing is competitive; at moderate volumes, a flat subscription rate can be more predictable.

The practical advantage of the Z.ai subscription is clearest for independent developers doing significant coding volume across multiple tools. The unified quota model means a single subscription covers Cursor sessions in the morning, Cline agentic tasks in the afternoon, and Claude Code reviews in the evening. That flexibility is not available from most competing subscriptions, which either bind to a specific tool or charge per model.

What's Next for GLM and Chinese AI in Developer Tools

The question for Zhipu AI is whether the current overseas pricing settles or continues moving upward. The stated rationale, compute costs and service quality, applies with equal force to both domestic and international users, but only international users have received price increases. If the gap narrows in the direction of the domestic price, it signals Zhipu is prioritizing international user retention. If it continues widening, it signals that the international market is being treated as premium extraction rather than strategic expansion.

GLM-5.1's open-source release introduces a third path: developers who want the model without the subscription can download the weights and run it locally or on their own cloud infrastructure. At 744 billion parameters with 40 billion active, self-hosting is not trivial, but it is within reach for engineering teams with existing GPU infrastructure.

The longer-term dynamic is whether Chinese AI coding tools can maintain a price advantage as they mature. DeepSeek's API pricing has already shown that near-frontier coding capability can be commoditized quickly. If GLM-5.1 is outperformed by the next open-source release in six months, the subscription model's value proposition needs to be something other than model quality.

For North American developers currently evaluating AI coding subscriptions: the Lite tier at $18 per month (overseas) remains below the cost of GitHub Copilot Pro ($19) and significantly below Claude Pro ($20), with comparable coding performance on benchmark tasks. The open-source option is available for teams that prefer infrastructure control over subscription convenience.

For solo developers and small teams without enterprise procurement requirements, the calculus is straightforward. The glm coding plan at its current pricing offers a meaningful cost reduction compared to Western alternatives, and the unified quota model across 20+ tools reduces the per-tool overhead of managing multiple API keys and billing relationships. Whether that advantage persists through the next round of pricing adjustments is the open question.

The harder question this story raises isn't which AI coding tool to use. It's how your team manages knowledge across all of them. When developers switch between Cursor, Cline, and Claude Code across different projects, the context that makes those tools useful doesn't automatically transfer. If your engineering team is building workflows around multiple AI coding tools, remio's knowledge base for engineering teams can help consolidate the documentation, decisions, and context that make AI coding assistance actually useful. The teams that extract the most value from AI coding tools aren't the ones picking the cheapest model; they're the ones maintaining the institutional knowledge that makes any model more effective.

Get started for free

A local first AI Assistant w/ Personal Knowledge Management

For better AI experience,

remio only supports Windows 10+ (x64) and M-Chip Macs currently.

​Add Search Bar in Your Brain

Just Ask remio

Remember Everything

Organize Nothing

bottom of page