|
English

In a research lab in Beijing, a team of engineers at Zhipu AI did something the Western AI establishment considered impractical: they trained a 744-billion parameter frontier model entirely on Huawei Ascend 910C chips — without a single NVIDIA GPU in the training cluster. The resulting model, GLM-5, proceeded to score #1 on the Humanity's Last Exam (HLE) benchmark with 50.4%, surpassing Claude Opus 4.5, GPT-5.2, and Gemini 2.5 Pro.

GLM-5 is not just a technical achievement. It is a geopolitical statement that rewrites assumptions about the AI hardware supply chain.

What Is GLM-5

GLM-5 is Zhipu AI's fifth-generation general language model, released on February 11, 2026, under an MIT open-weight license. The model represents the largest and most capable AI system ever trained exclusively on non-NVIDIA hardware, demonstrating that competitive frontier AI development is achievable outside the NVIDIA-dominated ecosystem.

Key specifications:

  • 744 billion parameters in a Mixture-of-Experts (MoE) architecture
  • 256 active experts per forward pass
  • Trained on 28.5 trillion tokens of multilingual data
  • 128K context window with efficient attention
  • Available via API at $0.11 per million input tokens

Zhipu AI, founded in 2019 as a spin-off from Tsinghua University, has raised over $700 million in funding and was valued at approximately $3 billion prior to the GLM-5 launch. The company announced plans for a potential IPO in Hong Kong in the second half of 2026.

Trained on Huawei Ascend: The Geopolitical Significance

The most consequential aspect of GLM-5 is not its performance metrics but its training infrastructure. Since October 2022, the US government has imposed escalating export controls on advanced AI chips, restricting China's access to NVIDIA's A100, H100, and H200 GPUs. These restrictions were intended to slow China's frontier AI development by 3-5 years.

GLM-5 trained on Huawei's Ascend 910C accelerators suggests that timeline assumption may have been incorrect.

Ascend 910C specifications:

  • Custom Da Vinci architecture optimized for transformer workloads
  • Competitive memory bandwidth for large-scale MoE training
  • Manufactured by SMIC using advanced process nodes
  • Deployed in clusters of 10,000+ chips for GLM-5 training

Zhipu AI reported that training GLM-5 required approximately 15% more compute time compared to equivalent NVIDIA-based training runs for similar-scale models, but that the cost differential was offset by the lower price of Ascend chips and government subsidies for domestic AI infrastructure.

The implications extend beyond Zhipu AI. If Chinese labs can train frontier models on domestic hardware at near-parity efficiency, the strategic rationale for chip export controls weakens significantly. GLM-5 demonstrates that export restrictions may have accelerated rather than prevented China's development of an independent AI hardware ecosystem.

Benchmark Dominance

GLM-5 achieved state-of-the-art or near-state-of-the-art results across multiple benchmark categories at the time of its release.

BenchmarkGLM-5Claude Opus 4.5GPT-5.2Gemini 2.5 Pro
HLE (Humanity's Last Exam)50.4%26.4%32.2%22.2%
BrowseComp75.9%56.5%58.0%50.0%
MMLU-Pro82.1%82.8%85.6%79.1%
AIME 202588.3%75.3%100%92.0%
SWE-bench Verified72.1%80.9%65.4%63.8%
LiveCodeBench68.7%70.2%71.0%66.3%

GLM-5's HLE score of 50.4% is particularly notable. The Humanity's Last Exam was designed by experts across dozens of academic disciplines as a benchmark that would resist AI saturation — questions that require genuine expert-level reasoning across mathematics, science, law, and philosophy. GLM-5 is the first model to exceed 50% on this benchmark, a threshold many researchers did not expect to be crossed until late 2026 or 2027.

The BrowseComp score of 75.9% indicates strong performance on web browsing and information retrieval tasks, a practical capability increasingly important for agent-based applications where models must navigate complex information environments.

The Slime RL Technique: Record-Low Hallucination

One of GLM-5's most significant contributions is its implementation of a novel reinforcement learning technique that Zhipu AI calls Slime RL (Self-Learning through Iterative Model Enhancement).

Traditional RLHF (Reinforcement Learning from Human Feedback) optimizes for human preference, which can inadvertently encourage models to generate confident-sounding but incorrect outputs — a phenomenon known as "sycophantic hallucination." Slime RL addresses this by incorporating a factual grounding reward signal alongside the preference reward.

The technique works in three stages:

  1. Self-interrogation: The model generates multiple candidate responses to factual queries and cross-references them against retrieved source material
  2. Consistency scoring: Responses are scored not just on human preference but on internal consistency and source alignment
  3. Iterative refinement: The reward model is updated to penalize confident claims that lack source support, creating a training pressure toward calibrated uncertainty

The result is measurable: GLM-5 achieved a hallucination rate of 1.2% on the HaluEval benchmark, compared to 3.8% for Claude Opus 4.5 and 4.1% for GPT-5.2. In practical terms, GLM-5 is approximately 3x less likely to hallucinate than the leading proprietary models — a critical advantage for enterprise applications where factual accuracy is non-negotiable.

Open-Weight MIT License

GLM-5 is released under the MIT license, one of the most permissive open-source licenses available. This means enterprises can:

  • Fine-tune the model on proprietary data without licensing fees
  • Deploy the model on-premises or in private cloud environments
  • Modify the architecture and redistribute derivative works
  • Use commercially without revenue-sharing or usage-based licensing

This licensing strategy contrasts with Meta's Llama models (custom license with commercial restrictions above certain user thresholds) and with proprietary models that can only be accessed via API.

For enterprises with data sovereignty requirements — healthcare, financial services, government, defense — the combination of open weights and competitive performance eliminates the trade-off between capability and control that has historically forced organizations toward proprietary API providers. Our analysis of the broader open-source AI landscape provides additional context on how open models are closing the gap with proprietary alternatives.

Pricing That Rewrites the Economics

GLM-5's API pricing reflects Zhipu AI's aggressive market positioning strategy.

ModelInput (per M tokens)Output (per M tokens)Ratio vs GLM-5
GLM-5$0.11$0.441x
Claude Opus 4.5$15.00$75.00136x / 170x
GPT-5.2$10.00$30.0091x / 68x
Claude Sonnet 4.5$3.00$15.0027x / 34x
DeepSeek-V3$0.07$0.280.6x / 0.6x

At $0.11 per million input tokens, GLM-5 is priced at approximately 1/136th the cost of Claude Opus 4.5 — while outperforming it on HLE and BrowseComp benchmarks. Even compared to mid-tier models like Claude Sonnet 4.5, GLM-5 is 27x cheaper on input. For a deeper analysis of how these pricing dynamics are reshaping enterprise AI budgets, see our AI API pricing trends report.

For high-volume enterprise applications — document processing, customer service, code generation — the cost differential can represent hundreds of thousands of dollars annually. A company processing 10 million tokens per day would spend approximately $33/month with GLM-5 versus $4,500/month with Claude Opus 4.5.

The pricing is subsidized by Zhipu AI's strategic priority of market share acquisition ahead of its planned IPO, and may not be sustainable long-term at current levels. However, it establishes a new reference point for what frontier AI capabilities should cost. Kimi K2, another open model released in the same period, follows a similar aggressive pricing strategy at $0.15/M tokens.

Zhipu AI IPO and Strategic Outlook

Zhipu AI's announcement of a potential Hong Kong IPO in H2 2026 signals the company's confidence in its competitive position. Key factors driving the IPO timing include:

  • Revenue growth: API consumption grew 400% year-over-year following the GLM-5 launch
  • Enterprise adoption: Over 10,000 enterprise customers across China and Southeast Asia
  • Government contracts: Designated preferred AI provider for multiple provincial government digitization programs
  • Hardware independence: Demonstrated ability to train frontier models without access to Western chips

The IPO would make Zhipu AI the first Chinese AI foundation model company to go public, potentially establishing a $10-15 billion valuation based on comparable AI company multiples.

For the global AI industry, Zhipu AI's trajectory illustrates a broader trend: the center of gravity in open-source AI is shifting. Between GLM-5, DeepSeek, and Alibaba's Qwen series, Chinese labs are producing open models that match or exceed proprietary Western alternatives at a fraction of the cost.

What GLM-5 Means for Enterprise AI Strategy

GLM-5's combination of frontier performance, open weights, domestic hardware training, and aggressive pricing has practical implications for enterprise AI strategy:

Multi-model architectures become essential: With frontier-quality models available at dramatically different price points, the optimal strategy is routing tasks to the most cost-effective model capable of handling them. Simple tasks go to GLM-5 at $0.11/M tokens; complex reasoning goes to Claude Opus 4.6 or GPT-5.3. Our February 2026 AI landscape roundup covers the full competitive picture.

Self-hosting becomes viable for frontier models: Open weights mean organizations can deploy GLM-5 on their own infrastructure, eliminating per-token API costs entirely for high-volume use cases. The calculus shifts from "can we afford the API" to "can we afford the GPUs." For a framework on evaluating this decision, see our cloud vs. on-prem AI TCO analysis.

Supply chain diversification matters: GLM-5 proves that competitive AI is achievable on non-NVIDIA hardware. For organizations concerned about hardware supply chain concentration, this opens new options for AI infrastructure planning.

Swfte helps enterprises build multi-model AI architectures that route requests to the optimal model — whether that's GLM-5 for cost-efficient processing, Claude for complex reasoning, or self-hosted models for data-sensitive workloads. Route between models with Swfte Connect, build workflows in Swfte Studio, or see our pricing.

0
0
0
0

Enjoyed this article?

Get more insights on AI and enterprise automation delivered to your inbox.