top of page

Meituan's LongCat AI Just Trained a Trillion-Parameter Model on Chinese Chips. NVIDIA Was Not Invited.

Meituan, China's largest food delivery platform, opened testing on April 24 for LongCat-2.0-Preview, a model with over a trillion parameters trained entirely on domestic Chinese chips. The training run used between 50,000 and 60,000 domestic compute cards, making it the largest large-model training task completed on non-NVIDIA hardware in China to date.

The model is built for agent workflows: tool calling, multi-step reasoning, code generation, enterprise automation. It supports a 1 million token context window with up to 64,000 tokens of output. It's currently in invited beta, with developers receiving a quota of 10 million tokens every two hours.

What makes LongCat AI's timing striking is that it wasn't alone. On the same day, DeepSeek released V4, a 1.6 trillion-parameter model also trained on domestic Huawei chips. Two trillion-parameter models, two companies, same day, zero NVIDIA. That's not a coincidence. It's a pattern.

What Meituan Built and What It's Actually For

LongCat-2.0-Preview is the latest generation of Meituan's LongCat model family, which began with LongCat-Flash, a 560-billion-parameter open-source MoE model released in August 2025. The 2.0 generation roughly doubles the parameter scale to over one trillion, adds the 1M-token context window, and shifts away from open-source to invited-access.

The model's deepest optimizations are in agent scenarios: multi-step task planning, tool invocation, and production-grade code generation. Meituan describes the model as suited for "complex task planning and enterprise automation" rather than general-purpose chat. The difference matters: a model optimized for agents operates differently from a model optimized for Q&A benchmarks.

Meituan CEO Wang Xing has outlined a three-layer AI strategy: AI at Work (improving internal operations), AI in Products (improving customer-facing features), and Building LLMs (the foundation layer). LongCat-2.0 is the third layer. Without a foundation model they control, the first two layers depend on someone else's infrastructure, which in Meituan's competitive landscape is not an acceptable position.

The meituan LongCat model family is already in active deployment. Xiaomei, Meituan's AI ordering agent, runs on LongCat to handle voice-based meal ordering and restaurant booking. The company's food safety monitoring system, called Star Eye, uses AI to scan kitchen footage from vendor restaurants around the clock. These are not research prototypes. They're production workloads that run on LongCat every day at scale.

Why a Food Delivery Company Is Training Trillion-Parameter Models

The obvious question: why is a food delivery platform building one of the world's largest language models?

The answer isn't that Meituan is trying to become an AI lab. It's that Meituan is trying to not become irrelevant.

Every interaction on Meituan's platform, a user ordering lunch, a merchant managing their menu, a delivery rider navigating a city block, is a decision loop that AI can optimize or disrupt. Meituan currently handles those loops with its own systems. But if an AI agent from a competitor could understand a user's food preferences, negotiate with restaurants, and dispatch orders more accurately than Meituan's platform, Meituan would lose its role as the intermediary.

When AI agents become the primary interface between consumers and services, every platform that lacks its own model becomes dependent on someone else's.

Meituan's scale makes this concrete: the company processes tens of millions of delivery orders daily, manages hundreds of thousands of restaurants, and routes delivery riders across dozens of Chinese cities in real time. The AI decisions embedded in that operation are proprietary. If LongCat can make those decisions faster and more accurately than a general-purpose model it licenses from a third party, the investment pays for itself.

The meituan AI bet is also defensive against the competitive dynamics of China's super-app market. Alibaba, Tencent, and ByteDance each have their own foundation models. For Meituan not to have one is to be structurally disadvantaged in a market where AI-driven recommendation, pricing, and logistics are becoming primary competitive variables.

Meituan's previous open-source release, LongCat-Flash, was described by VentureBeat as rivaling GPT-5 on certain tasks when it launched in 2025. LongCat-2.0 nearly doubles the parameter count and moves to invited-access, suggesting Meituan is shifting from community-building to commercial deployment.

The Real Story Is the Chips, Not the Model

LongCat-2.0's parameter count matters. But the more consequential fact about this model is what it was trained on.

Since 2022, the United States has progressively restricted AI chip exports to China: first A100s, then H100s and H800s, then the H20. The policy logic was consistent: advanced AI training requires advanced GPU hardware; restrict the hardware and you restrict the capability. The implicit assumption was that China could not develop frontier-level AI without access to NVIDIA's most capable chips.

LongCat-2.0 used 50,000 to 60,000 domestic compute cards to train a trillion-parameter model. On April 24, DeepSeek independently did the same at 1.6 trillion parameters, also on Huawei Ascend chips. These are not the same event: two separate companies, two separate training runs, two separate architectures. That's independent verification that the gap between domestic and NVIDIA-grade training hardware has narrowed enough to train frontier-class models.

The hardware context matters here. Huawei's Ascend 950PR delivers 1.56 petaflops per card at FP4 precision, roughly 2.8 times the FP4 performance of NVIDIA's H20, which is the chip US policy currently allows into China. ByteDance reportedly committed $5.6 billion to Ascend chip orders. Alibaba Cloud and Tencent have placed significant orders as well. Industry estimates suggest total Chinese demand for Huawei Ascend chips could reach $12 to $15 billion in 2026.

The caveat is real: aggregate flops and efficient flops are different things. Training efficiency, measured by metrics like MFU (Model FLOP Utilization), for domestic Chinese hardware clusters still lags behind NVIDIA H100 configurations. Neither Meituan nor DeepSeek has published detailed efficiency metrics, which makes direct comparison difficult. Larger chip counts can partially compensate for lower per-chip efficiency, but not infinitely.

LongCat-2.0 is also in invited beta with no public benchmark results yet. DeepSeek V4 published a technical report; LongCat-2.0 has not. Without independent performance evaluation on standard benchmarks, the trillion-parameter headline is a hardware claim, not a capability claim.

That said, the historical trajectory of China's approach to technology gaps is worth keeping in mind. China's solar panel industry began significantly behind Western manufacturers; by 2020, Chinese manufacturers controlled roughly 75% of global solar production capacity. The path was not matching Western technology, it was scaling past the constraint. Whether AI chips follow a similar arc is one of the more consequential open questions in tech policy.

How Meituan Got Here: The LongCat Model Timeline

LongCat-2.0 didn't appear from nowhere. Meituan built toward it through a rapid sequence of public releases over the past eight months, each one more capable and more publicly tested than the last.

The series began on September 1, 2025, with LongCat-Flash-Chat, an open-source 560-billion-parameter Mixture-of-Experts model released simultaneously on GitHub, Hugging Face, and Meituan's own model platform. The model activated roughly 27 billion parameters per token on average, keeping inference costs low despite the large total parameter count. On agent benchmarks, LongCat-Flash-Chat ranked first in IFEval (instruction-following evaluation at 89.65) and first in VitaBench, the multi-tool agent benchmark. LMSYS published a deployment guide for running it with SGLang the same day, signaling the model was production-ready at launch.

Three weeks later, on September 23, 2025, Meituan released LongCat-Flash-Thinking, a reasoning-focused variant on the same 560-billion MoE backbone. Where Flash-Chat was optimized for agent task completion, Flash-Thinking was built for extended chain-of-thought reasoning on harder problems. VentureBeat's coverage at the time described it as rivaling GPT-5 on certain reasoning tasks, which helped establish Meituan's credibility as a serious LLM developer rather than a company dabbling in AI as a side project.

The January 2026 release added more depth. LongCat-Flash-Thinking-2601 (technical report: arxiv 2601.16725, published January 23, 2026) became the first fully open-source model with a "Re-thinking Mode," activating eight parallel reasoning paths simultaneously before committing to an answer. On agentic benchmarks, it achieved 73.1% on BrowseComp, 77.7% on RWSearch, and 88.2% on tau-squared-Bench, reaching state-of-the-art among open-source models on all three. Meituan released a companion variant, LongCat-Flash-Thinking-ZigZag, at the same time, introducing LongCat ZigZag Attention (LoZA), a sparse attention mechanism designed to speed up inference on long-context inputs without degrading accuracy.

From this foundation, LongCat-2.0-Preview represents a significant architectural step: the total parameter count jumps from 560 billion to over one trillion, the context window extends to one million tokens, and the model is no longer open-source. The closed-access shift suggests Meituan views 2.0 as a commercial product rather than a community contribution.

LongCat-2.0 vs. DeepSeek V4 and What This Means for Chinese AI

The comparison between LongCat-2.0 and DeepSeek V4 reveals something about how China's AI ecosystem is stratifying.

DeepSeek V4 is a lab product built for global reach: 1.6 trillion total parameters, fully open-source, with a published technical report and benchmark results being evaluated by developers worldwide. DeepSeek's strategy is maximum transparency, letting the model's performance speak to everyone.

LongCat-2.0 is a platform product built for domestic deployment: invited-access, agent-optimized, benchmark results not yet public. Meituan isn't competing for the title of "China's strongest LLM." It's building the model it needs to power its own business at scale, and potentially a model it can license to other Chinese platforms facing similar infrastructure decisions.

That differentiation tells a cleaner story about where China's AI ecosystem is heading than any single benchmark. DeepSeek is the open-source frontier challenger. Meituan LongCat is the platform operator's foundation. Baidu's ERNIE is the search company's enterprise backbone. These aren't competing for the same prize.

What they share is the chip story. Each of these models has a training infrastructure that doesn't depend on NVIDIA. As recently as eighteen months ago, that was broadly considered impossible for frontier-class training. In one week in April 2026, two independent trillion-parameter training runs on domestic chips made that assumption look outdated.

As AI tools like these become part of everyday enterprise workflows, professionals across industries are building systems to capture, connect, and retrieve knowledge from multiple AI sources. The growing split between Western and Chinese AI ecosystems means that relevant information increasingly appears across different model families, languages, and platforms, making cross-source synthesis more important than ever.

What's Next for LongCat and the AI Export Control Debate

In the near term, the most important signal will be whether LongCat-2.0 opens to broader testing and publishes benchmark comparisons. Meituan's track record with LongCat-Flash suggests real capability behind the invite-only curtain. But "invite-only" means that verification is on Meituan's terms for now.

On the business side, the Xiaomei agent update, if it incorporates LongCat-2.0's enhanced reasoning, will be the first real-world performance test. Whether voice-based ordering and recommendation improves noticeably on Meituan's platform is a more practical measure of the model's value than any academic benchmark.

For US policymakers, the week of April 24 presents a harder question: if the goal of chip export controls was to prevent China from training frontier-class AI models, and two independent companies did that in the same week using domestic hardware, what does the next phase of the policy look like? The options narrow to software-layer restrictions, CUDA alternatives, compiler toolchains, model weights themselves, which carry different tradeoffs.

For developers and companies tracking AI infrastructure, the LongCat-2.0 release alongside DeepSeek V4's Huawei chip training marks something worth noting: the period when frontier AI model training required access to one specific vendor's hardware may be ending. What replaces it is a more fragmented but also more competitive global AI infrastructure, and the implications of that fragmentation are still unfolding.

The fragmentation matters beyond geopolitics. When AI infrastructure splits across two distinct hardware ecosystems, organizations that depend on AI for critical decisions face a new planning question: which ecosystem will the best models run on in two years? Meituan's approach, building a proprietary foundation model on domestic hardware, is one answer. The bet is that vertical integration on the infrastructure side provides more long-term stability than relying on any single external provider.

As AI tools multiply across ecosystems, professionals who work with information-intensive workflows increasingly need ways to connect knowledge across platforms. Tools that support AI knowledge management across different sources become more useful precisely when the AI landscape itself is diversifying.

FAQ: Common Questions About LongCat AI and Meituan's Model

What is LongCat AI?

LongCat AI is Meituan's large language model series. The current generation, LongCat-2.0-Preview, is a trillion-parameter-plus model optimized for agent workflows, released for invited testing in April 2026.

Is LongCat-2.0 open source?

No. Unlike the earlier LongCat-Flash, which was released as an open-source model, LongCat-2.0-Preview is in invited beta with controlled access. Developers can apply for testing via the LongCat API platform.

How does LongCat compare to DeepSeek V4?

Both were released the same day (April 24) and both trained on domestic Chinese hardware. DeepSeek V4 is larger (1.6T vs. 1T+ parameters) and fully open-source with published benchmarks. LongCat-2.0 is focused on platform and agent use cases, with benchmarks not yet public.

Why did Meituan train such a large model?

Meituan's core business, food delivery and local services, is increasingly driven by AI decisions around logistics, recommendations, and merchant management. Building a proprietary model gives Meituan control over its AI infrastructure in a market where competitors like Alibaba, Tencent, and ByteDance all have their own foundation models.

What domestic chips did Meituan use?

Meituan confirmed the use of 50,000-60,000 domestic compute cards (per Phemex reporting) but has not publicly specified the chip manufacturer or model. Industry analysts point to Huawei Ascend or Cambricon as the most likely hardware, based on what's available at training scale in China following US export controls. The fact that Meituan completed the run without disclosing chip specifics suggests the company views its hardware stack as a competitive asset, not just a logistical detail worth sharing publicly.

Get started for free

A local first AI Assistant w/ Personal Knowledge Management

For better AI experience,

remio only supports Windows 10+ (x64) and M-Chip Macs currently.

​Add Search Bar in Your Brain

Just Ask remio

Remember Everything

Organize Nothing

bottom of page