Analysis of Domestic GPU Four Dragons' Breakthrough of CUDA Ecosystem Barriers and Commercialization Prospects

I will conduct an in-depth analysis of the prospects for the four leading domestic GPU players to break through CUDA ecosystem barriers and achieve commercialization. This is a complex issue involving technology, ecosystem, capital, and geopolitics.

I. Differentiated Positioning and Latest Progress of the Four Leading Domestic GPU Players

1.1 Moore Threads: Ecosystem Compatibility Strategy for Full-Function Route

Core Positioning:

Emulate NVIDIA’s full-function GPU route, covering both consumer and cloud markets

Technical Route:

Self-developed MUSA full-function architecture, using MUSIFY toolchain to be compatible with NVIDIA’s CUDA ecosystem
7nm process, FP32 computing power up to 32 TFLOPS (about 47.8% of H100)
Covering full ‘cloud-edge-end’ scenarios, including AI computing and graphics rendering

Commercialization Progress:

Listed on the STAR Market on December 5, 2025, becoming the ‘first domestic GPU stock’
First-day stock price soared over 425%, with a market value peak exceeding 359.5 billion yuan
Consumer-grade game graphics card MTT S80 (benchmarked against RTX 3060) sold over 150,000 units in Q3 2025
Obtained Microsoft WHQL certification, being the only domestic game card supporting the Windows ecosystem in China
C-end market repurchase rate reached 28%, and graphics rendering revenue accounted for 35% in H1 2025

Core Advantage:

Reduce user migration costs through CUDA compatibility, and take the lead in achieving large-scale shipments in the consumer market

1.2 Muxi Semiconductor: Cost-Effective Full-Stack Solution Route

Core Positioning:

Focus on vertical domains such as government-enterprise, medical care, and finance, providing cost-effective full-stack solutions

Technical Route:

Independently develop GPU IP, build MXMACA software stack, with architecture highly compatible with CUDA ecosystem
Adopt ‘design-manufacturing-packaging’ full-chain autonomy

Product Matrix:

Xisi N Series: Real-time inference for small and medium models, already deployed in financial risk control (processing 1 billion transactions daily)
Xiyun C600: 7nm + 144GB HBM3e memory, supporting full training of 128B MoE large models
Xicai G Series: For industrial design and film rendering (expected to mass-produce in 2026)

Commercialization Progress:

Listed on the STAR Market on December 17, 2025, with a first-day increase of 692.95% and a single-lot profit of 362,600 yuan
Delivered over 28,000 chips cumulatively, covering 32 intelligent computing centers (including H3C clusters)
Revenue in the first half of 2025 exceeded the full year of 2024, with outstanding orders amounting to 1.43 billion yuan

Core Advantage:

Strong mass production capability, good ecosystem compatibility, and deep binding with government Xinchuang procurement

1.3 Biren Technology: Extreme Performance Route for High-End Training

Core Positioning:

Focus on high-end training, target supercomputing centers, and benchmark against international top levels

Technical Route:

One of the first domestic GPU companies to commercialize Chiplet technology
Adopt a progressive development path, focusing on cloud-side general intelligent computing fields

Commercialization Progress:

Launched Hong Kong IPO prospectus on December 22, 2025, planning to list on January 2, 2026
Cumulative loss exceeded 6.3 billion yuan in 6 years since establishment, with this IPO raising 4.21-4.85 billion Hong Kong dollars
According to estimates, this fundraising can only support until Q2 2028, with huge financial pressure
Over 1200 public patents globally, ranking first among Chinese general-purpose GPU companies

Core Challenge:

BR20X commercialization is less than expected, facing refinancing pressure, and capital续航 capability is in doubt

1.4 Suiyuan Technology: Inference Route Bound to Cloud Services

Core Positioning:

Bind to leading cloud vendors such as Tencent, and deeply cultivate cloud service inference scenarios

Technical Route:

Suisi L600 computing card focuses on optimization for inference scenarios
Deeply bound to Tencent Cloud, with inference card revenue accounting for over 60%

Commercialization Progress:

Restarted STAR Market listing counseling on November 1, 2025
Its computing cards are exclusively produced by the joint venture company of Hongxin Electronics (Xiamen Suihong), with related revenue reaching 750 million yuan in H1 2025
Plans to double production capacity in 2026, with Tencent Cloud orders accounting for 70% of production capacity

Core Advantage:

Deeply bound to cloud vendors, forming a stable order source in inference scenarios

II. In-Depth Analysis of NVIDIA’s CUDA Ecosystem Barriers

2.1 Components of CUDA Moat

Technical Aspect:

Global base of 5 million developers
Over 1 million adapted applications
Formed a strong network effect of ‘developers-applications-hardware’

Commercial Aspect:

Occupy over 80% of the global high-end GPU market
Market value was about 4.5 trillion US dollars in November 2025, becoming the world’s highest market value company
Q3 revenue increased by about 62% year-on-year, maintaining strong growth

Essence of Ecosystem Barriers:

CUDA’s moat is far higher than technical barriers. Even if competitors make breakthroughs in hardware performance, developer proficiency, application adaptability, and user migration costs in the ecosystem form an insurmountable gap.

2.2 The First Crack in CUDA

Breakthrough of Google TPU:

Google Gemini 3 large models are fully trained based on TPU, proving the feasibility of non-GPU routes
TPU can provide 15-30 times performance improvement compared to同期 CPUs and GPUs when processing tensor operations
Energy efficiency improved by 30-80 times, showing significant advantages in specific scenarios

Significance of TPU Route:

No need to compete head-on with CUDA ecosystem
Exchange computing efficiency with ASIC specialization route
Provide ideas for domestic manufacturers for differentiated competition

III. Comparison of Technical Routes and Pros and Cons Analysis

3.1 Characteristics of Different Technical Architectures

Technology Type	Definition	Advantages	Disadvantages	Representatives
GPU	General-purpose Graphics Processing Unit	Strong flexibility, mature ecosystem	High power consumption, high cost	NVIDIA
GPGPU	General-purpose Computing GPU	Compatible with graphics and computing	Still restricted by CUDA ecosystem	Moore Threads, Muxi
ASIC	Application-Specific Integrated Circuit	Extremely high efficiency, low cost	Poor flexibility, long development cycle	Google TPU
NPU	Neural Processing Unit	Optimized specifically for AI	Limited application scenarios	Huawei Ascend
TPU	Tensor Processing Unit	High efficiency in matrix operations	Overly specialized	Google, Zhonghao Xinying
DSA	Domain-Specific Architecture	Optimal in specific domains	Poor generality	Self-developed by major cloud vendors

3.2 Differentiation of Technical Routes of Domestic Manufacturers

CUDA Compatible Camp (Moore Threads, Muxi):

Advantages: Reduce user migration costs, quickly gain market recognition
Disadvantages: Always in a follow-up position, difficult to build their own moat
Risk: If CUDA ecosystem is further upgraded, compatibility may fail

Independent Architecture Camp (Biren, Suiyuan):

Advantages: More controllable in the long run, with the opportunity to build an independent ecosystem
Disadvantages: High migration cost, steep learning curve for developers
Risk: Long ecosystem construction cycle, huge financial pressure

IV. Core Challenges and Opportunities for Commercialization

4.1 Short-Term Challenges (2025-2027)

Supply Chain Risk:

TSMC advanced process capacity is queued, Moore Threads faces 6-12 months of tape-out delay risk
Muxi Semiconductor’s strategic stockpiling in 2024 led to inventory surge to 777 million yuan, increasing financial pressure
Biren Technology’s chip tape-out cycle was extended due to US Entity List restrictions in 2024

Financial Pressure:

The four companies have accumulated losses exceeding 10 billion yuan, still in the stage of ‘burning money for technology’
Biren Technology’s fundraising can only support until Q2 2028
Moore Threads plans to achieve profitability in 2027; if the target is not met, the market value bubble may burst

Ecosystem Maturity Gap:

Domestic ecosystem still has a huge gap with CUDA
Developer community size and application adaptation quantity are far less than NVIDIA
User learning costs and ‘pitfall’ experience affect word-of-mouth spread

4.2 Mid-to-Long-Term Opportunities (2025-2030)

Policy and Market Dividends:

‘Made in China 2025’ requires 70% self-sufficiency rate of integrated circuits by 2025
Ministry of Industry and Information Technology clearly states that domestic GPUs account for 40% of AI server chips
‘East-West Computing Resource Allocation Project’ provides application scenarios for domestic GPUs

Market Size Forecast:

Domestic GPU market size is expected to exceed 80 billion yuan in 2025, with an annual growth rate of over 60%
Global GPU market will reach 3.6 trillion yuan in 2029, with China accounting for 1.36 trillion yuan (37.8%)
Domestic GPU supply share in AI training scenarios has risen to 40%

Geopolitical ‘Protection’ Effect:

US chip bans objectively provide a protective market for domestic manufacturers
Chinese large model vendors and cloud vendors are forced to switch to domestic chips
Developers begin to learn platforms like CANN, and migration costs are being ‘digested’
Once the local network is formed, users may not be willing to go back even if the ban is lifted

4.3 Key Success Factors

Technical Breakthrough:
Key technologies such as advanced processes, Chiplet, and advanced packaging
Ecosystem Construction:
Developer community, application adaptation, toolchain improvement
Capital续航:
Sustained financing capability to support until break-even
Scenario Landing:
Establish benchmark cases in specific vertical domains
Supply Chain Security:
Ensure stable supply of advanced processes

V. Commercialization Prospect Evaluation

5.1 Comprehensive Judgment

Feasibility of Breaking CUDA Barriers:

Time Dimension	Feasibility	Key Path
Short-term (1-2 years)	Low	Gain market share through CUDA-compatible solutions
Mid-term (3-5 years)	Medium	Establish independent ecosystems in specific vertical domains
Long-term (5-10 years)	High	Geopolitics + technological breakthroughs to restructure the global pattern

Specific Evaluation:

Moore Threads
(Feasibility ★★★★☆): With full-function route and consumer market breakthrough, it is most likely to achieve large-scale commercialization in the short term. The 28% C-end repurchase rate proves that the product has been recognized by the market, but financial pressure and advanced process supply are major bottlenecks.
Muxi Semiconductor
(Feasibility ★★★★☆): Full-stack solution + vertical domain deep cultivation + strong mass production capability, with 1.43 billion yuan of outstanding orders providing stable cash flow support, it is the most robust enterprise in commercialization progress among the four.
Biren Technology
(Feasibility ★★★☆☆): Strongest technical strength (over 1200 patents), but the greatest financial pressure (cumulative loss of 6.3 billion yuan), and high-end training scenario competition is the most fierce, with high commercialization risk.
Suiyuan Technology
(Feasibility ★★★★☆): The strategy of binding to Tencent Cloud has been verified in inference scenarios, with relatively small financial pressure, but over-reliance on a single customer is a potential risk.

5.2 Key Conclusions

Cannot Break Through CUDA Ecosystem Head-On:
In an open market environment, domestic manufacturers are difficult to break through CUDA’s moat head-on. Google TPU’s case proves that the best strategy is not to compete head-on in the GPU field, but to take a differentiated route (ASIC/TPU/NPU).
Geopolitics Create ‘Protective Market’:
US chip bans objectively create a protective market for the Chinese market, and domestic manufacturers can avoid direct competition with NVIDIA in this market and gradually build their own ecosystem.
Vertical Domains Are Breakthrough Points:
Domestic manufacturers should not pursue full replacement of NVIDIA, but should deeply cultivate vertical domains such as government-enterprise, medical care, finance, and cloud service inference, and gradually expand after establishing benchmark cases.
2025-2027 Is a Critical Window Period:
According to estimates, Biren Technology’s fundraising can only support until Q2 2028, and Moore Threads plans to profit in 2027. The next 2-3 years will be a critical window period for domestic GPU commercialization landing, and track reshuffling has entered the countdown.
Long-Term Optimistic, But Pain Is Inevitable:
The rise of Chinese AI chips is an inevitable trend, but in the short term, it will face pains such as reduced computing efficiency, immature ecosystem, and supply chain restrictions. This is more like a ‘marathon’ rather than a ‘sprint’.

VI. Investment and Industry Recommendations

6.1 For Investors

Focus on companies with obvious differentiated technical routes and small financial pressure
Prioritize enterprises with benchmark cases in vertical domains
Pay close attention to supply chain risks and cash flow status
Understand that this is a long-term track, and short-term fluctuations are inevitable

6.2 For Industry Players

Should not pursue full replacement of CUDA, but build differentiated advantages
Deeply cultivate specific scenarios (government-enterprise, finance, medical care, cloud inference)
Strengthen developer community construction and reduce user migration costs
Deeply bind with cloud vendors and ISVs to form ecosystem synergy

6.3 For Policymakers

Continue to support the development of domestic GPU industry, but avoid excessive protection leading to insufficient competitiveness
Allow necessary NVIDIA chip procurement in core AI projects to balance short-term pain and long-term development
Support breakthroughs in key technologies such as advanced processes and Chiplet
Promote projects like ‘East-West Computing Resource Allocation’ to provide application scenarios for domestic GPUs

Final Conclusion:

The four leading domestic GPU players are difficult to fully break through CUDA ecosystem barriers in the short term, but through differentiated technical routes, vertical domain deep cultivation, and protective markets created by geopolitics, they have the opportunity to achieve partial breakthroughs and commercialization landing. 2025-2027 is a critical window period, and whoever can achieve break-even and establish benchmark cases first will occupy a favorable position in this marathon.

References

[1] Zhihu Column - ‘Domestic GPU Four Dragons: Differentiated Breakthrough and Capitalization Race’ (https://zhuanlan.zhihu.com/p/1984677832585130329)
[2] Sina Finance - ‘Domestic GPU Four Dragons Hardcore Showdown: Who Will Lead the Computing Power Autonomy Wave?’ (https://finance.sina.com.cn/roll/2025-12-17/doc-inhccnef7087595.shtml)
[3] 36Kr - ‘Moore Threads and Muxi Crazy Fundraising, Country Garden Waits for Biren Technology to List’ (https://m.36kr.com/p/3607933512270594)
[4] 36Kr - ‘CUDA Was Torn Open the First Crack, Google TPUv7 Beat NVIDIA’ (https://m.36kr.com/p/3576367537814404)
[5] 36Kr - ‘Who Is NVIDIA’s Real Opponent?’ (https://m.36kr.com/p/3608790849635589)
[6] 36Kr - ‘Domestic AI Chips, Big Explosion’ (https://m.36kr.com/p/3601821550937088)
[7] Sina Finance - ‘Moore Threads First Publicly Discloses Full-Function GPU Technical Roadmap’ (https://finance.sina.com.cn/roll/2025-12-22/doc-inhcqxcv5092199.shtml)
[8] Jinling API Data - Market Data and Industry Analysis

Analysis of Domestic GPU Four Dragons' Breakthrough of CUDA Ecosystem Barriers and Commercialization Prospects

Unlock More Features

Related Stocks