AI内存芯片短缺正在悄然向整个经济征收“隐形税”

Sha Rabii

2026-05-05

人工智能热潮引发了业内人士所称的“内存末日”。

文本设置

小号

默认

大号

Plus(0条)

Majestic Labs总裁及联合创始人沙·拉比。图片来源：courtesy of Majestic Labs

你的下一台笔记本电脑、智能手机，甚至冰箱，都面临涨价，而这一切的根源，正是人工智能。人工智能热潮引发了业内人士所称的“内存末日”（RAMageddon）：围绕高带宽内存（HBM）芯片的抢购愈演愈烈，正在将全球市场上几乎所有其他买家挤出局，不仅推高消费电子产品售价，也给汽车、医疗保健等行业带来压力。就连苹果（Apple）的首席执行官蒂姆·库克也警告称，人工智能基础设施成本对硬件业务利润率形成挤压。

人工智能领域的巨头们实际上已经给整个经济强加了一项“隐形税”，而大多数人对此毫无知觉。

曾经平价的内存芯片，如何成了奢侈品

现代计算依赖多种类型的内存。静态随机存取存储器（SRAM）速度最快、价格最高，仅少量用于处理器内部。动态随机存取存储器（DRAM）是整个行业的主力产品，价格低廉，供应充足，广泛应用于笔记本电脑、汽车、冰箱等各类消费设备。此外还有高带宽内存。这是一种专用高端动态随机存取存储器，通过芯片堆叠实现数据传输速度的飞跃式提升。这种高端内存的成本极高：同一块硅晶圆能够生产的普通动态随机存取存储器数量，是高带宽内存的三倍。高带宽内存晶圆厂的加工时间也要长得多，这进一步加剧了供应问题。因此，提升高带宽内存产量，必然会压缩内存芯片的整体产能。

对于人工智能的训练与推理而言，高带宽内存已经成为不可或缺的核心组件。它如同航空燃油一般，为支撑当前规模最大、最先进模型运行的图形处理器（GPU）提供动力。

内存厂商每家晶圆厂（硅芯片工厂）的产能均存在上限。原本用于量产日常消费电子设备所需普通动态随机存取存储器的生产线，正在转向生产高带宽内存。从商业角度看，这一决策十分明智：在波动剧烈的半导体行业，高带宽内存不仅可以带来超高溢价，还能够锁定大额保底采购订单。事实上，人工智能企业及其同行已经锁定了2027年及之后的高带宽内存供应。这直接导致普通内存芯片供应趋紧、价格上涨、交货周期延长，由此产生的连锁反应几乎波及所有行业。目前，行业巨头已经形成供应垄断，这也成为内存芯片短缺的核心症结。人工智能行业实际上是在向整个经济“征税”，以构建自身体系。

“内存墙”解析

人工智能对内存的需求堪称惊人。大模型参数从百万级增长到十亿级、万亿级，上下文窗口从数千个词元（token）扩展到数千万个词元，与此同时，内存需求也同步增长，而数据中心的架构却难以跟上这一发展步伐。行业由此陷入棘手的“内存墙”困境：内存带宽和容量无法满足处理器需求这一根本性瓶颈。

高带宽内存曾经是解决早期内存墙问题的绝佳方案。在模型规模较小（例如GPT-2和GPT-3）时，将内存紧邻处理器部署，并实现高速数据传输，可以取得良好效果。如今，行业早已跨越这一时代。当今前沿模型的参数规模已经超过两万亿，下一代模型更将突破五万亿。单个高带宽内存堆栈的容量约为24GB，仅占当前工作负载实际需求的1%左右，对于下一代工作负载来说，更是微不足道。

其结果是，如今数据中心必须进行指数级扩张，它们将跨服务器和机架的数百个处理器串联起来。然而，在这种情况下，高带宽内存的超高本地带宽优势，却因为连接这些设备的链路速度相对较慢而受到限制。业界提出“华而不实”的解决方案：人工智能公司为高带宽内存支付高昂溢价，换来的却是持续递减的性能收益。

人工智能淘金热使大多数人被甩在身后

主流叙事将人工智能基础设施投资描绘成惠及多方的发展举措：既利好内存厂商与芯片企业，也能推动各领域的创新发展。但现实却严重失衡。

内存厂商或许可以在短期内获利，但真正的赢家是垄断本身。当高带宽内存供应被少数几家超大规模企业垄断时，一道难以逾越的护城河就此形成。初创企业、传统企业和成熟行业都面临硬件成本高企、获取先进人工智能能力渠道受限的困境。那些有能力囤积芯片的公司，不仅能够赢得当下的竞争，更可以巩固自身优势，让自身的领先地位愈发难以撼动。

其他人则被卷入了这场纷争。消费者将为性能缩水的设备支付更高的价格。企业要面对成本高企、波动剧烈的硬件市场环境。更广泛的科技生态系统，正在与拥有近乎无限资本、能够不断加价争夺内存资源的行业展开竞争。

为人工智能和内存规划一条可持续的发展道路

人工智能行业热衷于宣扬“民主化”理念：开放模型、易于获取的工具、让智能惠及每个人。然而，这一愿景正在与底层硬件现实愈发脱节。

当前的发展模式难以为继。向高带宽内存持续追加投资，终究只是治标不治本的权宜之计。行业必须摆脱为早期人工智能时代设计的单一内存架构的执念，潜心研发新架构——不仅可以满足当前人工智能的需求，还能应对未来几年需求增长百倍的挑战。

解决这一问题，不仅需要扩建高带宽内存晶圆厂，更需要从根本上重新思考人工智能内存架构设计。我们需要的是智能、快速且紧凑的内存系统，这套架构可以随模型规模增长扩容，且无需消耗大量资源。最重要的是，新兴架构必须让人工智能不再被少数巨头垄断。

内存墙问题并非人工智能崛起过程中无足轻重的工程注脚，而是这个时代最核心的基础设施挑战。行业当前“不惜一切代价，加速扩产高带宽内存”的应对方式，潜藏着诸多风险，可能会削弱竞争、抑制创新，并损害消费者信任。从产业全局视角来看，我们必须尽快找到更优的解决方案，以免为时已晚。当下最直接的应对之策，是摆脱对高带宽内存的依赖，转而采用专为人工智能需求定制的动态随机存取存储器架构。需要在人工智能领域的资源分配差距变得不可逆转之前采取行动，这一窗口期正在迅速关闭。

沙·拉比（Sha Rabii），半导体内存架构公司Majestic Labs总裁及联合创始人。

Fortune.com上发表的评论文章中表达的观点，仅代表作者本人的观点，不代表《财富》杂志的观点和立场。（财富中文网）

译者：中慧言-王芳

人工智能领域的巨头们实际上已经给整个经济强加了一项“隐形税”，而大多数人对此毫无知觉。

曾经平价的内存芯片，如何成了奢侈品

“内存墙”解析

人工智能淘金热使大多数人被甩在身后

主流叙事将人工智能基础设施投资描绘成惠及多方的发展举措：既利好内存厂商与芯片企业，也能推动各领域的创新发展。但现实却严重失衡。

为人工智能和内存规划一条可持续的发展道路

人工智能行业热衷于宣扬“民主化”理念：开放模型、易于获取的工具、让智能惠及每个人。然而，这一愿景正在与底层硬件现实愈发脱节。

沙·拉比（Sha Rabii），半导体内存架构公司Majestic Labs总裁及联合创始人。

Fortune.com上发表的评论文章中表达的观点，仅代表作者本人的观点，不代表《财富》杂志的观点和立场。（财富中文网）

译者：中慧言-王芳

Your next laptop, smartphone, or even refrigerator is going to cost more — and you can thank AI for that. The AI boom has triggered what insiders are calling “RAMageddon”: a gold rush on high-bandwidth memory chips that is squeezing out nearly every other buyer in the global market, driving up prices across consumer electronics and straining industries from automotive to healthcare. Even Apple CEO Tim Cook has warned about the pressure AI infrastructure costs are placing on hardware margins.

The biggest AI players have effectively imposed a tax on the entire economy — and most people have no idea it’s happening.

How the once-affordable memory chip became a luxury good

Modern computing relies on several types of memory. SRAM is the fastest and most expensive; it’s used in small amounts inside processors. DRAM is the workhorse of the group: cheap, abundant, found in everything from laptops to cars to refrigerators. Then there’s High Bandwidth Memory or HBM. This is a specialized, premium form of DRAM that stacks chips die-to-die to achieve dramatically faster data transfer speeds. The cost for this premium memory is quite steep: a single silicon wafer provides 3x as much commodity DRAM as HBM. Fab processing time for HBM is significantly longer too, making the supply problem worse. As a result, producing more HBM equates to fewer total memory chips produced.

For AI training and inference, HBM has become the essential ingredient. It’s the jet fuel that powers the GPUs running today’s largest and most advanced models.

Memory manufacturers have a limited number of wafers they can produce from each fab, or silicon factory. The same production lines that churn out commodity DRAM for the devices consumers use every day are being allocated to building HBM. It’s a rational business decision: HBM commands premium prices in a volatile industry and comes with massive guaranteed purchase orders. In fact, AI firms and their peers have already locked up HBM supply well into 2027. The result is a tightening of commodity memory supply, rising prices, and longer lead times with ripple effects that touch almost every industry. And right now, the industry’s biggest players have cornered the supply, creating the core tension driving the memory shortage. The AI industry is effectively taxing the entire economy in order to build its own.

The memory wall explained

The scale of AI’s memory appetite is staggering. As model sizes have grown from millions to billions to trillions of parameters and context windows have grown from thousands of tokens to tens of millions of tokens, memory requirements have increased in step, and the architecture of data centers has struggled to keep pace. This is the industry’s “memory wall”: a fundamental bottleneck where memory bandwidth and capacity can’t keep up with the processors demanding it.

HBM was an elegant solution to an earlier version of this problem. When models were smaller (such as GPT-2 and GPT-3), placing memory adjacent to the processors and delivering data at extreme speeds worked well. But we’ve since blown past that era. Today’s frontier models exceed two trillion parameters and the next generation will be over five trillion. A single HBM stack holds about 24 gigabytes. That’s roughly one percent of what today’s workloads actually need and far less than that for the next generation of workloads.

The result is that data centers must now scale out exponentially. They chain together hundreds of processors across servers and racks. At that point, HBM’s killer feature of extreme local bandwidth gets strangled by the comparatively slow links connecting all of these machines. The industry has built a gold-plated solution to the problem: AI companies pay the HBM premium while realizing only diminishing returns on performance.

The AI gold rush leaves most behind

The prevailing narrative frames AI infrastructure investment as broadly good: better for memory makers, better for chip companies, better for innovation everywhere. The reality is more lopsided.

Memory manufacturers may profit in the short term. But the true winner is concentration itself. When the HBM supply is locked up by a handful of hyperscalers, it functions as a moat. Startups, enterprises, and established industries all face higher hardware costs and more limited access to the advanced AI capabilities they need to compete. The companies that can afford to stockpile chips don’t just win today; they entrench advantages that could become impossible to dislodge.

Everyone else is caught in the crossfire. Consumers will pay more for devices with less capability. Businesses face a hardware cost environment that has become more taxing and volatile. And the broader technology ecosystem is competing for memory resources against an industry that has essentially unlimited capital to outbid them.

Charting a more sustainable path for AI and memory

The AI industry loves to talk about democratization: open models, accessible tools, intelligence for everyone. That story is increasingly disconnected from the hardware reality being constructed underneath it.

The current trajectory isn’t sustainable. Pouring more investment into HBM capacity addresses a symptom while ignoring the underlying disease. The industry needs to move beyond its fixation on a single memory architecture designed for an earlier era of AI and invest seriously in new approaches—ones that can meet AI’s demands today, and as they grow a hundredfold in the next few years.

Solving this requires more than ramping up additional HBM fabs. It requires a fundamental rethinking of how memory is architected for AI. What’s needed are memory systems that are smart, fast, and compact — architectures that can scale alongside model growth without requiring brute-force resource consumption. Most importantly, the emerging architecture must make AI accessible to more than just a handful of companies.

The memory wall isn’t just an engineering footnote in AI’s rise. It’s the defining infrastructure challenge of this era. The industry’s current answer of “more HBM, faster, at any cost” is a perilous road that risks eroding competition, innovation, and consumer trust. As an industry, we must find a way to do better, and quickly, before it’s too late. The most immediate relief available is a pivot away from HBM dependency toward commodity DRAM architectures engineered specifically for AI’s requirements. The window to act — before the gap between AI haves and have-nots becomes unbridgeable — is closing fast.

Sha Rabii, President and Co-founder of Majestic Labs, a semiconductor memory architecture company.

The opinions expressed in Fortune.com commentary pieces are solely the views of their authors and do not necessarily reflect the opinions and beliefs of Fortune.

财富中文网所刊载内容之知识产权为财富媒体知识产权有限公司及/或相关权利人专属所有或持有。未经许可，禁止进行转载、摘编、复制及建立镜像等任何使用。

0条Plus

精彩评论

撰写或查看更多评论

请打开财富Plus APP

前往打开

热读文章

关注我们

AI内存芯片短缺正在悄然向整个经济征收“隐形税”

撰写或查看更多评论