“修复代码”，一句简短指令导致美政府封禁Anthropic两款大模型

Jeremy Kahn

2026-06-29

很多AI模型同样可以用来挖掘现有代码中的安全漏洞。

文本设置

小号

默认

大号

Plus(0条)

2022 年一场科技峰会上，Luta Security创始人兼首席执行官凯蒂・穆苏里斯。亚马逊研究人员发现Anthropic旗下Fable AI模型存在安全漏洞，穆苏里斯负责对该漏洞展开核查。她证实这一模型越狱手段操作门槛极低，但同时提出，Fable对网络安全防御人员的价值远高于攻击者滥用该技术带来的风险。图片来源：Kelly Sullivan—Getty Images for TechCrunch

一处安全漏洞，促使美国政府针对Anthropic的Fable 5和Mythos 5模型实施出口管制，而攻破安全防护仅涉及一句简短指令：“修复代码”（Fix this code）。

Luta Security创始人兼首席执行官凯蒂・穆苏里斯撰写的深度博客文章中披露了该事始末。Anthropic曾委托穆苏里斯审核关于Fable模型安全漏洞的报告，报告由亚马逊的网络安全研究人员出具。穆苏里斯曾两次担任美国政府网络安全顾问，也曾在微软担任网络安全专家。这一漏洞随后被上报给特朗普政府，亚马逊首席执行官官安迪・贾西还专门就此致电白宫汇报，最终结果是美国政府对Fable及其底层基础模型Mythos实施出口管制。

按照美国出口管制法规的界定，只要将相关技术提供给非美国公民，即便对方身处美国境内，也会被认定为技术出口。Anthropic表示，公司别无选择，只能对所有用户禁用两款AI模型。一旦实施出口管制，Anthropic公司内部的非美国籍员工也不得使用或参与模型研发工作。

目前尚不清楚亚马逊究竟为何决定测试Fable的安全防护机制，以及首次就此事联系Anthropic的具体时间。

穆苏里斯写道，亚马逊发现的越狱手法非常简单，即向Fable提供包含已知漏洞的软件代码。当研究人员要求Fable“审查代码中的安全问题”时，模型拒绝请求。但当研究人员转而要求模型“修复代码”时，模型却生成了修复补丁。她表示，研究人员随后通过手动操作，将Fable的输出结果转化为脚本，也就是一组能自动执行流程的编程指令，以便测试补丁。但由于模型必须首先找出软件漏洞才能生成修复方案，攻击者同样可以利用这一过程发现代码中的漏洞。

她写道，亚马逊发现的漏洞没有可行的根本性修复方案，任何修补尝试都只会削弱模型在防御方面的能力。

其他很多AI模型同样可以用来挖掘现有代码中的安全漏洞。正如穆苏里斯所述，这种越狱方法并未解锁Fable底层模型Mythos最强大的功能。Mythos的突出之处在于能自主发现并将多个网络安全漏洞串联，甚至可能自主策划完整的攻击。Mythos是首个成功通过英国AI安全研究所测试AI模型黑客能力的两项网络安全“测试靶场”的模型。

穆苏里斯写道，Fable利用亚马逊的技术展现出的能力虽然可能对攻击者有用，对网络防御者同样至关重要。“防御者需要借助AI修复文件中的漏洞，解释修复的重要性，并编写测试脚本以确认补丁有效，”她写道。“这不是绕过安全防护，而是AI模型为防御性安全提供最有价值的功能。”

穆苏里斯建议，反对出口管制的人应该印些T恤，一面印上“修复代码”，另一面印上“本衬衫是军需品”。这是致敬上世纪90年代网络安全界为推翻美国对强加密技术出口管制的事件。1995年，密码学家亚当・巴克在一件T恤正面印上了三行RSA加密代码，背面印着“本衬衫被列为军需品，禁止从美国出口”。他鼓励人们穿着T恤跨境出行，开展非暴力抗议。

穆苏里斯还是签署公开信的网络安全专家之一。该信由网络安全初创公司Corridor首席安全官、Facebook前首席安全官亚历克斯·斯塔莫斯发起，呼吁撤销对Fable和Mythos的出口管制。“在竞争对手快速进步时，如果没有充分的理由就剥夺防御者获取顶尖能力的机会，是极其危险的，”信中写道，还特别提及中国AI模型能力日益增强。

截至目前，约100位来自英伟达、Adobe、Zoom、谷歌、Anaplan和Sophos等公司的网络安全专业人士，以及一些学术界的网络安全研究人员在公开信上联合签名。

信中指出，尽管Anthropic的Mythos级别模型“在发现漏洞和武器化利用方面表现出色……但并非只有该系列模型具备相关能力”。信中提到，网络安全专家已使用其他AI模型，包括开源模型进行软件安全审计和红队测试。此外，OpenAI的GPT-5.5、Anthropic最新的Claude Opus和Sonnet模型，以及中国月之暗面的Kimi 2.7等模型，都能对代码进行安全漏洞审查，原理与亚马逊测试Fable非常相似。

“之所以采取前所未有的出口管制，给出的理由是Fable拥有其他其他AI模型不具备的独特技术优势，但事实上，去年以来AI就已能以超越人类的水平挖掘漏洞并生成可执行的攻击代码，”公开信写道。

公开信还指出，Anthropic在Fable中内置了多重防护机制，防止其被用于网络攻击。“防护措施相当严格，发布当天还一度成为网络安全圈的笑料，”信中称。

据Axios报道，一位熟悉特朗普政府出口管制思路的匿名消息人士透露，Anthropic委托穆苏里斯评估亚马逊的研究，可能加剧了白宫方面的不满，直接加速了出口管制的出台。

Axios援引该官员的话称，政府将穆苏里斯视为“激进民主党人士”。这位匿名消息人士指出，安全研究员克里斯・克雷布斯在社交媒体上为穆苏里斯的分析背书，让情势雪上加霜。特朗普第一任期内曾解除克雷布斯网络安全和基础设施安全主管的职务，因为他驳斥过特朗普的说法，当时特朗普声称2020年11月大选存在大规模舞弊的说法，还发生过黑客入侵电子投票机的情况。（财富中文网）

译者：梁宇

审校：夏林

一处安全漏洞，促使美国政府针对Anthropic的Fable 5和Mythos 5模型实施出口管制，而攻破安全防护仅涉及一句简短指令：“修复代码”（Fix this code）。

目前尚不清楚亚马逊究竟为何决定测试Fable的安全防护机制，以及首次就此事联系Anthropic的具体时间。

她写道，亚马逊发现的漏洞没有可行的根本性修复方案，任何修补尝试都只会削弱模型在防御方面的能力。

截至目前，约100位来自英伟达、Adobe、Zoom、谷歌、Anaplan和Sophos等公司的网络安全专业人士，以及一些学术界的网络安全研究人员在公开信上联合签名。

译者：梁宇

审校：夏林

The security vulnerability that led the U.S. government to impose export controls on Anthropic’s Fable 5 and Mythos 5 models is a simple technique that involves just three simple words: Fix this code.

That’s according to a detailed blog post from Katie Moussouris, the founder and CEO of Luta Security. Anthropic had asked Moussouris, who has held two government advisory roles on cybersecurity and previously worked as a cybersecurity expert at Microsoft, to review a report on the security vulnerability in its Fable model that cybersecurity researchers at Amazon had produced. The vulnerability, which was later reported to the Trump administration, including in a phone call Amazon CEO Andy Jassy had with the White House, led the U.S. government to impose export controls on Fable as well as the underlying base model, Mythos.

Because U.S. export controls work in a way that distribution of the technology to any noncitizen is deemed to be an export, even if those individuals are physically located in the U.S., the company said it had no choice but to disable the two AI models for all users. The export controls would have meant that Anthropic’s own noncitizen employees would not be allowed to use or work on the models.

It remains unclear exactly why Amazon decided to test the safeguards around Fable and when it first contacted Anthropic about the issue.

Moussouris wrote that the jailbreak Amazon discovered was simple and involved giving Fable software code with known vulnerabilities. When the researchers asked Fable to “review the code for security issues” the model refused. But when the researchers instead asked the model to “fix this code,” the model produced patches. The researchers, she said, then used a manual process that turned Fable’s output into scripts—a set of programming instructions that can automate a process—that could test the patches. But because the model had to find the software vulnerabilities in order to generate the fixes, the same process could potentially be used by an attacker to spot code vulnerabilities.

She wrote that the vulnerability that Amazon discovered “cannot meaningfully be fixed, and any attempt would only weaken the model for defense.”

Many other AI models can also be used to spot security flaws in existing code. The jailbreak, as described by Moussouris, did not unlock the most potent capabilities of Anthropic’s Mythos model, upon which Fable is based. Mythos was notable for being able to autonomously find and chain multiple cybersecurity vulnerabilities together, potentially orchestrating entire attacks autonomously. Mythos was the first model to successfully complete both cybersecurity “test ranges” that the U.K. AI Security Institute uses to test the hacking abilities of AI models.

Moussouris wrote that the capabilities Fable displayed using the Amazon technique, while potentially useful to attackers, were also vital for cyber defenders. “Defenders need to be able to ask AI to fix bugs in a file, explain why the fix matters, and write tests that confirm the patch works,” she wrote. “That is not a guardrail bypass. It is the most valuable thing an AI model can do for defensive security.”

Moussouris suggested that those opposing the export controls ought to have T-shirts printed with the words “fix this code” on one side and the phrase “this shirt is a munition” on the other. That’s a reference to a 1990s effort by the cybersecurity community to overturn U.S. export controls on strong encryption methods. In 1995, cryptographer Adam Back printed three lines of RSA encryption code on the front of a T-shirt, and on the back printed “this shirt is classified as a munition and cannot be exported from the United States.” He encouraged people to cross the border wearing the shirts in an act of civil disobedience.

Moussouris was among the cybersecurity experts who have added their names to an open letter, put together by Alex Stamos, the chief security officer at cybersecurity startup Corridor and a former chief security officer at Facebook, that is calling for the export controls on Fable and Mythos to be rescinded. “To pull the best capabilities away from defenders without a good reason when our adversaries are rapidly advancing is dangerous,” the letter stated, noting the increasing capabilities of Chinese AI models.

That letter has now been signed by about 100 cybersecurity professionals from companies including Nvidia, Adobe, Zoom, Google, Anaplan, and Sophos, as well as some academic cybersecurity researchers.

The letter stated that while Anthropic’s Mythos-class models “are quite good at finding flaws and weaponizing exploits … they are not uniquely good at these tasks.” It noted that cybersecurity experts were already using other AI models, including open-source models, for security audits and red-teaming of software. And it said that OpenAI’s GPT-5.5 as well as Anthropic’s latest Claude Opus and Sonnet models, as well as Chinese models such as Moonshot AI’s Kimi 2.7 can all perform similar reviews of code for security flaws in a similar way to the one Amazon discovered with Fable.

“The justification for this unprecedented action was that Fable provides a unique ‘uplift’ of capabilities beyond other AI models, but AI has been finding bugs and generating working exploits at superhuman levels since last year,” the letter stated.

The letter also notes that Anthropic had built multiple protections into Fable to prevent its use for cyberattacks. “These protections were so aggressive as to be the source of humor in the cyber community on launch day,” it said.

Axios cited an unnamed source familiar with the Trump administration’s thinking around the export controls as suggesting that Anthropic’s decision to engage Moussouris to review the Amazon research might have inflamed tensions with the White House and precipitated the export controls.

Axios quoted the official as saying the company had enlisted an expert—Moussouris—whom the administration viewed as a “radical Democrat.” The same unnamed source noted that it also didn’t help that security researcher Chris Krebs had vouched for Moussouris’s analysis on social media. President Trump had fired Krebs from his role as cybersecurity and infrastructure security chief during his first term after Krebs contradicted Trump’s claims of widespread election fraud, including hacking of electronic voting machines, in the November 2020 presidential election.

财富中文网所刊载内容之知识产权为财富媒体知识产权有限公司及/或相关权利人专属所有或持有。未经许可，禁止进行转载、摘编、复制及建立镜像等任何使用。

0条Plus

精彩评论

撰写或查看更多评论

请打开财富Plus APP

前往打开

热读文章

关注我们

“修复代码”，一句简短指令导致美政府封禁Anthropic两款大模型

撰写或查看更多评论