2026-05-05 22:07 UTC / 路透社
作者:考特尼·罗森 与 乔迪·戈多伊
2026年5月5日 世界标准时间22:07 更新于1小时前
华盛顿5月5日路透电 – 特朗普政府周二宣布,将一项允许美国政府科学家接触未发布人工智能模型以开展风险评估的计划范围扩大,新增谷歌DeepMind、xAI和微软。
据两家公司透露,ChatGPT开发商OpenAI以及Claude的母公司Anthropic此前已主动与美国人工智能标准与创新中心(CAISI)——该团队由美国政府科学家组成——合作,对未发布模型开展漏洞测试。
订阅《每日案卷》新闻简报,将最新法律资讯直接发送至您的收件箱,开启您的清晨阅读。点击此处注册。
广告 · 滚动继续阅读
以下是我们目前了解到的有关此次评估的信息:
美国重点关注哪些风险?
据CAISI官网信息,美国政府科学家重点关注“可证实的风险”,例如先进AI模型可能被用于对美国基础设施发起网络攻击。他们希望限制美国对手利用AI开发化学或生物武器,或篡改用于训练美国本土AI模型的数据的机会。
企业将提交哪些内容?
OpenAI全球事务主管克里斯·莱恩周二在领英帖子中表示,该公司正与该团队合作测试GPT-5.5-Cyber。这是其最新模型的一个变体,专为防御性网络安全工作设计。
广告 · 滚动继续阅读
微软在一份声明中表示,该公司将与科学家合作构建共享数据集和工作流程,用于评估先进AI模型,但微软未具体说明是哪些模型。
Anthropic去年9月在一份声明中称,该公司向CAISI开放了公开可用模型和未发布模型的访问权限,允许研究人员通过“红队演练”(即模拟恶意行为者的行为)流程探测漏洞。该公司还向CAISI提供了有关已知漏洞和安全机制的详细文档。
Alphabet旗下的AI研究部门谷歌DeepMind的一位发言人表示,该公司将提供其“专有模型”和数据的访问权限。
xAI未立即回应路透社的置评请求。
美国目前有何发现?
Anthropic称,其与CAISI的合作发现,诸如声称已进行人工审核或替换字符等手段,可以绕过安全机制,该公司补充表示已修复了这些漏洞。
OpenAI去年9月表示,该公司与CAISI合作,对其ChatGPT代理程序的漏洞展开了探测,这些漏洞可能会让复杂的行为者绕过OpenAI的网络安全措施。该公司称,这一漏洞本可让攻击者“远程控制该代理程序在本次会话中可访问的计算机系统,并成功冒充用户登录其他网站”。
包括上述企业在内的Meta、亚马逊和Inflection AI,已于2023年同意允许独立专家检查其模型的生物安全和网络安全风险。
这支由美国政府科学家组成的团队——在前美国总统拜登任期内曾以另一名称开展工作——还发布了自愿性指南,以防范AI模型泄露私人健康信息或生成错误答案的风险。
据其官网信息,这些科学家目前正在为关键基础设施提供商(如通信和应急服务部门)制定相关指南,用于测试其AI系统。
考特尼·罗森 报道;斯蒂芬·科茨 编辑
我们的准则:路透社汤森路透信任原则。
What we know about US stress tests of Google, xAI and Microsoft AI models
2026-05-05 22:07 UTC / Reuters
By Courtney Rozen and Jody Godoy
May 5, 2026 10:07 PM UTC Updated 1 hour ago
The Google logo is pictured at the entrance to the Google offices in London, Britain January 18, 2019. REUTERS/Hannah McKay/File Photo
WASHINGTON, May 5 (Reuters) – The Trump administration on Tuesday announced it had expanded a program to give U.S. government scientists access to unreleased artificial intelligence models to conduct risk assessments to include Google’s DeepMind, xAI and Microsoft.
ChatGPT maker OpenAI and Claude owner Anthropic had already been voluntarily working with the U.S. Center for AI Standards and Innovation, the team of U.S. government scientists, to test unreleased models for vulnerabilities, according to the companies.
Jumpstart your morning with the latest legal news delivered straight to your inbox from The Daily Docket newsletter. Sign up here.
Advertisement · Scroll to continue
Here is what we know about the reviews:
WHAT RISKS ARE THE U.S. FOCUSED ON?
U.S. government scientists are focused on “demonstrable risks,” such as the risk that advanced models can be used to launch cyberattacks on American infrastructure, according to the CAISI website. They want to limit opportunities for U.S. adversaries to use AI to develop chemical or biological weapons, or corrupt the data used to train American AI models.
WHAT WILL COMPANIES HAND OVER?
OpenAI is working with the group to test GPT-5.5-Cyber, said Chris Lehane, head of global affairs at OpenAI, in a LinkedIn post on Tuesday. GPT-5.5-Cyber is a variant of its latest model designed for defensive cybersecurity work.
Advertisement · Scroll to continue
Microsoft will work with the scientists to build shared datasets and workflows to assess advanced AI models, the company said in a statement. Microsoft did not specify which models.
Anthropic gave CAISI access to both publicly available and unreleased models, allowing researchers to probe for vulnerabilities in a process known as “red-teaming,” or simulating the behavior of malicious actors, the company said in September. The company also gave CAISI detailed documentation on known vulnerabilities and safety mechanisms.
Google DeepMind, Alphabet’s AI research arm, will provide access to its “proprietary models” and data, a spokesperson said.
xAI did not immediately respond to a request for comment from Reuters.
WHAT HAS THE U.S. FOUND SO FAR?
Anthropic’s work with CAISI revealed that tricks such as claiming that human review had occurred, or substituting characters, could get around safety mechanisms, the company said, adding that it had patched the vulnerabilities.
OpenAI said in September that it worked with CAISI to probe vulnerabilities in its ChatGPT Agent that could have allowed sophisticated actors to bypass OpenAI’s cybersecurity measures. The exploit would have allowed the attacker to “remotely control the computer systems the agent could access for that session and successfully impersonate the user for other websites they’d logged into,” the company said.
The companies, along with Meta, Amazon and Inflection AI, agreed in 2023 to allow independent experts to check their models for biosecurity and cybersecurity risks.
The U.S. government scientists, organized under a different name during former U.S. President Joe Biden’s tenure, also released voluntary guidelines to protect against the risk of AI models leaking private health information or producing incorrect answers.
The scientists are now working on guidelines for critical infrastructure providers, such as the communications and emergency services sectors, to test their AI systems, according to their website.
Reporting by Courtney Rozen; Editing by Stephen Coates
Our Standards: The Thomson Reuters Trust Principles.
发表回复