Google, Microsoft, and xAI Agree to Government Pre-Release Review of AI Models
- From Mythos panic to a government beachhead
- CAISI: the underpowered office that became the de facto gatekeeper
- The big three join the club
- A White House that wanted speed, now talking safety
- Industry motives: safety, signaling, and self‑preservation
- A voluntary system that’s already outgrowing “voluntary”
- The China variable and the road ahead
Google, Microsoft, and xAI Agree to Government Pre-Release Review of AI Models Human Human coverage frames the Google, Microsoft, and xAI deal as a voluntary but fragile expansion of CAISI’s pre-release access program, prompted by incidents like the Mythos crisis and growing concern over frontier AI risks. It situates the move within Trump-era debates over FDA-style approvals, Pentagon vetting for government models, and US–China jockeying to manage an AI arms race. @4qd8…qnwa @Verge @TNW Google, Microsoft, and Elon Musk’s xAI have quietly agreed to something once unthinkable in Silicon Valley: letting the U.S. government inspect their most advanced AI models before the public ever touches them.
What looks like a voluntary “partnership” today is in fact the latest move in a rapid, crisis-driven pivot from AI acceleration at all costs to something that looks a lot more like regulation by stealth.
From Mythos panic to a government beachhead
The turning point came with Anthropic’s Mythos — a frontier model whose reported hacking capabilities sent the Trump White House scrambling to understand what it had unleashed and what, if anything, the government could do about it.1 The administration, which had spent years mocking formal AI safety regimes, suddenly began convening tech companies and trade groups to talk about cyber risks from models like Mythos and OpenAI’s GPT‑5.5.1
Inside those meetings, one idea crystallized: the Pentagon should be tasked with safety‑testing AI models before they’re deployed across federal, state, and local governments — an extra layer of vetting to catch security vulnerabilities before they’re embedded in public‑sector systems.1 Parallel conversations, first reported as “fairly far along,” explored an executive order to force safety testing of new models across multiple agencies.1
In other words, what started as damage control over a single scary model quickly became the blueprint for a much broader oversight regime.
CAISI: the underpowered office that became the de facto gatekeeper
The institutional vehicle for this pivot was already hiding in plain sight. Established in 2023 under Biden as the AI Safety Institute, the office was renamed and refocused by Trump as the Center for AI Standards and Innovation (CAISI), housed inside the Commerce Department’s National Institute of Standards and Technology.2 Its new brief: standards and national security, not abstract “AI ethics.”
Despite having no statutory power and fewer than 200 staff, CAISI has quietly become the closest thing the U.S. has to an AI regulator. Since 2024 it has conducted more than 40 evaluations of cutting‑edge models from OpenAI and Anthropic — including systems never released to the public and, crucially, versions with many safety guardrails deliberately stripped away.2
That access lets government evaluators probe directly for national‑security nightmares: biological weapon synthesis, automated cyberattacks, and autonomous agent behaviors that might be hard to control at scale.2 CAISI’s new director, Chris Fall, frames this as basic measurement science in the public interest. “Independent, rigorous measurement science is essential to understanding frontier AI and its national security implications,” he said, adding that expanded partnerships “help us scale our work in the public interest at a critical moment.”3
Still, the arrangement is voluntary. CAISI can’t actually block a model’s release. Its influence derives entirely from political pressure, corporate anxiety, and the looming threat that if industry doesn’t play ball, Congress or the White House might decide to give it teeth.
The big three join the club
On May 5, that informal regime snapped into a new phase. The Commerce Department announced that Google DeepMind, Microsoft, and xAI would now allow CAISI to conduct “pre‑deployment evaluations and targeted research to better assess frontier AI capabilities.”3 The government explicitly framed this as deepening oversight of “cutting‑edge AI” just as systems are becoming “more powerful and potentially risky.”3
With the three newcomers, all five major frontier labs — OpenAI, Anthropic, Google, Microsoft, and xAI — now hand over their latest models for government testing before launch.2 As one account put it: “Five companies now account for the vast majority of frontier AI development worldwide, and all five have agreed to let a single government office test their systems before deployment.”2
Previously announced deals with Anthropic and OpenAI were quietly “renegotiated” to align with CAISI’s updated directives and President Trump’s AI Action Plan, according to Commerce.3 As The Verge summarized, “Google DeepMind, Microsoft, and Elon Musk’s xAI have agreed to allow the US government to review new AI models before they’re released to the public,” under agreements designed to “better assess frontier AI capabilities.”4
The symbolism is striking: an office with no clear mandate now enjoys pre‑release access to nearly every model that could plausibly reshape the global economy or destabilize global security.
A White House that wanted speed, now talking safety
The politics of this shift are as dramatic as the technical implications. For most of Trump’s tenure, the White House line on AI was simple: move fast, scale faster, beat China. Safety guardrails were usually described as job‑killing bureaucracy.
Yet the same administration is now “deepening its oversight of cutting‑edge AI” by signing new agreements for pre‑deployment testing — and officials openly concede that this marks “a sharp change from the White House’s approach of prioritizing rapid innovation without guardrails in a bid to beat China.”3
National Economic Council director Kevin Hassett has started floating a far more interventionist model: an oversight process for new AI that resembles FDA drug approvals. “We’re studying, possibly an executive order to give a clear roadmap to everybody about how this is going to go and how future AIs that also potentially create vulnerabilities should go through a process so that they’re released to the wild after they’ve been proven safe, just like an FDA drug,” he said on Fox Business.5
White House chief of staff Susie Wiles is simultaneously trying to reassure both hawks and the tech sector. “When it comes to AI and cyber security, President Trump and his administration are not in the business of picking winners and losers. This administration has one goal; ensure the best and safest tech is deployed rapidly to defeat any and all threats,” she said in a statement, adding that the White House “appreciate[s] the effort being made by the frontier labs to ensure that goal is met.”5
Translation: we still want to win the AI race — but we now admit there should be a track marshal.
Industry motives: safety, signaling, and self‑preservation
For the labs, agreeing to pre‑release reviews is a calculated gamble.
On one hand, it offers a powerful political shield. By working with CAISI, companies can argue they’re acting responsibly while avoiding harder, binding regulation. It also gives them a direct channel to shape the metrics, tests, and threat models policymakers end up using.
On the other hand, they’re voluntarily letting the federal government peek under the hood of their crown‑jewel models — including, in some cases, raw capabilities that haven’t been locked down with user‑facing safeguards.2 In a competitive landscape increasingly framed as U.S. vs. China, that’s not a trivial concession.
Still, with leaks about Mythos’ offensive cyber potential generating “a new cybersecurity panic” in Washington,1 the alternative — being painted as the company that shipped a model that crashed critical infrastructure — looks worse. For now, pre‑release review is the least bad option.
A voluntary system that’s already outgrowing “voluntary”
The oversight framework now emerging has three distinct layers:
- CAISI pre‑release and post‑deployment evaluations of frontier models from the big five labs, currently voluntary and non‑binding but expanding in scope.23
- A proposed Pentagon‑led safety testing framework for models used across federal, state, and local governments — a more formal requirement for public‑sector deployments.1
- A possible executive order to create a cross‑agency safety regime for powerful models, which senior officials say could resemble FDA‑style pre‑clearance.15
Behind the scenes, the White House is also exploring ways to bypass an existing ban on federal agencies using Anthropic systems so they can adopt Mythos for defense and cybersecurity work,1 underscoring the central contradiction: the same capabilities that terrify them are the ones they most want in their own toolbox.
The China variable and the road ahead
All of this is unfolding against a geopolitical backdrop the administration can’t ignore. As Axios notes, the AI safety pivot comes just as the U.S. and China weigh “official discussions about AI” ahead of Trump’s upcoming trip to Beijing — a signal that “neither side wants a dangerous arms race.”5
If Washington genuinely wants to avoid that arms race, a domestic oversight regime becomes a bargaining chip as much as a security measure. You can’t credibly ask Beijing for guardrails if your own frontier labs are shipping models without any.
For now, the system rests on a thin reed: corporate goodwill toward an under‑resourced office with no legal power to say no. But with each new crisis, each leaked capability demo, and each headline about “frontier AI testing” ramping up,3 the gravitational pull toward something more formal grows stronger.
Today it’s voluntary review. Tomorrow it may look a lot more like licensing.
Story coverage
Write a comment