Google, Microsoft, and xAI Must Now Pass a US Government Safety Test Before Releasing AI Models

The AI industry just got a new referee — and it’s sitting in Washington, not Silicon Valley.

On May 5, 2026, the US Department of Commerce announced that Google, Microsoft, and xAI had all agreed to voluntarily submit their latest AI models for safety evaluation through the Center for AI Standards and Innovation (CAISI) before those models go public. This isn’t a gentle suggestion. It’s the most significant shift in US AI governance in two years, and it came from an administration that spent its first year practically cheering from the sidelines.

Here’s how the new system works, why it matters, and what it means for the future of AI development in America.

How the System Works

Think of it like the FAA testing a new aircraft before letting it take off. Before these three companies can ship their next-gen models to the public, they hand over the blueprints — or rather, the models themselves — to CAISI for evaluation. The center tests for capabilities and security risks, covering “testing, collaborative research and best practice development related to commercial AI systems,” according to CAISI’s director Chris Fall.

The number is telling: CAISI has already conducted 40 prior evaluations of AI tools. That includes “state-of-the-art models that remain unreleased.” So this isn’t a brand-new bureaucracy churning through paperwork — it’s an existing operation that’s been quietly reviewing unreleased models, and now the scope just expanded dramatically. Three massive server racks display tech symbols and submit d

The Shift Nobody Expected

Here’s what makes this story genuinely surprising: this administration has been the most hands-off on AI regulation of any administration in recent memory. When Trump signed his “AI Action Plan” last year, the language was clear — “remove red tape and onerous regulation” and ensure the US would “win” through advancement and control. The framing was innovation-first, regulation-later.

So what changed? Two things, both involving Anthropic.

First, Anthropic’s CEO Dario Amodei publicly revealed a model called “Mythos” that the company said was too powerful to release. The implication was blunt: a company had built something so capable that keeping it offline was the safest option. That kind of capability announcement makes policymakers nervous in ways that incremental model improvements never do.

Second, the Pentagon sued Anthropic for refusing to remove safety guardrails from models intended for government use. The DoD wanted uncensored access. Anthropic said no. The lawsuit is ongoing, but the message was clear — the military-industrial complex was pushing hard for fewer constraints, and someone had to push back. A translucent neural network graph traverses a steel inspect

According to the BBC, senior members of Trump’s staff met with Amodei last month. Something happened in that meeting that shifted the calculus. Whether it was the Mythos revelation, the DoD pressure, or a combination of both, the result is the same: the White House is now actively engaging with AI safety in a way that was unthinkable a year ago.

Why This Actually Matters

Most AI news is about benchmark scores or parameter counts. This is different because it’s structural. It changes the rules of the game.

If every major AI company has to clear a government safety hurdle before shipping, that’s a gatekeeper. It means the US government now has informal — but very real — power to delay, request changes, or simply delay the deployment of models it considers risky. It won’t be able to ban models outright (there’s no law yet for that), but the ability to say “we’re still reviewing this” creates a de facto holding pattern.

Microsoft put out its own blog post framing this as a partnership: “testing for national security and large-scale public safety risks necessarily must be a collaborative endeavour with governments.” That’s corporate-speak for “we agree to this because the alternative is worse.” Google’s DeepMind declined to comment. xAI’s parent company, SpaceX, did not respond. Neither silence is particularly encouraging for those who believe this is a voluntary, low-friction process.

What to Watch Next

Three things: Three distinct corporate towers extend mechanical arms to de

1. Will OpenAI and Anthropic formally join the expanded agreement? They were part of the original Biden-era framework. Formalizing their inclusion under the CAISI umbrella would complete the major-player coverage.

2. How aggressive will CAISI’s reviews get? Forty evaluations so far is a modest number. But as the volume of model submissions increases and capabilities advance, the center’s role could expand from safety-testing to actual capability-limiting.

3. What does this mean for the global AI race? If US companies are the only ones required to pre-clear models, that’s a competitive disadvantage against companies in China, the EU, or anywhere without equivalent requirements. Watch for pushback from the industry on exactly this point.

The Bottom Line

The hands-off era of US AI policy is over — at least for now. Whether this is the start of a sustained regulatory framework or a temporary response to specific incidents (Mythos, the DoD lawsuit) remains to be seen. But one thing is clear: the government now sits at the table, and it’s not just observing anymore. A fortified government checkpoint awaits three angular AI tr

Quick Quiz

1. Which center at the US Department of Commerce is running the new AI model safety evaluations? Answer: The Center for AI Standards and Innovation (CAISI).

2. What two events involving Anthropic likely triggered this policy shift? Answer: The “Mythos” model announcement (too powerful to release) and the Pentagon lawsuit demanding removal of safety guardrails.

3. What power does CAISI currently have — and what can’t it do? Answer: CAISI can evaluate models and effectively delay their public release, but it cannot outright ban models — there’s no legal authority for that yet.