Anthropic Abandons Core Safety Pledge as Pentagon Ultimatum Looms

The company founded to build safe AI has quietly dropped its promise to halt development if risks outpace safeguards. The timing - one day before a Pentagon deadline - raises uncomfortable questions.

Anthropic was founded on one central promise: we will slow down if the risks get too high.

That promise is now gone.

On Tuesday, the company quietly rewrote its Responsible Scaling Policy, removing the core pledge that had defined it since 2023 - the commitment to never train AI systems unless safety measures were proven adequate first.

The timing is hard to ignore. Tomorrow is Friday, the deadline Defense Secretary Pete Hegseth gave Anthropic to allow unrestricted military use of Claude or face the consequences.

What Changed

The original 2023 Responsible Scaling Policy contained a clear line: Anthropic would halt development if capabilities outpaced its ability to guarantee safety. This wasn’t a suggestion. It was a hard limit.

The new policy eliminates that limit. Anthropic will now only pause development if it believes it has a “significant lead” over competitors AND leadership determines the risks are catastrophic enough to warrant stopping.

In practice, this means Anthropic will never stop. There’s always a competitor. There’s always pressure.

Chief Science Officer Jared Kaplan was blunt about the reasoning: “We felt that it wouldn’t actually help anyone for us to stop training AI models.”

He added: “We didn’t really feel, with the rapid advance of AI, that it made sense for us to make unilateral commitments… if competitors are blazing ahead.”

The Pentagon Connection

Anthropic insists the timing is coincidental. The policy update was “always planned” and unrelated to the Pentagon standoff.

But consider the sequence:

February 11: Safety researcher Mrinank Sharma resigns from Anthropic, warning that the team “[constantly faces] pressures to set aside what matters most.” He said the “world is in peril.”

February 24: Defense Secretary Hegseth meets with Dario Amodei and delivers an ultimatum: drop restrictions on military use by 5pm Friday, or face contract cancellation, supply chain blacklisting, and possible Defense Production Act invocation.

February 25: Anthropic publishes its revised Responsible Scaling Policy, removing hard safety limits.

February 27: The deadline.

Anthropic denies any connection. But the company that built its reputation on principled resistance is now softening its principles the day before a showdown with the Pentagon.

What Anthropic Still Claims to Refuse

Two red lines reportedly remain:

  1. Mass surveillance of Americans - Claude cannot be used for large-scale domestic monitoring
  2. Fully autonomous weapons - A human must remain in the loop for lethal force decisions

Anthropic has offered carveouts for missile defense and cyber defense. An Anthropic spokesperson told NBC News: “Every iteration of our proposed contract language would enable our models to support missile defense and similar uses.”

But the Pentagon wants more. It wants “all lawful purposes” with no company oversight. Pentagon officials were reportedly frustrated by the idea that they might need to “reach out and check with Anthropic” during an active crisis.

The Industry Response

Nvidia CEO Jensen Huang, whose company has a $5 billion investment relationship with Anthropic, offered a cold assessment on Wednesday:

“I hope that they can work it out, but if it doesn’t get worked out, it’s also not the end of the world. Anthropic is not the only AI company in the world.”

He added that the Pentagon “has the right to use the technology and use the products that they procure in a way that serves their interests.”

When your largest investor publicly acknowledges you might be replaceable, the pressure to comply intensifies.

The Broader Pattern

OpenAI removed its military restrictions last year. Google cooperates fully with the Pentagon. xAI operates across all classification levels.

Anthropic was the last major AI company maintaining hard limits on government use. It’s now the last company to soften those limits - but it has.

Chris Painter, policy director at METR, acknowledged the logic but raised concerns about what he called “frog-boiling effects.” Without hard binary thresholds, incremental compromises become easier to justify.

“Society is not prepared for the potential catastrophic risks posed by AI,” Painter said.

The Real Test

The new Responsible Scaling Policy trades binding commitments for transparency. Anthropic promises to publish “Frontier Safety Roadmaps” detailing its safety goals across security, alignment, safeguards, and policy.

But roadmaps are not speed limits. Publishing intentions is not the same as honoring constraints.

The company that left OpenAI specifically because it wanted harder safety commitments has now adopted a framework that sounds remarkably like what it criticized: aspirational, flexible, and ultimately nonbinding.

What Happens Tomorrow

The Pentagon’s deadline arrives at 5:01pm Friday.

Anthropic still publicly maintains its red lines on autonomous weapons and mass surveillance. But the company has already demonstrated that its core commitments can change when the pressure is sufficient.

If Anthropic holds firm tomorrow, it faces contract cancellation, supply chain blacklisting, and possibly the Defense Production Act - a Cold War-era law that could compel compliance.

If Anthropic bends, it survives as a company but dies as the safety-first alternative it claimed to be.

Either outcome answers the same question: Was AI safety ever a principle, or just a positioning strategy that worked until it didn’t?

The Bottom Line

Anthropic dropped its core safety pledge the day before a Pentagon deadline. The company claims the timing is coincidental. Its critics see a pattern: principles that hold until they become expensive.