The AI Governance Paradox: Why Companies Are Building Their Own Arms Control While Waiting for Rules

AI companies are constructing internal arms control infrastructure on their own, with no external accountability or legal requirement to do so. Anthropic and OpenAI have posted job listings for policy managers focused on preventing their language models from assisting in chemical weapons synthesis, biological threats, and explosives design. These roles function as verification officers, a responsibility traditionally held by international bodies and defense ministries .

Why Are AI Companies Hiring Weapons Experts?

The job listings signal a critical gap in AI governance. The Organisation for the Prohibition of Chemical Weapons (OPCW), which oversees the Chemical Weapons Convention across 193 member states, has documented how AI-enabled tools are accelerating chemical research. Molecular modeling and AI-assisted synthesis planning now allow researchers to identify chemical pathways faster and with less expertise than previously required .

The OPCW released a report in March 2026 highlighting these risks and calling for AI companies to engage with international bodies. However, the organization has no mechanism to compel private companies in San Francisco to submit their model evaluations for external review, disclose red-teaming results, or coordinate before deploying new model versions. That gap between international expertise and international authority is where the current governance crisis lives .

"We didn't really feel, with the rapid advance of AI, that it made sense for us to make unilateral commitments if competitors are blazing ahead," stated Jared Kaplan, chief science officer at Anthropic.

Jared Kaplan, Chief Science Officer at Anthropic

Anthropic's February 2026 update to its Responsible Scaling Policy (RSP 3.0) made this logic explicit. The company removed a binding commitment to pause development if adequate safety measures could not be demonstrated before crossing capability thresholds. In its place are voluntary public goals that are explicitly non-binding. The reasoning was candid: unilateral safety commitments don't work if competitors aren't making equivalent ones .

What's Missing From the Current Governance Framework?

The voluntary safety infrastructure that exists today has significant blind spots. Frontier AI safety work has concentrated heavily on pandemic-scale biological risks, with labs publishing evaluations of whether their models could enable a pathogen capable of mass casualties. Chemical weapons, improvised explosive attacks, and radiological devices receive substantially less systematic attention, despite being considerably more accessible to a motivated actor .

The hiring of dedicated chemical weapons policy managers at two major labs suggests the companies have registered this concern. What remains unclear is whether the evaluations those managers conduct will ever be visible to anyone outside the companies. Most risk-management practices at frontier labs remain voluntary, with only a handful of jurisdictions beginning to formalize limited requirements .

  • Algorithmic Accountability: The Trump administration's National Policy Framework for Artificial Intelligence contains no requirements for AI systems to be audited for discriminatory outputs, yet it threatens to preempt state laws like those in New York and Colorado that mandate bias assessments for AI systems used in hiring, lending, and benefit eligibility .
  • Data Privacy Beyond Children: While the framework includes guidelines for federal regulation to protect children's data, it is silent on privacy protections for adults, leaving it at odds with state laws like California's Consumer Privacy Act .
  • Transparency and Explainability: The framework says nothing about ensuring transparency on public agencies' use of AI in consequential decisions such as tax assessments and permit approvals, which experts describe as fundamental to due process .
  • Enforcement Mechanisms: The framework contains no compliance deadlines and directs no federal agency to take specific actions, making it more aspirational than operational .
  • Weapons-Domain Evaluations: No existing law requires mandatory disclosure of model evaluations related to chemical, biological, or radiological threats to external bodies with real authority .

How Can Organizations Navigate the Governance Gap?

For companies and policymakers trying to move beyond this voluntary-only approach, several practical steps can help bridge the gap between internal safeguards and external accountability. The Centre for the Governance of AI at Oxford drew a logical conclusion from Anthropic's own reasoning: if the core problem is collective action, companies should push for stronger regulation .

  • Establish Clear Ownership: Organizations deploying AI at scale need to assign specific owners for specific outcomes before deployment, not after problems appear. When accountability is distributed across IT, data science, operations, legal, and business leadership without explicit ownership at each decision point, the result is organizational paralysis .
  • Redesign Workflows Around AI: The roughly 6% of organizations classified as AI high performers, those generating more than 5% of earnings before interest and taxes from AI, are nearly three times more likely to have redesigned their workflows around AI than typical organizations. AI should change how decisions are made, who makes them, and what performance looks like across a system .
  • Demand External Verification: Organizations should advocate for mandatory disclosure of model evaluations to external bodies with real authority, similar to how international arms control regimes operate. This requires legislative action at both federal and state levels to create enforceable standards .
  • Monitor Legislative Developments: AI legislative tracking software using natural language processing can help organizations stay ahead of emerging regulations across multiple jurisdictions, reducing time spent on routine monitoring and freeing capacity for strategic advocacy .

The architecture for external oversight isn't difficult to sketch. It would require mandatory disclosure of model evaluations related to chemical, biological, and radiological threats to international bodies with technical expertise and enforcement authority. California's SB-53 and elements of the EU AI Act now require frontier developers to publish risk frameworks, and New York's RAISE Act will add similar obligations when it takes effect in 2027. However, none specifically addresses the weapons-domain evaluations that Anthropic and OpenAI are now staffing internally .

The fundamental problem is that the companies best positioned to design workable AI governance frameworks are the ones that understand the technology most deeply. They are also the companies with the strongest competitive incentive to avoid any framework that constrains them more than their rivals. They can diagnose the problem clearly. Acting on the diagnosis is harder .

Until external rules apply to everyone, the voluntary safety infrastructure built by frontier AI labs will remain fragile. The hiring of weapons experts is a sign that companies recognize the risks. Whether those internal safeguards survive the competitive pressures the companies themselves have identified remains an open question.