Anthropic Escalates AI Safety to ASL-3 as Claude Opus 4 Launches
The new safeguards target CBRN risks—but stop short of doomsday protocols
Anthropic just activated its most stringent safety measures yet for Claude Opus 4, implementing AI Safety Level 3 (ASL-3) protocols designed to prevent misuse in chemical, biological, radiological, and nuclear (CBRN) weapon development. The move marks a first for the company—previous models like Claude Sonnet 3.7 operated under ASL-2—and reflects growing industry unease about advanced AI’s dual-use potential.
“We can’t rule out ASL-3 risks with Opus 4’s improved CBRN capabilities,” an Anthropic spokesperson told WIRED, “but we believe ASL-4 would be premature.”
The safeguards focus narrowly on blocking workflows that could aid CBRN weaponization, not individual queries about hazardous materials. Over 100 new controls include egress bandwidth restrictions to slow model weight exfiltration and two-party authorization for sensitive operations—measures aimed at thwarting sophisticated non-state actors. Early tests suggest minimal disruption to legitimate research.
Provisional ASL-3 deployment follows Anthropic’s internal assessment that Opus 4’s CBRN abilities might cross a critical capability threshold. If further evaluation disproves this, some restrictions could roll back. For now, the company emphasizes iterative collaboration: “We’re working with government and civil society to refine these safeguards,” their statement noted.
The decision highlights the AI sector’s tightening self-regulation amid escalating compute power. Unlike OpenAI’s withheld GPT-4 successor, Anthropic is proceeding with caution rather than abstinence—betting that targeted constraints can mitigate risks without stifling progress. Whether that balance holds may define the next era of frontier AI deployment.