Anthropic Releases Claude Opus 4.6: New Frontier in AI Safety and Performance

Anthropic has unveiled Claude Opus 4.6, the latest iteration of its flagship AI model, demonstrating significant performance improvements while maintaining strict safety standards. The new model represents a major step forward in both capability and reliability for enterprise and consumer AI applications.

Performance Breakthroughs

Claude Opus 4.6 achieved the highest BigLaw Bench score of any Claude model at 90.2%, with 40% of test cases receiving perfect scores and 84% scoring above 0.8. The model also leads all other frontier models on Humanity’s Last Exam, a complex multidisciplinary reasoning test, reaching an industry-leading score of 62.7% when run at high effort levels.

The model excels in agentic coding tasks, achieving the highest score on Terminal-Bench 2.0 and demonstrating state-of-the-art performance on economically valuable knowledge work evaluations spanning finance, legal, and other domains.

New Developer Features

Anthropic introduced several capabilities designed to expand Claude’s utility:

Adaptive thinking: The model can now automatically determine when extended reasoning would be beneficial, eliminating the need for binary on/off settings. Developers can adjust effort levels to balance intelligence, speed, and cost.
Compaction: Claude can now summarize its own context to perform longer-running tasks without hitting token limits, enabling more extended workflows.
Agent teams: Multiple Claude instances can collaborate on complex tasks together.
Effort controls: Developers gain granular control over model behavior and resource allocation.

Security and Vulnerability Detection

In a significant development for cybersecurity, Anthropic’s frontier red team reported that Claude Opus 4.6 identified over 500 previously unknown high-severity security flaws in open-source libraries. Each vulnerability was confirmed by security experts. According to Logan Graham, who leads Anthropic’s frontier red team, “There’s a competition between defenders and attackers, and our goal is to equip defenders with the necessary tools as swiftly as possible.”

Safety Aligned with Power

The intelligence gains do not compromise safety. On Anthropic’s automated behavioral audit, Opus 4.6 showed a low rate of misaligned behaviors including deception, sycophancy, and cooperation with misuse. The model maintains the same safety alignment level as its predecessor, Claude Opus 4.5, while demonstrating the lowest rate of over-refusals of any recent Claude model—meaning it’s less likely to refuse benign queries unnecessarily.

Real-World Usage Patterns

Analysis of Claude usage patterns reveals important trends about how the AI is being deployed. According to Anthropic’s Economic Index report based on November 2025 data, augmented use—where users collaborate iteratively with Claude—has rebounded to 52% of conversations, up from 45% in the previous period. This suggests that as Claude capabilities improve, users are increasingly leveraging the model as a collaborative partner rather than a pure automation tool.

Computer and mathematical tasks, particularly coding-related work, continue to dominate Claude usage, accounting for about a third of Claude.ai conversations and nearly half of first-party API traffic.

Ad-Free Commitment

Anthropic has reaffirmed its commitment to keeping Claude ad-free, arguing that advertising incentives are fundamentally incompatible with building a genuinely helpful AI assistant. The company indicated interest in supporting agentic commerce capabilities where appropriate, but only where users explicitly choose to engage with such features.

Claude Opus 4.6 is available now through Anthropic’s API and Claude.ai platform, representing the company’s latest step toward building more capable, safer, and more economically useful AI systems.

Photo by hakelbudel on Pixabay