Anthropic has officially launched Claude Opus 4.6, its most capable large language model yet, now readily available on claude.ai, via the Claude API, and through major cloud providers. Identified as claude-opus-4-6, this release marks a significant focus on advanced long-context reasoning, sophisticated agentic coding, and high-impact knowledge work applications, aiming to redefine AI's role in complex enterprise environments.
Agentic Intelligence for Complex Tasks
Opus 4.6 is meticulously designed for multi-step tasks that demand planning, execution, and iterative refinement, moving beyond single-answer responses. Internally, Anthropic reports its use in Claude Code highlights an improved ability to tackle challenging problems, demonstrating better judgment and sustained productivity. The model prioritizes deeper deliberation before formulating a response. To offer developers flexibility, Anthropic has introduced an '/effort' parameter with four levels—low, medium, high (default), and max—allowing a direct trade-off between reasoning depth, speed, and cost. Beyond coding, Opus 4.6 excels in practical knowledge work, including financial analyses, research, and the creation or modification of documents. Within 'Cowork,' Anthropic’s autonomous work surface, the model can orchestrate multi-step workflows across various artifacts without continuous human prompting.
Unprecedented Context and Fine-Grained Controls
A pivotal feature of Opus 4.6 is the beta release of its 1 million token context window, a first for an Opus-class model. This substantial capacity supports up to 128,000 output tokens, suitable for generating extensive reports or complex multi-file edits. Prompts exceeding 200,000 tokens in this mode are subject to adjusted pricing. To streamline the management of long-running agentic processes, Anthropic has implemented several new platform features:
- Adaptive Thinking: The model intelligently decides when to employ extended reasoning.
- Effort Controls: Four discrete levels balance latency against reasoning quality.
- Context Compaction (Beta): Automatic summarization of older conversation parts.
- US-only Inference: For strict US regional processing, with slight token pricing adjustment.
These controls are engineered to address the complexities of agentic workflows that accumulate vast token counts through interactions with tools, documents, and code.
Seamless Product Integrations for Enterprise
Anthropic has bolstered its product ecosystem, enabling Opus 4.6 to drive more realistic workflows for engineers and analysts. Claude Code's new 'agent teams' mode (research preview) facilitates multiple agents working in parallel for tasks like comprehensive codebase reviews. Claude in Excel now incorporates pre-planning capabilities, infers data structure, and performs multi-step transformations. Paired with Claude in PowerPoint (research preview), it converts raw data into structured, on-brand slide decks, intelligently interpreting layouts and master slides.
Leading Benchmark Performance
Opus 4.6 establishes new performance standards across several external benchmarks crucial for coding agents, search agents, and professional decision support. It significantly outperforms competitors on GDPval-AA for economically valuable knowledge work, Terminal-Bench 2.0 for agentic coding, and Humanity’s Last Exam for multidisciplinary reasoning. A major advancement in long-context retrieval is evident in its 76% score on the 1 million token variant of MRCR v2—a 'needle-in-a-haystack' benchmark—representing a substantial improvement over previous models and a qualitative shift in effective context utilization. Further performance gains have been observed in root cause analysis for software failures, multilingual coding, and specific life sciences applications.
This article is a rewritten summary based on publicly available reporting. For the original story, visit the source.
Source: MarkTechPost