Claude Opus 4.8 is here, and the headline is restraint: Anthropic shipped its most capable generally available model on May 28, 2026 at exactly the same price as Opus 4.7 — $5 per million input tokens and $25 per million output tokens. The literal model ID is claude-opus-4-8, it builds on Opus 4.7, defaults to a 1M-token context window on the Claude API, Amazon Bedrock and Vertex AI, and carries a January 2026 knowledge cutoff. The genuinely new story sits one layer down, inside Claude Code: dynamic workflows (a research preview that lets Claude orchestrate subagents from a JavaScript script) and ultracode, a session setting nearly every mainstream outlet missed.
What Anthropic Actually Shipped
Three things landed together, and they are easy to conflate. So let us separate them cleanly, because the press cycle did not.
Pillar one is the model. Opus 4.8 is generally available — not a preview, not waitlisted. It builds directly on Opus 4.7, keeps the 1M-token default context window on the Claude API, Bedrock and Vertex AI (Microsoft Foundry stays at 200K), tops out at 128K tokens of output, and uses a January 2026 knowledge and training cutoff. Adaptive thinking is on: the model decides how much to reason per step instead of burning a fixed budget.
Pillar two is dynamic workflows in Claude Code — a research preview that turns a sprawling task into a script-driven swarm of subagents.
Pillar three is ultracode, a single Claude Code setting that quietly bundles maximum reasoning with automatic workflow orchestration. TechCrunch, Reuters and Axios all covered the model and skipped this. The developers who live in Claude Code will care about it more than the benchmark.
The one benchmark worth quoting
Anthropic anchors the launch to a single clean number: 84% on Online-Mind2Web, which the company describes as a meaningful jump over both Opus 4.7 and GPT-5.5. That benchmark measures whether an agent can complete real, open-ended web tasks end to end — exactly the kind of work that breaks brittle automations. Anthropic also says Opus 4.8 is the first model to break the 10% all-pass standard on its internal Legal Agent Benchmark, a qualitative threshold rather than a headline percentage.
One coding number is worth naming precisely because it is circulating widely and remains unconfirmed. Several third-party write-ups — VentureBeat among them — have attached a SWE-bench Verified figure of roughly 88.6% to Opus 4.8. We could not find that score anywhere on Anthropic’s own announcement page or in its primary documentation, and we could not trace it to a verifiable source. So we are flagging it, not endorsing it: the 88.6% may well prove accurate, but until Anthropic publishes it, it is a circulating claim, not a confirmed benchmark — treat it as unverified. The clean, traceable number from the launch is the 84% web-task result on Online-Mind2Web, and that is the one we will hold Anthropic to.
Same Price, Smarter Model — Read That Twice
The pricing table is where the strategy hides in plain sight. A capability bump usually comes with a price bump. This one did not.
Opus 4.8 pricing at a glance
| Tier | Input (per million tokens) | Output (per million tokens) |
|---|---|---|
| Standard | $5 | $25 |
| Fast mode (research preview) | $10 | $50 |
| Batch | $2.50 | $12.50 |
Standard pricing is identical to Opus 4.7. The Fast mode research preview is the real cut: $10 per million input and $50 per million output, down from the $30 and $150 that fast tiers carried on earlier Opus generations — roughly three times cheaper for low-latency work. Batch stays the cheapest lane at $2.50 per million input and $12.50 per million output for jobs that can wait.
Holding the line on price while raising capability is a positioning move, and we will say plainly what we think it signals in the analysis below. For now, the practical takeaway is simple: if your bill was sized for Opus 4.7, it does not move when you switch the opus alias to 4.8 on the Claude API.
Dynamic Workflows: Claude Writes the Orchestration
Here is Anthropic’s own definition, and it is worth reading slowly: “A dynamic workflow is a JavaScript script that orchestrates subagents at scale. Claude writes the script for the task you describe, and a runtime executes it in the background while your session stays responsive.”
That last clause is the unlock. With ordinary subagents, Claude is the orchestrator — it decides turn by turn what to spawn next, and every intermediate result lands back in its context window. A workflow moves that plan into code. The script holds the loop, the branching and the intermediate results, so your conversation only ever sees the final answer. The context stays clean even when hundreds of agents have run underneath.
The limits, exactly as documented
- Up to 16 concurrent agents, fewer on machines with limited CPU cores, to bound local resource use.
- 1,000 agents total per run, a hard ceiling that prevents runaway loops.
- Requires Claude Code v2.1.154 or later. Run
claude updateif you are behind. - Research preview, available on all paid plans, with Anthropic API access, and on Bedrock, Vertex AI and Microsoft Foundry.
You trigger one of two ways. Drop the word workflow anywhere in a prompt — “Run a workflow to audit every API endpoint under src/routes/ for missing auth checks” — and Claude Code highlights the keyword and writes a script instead of grinding through the task turn by turn. Or run a bundled one: /deep-research fans web searches across several angles, fetches and cross-checks sources, votes on each claim, and returns a cited report with the claims that did not survive cross-checking already filtered out. The /workflows command lists running and completed runs so you can watch agent counts, token totals and elapsed time per phase, or pause and resume inside the same session.
The subtler value is not raw scale — it is the repeatable quality pattern. A workflow can have independent agents adversarially review each other’s findings before anything is reported, or draft a plan from several angles and weigh them against each other. That is a different shape of work than “spawn more subagents,” and it is closer to how Anthropic frames its own agentic security push in the Project Glasswing zero-day research.
Ultracode: The Detail the Headlines Missed
This is the part we would build a workflow around ourselves, and it is buried in the model-config docs. Straight from Anthropic: “Ultracode is a Claude Code setting rather than a model effort level: it sends xhigh to the model and additionally has Claude orchestrate dynamic workflows for substantive tasks. It applies to the current session only.”
Read carefully, because the distinction matters. Ultracode is not a new API effort level. The effort levels for Opus 4.8 are low, medium, high, xhigh and max — the same set as Opus 4.7. What changed is the default: Opus 4.8 defaults to high, where Opus 4.7 defaulted to xhigh. Ultracode sits on top of that scale as a Claude Code behavior you switch on with /effort ultracode.
With ultracode on, Claude decides when a task warrants a workflow instead of waiting for you to type the word. A single request can fan into several workflows in a row — one to understand the code, one to make the change, one to verify it. It lasts for the current session and resets when you start a new one, and you drop back to routine work with /effort high. Because every substantive task plans a workflow, each request spends more tokens and takes longer. That is the trade: you are paying for a default-on swarm.
Why does this matter more than a benchmark? Because it changes the unit of work in Claude Code from “a turn” to “an orchestrated run” without you having to design the orchestration. For anyone running Claude Code as a daily driver — and we run it on the Max plan every day — ultracode is the line in this release that actually changes the workflow. The press covered the model. The setting is what developers will feel.
What Changed in the API (and What Didn’t)
If you have production code on Opus 4.7, the reassuring headline is: nothing breaks. Anthropic shipped the new capabilities as additive, not as a migration.
- Mid-conversation system messages. You can now place a
systemrole entry after a user turn in the messages array, with no beta header required — useful for injecting fresh instructions mid-session without rebuilding the prompt. - Fast mode research preview. The lower-latency lane at $10 per million input and $50 per million output, three times cheaper than the previous fast tier.
- Lower prompt cache minimum. The cache threshold drops to 1,024 tokens, so smaller repeated prefixes now qualify for caching.
On the Claude API, the opus alias resolves to Opus 4.8. On Bedrock, Vertex and Foundry the aliases still point to older versions until you pin claude-opus-4-8 explicitly — pin your model IDs before rolling out so you control when users move.
Our Analysis: The Frontier Price Floor Is Holding
Here is our read, scoped to what the data shows. The most interesting decision in this release is not technical — it is the unchanged price tag. When a lab raises capability and holds price flat, it is defending a floor, not capturing more value per token. That is a commoditization signal at the frontier, and it lines up with the competitive map: Google’s Gemini 3.5 Flash is pushing cost-efficient agentic performance hard, and OpenAI is iterating fast on its own stack. In that environment, charging more for Opus 4.8 would have handed price-sensitive agentic workloads to the cheaper tier.
So Anthropic is doing something we find strategically coherent: keep the per-token sticker still, cut the fast lane by 3x, and move the actual differentiation up into the harness. Dynamic workflows and ultracode are not API line items you pay extra for — they are reasons to stay inside Claude Code rather than wiring your own orchestration against a raw model endpoint. That is the same playbook we traced in Anthropic’s 2028 doctrine: compete on the system around the model, not just the model.
What would prove us wrong? If Anthropic’s next release breaks the flat-price pattern, or if independent evaluations show Opus 4.8’s coding edge is narrow enough that the cheaper tiers absorb most real workloads, the “price floor” framing weakens. We are also watching whether ultracode’s default-on token burn proves worth it in practice — a swarm that plans a workflow for every substantive task is powerful, but it is not free, and the diminishing-returns warning Anthropic attaches to its own max effort level is a fair caution here too. As Demis Hassabis recently reframed the trajectory, the interesting progress now is less about a single score and more about whether these systems reliably finish real, multi-step work — which is exactly what the 84% web-task number, and the workflow harness around it, are trying to address.
Who Should Care, and What to Do
If you build on the API: switch the opus alias to 4.8 — your cost model does not change — and look at the Fast mode preview for latency-sensitive paths. Adopt mid-conversation system messages where you were previously rebuilding prompts.
If you live in Claude Code: update to v2.1.154+, try a single task with the workflow keyword before flipping ultracode on globally, and keep /effort high as your routine default. Reserve ultracode for genuinely substantive sessions where the extra token spend earns its keep.
If you are comparing vendors: do not let one benchmark decide it. Opus 4.8’s 84% on Online-Mind2Web is a strong agentic-web number, but Gemini 3.5 Flash competes on cost-efficient throughput, a different axis. Pick the model that matches your workload’s real constraint — frontier reasoning or cheap volume — not the bigger headline.
Frequently Asked Questions
What is Claude Opus 4.8?
Claude Opus 4.8 is Anthropic’s most capable generally available model, released on May 28, 2026, with the literal model ID claude-opus-4-8. It builds on Opus 4.7, keeps a 1M-token context window by default on the Claude API, Amazon Bedrock and Vertex AI (200K on Microsoft Foundry), a 128K-token max output, and a January 2026 knowledge cutoff.
How much does Claude Opus 4.8 cost?
Pricing is unchanged from Opus 4.7: $5 per million input tokens and $25 per million output tokens. The Fast mode research preview is $10 per million input and $50 per million output — down from the $30 and $150 fast pricing on earlier models. Batch processing is $2.50 per million input and $12.50 per million output.
What are dynamic workflows in Claude Code?
A dynamic workflow is a JavaScript script that orchestrates subagents at scale. Claude writes the script for the task you describe, and a runtime executes it in the background while your session stays responsive. It is a research preview, requires Claude Code v2.1.154 or later, and runs up to 16 concurrent agents and 1,000 agents total per run.
How do you trigger a dynamic workflow?
Include the word “workflow” anywhere in your prompt and Claude Code writes a workflow script instead of working turn by turn. The /workflows command lists running and completed runs, and /deep-research is a bundled workflow that fans out web searches, cross-checks sources and returns a cited report.
What is ultracode in Claude Code?
Ultracode is a Claude Code setting rather than a model effort level. It sends xhigh to the model and additionally has Claude orchestrate dynamic workflows for substantive tasks. It applies to the current session only and is set with /effort ultracode. It is not an extra API effort level.
What effort levels does Opus 4.8 support?
In Claude Code, Opus 4.8 and Opus 4.7 support low, medium, high, xhigh and max. The default is high on Opus 4.8, versus xhigh on Opus 4.7. The /effort menu also offers ultracode, which is a session-only setting that combines xhigh with automatic workflow orchestration rather than a separate effort value.
Is Claude Opus 4.8 better than Gemini 3.5 Flash or GPT-5.5?
On Online-Mind2Web, Anthropic reports Opus 4.8 at 84%, which it calls a meaningful jump over both Opus 4.7 and GPT-5.5. That is the one clean numeric benchmark on the primary page. Gemini 3.5 Flash competes on a different axis — cost-efficient speed — so a single score does not settle the comparison; pick based on whether you need frontier reasoning or cheap throughput.
Where can you use Claude Opus 4.8?
Opus 4.8 is available on the Claude API, Amazon Bedrock, Google Cloud Vertex AI and Microsoft Foundry, and it is the Claude Code default on Max, Team Premium and Enterprise pay-as-you-go plans. On the API, the opus alias now resolves to Opus 4.8. Consumer Pro and Max surfaces list a generic “Opus” rather than a version number.
Do I need to change my code that runs on Opus 4.7?
No. Anthropic ships no breaking changes for code running on Opus 4.7. New API capabilities are additive: mid-conversation system messages (a system role after a user turn, with no beta header), the Fast mode research preview, and a lower prompt cache minimum of 1,024 tokens.
Does flat pricing mean frontier models are becoming a commodity?
It is a signal worth watching. Holding $5 per million input and $25 per million output across a capability bump, while cutting fast-mode pricing, suggests Anthropic is defending a price floor against Gemini 3.5 Flash and OpenAI rather than charging more for more capability. The differentiation is moving into the harness — dynamic workflows and ultracode — not the per-token sticker.
Sources: Anthropic’s official announcement, the Claude platform documentation, and the Claude Code workflows and model-config documentation. This is editorial coverage of a public launch — we report what Anthropic shipped and add our analysis; we have not independently benchmarked the model. External references: anthropic.com/news/claude-opus-4-8, platform.claude.com — what’s new in Claude 4.8, code.claude.com — dynamic workflows.



