Opus 4.8 and workflows: multi-agent AI development
Most people use Claude Code the same way: one prompt, one task, wait for it to finish, type the next thing. That works fine until it doesn't. Here's what to do when one agent isn't enough.
The model matters more than you think
Claude Code supports multiple models. You can switch between them mid-conversation with /model. The one you probably want for serious work is Opus 4.8 (claude-opus-4-8), the most capable model in the Claude family right now. Select it with:
The opus alias always resolves to the latest Opus model. You can also pin it as your default in ~/.claude/settings.json so it sticks across sessions.
Where Opus 4.8 earns its keep is the kind of work where reasoning depth actually matters. A 40-file refactor where the type system needs to stay consistent. Debugging something that spans three services. Architecture decisions where you need the model to hold a lot of context and think carefully about tradeoffs. On Max, Team, and Enterprise plans, Opus 4.8 can run with a 1M-token context window, which is useful when you're working across a large codebase.
For smaller tasks, sure, Sonnet is fine. But I've found that the difference between Opus and Sonnet on hard problems isn't incremental. It's the difference between getting a correct answer and getting something that looks right but falls apart when you test it.
What /fast actually does
There's a common misconception about /fast. People assume it downgrades you to a smaller, cheaper model. It doesn't. Fast mode runs the same Opus weights with faster token output. That's it.
This makes /fast genuinely useful for the boring stretches. You're doing a complex refactor with Opus, and partway through you need to rename a variable in twelve files or add the same import to a bunch of modules. Toggle /fast, blast through the mechanical edits, toggle it off for the next tricky part. Same brain, just faster hands.
Note: /fast is still a research preview, so you need a recent version of Claude Code. If it's not available for you yet, update and try again.
When one agent hits the ceiling
The single-agent model breaks down in a predictable way. You give Claude Code a big task, it works through it serially, and you wait. Maybe it takes twenty minutes. Maybe it misses something because by file thirty-seven it's lost track of what it decided on file four.
You've probably tried working around this by breaking the task into smaller prompts yourself. That helps, but you're still the bottleneck. You wait, read the output, type the next thing. It's fine for a five-step job. For a fifty-step job, you spend more time managing the process than the work itself.
Claude Code has three ways to break out of single-agent work, and they sit on a maturity ladder:
- Subagents (generally available) — spawn a helper agent inside your session with the Task tool. It does one focused job and reports back. This is the everyday workhorse and you're probably already using it without realizing.
- Agent Teams (experimental) — multiple full Claude Code sessions that coordinate with each other. Enable with
CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1. Heavier, more suited to complex collaboration between peers. - Workflows (frontier preview) — you write the orchestration script. Dozens of agents, fan-out and fan-in, structured phases. This is maximum control.
Workflows: you write the plan
A workflow is a script you write that tells Claude Code how to coordinate multiple agents. You decide what fans out, what runs in parallel, and what synthesizes. The structure is in your code, not left to the model's improvisation.
Here's a minimal example. Say you have three files that changed and you want them reviewed simultaneously, then a summary:
name: 'review-changes',
description: 'Review changed files in parallel, then summarize',
phases: [{ title: 'Review' }, { title: 'Summarize' }],
}
phase('Review')
const files = ['auth.ts', 'api.ts', 'db.ts']
const reviews = await parallel(
files.map(f => () => agent(`Review ${f} for bugs.`))
)
phase('Summarize')
const summary = await agent(
`Summarize these reviews:\n${reviews.join('\n')}`
)
return summary
agent(prompt) spawns one subagent and returns its text. parallel([...]) runs many agents at the same time and waits for all of them. phase('...') groups the progress so you can follow what's happening. You can watch it live with /workflows.
The important distinction is between parallel() and pipeline(). With parallel(), all agents launch at once and nothing proceeds until they all finish. With pipeline(), each item flows through stages independently. If you have twenty files that each need a review followed by a verification, pipeline() lets file one's verification start as soon as its review is done, without waiting for the other nineteen reviews.
Pick based on whether you need a barrier or not. Code review with a final summary? Parallel, then synthesize. Independent multi-stage processing? Pipeline.
Patterns worth stealing
Once you have multi-agent orchestration, certain patterns become possible that a single agent can't do well.
Adversarial verification. After your agents produce findings, spawn a second wave of "skeptic" agents whose only job is to refute each finding. Whatever survives the skeptics is probably real. I've used this for security reviews where false positives waste more time than the review itself.
Judge panel. Generate several independent attempts at the same problem from different angles, then have a final agent score them and synthesize from the best. Good for architecture decisions where you want to explore multiple approaches before committing.
Loop until dry. Keep sweeping for issues until two consecutive rounds come back empty. A single pass catches the obvious stuff. The second pass catches what the first one missed. The third pass usually comes back clean, and now you know you're done instead of hoping you're done.
These patterns share a common idea: don't ask one agent and trust the answer. Make the results prove themselves.
When this is overkill
I want to be honest about this: most of the time, you don't need workflows. If you're fixing a bug, adding a feature, or doing a focused refactor, a single Opus session handles it. Subagents cover the next tier up. I reach for workflows maybe once or twice a week, and only for specific shapes of work.
The sweet spot is tasks that are embarrassingly parallel: the same operation applied independently to many targets. Reviewing twenty files. Running the same audit across five services. Translating content into eight languages. If the subtasks don't depend on each other, workflows give you a linear speedup that single-agent work can't match.
For tasks that are deeply sequential, where step N depends on the output of step N-1, workflows add complexity without much benefit. Just use a single Opus session and let it think.
Getting started
If you're already comfortable with Claude Code and subagents, the jump to workflows is mostly about writing that orchestration script. Start small: pick a real task you do regularly, write a workflow for it, and see if the parallel execution actually saves you time. If it does, keep it as a reusable script. If it doesn't, you've lost ten minutes and learned something.
The practical checklist: switch to Opus 4.8 with /model opus. Use /fast for the routine parts. Know when a subagent is enough and when you need a workflow. And build in verification, because the real expert move isn't speed. It's confidence.
Go deeper in the Expert Zone
Zero2Claude's Level 17 covers Opus 4.8 and Workflows hands-on, with interactive exercises and real workflow code. Part of the Expert Zone advanced track.
Explore the Course