Workflow
Workflow
Workflow is the shape of agent execution: the phases, the permissions per phase, the transitions between them, and the control structure of the loop itself.
This is the scaffolding the agent runs inside. Tools and memory are the verbs and nouns; workflow is the grammar.
The core problem
Agents that jump straight to editing make changes based on incomplete understanding. They edit the wrong files, miss dependencies, and “fix” symptoms rather than causes. The symmetric failure is agents that plan forever and never act — paralyzed by exploration.
The workflow must force a reading phase before a writing phase, a planning phase before an acting phase, and a verification phase before completion. Each phase has different capabilities. Phase transitions are permission boundaries.
Patterns observed in Claude Code
Explore-plan-act loop
Three-phase workflow with increasing write permissions:
| Phase | Tools | Goal |
|---|---|---|
| Explore | Read, Grep, Glob (read-only) | Map the problem. Load context. Understand what exists. |
| Plan | Read + discussion (no writes) | Propose an approach. Get user alignment. Identify risks. |
| Act | Full tool access | Execute the plan. Write code. Run tests. |
The system prompt steers the agent away from editing before understanding. Plan mode is a workflow phase that is ALSO a permission mode — the agent physically cannot Write while in plan.
Why it matters: agents jumping straight to editing miss dependencies. The exploration phase is not optional lazy research — it’s a required pre-condition for any change that’s not a one-liner.
Trade-off: additional turns slow progress for trivial tasks. Claude Code mitigates this with “safe tool allowlist” fast paths — trivial reads bypass the plan gate.
Coordinator restriction
The coordinator agent is banned from execution tools. It only has 4 tools:
TeamCreateTeamDeleteSendMessageSyntheticOutput
No Bash, no Read, no Edit. The coordinator can only coordinate — it cannot do the work directly.
Why: if the coordinator has Edit, it will edit directly instead of delegating. LLMs take the shortest path. Removing the capability forces delegation, which creates natural parallelism and specialization.
Enforced by code (tool list), not by instruction (prompt). Defense-in-depth: both the prompt tells the coordinator to delegate AND the tool list prevents it from doing otherwise.
This is the single most important insight in multi-agent workflow design: remove capabilities rather than asking agents not to use them. Asking doesn’t scale. Removing does.
See multi-agent.md for the coordinator pattern in depth.
Workflow phases as permission modes
Claude Code has 7 permission modes:
- default — normal operation, approve risky actions
- plan — read-only, no writes
- auto — low-risk actions auto-approved, high-risk still gated
- bypass — skip approval for approved tools (not dangerous patterns)
- yolo — full autonomy (dangerous)
- readonly — strict read-only
- workspace-write — write only inside workspace
Each mode is a workflow phase wearing a permission hat. The phase decides which tools the agent has; the permission system enforces it.
Modes can be composed with phases:
- Explore = plan mode (phase) + readonly (permission)
- Plan = plan mode + minimal tools
- Act = default mode (phase) + whatever the user approved for this session
Main query loop
The agent’s entire control flow lives in an async function* running while(true). Each iteration has 4 phases:
- Context Assembly — build the system prompt (static + dynamic boundary + dynamic sections), assemble conversation history, apply compaction if needed
- Stream API Call — send to the model, stream tokens back
- Tool Execution — for each
tool_useblock, run the tool (possibly in parallel), inject results as user messages - Stop or Continue — derived flag: if any
tool_useblocks exist → continue; else → stop
See patterns.md for the async generator mechanics. The key workflow insight is: the loop is the workflow. Everything else hangs off it.
Derived continuation flag
The agent does NOT trust stop_reason === 'tool_use' from the API. Quote from the leaked code:
stop_reason === 'tool_use'is unreliable — it’s not always set correctly.
Instead, it observes actual content: if any tool_use blocks exist in the assistant’s response, needsFollowUp = true.
Lesson for workflow design: derive control flow from content, not metadata. Metadata can lie; content cannot. This generalizes: any control signal should be derivable from observable state, not from upstream flags.
Workflow composition
Claude Code’s workflows are built by composing primitive phases:
| Phase | Enforcement | Example |
|---|---|---|
| Explore | Tool subset (Read/Grep/Glob only) | /plan mode |
| Plan | Tool subset + prompt (“discuss before acting”) | /plan output |
| Act | Default tools | Default mode |
| Verify | Dedicated subagent (“Verification agent”) | After act phase |
| Review | Tool subset + prompt (“find problems”) | /review skill |
| Ship | Linear pipeline (test → bump → commit → push) | /ship skill |
Skills in this repo compose phases:
decompose= explore + plan (for feature specs)plan-*-review= plan + verifyqa= act + verify loopship= verify → act → observe → act (pipeline)review= explore + verify (no action)investigate= explore (iron law: no act without root cause)
The composition lives in the skill’s SKILL.md; the enforcement lives in the tool list + permission mode + prompt.
Anti-patterns
- Single-phase workflows. “Just do the thing.” No planning, no verification, no phase transitions. Works for one-line tasks, fails for everything else.
- Phase restriction by prompt only. The prompt says “don’t edit yet” but the agent has
Editin its tool list. It edits anyway. LLMs take the shortest path. - Permanent exploration phase. Plan mode with no exit criterion. Agent explores forever, never commits.
- No verify phase. Agent declares “done” without running tests, checking lint, or reading the diff. “Verification avoidance” — see
multi-agent.mdon the Verification agent. - Coordinator with execution tools. Coordinator edits code directly, bypassing its workers. Parallelism collapses to sequential.
- Metadata-driven control flow. Using
stop_reasonor turn counts to decide continuation. Fragile — the API changes, the metadata drifts. - No main loop. Agent runs one turn and exits. Can’t handle follow-ups, can’t recover from errors, can’t iterate.
Takeaways for harness engineering
- Separate phases by permission, not by prompt. Remove capabilities, don’t ask the agent nicely.
- Explore-plan-act is the default scaffold. Deviations need justification, not the other way around.
- Derive control flow from content. Never trust upstream metadata to drive the loop.
- One main loop, many phases. The loop is the workflow. Phases are state transitions inside it.
- Coordinator ≠ worker. If you have multi-agent, the coordinator has no execution tools. This is the single highest-leverage rule in multi-agent design.
- Phase transitions are logged. You need to know which phase was active when a thing went wrong. Observability lives at the phase boundary.
- Fast-path trivial tasks. Safe-tool allowlist bypasses the plan gate. Otherwise plan-mode becomes user-hostile for one-line questions.
- Every phase has an exit criterion. “Plan mode ends when the user approves the plan or types
/act.” Without exit criteria, phases become black holes.
What this repo does
rules/primary-workflow.md— auto-loaded rule describing the plan → implement → test → review pipeline. Every session starts with this in context.skills/plan-ceo-review/,skills/plan-eng-review/,skills/plan-design-review/— plan-phase skills. Each is a structured review at a different layer (scope, architecture, design).skills/investigate/— enforces the iron law: no fixes without root cause. An explore-phase skill that proactively claims bug reports.skills/verify/— a verify phase runs before any “done” claim. Pre-completion checklist.skills/ship/— a linear pipeline phase (test → version → changelog → PR). Zero discretion once started.hooks/stop-verify.cjs— enforces the verify phase at the harness level, not just the skill level. Stop hook checks lint and intent-vs-diff before letting the agent finish.hooks/guard-task.cjs— forces approval before spawning subagents. Prevents accidental multi-agent fan-out from within a single-phase workflow.hooks/loop-detection.cjs— detects when the agent is stuck iterating on the same file (doom loop) and injects a nudge to step back. A workflow-level intervention when the agent loses its phase discipline.
Open problems
- Phase transitions are implicit in this repo. The
investigate → verify → shipchain isn’t formally modeled — it’s emergent from skill invocations. A formal phase state machine would let hooks observe transitions and enforce invariants. - No fast-path allowlist for trivial tasks. The full rule-reminder pipeline runs on every prompt, even for “what time is it?”. Wasted context.
- No exit criteria for plan mode. Agent can loop in plan mode indefinitely. Claude Code handles this at the UI level; this repo doesn’t model it.
- Coordinator pattern is not implemented in this repo. If the user moves to multi-agent work, a coordinator skill with restricted tools would be the next addition.