[PREP] Key Comparisons — With Real Examples
Table of Contents
- Key Comparisons — With Real Examples
- 1. Skills vs CLAUDE.md
- 2. Skills vs Subagents
- 3. context: fork vs Subagent (Task tool)
- 4. Hooks vs Prompt Instructions
- 5. Plan Mode vs Direct Execution
- 6. Self-review vs Independent Review
- 7. Error: Access Failure vs Valid Empty Result
- 8. Batch API vs Real-time API
- 9. Planning (Plan Mode) vs Thinking (Extended Thinking)
- Quick Reference Card
Key Comparisons — With Real Examples
Concrete examples from this repo (cc-distribution) to visualize the distinctions that appear on the certification exam.
1. Skills vs CLAUDE.md
Core difference: CLAUDE.md loads every conversation (always-on context cost). Skills load only when invoked (on-demand, near-zero cost until triggered).
When to use CLAUDE.md
Short facts, conventions, constraints that apply to ALL work in the project.
Real example — rules/git-safety.md (6 lines, loaded every conversation):
# Git Safety
- NEVER use: git add, git commit, git push, git reset --hard
- When user asks to commit: just write the commit message
- Only run read-only git commands
This is a 6-line guardrail. It MUST be loaded every time because any conversation could involve git. The cost is tiny (6 lines) and the risk of not loading it (destructive git commands) is high.
When to use a Skill
Multi-step procedures, checklists, reference material that only matter for specific tasks.
Real example — skills/verify/SKILL.md (78 lines, loaded only on /verify):
# /verify — Pre-Completion Verification
## Steps
### 1. Detect Project Lint Command (check package.json, pyproject.toml, Makefile...)
### 2. Run Lint
### 3. Diff Intent Check (git diff, review for unrelated changes)
### 4. Report (structured output with PASS/FAIL)
### 5. If NOT READY, fix and re-run
This is a 78-line procedure. Loading it every conversation wastes ~78 lines of context on turns that have nothing to do with verification. As a skill, it costs nothing until you type /verify.
Decision rule
| Question | CLAUDE.md | Skill |
|---|---|---|
| Is it a fact or a constraint? | Yes | |
| Is it a multi-step procedure? | Yes | |
| Must it apply to every conversation? | Yes | |
| Is it longer than ~20 lines? | Probably not | Yes |
| Could it be a slash command? | Yes |
2. Skills vs Subagents
Core difference: A skill adds instructions to Claude’s current context. A subagent is a separate Claude instance with its own isolated context.
Skill (inline — shares your conversation context)
When you invoke /verify, the SKILL.md content is injected into your current conversation. Claude sees your full history + the skill instructions, then acts.
[Your conversation: 50 messages about building a feature]
+ [/verify SKILL.md injected: 78 lines]
= Claude runs verification WITH knowledge of what you've been doing
Pros: Claude knows what you were working on. Can reference earlier decisions. Cons: Takes space in your context window. Verbose skill output stays in context.
Subagent (forked — isolated context)
When you invoke a skill with context: fork, or when Claude spawns a Task, a NEW Claude instance starts with only the instructions you give it. It has no memory of your conversation.
[Your conversation: 50 messages]
--> spawns subagent with: "Review the diff on branch feature-x for SQL safety"
--> subagent has ONLY that prompt + CLAUDE.md. No conversation history.
--> subagent returns a summary. Only the summary enters your context.
Pros: Verbose output stays isolated. Main context stays clean. Cons: Subagent doesn’t know what you discussed. You must pass explicit context.
Real example — when the wrong choice costs you
Bad: Running /qa (QA testing) inline without context: fork. QA generates dozens of screenshots, test logs, assertion results. All of it floods your main context. After QA finishes, your context window is 80% test output and you can’t continue coding without /compact.
Good: Running /qa with context: fork. QA runs in isolation, generates all that output in its own context, then returns a clean summary: “Found 3 bugs: X, Y, Z. Health score: 7/10.” Your main context stays lean.
Decision rule
| Situation | Use Skill (inline) | Use Subagent (fork) |
|---|---|---|
| Needs conversation history | Yes | |
| Generates verbose output | Yes | |
| Quick reference/checklist | Yes | |
| Independent investigation | Yes | |
| Modifies files you’re discussing | Yes | |
| Read-only analysis | Yes |
3. context: fork vs Subagent (Task tool)
Both create isolated contexts. The difference is who defines the task and how.
Skill with context: fork
The SKILL.md IS the task. You pre-wrote the instructions. The user just types /skill-name.
Example — a hypothetical forked research skill:
---
name: deep-research
context: fork
agent: Explore
---
Research $ARGUMENTS thoroughly:
1. Find relevant files using Glob and Grep
2. Read and analyze the code
3. Summarize findings with file references
Flow: User types /deep-research auth system -> subagent receives the SKILL.md content as its prompt -> runs in Explore mode (read-only tools) -> returns summary.
Task tool (Claude-initiated)
Claude dynamically decides to spawn a subagent and writes the prompt on the fly.
Example — Claude decides to delegate:
User: "Refactor the auth module and update all the tests"
Claude thinks: "This is complex. I'll do the refactor myself but delegate test updates to a subagent."
Claude calls Task tool: {
prompt: "Update all test files in tests/auth/ to match the new AuthService interface.
The method `authenticate(user, pass)` was renamed to `verifyCredentials(credentials)`.
The return type changed from boolean to AuthResult.",
description: "Update auth tests"
}
Flow: Claude writes a custom prompt based on the current conversation -> subagent runs with that prompt -> returns results to Claude.
Decision rule
| Question | context: fork skill | Task tool |
|---|---|---|
| Is the task repeatable and pre-defined? | Yes | |
| Does the task depend on conversation context? | Yes | |
| Do you want a slash command for it? | Yes | |
| Is Claude dynamically decomposing work? | Yes |
4. Hooks vs Prompt Instructions
Core difference: Hooks are deterministic (code runs, guaranteed outcome). Prompts are probabilistic (Claude usually follows them, but has a non-zero failure rate).
Prompt instruction (probabilistic)
Real example — rules/git-safety.md:
- NEVER use: git add, git commit, git push
Claude reads this and usually complies. But under pressure (complex multi-step task, user says “just ship it”), Claude might still run git push. The failure rate is low but non-zero.
Hook (deterministic)
Real example — skills/careful/SKILL.md with PreToolUse hook:
hooks:
PreToolUse:
- matcher: "Bash"
hooks:
- type: command
command: "bash ${CLAUDE_SKILL_DIR}/bin/check-careful.sh"
The hook script checks every Bash command for destructive patterns (rm -rf, DROP TABLE, git push --force). If it matches, the hook returns permissionDecision: "deny" and the command is blocked at the code level. Claude cannot bypass this — it’s not a suggestion, it’s enforcement.
Another real example — skills/freeze/SKILL.md
hooks:
PreToolUse:
- matcher: "Edit"
hooks:
- type: command
command: "bash ${CLAUDE_SKILL_DIR}/bin/check-freeze.sh"
When /freeze is active, the hook checks every Edit/Write operation against the freeze boundary. If the file is outside the allowed directory, the edit is blocked. Claude doesn’t need to “remember” the constraint — the hook enforces it mechanically.
When to use which
| Scenario | Prompt | Hook |
|---|---|---|
| “Prefer tabs over spaces” | Yes | |
| “Never delete production data” | Yes | |
| “Use descriptive variable names” | Yes | |
| “Block git push without review” | Yes | |
| Style preferences | Yes | |
| Security/compliance gates | Yes |
Exam rule: when the question says “reliability is critical” or “guaranteed compliance” -> the answer is hooks (programmatic enforcement), never prompts.
5. Plan Mode vs Direct Execution
Direct execution
Single-file, well-scoped change. No ambiguity about what to do.
Example: “Add a timeout field to the hook config schema.”
-> One file, one change, clear spec. Just do it.
Plan mode
Complex, multi-file, multiple valid approaches. Need to think before acting.
Example: “Restructure the lessons/ directory into course subfolders with study notes.” -> Touches 15+ files, needs to decide on folder structure, requires moving existing files, creating new ones, updating cross-references. Plan first, then execute.
Decision rule
| Signal | Direct | Plan mode |
|---|---|---|
| Single file change | Yes | |
| Multiple files, unclear scope | Yes | |
| Clear spec, no ambiguity | Yes | |
| Multiple valid approaches | Yes | |
| “Add X to Y” | Yes | |
| “Refactor/migrate/restructure” | Yes |
6. Self-review vs Independent Review
Self-review (weak)
Same Claude session reviews its own code. It retains all the reasoning context from when it wrote the code — the same assumptions, the same blind spots.
Session A: writes auth code
Session A: reviews auth code <-- biased, retains reasoning context
Like proofreading your own essay — you read what you meant to write, not what’s actually there.
Independent review (strong)
A separate Claude instance (subagent or new session) reviews the code. It has no prior context about why decisions were made, so it evaluates the code on its own merits.
Session A: writes auth code
Session B: reviews auth code <-- no prior context, unbiased
Real example from this repo: The /review skill spawns as a separate review process analyzing the diff. It doesn’t know why you made certain choices — it just reads the code and finds issues. This is by design.
Exam rule: when the question mentions “effective review” or “catching subtle issues” -> independent instance, always.
7. Error: Access Failure vs Valid Empty Result
This trips people up on the exam. Both look like “no results” but require different handling.
Access failure
The database is down. The API timed out. The file doesn’t exist. The query couldn’t run.
{"isError": true, "errorCategory": "transient", "isRetryable": true,
"message": "Connection to flights API timed out after 30s"}
Agent should: retry, try alternative source, report the failure.
Valid empty result
The query ran successfully. There are simply no matches.
{"isError": false, "results": [],
"message": "No flights found between A and B on that date"}
Agent should: report to user that no results exist, suggest alternatives.
Why this matters
If you return a generic "Operation failed" for both, the agent can’t decide whether to retry (transient) or move on (valid empty). Structured error responses (errorCategory + isRetryable) give the agent the information to make the right recovery decision.
8. Batch API vs Real-time API
Real-time (Messages API)
Synchronous, immediate response. Use for interactive workflows.
response = client.messages.create(...) # blocks until done
Use for: pre-merge checks, user-facing features, anything time-sensitive.
Batch (Message Batches API)
Asynchronous, 50% cheaper, up to 24-hour processing window. No latency guarantee.
batch = client.batches.create(requests=[...]) # returns immediately
# check status later, results within 24 hours
Use for: overnight reports, weekly audits, bulk document processing.
Exam rule: if the question mentions “blocking workflow” or “pre-merge” -> never batch. If it mentions “cost savings” + “latency-tolerant” -> batch.
9. Planning (Plan Mode) vs Thinking (Extended Thinking)
Both are “think before you act” mechanisms, but they operate at completely different levels.
Planning (Plan Mode in Claude Code)
Claude Code’s plan mode is a workflow phase — Claude outlines the steps it will take, you review and approve, then it executes. This is a conversation-level feature that restricts Claude to read-only tools during planning.
Real example — user asks to restructure lessons/:
User: /plan Restructure the lessons/ directory into course subfolders
Claude (plan mode):
Phase 1: Create directory structure
- lessons/cert-prep/
- lessons/courses/{agent-skills,claude-api,mcp,claude-code-in-action}/
Phase 2: Move existing exam files to cert-prep/
Phase 3: Write study-notes.md for each course
Phase 4: Update README.md with new structure
User: Looks good, proceed.
Claude (execution mode): [starts making changes]
Key properties:
- Happens in the conversation between you and Claude
- Claude proposes steps, you approve or redirect
- Read-only during planning (can’t edit files)
- Execute after approval
- Helps with complex, multi-file, ambiguous tasks
- The plan is visible to you — you can catch wrong approaches early
Thinking (Extended Thinking in the API)
Extended thinking is a model-level feature — Claude reasons internally before producing output. You don’t see the thinking in Claude Code (it happens inside the model), but you can enable and budget for it via the API.
Real example — API call with extended thinking:
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={"type": "enabled", "budget_tokens": 10000},
messages=[{"role": "user", "content": "Analyze this complex algorithm..."}]
)
# response.content may include thinking blocks + text blocks
Key properties:
- Happens inside the model’s reasoning process
- You don’t control or review the steps
- Costs extra tokens (budget_tokens)
- Improves quality on complex reasoning tasks (math, logic, code analysis)
- Not a workflow — no approval step
- In Claude Code: triggered by “ultrathink” keyword in skill content or /think command
When to use which
Planning is for you — it lets you review and approve an approach before Claude acts. Use it when the approach matters more than the individual reasoning steps.
Thinking is for Claude — it gives the model more internal reasoning capacity. Use it when the task requires deeper analysis, not necessarily more steps.
| Scenario | Planning | Thinking |
|---|---|---|
| “Refactor auth module across 12 files” | Yes — review approach first | Maybe — complex but approach matters more |
| “Prove this algorithm is O(n log n)” | No — one-shot reasoning | Yes — needs deep analysis |
| “Migrate database schema” | Yes — many valid approaches, review needed | No — straightforward once approach chosen |
| “Find the subtle bug in this race condition” | No — let Claude investigate | Yes — needs careful reasoning |
| “Design the API for a new feature” | Yes — you want to approve the design | Yes — benefits from deeper thought |
| “Explain why this regex fails on edge case X” | No | Yes — reasoning-heavy |
Can you combine them?
Yes. In Claude Code, plan mode + thinking mode gives you the best of both:
- Claude thinks deeply about each planning step (better plan quality)
- You review the plan before execution (control)
- Claude thinks deeply during execution (better implementation)
In the API, you can enable extended thinking and then implement a plan-execute loop with tool use. The planning is your application logic; the thinking is the model’s reasoning.
The exam gotcha
The exam may ask: “For a complex multi-step task, should you use plan mode or extended thinking?”
The answer is usually plan mode — because the question is about workflow control, not reasoning depth. Extended thinking helps Claude think better; plan mode helps you validate the approach. For “complex multi-step” work, the risk is wrong approach, not wrong reasoning.
But if the question mentions “mathematical proof”, “subtle bug”, “logical analysis”, or “complex reasoning” — the answer is extended thinking.
Quick Reference Card
| Comparison | Left | Right | Key signal |
|---|---|---|---|
| CLAUDE.md vs Skill | Always loaded, short facts | On-demand, long procedures | Length + frequency of need |
| Skill (inline) vs Subagent (fork) | Shares context | Isolated context | Does it need conversation history? |
| context: fork vs Task tool | Pre-defined task | Dynamic delegation | Repeatable vs situational |
| Hook vs Prompt | Deterministic, guaranteed | Probabilistic, may fail | Is compliance critical? |
| Plan mode vs Direct | Think first | Act now | Scope and ambiguity |
| Planning vs Thinking | Workflow control (you review) | Reasoning depth (model thinks) | Approach risk vs reasoning difficulty |
| Self-review vs Independent | Same session, biased | New instance, unbiased | Always prefer independent |
| Access failure vs Empty result | Error, retry | Success, no matches | Need structured error info |
| Batch vs Real-time | Cheap, slow, async | Full price, immediate | Is latency acceptable? |