[PREP] Key Comparisons — With Real Examples

12 min read 2365 words

Table of Contents

Key Comparisons — With Real Examples

Key Comparisons — With Real Examples

Concrete examples from this repo (cc-distribution) to visualize the distinctions that appear on the certification exam.

1. Skills vs CLAUDE.md

Core difference: CLAUDE.md loads every conversation (always-on context cost). Skills load only when invoked (on-demand, near-zero cost until triggered).

When to use CLAUDE.md

Short facts, conventions, constraints that apply to ALL work in the project.

Real example — rules/git-safety.md (6 lines, loaded every conversation):

# Git Safety
- NEVER use: git add, git commit, git push, git reset --hard
- When user asks to commit: just write the commit message
- Only run read-only git commands

This is a 6-line guardrail. It MUST be loaded every time because any conversation could involve git. The cost is tiny (6 lines) and the risk of not loading it (destructive git commands) is high.

When to use a Skill

Multi-step procedures, checklists, reference material that only matter for specific tasks.

Real example — skills/verify/SKILL.md (78 lines, loaded only on /verify):

# /verify — Pre-Completion Verification
## Steps
### 1. Detect Project Lint Command (check package.json, pyproject.toml, Makefile...)
### 2. Run Lint
### 3. Diff Intent Check (git diff, review for unrelated changes)
### 4. Report (structured output with PASS/FAIL)
### 5. If NOT READY, fix and re-run

This is a 78-line procedure. Loading it every conversation wastes ~78 lines of context on turns that have nothing to do with verification. As a skill, it costs nothing until you type /verify.

Decision rule

Question	CLAUDE.md	Skill
Is it a fact or a constraint?	Yes
Is it a multi-step procedure?		Yes
Must it apply to every conversation?	Yes
Is it longer than ~20 lines?	Probably not	Yes
Could it be a slash command?		Yes

2. Skills vs Subagents

Core difference: A skill adds instructions to Claude’s current context. A subagent is a separate Claude instance with its own isolated context.

Skill (inline — shares your conversation context)

When you invoke /verify, the SKILL.md content is injected into your current conversation. Claude sees your full history + the skill instructions, then acts.

[Your conversation: 50 messages about building a feature]
  + [/verify SKILL.md injected: 78 lines]
  = Claude runs verification WITH knowledge of what you've been doing

Pros: Claude knows what you were working on. Can reference earlier decisions. Cons: Takes space in your context window. Verbose skill output stays in context.

Subagent (forked — isolated context)

When you invoke a skill with context: fork, or when Claude spawns a Task, a NEW Claude instance starts with only the instructions you give it. It has no memory of your conversation.

[Your conversation: 50 messages]
  --> spawns subagent with: "Review the diff on branch feature-x for SQL safety"
  --> subagent has ONLY that prompt + CLAUDE.md. No conversation history.
  --> subagent returns a summary. Only the summary enters your context.

Pros: Verbose output stays isolated. Main context stays clean. Cons: Subagent doesn’t know what you discussed. You must pass explicit context.

Real example — when the wrong choice costs you

Bad: Running /qa (QA testing) inline without context: fork. QA generates dozens of screenshots, test logs, assertion results. All of it floods your main context. After QA finishes, your context window is 80% test output and you can’t continue coding without /compact.

Good: Running /qa with context: fork. QA runs in isolation, generates all that output in its own context, then returns a clean summary: “Found 3 bugs: X, Y, Z. Health score: 7/10.” Your main context stays lean.

Decision rule

Situation	Use Skill (inline)	Use Subagent (fork)
Needs conversation history	Yes
Generates verbose output		Yes
Quick reference/checklist	Yes
Independent investigation		Yes
Modifies files you’re discussing	Yes
Read-only analysis		Yes

3. context: fork vs Subagent (Task tool)

Both create isolated contexts. The difference is who defines the task and how.

Skill with `context: fork`

The SKILL.md IS the task. You pre-wrote the instructions. The user just types /skill-name.

Example — a hypothetical forked research skill:

---
name: deep-research
context: fork
agent: Explore
---
Research $ARGUMENTS thoroughly:
1. Find relevant files using Glob and Grep
2. Read and analyze the code
3. Summarize findings with file references

Flow: User types /deep-research auth system -> subagent receives the SKILL.md content as its prompt -> runs in Explore mode (read-only tools) -> returns summary.

Task tool (Claude-initiated)

Claude dynamically decides to spawn a subagent and writes the prompt on the fly.

Example — Claude decides to delegate:

User: "Refactor the auth module and update all the tests"

Claude thinks: "This is complex. I'll do the refactor myself but delegate test updates to a subagent."

Claude calls Task tool: {
  prompt: "Update all test files in tests/auth/ to match the new AuthService interface.
           The method `authenticate(user, pass)` was renamed to `verifyCredentials(credentials)`.
           The return type changed from boolean to AuthResult.",
  description: "Update auth tests"
}

Flow: Claude writes a custom prompt based on the current conversation -> subagent runs with that prompt -> returns results to Claude.

Decision rule

Question	context: fork skill	Task tool
Is the task repeatable and pre-defined?	Yes
Does the task depend on conversation context?		Yes
Do you want a slash command for it?	Yes
Is Claude dynamically decomposing work?		Yes

4. Hooks vs Prompt Instructions

Core difference: Hooks are deterministic (code runs, guaranteed outcome). Prompts are probabilistic (Claude usually follows them, but has a non-zero failure rate).

Prompt instruction (probabilistic)

Real example — rules/git-safety.md:

- NEVER use: git add, git commit, git push

Claude reads this and usually complies. But under pressure (complex multi-step task, user says “just ship it”), Claude might still run git push. The failure rate is low but non-zero.

Hook (deterministic)

Real example — skills/careful/SKILL.md with PreToolUse hook:

hooks:
  PreToolUse:
    - matcher: "Bash"
      hooks:
        - type: command
          command: "bash ${CLAUDE_SKILL_DIR}/bin/check-careful.sh"

The hook script checks every Bash command for destructive patterns (rm -rf, DROP TABLE, git push --force). If it matches, the hook returns permissionDecision: "deny" and the command is blocked at the code level. Claude cannot bypass this — it’s not a suggestion, it’s enforcement.

Another real example — `skills/freeze/SKILL.md`

hooks:
  PreToolUse:
    - matcher: "Edit"
      hooks:
        - type: command
          command: "bash ${CLAUDE_SKILL_DIR}/bin/check-freeze.sh"

When /freeze is active, the hook checks every Edit/Write operation against the freeze boundary. If the file is outside the allowed directory, the edit is blocked. Claude doesn’t need to “remember” the constraint — the hook enforces it mechanically.

When to use which

Scenario	Prompt	Hook
“Prefer tabs over spaces”	Yes
“Never delete production data”		Yes
“Use descriptive variable names”	Yes
“Block git push without review”		Yes
Style preferences	Yes
Security/compliance gates		Yes

Exam rule: when the question says “reliability is critical” or “guaranteed compliance” -> the answer is hooks (programmatic enforcement), never prompts.

5. Plan Mode vs Direct Execution

Direct execution

Single-file, well-scoped change. No ambiguity about what to do.

Example: “Add a timeout field to the hook config schema.” -> One file, one change, clear spec. Just do it.

Plan mode

Complex, multi-file, multiple valid approaches. Need to think before acting.

Example: “Restructure the lessons/ directory into course subfolders with study notes.” -> Touches 15+ files, needs to decide on folder structure, requires moving existing files, creating new ones, updating cross-references. Plan first, then execute.

Decision rule

Signal	Direct	Plan mode
Single file change	Yes
Multiple files, unclear scope		Yes
Clear spec, no ambiguity	Yes
Multiple valid approaches		Yes
“Add X to Y”	Yes
“Refactor/migrate/restructure”		Yes

6. Self-review vs Independent Review

Self-review (weak)

Same Claude session reviews its own code. It retains all the reasoning context from when it wrote the code — the same assumptions, the same blind spots.

Session A: writes auth code
Session A: reviews auth code  <-- biased, retains reasoning context

Like proofreading your own essay — you read what you meant to write, not what’s actually there.

Independent review (strong)

A separate Claude instance (subagent or new session) reviews the code. It has no prior context about why decisions were made, so it evaluates the code on its own merits.

Session A: writes auth code
Session B: reviews auth code  <-- no prior context, unbiased

Real example from this repo: The /review skill spawns as a separate review process analyzing the diff. It doesn’t know why you made certain choices — it just reads the code and finds issues. This is by design.

Exam rule: when the question mentions “effective review” or “catching subtle issues” -> independent instance, always.

7. Error: Access Failure vs Valid Empty Result

This trips people up on the exam. Both look like “no results” but require different handling.

Access failure

The database is down. The API timed out. The file doesn’t exist. The query couldn’t run.

{"isError": true, "errorCategory": "transient", "isRetryable": true,
 "message": "Connection to flights API timed out after 30s"}

Agent should: retry, try alternative source, report the failure.

Valid empty result

The query ran successfully. There are simply no matches.

{"isError": false, "results": [],
 "message": "No flights found between A and B on that date"}

Agent should: report to user that no results exist, suggest alternatives.

Why this matters

If you return a generic "Operation failed" for both, the agent can’t decide whether to retry (transient) or move on (valid empty). Structured error responses (errorCategory + isRetryable) give the agent the information to make the right recovery decision.

8. Batch API vs Real-time API

Real-time (Messages API)

Synchronous, immediate response. Use for interactive workflows.

response = client.messages.create(...)  # blocks until done

Use for: pre-merge checks, user-facing features, anything time-sensitive.

Batch (Message Batches API)

Asynchronous, 50% cheaper, up to 24-hour processing window. No latency guarantee.

batch = client.batches.create(requests=[...])  # returns immediately
# check status later, results within 24 hours

Use for: overnight reports, weekly audits, bulk document processing.

Exam rule: if the question mentions “blocking workflow” or “pre-merge” -> never batch. If it mentions “cost savings” + “latency-tolerant” -> batch.

9. Planning (Plan Mode) vs Thinking (Extended Thinking)

Both are “think before you act” mechanisms, but they operate at completely different levels.

Planning (Plan Mode in Claude Code)

Claude Code’s plan mode is a workflow phase — Claude outlines the steps it will take, you review and approve, then it executes. This is a conversation-level feature that restricts Claude to read-only tools during planning.

Real example — user asks to restructure lessons/:

User: /plan Restructure the lessons/ directory into course subfolders

Claude (plan mode):
  Phase 1: Create directory structure
    - lessons/cert-prep/
    - lessons/courses/{agent-skills,claude-api,mcp,claude-code-in-action}/
  Phase 2: Move existing exam files to cert-prep/
  Phase 3: Write study-notes.md for each course
  Phase 4: Update README.md with new structure

User: Looks good, proceed.

Claude (execution mode): [starts making changes]

Key properties:

Happens in the conversation between you and Claude
Claude proposes steps, you approve or redirect
Read-only during planning (can’t edit files)
Execute after approval
Helps with complex, multi-file, ambiguous tasks
The plan is visible to you — you can catch wrong approaches early

Thinking (Extended Thinking in the API)

Extended thinking is a model-level feature — Claude reasons internally before producing output. You don’t see the thinking in Claude Code (it happens inside the model), but you can enable and budget for it via the API.

Real example — API call with extended thinking:

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={"type": "enabled", "budget_tokens": 10000},
    messages=[{"role": "user", "content": "Analyze this complex algorithm..."}]
)
# response.content may include thinking blocks + text blocks

Key properties:

Happens inside the model’s reasoning process
You don’t control or review the steps
Costs extra tokens (budget_tokens)
Improves quality on complex reasoning tasks (math, logic, code analysis)
Not a workflow — no approval step
In Claude Code: triggered by “ultrathink” keyword in skill content or /think command

When to use which

Planning is for you — it lets you review and approve an approach before Claude acts. Use it when the approach matters more than the individual reasoning steps.

Thinking is for Claude — it gives the model more internal reasoning capacity. Use it when the task requires deeper analysis, not necessarily more steps.

Scenario	Planning	Thinking
“Refactor auth module across 12 files”	Yes — review approach first	Maybe — complex but approach matters more
“Prove this algorithm is O(n log n)”	No — one-shot reasoning	Yes — needs deep analysis
“Migrate database schema”	Yes — many valid approaches, review needed	No — straightforward once approach chosen
“Find the subtle bug in this race condition”	No — let Claude investigate	Yes — needs careful reasoning
“Design the API for a new feature”	Yes — you want to approve the design	Yes — benefits from deeper thought
“Explain why this regex fails on edge case X”	No	Yes — reasoning-heavy

Can you combine them?

Yes. In Claude Code, plan mode + thinking mode gives you the best of both:

Claude thinks deeply about each planning step (better plan quality)
You review the plan before execution (control)
Claude thinks deeply during execution (better implementation)

In the API, you can enable extended thinking and then implement a plan-execute loop with tool use. The planning is your application logic; the thinking is the model’s reasoning.

The exam gotcha

The exam may ask: “For a complex multi-step task, should you use plan mode or extended thinking?”

The answer is usually plan mode — because the question is about workflow control, not reasoning depth. Extended thinking helps Claude think better; plan mode helps you validate the approach. For “complex multi-step” work, the risk is wrong approach, not wrong reasoning.

But if the question mentions “mathematical proof”, “subtle bug”, “logical analysis”, or “complex reasoning” — the answer is extended thinking.

Quick Reference Card

Comparison	Left	Right	Key signal
CLAUDE.md vs Skill	Always loaded, short facts	On-demand, long procedures	Length + frequency of need
Skill (inline) vs Subagent (fork)	Shares context	Isolated context	Does it need conversation history?
context: fork vs Task tool	Pre-defined task	Dynamic delegation	Repeatable vs situational
Hook vs Prompt	Deterministic, guaranteed	Probabilistic, may fail	Is compliance critical?
Plan mode vs Direct	Think first	Act now	Scope and ambiguity
Planning vs Thinking	Workflow control (you review)	Reasoning depth (model thinks)	Approach risk vs reasoning difficulty
Self-review vs Independent	Same session, biased	New instance, unbiased	Always prefer independent
Access failure vs Empty result	Error, retry	Success, no matches	Need structured error info
Batch vs Real-time	Cheap, slow, async	Full price, immediate	Is latency acceptable?

Key Comparisons — With Real Examples#

1. Skills vs CLAUDE.md#

When to use CLAUDE.md#

When to use a Skill#

Decision rule#

2. Skills vs Subagents#

Skill (inline — shares your conversation context)#

Subagent (forked — isolated context)#

Real example — when the wrong choice costs you#

Decision rule#

3. context: fork vs Subagent (Task tool)#

Skill with context: fork#

Task tool (Claude-initiated)#

Decision rule#

4. Hooks vs Prompt Instructions#

Prompt instruction (probabilistic)#

Hook (deterministic)#

Another real example — skills/freeze/SKILL.md#

When to use which#

5. Plan Mode vs Direct Execution#

Direct execution#

Plan mode#

Decision rule#

6. Self-review vs Independent Review#

Self-review (weak)#

Independent review (strong)#

7. Error: Access Failure vs Valid Empty Result#

Access failure#

Valid empty result#

Why this matters#

8. Batch API vs Real-time API#

Real-time (Messages API)#

Batch (Message Batches API)#

9. Planning (Plan Mode) vs Thinking (Extended Thinking)#

Planning (Plan Mode in Claude Code)#

Thinking (Extended Thinking in the API)#

When to use which#

Can you combine them?#

The exam gotcha#

Quick Reference Card#

Key Comparisons — With Real Examples

1. Skills vs CLAUDE.md

When to use CLAUDE.md

When to use a Skill

Decision rule

2. Skills vs Subagents

Skill (inline — shares your conversation context)

Subagent (forked — isolated context)

Real example — when the wrong choice costs you

Decision rule

3. context: fork vs Subagent (Task tool)

Skill with `context: fork`

Task tool (Claude-initiated)

Decision rule

4. Hooks vs Prompt Instructions

Prompt instruction (probabilistic)

Hook (deterministic)

Another real example — `skills/freeze/SKILL.md`

When to use which

5. Plan Mode vs Direct Execution

Direct execution

Plan mode

Decision rule

6. Self-review vs Independent Review

Self-review (weak)

Independent review (strong)

7. Error: Access Failure vs Valid Empty Result

Access failure

Valid empty result

Why this matters

8. Batch API vs Real-time API

Real-time (Messages API)

Batch (Message Batches API)

9. Planning (Plan Mode) vs Thinking (Extended Thinking)

Planning (Plan Mode in Claude Code)

Thinking (Extended Thinking in the API)

When to use which

Can you combine them?

The exam gotcha

Quick Reference Card