[PREP] Claude Certified Architect - Foundations Exam Notes

9 min read 1796 words

Table of Contents

Claude Certified Architect - Foundations Exam Notes

Claude Certified Architect - Foundations Exam Notes

Exam Overview

Format: Multiple choice (1 correct, 3 distractors), no penalty for guessing
Passing score: 720/1000 (scaled)
Scenarios: 4 of 6 picked randomly per exam
Domains: 5 weighted areas

Domain Weights

Domain	Weight
1. Agentic Architecture & Orchestration	27%
2. Tool Design & MCP Integration	18%
3. Claude Code Configuration & Workflows	20%
4. Prompt Engineering & Structured Output	20%
5. Context Management & Reliability	15%

Exam Scenarios

Customer Support Resolution Agent - SDK agent with MCP tools, 80% first-contact resolution target
Code Generation with Claude Code - Team workflows, CLAUDE.md, plan mode
Multi-Agent Research System - Coordinator + subagents (search, analyze, synthesize, report)
Developer Productivity - Codebase exploration, built-in tools, MCP integration
Claude Code for CI/CD - Automated reviews, test generation, PR feedback
Structured Data Extraction - JSON schemas, validation-retry, batch processing

Domain 1: Agentic Architecture & Orchestration (27%)

MUST KNOW

Agentic Loop Lifecycle:

Send request to Claude -> inspect stop_reason -> if “tool_use”: execute tool, append result, loop -> if “end_turn”: done
Tool results MUST be appended to conversation history for next iteration
Model-driven decisions (Claude reasons about next tool) NOT pre-configured tool sequences

Anti-patterns to avoid:

Parsing natural language to determine loop termination
Setting arbitrary iteration caps as PRIMARY stopping mechanism
Checking for assistant text content as completion indicator

Hub-and-spoke (Coordinator-Subagent):

Coordinator manages ALL inter-subagent communication
Subagents have ISOLATED context - do NOT inherit coordinator’s conversation history
Coordinator does: task decomposition, delegation, result aggregation, deciding which subagents to invoke
Risk: overly narrow task decomposition -> incomplete topic coverage

Subagent Spawning:

Task tool = mechanism for spawning subagents
allowedTools must include “Task” for coordinator to invoke subagents
Context must be EXPLICITLY provided in prompt - no automatic inheritance
Parallel subagents: emit MULTIPLE Task tool calls in a SINGLE coordinator response
Pass complete findings from prior agents directly in subagent’s prompt

Programmatic vs Prompt-based Enforcement:

When deterministic compliance required (identity verification before financial ops) -> programmatic hooks
Prompt instructions have NON-ZERO failure rate for critical sequences
Hooks provide guaranteed compliance; prompts provide probabilistic compliance

Agent SDK Hooks:

PostToolUse: intercept tool results for transformation/normalization before model processes them
Tool call interception: block policy-violating actions, redirect to alternative workflows
Choose hooks over prompts when business rules require GUARANTEED compliance

Task Decomposition:

Fixed sequential (prompt chaining): predictable multi-aspect reviews
Dynamic adaptive: open-ended investigation where subtasks depend on discoveries
Large code reviews: per-file local analysis + separate cross-file integration pass

Session Management:

--resume <session-name>: continue specific prior conversation
fork_session: create independent branches from shared analysis baseline
New session with structured summary > resuming with stale tool results
Inform resumed sessions about specific file changes for targeted re-analysis

KEY DISTINCTIONS

Programmatic enforcement = deterministic = hooks = for critical business logic
Prompt-based guidance = probabilistic = for soft preferences
Self-review in same session = weak (retains reasoning bias)
Independent review instance = strong (no prior context bias)

Domain 2: Tool Design & MCP Integration (18%)

MUST KNOW

Tool Descriptions:

Primary mechanism LLMs use for tool selection
Minimal descriptions -> unreliable selection among similar tools
Must include: input formats, example queries, edge cases, boundary explanations
Ambiguous/overlapping descriptions cause misrouting

Fix tool confusion by:

Expanding descriptions (first step, highest leverage)
Renaming tools to eliminate overlap
Splitting generic tools into purpose-specific ones

Structured Error Responses (MCP isError flag):

Return: errorCategory (transient/validation/permission), isRetryable boolean, human-readable description
Uniform “Operation failed” prevents agent from making recovery decisions
Transient = retry; Validation = fix input; Permission = escalate; Business = explain to user
Distinguish access failures (need retry) from valid empty results (successful query, no matches)

Tool Distribution:

Too many tools (18 instead of 4-5) DEGRADES selection reliability
Agents with tools outside specialization tend to MISUSE them
Give agents ONLY tools needed for their role
Scoped cross-role tools for high-frequency needs (e.g., verify_fact for synthesis agent)

tool_choice options:

"auto": model may return text OR call a tool
"any": model MUST call a tool (but can choose which)
{"type": "tool", "name": "..."}: FORCE specific tool call

MCP Server Scoping:

Project-level: .mcp.json (shared via version control)
User-level: ~/.claude.json (personal/experimental)
Environment variable expansion: ${GITHUB_TOKEN} in .mcp.json
Tools from ALL configured servers available simultaneously
MCP resources: expose content catalogs to reduce exploratory tool calls

Built-in Tools:

Grep = content search (function names, error messages, patterns IN files)
Glob = file path pattern matching (find files BY name/extension)
Read/Write = full file operations; Edit = targeted modifications using unique text matching
If Edit fails (non-unique match) -> Read + Write as fallback
Build understanding incrementally: Grep entry points -> Read to trace flows

Domain 3: Claude Code Configuration & Workflows (20%)

MUST KNOW

CLAUDE.md Hierarchy:

User-level: ~/.claude/CLAUDE.md (personal, NOT shared via version control)
Project-level: .claude/CLAUDE.md or root CLAUDE.md (shared with team)
Directory-level: subdirectory CLAUDE.md files (scoped to that area)
@import syntax: reference external files for modularity

.claude/rules/ Directory:

Alternative to monolithic CLAUDE.md
Topic-specific rule files
YAML frontmatter paths field with glob patterns for conditional activation
Rules load ONLY when editing matching files -> reduces irrelevant context
Better than directory-level CLAUDE.md for conventions spanning multiple dirs

Custom Slash Commands:

Project-scoped: .claude/commands/ (shared via VCS)
User-scoped: ~/.claude/commands/ (personal)

Skills (.claude/skills/):

SKILL.md with frontmatter: context: fork, allowed-tools, argument-hint
context: fork: runs in isolated sub-agent context, prevents polluting main conversation
allowed-tools: restricts tool access during skill execution
Skills = on-demand invocation; CLAUDE.md = always-loaded

Plan Mode vs Direct Execution:

Plan mode: complex tasks, large-scale changes, multiple valid approaches, architectural decisions, multi-file modifications
Direct execution: simple, well-scoped changes (single validation check, one function)
Explore subagent: isolates verbose discovery output, returns summaries to preserve main context
Combine: plan mode for investigation, direct execution for implementation

CI/CD Integration:

-p (or --print) flag: non-interactive mode for automated pipelines
--output-format json + --json-schema: structured output for CI
Same session that generated code is LESS effective at reviewing its own changes
Include prior review findings to avoid duplicate comments on re-review

Iterative Refinement:

Concrete input/output examples > prose descriptions for transformation specs
Test-driven iteration: write tests first, iterate by sharing failures
Interview pattern: Claude asks questions to surface unconsidered aspects
Multiple interacting issues -> single message; independent issues -> sequential

Domain 4: Prompt Engineering & Structured Output (20%)

MUST KNOW

Explicit Criteria:

“Flag comments only when claimed behavior contradicts actual code” > “check comments are accurate”
General instructions (“be conservative”) DON’T improve precision
High false positive rates undermine confidence in ALL categories
Temporarily disable high-FP categories to restore trust while improving

Few-shot Prompting:

Most effective for consistently formatted, actionable output
Show reasoning for WHY one action chosen over alternatives
Enable generalization to novel patterns (not just matching pre-specified cases)
Reduce hallucination in extraction tasks
2-4 targeted examples for ambiguous scenarios

Structured Output via tool_use:

tool_use + JSON schemas = most reliable for guaranteed schema compliance
Eliminates JSON SYNTAX errors but NOT SEMANTIC errors (items don’t sum, wrong fields)
tool_choice: "any" = guarantee structured output when doc type unknown
Schema design: optional/nullable fields prevent model fabricating values for absent info
Enum with “other” + detail string = extensible categories
“unclear” enum value for ambiguous cases

Validation-Retry Loops:

Append specific validation errors to prompt on retry
Retries INEFFECTIVE when info simply absent from source (vs format/structural errors)
Track detected_pattern field for systematic analysis of false positive dismissals
Self-correction: extract “calculated_total” alongside “stated_total” to flag discrepancies

Message Batches API:

50% cost savings, up to 24-hour processing window
NO guaranteed latency SLA
Appropriate: non-blocking, latency-tolerant (overnight reports, weekly audits)
Inappropriate: blocking workflows (pre-merge checks)
Does NOT support multi-turn tool calling within a single request
custom_id for correlating request/response pairs
Handle failures: resubmit only failed docs by custom_id

Multi-instance Review:

Self-review = weak (retains reasoning context from generation)
Independent instance (no prior reasoning) = more effective at catching subtle issues
Split large reviews: per-file local analysis + cross-file integration passes
Confidence alongside findings enables calibrated routing

Domain 5: Context Management & Reliability (15%)

MUST KNOW

Context Preservation:

Progressive summarization LOSES: numerical values, percentages, dates, customer expectations
“Lost in the middle” effect: models reliably process beginning/end, may omit middle sections
Tool results accumulate disproportionately (40+ fields when only 5 relevant)
MUST pass complete conversation history in subsequent API requests

Techniques:

Extract transactional facts into persistent “case facts” block OUTSIDE summarized history
Trim verbose tool outputs to only relevant fields BEFORE accumulation
Place key findings at BEGINNING; detailed results with explicit section headers
Subagents return structured data (key facts, citations, relevance) not verbose reasoning chains
/compact to reduce context during extended sessions

Escalation Patterns:

Triggers: customer requests human, policy exceptions/gaps, inability to progress
Honor explicit human requests IMMEDIATELY (don’t try to resolve first)
Sentiment-based escalation = UNRELIABLE proxy for case complexity
Self-reported confidence = POORLY CALIBRATED
Multiple customer matches -> ask for additional identifiers, NOT heuristic selection
Policy is ambiguous -> escalate (e.g., competitor price match when policy silent)

Error Propagation (Multi-Agent):

Return: failure type, attempted query, partial results, alternative approaches
Generic “search unavailable” HIDES valuable context
Silently suppressing errors (empty as success) = anti-pattern
Terminating entire workflow on single failure = anti-pattern
Subagents: local recovery for transient failures, escalate only unresolvable errors + partial results

Large Codebase Exploration:

Context degrades in extended sessions -> inconsistent answers, “typical patterns” instead of specifics
Scratchpad files: persist key findings across context boundaries
Subagent delegation: isolate verbose exploration, main agent coordinates high-level
Crash recovery: structured agent state exports (manifests), coordinator loads on resume

Human Review & Confidence:

Aggregate accuracy (97%) may MASK poor performance on specific doc types/fields
Stratified random sampling for measuring error rates in high-confidence extractions
Field-level confidence calibrated using labeled validation sets
Validate accuracy BY document type AND field BEFORE automating

Information Provenance:

Source attribution lost during summarization without claim-source mappings
Conflicting stats: annotate with source attribution, don’t arbitrarily select one
Require publication/collection dates to prevent temporal misinterpretation
Render content type-appropriately (financial=tables, news=prose, technical=structured lists)

Out of Scope (DO NOT study)

Fine-tuning, training custom models
API authentication, billing, account management
Deploying/hosting MCP servers (infrastructure)
Claude’s internal architecture, training process
Constitutional AI, RLHF, safety training
Embeddings, vector databases
Computer use (browser automation)
Vision/image analysis
Streaming API, SSE
Rate limiting, quotas, pricing
OAuth, API key rotation
Cloud provider configs (AWS, GCP, Azure)
Performance benchmarking, model comparison
Prompt caching implementation details
Token counting algorithms

Key Exam Patterns

When reliability is critical -> programmatic enforcement (hooks), not prompts
Tool selection confused -> expand descriptions first (lowest effort, highest leverage)
Self-review -> always use independent instance (no shared reasoning context)
Batch API -> only for latency-tolerant workloads, never blocking workflows
Multiple matches -> ask for clarification, never heuristic selection
Errors -> structured context (type + attempted + partial + alternatives), never generic
Context growing -> trim tool outputs, extract facts, use subagents for isolation
Large code review -> split into per-file + cross-file passes
Few-shot examples -> for ambiguous scenarios where instructions produce inconsistent results
Schema fields -> make optional/nullable when info might be absent (prevents hallucination)

Claude Certified Architect - Foundations Exam Notes#

Exam Overview#

Domain Weights#

Exam Scenarios#

Domain 1: Agentic Architecture & Orchestration (27%)#

MUST KNOW#

KEY DISTINCTIONS#

Domain 2: Tool Design & MCP Integration (18%)#

MUST KNOW#

Domain 3: Claude Code Configuration & Workflows (20%)#

MUST KNOW#

Domain 4: Prompt Engineering & Structured Output (20%)#

MUST KNOW#

Domain 5: Context Management & Reliability (15%)#

MUST KNOW#

Out of Scope (DO NOT study)#

Key Exam Patterns#

Claude Certified Architect - Foundations Exam Notes

Exam Overview

Domain Weights

Exam Scenarios

Domain 1: Agentic Architecture & Orchestration (27%)

MUST KNOW

KEY DISTINCTIONS

Domain 2: Tool Design & MCP Integration (18%)

MUST KNOW

Domain 3: Claude Code Configuration & Workflows (20%)

MUST KNOW

Domain 4: Prompt Engineering & Structured Output (20%)

MUST KNOW

Domain 5: Context Management & Reliability (15%)

MUST KNOW

Out of Scope (DO NOT study)

Key Exam Patterns