[PREP] Claude Certified Architect - Foundations Exam Notes
Table of Contents
- Claude Certified Architect - Foundations Exam Notes
- Exam Overview
- Domain Weights
- Exam Scenarios
- Domain 1: Agentic Architecture & Orchestration (27%)
- Domain 2: Tool Design & MCP Integration (18%)
- Domain 3: Claude Code Configuration & Workflows (20%)
- Domain 4: Prompt Engineering & Structured Output (20%)
- Domain 5: Context Management & Reliability (15%)
- Out of Scope (DO NOT study)
- Key Exam Patterns
Claude Certified Architect - Foundations Exam Notes
Exam Overview
- Format: Multiple choice (1 correct, 3 distractors), no penalty for guessing
- Passing score: 720/1000 (scaled)
- Scenarios: 4 of 6 picked randomly per exam
- Domains: 5 weighted areas
Domain Weights
| Domain | Weight |
|---|---|
| 1. Agentic Architecture & Orchestration | 27% |
| 2. Tool Design & MCP Integration | 18% |
| 3. Claude Code Configuration & Workflows | 20% |
| 4. Prompt Engineering & Structured Output | 20% |
| 5. Context Management & Reliability | 15% |
Exam Scenarios
- Customer Support Resolution Agent - SDK agent with MCP tools, 80% first-contact resolution target
- Code Generation with Claude Code - Team workflows, CLAUDE.md, plan mode
- Multi-Agent Research System - Coordinator + subagents (search, analyze, synthesize, report)
- Developer Productivity - Codebase exploration, built-in tools, MCP integration
- Claude Code for CI/CD - Automated reviews, test generation, PR feedback
- Structured Data Extraction - JSON schemas, validation-retry, batch processing
Domain 1: Agentic Architecture & Orchestration (27%)
MUST KNOW
Agentic Loop Lifecycle:
- Send request to Claude -> inspect
stop_reason-> if “tool_use”: execute tool, append result, loop -> if “end_turn”: done - Tool results MUST be appended to conversation history for next iteration
- Model-driven decisions (Claude reasons about next tool) NOT pre-configured tool sequences
Anti-patterns to avoid:
- Parsing natural language to determine loop termination
- Setting arbitrary iteration caps as PRIMARY stopping mechanism
- Checking for assistant text content as completion indicator
Hub-and-spoke (Coordinator-Subagent):
- Coordinator manages ALL inter-subagent communication
- Subagents have ISOLATED context - do NOT inherit coordinator’s conversation history
- Coordinator does: task decomposition, delegation, result aggregation, deciding which subagents to invoke
- Risk: overly narrow task decomposition -> incomplete topic coverage
Subagent Spawning:
- Task tool = mechanism for spawning subagents
allowedToolsmust include “Task” for coordinator to invoke subagents- Context must be EXPLICITLY provided in prompt - no automatic inheritance
- Parallel subagents: emit MULTIPLE Task tool calls in a SINGLE coordinator response
- Pass complete findings from prior agents directly in subagent’s prompt
Programmatic vs Prompt-based Enforcement:
- When deterministic compliance required (identity verification before financial ops) -> programmatic hooks
- Prompt instructions have NON-ZERO failure rate for critical sequences
- Hooks provide guaranteed compliance; prompts provide probabilistic compliance
Agent SDK Hooks:
PostToolUse: intercept tool results for transformation/normalization before model processes them- Tool call interception: block policy-violating actions, redirect to alternative workflows
- Choose hooks over prompts when business rules require GUARANTEED compliance
Task Decomposition:
- Fixed sequential (prompt chaining): predictable multi-aspect reviews
- Dynamic adaptive: open-ended investigation where subtasks depend on discoveries
- Large code reviews: per-file local analysis + separate cross-file integration pass
Session Management:
--resume <session-name>: continue specific prior conversationfork_session: create independent branches from shared analysis baseline- New session with structured summary > resuming with stale tool results
- Inform resumed sessions about specific file changes for targeted re-analysis
KEY DISTINCTIONS
- Programmatic enforcement = deterministic = hooks = for critical business logic
- Prompt-based guidance = probabilistic = for soft preferences
- Self-review in same session = weak (retains reasoning bias)
- Independent review instance = strong (no prior context bias)
Domain 2: Tool Design & MCP Integration (18%)
MUST KNOW
Tool Descriptions:
- Primary mechanism LLMs use for tool selection
- Minimal descriptions -> unreliable selection among similar tools
- Must include: input formats, example queries, edge cases, boundary explanations
- Ambiguous/overlapping descriptions cause misrouting
Fix tool confusion by:
- Expanding descriptions (first step, highest leverage)
- Renaming tools to eliminate overlap
- Splitting generic tools into purpose-specific ones
Structured Error Responses (MCP isError flag):
- Return:
errorCategory(transient/validation/permission),isRetryableboolean, human-readable description - Uniform “Operation failed” prevents agent from making recovery decisions
- Transient = retry; Validation = fix input; Permission = escalate; Business = explain to user
- Distinguish access failures (need retry) from valid empty results (successful query, no matches)
Tool Distribution:
- Too many tools (18 instead of 4-5) DEGRADES selection reliability
- Agents with tools outside specialization tend to MISUSE them
- Give agents ONLY tools needed for their role
- Scoped cross-role tools for high-frequency needs (e.g., verify_fact for synthesis agent)
tool_choice options:
"auto": model may return text OR call a tool"any": model MUST call a tool (but can choose which){"type": "tool", "name": "..."}: FORCE specific tool call
MCP Server Scoping:
- Project-level:
.mcp.json(shared via version control) - User-level:
~/.claude.json(personal/experimental) - Environment variable expansion:
${GITHUB_TOKEN}in .mcp.json - Tools from ALL configured servers available simultaneously
- MCP resources: expose content catalogs to reduce exploratory tool calls
Built-in Tools:
- Grep = content search (function names, error messages, patterns IN files)
- Glob = file path pattern matching (find files BY name/extension)
- Read/Write = full file operations; Edit = targeted modifications using unique text matching
- If Edit fails (non-unique match) -> Read + Write as fallback
- Build understanding incrementally: Grep entry points -> Read to trace flows
Domain 3: Claude Code Configuration & Workflows (20%)
MUST KNOW
CLAUDE.md Hierarchy:
- User-level:
~/.claude/CLAUDE.md(personal, NOT shared via version control) - Project-level:
.claude/CLAUDE.mdor rootCLAUDE.md(shared with team) - Directory-level: subdirectory CLAUDE.md files (scoped to that area)
@importsyntax: reference external files for modularity
.claude/rules/ Directory:
- Alternative to monolithic CLAUDE.md
- Topic-specific rule files
- YAML frontmatter
pathsfield with glob patterns for conditional activation - Rules load ONLY when editing matching files -> reduces irrelevant context
- Better than directory-level CLAUDE.md for conventions spanning multiple dirs
Custom Slash Commands:
- Project-scoped:
.claude/commands/(shared via VCS) - User-scoped:
~/.claude/commands/(personal)
Skills (.claude/skills/):
SKILL.mdwith frontmatter:context: fork,allowed-tools,argument-hintcontext: fork: runs in isolated sub-agent context, prevents polluting main conversationallowed-tools: restricts tool access during skill execution- Skills = on-demand invocation; CLAUDE.md = always-loaded
Plan Mode vs Direct Execution:
- Plan mode: complex tasks, large-scale changes, multiple valid approaches, architectural decisions, multi-file modifications
- Direct execution: simple, well-scoped changes (single validation check, one function)
- Explore subagent: isolates verbose discovery output, returns summaries to preserve main context
- Combine: plan mode for investigation, direct execution for implementation
CI/CD Integration:
-p(or--print) flag: non-interactive mode for automated pipelines--output-format json+--json-schema: structured output for CI- Same session that generated code is LESS effective at reviewing its own changes
- Include prior review findings to avoid duplicate comments on re-review
Iterative Refinement:
- Concrete input/output examples > prose descriptions for transformation specs
- Test-driven iteration: write tests first, iterate by sharing failures
- Interview pattern: Claude asks questions to surface unconsidered aspects
- Multiple interacting issues -> single message; independent issues -> sequential
Domain 4: Prompt Engineering & Structured Output (20%)
MUST KNOW
Explicit Criteria:
- “Flag comments only when claimed behavior contradicts actual code” > “check comments are accurate”
- General instructions (“be conservative”) DON’T improve precision
- High false positive rates undermine confidence in ALL categories
- Temporarily disable high-FP categories to restore trust while improving
Few-shot Prompting:
- Most effective for consistently formatted, actionable output
- Show reasoning for WHY one action chosen over alternatives
- Enable generalization to novel patterns (not just matching pre-specified cases)
- Reduce hallucination in extraction tasks
- 2-4 targeted examples for ambiguous scenarios
Structured Output via tool_use:
- tool_use + JSON schemas = most reliable for guaranteed schema compliance
- Eliminates JSON SYNTAX errors but NOT SEMANTIC errors (items don’t sum, wrong fields)
tool_choice: "any"= guarantee structured output when doc type unknown- Schema design: optional/nullable fields prevent model fabricating values for absent info
- Enum with “other” + detail string = extensible categories
- “unclear” enum value for ambiguous cases
Validation-Retry Loops:
- Append specific validation errors to prompt on retry
- Retries INEFFECTIVE when info simply absent from source (vs format/structural errors)
- Track
detected_patternfield for systematic analysis of false positive dismissals - Self-correction: extract “calculated_total” alongside “stated_total” to flag discrepancies
Message Batches API:
- 50% cost savings, up to 24-hour processing window
- NO guaranteed latency SLA
- Appropriate: non-blocking, latency-tolerant (overnight reports, weekly audits)
- Inappropriate: blocking workflows (pre-merge checks)
- Does NOT support multi-turn tool calling within a single request
custom_idfor correlating request/response pairs- Handle failures: resubmit only failed docs by custom_id
Multi-instance Review:
- Self-review = weak (retains reasoning context from generation)
- Independent instance (no prior reasoning) = more effective at catching subtle issues
- Split large reviews: per-file local analysis + cross-file integration passes
- Confidence alongside findings enables calibrated routing
Domain 5: Context Management & Reliability (15%)
MUST KNOW
Context Preservation:
- Progressive summarization LOSES: numerical values, percentages, dates, customer expectations
- “Lost in the middle” effect: models reliably process beginning/end, may omit middle sections
- Tool results accumulate disproportionately (40+ fields when only 5 relevant)
- MUST pass complete conversation history in subsequent API requests
Techniques:
- Extract transactional facts into persistent “case facts” block OUTSIDE summarized history
- Trim verbose tool outputs to only relevant fields BEFORE accumulation
- Place key findings at BEGINNING; detailed results with explicit section headers
- Subagents return structured data (key facts, citations, relevance) not verbose reasoning chains
/compactto reduce context during extended sessions
Escalation Patterns:
- Triggers: customer requests human, policy exceptions/gaps, inability to progress
- Honor explicit human requests IMMEDIATELY (don’t try to resolve first)
- Sentiment-based escalation = UNRELIABLE proxy for case complexity
- Self-reported confidence = POORLY CALIBRATED
- Multiple customer matches -> ask for additional identifiers, NOT heuristic selection
- Policy is ambiguous -> escalate (e.g., competitor price match when policy silent)
Error Propagation (Multi-Agent):
- Return: failure type, attempted query, partial results, alternative approaches
- Generic “search unavailable” HIDES valuable context
- Silently suppressing errors (empty as success) = anti-pattern
- Terminating entire workflow on single failure = anti-pattern
- Subagents: local recovery for transient failures, escalate only unresolvable errors + partial results
Large Codebase Exploration:
- Context degrades in extended sessions -> inconsistent answers, “typical patterns” instead of specifics
- Scratchpad files: persist key findings across context boundaries
- Subagent delegation: isolate verbose exploration, main agent coordinates high-level
- Crash recovery: structured agent state exports (manifests), coordinator loads on resume
Human Review & Confidence:
- Aggregate accuracy (97%) may MASK poor performance on specific doc types/fields
- Stratified random sampling for measuring error rates in high-confidence extractions
- Field-level confidence calibrated using labeled validation sets
- Validate accuracy BY document type AND field BEFORE automating
Information Provenance:
- Source attribution lost during summarization without claim-source mappings
- Conflicting stats: annotate with source attribution, don’t arbitrarily select one
- Require publication/collection dates to prevent temporal misinterpretation
- Render content type-appropriately (financial=tables, news=prose, technical=structured lists)
Out of Scope (DO NOT study)
- Fine-tuning, training custom models
- API authentication, billing, account management
- Deploying/hosting MCP servers (infrastructure)
- Claude’s internal architecture, training process
- Constitutional AI, RLHF, safety training
- Embeddings, vector databases
- Computer use (browser automation)
- Vision/image analysis
- Streaming API, SSE
- Rate limiting, quotas, pricing
- OAuth, API key rotation
- Cloud provider configs (AWS, GCP, Azure)
- Performance benchmarking, model comparison
- Prompt caching implementation details
- Token counting algorithms
Key Exam Patterns
- When reliability is critical -> programmatic enforcement (hooks), not prompts
- Tool selection confused -> expand descriptions first (lowest effort, highest leverage)
- Self-review -> always use independent instance (no shared reasoning context)
- Batch API -> only for latency-tolerant workloads, never blocking workflows
- Multiple matches -> ask for clarification, never heuristic selection
- Errors -> structured context (type + attempted + partial + alternatives), never generic
- Context growing -> trim tool outputs, extract facts, use subagents for isolation
- Large code review -> split into per-file + cross-file passes
- Few-shot examples -> for ambiguous scenarios where instructions produce inconsistent results
- Schema fields -> make optional/nullable when info might be absent (prevents hallucination)