Extensions

8 min read 1618 words

Table of Contents

Extensions

Extensions

Extensions are how third parties (or you, later) add capabilities to the agent without modifying the core. In Claude Code: skills, plugins, MCP servers, hooks. In your harness: whatever pattern you expose to users who want to extend without forking.

Extension design is where supply-chain security and capability boundaries meet.

The core problem

A closed agent is limited to what its creators shipped. An open agent can be extended by anyone — users, teams, third parties, the community — but every extension point is also an attack surface. A malicious skill can inject instructions; a malicious plugin can escalate privileges; a malicious MCP server can exfiltrate data.

The right design enables rich extensions while sandboxing their trust level. Trust is declared at install time and enforced at runtime. Extensions never gain capabilities they weren’t granted when the user installed them.

Skills vs tools

The foundational distinction:

	Tools	Skills
What they change	Capability (what the agent CAN do)	Instruction (what the agent KNOWS)
Form	Code with typed input schema	Markdown with frontmatter
Activation	Always available in the tool list	On-demand (invoked, auto-triggered, path-matched)
Sandboxing	Permission rules, input validation	Gitignore check, shell-in-prompt restrictions
Example	`FileEditTool`	`/qa`, `/investigate`, `/plan-eng-review`

Skills and tools are both needed:

Tools extend what’s physically possible
Skills extend what the agent decides to do in which order

A skill composes existing tools into a higher-level workflow. A tool provides a primitive the skill can use.

Patterns observed in Claude Code

Skill discovery modes

Three ways skills get loaded:

Static — discovered at startup by walking skill directories
Dynamic — rediscovered when a file is touched (hot reload)
Conditional — path-based activation (only active when CWD matches the skill’s target paths)

Why path-based activation matters: in a monorepo, packages/api/ needs different skills than packages/web/. Conditional activation makes skills appear/disappear as the agent navigates.

Skill sources (5)

Claude Code loads skills from 5 sources:

Bundled — compiled into the binary, always available
User — ~/.claude/skills/, globally available for this user
Project — ./.claude/skills/, specific to this project
MCP — remote server provides skills via the MCP protocol
Managed — enterprise-controlled, pushed from a central server

Each source has a different trust level. Bundled skills run with full capability; MCP skills are blocked from shell execution.

Shell-in-prompt + gitignore check

Skills can contain inline shell commands:

## Context

Current branch: !`git branch --show-current`
Staged files: !`git diff --cached --name-only`

The !…`` syntax runs the command at prompt-build time and embeds the output. This is powerful — skills can include live repo state without asking the model to run tools.

Security constraints:

MCP skills blocked from shell execution. Remote code = remote shell would be a catastrophic vuln.
Gitignore check before loading. Skills in node_modules/, .venv/, or any gitignored directory are skipped. Prevents supply-chain injection via a npm package that sneaks a skill file into the project.

Lesson: every content loader must have a gitignore check. Without it, any package you install can inject skills.

Plugin system (4 extension points)

Plugins are larger-scope extensions — they bundle multiple artifacts. The Claude Code leak showed 4 extension points:

Commands — slash commands (like /review)
Agents — subagent definitions with custom system prompts and tool subsets
Hooks — lifecycle hooks (like session-init.cjs)
Servers — MCP server definitions

A single plugin can register across all 4 points. Installing “my-plugin” might add a slash command, a subagent, a hook, and an MCP server — all at once.

Security sandbox for plugin agents

Plugin-defined agents have restricted capabilities:

Cannot set permissionMode per-agent
Cannot define hooks per-agent
Cannot declare new mcpServers per-agent

Only user-defined agents (in .claude/agents/) can set these. Plugin agents run in a sandbox that cannot escalate beyond what the user granted at install time.

Why: plugins come from third parties. A malicious plugin that could add hooks would gain arbitrary code execution. A malicious plugin that could change permission modes could bypass safety rules. The sandbox says “nothing runtime-dangerous” at the extension boundary.

Reconciliation-based install

Plugin install/update uses Kubernetes-style reconciliation:

Desired state — list of plugins the user wants enabled
Actual state — what’s currently installed
Diff — compute the delta
Apply — install/update/remove in background, non-blocking

Why: declarative install is idempotent, auditable, and safer than imperative. You can re-run setup 100 times and the result is the same. You can diff the desired state against a previous snapshot to see what changed.

Skills change instructions; tools change capabilities

This is worth stating again because it’s easy to confuse:

If the agent can’t DO something, add a tool
If the agent doesn’t KNOW to do something, add a skill

A skill that says “search with ripgrep” doesn’t give the agent ripgrep — it tells the agent when to use the Grep tool that already exists. A tool that adds ripgrep support gives the agent a new capability regardless of whether any skill uses it.

Confusing them produces failures:

Skill with no tool support → agent can’t actually do the thing
Tool with no skill guidance → agent doesn’t know when to use it

Official plugin landscape

Claude Code ships with 32 official plugins, organized by purpose:

12 LSP — language-server integrations (TypeScript, Python, Rust, Go, etc.)
10 workflow — slash commands for common tasks
6 meta — plugin management, debugging, diagnostics
4 output — formatters, exporters

Plus 16 bundled skills compiled into the binary.

Lesson: a mature agent system ships with a substantial extension library even before third parties contribute. Being extensible is not enough; you also need a seed of high-quality extensions to demonstrate the patterns and provide immediate value.

Anti-patterns

Extensions with install-time capability escalation. A plugin that grants itself new permissions during install. Compromise.
Skills without gitignore check. Every npm install becomes a potential prompt injection.
MCP skills allowed to execute shell. Remote shell execution. Full RCE.
Plugin agents that can add hooks. Plugin can hook arbitrary code into every session. Persistence.
No trust tiering. Bundled = user = project = MCP = managed, all treated the same. Attack surface is the lowest common denominator.
Imperative install with side effects. npm install my-plugin runs arbitrary postinstall. Not reconcilable, not auditable, not idempotent.
Hot reload without signature verification. Plugin updates itself at runtime. No integrity check.
Extensions modify core files. Plugins that patch the binary or core scripts. Impossible to reason about state.
No extension uninstall. You can add a plugin but never remove it cleanly.
Skills with implicit tool dependencies. Skill assumes WebFetch exists; on systems without it, silent failure.

Takeaways for harness engineering

Skills ≠ tools. Know which you’re adding. Don’t confuse instruction extension with capability extension.
4 extension points: commands, agents, hooks, servers. Anything you want third parties to add should fall into one of these categories. Don’t invent new extension surfaces.
Trust tiers by source. Bundled > user > project > MCP > managed. Enforce the tier at runtime.
Sandbox plugin-defined agents. No permission mode changes, no hooks, no MCP server declarations.
Gitignore check every content loader. Prevents supply-chain injection via package directories.
MCP extensions cannot execute shell. Remote code is already remote; remote shell would be worse.
Reconciliation install. Declarative desired state. Diff against actual. Apply delta. Idempotent.
Conditional activation. Path-based skill activation for monorepos. Skills appear/disappear based on CWD.
Ship a seed library. An extensible system with no extensions looks broken. Seed it with high-quality examples.
Extensions have uninstall stories. “Remove this plugin” should leave no traces.
Version every extension. Skills, plugins, hooks — all versioned. Rollback is a first-class operation.

What this repo does

Skills as markdown-first extensions. Each skill in skills/<name>/ has a SKILL.md with frontmatter defining name, description, allowed-tools, and optional hook registrations. This matches Claude Code’s skill format.
22 skills seeded. See main README.md for the full catalog.
Path-based scope (implicit). Skills are loaded from .claude/skills/ in the target project. When this distribution is symlinked as .claude/, the symlinked skills become project-scoped.
Hook-registering skills. skills/careful/ and skills/freeze/ both declare PreToolUse hooks in their SKILL.md frontmatter. The skill brings its own enforcement.
Venv for hook dependencies. setup-hooks installs lint tools into .venv/. The same pattern could hold future Python-based extensions.
Per-hook disable flag. .ckconfig.json lets users disable individual hooks — a form of extension control.

Gaps in this repo

No MCP server wiring. The distribution doesn’t ship or reference any MCP servers. When the user adds one (e.g., an Atlassian or Slack connector), it’s machine-specific in settings.local.json.
No trust tiering. All skills in the repo are treated the same. No bundled vs user vs project distinction.
No gitignore check on skill loading. If a skill were added in a gitignored directory, it would still load. (Claude Code itself handles this at the core level.)
No skill versioning discipline. SKILL.md frontmatter has a version field but we don’t track rollbacks or changelogs per skill.
No plugin system. The repo is a distribution, not a plugin marketplace. If the user wanted to share individual skills with others, a real plugin format would be the next step.
No reconciliation-based install. setup-hooks is imperative bash. Fine for a single-repo distribution; wouldn’t scale to plugin management.

Open problems

Skill versioning and rollback. If a skill update breaks something, how do you revert just that skill without reverting the whole distribution?
Plugin marketplace story. If the user wants to share skills with others (e.g., “the harness engineer starter pack”), what’s the packaging + install format?
MCP security posture. When users add MCP servers, how are they vetted? Who decides which servers are safe?
Conditional skill activation. Path-based activation isn’t implemented in this repo — all skills are always available. In a monorepo, some skills should only activate in certain subdirectories.
Extension telemetry. Which skills get used most? Which fire but fail? harness-tune could surface this but doesn’t yet.

Extensions#

The core problem#

Skills vs tools#

Patterns observed in Claude Code#

Skill discovery modes#

Skill sources (5)#

Shell-in-prompt + gitignore check#

Plugin system (4 extension points)#

Security sandbox for plugin agents#

Reconciliation-based install#

Skills change instructions; tools change capabilities#

Official plugin landscape#

Anti-patterns#

Takeaways for harness engineering#

What this repo does#

Gaps in this repo#

Open problems#

Extensions

The core problem

Skills vs tools

Patterns observed in Claude Code

Skill discovery modes

Skill sources (5)

Shell-in-prompt + gitignore check

Plugin system (4 extension points)

Security sandbox for plugin agents

Reconciliation-based install

Skills change instructions; tools change capabilities

Official plugin landscape

Anti-patterns

Takeaways for harness engineering

What this repo does

Gaps in this repo

Open problems