memex/notes/comparative-extensibility.org

#+TITLE: Comparative Extension Architecture Study
#+FILETAGS: :notes:comparative-study:extensibility:skills:plugins:hooks:mcp:

* Purpose

Compare extension/skill/plugin/hook architectures across Claude Code, OpenCode, OpenClaw, and Hermes Agent. Inform Passepartout's skill system and planned MCP integration (v0.10.0). Identify whether Passepartout needs hooks or plugins in addition to skills.

* Findings Summary

| Dimension | Claude Code | OpenCode | OpenClaw | Hermes | Passepartout |
|-----------+-------------+----------+----------+--------+--------------|
| Extension mechanisms | 4 (MCP, plugins, skills, hooks) | 2 (plugins + skills) | 1 (plugins — everything) | 2 (skills + plugins) | 1 (skills) + planned MCP |
| Skills format | SKILL.md with YAML frontmatter | SKILL.md with YAML frontmatter | None (plugins serve role) | SKILL.md with YAML frontmatter | .org files with defskill |
| Skill security | Path-conditional, symlink-safe, MCP shell inj disabled | None | N/A | Static analysis + trust levels + quarantine | Jail-loaded packages + sandbox check |
| Plugins | npm packages, manifest JSON, 3 scopes | npm packages, Effect functions | npm packages, manifest JSON | Bundled/user/project/Pip chain | Not implemented |
| Hooks | 27 events, 4 types (cmd/prompt/agent/http) | ~10 events via plugin functions | 36 events via plugin hooks | 20 events via plugin callbacks | defskill triggers only |
| Lifecycle | PreToolUse, PostToolUse, Session*, etc. | tool.definition, session.*, etc. | before_* , after_*, session_*, etc. | pre_tool_call, post_tool_call, etc. | None beyond trigger |
| MCP | Deepest: stdio/SSE/WS, OAuth, MCP skills, enterprise | Minimal: MCP web search only | Bidirectional: serve + consume MCP | Deep: stdio/HTTP, auto-reload, OSV check | Planned v0.10.0 |
| Tool registration | TypeScript Tool objects, Zod schemas | Effect Tool.Def, plugin + file glob | Plugin tool factories | Self-registration at import, AST discovery | def-cognitive-tool (never called) |

* Claude Code — Four Mechanisms, Clear Boundaries

**Skills**: Two types — Bundled (TypeScript, compiled in, feature-flagged) and File-based (SKILL.md in .claude/skills/<name>/). YAML frontmatter with: name, description, when_to_use, allowed-tools, model, user-invocable, paths (conditional activation), hooks, context, agent, effort. Path-conditional: skills activate only when matching files are touched (gitignore-style matching). Dynamic discovery walking up from file paths.

**Plugins**: package.json manifest + Bun plugin ecosystem. PluginDefinition unifies skills, hooks, MCP servers, LSP servers, agents, output styles. Three scopes (user/project/local) with precedence. Settings-first installation: intent written before materialization. Enterprise policy blocking.

**Hooks**: 27 lifecycle events in HOOK_EVENTS. Four types: command (shell), prompt (LLM), agent (subagent), http (callback). JSON protocol: stdin→stdout structured communication. Permission decisions (allow/deny/ask) with updatedInput. Trust dialog gate. 10-minute timeout. Async backgrounding support.

**MCP**: stdio, SSE, WebSocket, in-process transports. Tool naming: mcp__<server>__<tool>. MCP servers can provide skills (loaded from MCP, shell injection disabled). OAuth 2.1. Connection management with background reconnection.

**Why 4 mechanisms?** The paper (arXiv:2604.14228v1) explains: different trade-offs. MCP = process isolation, highest safety. Plugins = npm ecosystem, maximum capability. Skills = markdown files, zero-install, lowest context cost. Hooks = intercept behavior without defining full tools. Each addresses a different point on the safety↔capability spectrum.

* OpenCode — Simple, Functional

**Plugins**: Effect-TS functions. (PluginInput) → Hooks. Fires on bus events. Tool provisioning: plugins contribute Tool.Def entries. File-based tool discovery: Glob.scanSync("{tool,tools}/*.{js,ts}").

**TUI plugins (separate system)**: Slot-based JSX rendering. 10+ named slots. SolidJS reactive. Full API: route, theme, keymap, sdk, dialog, toast, slot, sync, lifecycle, kv.

**Skills**: SKILL.md files from .opencode/skills/, .claude/skills/, .agents/skills/. Simple YAML frontmatter (name, description). Loaded lazily, cached. Content injected as <skill_content> blocks.

**Hooks**: ~10 hook events. Tool definition mutation (plugin.trigger("tool.definition")). Message transform. Session lifecycle.

**Unique**: Effect-TS DI makes cancellation/safety composable. TUI slot system enables rich terminal UI extensions. Tool definition mutation at registration time.

* OpenClaw — Everything is a Plugin

**Unified plugin system**: Everything is a plugin — providers, tools, memory, context engines, channels, web search, speech, TTS, video, image, transcription, migration, CLI backends. One mechanism to rule them all.

**Plugin entry**: definePluginEntry({id, name, description, kind?, configSchema?, register}). The register(api) callback receives the full API surface. Kind declaration in manifest.

**36 lifecycle hooks**: Most extensive hook surface of all agents. before_model_resolve, agent_turn_prepare, before_prompt_build, before_tool_call, after_tool_call, session_start/end, subagent_spawning/spawned/ended, gateway_start/stop, heartbeat_prompt_contribution, before_install, and more.

**Tool factories**: (ctx: OpenClawPluginToolContext) => AnyAgentTool[]. Tools auto-exposed as MCP server. Plugin tool MCP for other agents to consume.

**Isolation**: SQLite-backed plugin state store. Path traversal protection. Activation gating. Memory slot competition (only one memory plugin). Security audit collector interface.

**Auto-enable**: Complex detection rules (e.g., "auto-enable this plugin when using the OpenAI provider"). Manifest-first: capabilities declared in openclaw.plugin.json.

**Unique**: Universal plugin mechanism eliminates concept count. Auto-enable rules. Plugin tools available as MCP to external agents. Most granular isolation.

* Hermes Agent — Skills with Security Scanning

**Skills**: SKILL.md with rich YAML frontmatter (name, description, version, license, platforms, prerequisites, compatibility, metadata.tags). Progressive disclosure (list metadata → view content → load references). Agentskills.io compatible. Skill Hub with multiple sources (bundled, GitHub, extensible SkillSource ABC). Hub directory with lock file, quarantine, audit log, taps.

**Skill security (most sophisticated of all)**: Static analysis scanner detecting data exfiltration, prompt injection, destructive commands, persistence, network access, obfuscation. Trust level matrix (builtin/trusted/community/agent-created × safe/caution/dangerous → allow/block/ask). Quarantine before install.

**Plugins**: Four-source chain (bundled → user → project → Pip). Manifest: plugin.yaml + __init__.py with register(ctx). 20 hooks (pre/post tool call, pre/post LLM call, session start/end, subagent, transform hooks, gateway hooks, config loaded).

**Toolsets**: Composable tool groups (define → include → resolve). Probability-based distribution for batch runs. AST-based auto-discovery of tool registrations.

**MCP**: stdio + HTTP transports. Background asyncio event loop. Auto-reconnect with exponential backoff. Auto-reload on config.yaml change. OSV malware check for npx/uvx packages.

**Unique**: Skill security scanner. Skill hub with quarantine. Toolset composability. MCP auto-reload.

* Skills vs Hooks — The Distinction

| Agent | Skills (Instruction Injection) | Hooks (Behavior Interception) | Unified? |
|-------|-------------------------------|------------------------------|----------|
| Claude Code | Command objects, inject markdown into context | 27 lifecycle events, 4 types, JSON protocol | Blurred — skills CAN have hooks in frontmatter |
| OpenCode | Markdown files loaded as <skill_content> | Plugin functions on lifecycle names | Separate — skills are content, hooks are code |
| OpenClaw | No concept — everything is a plugin | 36 events, everything is a hook | Unified — no distinction exists |
| Hermes | SKILL.md with security scanning | 20 events via plugin callbacks | Separate — different formats |
| **Passepartout** | defskill with trigger + deterministic | **None** | **Skills only — missing hooks entirely** |

* Passepartout Blindspot Assessment

1. **No hooks** — Passepartout's skills fire on triggers but can't intercept tool execution, model calls, or session lifecycle. All 4 competitors have hooks. This is the biggest extension architecture gap. Claude Code's PreToolUse hook pattern is the cleanest: a registered function that can inspect a tool call before execution and return allow/deny/ask with optional input modification. [Action: add Hook system to v0.8.0+]

2. **def-cognitive-tool never called** — The macro and registry exist but are empty. Claude Code's Tool interface with 10+ methods (call, isEnabled, isReadOnly, isConcurrencySafe, checkPermissions, validateInput) is a richer model. [Action: fill registry in v0.4.1 already planned]

3. **No skill security scanning** — Hermes's static analysis scanner is the gold standard. Passepartout's jail-loading + sandbox check is good but could be augmented with regex-based content scanning. [Action: add to security study]

4. **No plugin system** — OpenClaw's universal plugin system proves one mechanism can serve all extensibility needs. Passepartout should consider whether skills + hooks + MCP is sufficient, or whether a plugin manifest is needed. [Action: decision point for v0.10.0]

5. **Skills are Lisp-specific** — All competitors use markdown/JSON for skills (language-agnostic). Passepartout's .org files with defskill macros require knowing Common Lisp. For the Skill Creator (v0.11.0), this is fine (LLM writes Lisp). For user-authored skills, markdown frontmatter with Lisp code blocks would lower the barrier. [Action: consider dual format]

6. **MCP integration planned but not prioritized** — Claude Code and Hermes both have deep MCP. Passepartout's v0.10.0 placement is correct. No action needed, just schedule awareness.

* Recommended Architecture for Passepartout

Based on competitive analysis, Passepartout should converge on 3 mechanisms (not Claude Code's 4, not OpenClaw's 1):

1. **Skills** (defskill, .org format) — Instruction injection. Current state: working, needs security scanning + dual format option.
2. **Hooks** (new, PreToolUse/PostToolUse/Session*) — Behavior interception. Missing entirely. Add as slots on the defskill struct: :pre-tool-hook, :post-tool-hook, :on-session-start, :on-heartbeat.
3. **MCP** (planned v0.10.0) — External tool ecosystem. Process-isolated, community-maintained.

This avoids the complexity of plugins (Claude Code/OpenClaw) while covering the essential extension surface. The key addition is hooks — without them, Passepartout can't intercept tool execution or respond to lifecycle events.