Files
memex/notes/comparative-extensibility.org
Amr Gharbeia 4e9431ec1d memex: update passepartout submodule → v0.7.2, add notes
passepartout v0.7.2 (Gate Trace + HITL + Search + 11 more features):
- Gate trace visualization with Ctrl+G toggle
- HITL inline panels with styled collapse on approve/deny
- Agent identity file + /identity command
- Safe-tool read-only allowlist
- Message search mode with Up/Down nav and highlights
- Context budget visibility with section breakdown
- Session rewind /sessions /resume /rewind
- Undo/redo per operation
- Context debugging /context why /context dropped
- Tool hardening (timeouts, write verify, read-only cache)
- Tag stack severity tiers + trigger counts
- Merkle provenance audit + audit-verify
- Self-help /help <topic> reads USER_MANUAL.org
- Live CONFIG section in system prompts
- Pads: Page Up/Down scroll by 10 lines

Core 92/92  TUI Main 104/104  TUI View 29/29  Neuro 13/13
2026-05-08 21:56:11 -04:00

108 lines
11 KiB
Org Mode
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
#+TITLE: Comparative Extension Architecture Study
#+FILETAGS: :notes:comparative-study:extensibility:skills:plugins:hooks:mcp:
* Purpose
Compare extension/skill/plugin/hook architectures across Claude Code, OpenCode, OpenClaw, and Hermes Agent. Inform Passepartout's skill system and planned MCP integration (v0.10.0). Identify whether Passepartout needs hooks or plugins in addition to skills.
* Findings Summary
| Dimension | Claude Code | OpenCode | OpenClaw | Hermes | Passepartout |
|-----------+-------------+----------+----------+--------+--------------|
| Extension mechanisms | 4 (MCP, plugins, skills, hooks) | 2 (plugins + skills) | 1 (plugins — everything) | 2 (skills + plugins) | 1 (skills) + planned MCP |
| Skills format | SKILL.md with YAML frontmatter | SKILL.md with YAML frontmatter | None (plugins serve role) | SKILL.md with YAML frontmatter | .org files with defskill |
| Skill security | Path-conditional, symlink-safe, MCP shell inj disabled | None | N/A | Static analysis + trust levels + quarantine | Jail-loaded packages + sandbox check |
| Plugins | npm packages, manifest JSON, 3 scopes | npm packages, Effect functions | npm packages, manifest JSON | Bundled/user/project/Pip chain | Not implemented |
| Hooks | 27 events, 4 types (cmd/prompt/agent/http) | ~10 events via plugin functions | 36 events via plugin hooks | 20 events via plugin callbacks | defskill triggers only |
| Lifecycle | PreToolUse, PostToolUse, Session*, etc. | tool.definition, session.*, etc. | before_* , after_*, session_*, etc. | pre_tool_call, post_tool_call, etc. | None beyond trigger |
| MCP | Deepest: stdio/SSE/WS, OAuth, MCP skills, enterprise | Minimal: MCP web search only | Bidirectional: serve + consume MCP | Deep: stdio/HTTP, auto-reload, OSV check | Planned v0.10.0 |
| Tool registration | TypeScript Tool objects, Zod schemas | Effect Tool.Def, plugin + file glob | Plugin tool factories | Self-registration at import, AST discovery | def-cognitive-tool (never called) |
* Claude Code — Four Mechanisms, Clear Boundaries
**Skills**: Two types — Bundled (TypeScript, compiled in, feature-flagged) and File-based (SKILL.md in .claude/skills/<name>/). YAML frontmatter with: name, description, when_to_use, allowed-tools, model, user-invocable, paths (conditional activation), hooks, context, agent, effort. Path-conditional: skills activate only when matching files are touched (gitignore-style matching). Dynamic discovery walking up from file paths.
**Plugins**: package.json manifest + Bun plugin ecosystem. PluginDefinition unifies skills, hooks, MCP servers, LSP servers, agents, output styles. Three scopes (user/project/local) with precedence. Settings-first installation: intent written before materialization. Enterprise policy blocking.
**Hooks**: 27 lifecycle events in HOOK_EVENTS. Four types: command (shell), prompt (LLM), agent (subagent), http (callback). JSON protocol: stdin→stdout structured communication. Permission decisions (allow/deny/ask) with updatedInput. Trust dialog gate. 10-minute timeout. Async backgrounding support.
**MCP**: stdio, SSE, WebSocket, in-process transports. Tool naming: mcp__<server>__<tool>. MCP servers can provide skills (loaded from MCP, shell injection disabled). OAuth 2.1. Connection management with background reconnection.
**Why 4 mechanisms?** The paper (arXiv:2604.14228v1) explains: different trade-offs. MCP = process isolation, highest safety. Plugins = npm ecosystem, maximum capability. Skills = markdown files, zero-install, lowest context cost. Hooks = intercept behavior without defining full tools. Each addresses a different point on the safety↔capability spectrum.
* OpenCode — Simple, Functional
**Plugins**: Effect-TS functions. (PluginInput) → Hooks. Fires on bus events. Tool provisioning: plugins contribute Tool.Def entries. File-based tool discovery: Glob.scanSync("{tool,tools}/*.{js,ts}").
**TUI plugins (separate system)**: Slot-based JSX rendering. 10+ named slots. SolidJS reactive. Full API: route, theme, keymap, sdk, dialog, toast, slot, sync, lifecycle, kv.
**Skills**: SKILL.md files from .opencode/skills/, .claude/skills/, .agents/skills/. Simple YAML frontmatter (name, description). Loaded lazily, cached. Content injected as <skill_content> blocks.
**Hooks**: ~10 hook events. Tool definition mutation (plugin.trigger("tool.definition")). Message transform. Session lifecycle.
**Unique**: Effect-TS DI makes cancellation/safety composable. TUI slot system enables rich terminal UI extensions. Tool definition mutation at registration time.
* OpenClaw — Everything is a Plugin
**Unified plugin system**: Everything is a plugin — providers, tools, memory, context engines, channels, web search, speech, TTS, video, image, transcription, migration, CLI backends. One mechanism to rule them all.
**Plugin entry**: definePluginEntry({id, name, description, kind?, configSchema?, register}). The register(api) callback receives the full API surface. Kind declaration in manifest.
**36 lifecycle hooks**: Most extensive hook surface of all agents. before_model_resolve, agent_turn_prepare, before_prompt_build, before_tool_call, after_tool_call, session_start/end, subagent_spawning/spawned/ended, gateway_start/stop, heartbeat_prompt_contribution, before_install, and more.
**Tool factories**: (ctx: OpenClawPluginToolContext) => AnyAgentTool[]. Tools auto-exposed as MCP server. Plugin tool MCP for other agents to consume.
**Isolation**: SQLite-backed plugin state store. Path traversal protection. Activation gating. Memory slot competition (only one memory plugin). Security audit collector interface.
**Auto-enable**: Complex detection rules (e.g., "auto-enable this plugin when using the OpenAI provider"). Manifest-first: capabilities declared in openclaw.plugin.json.
**Unique**: Universal plugin mechanism eliminates concept count. Auto-enable rules. Plugin tools available as MCP to external agents. Most granular isolation.
* Hermes Agent — Skills with Security Scanning
**Skills**: SKILL.md with rich YAML frontmatter (name, description, version, license, platforms, prerequisites, compatibility, metadata.tags). Progressive disclosure (list metadata → view content → load references). Agentskills.io compatible. Skill Hub with multiple sources (bundled, GitHub, extensible SkillSource ABC). Hub directory with lock file, quarantine, audit log, taps.
**Skill security (most sophisticated of all)**: Static analysis scanner detecting data exfiltration, prompt injection, destructive commands, persistence, network access, obfuscation. Trust level matrix (builtin/trusted/community/agent-created × safe/caution/dangerous → allow/block/ask). Quarantine before install.
**Plugins**: Four-source chain (bundled → user → project → Pip). Manifest: plugin.yaml + __init__.py with register(ctx). 20 hooks (pre/post tool call, pre/post LLM call, session start/end, subagent, transform hooks, gateway hooks, config loaded).
**Toolsets**: Composable tool groups (define → include → resolve). Probability-based distribution for batch runs. AST-based auto-discovery of tool registrations.
**MCP**: stdio + HTTP transports. Background asyncio event loop. Auto-reconnect with exponential backoff. Auto-reload on config.yaml change. OSV malware check for npx/uvx packages.
**Unique**: Skill security scanner. Skill hub with quarantine. Toolset composability. MCP auto-reload.
* Skills vs Hooks — The Distinction
| Agent | Skills (Instruction Injection) | Hooks (Behavior Interception) | Unified? |
|-------|-------------------------------|------------------------------|----------|
| Claude Code | Command objects, inject markdown into context | 27 lifecycle events, 4 types, JSON protocol | Blurred — skills CAN have hooks in frontmatter |
| OpenCode | Markdown files loaded as <skill_content> | Plugin functions on lifecycle names | Separate — skills are content, hooks are code |
| OpenClaw | No concept — everything is a plugin | 36 events, everything is a hook | Unified — no distinction exists |
| Hermes | SKILL.md with security scanning | 20 events via plugin callbacks | Separate — different formats |
| **Passepartout** | defskill with trigger + deterministic | **None** | **Skills only — missing hooks entirely** |
* Passepartout Blindspot Assessment
1. **No hooks** — Passepartout's skills fire on triggers but can't intercept tool execution, model calls, or session lifecycle. All 4 competitors have hooks. This is the biggest extension architecture gap. Claude Code's PreToolUse hook pattern is the cleanest: a registered function that can inspect a tool call before execution and return allow/deny/ask with optional input modification. [Action: add Hook system to v0.8.0+]
2. **def-cognitive-tool never called** — The macro and registry exist but are empty. Claude Code's Tool interface with 10+ methods (call, isEnabled, isReadOnly, isConcurrencySafe, checkPermissions, validateInput) is a richer model. [Action: fill registry in v0.4.1 already planned]
3. **No skill security scanning** — Hermes's static analysis scanner is the gold standard. Passepartout's jail-loading + sandbox check is good but could be augmented with regex-based content scanning. [Action: add to security study]
4. **No plugin system** — OpenClaw's universal plugin system proves one mechanism can serve all extensibility needs. Passepartout should consider whether skills + hooks + MCP is sufficient, or whether a plugin manifest is needed. [Action: decision point for v0.10.0]
5. **Skills are Lisp-specific** — All competitors use markdown/JSON for skills (language-agnostic). Passepartout's .org files with defskill macros require knowing Common Lisp. For the Skill Creator (v0.11.0), this is fine (LLM writes Lisp). For user-authored skills, markdown frontmatter with Lisp code blocks would lower the barrier. [Action: consider dual format]
6. **MCP integration planned but not prioritized** — Claude Code and Hermes both have deep MCP. Passepartout's v0.10.0 placement is correct. No action needed, just schedule awareness.
* Recommended Architecture for Passepartout
Based on competitive analysis, Passepartout should converge on 3 mechanisms (not Claude Code's 4, not OpenClaw's 1):
1. **Skills** (defskill, .org format) — Instruction injection. Current state: working, needs security scanning + dual format option.
2. **Hooks** (new, PreToolUse/PostToolUse/Session*) — Behavior interception. Missing entirely. Add as slots on the defskill struct: :pre-tool-hook, :post-tool-hook, :on-session-start, :on-heartbeat.
3. **MCP** (planned v0.10.0) — External tool ecosystem. Process-isolated, community-maintained.
This avoids the complexity of plugins (Claude Code/OpenClaw) while covering the essential extension surface. The key addition is hooks — without them, Passepartout can't intercept tool execution or respond to lifecycle events.