passepartout v0.7.2 (Gate Trace + HITL + Search + 11 more features): - Gate trace visualization with Ctrl+G toggle - HITL inline panels with styled collapse on approve/deny - Agent identity file + /identity command - Safe-tool read-only allowlist - Message search mode with Up/Down nav and highlights - Context budget visibility with section breakdown - Session rewind /sessions /resume /rewind - Undo/redo per operation - Context debugging /context why /context dropped - Tool hardening (timeouts, write verify, read-only cache) - Tag stack severity tiers + trigger counts - Merkle provenance audit + audit-verify - Self-help /help <topic> reads USER_MANUAL.org - Live CONFIG section in system prompts - Pads: Page Up/Down scroll by 10 lines Core 92/92 TUI Main 104/104 TUI View 29/29 Neuro 13/13
117 lines
7.5 KiB
Org Mode
117 lines
7.5 KiB
Org Mode
#+TITLE: Comparative Safety/Permission Study — Agent Authorization Architectures
|
|
#+FILETAGS: :notes:comparative-study:safety:permissions:security:
|
|
|
|
* Purpose
|
|
|
|
Compare safety architectures across Claude Code, OpenCode, OpenClaw, and Hermes Agent against Passepartout's 10-vector dispatcher. Inform the HITL regime, auto-approve learning, and permission dialogs in v0.7.2 and dispatcher-learn in v0.9.0.
|
|
|
|
* Findings Summary
|
|
|
|
| Dimension | Claude Code | OpenCode | OpenClaw | Hermes | Passepartout |
|
|
|-----------+-------------+----------+----------+--------+--------------|
|
|
| Modes | 7 (default, acceptEdits, plan, bypass, dontAsk, auto, bubble) | 3 per-tool (allow/deny/ask) | 3-tier (config/gateway/tool) | 3 (manual/smart/off) | Single HITL flow |
|
|
| Auto-approve | AI classifier (Opus 2-stage XML) | Rule evaluation (last-match-wins) | Exec approval manager (promise-based) | Smart approval via aux LLM | Planned: dispatcher-learn |
|
|
| Non-bypassable floor | Path safety checks immune to bypass | None | Gateway HTTP tool deny + config drift | 12-item hardline blocklist | 10 deterministic vectors |
|
|
| Remember/always | Session-level settings files | In-memory (session-only) | allow-always + config persistence | Session + permanent config.yaml | Not implemented |
|
|
| Pre-tool hooks | PreToolUse (27 lifecycle events) | None | None | pre/post approval hooks | defskill triggers only |
|
|
| Anti-misclick | Not literal 200ms; denial tracking + mode stripping | No explicit protection | No explicit protection | Countdown timer on prompt | Not implemented |
|
|
| Sandbox | bubblewrap+Seatbelt | None | Docker with 30+ blocked paths | None (container bypass) | Regex + planned bwrap |
|
|
|
|
* Claude Code — Authorization Pipeline
|
|
|
|
**7 Permission Modes:**
|
|
| Mode | Description |
|
|
|------|-------------|
|
|
| default | Interactive prompting for every tool use |
|
|
| acceptEdits | Auto-allow edits within cwd; prompt for bash/outside |
|
|
| plan | Plan-only; can plan but not execute |
|
|
| bypassPermissions | Full bypass (can be disabled via GrowthBook) |
|
|
| dontAsk | All ask results → deny; only rule-allowed tools |
|
|
| auto | AI classifier (Opus) decides yes/no on every invocation |
|
|
| bubble | Internal (not user-facing) |
|
|
|
|
**Authorization Pipeline** (12 steps in hasPermissionsToUseToolInner):
|
|
1a. Tool denied by rule → DENY immediately
|
|
1b. Tool has ask rule → ASK
|
|
1c-1g. Checks: permissions, implementation deny, user interaction required, content-specific rules, safety checks (.git/, .claude/, shell configs — bypass-immune)
|
|
2a. BypassPermissions or Plan → ALLOW
|
|
2b. Always-allow rule → ALLOW
|
|
3. Passthrough → ASK
|
|
|
|
**Auto-Mode Classifier** (yoloClassifier.ts):
|
|
- 2-stage Opus XML: Stage 1 (fast, max_tokens=64, <block>yes/no</block>), Stage 2 (thinking, chain-of-thought)
|
|
- Input: serialized transcript of user messages + assistant tool_use blocks (assistant text excluded — could manipulate)
|
|
- Fail-closed by default (GrowthBook iron_gate), but can fail-open for dev
|
|
- Transcript-too-long → fall back to manual prompting
|
|
- Denial tracking: 3 consecutive / 20 total blocks → force human review
|
|
- Safe-tool allowlist: FileRead, Grep, Glob, LSP, Task*, TodoWrite, AskUserQuestion — auto-allowed without classifier
|
|
- acceptEdits fast-path: re-checks tool with acceptEdits mode before calling classifier
|
|
- Mode stripping: dangerous rules (Bash(*), PowerShell(*), Agent) auto-removed when entering auto mode
|
|
|
|
* OpenCode — Permission Architecture
|
|
|
|
Simpler, rule-evaluation based. 3 actions per permission type (allow/deny/ask) with glob patterns:
|
|
```yaml
|
|
permissions:
|
|
bash: "ask"
|
|
edit:
|
|
"*.md": "allow"
|
|
"*.env": "deny"
|
|
```
|
|
|
|
Pipeline: evaluate each pattern → if deny → DeniedError; if all allow → auto-allow; otherwise → needAsk → publish Event.Asked → block until reply.
|
|
|
|
**Always Allow**: User clicks "Always allow" → patterns added to PermissionTable (SQLite/Drizzle). Auto-resolves any pending permissions the new rules cover. Session-level only (runtime, not written to config).
|
|
|
|
**Question prompts**: Vim-style (hjkl) navigation. Numbers 1-9 for direct selection. Multi-tab for multi-question. Custom answer textarea.
|
|
|
|
**Cascading rejection**: Denying one permission auto-rejects all pending permissions for that session.
|
|
|
|
* OpenClaw — Gateway-Centric Security
|
|
|
|
**Gateway-level access control** — all tool invocations flow through central auth/approval server:
|
|
- Auth modes: none/token/password/tailscale/device-token/bootstrap-token/trusted-proxy
|
|
- Rate limiting, browser origin checking, timing-safe secret comparison
|
|
|
|
**Exec Approval Manager**: Promise-based approval queue with allow-once (one-shot consumption), allow-always, 15s grace period for late awaiters.
|
|
|
|
**Dangerous config flags audit**: Scans configuration at startup for security risks (SSRF, unrestricted filesystem access, Docker privilege escalation). Plugins declare their own dangerous flags.
|
|
|
|
**Sandbox**: Docker-based with 30+ blocked host paths, blocked home subpaths (.ssh, .aws, .docker, .gnupg), blocked network modes, blocked seccomp/apparmor profiles. Tool policy separate for sandboxed agents.
|
|
|
|
**Node invoke approval**: Device-identity-bound, client-identity-bound, backend-bridge for web UI, allowlist for forwarded params.
|
|
|
|
* Hermes Agent — Three-Tier Detection
|
|
|
|
**HARDLINE blocklist** (12 patterns, unconditional — not even --yolo can bypass):
|
|
rm -rf /, mkfs, dd of=/dev/sd*, fork bombs, kill -1, shutdown, reboot, etc.
|
|
|
|
**DANGEROUS patterns** (47 patterns, approvable):
|
|
rm -r, chmod 777, chown -R root, curl|bash, systemctl, git push --force, sensitive file writes, self-termination, etc.
|
|
|
|
**Approval flow** (check_all_command_guards):
|
|
1. Skip containers → auto-approve
|
|
2. HARDLINE check → block
|
|
3. YOLO/mode=off → auto-approve (except hardline)
|
|
Phase 1: Tirith security scanner (Rust binary, cryptographically verified) + pattern detection
|
|
Phase 2: Smart approval via aux LLM (if mode=smart)
|
|
Phase 3: Manual approval with timeout (default 60s)
|
|
|
|
**Busy mode**: interrupt (immediate), queue (next turn), steer (inject after next tool call)
|
|
|
|
**SUDO injection**: Transparent sudo -S -p '' rewrite with password piped via stdin from .env.
|
|
|
|
* Passepartout Blindspot Assessment
|
|
|
|
1. **No "remember" mechanism** — Every HITL prompt requires a decision. Claude Code auto-approves read-only tools via allowlist. Hermes saves "always" approvals to config.yaml. Passepartout should add per-session and permanent allow options. [Action: v0.7.2 HITL inline]
|
|
|
|
2. **No non-bypassable floor** — Passepartout could add a hardline blocklist for catastrophic commands (rm -rf /, mkfs, dd to devices) that cannot be approved. [Action: v0.7.2, add to dispatcher-check-shell-safety]
|
|
|
|
3. **No classifier** — Claude Code and Hermes both use auxiliary LLMs to reduce HITL frequency. Passepartout's dispatcher-learn (v0.9.0) uses deterministic counting instead. This is architecturally cleaner but will ask more questions initially. Consider adding a safe-tool allowlist for read-only tools. [Action: v0.7.2, add tool-safety classification]
|
|
|
|
4. **No anti-misclick** — Claude Code tracks denials; Hermes has countdown timers. Passepartout could add a 500ms input block after HITL prompts appear. [Action: v0.7.2 HITL inline]
|
|
|
|
5. **Sandbox superiority** — Passepartout's planned bwrap is better than OpenCode (none) and Hermes (none), comparable to Claude Code (same bwrap approach). OpenClaw uses Docker which is heavier. [Architecture confirmed]
|
|
|
|
6. **Hook system missing** — All 4 competitors have PreToolUse/PostToolUse hooks. Passepartout's skills only fire on definable triggers; they can't intercept tool execution. [Action: add to Extension Architecture Study]
|