amr/memex

Files

Amr Gharbeia 4e9431ec1d memex: update passepartout submodule → v0.7.2, add notes

passepartout v0.7.2 (Gate Trace + HITL + Search + 11 more features):
- Gate trace visualization with Ctrl+G toggle
- HITL inline panels with styled collapse on approve/deny
- Agent identity file + /identity command
- Safe-tool read-only allowlist
- Message search mode with Up/Down nav and highlights
- Context budget visibility with section breakdown
- Session rewind /sessions /resume /rewind
- Undo/redo per operation
- Context debugging /context why /context dropped
- Tool hardening (timeouts, write verify, read-only cache)
- Tag stack severity tiers + trigger counts
- Merkle provenance audit + audit-verify
- Self-help /help <topic> reads USER_MANUAL.org
- Live CONFIG section in system prompts
- Pads: Page Up/Down scroll by 10 lines

Core 92/92  TUI Main 104/104  TUI View 29/29  Neuro 13/13

2026-05-08 21:56:11 -04:00

7.5 KiB

Raw Blame History

Comparative Safety/Permission Study — Agent Authorization Architectures

Purpose
Findings Summary
Claude Code — Authorization Pipeline
OpenCode — Permission Architecture
OpenClaw — Gateway-Centric Security
Hermes Agent — Three-Tier Detection
Passepartout Blindspot Assessment

Purpose

Compare safety architectures across Claude Code, OpenCode, OpenClaw, and Hermes Agent against Passepartout's 10-vector dispatcher. Inform the HITL regime, auto-approve learning, and permission dialogs in v0.7.2 and dispatcher-learn in v0.9.0.

Findings Summary

Dimension	Claude Code	OpenCode	OpenClaw	Hermes	Passepartout
Modes	7 (default, acceptEdits, plan, bypass, dontAsk, auto, bubble)	3 per-tool (allow/deny/ask)	3-tier (config/gateway/tool)	3 (manual/smart/off)	Single HITL flow
Auto-approve	AI classifier (Opus 2-stage XML)	Rule evaluation (last-match-wins)	Exec approval manager (promise-based)	Smart approval via aux LLM	Planned: dispatcher-learn
Non-bypassable floor	Path safety checks immune to bypass	None	Gateway HTTP tool deny + config drift	12-item hardline blocklist	10 deterministic vectors
Remember/always	Session-level settings files	In-memory (session-only)	allow-always + config persistence	Session + permanent config.yaml	Not implemented
Pre-tool hooks	PreToolUse (27 lifecycle events)	None	None	pre/post approval hooks	defskill triggers only
Anti-misclick	Not literal 200ms; denial tracking + mode stripping	No explicit protection	No explicit protection	Countdown timer on prompt	Not implemented
Sandbox	bubblewrap+Seatbelt	None	Docker with 30+ blocked paths	None (container bypass)	Regex + planned bwrap

Claude Code — Authorization Pipeline

7 Permission Modes:

Mode	Description
default	Interactive prompting for every tool use
acceptEdits	Auto-allow edits within cwd; prompt for bash/outside
plan	Plan-only; can plan but not execute
bypassPermissions	Full bypass (can be disabled via GrowthBook)
dontAsk	All ask results → deny; only rule-allowed tools
auto	AI classifier (Opus) decides yes/no on every invocation
bubble	Internal (not user-facing)

Authorization Pipeline (12 steps in hasPermissionsToUseToolInner): 1a. Tool denied by rule → DENY immediately 1b. Tool has ask rule → ASK 1c-1g. Checks: permissions, implementation deny, user interaction required, content-specific rules, safety checks (.git/, .claude/, shell configs — bypass-immune) 2a. BypassPermissions or Plan → ALLOW 2b. Always-allow rule → ALLOW

Passthrough → ASK

Auto-Mode Classifier (yoloClassifier.ts):

2-stage Opus XML: Stage 1 (fast, max_tokens=64, <block>yes/no</block>), Stage 2 (thinking, chain-of-thought)
Input: serialized transcript of user messages + assistant tool_use blocks (assistant text excluded — could manipulate)
Fail-closed by default (GrowthBook iron_gate), but can fail-open for dev
Transcript-too-long → fall back to manual prompting
Denial tracking: 3 consecutive / 20 total blocks → force human review
Safe-tool allowlist: FileRead, Grep, Glob, LSP, Task*, TodoWrite, AskUserQuestion — auto-allowed without classifier
acceptEdits fast-path: re-checks tool with acceptEdits mode before calling classifier
Mode stripping: dangerous rules (Bash(), PowerShell(), Agent) auto-removed when entering auto mode

OpenCode — Permission Architecture

Simpler, rule-evaluation based. 3 actions per permission type (allow/deny/ask) with glob patterns: ```yaml permissions: bash: "ask" edit: ".md": "allow" ".env": "deny" ```

Pipeline: evaluate each pattern → if deny → DeniedError; if all allow → auto-allow; otherwise → needAsk → publish Event.Asked → block until reply.

Always Allow: User clicks "Always allow" → patterns added to PermissionTable (SQLite/Drizzle). Auto-resolves any pending permissions the new rules cover. Session-level only (runtime, not written to config).

Question prompts: Vim-style (hjkl) navigation. Numbers 1-9 for direct selection. Multi-tab for multi-question. Custom answer textarea.

Cascading rejection: Denying one permission auto-rejects all pending permissions for that session.

OpenClaw — Gateway-Centric Security

Gateway-level access control — all tool invocations flow through central auth/approval server:

Auth modes: none/token/password/tailscale/device-token/bootstrap-token/trusted-proxy
Rate limiting, browser origin checking, timing-safe secret comparison

Exec Approval Manager: Promise-based approval queue with allow-once (one-shot consumption), allow-always, 15s grace period for late awaiters.

Dangerous config flags audit: Scans configuration at startup for security risks (SSRF, unrestricted filesystem access, Docker privilege escalation). Plugins declare their own dangerous flags.

Sandbox: Docker-based with 30+ blocked host paths, blocked home subpaths (.ssh, .aws, .docker, .gnupg), blocked network modes, blocked seccomp/apparmor profiles. Tool policy separate for sandboxed agents.

Node invoke approval: Device-identity-bound, client-identity-bound, backend-bridge for web UI, allowlist for forwarded params.

Hermes Agent — Three-Tier Detection

HARDLINE blocklist (12 patterns, unconditional — not even –yolo can bypass): rm -rf /, mkfs, dd of=/dev/sd*, fork bombs, kill -1, shutdown, reboot, etc.

DANGEROUS patterns (47 patterns, approvable): rm -r, chmod 777, chown -R root, curl|bash, systemctl, git push –force, sensitive file writes, self-termination, etc.

Approval flow (check_all_command_guards):

Skip containers → auto-approve
HARDLINE check → block
YOLO/mode=off → auto-approve (except hardline)

Phase 1: Tirith security scanner (Rust binary, cryptographically verified) + pattern detection Phase 2: Smart approval via aux LLM (if mode=smart) Phase 3: Manual approval with timeout (default 60s)

Busy mode: interrupt (immediate), queue (next turn), steer (inject after next tool call)

SUDO injection: Transparent sudo -S -p '' rewrite with password piped via stdin from .env.

Passepartout Blindspot Assessment

No "remember" mechanism — Every HITL prompt requires a decision. Claude Code auto-approves read-only tools via allowlist. Hermes saves "always" approvals to config.yaml. Passepartout should add per-session and permanent allow options. [Action: v0.7.2 HITL inline]
No non-bypassable floor — Passepartout could add a hardline blocklist for catastrophic commands (rm -rf /, mkfs, dd to devices) that cannot be approved. [Action: v0.7.2, add to dispatcher-check-shell-safety]
No classifier — Claude Code and Hermes both use auxiliary LLMs to reduce HITL frequency. Passepartout's dispatcher-learn (v0.9.0) uses deterministic counting instead. This is architecturally cleaner but will ask more questions initially. Consider adding a safe-tool allowlist for read-only tools. [Action: v0.7.2, add tool-safety classification]
No anti-misclick — Claude Code tracks denials; Hermes has countdown timers. Passepartout could add a 500ms input block after HITL prompts appear. [Action: v0.7.2 HITL inline]
Sandbox superiority — Passepartout's planned bwrap is better than OpenCode (none) and Hermes (none), comparable to Claude Code (same bwrap approach). OpenClaw uses Docker which is heavier. [Architecture confirmed]
Hook system missing — All 4 competitors have PreToolUse/PostToolUse hooks. Passepartout's skills only fire on definable triggers; they can't intercept tool execution. [Action: add to Extension Architecture Study]

7.5 KiB Raw Blame History Unescape Escape