passepartout v0.7.2 (Gate Trace + HITL + Search + 11 more features): - Gate trace visualization with Ctrl+G toggle - HITL inline panels with styled collapse on approve/deny - Agent identity file + /identity command - Safe-tool read-only allowlist - Message search mode with Up/Down nav and highlights - Context budget visibility with section breakdown - Session rewind /sessions /resume /rewind - Undo/redo per operation - Context debugging /context why /context dropped - Tool hardening (timeouts, write verify, read-only cache) - Tag stack severity tiers + trigger counts - Merkle provenance audit + audit-verify - Self-help /help <topic> reads USER_MANUAL.org - Live CONFIG section in system prompts - Pads: Page Up/Down scroll by 10 lines Core 92/92 TUI Main 104/104 TUI View 29/29 Neuro 13/13
7.2 KiB
Comparative Agent Loop & Recovery Study
- Purpose
- Findings Summary
- Claude Code — Loop Architecture
- OpenCode — Effect-TS Functional Pipeline
- OpenClaw — Dual-Loop with Failover
- Hermes Agent — Monolithic 14,672-line Agent
- Passepartout Blindspot Assessment
Purpose
Compare agent loop architectures and error recovery mechanisms across Claude Code, OpenCode, OpenClaw, and Hermes Agent. Inform Passepartout's signal pipeline (v0.9.0) and recovery strategies.
Findings Summary
| Dimension | Claude Code | OpenCode | OpenClaw | Hermes | Passepartout |
|---|---|---|---|---|---|
| Loop style | Async generator while(true) | Effect while(true) | Dual nested while(true) | Plain while + budget | while(process-signal) |
| Streaming tools | StreamingToolExecutor during API stream | AI SDK streamText + tool dispatch | Dual-loop attempt dispatch | Always streaming, chunk iteration | Not yet (v0.7.1) |
| Watchdog | 90s stream idle, 30s stall | Via Effect cancellation | TUI watchdog + 5-count idle breaker | 90s stale-stream, 60-120s read | Not implemented |
| Auto-retry | 10 max, 3 for 529, stream→non-stream fallback | Effect retry/fallback | Model fallback + profile rotation | 30+ error flags with specific recovery | Not implemented |
| Compaction layers | 5 (snip, micro, collapse, auto, reactive) | Auto-compaction + pruning | Post-compaction loop guard | ContextCompressor + 1-line tool pruning | Foveal-peripheral (single layer) |
| Interrupt | AbortController + signal.reason | Runner.cancel() + BusyError | AbortSignal propagation | Thread-scoped set + 3-level cascade | Single SIGINT handler |
| Cost control | Token budget + task_budget API | Per-step tracking | Idle-timeout breaker (cost runaway) | IterationBudget + subagent caps | Planned (v0.5.0 token economics) |
| Busy-mode | N/A (single REPL) | Interrupt running + queue | Interrupt + queue | interrupt/queue/steer (3 modes) | Not implemented |
Claude Code — Loop Architecture
Core loop: Async generator queryLoop() with mutable State struct tracking messages, tool context, compaction state, recovery counts, transitions. Each iteration: snip → microcompact → collapse → auto-compact → blocking limit → call model → execute tools → loop.
Stop conditions (return Terminal): completed, blocking_limit, aborted_streaming, aborted_tools, max_turns, stop_hook_prevented, hook_stopped, model_error, prompt_too_long, image_error.
StreamingToolExecutor: Tools execute DURING the API stream. As tool_use blocks arrive, immediately dispatched. Concurrency model: concurrent-safe tools run in parallel; non-concurrent serialized. Bash tool errors trigger siblingAbortController.
5-Layer Recovery:
- Auto-retry: 10 max, 3 for 529, persistent retry mode (CLAUDE_CODE_UNATTENDED_RETRY), fallback model switch
- Stale LLM detection: 90s stream idle timeout, half-time warning at 45s, fallback to non-streaming
- Tool error recovery: user_interrupted → REJECT_MESSAGE, streaming_fallback → executor recreated, missing tool results → synthetic errors
- Compaction retry: reactive compact on 413 prompt-too-long. Two-stage: context-collapse drain → reactive compact
- Watchdog/context overflow: blocking limit pre-check before model call
Unique: Task budgets carried across compaction boundaries. Thinking block validation at query level. Memory prefetch concurrent with LLM stream.
OpenCode — Effect-TS Functional Pipeline
Core loop: Effect.fn(SessionPrompt.run) → while(true). Exit on: last assistant has finish reason not tool-calls, structured output produced, processor returns stop/compact, doom loop detected, max steps reached.
Doom loop detection: 3 consecutive identical tool calls → permission prompt.
Interrupt: SessionRunState with Runner.onInterrupt. Cancel() sets status to idle. BusyError when starting while running. New request → cancel current → interrupt work returns last assistant message.
Undo/Redo: SessionRevert with file system snapshots. Revert restores file patches in reverse. Unrevert restores pre-revert snapshot. Per-step diff tracking.
Streaming: AI SDK streamText with event handling (reasoning-start/delta/end, tool-input-start, tool-call, tool-result, tool-error, text-delta, finish-step). Snapshot diff on each step.
OpenClaw — Dual-Loop with Failover
Outer loop (runWithModelFallback): Model/provider fallback chain with result classification. Inner loop (runEmbeddedAttemptWithBackend): Attempt dispatch with auth/profile rotation.
Idle-timeout cost-runaway breaker: 5 consecutive idle timeouts with no progress → halt paid model calls. Prevents $20-30 runaway from a single code bug.
Auth profile rotation: Multiple profiles per provider, rotated on failures. MAX_SAME_MODEL_IDLE_TIMEOUT_RETRIES=1 before model switch.
Post-Compaction Loop Guard: Detects loops where compaction happens but produces no progress.
Streaming watchdog: Armed on every delta. TUI detects idle stream, updates status. 30s timeout.
Hermes Agent — Monolithic 14,672-line Agent
Core loop: while(api_call_count < max_iterations AND iteration_budget.remaining > 0). Checks: interrupt flag, iteration budget, steer drain, API retry loop (with rate limiting), compaction loop (max 3), process tool calls.
3-Level Ctrl+C Cascade:
- Graceful interrupt: sets _interrupt_requested, per-thread signal, propagates to children
- Clear & Queue: clear_interrupt(), busy mode queue (next turn)
- /steer: Non-disruptive injection into next tool result (thread-safe)
IterationBudget: Parent gets 90 iterations, subagents get 50. execute_code iterations refunded. _budget_grace_call for one extra.
Streaming: Always preferred path. 90s stale detection, 60-120s read timeout. Chunk iteration with last_chunk_time tracking. Ollama-specific tool call reuse fix.
Recovery flags: 30+ provider-specific retry flags. JSON repair (5 passes: strict, commas, braces, control chars, unicode). Surrogate sanitization. Dead connection cleanup. Message sequence repair. Credential pool rotation.
Passepartout Blindspot Assessment
- No streaming tool execution — Passepartout's pipeline is strictly sequential. Claude Code executes tools during the API stream (hiding ~1s latency). Passepartout's streaming (v0.7.1) should consider whether tool calls can execute during stream. [Action: v0.7.1 protocol design]
- No recovery layers — Passepartout has handler-case around think() but no auto-retry, no stale detection, no model fallback. All 3 competitors have multi-layer recovery. [Action: add to v0.9.0 signal pipeline]
- No busy-mode — When the agent is running and user types, Passepartout has no defined behavior. Hermes has interrupt/queue/steer. Passepartout should add queue mode at minimum. [Action: v0.9.0 priority queue]
- Single SIGINT handler — The 3-level Ctrl+C cascade is universal. Passepartout should match this. [Action: v0.7.0 Ctrl keys]
- No compaction layers — Passepartout's foveal-peripheral pruning is a single strategy. Claude Code has 5. Should Passepartout add reactive compaction (compact on PTL) and tool output summarization? [Action: context/compaction study needed]
- No doom loop detection — OpenCode detects 3 identical tool calls. Hermes detects budget exhaustion. Passepartout could loop forever on a stuck tool. [Action: add to v0.9.0]