amr/memex

Files

Amr Gharbeia 4e9431ec1d memex: update passepartout submodule → v0.7.2, add notes

passepartout v0.7.2 (Gate Trace + HITL + Search + 11 more features):
- Gate trace visualization with Ctrl+G toggle
- HITL inline panels with styled collapse on approve/deny
- Agent identity file + /identity command
- Safe-tool read-only allowlist
- Message search mode with Up/Down nav and highlights
- Context budget visibility with section breakdown
- Session rewind /sessions /resume /rewind
- Undo/redo per operation
- Context debugging /context why /context dropped
- Tool hardening (timeouts, write verify, read-only cache)
- Tag stack severity tiers + trigger counts
- Merkle provenance audit + audit-verify
- Self-help /help <topic> reads USER_MANUAL.org
- Live CONFIG section in system prompts
- Pads: Page Up/Down scroll by 10 lines

Core 92/92  TUI Main 104/104  TUI View 29/29  Neuro 13/13

2026-05-08 21:56:11 -04:00

7.2 KiB

Raw Blame History

Comparative Agent Loop & Recovery Study

Purpose
Findings Summary
Claude Code — Loop Architecture
OpenCode — Effect-TS Functional Pipeline
OpenClaw — Dual-Loop with Failover
Hermes Agent — Monolithic 14,672-line Agent
Passepartout Blindspot Assessment

Purpose

Compare agent loop architectures and error recovery mechanisms across Claude Code, OpenCode, OpenClaw, and Hermes Agent. Inform Passepartout's signal pipeline (v0.9.0) and recovery strategies.

Findings Summary

Dimension	Claude Code	OpenCode	OpenClaw	Hermes	Passepartout
Loop style	Async generator while(true)	Effect while(true)	Dual nested while(true)	Plain while + budget	while(process-signal)
Streaming tools	StreamingToolExecutor during API stream	AI SDK streamText + tool dispatch	Dual-loop attempt dispatch	Always streaming, chunk iteration	Not yet (v0.7.1)
Watchdog	90s stream idle, 30s stall	Via Effect cancellation	TUI watchdog + 5-count idle breaker	90s stale-stream, 60-120s read	Not implemented
Auto-retry	10 max, 3 for 529, stream→non-stream fallback	Effect retry/fallback	Model fallback + profile rotation	30+ error flags with specific recovery	Not implemented
Compaction layers	5 (snip, micro, collapse, auto, reactive)	Auto-compaction + pruning	Post-compaction loop guard	ContextCompressor + 1-line tool pruning	Foveal-peripheral (single layer)
Interrupt	AbortController + signal.reason	Runner.cancel() + BusyError	AbortSignal propagation	Thread-scoped set + 3-level cascade	Single SIGINT handler
Cost control	Token budget + task_budget API	Per-step tracking	Idle-timeout breaker (cost runaway)	IterationBudget + subagent caps	Planned (v0.5.0 token economics)
Busy-mode	N/A (single REPL)	Interrupt running + queue	Interrupt + queue	interrupt/queue/steer (3 modes)	Not implemented

Claude Code — Loop Architecture

Core loop: Async generator queryLoop() with mutable State struct tracking messages, tool context, compaction state, recovery counts, transitions. Each iteration: snip → microcompact → collapse → auto-compact → blocking limit → call model → execute tools → loop.

Stop conditions (return Terminal): completed, blocking_limit, aborted_streaming, aborted_tools, max_turns, stop_hook_prevented, hook_stopped, model_error, prompt_too_long, image_error.

StreamingToolExecutor: Tools execute DURING the API stream. As tool_use blocks arrive, immediately dispatched. Concurrency model: concurrent-safe tools run in parallel; non-concurrent serialized. Bash tool errors trigger siblingAbortController.

5-Layer Recovery:

Auto-retry: 10 max, 3 for 529, persistent retry mode (CLAUDE_CODE_UNATTENDED_RETRY), fallback model switch
Stale LLM detection: 90s stream idle timeout, half-time warning at 45s, fallback to non-streaming
Tool error recovery: user_interrupted → REJECT_MESSAGE, streaming_fallback → executor recreated, missing tool results → synthetic errors
Compaction retry: reactive compact on 413 prompt-too-long. Two-stage: context-collapse drain → reactive compact
Watchdog/context overflow: blocking limit pre-check before model call

Unique: Task budgets carried across compaction boundaries. Thinking block validation at query level. Memory prefetch concurrent with LLM stream.

OpenCode — Effect-TS Functional Pipeline

Core loop: Effect.fn(SessionPrompt.run) → while(true). Exit on: last assistant has finish reason not tool-calls, structured output produced, processor returns stop/compact, doom loop detected, max steps reached.

Doom loop detection: 3 consecutive identical tool calls → permission prompt.

Interrupt: SessionRunState with Runner.onInterrupt. Cancel() sets status to idle. BusyError when starting while running. New request → cancel current → interrupt work returns last assistant message.

Undo/Redo: SessionRevert with file system snapshots. Revert restores file patches in reverse. Unrevert restores pre-revert snapshot. Per-step diff tracking.

Streaming: AI SDK streamText with event handling (reasoning-start/delta/end, tool-input-start, tool-call, tool-result, tool-error, text-delta, finish-step). Snapshot diff on each step.

OpenClaw — Dual-Loop with Failover

Outer loop (runWithModelFallback): Model/provider fallback chain with result classification. Inner loop (runEmbeddedAttemptWithBackend): Attempt dispatch with auth/profile rotation.

Idle-timeout cost-runaway breaker: 5 consecutive idle timeouts with no progress → halt paid model calls. Prevents $20-30 runaway from a single code bug.

Auth profile rotation: Multiple profiles per provider, rotated on failures. MAX_SAME_MODEL_IDLE_TIMEOUT_RETRIES=1 before model switch.

Post-Compaction Loop Guard: Detects loops where compaction happens but produces no progress.

Streaming watchdog: Armed on every delta. TUI detects idle stream, updates status. 30s timeout.

Hermes Agent — Monolithic 14,672-line Agent

Core loop: while(api_call_count < max_iterations AND iteration_budget.remaining > 0). Checks: interrupt flag, iteration budget, steer drain, API retry loop (with rate limiting), compaction loop (max 3), process tool calls.

3-Level Ctrl+C Cascade:

Graceful interrupt: sets _interrupt_requested, per-thread signal, propagates to children
Clear & Queue: clear_interrupt(), busy mode queue (next turn)
/steer: Non-disruptive injection into next tool result (thread-safe)

IterationBudget: Parent gets 90 iterations, subagents get 50. execute_code iterations refunded. _budget_grace_call for one extra.

Streaming: Always preferred path. 90s stale detection, 60-120s read timeout. Chunk iteration with last_chunk_time tracking. Ollama-specific tool call reuse fix.

Recovery flags: 30+ provider-specific retry flags. JSON repair (5 passes: strict, commas, braces, control chars, unicode). Surrogate sanitization. Dead connection cleanup. Message sequence repair. Credential pool rotation.

Passepartout Blindspot Assessment

No streaming tool execution — Passepartout's pipeline is strictly sequential. Claude Code executes tools during the API stream (hiding ~1s latency). Passepartout's streaming (v0.7.1) should consider whether tool calls can execute during stream. [Action: v0.7.1 protocol design]
No recovery layers — Passepartout has handler-case around think() but no auto-retry, no stale detection, no model fallback. All 3 competitors have multi-layer recovery. [Action: add to v0.9.0 signal pipeline]
No busy-mode — When the agent is running and user types, Passepartout has no defined behavior. Hermes has interrupt/queue/steer. Passepartout should add queue mode at minimum. [Action: v0.9.0 priority queue]
Single SIGINT handler — The 3-level Ctrl+C cascade is universal. Passepartout should match this. [Action: v0.7.0 Ctrl keys]
No compaction layers — Passepartout's foveal-peripheral pruning is a single strategy. Claude Code has 5. Should Passepartout add reactive compaction (compact on PTL) and tool output summarization? [Action: context/compaction study needed]
No doom loop detection — OpenCode detects 3 identical tool calls. Hermes detects budget exhaustion. Passepartout could loop forever on a stuck tool. [Action: add to v0.9.0]

7.2 KiB Raw Blame History