bump passepartout: v0.9.0 Warm TUI Redesign — blank slate

Complete rewrite of the TUI with warm amber/gold color palette and
clean three-zone layout (chat top, input bottom, status very bottom).

1. Layout restructure: input at y=h-3, hint at y=h-2, status at y=h-1
2. Warm palette: 20-key amber/gold theme, 8 warm presets
3. Readline keybindings: Ctrl+A/E/U/W/K/Y/L/D/F/G in :global keymap
4. Chat messages: user boxes (┌─└─), agent headers, collapsible tools
5. Command palette: Ctrl+P top-centered overlay, warm colors
6. Sidebar: Ctrl+B toggle, right panel with focus/rules/context/MCP
7. Keybindings: :ctrl+x, :?, mouse wheel support
8. Search: existing /search with match highlighting
9. Help overlay: ? shows keybinding and command reference
This commit is contained in:
2026-05-13 19:13:20 -04:00
parent e27cffa4e0
commit 15d16fd520
7 changed files with 1045 additions and 783 deletions

View File

@@ -125,25 +125,139 @@ The croatoan TUI is replaced entirely. cl-tty provides the widget set (box, text
~420 lines total.
** v0.9.0: Eval Harness — Safety Net First
Every subsequent release ships with automated regression protection. The eval harness is the gate that makes self-modification safe — before any neurosymbolic component modifies the system, the harness verifies nothing broke.
*** TODO Internal evaluation harness — 10 tasks, regression detection
:PROPERTIES:
:ID: id-v090-eval-harness
:CREATED: [2026-05-08 Fri]
** DONE v0.9.0: Warm TUI Redesign — Blank Slate
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-13 Wed]
:END:
- New skill: ~symbolic-evaluation.org~~symbolic-evaluation.lisp~
- ~deftask~ macro: define an eval task with ~:setup~ (create test environment), ~:prompt~ (what to ask the agent), ~:verify~ (function that checks the output), ~:teardown~ (cleanup)
- ~run-eval-suite~: run all registered tasks, produce score (pass count / total), per-task diagnostics
- Initial 10 tasks: find TODOs, create Org note, search codebase, read file, query memory, list projects, run safe shell command, find definition, set TODO state, summarize session
- Regression mode: run after each version build. Fail CI if score drops.
- Task suite grows with codebase: every bug fix adds a regression task
~200 lines.
The v0.8.0 TUI has correct internal wiring but is unusable — input at the top
instead of the bottom, layout bugs where chat overwrites the status bar, and
Ctrl-key shortcuts silently fail. This version strips the TUI down to a clean
three-zone design with a warm amber/gold color palette inspired by OpenCode and
Gemini CLI. Everything in view/state/main is rewritten; only the daemon protocol
survives.
** v0.9.1: Emacs Development Environment — Secondary Client
*** Visual Mockup
#+begin_example
┌──────────────────────────────────────────────────────────────────┐
│ │
│ ┌─ you ─────────────────────────────────────────────────┐ │
│ │ Can you refactor the dispatcher pipeline? │ │
│ └───────────────────────────────────────────────────────┘ │
│ │
│ ── passepartout ────────────────────────────────────────────── │
│ Sure. The issue is in run-gates — it calls predicates │
│ before checking type levels. Let me fix that. │
│ │
│ ┌─ shell: run tests ──── 0.3s ─────────────────────────┐ │
│ │ ✓ all 12 tests pass │ │
│ └──────────────────────────────────────────────────────┘ │
│ │
│ ────────────────────────────────────────────────────────────── │
│ │
│ > /focus stoa │
│ Ctrl+P palette │ Up/Dn history │ Tab complete │
├──────────────────────────────────────────────────────────────────┤
│ ● Connected stoa Rules:12 Cost:$0.42 14:30 │
└──────────────────────────────────────────────────────────────────┘
#+end_example
*** Three Zones
**** Zone 3 (bottom-most, 1 line): Status Bar (tmux-style)
Warm dark background (~#2A1F1A~), amber foreground (~#D4A574~). Always visible.
Left: ● Connected, project/focus name, rule count. Right: Session cost, clock.
No borders — background color alone defines the zone.
~30 lines.
**** Zone 2 (just above status, 2 lines): Input Area
Line 1 — ~>~ prompt (warm orange ~#FF8C42~), cursor visible. Readline keybindings
(Ctrl+A/E/U/W/K/Y), Up/Down history, Tab complete, Alt+Enter multi-line.
Line 2 — Context-sensitive hint bar (dim amber ~#A08060~):
Normal: Ctrl+P palette | Up/Dn history | Tab complete
Search: Up/Dn navigate | Enter jump | Esc exit
Dialog: Up/Dn select | Enter confirm | Esc dismiss
Slash commands appear as top-centered overlay dialogs.
~60 lines.
**** Zone 1 (scrollable, fills remaining space): Chat Area
User messages: boxed ~┌─ you ─┐~ / ~└─┘~ (bg #3A2A1A, fg #FFB347)
Agent messages: ~-- passepartout --~ header (fg #D4956A), body (fg #E8D5B7)
System: plain text (fg #C8A87C)
Tool calls: collapsible ~┌─ name -- 0.3s --┐~ (running #FF8C42, done #7CCC6C)
Gate traces: ~╎~ indented lines (pass green, block red, approval yellow)
Date separators between time blocks. Streaming inserts char by char.
~120 lines.
*** Warm Color Palette (18 keys, 8 presets)
| Token | Hex | Role |
|-------+-----+------|
| :user-fg | #FFB347 | User message text |
| :user-bg | #3A2A1A | User message background |
| :user-border | #CC8800 | User message box border |
| :agent-header | #D4956A | Agent message header |
| :agent-fg | #E8D5B7 | Agent message body |
| :system | #C8A87C | System notifications |
| :input-prompt | #FF8C42 | > prompt character |
| :input-fg | #E8D5B7 | Input text |
| :hint | #A08060 | Hint bar text |
| :status-bg | #2A1F1A | Status bar background |
| :status-fg | #D4A574 | Status bar text |
| :dot-connected | #7CCC6C | Status dot when connected |
| :dot-disconnected | #E2584A | Status dot when disconnected |
| :error | #E2584A | Error messages |
| :tool-running | #FF8C42 | In-progress tool |
| :tool-done | #7CCC6C | Completed tool |
| :separator | #4A3A2A | Horizontal rules |
| :accent | #FFB347 | Links, highlights |
| :dim | #8B7355 | Metadata, timestamps |
8 presets: amber, gold, terracotta, sepia, nord-warm, monokai-warm,
gruvbox-warm, light-amber.
~80 lines.
*** Build Plan (590 lines total)
| # | Task | Lines | Files |
|---+------+-------+-------|
| 1 | Layout restructure (+status bar) | 100| view.org, main.org |
| 2 | Warm palette | 80 | state.org |
| 3 | Input area (readline keybindings) | 60 | main.org |
| 4 | Chat messages (boxes, headers, tools) | 120 | view.org |
| 5 | Command palette | 50 | main.org, state.org |
| 6 | Sidebar | 60 | view.org, main.org |
| 7 | Keybindings (all Ctrl in :global) | 50 | main.org |
| 8 | Search | 40 | main.org, view.org |
| 9 | Help overlay | 30 | main.org |
*** Keybinding Reference
| Key | Action |
|-----|--------|
| Enter | Send message |
| Alt+Enter | Newline |
| Up/Down | History cycle |
| Tab | Complete command/path |
| Ctrl+P | Command palette |
| Ctrl+B | Toggle sidebar |
| Ctrl+F | Search messages |
| Ctrl+L | Redraw |
| Ctrl+D | Quit prompt |
| Esc | Interrupt/dismiss |
| PageUp/Dn | Scroll chat |
| Ctrl+Q | Quit |
| ? | Help panel |
** v0.10.0: Emacs Development Environment — Secondary Client
cl-tty is the primary TUI (v0.8.0). The Emacs major mode is an optional secondary client for users who prefer Emacs-based workflows. Both clients communicate with the same daemon over the same TCP protocol — they are interchangeable frontends, not competing architectures.
@@ -211,7 +325,7 @@ Each command is a thin wrapper around ~passepartout-send~ (the existing TCP brid
Total: ~260 lines elisp, persisting through v2.0.0+.
** v0.10.0: Phase 0 — Type-Level Gates + Core Integrity (~75 lines)
** v0.11.0: Phase 0 — Type-Level Gates + Core Integrity (~75 lines)
:PROPERTIES:
:ID: id-v090-phase0
@@ -239,7 +353,7 @@ Existing FiveAM gate tests continue to pass. New test: signal at type-level 5 ta
This is Contribution 1 from ~notes/passepartout-whitehead.org~. Every type-level rejection emits a structured event that Phase 1 ingests as a fact. ~30 lines implement the seed of the ontology without any new dependencies. ~75 lines total, extends dispatcher, no new skill.
** v0.11.0: Full Markdown Rendering
** v0.12.0: Full Markdown Rendering
:PROPERTIES:
:ID: id-v071-markdown-full
:CREATED: [2026-05-08 Fri]
@@ -253,7 +367,7 @@ Extend the markdown renderer from v0.7.1:
- Syntax highlighting for code blocks: keyword/string/function colors from theme. Regex-based (no parser dependency).
- All markdown features degrade gracefully to plain text on terminals without attribute support. ~100 lines.
** v0.12.0: Phase 0b — Layered Signal Authentication, Layer 1 (~200 lines)
** v0.13.0: Phase 0b — Layered Signal Authentication, Layer 1 (~200 lines)
:PROPERTIES:
:ID: id-v090-phase0b
:CREATED: [2026-05-09 Sat]
@@ -332,7 +446,7 @@ The gate architecture is designed with all four layers from Phase 0b. Adding a l
~200 lines total. Depends on Phase 0 (type-level gates).
** v0.13.0: Tool Execution Visualization
** v0.14.0: Tool Execution Visualization
:PROPERTIES:
:ID: id-v071-tools
:CREATED: [2026-05-08 Fri]
@@ -347,7 +461,7 @@ When the agent invokes a tool:
Uses Croatoan's ~init-pair~ + ~color-pair~ for 256-color backgrounds on tool state regions. ~100 lines.
** v0.14.0: Phase 1 — Minimum Viable Fact Language (~200 lines, new skill)
** v0.15.0: Phase 1 — Minimum Viable Fact Language (~200 lines, new skill)
:PROPERTIES:
:ID: id-v090-phase1
:CREATED: [2026-05-09 Sat]
@@ -433,7 +547,7 @@ The policy table maps entity classes to ~:singular~, ~:dual~, or ~:plural~. Gate
~200 lines. New skill: ~symbolic-facts.org~. Depends on Phase 0b (auth).
** v0.15.0: Mouse Support
** v0.16.0: Mouse Support
:PROPERTIES:
:ID: id-v071-mouse
:CREATED: [2026-05-08 Fri]
@@ -448,7 +562,7 @@ Croatoan supports ncurses mouse mode via ~(setf mouse-enabled-p)~. Enable:
- Click on gate trace line to expand/collapse trace
~40 lines.
** v0.16.0: Phase 1a — Self-Preservation Mechanisms (~120 lines)
** v0.17.0: Phase 1a — Self-Preservation Mechanisms (~120 lines)
:PROPERTIES:
:ID: id-v090-phase1a
:CREATED: [2026-05-09 Sat]
@@ -494,7 +608,7 @@ The watchdog is outside the SBCL image. A dead process cannot restart itself. ~2
~120 lines. Extends existing skills. Depends on Phase 0-1.
** v0.17.0: Cost Display
** v0.18.0: Cost Display
:PROPERTIES:
:ID: id-v071-cost
:CREATED: [2026-05-08 Fri]
@@ -506,7 +620,7 @@ The watchdog is outside the SBCL image. A dead process cannot restart itself. ~2
- Color-coded: green under daily budget, yellow approaching, red exceeding
- Requires token counter infrastructure from v0.5.0. ~50 lines for display; token counting is v0.5.0 infrastructure.
** v0.18.0: Phase 2 — Screamer as Admission Gate (~200 lines, new skill)
** v0.19.0: Phase 2 — Screamer as Admission Gate (~200 lines, new skill)
:PROPERTIES:
:ID: id-v090-phase2
:CREATED: [2026-05-09 Sat]
@@ -554,7 +668,7 @@ This is the function the archivist calls before any LLM-proposed fact enters the
~200 lines. New skill: ~symbolic-screamer.org~. Depends on Phase 1 (triple store). Not an ASDF dependency — degrades gracefully.
** v0.19.0: Session Export
** v0.20.0: Session Export
:PROPERTIES:
:ID: id-v071-export
:CREATED: [2026-05-08 Fri]
@@ -568,7 +682,7 @@ Claude Code has ~/share~ (shareable URL). OpenCode has ~/export~ (Markdown). Her
- ~/export json~ outputs the session as JSON (for programmatic consumption)
~50 lines. Uses existing message vector and ~memory-object-render~ for Org formatting.
** v0.20.0: Phase 3 — Archivist as Fact Proposer (~100 lines, extends existing archivist)
** v0.21.0: Phase 3 — Archivist as Fact Proposer (~100 lines, extends existing archivist)
:PROPERTIES:
:ID: id-v090-phase3
:CREATED: [2026-05-09 Sat]
@@ -619,7 +733,7 @@ This is the safety net: if the LLM produces a bad extraction that passes Screame
~100 lines. Extends existing archivist skill. Depends on Phase 2 (Screamer).
** v0.21.0: Tool Output Spilling
** v0.22.0: Tool Output Spilling
:PROPERTIES:
:ID: id-v081-output-spill
:CREATED: [2026-05-08 Fri]
@@ -632,7 +746,7 @@ Claude Code saves tool results >30KB to ~/.claude/tool-results/ with a 200-line
- The LLM can ~read-file~ the full output if it needs to analyze it
~30 lines in ~core-loop-act.lisp~
** v0.22.0: Phase 4 — Sufficiency Criterion ("The Flip") (~50 lines)
** v0.23.0: Phase 4 — Sufficiency Criterion ("The Flip") (~50 lines)
:PROPERTIES:
:ID: id-v090-phase4
:CREATED: [2026-05-09 Sat]
@@ -686,7 +800,7 @@ Symbolic Index
~50 lines. Extends Phase 3 (archivist).
** v0.23.0: Read-Only Output Caching Within a Turn
** v0.24.0: Read-Only Output Caching Within a Turn
:PROPERTIES:
:ID: id-v081-cache-turn
:CREATED: [2026-05-08 Fri]
@@ -700,7 +814,7 @@ Claude Code caches read-only tool results within a turn. If the agent reads the
- Prevents redundant tool calls when the agent asks the same question twice within a reasoning step
~25 lines in ~programming-tools.lisp~
** v0.24.0: Skin Engine + 10 Presets
** v0.25.0: Skin Engine + 10 Presets
:PROPERTIES:
:ID: id-v072-skin-engine
:CREATED: [2026-05-08 Fri]
@@ -729,7 +843,7 @@ Claude Code caches read-only tool results within a turn. If the agent reads the
Shipped as part of the skin engine release — the engine with 0 presets is unusable. See Skin Engine TODO above for the preset definitions.
** v0.25.0: Phase 5 — VivaceGraph + Merkle DAG + Ontology Versioning (~400 lines, new skill)
** v0.26.0: Phase 5 — VivaceGraph + Merkle DAG + Ontology Versioning (~400 lines, new skill)
:PROPERTIES:
:ID: id-v090-phase5
:CREATED: [2026-05-09 Sat]
@@ -804,7 +918,7 @@ Queries accept an optional ~:ontology-version~ parameter. The default is ~:activ
~400 lines. New skill: ~symbolic-vivacegraph.org~. Depends on Phase 4 (sufficiency). Not an ASDF dependency — degrades to hash-table fallback.
** v0.26.0: Hooks on defskill — Lifecycle Interception
** v0.27.0: Hooks on defskill — Lifecycle Interception
:PROPERTIES:
:ID: id-v082-hooks
:CREATED: [2026-05-08 Fri]
@@ -819,7 +933,7 @@ Passepartout's skills can inject instructions and react to triggers but cannot i
- Hooks run in skill priority order. A ~:deny~ from any hook short-circuits the chain.
~50 lines in ~defskill~ macro + ~core-perceive.lisp~
** v0.27.0: Phase 6 — ACL2 Structural Verification (~200 lines, new skill)
** v0.28.0: Phase 6 — ACL2 Structural Verification (~200 lines, new skill)
:PROPERTIES:
:ID: id-v090-phase6
:CREATED: [2026-05-09 Sat]
@@ -853,7 +967,7 @@ ACL2 does not verify that ~rm -rf / is destructive. That is an empirical claim a
~200 lines. New skill: ~symbolic-acl2.org~. Depends on Phase 5 (VivaceGraph). Not an ASDF dependency — degrades gracefully.
** v0.28.0: Prompt Templates / Output Styles
** v0.29.0: Prompt Templates / Output Styles
:PROPERTIES:
:ID: id-v082-prompt-styles
:CREATED: [2026-05-08 Fri]
@@ -870,7 +984,7 @@ Claude Code has "output styles" (~default~, ~Explanatory~, ~Learning~). Hermes h
- Style changes are immediate (next think() call). Survive restarts via config persistence.
~100 lines (~60 prompt templates + ~40 TUI integration).
** v0.29.0: Skill Auto-Detection — File-Watch Hot-Reload
** v0.30.0: Skill Auto-Detection — File-Watch Hot-Reload
:PROPERTIES:
:ID: id-v082-auto-reload
:CREATED: [2026-05-08 Fri]
@@ -888,7 +1002,7 @@ Passepartout's image-based Lisp model enables hot-reload — redefine a function
- On compile error: keep the old version loaded, log the error, show TUI warning: ~"✗ Skill 'skill-name' failed to compile — old version retained."~
~80 lines in a new ~symbolic-file-watch.org~ skill.
** v0.30.0: Heavy Thinking Skill — Parallel Reasoning + Sequential Deliberation
** v0.31.0: Heavy Thinking Skill — Parallel Reasoning + Sequential Deliberation
:PROPERTIES:
:ID: id-v082-heavy-thinking
:CREATED: [2026-05-08 Fri]
@@ -906,7 +1020,7 @@ The HeavySkill paper (arXiv:2605.02396v1) demonstrates that a two-stage pipeline
- Cost model: 3 parallel × 1 deliberation = 4 API calls for complex tasks (vs 1 normally). ~HEAVY_THINKING_COST_MULTIPLIER~ env var for cost-aware auto-activation
~100 lines as a skill (~60 prompt template + ~40 orchestration in ~symbolic-heavy-thinking.org~).
** v0.31.0: Adaptive Layout (3 Tiers)
** v0.32.0: Adaptive Layout (3 Tiers)
:PROPERTIES:
:ID: id-v073-adaptive-layout
:CREATED: [2026-05-08 Fri]
@@ -918,7 +1032,7 @@ The HeavySkill paper (arXiv:2605.02396v1) demonstrates that a two-stage pipeline
Re-renders on terminal resize (already handled via ~KEY_RESIZE~). Content re-flows — not truncated. The layout remembers per-terminal-size preference. ~80 lines.
** v0.32.0: Spinner Personality
** v0.33.0: Spinner Personality
:PROPERTIES:
:ID: id-v073-spinner
:CREATED: [2026-05-08 Fri]
@@ -934,7 +1048,7 @@ Configurable spinner style per skin:
Stall indication: when no response for 10s, spinner color interpolates from theme color → error red (Claude Code pattern). Reduced motion preference: spinner replaced with slow-pulse ●. ~50 lines.
** v0.33.0: Progress Bar
** v0.34.0: Progress Bar
:PROPERTIES:
:ID: id-v073-progress-bar
:CREATED: [2026-05-08 Fri]
@@ -946,7 +1060,7 @@ For measurable operations (file processing, test runs with known count, batch op
Uses 9 block characters for sub-character precision: ~[' ', '▏', '▎', '▍', '▌', '▋', '▊', '▉', '█']~ (Claude Code pattern). Color-coded by progress: red <25%, yellow 25-75%, green 75%+. ~25 lines.
** v0.34.0: Live Timestamps
** v0.35.0: Live Timestamps
:PROPERTIES:
:ID: id-v073-timestamps
:CREATED: [2026-05-08 Fri]
@@ -958,7 +1072,7 @@ Uses 9 block characters for sub-character precision: ~[' ', '▏', '▎', '▍',
- Timestamps update live (per-minute recalculation, not per-frame)
~40 lines.
** v0.35.0: Context-Sensitive Help
** v0.36.0: Context-Sensitive Help
:PROPERTIES:
:ID: id-v073-help
:CREATED: [2026-05-08 Fri]
@@ -972,7 +1086,7 @@ Press ~?~ to show available actions in current context:
Rendered as a dim help bar at the bottom of the screen (above input). Dismisses on any key or after 5 seconds. ~40 lines.
** v0.36.0: Phase 7 — 10-80-10 Planner (~500 lines, new skill, last phase)
** v0.37.0: Phase 7 — 10-80-10 Planner (~500 lines, new skill, last phase)
:PROPERTIES:
:ID: id-v090-phase7
:CREATED: [2026-05-09 Sat]
@@ -998,7 +1112,7 @@ Screamer returns a viable plan or reports unsolvability with the conflicting con
**** Plan verification
ACL2 proves that the plan contains no deadlocks (two subtasks waiting on each other), no dependency cycles (A depends on B depends on C depends on A), and no safety violations (no plan step requires a gate-blocked operation).
If verification fails, ACL2 identifies the failing subtask and the violated constraint. The planner re-decomposes the problematic branch (the existing ROADMAP's branch pruning, v0.61.0, but symbolically rather than neurally).
If verification fails, ACL2 identifies the failing subtask and the violated constraint. The planner re-decomposes the problematic branch (the existing ROADMAP's branch pruning, v0.62.0, but symbolically rather than neurally).
**** Neuro-symbolic boundary
The LLM handles the I/O boundaries:
@@ -1007,7 +1121,7 @@ The LLM handles the I/O boundaries:
- *Output* (10%): structured plan → natural language response. The verified plan plist is formatted as "I'll refactor the authentication module in 5 steps: 1) Create the OAuth2 client (depends on: nothing, modifies: auth/client.lisp) 2) Add the token store..." Small prompt, formulaic translation, ~150 tokens.
**** TUI visualization
The plan is rendered as an Org headline tree in the TUI, with each subtask as a node showing its terminal state (=todo=, =next-action=, =in-progress=, =done=, =blocked=, =stuck=), its constraints, and its verified properties. This is the same task tree visualization planned for v0.61.0, but with the addition of Screamer constraint annotations and ACL2 verification badges.
The plan is rendered as an Org headline tree in the TUI, with each subtask as a node showing its terminal state (=todo=, =next-action=, =in-progress=, =done=, =blocked=, =stuck=), its constraints, and its verified properties. This is the same task tree visualization planned for v0.62.0, but with the addition of Screamer constraint annotations and ACL2 verification badges.
*** Verification — ~6 FiveAM tests
1. ~test-goal-plist-from-natural-language~ — natural language input produces correct structured goal plist (LLM-dependent but formulaic; tested with deterministic mock).
@@ -1019,7 +1133,7 @@ The plan is rendered as an Org headline tree in the TUI, with each subtask as a
~500 lines. New skill: ~symbolic-planner.org~. Depends on Phase 6 (ACL2) + all prior phases.
** v0.36.1: Phase 8+ — Semantic Wikipedia Integration (TBD lines, optional acceleration)
** v0.37.1: Phase 8+ — Semantic Wikipedia Integration (TBD lines, optional acceleration)
:PROPERTIES:
:ID: id-v090-phase8
:CREATED: [2026-05-10 Sun]
@@ -1046,7 +1160,7 @@ How much Wikidata is the right amount? Loading entities referenced in the memex
TBD lines. New skill. Depends on Phase 5 (VivaceGraph).
** v0.37.0: Priority-Queue Signal Processing
** v0.38.0: Priority-Queue Signal Processing
:PROPERTIES:
:ID: id-v090-priority-queue
@@ -1066,7 +1180,7 @@ Replace the linear ~process-signal~ call chain with a priority-ordered signal qu
~80 lines in ~core-pipeline.lisp~ + ~30 lines TUI.
** v0.38.0: MVCC Memory Concurrency
** v0.39.0: MVCC Memory Concurrency
:PROPERTIES:
:ID: id-v090-mvcc
:CREATED: [2026-05-08 Fri]
@@ -1081,7 +1195,7 @@ Replace the linear ~process-signal~ call chain with a priority-ordered signal qu
~60 lines in ~core-memory.lisp~.
** v0.39.0: Structured Output Enforcement
** v0.40.0: Structured Output Enforcement
:PROPERTIES:
:ID: id-v090-structured-output
:CREATED: [2026-05-08 Fri]
@@ -1095,7 +1209,7 @@ Replace the linear ~process-signal~ call chain with a priority-ordered signal qu
~40 lines in ~core-reason.lisp~.
** v0.40.0: Doom-Loop Detection
** v0.41.0: Doom-Loop Detection
:PROPERTIES:
:ID: id-v090-doom-loop
@@ -1110,7 +1224,7 @@ OpenCode detects 3 consecutive identical tool calls and prompts the user. Withou
- Resets on any different tool call or successful output
~15 lines in ~core-loop-act.lisp~
** v0.41.0: Busy-Mode — Queue on Interrupt
** v0.42.0: Busy-Mode — Queue on Interrupt
:PROPERTIES:
:ID: id-v090-busy-mode
@@ -1125,7 +1239,7 @@ When the agent is processing a turn and the user types a message, the current be
- The priority queue (above) naturally supports this — user input queued during a turn has higher priority than heartbeats, lower than the active turn
~20 lines in ~core-pipeline.lisp~
** v0.42.0: CLI / Non-Interactive Mode
** v0.43.0: CLI / Non-Interactive Mode
:PROPERTIES:
:ID: id-v090-cli
@@ -1141,7 +1255,7 @@ Claude Code supports ~claude -p "fix the failing test" --print~. Hermes has ~her
- Uses the existing wire protocol — no new protocol, just a CLI wrapper around the framed TCP message format
~80 lines in ~passepartout~ bash script + ~50 lines daemon handler.
** v0.43.0: Provider Health Tracking
** v0.44.0: Provider Health Tracking
:PROPERTIES:
:ID: id-v090-provider-health
@@ -1157,7 +1271,7 @@ Claude Code supports ~claude -p "fix the failing test" --print~. Hermes has ~her
- Telemetry: provider health data feeds the session telemetry system
~60 lines in ~neuro-provider.lisp~ + ~30 lines TUI.
** v0.44.0: Cost-Based Provider Routing
** v0.45.0: Cost-Based Provider Routing
:PROPERTIES:
:ID: id-v090-cost-routing
@@ -1172,7 +1286,7 @@ Claude Code supports ~claude -p "fix the failing test" --print~. Hermes has ~her
- ~/routing~ TUI command: displays current cascade order with scores and reasons
~40 lines in ~core-reason.lisp~
** v0.45.0: Intelligent Provider Fallback — Per-Task-Type Routing
** v0.46.0: Intelligent Provider Fallback — Per-Task-Type Routing
:PROPERTIES:
:ID: id-v090-intelligent-fallback
@@ -1188,7 +1302,7 @@ Current fallback is "try the next provider." But different providers excel at di
- Bootstrap from defaults: GPT-4/Claude for reasoning, DeepSeek for code, Groq for chat, local Ollama for reflex
~60 lines in ~neuro-router.lisp~
** v0.46.0: Autonomous Certification Badge
** v0.47.0: Autonomous Certification Badge
:PROPERTIES:
:ID: id-v090-certification
@@ -1204,7 +1318,7 @@ After N HITL approvals of the same pattern, the dispatcher auto-approves it. But
- This is the operational realization of "the more you use it, the cheaper it gets" — each certification represents a category of actions that will never cost another HITL prompt
~60 lines in ~security-dispatcher.lisp~ + sidebar rendering reuse.
** v0.47.0: Certification Progress Bar
** v0.48.0: Certification Progress Bar
:PROPERTIES:
:ID: id-v090-cert-progress
@@ -1218,7 +1332,7 @@ The certification badge grants permanent auto-approval. Users need to see this h
- Certification velocity: ~"+2 certified this week"~ trend indicator in sidebar
~30 lines on top of existing sidebar rendering.
** v0.48.0: Update Mechanism + Migrations
** v0.49.0: Update Mechanism + Migrations
:PROPERTIES:
:ID: id-v090-update
@@ -1231,10 +1345,10 @@ No update mechanism exists. Users must manually ~git pull~ and re-run ~passepart
- ~passepartout update~ (git-based) — ~git fetch --tags && git checkout v0.5.1~, incremental tangle (only org files changed since previous tag, via ~git diff --name-only v0.5.0..v0.5.1 -- org/*.org~), recompile changed lisp files, restart daemon
- Migration hooks: ~~/memex/system/migrations/~ — ordered Lisp scripts run after tangle, before daemon restart. ~migrate-v051.lisp~ upgrades memory format, config schema, package names. Tracked by ~*migration-version*~ in ~~/.config/passepartout/version.lisp~
- Post-update verification: run internal eval suite, verify skill count ≥ 10, smoke test daemon port 9105. On failure: ~passepartout update --rollback~~git checkout v0.5.0~ → re-tangle → restart
- Binary update path (when v0.63.0 ships): download binary from GitHub Releases, verify SHA-256, replace, restart
- Binary update path (when v0.64.0 ships): download binary from GitHub Releases, verify SHA-256, replace, restart
~80 lines bash + ~50 lines Lisp.
** v0.49.0: Self-Configuration — Agent Proposes and Applies Config Changes
** v0.50.0: Self-Configuration — Agent Proposes and Applies Config Changes
:PROPERTIES:
:ID: id-v090-self-config
@@ -1251,11 +1365,11 @@ Passepartout's config is text files (`.env`, `.lisp`) — the same format the ag
Three tiers of self-configuration:
1. **Config Query** (v0.7.2) — "What providers do I have?" → answered from system prompt CONFIG section. Already implemented.
2. **Config Suggest** (v0.49.0) — "Should I use a cheaper model?" → agent analyzes telemetry, proposes specific config change with estimated savings. User decides.
3. **Config Apply** (v0.49.0) — "Add @credentials to privacy tags" → agent proposes change → HITL review → writes `.env` → daemon reloads → change takes effect within one think() cycle.
4. **Config Optimize** (v0.49.0) — "Make yourself cheaper" → agent analyzes cost patterns across all sessions, proposes multi-key optimization. User approves full batch.
2. **Config Suggest** (v0.50.0) — "Should I use a cheaper model?" → agent analyzes telemetry, proposes specific config change with estimated savings. User decides.
3. **Config Apply** (v0.50.0) — "Add @credentials to privacy tags" → agent proposes change → HITL review → writes `.env` → daemon reloads → change takes effect within one think() cycle.
4. **Config Optimize** (v0.50.0) — "Make yourself cheaper" → agent analyzes cost patterns across all sessions, proposes multi-key optimization. User approves full batch.
** v0.50.0: Self-Diagnosis Coach — ~/coach~ Command
** v0.51.0: Self-Diagnosis Coach — ~/coach~ Command
:PROPERTIES:
:ID: id-v090-coach
@@ -1267,7 +1381,7 @@ Telemetry data plus the agent's self-knowledge enables coaching: the agent detec
- ~/coach~ — analyzes telemetry from the last N sessions, produces a coaching report with 3-5 actionable tips. Coaching is opt-in (privacy-respecting — no data leaves the machine).
~50 lines in telemetry skill + ~30 lines TUI rendering.
** v0.51.0: Failure Attribution — Tag Task Failures with Probable Component
** v0.52.0: Failure Attribution — Tag Task Failures with Probable Component
:PROPERTIES:
:ID: id-v090-failure-attribution
@@ -1278,10 +1392,10 @@ AHE (arXiv:2604.25850v2) shows that evolution loops work when failures are attri
- In telemetry skill: when a session ends with a task failure, classify as: ~:tool-failure~, ~:gate-overblock~, ~:gate-underblock~, ~:reasoning-error~, ~:context-overflow~, ~:timeout~
- Classification is deterministic: if last action was blocked by dispatcher → gate-overblock. If last action was a tool error → tool-failure. If last action was a successful tool call but wrong output → reasoning-error.
- Feeds the Skill Creator (v0.57.0) — the agent knows *which* component to fix, not just *that* something went wrong
- Feeds the Skill Creator (v0.58.0) — the agent knows *which* component to fix, not just *that* something went wrong
~20 lines in telemetry skill.
** v0.52.0: MCP Native Client
** v0.53.0: MCP Native Client
:PROPERTIES:
:ID: id-v100-mcp
@@ -1296,7 +1410,7 @@ AHE (arXiv:2604.25850v2) shows that evolution loops work when failures are attri
~200 lines as a new skill ~mcp-client.org~.
** v0.53.0: Web Search + Web Fetch Tools
** v0.54.0: Web Search + Web Fetch Tools
:PROPERTIES:
:ID: id-v100-web
:CREATED: [2026-05-08 Fri]
@@ -1309,7 +1423,7 @@ Claude Code has ~WebSearchTool~ + ~WebFetchTool~. Hermes has ~firecrawl-py~ + ~e
- Both register via ~def-cognitive-tool~ as read-only tools (auto-approve via v0.7.2 safe-tool allowlist)
~150 lines as a new skill ~programming-web.org~. No external Python/Node.js process.
** v0.54.0: LSP Integration
** v0.55.0: LSP Integration
:PROPERTIES:
:ID: id-v100-lsp
@@ -1325,7 +1439,7 @@ Claude Code uses LSP for code intelligence — find definitions, find references
- LSP servers installed by the user (e.g., ~npm install -g typescript-language-server~). Passepartout auto-discovers installed servers via PATH.
~200 lines. Register as read-only cognitive tools. No daemon protocol changes — LSP is a background process, not a rendering concern.
** v0.55.0: ~debug-inspect~ Cognitive Tool
** v0.56.0: ~debug-inspect~ Cognitive Tool
:PROPERTIES:
:ID: id-v100-debug-inspect
@@ -1340,7 +1454,7 @@ Lisp enables live state inspection that no TypeScript/Python agent can match. Cl
- The agent can introspect its own state to answer meta-questions: "How many objects are in memory?" "What skills are loaded?" "What was the last HITL decision?"
~30 lines in ~programming-repl.lisp~ (extends existing repl-eval with safety guard).
** v0.56.0: Session Transcripts — ~/memex/system/sessions/~
** v0.57.0: Session Transcripts — ~/memex/system/sessions/~
:PROPERTIES:
:ID: id-v100-transcripts
@@ -1357,7 +1471,7 @@ Passepartout has no session persistence beyond Merkle tree snapshots. Chat histo
- Survives daemon restarts. Resume via ~/resume <date-title>~ (existing session resume from v0.7.2)
~80 lines in ~core-transport.lisp~ (append on message send) + reuse existing Org rendering.
** v0.57.0: Auto-Memory Extraction — Learnings from Sessions
** v0.58.0: Auto-Memory Extraction — Learnings from Sessions
:PROPERTIES:
:ID: id-v100-auto-memory
@@ -1373,7 +1487,7 @@ Claude Code's ~extractMemories~ runs at the end of each query loop, scanning the
- Opt-out via ~AUTO_MEMORY=false~ env var. Extraction frequency capped at one per minute to prevent runaway API costs.
~80 lines in ~core-reason.lisp~ + reuse session transcript for context.
** v0.58.0: Universal Cross-Project Org Query
** v0.59.0: Universal Cross-Project Org Query
:PROPERTIES:
:ID: id-v100-org-query
@@ -1388,7 +1502,7 @@ Passepartout's entire memex is Org — one format for memory, tasks, documents,
- ~(org-query :limit 20 :sort :priority)~ — sorted, capped results.
~150 lines in ~programming-org.lisp~ (extends existing Org manipulation primitives).
** v0.59.0: Skill Creator — LLM-Drafted, Verified Skills
** v0.60.0: Skill Creator — LLM-Drafted, Verified Skills
:PROPERTIES:
:ID: id-v110-skill-creator
@@ -1401,7 +1515,7 @@ Passepartout's entire memex is Org — one format for memory, tasks, documents,
- Skills are the primary extension mechanism for users. The Skill Creator makes skill authoring accessible to non-Lisp-programmers: describe what you want in English, the LLM drafts the Org file, the system verifies it, and the skill is live.
~150 lines as a new skill ~symbolic-skill-creator.org~.
** v0.60.0: Change Manifest — Skills Ship with Falsifiable Predictions
** v0.61.0: Change Manifest — Skills Ship with Falsifiable Predictions
:PROPERTIES:
:ID: id-v110-change-manifest
@@ -1416,7 +1530,7 @@ AHE (arXiv:2604.25850v2) shows that harness edits work better when each edit shi
- The change manifest persists in the skill's Org file — every skill carries its own evidence ledger.
~40 lines in Skill Creator + telemetry integration.
** v0.61.0: Long-Horizon Planning (Task Tree DAG)
** v0.62.0: Long-Horizon Planning (Task Tree DAG)
:PROPERTIES:
:ID: id-v110-planning
@@ -1431,7 +1545,7 @@ AHE (arXiv:2604.25850v2) shows that harness edits work better when each edit shi
- TUI task tree visualization: a collapsible Org headline tree rendered in the chat area. Each node shows its terminal state with a colored indicator (~○~ todo, ~▶~ next-action, ~◉~ in-progress, ~✓~ done, ~✗~ blocked, ~⏸~ stuck). Nodes expand/collapse on Enter. The tree updates in real time as the agent progresses through subtasks.
~200 lines.
** v0.62.0: Tier Classifier Fix
** v0.63.0: Tier Classifier Fix
:PROPERTIES:
:ID: id-v110-tier-fix
@@ -1444,7 +1558,7 @@ AHE (arXiv:2604.25850v2) shows that harness edits work better when each edit shi
- The classifier should be a skill, not core infrastructure — reloadable and replaceable without restart.
~40 lines.
** v0.63.0: SWE-Bench Harness
** v0.64.0: SWE-Bench Harness
:PROPERTIES:
:ID: id-v120-swebench
@@ -1457,7 +1571,7 @@ AHE (arXiv:2604.25850v2) shows that harness edits work better when each edit shi
- Target: competitive score with Claude Code and OpenClaw on SWE-bench-verified by v1.0.0.
~200 lines.
** v0.64.0: Computer Use / Vision
** v0.65.0: Computer Use / Vision
:PROPERTIES:
:ID: id-v120-vision
@@ -1470,7 +1584,7 @@ AHE (arXiv:2604.25850v2) shows that harness edits work better when each edit shi
- Use case: "open Firefox, search for the Passepartout GitHub repo, and star it."
~100 lines.
** v0.65.0: Telemetry / Observability
** v0.66.0: Telemetry / Observability
:PROPERTIES:
:ID: id-v120-telemetry
@@ -1484,7 +1598,7 @@ AHE (arXiv:2604.25850v2) shows that harness edits work better when each edit shi
- Feeds the evaluation harness (SWE-bench trajectory data comes from the same telemetry system)
~200 lines as a new skill ~symbolic-telemetry.org~. No daemon protocol changes.
** v0.66.0: Consensus Loop
** v0.67.0: Consensus Loop
:PROPERTIES:
:ID: id-v130-consensus
@@ -1497,7 +1611,7 @@ AHE (arXiv:2604.25850v2) shows that harness edits work better when each edit shi
- TUI consensus display: collapsible region listing each provider, its model, its proposal, and its confidence score. ~✓ 3/3 providers agree~ in green; ~✗ 2/3 agree~ in yellow.
~80 lines.
** v0.67.0: GTD Integration
** v0.68.0: GTD Integration
:PROPERTIES:
:ID: id-v130-gtd
@@ -1510,7 +1624,7 @@ AHE (arXiv:2604.25850v2) shows that harness edits work better when each edit shi
- TUI agenda view: ~/agenda~ command renders Org-agenda as formatted scrollable region within the chat area.
~150 lines.
** v0.68.0: Deep Emacs Integration
** v0.69.0: Deep Emacs Integration
:PROPERTIES:
:ID: id-v130-emacs
@@ -1523,7 +1637,7 @@ AHE (arXiv:2604.25850v2) shows that harness edits work better when each edit shi
- Refile and archive: agent refiles headlines between Org files and archives completed items.
~300 lines.
** v0.69.0: Save-Lisp-and-Die Binary
** v0.70.0: Save-Lisp-and-Die Binary
:PROPERTIES:
:ID: id-v140-save-lisp
@@ -1538,7 +1652,7 @@ AHE (arXiv:2604.25850v2) shows that harness edits work better when each edit shi
- Add FiveAM test: the deterministic path succeeds on a system with all dependencies pre-installed; the LLM-assisted path correctly classifies 10 common package-manager error messages.
~200 lines Lisp + build configuration.
** v0.70.0: Channels + Providers — Match OpenClaw on Demand
** v0.71.0: Channels + Providers — Match OpenClaw on Demand
:PROPERTIES:
:ID: id-v100-channels
@@ -1553,34 +1667,34 @@ The daemon protocol is client-agnostic hex-framed plists over TCP. Every new cha
No separate releases. Done when needed, shipped when ready.
** v0.71.0: Lish Shell
** v0.72.0: Lish Shell
- plist-returning commands: ~(ls :path "~/memex/projects/")~ → structured result
- Pipe as function composition: ~(pipe (ls ...) (filter :state 'TODO))~
- Org-buffer output: shell output rendered as Org headlines
- External bash compatibility: ~(bash "npm run build")~ → plist with exit code, stdout, stderr
~500 lines CL. Useful immediately for the agent.
** v0.72.0: Buffer-as-CLOS Prototype
** v0.73.0: Buffer-as-CLOS Prototype
- buffer class: source (file path or Org AST), content, cursor, marks, overlays
- Key editing primitives: insert, delete, move, search, replace
- Org-AST-backed: editing mutates the AST, text rendering is a view
~300 lines CL. No display dependency.
** v0.73.0: EQL5 Feasibility
** v0.74.0: EQL5 Feasibility
- Add EQL5 to Quicklisp dependencies (optional, like croatoan)
- Compile and verify on Linux (primary target)
- Single QML window: "Passepartout" title, 800x600, dark background
- Verify event loop integration with SBCL threads
~100 lines QML + build config.
** v0.74.0: EQL5 TCP Client
** v0.75.0: EQL5 TCP Client
- QML window with terminal widget, input area, status bar
- Connects to daemon via existing framed TCP protocol
- Renders agent responses, gate trace, sidebar panels as QML components
- Lives alongside croatoan TUI (two clients, one daemon)
~300 lines QML + ~200 lines CL.
** v0.75.0: Minibuffer Prototype
** v0.76.0: Minibuffer Prototype
- Universal command line at bottom of Qt window
- /chat /edit /shell /eval dispatch
- Goes through same gate stack as agent actions
@@ -1592,7 +1706,7 @@ v1.0.0 is where the agent achieves symbolic-first reasoning in the 10-80-10 arch
Hallucination becomes structurally impossible because the symbolic engine will not accept a fact that contradicts its knowledge graph. Safety becomes provable because ACL2 can prove properties about the system's behavior. Self-improvement becomes stable because the agent modifies skills that are then verified before execution.
The system is benchmarked against SWE-bench (competitive score with Claude Code and OpenClaw), verified under concurrent load (MVCC from v0.38.0), and validated by the eval harness (v0.9.0). The 10-80-10 planner operates on a mature symbolic index seeded from months of gate outcomes, Screamer deductions, LLM-proposed facts with provenance, and human-authored facts.
The system is benchmarked against SWE-bench (competitive score with Claude Code and OpenClaw), verified under concurrent load (MVCC from v0.39.0), and validated by the eval harness (v0.9.0). The 10-80-10 planner operates on a mature symbolic index seeded from months of gate outcomes, Screamer deductions, LLM-proposed facts with provenance, and human-authored facts.
The TUI at v1.0.0 is competitive: streaming responses, gate trace visualization, sidebar with 10 panels, skin system with 10+ presets, adaptive layout, full markdown, mouse support, spinner personality, and progress bars. The sidebar's gate trace, focus map, rule counter, sufficiency score, and provenance breakdown are capabilities no competitor can replicate — Passepartout's permanent UX differentiator.