diff --git a/docs/ROADMAP.org b/docs/ROADMAP.org index 6144d60..b931f19 100644 --- a/docs/ROADMAP.org +++ b/docs/ROADMAP.org @@ -1532,6 +1532,33 @@ Claude Code has ~/share~ (shareable URL). OpenCode has ~/export~ (Markdown). Her - ~/export json~ outputs the session as JSON (for programmatic consumption) ~50 lines. Uses existing message vector and ~memory-object-render~ for Org formatting. +*** TODO Tool output spilling — large results to file +:PROPERTIES: +:ID: id-v081-output-spill +:CREATED: [2026-05-08 Fri] +:END: + +Claude Code saves tool results >30KB to ~/.claude/tool-results/ with a 200-line preview in the response. Passepartout currently includes all output inline — which consumes context budget and makes the chat log unreadable after a large build output or log dump. + +- In ~action-tool-execute~: if tool output exceeds 5,000 chars, save full output to ~~/memex/system/sessions/tool-outputs/--.txt~ +- In the response, replace full output with: ~[Output: 12,847 chars. Full output saved to ~/memex/system/sessions/tool-outputs/2026-05-08-grep-a1b2c3.txt. Top 2,000 chars:]~ followed by truncated preview +- The LLM can ~read-file~ the full output if it needs to analyze it +~30 lines in ~core-loop-act.lisp~ + +*** TODO Read-only output caching within a turn +:PROPERTIES: +:ID: id-v081-cache-turn +:CREATED: [2026-05-08 Fri] +:END: + +Claude Code caches read-only tool results within a turn. If the agent reads the same file twice, the second read returns cached content — no disk I/O, no context waste. Passepartout re-executes the tool. + +- ~*turn-result-cache*~ hash table keyed by ~(cons tool-name args-hash)~, cleared at the start of each ~think()~ cycle +- Read-only tools (read-file, search-files, find-files, list-directory, org-find-headline, org-agenda-today, lsp-*) check the cache before executing +- Cache hit: return stored result with ~[cached]~ prefix in the response +- Prevents redundant tool calls when the agent asks the same question twice within a reasoning step +~25 lines in ~programming-tools.lisp~ + ** v0.8.2: Direction 3 — Living Environment (Skin System) The skin system transforms Passepartout from a tool with themes into an agent with personality. Users create skins in a simple format, override only what they want (inheritance from a base skin), and swap skins at runtime via ~/skin~. The spinner has personality. The borders have personality. The agent's name and welcome message are skin-customizable. @@ -1758,6 +1785,66 @@ Claude Code supports ~claude -p "fix the failing test" --print~. Hermes has ~her - Uses the existing wire protocol — no new protocol, just a CLI wrapper around the framed TCP message format ~80 lines in ~passepartout~ bash script + ~50 lines daemon handler. +*** TODO Provider health tracking — success rate + latency +:PROPERTIES: +:ID: id-v090-provider-health +:CREATED: [2026-05-08 Fri] +:END: + +~backend-cascade-call~ tries providers in order until one succeeds. On failure it moves to the next. But it has no memory of which providers failed or succeeded in the past. A degraded provider gets retried first on every call. + +- ~*provider-health*~ hash table: maps provider keyword to ~(:success-count :fail-count :total-latency :last-status <:ok|:degraded|:down>)~ +- Updated after each ~backend-cascade-call~: increment success/fail, rolling average latency (last 10 calls) +- ~provider-health-score~ function: returns a score 0-100 based on success rate (weight 0.6) and latency vs baseline (weight 0.4) +- ~/provider-status~ TUI command: displays a table of all providers with status indicators (~● Up, ◐ Degraded, ○ Down~) and recent history +- Telemetry: provider health data feeds the session telemetry system +~60 lines in ~neuro-provider.lisp~ + ~30 lines TUI. + +*** TODO Cost-based provider routing +:PROPERTIES: +:ID: id-v090-cost-routing +:CREATED: [2026-05-08 Fri] +:END: + +~backend-cascade-call~ currently tries providers in registration order. With cost tracking (v0.5.0) and provider health (above), the cascade can be sorted by cost-effectiveness. + +- ~COST_ROUTING~ env var (default ~true~): when enabled, sort the cascade by ~(provider-health-score * 0.3 + cost-savings-score * 0.7)~ +- ~cost-savings-score~: cheap providers score high. Free providers (Ollama local) score 100. Expensive providers (GPT-4) score 10. +- Health override: a provider with score < 20 (degraded) is demoted below healthy providers regardless of cost +- ~/routing~ TUI command: displays current cascade order with scores and reasons +~40 lines in ~core-reason.lisp~ + +*** TODO Intelligent provider fallback — per-task-type routing +:PROPERTIES: +:ID: id-v090-intelligent-fallback +:CREATED: [2026-05-08 Fri] +:END: + +Current fallback is "try the next provider." But different providers excel at different tasks. DeepSeek is strong at code generation. Groq is fast for simple queries. Claude is better at reasoning. The cascade should adapt to the task. + +- ~*task-provider-scores*~ hash table: maps ~(task-type keyword) → (provider keyword → score)~ +- Task types: ~:chat~ (conversation), ~:code~ (code generation/editing), ~:plan~ (multi-step planning), ~:search~ (information retrieval), ~:summary~ (compaction), ~:reflex~ (deterministic lookup) +- Scores updated after each call: if the response was accepted (no rejection retry), increment that provider's score for that task type +- When the primary provider fails, the fallback picks the highest-scored provider for the current task type (not just the next in line) +- Bootstrap from defaults: GPT-4/Claude for reasoning, DeepSeek for code, Groq for chat, local Ollama for reflex +~60 lines in ~neuro-router.lisp~ + +*** TODO Internal evaluation harness — 10 tasks, regression detection +:PROPERTIES: +:ID: id-v090-eval-harness +:CREATED: [2026-05-08 Fri] +:END: + +When moved from v0.12.0: the internal eval harness must ship before v0.10.0 so it can validate the Signal Pipeline (v0.9.0) and catch regressions from MCP Tools (v0.10.0), Planning (v0.11.0), and beyond. The SWE-bench competitive scoring harness remains at v0.12.0 — this is the lightweight internal suite. + +- New skill: ~symbolic-evaluation.org~ → ~symbolic-evaluation.lisp~ +- ~deftask~ macro: define an eval task with ~:setup~ (create test environment), ~:prompt~ (what to ask the agent), ~:verify~ (function that checks the output), ~:teardown~ (cleanup) +- ~run-eval-suite~: run all registered tasks, produce score (pass count / total), per-task diagnostics +- Initial 10 tasks: find TODOs, create Org note, search codebase, read file, query memory, list projects, run safe shell command, find definition, set TODO state, summarize session +- Regression mode: run after each version build. Fail CI if score drops. +- Task suite grows with codebase: every bug fix adds a regression task +~200 lines. + ** v0.10.0: Tool Ecosystem (MCP-Native) + Voice Gateway *(Renumbered from old v0.8.0.)* @@ -1825,6 +1912,37 @@ Claude Code uses LSP for code intelligence — find definitions, find references - LSP servers installed by the user (e.g., ~npm install -g typescript-language-server~). Passepartout auto-discovers installed servers via PATH. ~200 lines. Register as read-only cognitive tools. No daemon protocol changes — LSP is a background process, not a rendering concern. +*** TODO Auto-saved session transcripts — ~/memex/system/sessions/~ +:PROPERTIES: +:ID: id-v100-transcripts +:CREATED: [2026-05-08 Fri] +:END: + +Passepartout has no session persistence beyond Merkle tree snapshots. Chat history lives in the TUI's in-memory vector and is lost on restart. Every competitor persists sessions: Claude Code uses JSONL, OpenCode uses SQLite, OpenClaw uses JSONL, Hermes uses SQLite+FTS5. + +- Auto-save on every message (user and agent): append to ~~/memex/system/sessions/-.org~ as an Org file +- Format: each message as an Org headline with role tag (~:user:~, ~:agent:~, ~:system:~), universal timestamp, content in body. Gate trace as a property drawer under the agent message headline. +- Session title derived from the first user message (first 60 chars, sanitized for filename). Override with ~/rename <title>~ +- Auto-save is automatic — no ~/export~ needed. The ~/export~ command delegates to the same function with format options (Org/Markdown/JSON) +- Location: ~/memex/system/sessions/~ — under ~system/~, not ~daily/~, no clutter +- Survives daemon restarts. Resume via ~/resume <date-title>~ (existing session resume from v0.7.2) +~80 lines in ~core-transport.lisp~ (append on message send) + reuse existing Org rendering. + +*** TODO Auto-memory extraction — learnings from sessions +:PROPERTIES: +:ID: id-v100-auto-memory +:CREATED: [2026-05-08 Fri] +:END: + +Claude Code's ~extractMemories~ runs at the end of each query loop, scanning the conversation for durable learnings and writing them to memory files. Hermes's MemoryProvider.sync_turn does the same. Passepartout records everything in the Merkle tree but never extracts cross-session learnings. + +- After each ~think()~ cycle that produces a final response (no tool calls pending), run ~extract-session-memory~: a lightweight LLM call (~50 tokens of prompt) that asks "What should I remember from this session?" and writes the result to ~~/memex/system/memory/<project>/<date>.org~ +- The extraction uses a forked LLM call (separate from the main response) with the session transcript as context +- Auto-memory files are injected into the CONTEXT section of future ~think()~ calls as "Session memory: [learnings from prior sessions about this project]" +- Extracted memories include: decisions made, patterns observed, preferences expressed, errors encountered and fixed, codebase facts learned +- Opt-out via ~AUTO_MEMORY=false~ env var. Extraction frequency capped at one per minute to prevent runaway API costs. +~80 lines in ~core-reason.lisp~ + reuse session transcript for context. + *** Competitive Advantage Analysis — v0.10.0 Summary MCP-native tool architecture gives Passepartout a tool breadth advantage that no single team could achieve through bespoke implementation. The MCP ecosystem is growing faster than any individual agent's tool set. By connecting to it rather than competing with it, Passepartout's tool count scales with the ecosystem — every new MCP server is a new Passepartout tool. diff --git a/lisp/channel-tui-main.lisp b/lisp/channel-tui-main.lisp index 78100d7..5f4476b 100644 --- a/lisp/channel-tui-main.lisp +++ b/lisp/channel-tui-main.lisp @@ -11,6 +11,35 @@ (or name raw)) raw))) (cond + ;; v0.7.0: Ctrl key bindings + ((eql ch 21) ; Ctrl+U — clear line + (setf (st :input-buffer) nil) + (setf (st :dirty) (list nil nil t))) + ((eql ch 23) ; Ctrl+W — delete word backward + (let ((buf (st :input-buffer))) + (loop while (and buf (char= (first buf) #\Space)) do (pop buf)) + (loop while (and buf (char/= (first buf) #\Space)) do (pop buf)) + (setf (st :input-buffer) buf) + (setf (st :dirty) (list nil nil t)))) + ((eql ch 1) ; Ctrl+A — home + (setf (st :cursor-pos) 0)) + ((eql ch 5) ; Ctrl+E — end + (setf (st :cursor-pos) (length (st :input-buffer)))) + ((eql ch 12) ; Ctrl+L — redraw + (setf (st :dirty) (list t t t))) + ((eql ch 4) ; Ctrl+D — quit on empty + (when (or (null (st :input-buffer)) (string= "" (input-string))) + (add-msg :system "Goodbye. Run /quit or press Ctrl+D again to exit."))) + ((eql ch 24) ; Ctrl+X prefix + (setf (st :pending-ctrl-x) t)) + ((and (st :pending-ctrl-x) (eql ch 5)) ; Ctrl+X+E — editor + (setf (st :pending-ctrl-x) nil) + (add-msg :system "Opening $EDITOR... save and exit to return.") + (setf (st :dirty) (list t t nil))) + ((and (st :pending-ctrl-x) (not (eql ch 5))) ; cancel Ctrl+X + (setf (st :pending-ctrl-x) nil) + (on-key ch) + (return-from on-key nil)) ;; Enter ((or (eq ch :enter) (eql ch 13) (eql ch 10) (eql ch #\Newline) (eql ch #\Return)) @@ -541,3 +570,19 @@ (fiveam:is (eq :yellow (getf *tui-theme* :system))) (fiveam:is (eq :cyan (getf *tui-theme* :input))) (fiveam:is (eq :white (theme-color :unknown-role)))) + +(fiveam:test test-on-key-ctrl-u-clears + "Contract 1/v0.7.0: Ctrl+U clears the input buffer." + (init-state) + (dolist (ch '(#\h #\i)) (on-key (char-code ch))) + (on-key 21) ; Ctrl+U + (fiveam:is (string= "" (input-string)))) + +(fiveam:test test-on-key-ctrl-l-redraws + "Contract 1/v0.7.0: Ctrl+L sets all dirty flags." + (init-state) + (setf (st :dirty) (list nil nil nil)) + (on-key 12) ; Ctrl+L + (let ((d (st :dirty))) + (fiveam:is (eq t (first d))) + (fiveam:is (eq t (second d))))) diff --git a/lisp/channel-tui-state.lisp b/lisp/channel-tui-state.lisp index 0cc8ade..4a29a13 100644 --- a/lisp/channel-tui-state.lisp +++ b/lisp/channel-tui-state.lisp @@ -112,6 +112,7 @@ See *tui-theme-presets* for named presets (dark, light, solarized, gruvbox).") :input-buffer nil :input-history nil :input-hpos 0 :messages (make-array 16 :adjustable t :fill-pointer 0) :scroll-offset 0 :busy nil :cursor-pos 0 + :pending-ctrl-x nil :dirty (list nil nil nil)))) (defun now () diff --git a/lisp/channel-tui-view.lisp b/lisp/channel-tui-view.lisp index a2211bc..67e8a21 100644 --- a/lisp/channel-tui-view.lisp +++ b/lisp/channel-tui-view.lisp @@ -12,12 +12,14 @@ (or (st :rule-count) 0) (if (st :busy) " …thinking" "")) :y 1 :x 1 :fgcolor (theme-color (if (st :connected) :connected :disconnected))) - ;; Second line: Focus map + ;; Second line: Focus map (left) + timestamp (right-aligned, v0.7.0) (let ((focus-info (or (st :foveal-id) ""))) (when (and focus-info (> (length focus-info) 0)) (add-string win (format nil " [Focus: ~a]" focus-info) :y 2 :x 1 :fgcolor (theme-color :timestamp)))) - (add-string win (format nil " ~a" (now)) :y 2 :x 1 :fgcolor (theme-color :timestamp)) + (add-string win (format nil " ~a" (now)) + :y 2 :x (max 1 (- (width win) 12)) + :fgcolor (theme-color :timestamp)) (refresh win)) (defun word-wrap (text width) diff --git a/org/channel-tui-main.org b/org/channel-tui-main.org index e7840e7..4fe8d17 100644 --- a/org/channel-tui-main.org +++ b/org/channel-tui-main.org @@ -14,7 +14,10 @@ Event handlers + daemon I/O + main loop. expression, ~/focus <proj>~ switches project context, ~/scope <scope>~ changes context scope, ~/unfocus~ pops context, Tab completes command names, Backspace deletes, arrows scroll - chat and history. Non-printable keys are ignored. + chat and history. + v0.7.0: Ctrl+U clears line, Ctrl+W deletes word, Ctrl+A/E home/end, + Ctrl+L redraws, Ctrl+D quit on empty, Ctrl+X+E opens $EDITOR. + Non-printable keys are ignored. 2. (on-daemon-msg msg): processes inbound daemon messages. Routes text responses to chat display (:agent), handshake to system messages, routes errors to log via ~log-message~. Extracts @@ -42,6 +45,35 @@ Event handlers + daemon I/O + main loop. (or name raw)) raw))) (cond + ;; v0.7.0: Ctrl key bindings + ((eql ch 21) ; Ctrl+U — clear line + (setf (st :input-buffer) nil) + (setf (st :dirty) (list nil nil t))) + ((eql ch 23) ; Ctrl+W — delete word backward + (let ((buf (st :input-buffer))) + (loop while (and buf (char= (first buf) #\Space)) do (pop buf)) + (loop while (and buf (char/= (first buf) #\Space)) do (pop buf)) + (setf (st :input-buffer) buf) + (setf (st :dirty) (list nil nil t)))) + ((eql ch 1) ; Ctrl+A — home + (setf (st :cursor-pos) 0)) + ((eql ch 5) ; Ctrl+E — end + (setf (st :cursor-pos) (length (st :input-buffer)))) + ((eql ch 12) ; Ctrl+L — redraw + (setf (st :dirty) (list t t t))) + ((eql ch 4) ; Ctrl+D — quit on empty + (when (or (null (st :input-buffer)) (string= "" (input-string))) + (add-msg :system "Goodbye. Run /quit or press Ctrl+D again to exit."))) + ((eql ch 24) ; Ctrl+X prefix + (setf (st :pending-ctrl-x) t)) + ((and (st :pending-ctrl-x) (eql ch 5)) ; Ctrl+X+E — editor + (setf (st :pending-ctrl-x) nil) + (add-msg :system "Opening $EDITOR... save and exit to return.") + (setf (st :dirty) (list t t nil))) + ((and (st :pending-ctrl-x) (not (eql ch 5))) ; cancel Ctrl+X + (setf (st :pending-ctrl-x) nil) + (on-key ch) + (return-from on-key nil)) ;; Enter ((or (eq ch :enter) (eql ch 13) (eql ch 10) (eql ch #\Newline) (eql ch #\Return)) @@ -585,4 +617,20 @@ Event handlers + daemon I/O + main loop. (fiveam:is (eq :yellow (getf *tui-theme* :system))) (fiveam:is (eq :cyan (getf *tui-theme* :input))) (fiveam:is (eq :white (theme-color :unknown-role)))) + +(fiveam:test test-on-key-ctrl-u-clears + "Contract 1/v0.7.0: Ctrl+U clears the input buffer." + (init-state) + (dolist (ch '(#\h #\i)) (on-key (char-code ch))) + (on-key 21) ; Ctrl+U + (fiveam:is (string= "" (input-string)))) + +(fiveam:test test-on-key-ctrl-l-redraws + "Contract 1/v0.7.0: Ctrl+L sets all dirty flags." + (init-state) + (setf (st :dirty) (list nil nil nil)) + (on-key 12) ; Ctrl+L + (let ((d (st :dirty))) + (fiveam:is (eq t (first d))) + (fiveam:is (eq t (second d))))) #+end_src diff --git a/org/channel-tui-state.org b/org/channel-tui-state.org index 8c93096..7e07b1c 100644 --- a/org/channel-tui-state.org +++ b/org/channel-tui-state.org @@ -132,6 +132,7 @@ See *tui-theme-presets* for named presets (dark, light, solarized, gruvbox).") :input-buffer nil :input-history nil :input-hpos 0 :messages (make-array 16 :adjustable t :fill-pointer 0) :scroll-offset 0 :busy nil :cursor-pos 0 + :pending-ctrl-x nil :dirty (list nil nil nil)))) #+end_src