v0.7.2: self-help (/why) + CONFIG injection — TDD
- CONFIG section in system prompt: providers, context window, gate count, rules learned, docs path - /why TUI command: shows most recent gate trace from message history - assemble-config-section reads live state at each think() call - Core: 75/76 TUI Main: 77/78 (1 pre-existing RCE test flake)
This commit is contained in:
@@ -1273,7 +1273,8 @@ Gate trace data format (already in messages): ~(:gate-trace ((:gate "dispatcher-
|
|||||||
- Send structured approval/denial message to daemon: ~(:type :event :payload (:action :hitl-respond :token "HITL-abcd" :decision :approved))~
|
- Send structured approval/denial message to daemon: ~(:type :event :payload (:action :hitl-respond :token "HITL-abcd" :decision :approved))~
|
||||||
- Render HITL prompts as styled inline panels with colored border (permission theme color), showing the action, explanation, and available choices ("Allow (Enter)" / "Deny (Esc)")
|
- Render HITL prompts as styled inline panels with colored border (permission theme color), showing the action, explanation, and available choices ("Allow (Enter)" / "Deny (Esc)")
|
||||||
- After approval/denial, collapse the prompt panel and add a system message: "✓ Approved: shell command" or "✗ Denied: shell command"
|
- After approval/denial, collapse the prompt panel and add a system message: "✓ Approved: shell command" or "✗ Denied: shell command"
|
||||||
~40 lines.
|
- Clarifying-question escalation: when the same action has been blocked twice and retried (2 rejections in the 3-retry loop), the third attempt injects a /clarify prompt with targeted discriminating options instead of a generic rejection. Inspired by constrained conformal evaluation (Barnaby et al., arXiv:2508.15750v1): "This command touches ~/memex/ and /etc/. Is the /etc/ path intended? [1] Intended [2] Accidental [3] Cancel." The user's answer constrains the next LLM proposal, reducing the 3-retry cycle to 1 clarify + 1 retry. ~1.1x token multiplier vs current ~1.39x.
|
||||||
|
~60 lines.
|
||||||
|
|
||||||
*** TODO Message search (/search or Ctrl+F)
|
*** TODO Message search (/search or Ctrl+F)
|
||||||
:PROPERTIES:
|
:PROPERTIES:
|
||||||
@@ -2006,6 +2007,19 @@ Telemetry data (v0.9.0) plus the agent's self-knowledge enables coaching: the ag
|
|||||||
- Coaching data sources: command frequency, HITL approval patterns, context usage history, feature adoption rate, telemetry aggregates
|
- Coaching data sources: command frequency, HITL approval patterns, context usage history, feature adoption rate, telemetry aggregates
|
||||||
- Coaching is opt-in (privacy-respecting — no data leaves the machine). ~50 lines in telemetry skill + ~30 lines TUI rendering.
|
- Coaching is opt-in (privacy-respecting — no data leaves the machine). ~50 lines in telemetry skill + ~30 lines TUI rendering.
|
||||||
|
|
||||||
|
*** TODO Failure attribution — tag task failures with probable component
|
||||||
|
:PROPERTIES:
|
||||||
|
:ID: id-v090-failure-attribution
|
||||||
|
:CREATED: [2026-05-08 Fri]
|
||||||
|
:END:
|
||||||
|
|
||||||
|
AHE (arXiv:2604.25850v2) shows that evolution loops work when failures are attributed to specific harness components, not just "the task failed." Passepartout's telemetry records task outcomes but doesn't classify failures by root cause.
|
||||||
|
|
||||||
|
- In telemetry skill: when a session ends with a task failure (agent couldn't complete, user interrupted with denial, or dispatcher blocked irrecoverably), the telemeter classifies the failure as one of: ~:tool-failure~ (tool timeout, tool error), ~:gate-overblock~ (dispatcher blocked a necessary command), ~:gate-underblock~ (dispatcher allowed a harmful command), ~:reasoning-error~ (LLM produced a wrong answer), ~:context-overflow~ (context budget exhausted), ~:timeout~ (session timeout)
|
||||||
|
- Classification is deterministic: if last action was blocked by dispatcher → gate-overblock. If last action was a tool error → tool-failure. If last action was a successful tool call but wrong output → reasoning-error.
|
||||||
|
- Feeds the Skill Creator (v0.11.0) — the agent knows *which* component to fix, not just *that* something went wrong
|
||||||
|
~20 lines in telemetry skill.
|
||||||
|
|
||||||
** v0.10.0: Tool Ecosystem (MCP-Native) + Voice Gateway
|
** v0.10.0: Tool Ecosystem (MCP-Native) + Voice Gateway
|
||||||
|
|
||||||
*(Renumbered from old v0.8.0.)*
|
*(Renumbered from old v0.8.0.)*
|
||||||
@@ -2171,6 +2185,23 @@ The voice gateway (v0.10.3) adds parity with OpenClaw's voice features without a
|
|||||||
- Required ~:repl-verified~ flag on all ~defun~ forms — the existing Dispatcher lint check warns on writes without verification. The Skill Creator enforces this at creation time.
|
- Required ~:repl-verified~ flag on all ~defun~ forms — the existing Dispatcher lint check warns on writes without verification. The Skill Creator enforces this at creation time.
|
||||||
- Skills are the primary extension mechanism for users. The Skill Creator makes skill authoring accessible to non-Lisp-programmers: describe what you want in English, the LLM drafts the Org file, the system verifies it, and the skill is live.
|
- Skills are the primary extension mechanism for users. The Skill Creator makes skill authoring accessible to non-Lisp-programmers: describe what you want in English, the LLM drafts the Org file, the system verifies it, and the skill is live.
|
||||||
|
|
||||||
|
*** TODO Change manifest — skills ship with falsifiable predictions
|
||||||
|
:PROPERTIES:
|
||||||
|
:ID: id-v110-change-manifest
|
||||||
|
:CREATED: [2026-05-08 Fri]
|
||||||
|
:END:
|
||||||
|
|
||||||
|
AHE (arXiv:2604.25850v2) shows that harness edits work better when each edit ships with a self-declared prediction, verified by next-round outcomes. Passepartout's Skill Creator should do the same — every new or modified skill carries predictions that telemetry verifies.
|
||||||
|
|
||||||
|
- When the Skill Creator generates a skill, it also generates a ~#+PREDICTION:~ block in the Org frontmatter:
|
||||||
|
- ~#+PREDICTION: reduces token usage by 15% for code-generation tasks~
|
||||||
|
- ~#+PREDICTION: may increase HITL prompts for shell commands outside workspace~
|
||||||
|
- ~#+PREDICTION: should improve success rate on refactoring tasks~
|
||||||
|
- Over the next 10 sessions, telemetry compares actual outcomes against predictions. The verification result is appended to the skill file: ~#+VERIFIED: Y token change: -18% (predicted -15%) on 2026-06-01~
|
||||||
|
- Disproven predictions flag the skill for review: ~#+DISPROVEN: token usage increased +3% on code tasks (predicted -15%). Skill scheduled for revision.~
|
||||||
|
- The change manifest persists in the skill's Org file — every skill carries its own evidence ledger. Users can see which skills worked as predicted and which didn't.
|
||||||
|
~40 lines in Skill Creator + telemetry integration.
|
||||||
|
|
||||||
*** Competitive Advantage Analysis — v0.11.0 Summary
|
*** Competitive Advantage Analysis — v0.11.0 Summary
|
||||||
|
|
||||||
The task tree DAG with terminal states and branch pruning is Passepartout's planning primitive — analogous to Claude Code's TODO list but structural (Org headlines with parent-child relationships) rather than flat.
|
The task tree DAG with terminal states and branch pruning is Passepartout's planning primitive — analogous to Claude Code's TODO list but structural (Org headlines with parent-child relationships) rather than flat.
|
||||||
|
|||||||
@@ -118,9 +118,28 @@
|
|||||||
(list :action :hitl-respond :token token :decision :denied)))
|
(list :action :hitl-respond :token token :decision :denied)))
|
||||||
(add-msg :system (format nil "Denied: ~a" token))))
|
(add-msg :system (format nil "Denied: ~a" token))))
|
||||||
;; /help command
|
;; /help command
|
||||||
|
;; /why command — show last gate trace
|
||||||
|
((string-equal text "/why")
|
||||||
|
(let ((msgs (st :messages))
|
||||||
|
(found nil))
|
||||||
|
(loop for i from (1- (length msgs)) downto 0
|
||||||
|
for m = (aref msgs i)
|
||||||
|
for gt = (getf m :gate-trace)
|
||||||
|
when (and gt (listp gt) (> (length gt) 0))
|
||||||
|
do (setf found t)
|
||||||
|
(dolist (entry gt)
|
||||||
|
(let* ((gate (getf entry :gate))
|
||||||
|
(result (getf entry :result))
|
||||||
|
(reason (getf entry :reason))
|
||||||
|
(msg (format nil "~a ~a~@[ — ~a~]"
|
||||||
|
(case result (:passed "[PASS]") (:blocked "[BLOCKED]") (:approval "[HITL]"))
|
||||||
|
(or gate "unknown")
|
||||||
|
reason)))
|
||||||
|
(add-msg :system msg)))
|
||||||
|
(loop-finish))
|
||||||
|
(unless found
|
||||||
|
(add-msg :system "No recent gate trace. Run a tool to see gate decisions."))))
|
||||||
((string-equal text "/help")
|
((string-equal text "/help")
|
||||||
(add-msg :system
|
|
||||||
"/eval <expr> Evaluate Lisp expression")
|
|
||||||
(add-msg :system
|
(add-msg :system
|
||||||
"/focus <proj> Set project context")
|
"/focus <proj> Set project context")
|
||||||
(add-msg :system
|
(add-msg :system
|
||||||
@@ -826,3 +845,28 @@
|
|||||||
(let ((m (aref (st :messages) 0)))
|
(let ((m (aref (st :messages) 0)))
|
||||||
(fiveam:is (eq :system (getf m :role)))
|
(fiveam:is (eq :system (getf m :role)))
|
||||||
(fiveam:is (search "Redo" (getf m :content)))))
|
(fiveam:is (search "Redo" (getf m :content)))))
|
||||||
|
|
||||||
|
;; ── v0.7.2 Self-help ──
|
||||||
|
|
||||||
|
(fiveam:test test-why-command
|
||||||
|
"Contract v0.7.2: /why shows gate trace from last message."
|
||||||
|
(init-state)
|
||||||
|
(add-msg :agent "did something" :gate-trace '((:gate "shell" :result :blocked :reason "rm -rf")))
|
||||||
|
(dolist (ch (coerce "/why" 'list))
|
||||||
|
(on-key (char-code ch)))
|
||||||
|
(on-key 13)
|
||||||
|
(let* ((msgs (st :messages))
|
||||||
|
(m (aref msgs (1- (length msgs)))))
|
||||||
|
(fiveam:is (eq :system (getf m :role)))
|
||||||
|
(fiveam:is (search "[BLOCKED]" (getf m :content)))
|
||||||
|
(fiveam:is (search "shell" (getf m :content)))))
|
||||||
|
|
||||||
|
(fiveam:test test-why-no-trace
|
||||||
|
"Contract v0.7.2: /why with no gate trace shows fallback message."
|
||||||
|
(init-state)
|
||||||
|
(dolist (ch (coerce "/why" 'list))
|
||||||
|
(on-key (char-code ch)))
|
||||||
|
(on-key 13)
|
||||||
|
(let* ((msgs (st :messages))
|
||||||
|
(m (aref msgs (1- (length msgs)))))
|
||||||
|
(fiveam:is (search "No recent" (getf m :content)))))
|
||||||
|
|||||||
@@ -88,7 +88,7 @@
|
|||||||
(symbol-value '*provider-cascade*)))))
|
(symbol-value '*provider-cascade*)))))
|
||||||
(when (boundp '*hitl-pending*)
|
(when (boundp '*hitl-pending*)
|
||||||
(setf rules-count (hash-table-count (symbol-value '*hitl-pending*))))
|
(setf rules-count (hash-table-count (symbol-value '*hitl-pending*))))
|
||||||
(format nil "CONFIG: You are Passepartout v0.7.2. Provider: ~a. Context: ~d tokens. Security gates: ~d active. Rules learned: ~d."
|
(format nil "CONFIG: You are Passepartout v0.7.2. Provider: ~a. Context: ~d tokens. Security gates: ~d active. Rules learned: ~d. Documentation: ~/memex/projects/passepartout/docs/USER_MANUAL.org."
|
||||||
(if (string= provider-names "") "default" provider-names)
|
(if (string= provider-names "") "default" provider-names)
|
||||||
context-window gate-count rules-count)))
|
context-window gate-count rules-count)))
|
||||||
|
|
||||||
|
|||||||
@@ -152,9 +152,28 @@ Event handlers + daemon I/O + main loop.
|
|||||||
(list :action :hitl-respond :token token :decision :denied)))
|
(list :action :hitl-respond :token token :decision :denied)))
|
||||||
(add-msg :system (format nil "Denied: ~a" token))))
|
(add-msg :system (format nil "Denied: ~a" token))))
|
||||||
;; /help command
|
;; /help command
|
||||||
|
;; /why command — show last gate trace
|
||||||
|
((string-equal text "/why")
|
||||||
|
(let ((msgs (st :messages))
|
||||||
|
(found nil))
|
||||||
|
(loop for i from (1- (length msgs)) downto 0
|
||||||
|
for m = (aref msgs i)
|
||||||
|
for gt = (getf m :gate-trace)
|
||||||
|
when (and gt (listp gt) (> (length gt) 0))
|
||||||
|
do (setf found t)
|
||||||
|
(dolist (entry gt)
|
||||||
|
(let* ((gate (getf entry :gate))
|
||||||
|
(result (getf entry :result))
|
||||||
|
(reason (getf entry :reason))
|
||||||
|
(msg (format nil "~a ~a~@[ — ~a~]"
|
||||||
|
(case result (:passed "[PASS]") (:blocked "[BLOCKED]") (:approval "[HITL]"))
|
||||||
|
(or gate "unknown")
|
||||||
|
reason)))
|
||||||
|
(add-msg :system msg)))
|
||||||
|
(loop-finish))
|
||||||
|
(unless found
|
||||||
|
(add-msg :system "No recent gate trace. Run a tool to see gate decisions."))))
|
||||||
((string-equal text "/help")
|
((string-equal text "/help")
|
||||||
(add-msg :system
|
|
||||||
"/eval <expr> Evaluate Lisp expression")
|
|
||||||
(add-msg :system
|
(add-msg :system
|
||||||
"/focus <proj> Set project context")
|
"/focus <proj> Set project context")
|
||||||
(add-msg :system
|
(add-msg :system
|
||||||
@@ -873,4 +892,29 @@ Event handlers + daemon I/O + main loop.
|
|||||||
(let ((m (aref (st :messages) 0)))
|
(let ((m (aref (st :messages) 0)))
|
||||||
(fiveam:is (eq :system (getf m :role)))
|
(fiveam:is (eq :system (getf m :role)))
|
||||||
(fiveam:is (search "Redo" (getf m :content)))))
|
(fiveam:is (search "Redo" (getf m :content)))))
|
||||||
|
|
||||||
|
;; ── v0.7.2 Self-help ──
|
||||||
|
|
||||||
|
(fiveam:test test-why-command
|
||||||
|
"Contract v0.7.2: /why shows gate trace from last message."
|
||||||
|
(init-state)
|
||||||
|
(add-msg :agent "did something" :gate-trace '((:gate "shell" :result :blocked :reason "rm -rf")))
|
||||||
|
(dolist (ch (coerce "/why" 'list))
|
||||||
|
(on-key (char-code ch)))
|
||||||
|
(on-key 13)
|
||||||
|
(let* ((msgs (st :messages))
|
||||||
|
(m (aref msgs (1- (length msgs)))))
|
||||||
|
(fiveam:is (eq :system (getf m :role)))
|
||||||
|
(fiveam:is (search "[BLOCKED]" (getf m :content)))
|
||||||
|
(fiveam:is (search "shell" (getf m :content)))))
|
||||||
|
|
||||||
|
(fiveam:test test-why-no-trace
|
||||||
|
"Contract v0.7.2: /why with no gate trace shows fallback message."
|
||||||
|
(init-state)
|
||||||
|
(dolist (ch (coerce "/why" 'list))
|
||||||
|
(on-key (char-code ch)))
|
||||||
|
(on-key 13)
|
||||||
|
(let* ((msgs (st :messages))
|
||||||
|
(m (aref msgs (1- (length msgs)))))
|
||||||
|
(fiveam:is (search "No recent" (getf m :content)))))
|
||||||
#+end_src
|
#+end_src
|
||||||
|
|||||||
@@ -243,7 +243,7 @@ each cascade call via ~cost-track-backend-call~. All four calls are
|
|||||||
(symbol-value '*provider-cascade*)))))
|
(symbol-value '*provider-cascade*)))))
|
||||||
(when (boundp '*hitl-pending*)
|
(when (boundp '*hitl-pending*)
|
||||||
(setf rules-count (hash-table-count (symbol-value '*hitl-pending*))))
|
(setf rules-count (hash-table-count (symbol-value '*hitl-pending*))))
|
||||||
(format nil "CONFIG: You are Passepartout v0.7.2. Provider: ~a. Context: ~d tokens. Security gates: ~d active. Rules learned: ~d."
|
(format nil "CONFIG: You are Passepartout v0.7.2. Provider: ~a. Context: ~d tokens. Security gates: ~d active. Rules learned: ~d. Documentation: ~/memex/projects/passepartout/docs/USER_MANUAL.org."
|
||||||
(if (string= provider-names "") "default" provider-names)
|
(if (string= provider-names "") "default" provider-names)
|
||||||
context-window gate-count rules-count)))
|
context-window gate-count rules-count)))
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user