passepartout: v0.4.1 Design Cleanup
Some checks failed
Deploy (Gitea) / deploy (push) Failing after 2s

- Remove system-prompt-augment mechanism, introduce *standing-mandates*
- Fix false token-overhead claims in DESIGN_DECISIONS + ROADMAP
- Update security vector count 9-10 across all docs and dispatcher docstring
- Rewrite README with agent section, soften aspirational claims
- Register 10 cognitive tools in programming-tools.org with test suite
- Enforce NO-HARDCODED-CONSTANTS in .env.example
- ROADMAP: mark v0.3.x patches DONE, add LOGBOOKs, mark releases
- AGENTS.md: rewrite compact (180 to 50 lines), move refs to CONTRIBUTING
- Normalize org tangle directives to file-level PROPERTY inheritance
This commit is contained in:
2026-05-07 16:44:59 -04:00
parent d3b74f5c88
commit 639bc348d9
25 changed files with 1555 additions and 144 deletions

View File

@@ -67,6 +67,15 @@ PROTOCOL_HMAC_SECRET="change-this-to-a-secure-random-string"
# Default: @personal
PRIVACY_FILTER_TAGS="@personal,@health,@finance"
# =============================================================================
# DISPATCHER RULE LEARNING
# =============================================================================
# Number of HITL approvals before a pattern becomes a permanent rule
DISPATCHER_RULE_THRESHOLD=3
# Where learned rules are persisted
RULES_FILE="$HOME/memex/system/rules.org"
# =============================================================================
# BOOTSTRAP
# =============================================================================

View File

@@ -9,7 +9,7 @@
#+HTML: <img src="https://img.shields.io/badge/docs-Org--mode-darkgreen?style=flat-square">
#+HTML: </div>
Passepartout is an AI assistant that runs in your terminal. It reads and writes your Org-mode files, executes tasks through a verified safety gate, and works fully offline with local LLMs. Every action the LLM proposes is checked by nine deterministic safety gates before it touches a file, runs a command, or sends a message. The LLM suggests. The gate decides.
Passepartout is an AI assistant that runs in your terminal. It reads and writes your Org-mode files, executes tasks through a verified safety gate, and works fully offline with local LLMs. Every action the LLM proposes is checked by ten deterministic safety gates before it touches a file, runs a command, or sends a message. The LLM suggests. The gate decides.
Everything it knows is a folder of plain text files that you own.
*Install:*
@@ -20,25 +20,31 @@ curl -fsSL https://raw.githubusercontent.com/amrgharbeia/passepartout/main/passe
This installs dependencies (SBCL, Quicklisp), tangles the Org source files, and runs the setup wizard for LLM providers. Requires curl and sudo access for package installation.
* What is an AI Agent?
An AI agent is a program that can act on your behalf — reading files, running commands, sending messages — rather than just answering questions. Unlike a chatbot that only produces text, an agent has /actuators/ that let it affect the world: a shell, a file editor, a message sender. See [[https://en.wikipedia.org/wiki/Software_agent][Software agent]] on Wikipedia.
Passepartout is a /sovereign/ agent: it runs on your machine, operates on your plain-text files, and verifies every action through deterministic safety gates before execution.
* What Makes Passepartout Different
** Every action is verified, not trusted.
Most AI agents add safety checks as an afterthought — prompt-based guardrails that consume LLM tokens and can be evaded with clever phrasing. Passepartout inverts this: nine deterministic safety gates run in pure Lisp between the LLM's proposal and execution. Secret scanning checks for API key leaks. Path protection blocks reads and writes to sensitive files. Shell safety detects destructive commands and injection vectors. Network exfiltration detection flags unauthorized outbound connections. Lisp syntax validation catches malformed code before it writes to disk.
Most AI agents add safety checks as an afterthought — prompt-based guardrails that consume LLM tokens and can be evaded with clever phrasing. Passepartout inverts this: ten deterministic safety gates run in pure Lisp between the LLM's proposal and execution. Secret scanning checks for API key leaks. Path protection blocks reads and writes to sensitive files, including a self-build safety boundary that prevents the agent from modifying its own core pipeline without human review. Shell safety detects destructive commands and injection vectors. Network exfiltration detection flags unauthorized outbound connections. Lisp syntax validation catches malformed code before it writes to disk.
Every gate costs 0 LLM tokens. Every gate is a Common Lisp function, not a prompt. Every gate runs for every action, unconditionally.
If a gate blocks a proposal, the rejection feedback goes back to the LLM so it can self-correct and try again. If the deterministic Dispatcher is uncertain, it creates a Flight Plan — a human-readable Org buffer you review and approve. The human decides. The Dispatcher learns from your decision and writes a rule for next time.
** The more you use it, the cheaper it gets.
** The more you use it, the cheaper it gets (architectural aspiration)
Passepartout has a downward cost curve. This runs counter to every other AI agent.
Passepartout is designed with a downward cost curve — an architectural property, not yet measured empirically. Here is the thesis.
Here is why. When you use Passepartout, the Dispatcher observes every blocked action and every human-approved exception. Each decision becomes a deterministic rule. A file write you approved once becomes an allowed path pattern. A shell command you denied becomes a permanent block. Each hardened rule means one fewer LLM call next time.
When you use Passepartout, the Dispatcher observes every blocked action and every human-approved exception. Each decision becomes a deterministic rule. A file write you approved once becomes an allowed path pattern. A shell command you denied becomes a permanent block. Each hardened rule means one fewer LLM call next time. This rule-learning system is planned for v0.5.0.
Meanwhile, the foveal-peripheral context model prunes your [[https://en.wikipedia.org/wiki/Memex][memex]] — your personal knowledge base, a term coined by Vannevar Bush in 1945 for a mechanised private library — to the relevant Org subtrees before sending anything to the LLM. The agent does not load your entire knowledge base, or even the entire file like agents that use Markdown do — it loads precisely the headlines that matter. Less context in, fewer tokens out.
Other agents grow more expensive over time (context histories accumulate, safety instructions grow). Passepartout's cost curve bends down.
These mechanisms are implemented and working today. Token cost measurement and optimization are tracked in the [[file:docs/ROADMAP.org][v0.5.0 Roadmap]]. Until empirically verified, the cost claims in [[file:docs/DESIGN_DECISIONS.org][Design Decisions]] (2-3x fewer tokens for coding, 13-24x for knowledge management) should be read as architectural projections, not measured results.
** It edits its own source code. Verified before execution.
@@ -58,7 +64,7 @@ When you write a TODO in Emacs, the agent sees it immediately as a native data s
** Works offline. Works locally. The safety doesn't stop.
You can run Passepartout entirely on your hardware with a local LLM via Ollama or some other inference engine. No internet connection required. But unlike most local AI tools, offline mode does not mean safety-last. The nine deterministic safety gates are pure Common Lisp — they run identically whether you are online or off. The Merkle-tree memory with snapshot rollback is in-process, 0 milliseconds, 0 network calls. Semantic retrieval runs on in-image vectors, 0 LLM tokens per query.
You can run Passepartout entirely on your hardware with a local LLM via Ollama or some other inference engine. No internet connection required. But unlike most local AI tools, offline mode does not mean safety-last. The ten deterministic safety gates are pure Common Lisp — they run identically whether you are online or off. The Merkle-tree memory with snapshot rollback is in-process, 0 milliseconds, 0 network calls. Semantic retrieval runs on in-image vectors, 0 LLM tokens per query.
Cloud providers (OpenRouter, OpenAI, Anthropic, Groq, Gemini, DeepSeek, NVIDIA NIM...) are optional add-ons. When you use them, the model-tier router automatically selects the cheapest provider that matches your task's complexity. Privacy-tagged content stays local even when cloud providers are configured.
@@ -88,7 +94,7 @@ Features marked =Stable= ship in the current release. Features marked =Planned=
| Capability | Status | Since | Notes |
|----------------------------------+----------+---------+----------------------------------------------------------------------|
| 9-vector deterministic safety | Stable | v0.2.0 | Secrets, paths, shells, network, lisp, privacy |
| 10-vector deterministic safety | Stable | v0.2.0 | Secrets, paths, self-build, shells, network, lisp, privacy, approval |
| Foveal-peripheral context model | Stable | v0.2.0 | Sends relevant subtrees, not all files |
| Merkle-tree memory + snapshots | Stable | v0.2.0 | Integrity hashing, copy-on-write rollback |
| Self-editing + hot-reload | Stable | v0.2.0 | Agent reads, modifies, reloads its own code |

View File

@@ -77,8 +77,9 @@ Every action the LLM proposes passes through a stack of deterministic gates befo
| 600 | security-permissions | Tool permission table (allow/ask/deny per tool) |
| 600 | security-vault | Credential storage integrity |
| 500 | security-policy | Requires :explanation on every action |
| 150 | security-dispatcher | 9-vector safety: secrets, paths, shell, lisp, network, |
| | (the Dispatcher) | privacy, high-impact approval |
| 150 | security-dispatcher | 11-check safety: lisp, secret path, self-build, |
| | (the Dispatcher) | content exposure, vault, privacy tags, privacy text, |
| | | shell safety, network exfil, high-impact approval |
| 95 | security-validator | Protocol schema validation |
| 100 | system-archivist | Scribe and Gardener maintenance on heartbeat |
| 80 | system-event-orchestrator | Cron job dispatch on heartbeat |

View File

@@ -6,57 +6,111 @@
* Philosophy
Passepartout is built on a "Zero-Bloat" mandate. The core kernel is mathematically pure, pushing all peripheral logic, API integrations, and routing to hot-reloadable "Skills".
* TDD Discipline (Red-Green-Refactor)
* Development Workflow
All code changes MUST follow this cycle:
The full development cycle is described in ~AGENTS.md~. In summary:
1. *Write a failing test* — capture the desired behavior as a FiveAM test
in a =* Test Suite= section within the relevant =.org= file
2. *Prove it fails* — run =sbcl --eval "(asdf:test-system :passepartout)"=
and confirm the new test fails (RED) before writing implementation
3. *Write the code* — modify the implementation in the same =.org= file
4. *Prove it passes* — run the test suite again, confirm GREEN
5. *Reflect* — ensure the test and code are both in the =.org= literate source
1. *Think in org* — write reasoning and goals in the .org file
2. *Write contract* — define each function's behavior in a ~** Contract~ section
3. *TDD from contract* — each contract item becomes a ~fiveam:test~; prove RED then GREEN
4. *Reflect in org* — ensure implementation is in .org source
5. *Update literate prose* — explain the code: what, why, how it connects
For *existing code* that lacks tests: write a characterization test that
captures current behavior as the spec. Then refactor.
* Literate Programming
No test may be committed without proof it was first run to failure.
~.org~ files in ~org/~ are the source of truth. ~lisp/~ files are generated by ~org-babel-tangle~.
* Literate Granularity
We strictly adhere to Literate Programming using Org-mode.
- *Never* edit `.lisp` files in `src/` directly.
- Modify the corresponding `.org` files in the `literate/` or `skills/` directories.
- Run `org-babel-tangle` to generate the source code.
- Every architectural decision, constraint, and implementation detail must be documented alongside the code in the `.org` file.
- Never edit =lisp/= files directly — always modify the corresponding =org/= file
- All ~#+begin_src lisp~ blocks in a file inherit their tangle destination from the file-level ~#+PROPERTY: header-args:lisp :tangle ../lisp/FILE.lisp~
- Every architectural decision, constraint, and implementation detail must be documented alongside the code
* Contracts and Tests
Every code change starts with a contract and a failing test. Write a ~** Contract~ section listing each function's behavior, then create a ~fiveam:test~ in the ~* Test Suite~ section for each contract item.
To run tests for a specific file:
#+begin_src bash
sbcl --noinform \
--eval '(load (merge-pathnames "quicklisp/setup.lisp" (user-homedir-pathname)))' \
--eval '(ql:quickload :passepartout :silent t)' \
--eval '(load "lisp/FILE.lisp")' \
--eval '(fiveam:run (intern "SUITE-NAME" :passepartout-TESTS))' --quit
#+end_src
No test may be committed without proof it was first run to failure (RED).
* Skill Creation Standard
Skills are the building blocks of Passepartout. They reside in the `skills/` directory.
A skill must define:
1. *Trigger*: A lambda determining if the skill should activate based on the context.
2. *Probabilistic Gate*: Optional. Generates a prompt for the LLM.
3. *Deterministic Gate*: A hardcoded Lisp function that guarantees safety or executes side-effects (the Dispatcher pattern).
A skill is a =.org= file in =org/= that defines:
Example Registration:
1. *Contract* — what the skill guarantees
2. *Implementation* — the code, tangled to ~lisp/~ via ~#+PROPERTY: header-args:lisp~
3. *Skill Registration* — a ~defskill~ form with ~:priority~, ~:trigger~, ~:probabilistic~ / ~:deterministic~
4. *Test Suite*~fiveam:test~ forms verifying the contract
Example:
#+begin_src lisp
(defskill :skill-example
(defskill :passepartout-example
:priority 100
:trigger (lambda (ctx) ...)
:probabilistic nil
:probabilistic (lambda (ctx) ...)
:deterministic (lambda (action ctx) ...))
#+end_src
* The Unified Envelope (Communication Protocol)
All inter-process communication occurs via the Unified Envelope. Do not use legacy specific types like `:CHAT`.
- Always use semantic types: `:REQUEST`, `:EVENT`, `:RESPONSE`, `:STATUS`, `:LOG`.
- Include routing metadata in the `:META` block (e.g., `(:SOURCE :TUI)`).
- Ensure generated `:REQUEST` messages include a mandatory `:TARGET` field.
* Project Structure
* Pull Request Process
1. Choose an Org file and write a failing test in its =* Test Suite= section.
2. Tangle and run to confirm RED (the test fails).
3. Write the implementation in the same Org file, tangle, run to confirm GREEN.
4. Ensure your working tree is clean.
5. Run the full test suite: =sbcl --eval "(asdf:test-system :passepartout)"=.
6. Submit a PR outlining the architectural intent and the specific Literate changes.
| Directory | Purpose |
|----------------------+--------------------------------------------------|
| =org/= | Literate source files (edit these) |
| =lisp/= | Tangled .lisp output (never edit) |
| =docs/= | ROADMAP, ARCHITECTURE, DESIGN_DECISIONS, etc. |
| =scripts/= | Build and utility scripts |
| ~/.local/share/passepartout/= | XDG data dir — deployed lisp files |
| ~/.config/passepartout/= | Config (.env) |
* Key Libraries
| Library | Purpose |
|------------------+----------------------------------|
| Croatoan | TUI (terminal UI) |
| usocket | TCP sockets (daemon protocol) |
| bordeaux-threads | Threading (reader thread) |
| dexador | HTTP client (LLM API calls) |
| cl-ppcre | Regex (search-files, dispatcher) |
| ironclad | SHA-256 (Merkle hashing) |
| hunchentoot | HTTP server |
| cl-json | JSON encoding/decoding |
* Protocol
All inter-process communication uses the Unified Envelope protocol over TCP (port 9105). Message types: ~:REQUEST~, ~:EVENT~, ~:RESPONSE~, ~:STATUS~, ~:LOG~. Each message includes a ~:META~ block with routing metadata.
* Pre-Commit Hook
Validates staged org files by tangling + structural-checking:
#+begin_src bash
ln -sf ../../scripts/pre-commit-repl-check .git/hooks/pre-commit
#+end_src
Runs automatically on ~git commit~.
* Testing Tools
** TUI REPL (~/eval~)
The TUI has a built-in command for live evaluation:
- ~/eval (+ 1 2)~ → result displayed in chat window
- ~/eval (add-msg :system "test")~ → inject a test message
** Tmux (TUI integration testing)
#+begin_src bash
tmux new-session -d -s test "passepartout tui 2>&1 | tee /tmp/tui.log"
tmux send-keys -t test "hello world" Enter
tmux capture-pane -t test -p -S -200
tmux kill-session -t test
#+end_src
** Swank (Emacs REPL for TUI)
1. Start TUI: ~passepartout tui~
2. In Emacs: ~M-x slime-connect RET 127.0.0.1 RET 4006~
3. ~C-M-x~ any form from =org/gateway-tui.org= → evaluates in live TUI process
4. Configure port: ~export TUI_SWANK_PORT=4009~ (default: 4006)

View File

@@ -321,7 +321,7 @@ The structural multipliers are:
1. *Sparse tree retrieval* — the foveal-peripheral model renders relevant Org subtrees (titles and properties for peripheral nodes, full content for foveal and semantically relevant nodes). Active context stays at 2,0004,000 tokens. A "load everything" architecture serializes the entire knowledge base at 50,000150,000 tokens. The mechanism is provably cheaper; the exact multiplier depends on memex size and complexity.
2. *Deterministic safety* — the 9-vector Dispatcher gate stack runs in pure Lisp. Every gate is a Common Lisp function. Verification costs 0 LLM tokens per action. Competitors use prompt-based guardrails consuming 100500 LLM tokens per verification. This multiplier is mathematically infinite — a Lisp function call costs no tokens, a guardrail paragraph in a system prompt costs tokens proportional to its length.
2. *Deterministic safety* — the 10-vector Dispatcher gate stack runs in pure Lisp. Every gate is a Common Lisp function. Verification costs 0 LLM tokens per action. Competitors use prompt-based guardrails consuming 100500 LLM tokens per verification. This multiplier is mathematically infinite — a Lisp function call costs no tokens, a guardrail paragraph in a system prompt costs tokens proportional to its length.
3. *REPL verification* — code is tested in the running image before it is committed. Errors surface in milliseconds at 0 LLM tokens. Competitors discover errors after generation and pay 5002,000 tokens per correction round-trip. The REPL eliminates the most expensive kind of LLM call: the one that produced wrong code and needs a do-over.
@@ -432,7 +432,7 @@ The critical risk is implementation: achieving the retrieval precision, Dispatch
1. *Retrieval accuracy is the bottleneck.* If sparse tree retrieval loads the wrong subtree (low-similarity but causally relevant), the LLM makes unfixable errors. The architecture assumes embedding quality is "good enough" — this is untested at scale.
2. *System prompt overhead can consume savings.* Every =think= cycle iterates all registered skills and calls every =system-prompt-augment= function. With 20+ skills, a trivial interaction could carry 3,000-8,000 tokens of overhead before user input is even processed. This overhead is flat per-call, so it disproportionately affects short interactions.
2. *System prompt overhead can consume savings.* Every =think= cycle builds the full system prompt from IDENTITY + TOOLS + CONTEXT + LOGS. With the foveal-peripheral context model growing over time and the tool belt expanding with skills, the fixed overhead is non-trivial. However, it is driven by context and tool descriptions, not by the ~*standing-mandates*~ list (which contributes ~40 tokens when a single mandate fires, and 0 otherwise). Prefix caching (v0.5.0) is the primary mitigation for this overhead.
3. *Model size vs context quality.* A 3.8B model with perfect context cannot match a 70B model on complex multi-file refactors regardless of context quality. Model size independently determines reasoning depth. The minimum viable model is likely 7-13B parameters for engineering work.

View File

@@ -34,8 +34,8 @@ On release:
3. If a ~CHANGELOG.md~ is needed for packaging tools, auto-generate it from ROADMAP DONE items
** v0.1.0: The Autonomous Foundation — RELEASED 2026-04-20
:PROPERTIES:
:RETROSPECTIVE: [2026-05-07 Wed]
:LOGBOOK:
- State "DONE" from "TODO" [2026-04-20 Mon 19:05]
:END:
The secure, auditable Lisp kernel. All core infrastructure in place.
@@ -129,8 +129,8 @@ The actuator registry pattern makes MCP tools (v0.7.0) possible — they registe
The test infrastructure established in v0.1.0 becomes the TDD runner (v0.7.1) and the SWE-bench harness (v0.9.0).
** v0.2.0: Interactive Refinement — RELEASED 2026-04-29
:PROPERTIES:
:RETROSPECTIVE: [2026-05-07 Wed]
:LOGBOOK:
- State "DONE" from "TODO" [2026-04-29 Wed 20:17]
:END:
The "Brain" meets the "Machine." Standardization and professionalization of the user interface and environment.
@@ -190,12 +190,14 @@ The setup wizard established the "works out of the box" constraint that the gate
Copy-on-write snapshots (deep-copying the memory hash table on every write) gave the pipeline crash recovery. The snapshot mechanism is the root of MVCC concurrency (v0.6.1).
** v0.3.0: Event Orchestration + HITL — DONE, UNRELEASED
** v0.3.0: Event Orchestration + HITL — RELEASED 2026-05-06
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-06 Wed 15:50]
:END:
Unified control plane, Human-in-the-Loop state management, and backfill remediation
for stubs and gaps from v0.1.0/v0.2.0. All features are implemented but not yet
published. The security hardening patches (v0.3.10.3.3) will ship as follow-up
point releases before v0.4.0 feature work begins.
for stubs and gaps from v0.1.0/v0.2.0. Security hardening followed as
v0.3.1v0.3.3 point releases.
*** DONE Secret Exposure Gate, Shell Safety, Lisp Validation
:PROPERTIES:
@@ -384,7 +386,14 @@ CLOSED: [2026-05-03 Sun]
The Dispatcher's role has evolved beyond security guard. It is the seed of the deterministic engine — it learns to execute procedures without invoking the neural net.
*** v0.3.1 — TODO Parser RCE elimination
*** DONE Parser RCE elimination
:PROPERTIES:
:ID: id-v031-parser-rce
:CREATED: [2026-05-06 Wed]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-06 Wed 16:38]
:END:
Rationale: SBCL's default ~*read-eval* accessor is ~t~, enabling the ~#.~ reader macro to execute arbitrary Lisp forms during parsing. Three code paths in the current codebase process untrusted input with ~read-from-string~ or ~read~ without binding ~*read-eval*~ to ~nil~. Each represents a remote code execution vector that bypasses all deterministic safety gates — the Dispatcher's shell safety check, path protection, secret scanning, and network exfiltration detection never execute because the malicious form is evaluated during parsing, before the action plist is even constructed.
@@ -393,7 +402,14 @@ Rationale: SBCL's default ~*read-eval* accessor is ~t~, enabling the ~#.~ reader
- Wrap ~read-from-string~ in ~action-system-execute~ (core-loop-act.lisp:62) with ~(let ((*read-eval* nil)) ...)~ — the ~:system :eval~ path executes untrusted payload code. Explicitly assert that this path requires the Dispatcher's approval gate.
- Add FiveAM test: inject ~"(#.(shell \"echo pwned\"))"~ into the ~think()~ pipeline and assert no shell execution occurs.
*** v0.3.2 — TODO Shell safety & actuator sandboxing
*** DONE Shell safety & actuator sandboxing
:PROPERTIES:
:ID: id-v032-shell-sandbox
:CREATED: [2026-05-06 Wed]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-06 Wed 16:46]
:END:
Rationale: The ~:system :eval~ actuator path is currently unchecked by the Dispatcher's approval gate — only ~:shell~ and ~:tool "shell"~ trigger HITL. The shell actuator wraps commands through double ~bash -c~ nesting (~system-actuator-shell.lisp:10~), where Lisp's ~format~ with ~s~ produces S-expression-safe strings, not shell-safe strings. A command containing quotes or substitution characters can break out. Additionally, skill files loaded via ~skill-initialize-all~ execute arbitrary Lisp in jailed packages — a skill file containing ~(uiop:run-program "dangerous")~ executes immediately on load before any gate can inspect it.
@@ -402,7 +418,14 @@ Rationale: The ~:system :eval~ actuator path is currently unchecked by the Dispa
- Add skill sandbox mode for ~skill-initialize-all~: load each skill's code into a temporary jailed package, run the registered trigger function in isolation, verify it imports no restricted symbols (from CL package: ~run-program~, ~shell~, ~run-shell-command~), then promote to the live registry on pass.
- Add FiveAM test: register a skill containing ~(uiop:run-program "echo test")~ in the body and verify the sandbox blocks its promotion.
*** v0.3.3 — TODO TUI Critical Fixes
*** DONE TUI Critical Fixes
:PROPERTIES:
:ID: id-v033-tui-fixes
:CREATED: [2026-05-06 Wed]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-06 Wed 17:59]
:END:
Rationale: The TUI is Passepartout's only interface. OpenClaw distributes across 25+ messaging channels with voice, Canvas, and macOS/iOS apps. Hermes Agent ships multiline editing, slash-command autocomplete, conversation history, interrupt-and-redirect, and streaming tool output in its TUI. Passepartout's Croatoan TUI must carry the product alone, and it currently lacks word wrap, cursor movement, resize handling, connection-loss feedback, a quit command, and persistent history. None of these fixes require daemon changes — they are pure client-side Croatoan work that closes the gap from "proof of concept" to "daily driver."
@@ -415,7 +438,10 @@ Rationale: The TUI is Passepartout's only interface. OpenClaw distributes across
- Message list storage: replace the O(n²) ~(nth i msgs)~ list indexing with a simple adjustable vector. ~add-msg~ appends; ~view-chat~ iterates with ~aref~. The vector is resized as needed. Same API surface, 100x speedup on message-heavy sessions.
- Add FiveAM tests: word-wrap produces correct line count for a 200-character string at 80-column width; cursor left/right wraps at buffer boundaries; SIGWINCH preserves message state; ~/quit~ saves and restores history.
** v0.4.0: Production Hardening
** v0.4.0: Production Hardening — RELEASED 2026-05-06
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-06 Wed 20:56]
:END:
The features in this version were originally sequenced as v0.3.x patches but represent feature-level scope. They activate the architectural advantages designed in v0.1.0v0.3.0, harden the self-build safety boundary, and expand Passepartout's interaction surfaces beyond the terminal TUI. Each feature depends on infrastructure already in place — the wiring, the sandbox, the gate trace — and activates it.
@@ -454,7 +480,7 @@ CLOSED: [2026-05-06 Tue]
Rationale: Three architectural elements exist today in the daemon that no competitor can render — the Dispatcher gate trace, the foveal-peripheral focus map, and the rules-learned counter. All three run in pure Lisp with 0 LLM tokens. None are visible to the user. Making them visible turns Passepartout's architecture from an internal mechanism into a trust-building UX — the user sees exactly which safety gates passed, exactly what the agent is focusing on, and exactly how many rules the Dispatcher has learned from their decisions. No competitor can ship this because none has deterministic gates to trace, foveal-peripheral context to map, or a rule-synthesizing Dispatcher to count.
- Gate trace per action: extend the daemon's response plist to include ~:gate-trace~ — a list of ~(:gate <name> :result <:passed | :blocked | :approval>)~ entries produced by ~cognitive-verify~. The TUI renders each entry as a colored line below the corresponding agent message: green ~✓ Dispatcher: path allowed~, red ~✗ Dispatcher: blocked (shell safety)~, yellow ~→ HITL required: /approve HITL-ab12~. Gate trace lines are dim and collapsible (press Tab on a message to toggle trace visibility). This turns the invisible nine-vector safety gate into the user's primary trust mechanism.
- Gate trace per action: extend the daemon's response plist to include ~:gate-trace~ — a list of ~(:gate <name> :result <:passed | :blocked | :approval>)~ entries produced by ~cognitive-verify~. The TUI renders each entry as a colored line below the corresponding agent message: green ~✓ Dispatcher: path allowed~, red ~✗ Dispatcher: blocked (shell safety)~, yellow ~→ HITL required: /approve HITL-ab12~. Gate trace lines are dim and collapsible (press Tab on a message to toggle trace visibility). This turns the invisible ten-vector safety gate into the user's primary trust mechanism.
- Focus map in status bar: add a second status bar line showing ~[Focus: core-loop.lisp:think()] [Scope: passepartout] [3 related nodes]~. The daemon already tracks ~foveal-id~ and ~*scope-resolver*~ in the signal plist; the TUI reads these from the most recent response and renders them. Related node count comes from the number of objects with cosine similarity ≥ threshold in the last context assembly. This shows the user *what the agent is looking at* — the single biggest trust gap in AI agents.
- Rule counter in status bar: ~[Rules: 47]~. The Dispatcher's ~*hitl-pending*~ hash table and approved/disallowed memory-object entries provide the count — every HITL decision that produces a rule increments it. The TUI reads the count from a new daemon response field ~:rule-count~. The user watches the counter tick up as they teach the agent their preferences.
- Expanded theme: replace the 7-flat-color ~*tui-theme*~ with a 25-color layered system organized by message category (roles, content types, tool visibility, gate states, status). See the design discussion for the full color mapping. Implement a ~/theme <name>~ command that swaps between named presets (~dark~, ~light~, ~solarized~, ~gruvbox~). Theme change persists to disk and reloads on next session.
@@ -521,7 +547,14 @@ The messaging gateways and Emacs bridge expand Passepartout's interaction surfac
** v0.4.1: Design Cleanup
*** TODO Remove system-prompt-augment mechanism
*** DONE Remove system-prompt-augment mechanism
:PROPERTIES:
:ID: id-v041-augment-removal
:CREATED: [2026-05-07 Thu]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-07 Thu 13:13]
:END:
Rationale: The ~system-prompt-augment~ slot on the skill struct enables skills to inject always-on text into every LLM system prompt via a ~maphash~ over ~*skill-registry*~ in ~think()~ (core-loop-reason.lisp:83-92). Only one skill uses it — ~programming-repl~ — and it does so as a backdoor: the skill's trigger is hardcoded to ~nil~, so it never fires as an active skill. Its sole contribution is injecting a REPL-first mandate into every system prompt. The other ~24 skills have nil augments and are skipped by the ~when aug-fn~ guard. This is architecturally wrong: standing mandates (always-on rules) should live in a dedicated ~*standing-mandates*~ list, not piggyback on a skill that is never triggered. The mechanism also fuels a false claim in DESIGN_DECISIONS about 3,000-8,000 tokens of overhead — the actual overhead is ~40 tokens from the one active augment.
@@ -531,16 +564,164 @@ Rationale: The ~system-prompt-augment~ slot on the skill struct enables skills t
- Introduce ~*standing-mandates*~ (a list of function → string generators). Inject them into the IDENTITY section of the system prompt alongside ~assistant-name~. Move ~repl-mandate~ there: ~(push #'repl-mandate *standing-mandates*)~.
- Tangle the corresponding lisp/ files.
*** TODO Fix false token-overhead claims in docs
*** DONE Fix false token-overhead claims in docs
:PROPERTIES:
:ID: id-v041-doc-fix
:CREATED: [2026-05-07 Thu]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-07 Thu 13:13]
:END:
Rationale: Two documents claim the ~system-prompt-augment~ mechanism can waste 3,000-8,000 tokens per think() call (DESIGN_DECISIONS line 435, ROADMAP line 504). This conflates the ~maphash~ iteration (cheap hash walk, no token cost) with the augments actually emitted (only ~programming-repl~ emits ~40 tokens; the ~when aug-fn~ guard skips the other 24 nil-augment skills). Once issue #1 above is resolved (removing the mechanism), these claims become doubly false.
- DESIGN_DECISIONS: Rewrite or remove bullet 2 under "Open Questions and Risks" (line 435). Replace with a corrected note on standing mandates via ~*standing-mandates*~.
- ROADMAP v0.5.0 intro (line 504): Remove or rewrite the claim that "system prompt overhead alone could reach 3,000-8,000 tokens per call before user input is even processed." The fixed overhead is not from skill augments — it is from the IDENTITY, TOOLS, CONTEXT, and LOGS sections, which prefix caching addresses.
*** DONE Update security vector count 9→10 in docs
:PROPERTIES:
:ID: id-v041-vector-count
:CREATED: [2026-05-07 Thu]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-07 Thu 14:40]
:END:
Rationale: The current dispatcher runs 10 deterministic checks (11 counting the warning-only REPL lint), but the README, ARCHITECTURE.org, and the ~dispatcher-check~ docstring all say 9. The actual count: 0=REPL-lint (warn only), 1=lisp-validation, 2=secret-path, 2b=self-build-core, 3=secret-content, 4=vault-secrets, 5=privacy-tags, 6=privacy-text, 7=shell-safety, 8=network-exfil, 8b=high-impact-approval. Ten blocking/approval checks. The vector 2b (self-build safety) and the new count must be reflected accurately in all documentation.
- Update README.org "What Makes Passepartout Different" → "nine" becomes "ten".
- Update docs/ARCHITECTURE.org Dispatcher Gate Stack table — add self-build entry.
- Update security-dispatcher.lisp:196 docstring to list all 11 vectors.
*** DONE Rewrite README — add "What is an agent?" section, revise claims
:PROPERTIES:
:ID: id-v041-readme-rewrite
:CREATED: [2026-05-07 Thu]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-07 Thu 14:40]
:END:
Rationale: The current README opens with competitive claims (downward cost curve, 2-3x fewer tokens) that are architecturally sound but not yet measured in the implementation. A non-engineer reader doesn't know what an AI agent is or why they'd want one. The README should lead with a short "What is an agent?" section (3-4 sentences, Wikipedia link), then "What Makes It Different" (safety, org-mode, offline — things that actually work today), then honest status of what's implemented vs planned.
- Add "What is an AI Agent?" section at top: 3-4 sentences + link to [[https://en.wikipedia.org/wiki/Software_agent][Software agent]].
- Move competitive cost/speed claims to docs/DESIGN_DECISIONS.org.
- Revise "The more you use it, the cheaper it gets" to reflect current state — architectural aspiration, not measured implementation yet.
- The Current Capabilities table and Quick Start sections stay intact.
*** DONE Register cognitive tools — 10 tools for codebase operations
:PROPERTIES:
:ID: id-v041-cognitive-tools
:CREATED: [2026-05-07 Thu]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-07 Thu 14:40]
:END:
Rationale: The ~def-cognitive-tool~ macro and ~*cognitive-tool-registry*~ are fully implemented but the registry is empty. The LLM sees "No tools registered" in its tool belt prompt. The agent can chat and run shell commands, but cannot search codebases, find files, eval code, run tests, or manipulate Org files. Ten cognitive tools bridge this gap and are prerequisites for the TDD workflow, org-mode additions, and evaluation harness in v0.5.0.
- New skill: ~programming-tools.org~ (~programming-tools.lisp~).
- Register 10 tools via ~def-cognitive-tool~:
1. ~search-files~ — regex search in file contents (uses ~cl-ppcre:scan~). Parameters: ~pattern~, ~path~ (dir), ~include~ (glob filter).
2. ~find-files~ — glob file matching (uses SBCL ~directory~). Parameters: ~pattern~, ~path~.
3. ~read-file~ — read file contents (uses ~uiop:read-file-string~). Parameters: ~filepath~.
4. ~write-file~ — write content to file. Parameters: ~filepath~, ~content~.
5. ~list-directory~ — list directory contents. Parameters: ~path~, ~pattern~ (optional).
6. ~run-shell~ — execute shell command (through existing shell actuator). Parameters: ~cmd~.
7. ~eval-form~ — evaluate Lisp expression in running image. Parameters: ~code~, ~package~ (optional).
8. ~run-tests~ — run FiveAM tests. Parameters: ~test-name~ (optional, nil runs all).
9. ~org-find-headline~ — find Org headline by ID or title. Parameters: ~id~ or ~title~, ~filepath~ (optional, searches memory store if not given).
10. ~org-modify-file~ — surgical text replacement in Org file (reuses existing ~org-modify~). Parameters: ~filepath~, ~old-text~, ~new-text~.
- Descriptive names rather than Unix command names — the LLM reads these in a prompt, not a terminal.
- Each tool is ~20-60 lines. ~search-files~ iterates directory, reads files, scans lines.
- FiveAM tests: each tool gets a test verifying operation on a temp directory.
*** DONE Enforce NO-HARDCODED-CONSTANTS programming standard
:PROPERTIES:
:ID: id-v041-no-hardcoded
:CREATED: [2026-05-07 Thu]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-07 Thu 14:40]
:END:
Rationale: Currently, several configurable values are hardcoded in source: the Dispatcher's rule threshold (not yet configurable), similarity thresholds, timeouts, shell max output. The user should control behavior through ~.env~, not by editing source code. This is rule #6 in the ~programming-standards.org~ skill. Each new TODO that introduces a configurable value must add it to ~.env.example~ with a documented default.
- Add ~DISPATCHER_RULE_THRESHOLD=3~ to ~.env.example~ (number of HITL approvals before a pattern becomes a permanent rule).
- Add ~RULES_FILE="$HOME/memex/system/rules.org"~ to ~.env.example~.
- Scan existing source for hardcoded configurable values — add to ~.env.example~ where missing.
- Any new TODO in v0.4.2+ that introduces a configurable value MUST include its ~.env.example~ entry.
** v0.4.2: Structured Output (LLM → JSON → plist)
The current ~think()~ function asks the LLM to produce raw S-expression plists. Four pieces of defensive infrastructure (~handler-case~ around ~read-from-string~, ~markdown-strip~, ~plist-keywords-normalize~, the RCE guard test) exist because LLMs cannot reliably produce balanced, keyword-prefixed plists. The fix: use the LLM API's native function calling / tool-use feature. The LLM always returns guaranteed-valid JSON. Convert to plist deterministically at the boundary.
*** TODO Implement function-calling / tool-use API in provider requests
:PROPERTIES:
:ID: id-v042-function-calling
:CREATED: [2026-05-07 Thu]
:END:
Rationale: Every major provider API (OpenAI, Anthropic, Groq, DeepSeek, OpenRouter) supports function calling. The LLM is sent tool definitions as JSON Schema. It returns ~tool_calls~ with guaranteed-valid JSON arguments. This eliminates the fragile ~read-from-string~ plist parsing entirely — the probabilistic layer speaks JSON (what it was trained on), the deterministic layer speaks plists (what the code controls). Conversion happens at a narrow, well-defined boundary.
- Modify ~provider-openai-request~ in ~system-model-provider.lisp~: add optional ~:tools~ parameter. When tools are provided, include ~"tools": [...]~ and ~"tool_choice": "auto"~ in the request body.
- Parse ~tool_calls~ from the API response: extract ~function.name~ and ~function.arguments~ (guaranteed valid JSON).
- Return a new result shape: ~(:status :success :tool-calls ((:name "shell" :arguments (:cmd "echo hello"))))~ alongside or instead of ~:content~.
- For providers that don't support function calling (local Ollama): keep ~:content~ path as fallback. LLM can still return raw text.
- FiveAM test: send a request with a mock tool definition, verify the response shape.
*** TODO Wire structured tool calls into ~think()~ — JSON→plist at boundary
:PROPERTIES:
:ID: id-v042-wire-tool-calls
:CREATED: [2026-05-07 Thu]
:END:
Rationale: Once the provider layer returns structured ~tool-calls~, the ~think()~ function must convert them to the internal plist format that ~cognitive-verify~ and ~loop-gate-act~ expect. This is a one-way, deterministic conversion at the architectural boundary.
- Add ~json-alist-to-plist~ helper in ~core-loop-reason.lisp~ or ~core-utils.lisp~: convert JSON alist (from ~cl-json:decode-json-from-string~) to keyword-prefixed plist. String keys → keywords. Nested objects recurse. JSON null → ~nil~. ~25 lines.
- In ~think()~ after ~backend-cascade-call~: if result contains ~:tool-calls~, convert each tool call's ~:arguments~ JSON to plist via ~json-alist-to-plist~, wrap in ~(:TYPE :REQUEST :PAYLOAD (:TOOL <name> :ARGS <plist> :EXPLANATION "..."))~.
- Keep the existing ~read-from-string~ path as fallback for providers that return raw text (local Ollama, streaming).
- The ~read-from-string~ path remains guarded by ~*read-eval* nil~ from v0.3.1.
- FiveAM test: JSON ~{"action":"shell","cmd":"echo hello"}~ → plist ~(:ACTION "shell" :CMD "echo hello")~ round-trip verified.
** v0.4.3: Shell Sandboxing & Safety Classification
The current shell safety is regex-based pattern matching — a fast pre-filter that catches obvious attacks but cannot contain sophisticated or encoded payloads. This version adds actual sandbox isolation (bubblewrap Linux namespaces) as the enforcement layer, and introduces severity classification so the rule learning system in v0.5.0 can apply different thresholds to catastrophic vs harmless operations.
*** TODO Add ~bwrap~ sandbox to shell actuator
:PROPERTIES:
:ID: id-v043-bwrap-sandbox
:CREATED: [2026-05-07 Thu]
:END:
Rationale: Regex-based shell safety catches obvious patterns (~rm -rf /~, ~dd if=~, ~mkfs.~) but is fundamentally bypassable with encoding (~base64 -d | bash~), indirection (~find / -exec rm {} \;~), or interpreter-based execution (~python3 -c "import os; os.system(...)"~). Bubblewrap (~bwrap~) is a 200KB unprivileged sandbox binary available on all modern Linux distributions. It creates transient Linux namespaces without root, without Docker, without daemon processes. Combined with the regex pre-filter, it provides defense-in-depth: the regex catches obvious attacks fast (no sandbox spawn), the sandbox contains sophisticated ones.
- In ~actuator-shell-execute~ (~system-actuator-shell.lisp~): detect if ~bwrap~ binary is available (~which bwrap~).
- If available: wrap command in ~bwrap --ro-bind /usr /usr --ro-bind /lib /lib --ro-bind /bin /bin --ro-bind /etc /etc --bind ~/memex ~/memex --bind /tmp /tmp --unshare-net --unshare-ipc timeout ...~.
- ~--unshare-net~: no network access within sandbox. Makes regex-based network exfiltration check redundant for sandboxed commands.
- ~--unshare-ipc~: no shared memory, no semaphore injection.
- If ~bwrap~ is unavailable: log a warning, fall back to current behavior (regex-only safety).
- The regex checks remain as a fast pre-filter — they run before spawning the sandbox.
- FiveAM test: command that reads ~/etc/shadow~ inside sandbox fails with permission error; same command in unsandboxed fallback is at least caught by path protection.
*** TODO Shell safety severity classification system
:PROPERTIES:
:ID: id-v043-severity-classification
:CREATED: [2026-05-07 Thu]
:END:
Rationale: The current shell safety check treats all dangerous patterns equally — ~rm -rf /~ gets the same treatment as a backtick injection in ~echo~. But not all shell operations carry the same risk. A severity classification system enables the rule learning engine (v0.5.0) to apply different thresholds: catastrophic operations are always HITL regardless of approval count, moderate operations graduate to allowed after N approvals, harmless operations are allowed by default.
- Define four severity tiers as plist keywords: ~:catastrophic~ (mkfs, dd to devices, rm -rf /, shred /dev/), ~:dangerous~ (chmod -R /, writes outside ~/memex, curl to unwhitelisted domains), ~:moderate~ (npm install, pip install, git push, writes within ~/memex), ~:harmless~ (echo, ls, cat, find without exec, grep).
- Extend ~*dispatcher-shell-blocked*~ entries from simple ~(NAME REGEX)~ to ~(NAME REGEX :SEVERITY <tier>)~.
- Extend ~dispatcher-check-shell-safety~ to return the severity alongside the matched pattern name.
- ~:catastrophic~ severity always triggers HITL approval, regardless of rule count. ~:harmless~ operations are allowed by default (skip HITL and rule learning).
- The severity classification is the foundation that ~dispatcher-learn~ (v0.5.0) builds on — learning only applies to ~:dangerous~ and ~:moderate~ tiers.
- FiveAM test: ~echo hello~ returns ~:harmless~ severity and passes through; ~mkfs.ext4 /dev/sda~ returns ~:catastrophic~ and is always blocked.
** v0.5.0: Token Economics & Prompt Efficiency
The architecture's single largest gap versus SOTA: Passepartout currently spends tokens like a research prototype. Every ~think()~ call rebuilds and retransmits the full system prompt — IDENTITY + TOOLS + CONTEXT + LOGS + SKILL_AUGMENTS — with no caching, no budget, and no incremental assembly. The foveal-peripheral model prunes memory content but doesn't touch the fixed overhead. With 20+ skills by v1.0.0, system prompt overhead alone could reach 3,0008,000 tokens per call before user input is even processed.
The architecture's single largest gap versus SOTA: Passepartout currently spends tokens like a research prototype. Every ~think()~ call rebuilds and retransmits the full system prompt — IDENTITY + TOOLS + CONTEXT + LOGS — with no caching, no budget, and no incremental assembly. The foveal-peripheral model prunes memory content but doesn't touch the fixed overhead of IDENTITY, TOOLS, and LOGS sections, which together dominate the system prompt size. Standing mandates (~*standing-mandates*~) contribute negligible overhead (~40 tokens when the single active mandate fires).
Competitors (Claude Code, OpenClaw, Copilot) all implement some form of prefix caching — Anthropic's API gives 90% discount on cached tokens, OpenAI caches automatically. Passepartout's prompt structure is already naturally cacheable: IDENTITY, TOOLS, and LOGS format are static across calls. This version turns that structural property into a cost advantage.
@@ -552,7 +733,7 @@ Competitors (Claude Code, OpenClaw, Copilot) all implement some form of prefix c
- Use for three purposes: context budget enforcement (reject assembly if over limit), cost estimation (tokens × provider price), and prompt optimization (measure which sections of the system prompt consume the most budget).
*** TODO Prompt prefix caching
- Split the system prompt into a static prefix (IDENTITY string, TOOLS section, LOGS format header) and a dynamic suffix (CONTEXT render, current log entries, skill augments, user prompt).
- Split the system prompt into a static prefix (IDENTITY string, TOOLS section, LOGS format header) and a dynamic suffix (CONTEXT render, current log entries, standing mandates, user prompt).
- Track a hash of the static prefix; only retransmit when it changes (skill load/unload, identity config change). On cache hit, send the cached prefix with the dynamic suffix appended.
- Implement the Anthropic prompt-caching header protocol for providers that support it (claude-3-* models, up to 90% discount on cached tokens). For OpenAI, the automatic caching layer handles prefix detection without explicit headers.
- Log cache hit/miss rate to telemetry for cost tracking.
@@ -609,6 +790,185 @@ Rationale: The Dispatcher currently learns from blocked and approved actions —
- Induced functions are proposed, not automatically applied. The next time a similar request arrives, the agent checks: "I have an induced function for this. Use it?" The user approves the first invocation, and subsequent invocations of the same function are automatic.
- Add FiveAM test: replay a historical interaction sequence, verify the induced function produces the same outcome.
*** TODO TDD workflow skill — language-agnostic test runner
:PROPERTIES:
:ID: id-v050-programming-tdd
:CREATED: [2026-05-07 Thu]
:END:
Rationale: The REPL-TDD-Literate workflow described in AGENTS.md lives entirely outside the agent's cognitive loop. The agent should be able to write tests, run them, observe red/green, and iterate — without the user manually managing the cycle. This is the Lisp advantage made operational: redefine a function, re-run a single test, get results in <100ms. Claude Code cannot do this — it has no REPL. The skill is language-agnostic: it dispatches to the REPL skill for Lisp, shells out to ~pytest~ for Python, ~go test~ for Go, etc.
- New skill: ~programming-tdd.org~. Depends on REPL skill for Lisp, falls back to shell for other languages.
- Cognitive tools: ~deftest~ (define a test), ~run-test~ (run a specific test), ~list-tests~ (list all defined tests).
- ~run-test~ dispatches on ~:language~ parameter:
- ~:lisp~ → ~(fiveam:run 'test-name)~ via REPL eval
- ~:python~ → shell ~python3 -m pytest test_file.py::test_name~
- ~:go~ → shell ~go test -run TestName ./...~
- ~:rust~ → shell ~cargo test test_name~
- ~:default~ → shell command template from env ~TEST_RUNNER_<LANG>~
- The TDD loop: write test → ~run-test~ (expect RED) → write implementation → ~run-test~ (expect GREEN) → report.
- ~#+DEPENDS_ON: org-skill-utils-repl~ for Lisp TDD; no dependency for other languages (shell fallback).
- FiveAM tests: ~run-test~ on a known-failing test returns RED status; ~run-test~ on a known-passing test returns GREEN.
*** TODO Expand literate programming skill — persist after TDD
:PROPERTIES:
:ID: id-v050-literate-persist
:CREATED: [2026-05-07 Thu]
:END:
Rationale: After the TDD loop confirms green, the agent must persist the working code into its Org source file and tangle to ~.lisp~. Currently ~self-improve-edit~ can do surgical text replacement but doesn't integrate with the TDD confirmation step. The literate skill should provide a ~persist-verified-block~ tool that takes TDD-confirmed code and writes it to the appropriate ~#+begin_src lisp~ block.
- Add ~persist-verified-block~ cognitive tool: accepts ~filepath~, ~block-name~, ~code~, ~test-result~. Only writes if ~test-result~ is GREEN.
- Verifies the written Org file passes ~literate-block-balance-check~ before tangling.
- Tangles via existing ~org-tangle-file~.
- FiveAM test: persist a verified block, verify it appears in the tangled ~.lisp~ file, verify the Org file passes balance check.
*** TODO Org-mode productivity additions — agenda, clock, checklist, table
:PROPERTIES:
:ID: id-v050-org-additions
:CREATED: [2026-05-07 Thu]
:END:
Rationale: Passepartout bets on Org-mode as the universal format for human and machine. But current Org support is thin: headlines, tags, property drawers, source blocks. Missing are the features that make Org a productivity tool: agenda views, clock-in/out, checklists, tables. Adding these turns the agent from a chat partner into a productivity assistant — it can answer "what should I work on today?" with 0 LLM tokens.
- Extend ~programming-org.lisp~ (~programming-org.org~) with five new functions:
1. ~org-agenda-today~ — walk memory (or file tree) for headlines with ~SCHEDULED~ ≤ today or ~DEADLINE~ within N days. Returns list of memory-objects. ~60 lines.
2. ~org-clock-in~ / ~org-clock-out~ — set ~:CLOCK-START~ property; on clock-out, compute duration, append to ~:LOGBOOK:~ drawer. ~80 lines.
3. ~org-checklist-toggle~ — parse ~- [ ]~ / ~- [X]~ checkboxes in headline content, toggle state, return completed/total count. ~50 lines.
4. ~org-table-parse~ / ~org-table-render~ — parse ~| a | b |~ tables into list-of-lists, render back. ~70 lines.
5. ~org-agenda-view~ — compose agenda + clock state + TODO headlines into single Org-formatted string. Used by ~/agenda~ TUI command. ~50 lines.
- ~org-agenda-today~ and ~org-agenda-view~ operate on memory store (zero file I/O, zero LLM tokens).
- FiveAM test for each function.
*** TODO Vault encryption — Ironclad AES + PBKDF2
:PROPERTIES:
:ID: id-v050-vault-encryption
:CREATED: [2026-05-07 Thu]
:END:
Rationale: The vault (~*VAULT-MEMORY*~) stores API keys and credentials in plaintext in a hash table. ~VAULT-MASK-STRING~ always returns ~"[MASKED]"~ ignoring input — it's a stub. Ironclad is already a dependency (used for SHA-256 in Merkle hashing) and provides AES-256-GCM, ChaCha20, and PBKDF2. Encryption makes the vault go from security theater to actual security.
- Add ~vault-encrypt~ / ~vault-decrypt~ using Ironclad AES-256-GCM. Master key derived via PBKDF2 from ~VAULT_MASTER_PASSPHRASE~ env var or ~~/.config/passepartout/.key~ file.
- Store ciphertext instead of plaintext in ~*VAULT-MEMORY*~.
- ~VAULT-MASK-STRING~ actually masks (replaces all chars with ~*~, preserving length).
- ~dispatcher-vault-scan~ searches plaintext after decrypt (still catches leaks before they reach the LLM).
- FiveAM test: round-trip encrypt/decrypt; wrong passphrase fails; masked string has same length as original.
*** TODO Deterministic gate growth — ~dispatcher-learn~ + ~rules.org~
:PROPERTIES:
:ID: id-v050-dispatcher-learn
:CREATED: [2026-05-07 Thu]
:END:
Rationale: This is the "cheaper over time" claim made operational. Every HITL approval or denial becomes data. After N approvals of the same pattern, it becomes a permanent deterministic rule. The LLM no longer asks permission. 0 LLM tokens spent on what used to be a human decision. The user watches the rule counter tick up as they teach the agent.
- ~dispatcher-learn~ function in ~security-dispatcher.lisp~: called from ~hitl-approve~ and ~hitl-deny~. Extracts pattern (~:tool~ + ~:filepath~ glob + ~:cmd~ pattern). Tracks count per pattern in memory store.
- When count passes ~DISPATCHER_RULE_THRESHOLD~ (from ~.env~, default 3), writes a rule to ~RULES_FILE~ (~~/memex/system/rules.org~).
- Each rule is an Org headline with ~:EXPLANATION:~ property explaining what the rule does and why it was created.
- ~dispatcher-check~ consults ~RULES_FILE~ before its blocking vectors — allowed rules pass through, blocked rules are denied.
- Rules are loaded from ~rules.org~ at daemon startup (survive restarts).
- ~dispatcher-severity-allowed-p~: uses severity classification from v0.4.3 — ~:catastrophic~ always HITL regardless of rule count. ~:harmless~ always allowed.
- Severity thresholds: ~:dangerous~ = 5 approvals, ~:moderate~ = 3 approvals (configurable via ~.env~).
- ~DISPATCHER_RULE_THRESHOLD~ and ~RULES_FILE~ env vars already added in v0.4.1's NO-HARDCODED-CONSTANTS TODO.
- ~DISPATCHER_SEVERITY_DANGEROUS_THRESHOLD~ and ~DISPATCHER_SEVERITY_MODERATE_THRESHOLD~ in ~.env.example~.
- FiveAM test: approve same pattern 3 times → rule appears in ~rules.org~ → pattern passes through ~dispatcher-check~ without approval.
*** TODO Rule visibility — TUI ~/rules~ commands
:PROPERTIES:
:ID: id-v050-rule-visibility
:CREATED: [2026-05-07 Thu]
:END:
Rationale: The user must know what rules the Dispatcher has learned and must be able to undo bad learning. The rules live in ~~/memex/system/rules.org~ (editable in any text editor), but the TUI should provide live access.
- TUI commands:
- ~/rules~ — list all rules sorted by recency (most recent first). Shows pattern, decision (allowed/blocked), severity, approval count, explanation.
- ~/rules blocked~ — show only blocked patterns.
- ~/rules allowed~ — show only allowed patterns.
- ~/rule delete <id>~ — remove a rule (undoes the learning). Deletes the headline from ~rules.org~.
- ~/rule allow <id>~ — flip a blocked rule to allowed (user overrides the learning).
- On rule creation, daemon sends ~:rule-created~ event. TUI adds system message: ~[Rules: 47 → 48] New rule: shell commands targeting ~/memex/projects/* are now allowed. /rule delete rule-48 to undo.~
- Rules are visible in the TUI status bar via the rule counter (already implemented in v0.4.0 gate trace).
- FiveAM test: ~/rules~ returns expected rules; ~/rule delete~ removes a rule and it no longer passes through ~dispatcher-check~.
*** TODO Merkle learning — memory-find-similar, outcome recording
:PROPERTIES:
:ID: id-v050-merkle-learning
:CREATED: [2026-05-07 Thu]
:END:
Rationale: The Merkle tree provides content-addressed storage. Combined with embedding vectors (populated at ingest time since v0.4.0), it can answer "what happened the last 3 times I asked something like this?" This is retrieval-augmented generation from the user's own history — the agent learns what approaches succeeded and failed, not from the LLM's training data but from the user's actual sessions.
- ~memory-find-similar~ in ~core-memory.lisp~: given a vector, return N memory objects with highest cosine similarity. Uses ~memory-object-vector~ (already populated via ~ingest-ast~~embeddings-compute~ since v0.4.0). ~30 lines.
- ~memory-outcome-record~: store an outcome (success/failure plist) against a signal. Keyed by Merkle hash of the signal. ~25 lines.
- ~memory-find-outcomes~: given a signal (current context), find similar past signals and their outcomes. Uses ~memory-find-similar~ on the signal's foveal vector. Returns ranked list of past approaches with success/failure labels. ~40 lines.
- Outcome data feeds into ~context-assemble-global-awareness~: when the foveal node has similar past interactions, include them in the context as "Historical: last 3 times you asked this, approach X succeeded, Y failed."
- FiveAM test: record 3 outcomes for similar signals, verify ~memory-find-outcomes~ returns them ranked by similarity.
*** TODO Merkle learning documentation in Design Decisions
:PROPERTIES:
:ID: id-v050-merkle-docs
:CREATED: [2026-05-07 Thu]
:END:
Rationale: The Merkle tree was designed for integrity, not learning. Its second life as a learning substrate — content-addressed history + vector similarity → retrospective knowledge — deserves architectural documentation explaining the data flow, the similarity gating, and how it feeds the "cheaper over time" thesis.
- New section in ~docs/DESIGN_DECISIONS.org~: "The Merkle Tree as Learning Substrate."
- Explain: Merkle hash → content identity. Memory-object-vector → content similarity. Together → "find what worked last time."
- Include data flow diagram (ASCII art) showing ingest → embed → query → retrieve → inform cycle.
- Distinguish from symbolic induction (v0.5.0): Merkle learning answers "what happened last time?" Symbolic induction answers "can I automate this next time?"
*** TODO Internal evaluation harness — ~deftask~, ~run-eval-suite~
:PROPERTIES:
:ID: id-v050-eval-harness
:CREATED: [2026-05-07 Thu]
:END:
Rationale: Without an evaluation harness, there is no way to know if the agent's capabilities improve or regress across releases. SWE-bench (v0.9.0) measures competitive ranking against other agents. The internal suite measures regression detection — it catches when v0.5.1 breaks something v0.5.0 could do. The suite starts with 10 tasks and grows with the codebase.
- New skill: ~system-evaluation.org~ (~system-evaluation.lisp~).
- ~deftask~ macro: define an eval task with ~:setup~ (create test environment), ~:prompt~ (what to ask the agent), ~:verify~ (function that checks the output), ~:teardown~ (cleanup). Similar to ~defskill~ but for agent capabilities, not code.
- ~run-eval-task~: inject ~:prompt~ as ~:user-input~ signal via ~stimulus-inject~, wait for completion (poll ~*memory-store*~ or signal status), run ~:verify~ on the result, return ~(:passed)~ or ~(:failed :reason ...)~.
- ~run-eval-suite~: run all registered eval tasks, produce score (pass count / total), per-task diagnostics, summary.
- ~eval-score~: return current score as a number. Logged to telemetry.
- Initial 10 tasks covering: find TODOs, create Org note, modify file, search codebase, run shell command (safe), list projects, query memory, find definition, run test, set TODO state.
- Task suite grows with codebase: every bug fix adds a regression task. Every new feature adds a capability task.
- FiveAM test: a task that should pass passes; a task that should fail fails with the expected reason.
*** TODO Evaluation workflow in AGENTS.md
:PROPERTIES:
:ID: id-v050-eval-agentsmd
:CREATED: [2026-05-07 Thu]
:END:
Rationale: The AGENTS.md "Development Workflow" section describes how to develop code with REPL → TDD → Literate. A parallel "Evaluation Workflow" section should describe how to verify agent capabilities with eval tasks. Together they form the full quality cycle: TDD verifies the code the agent writes, eval verifies the agent itself.
- New section in AGENTS.md: "## Evaluation Workflow (Must Follow)".
- Mirror the Development Workflow structure: define task → prove BLANK (fresh agent fails) → implement capability → prove COMPLETE → track regression.
- Include ~deftask~ example and ~run-eval-suite~ usage.
- Rule: every new cognitive tool or skill MUST include an eval task before shipping.
*** TODO TDD + Eval + Merkle learning integration into ~.env.example~
:PROPERTIES:
:ID: id-v050-env-vars
:CREATED: [2026-05-07 Thu]
:END:
Rationale: All new configurable values from v0.5.0 must be documented in ~.env.example~ per the NO-HARDCODED-CONSTANTS standard (v0.4.1). This task ensures no env var is forgotten.
- Add to ~.env.example~:
- ~DISPATCHER_RULE_THRESHOLD=3~ (if not already added in v0.4.1 cleanup)
- ~RULES_FILE="$HOME/memex/system/rules.org"~
- ~DISPATCHER_SEVERITY_DANGEROUS_THRESHOLD=5~
- ~DISPATCHER_SEVERITY_MODERATE_THRESHOLD=3~
- ~VAULT_MASTER_PASSPHRASE=""~ (empty = prompt on startup, or read from ~/.key file)
- ~EVAL_TASKS_DIR="$HOME/memex/system/eval/"~
- ~EVAL_TIMEOUT=120~ (seconds before a task is considered failed)
- ~TEST_RUNNER_PYTHON="python3 -m pytest"~
- ~TEST_RUNNER_GO="go test -run"~
- ~TEST_RUNNER_RUST="cargo test"~
- Document each with a comment explaining its purpose and default.
*** Competitive Advantage Analysis — v0.5.0 Summary
Token economics is the dimension where the architecture's theoretical advantage becomes operationally real. The foveal-peripheral model and deterministic gates reduce the tokens *needed* per task; prompt caching and incremental assembly reduce the tokens *spent* per task. Combined, the 23x coding savings and 1324x knowledge management savings in the DESIGN_DECISIONS token analysis become achievable rather than aspirational. Symbolic induction extends this downward cost curve into new territory: the agent doesn't just block fewer dangerous actions — it automates away entire categories of LLM calls by learning reusable Lisp functions from successful interaction patterns.
@@ -820,7 +1180,7 @@ v1.0.0 is not a feature release — it is a verification release. Every feature
| Planning | Task tree DAG with terminal states | Multi-step integration tests |
| Tool ecosystem | 15+ MCP tools + native shell + git | MCP protocol compliance tests |
| Context window | Semantic search + foveal-peripheral + caching| Token budget vs competitor audit |
| Safety | 9-vector Dispatcher + policy + permissions | Chaos testing (v0.9.0) |
| Safety | 10-vector Dispatcher + policy + permissions | Chaos testing (v0.9.0) |
| Multi-step tasks | Task trees with terminal states | SWE-bench score (v0.9.0 harness) |
| Code editing | Full file read/write via MCP + Org | SWE-bench-verified subset |
| Memory | Vector recall + Merkle integrity + MVCC | Concurrency stress test (v0.6.1) |

View File

@@ -80,19 +80,18 @@
(reflection-feedback (if rejection-trace
(format nil "~%~%PREVIOUS PROPOSAL REJECTED: ~a" rejection-trace)
""))
(skill-augments (let ((augments ""))
(maphash (lambda (name skill)
(declare (ignore name))
(let ((aug-fn (skill-system-prompt-augment skill)))
(when aug-fn
(let ((aug-text (ignore-errors (funcall aug-fn context))))
(when (and aug-text (stringp aug-text) (> (length aug-text) 0))
(setf augments (concatenate 'string augments aug-text (string #\Newline))))))))
*skill-registry*)
(when (> (length augments) 0) augments)))
(system-prompt (format nil "IDENTITY: ~a~a~%~%TOOLS:~%~a~%~%CONTEXT:~%~a~%~%LOGS:~%~a~%~a"
assistant-name reflection-feedback tool-belt global-context system-logs
(or skill-augments ""))))
(standing-mandates-text (let ((out ""))
(dolist (fn *standing-mandates*)
(let ((text (ignore-errors (funcall fn context))))
(when (and text (stringp text) (> (length text) 0))
(setf out (concatenate 'string out text (string #\Newline))))))
(when (> (length out) 0) out)))
(system-prompt (format nil "IDENTITY: ~a~a~a~%~%TOOLS:~%~a~%~%CONTEXT:~%~a~%~%LOGS:~%~a"
assistant-name reflection-feedback
(if standing-mandates-text
(concatenate 'string (string #\Newline) standing-mandates-text)
"")
tool-belt global-context system-logs)))
(let* ((thought (backend-cascade-call raw-prompt :system-prompt system-prompt :context context))
(cleaned (if (and (listp thought) (getf thought :type))
(format nil "~a" (getf (getf thought :payload) :text))

View File

@@ -14,13 +14,18 @@
(defun VAULT-MASK-STRING (s) (declare (ignore s)) "[MASKED]")
(defvar *VAULT-MEMORY* (make-hash-table :test 'equal))
(defstruct skill name priority dependencies trigger-fn probabilistic-prompt deterministic-fn system-prompt-augment)
(defstruct skill name priority dependencies trigger-fn probabilistic-prompt deterministic-fn)
(defvar *skill-registry* (make-hash-table :test 'equal))
(defvar *skill-catalog* (make-hash-table :test 'equal)
"Tracks all discovered skill files and their loading state.")
(defvar *standing-mandates* nil
"List of functions (context) → string-or-nil. Each is called on every think() cycle.
When non-nil, the returned string is injected into the IDENTITY section of the system prompt.
Unlike skills (which activate on triggers), standing mandates are always consulted.")
(defstruct skill-entry filename (status :discovered) error-log (load-time 0))
;; Alias: find-triggered-skill → skill-triggered-find
@@ -38,7 +43,7 @@
*skill-registry*)
(first (sort triggered #'> :key #'skill-priority))))
(defmacro defskill (name &key priority dependencies trigger probabilistic deterministic system-prompt-augment)
(defmacro defskill (name &key priority dependencies trigger probabilistic deterministic)
"Registers a new skill. NAME is a keyword. TRIGGER is a function (context) → bool."
`(setf (gethash (string-downcase (string ,name)) *skill-registry*)
(make-skill :name (string-downcase (string ,name))
@@ -46,8 +51,7 @@
:dependencies ',dependencies
:trigger-fn ,trigger
:probabilistic-prompt ,probabilistic
:deterministic-fn ,deterministic
:system-prompt-augment ,system-prompt-augment)))
:deterministic-fn ,deterministic)))
(defun skill-dependencies-resolve (skill-name)
"Resolves transitive dependencies. Returns list of skill names in dependency order."

View File

@@ -144,8 +144,10 @@ writes the result back through the reply-stream."
(defskill :passepartout-programming-repl
:priority 200
:trigger (lambda (ctx) (declare (ignore ctx)) nil)
:deterministic (lambda (action ctx) (declare (ignore action ctx)) nil)
:system-prompt-augment #'repl-mandate)
:deterministic (lambda (action ctx) (declare (ignore action ctx)) nil))
(eval-when (:load-toplevel :execute)
(push #'repl-mandate *standing-mandates*))
(eval-when (:compile-toplevel :load-toplevel :execute)
(ql:quickload :fiveam :silent t))

420
lisp/programming-tools.lisp Normal file
View File

@@ -0,0 +1,420 @@
(in-package :passepartout)
(defun tools-write-file (filepath content)
"Write string CONTENT to FILEPATH, creating parent directories."
(uiop:ensure-all-directories-exist (list filepath))
(with-open-file (stream filepath :direction :output :if-exists :supersede :if-does-not-exist :create)
(write-string content stream)))
(def-cognitive-tool search-files
"Search file contents under a directory for a regex pattern."
((:name "pattern" :description "The regex pattern to search for." :type "string")
(:name "path" :description "Directory to search recursively." :type "string")
(:name "include" :description "Optional glob filter for filenames (e.g. \"*.lisp\")." :type "string"))
:guard nil
:body (lambda (args)
(block nil
(let* ((pattern (getf args :pattern))
(path (getf args :path))
(include (getf args :include))
(results nil))
(unless (and pattern path)
(return (list :status :error :message "search-files requires :pattern and :path")))
(handler-case
(dolist (file (directory (merge-pathnames
(if include
(make-pathname :name :wild :type (subseq include 2) :defaults path)
(make-pathname :name :wild :type :wild :defaults path))
path)))
(let ((base (file-namestring file)))
(with-open-file (stream file :direction :input :if-does-not-exist nil)
(when stream
(loop for line = (read-line stream nil nil)
for line-num from 1
while line
when (cl-ppcre:scan pattern line)
do (push (format nil "~a:~d: ~a" base line-num (string-trim '(#\Space #\Tab) line))
results))))))
(t (c) (return (list :status :error :message (format nil "~a" c)))))
(list :status :success
:content (if results
(format nil "~d matches:~%~a" (length results)
(format nil "~{~a~^~%~}" (reverse results)))
(format nil "No matches for '~a' in ~a" pattern path)))))))
(def-cognitive-tool find-files
"Find files matching a glob pattern under a directory."
((:name "pattern" :description "Glob pattern (e.g. \"*.lisp\", \"core-*\")." :type "string")
(:name "path" :description "Directory to search in." :type "string"))
:guard nil
:body (lambda (args)
(block nil
(let* ((pattern (getf args :pattern))
(path (getf args :path)))
(unless (and pattern path)
(return (list :status :error :message "find-files requires :pattern and :path")))
(let ((full (merge-pathnames pattern path)))
(handler-case
(let ((files (directory full)))
(list :status :success
:content (if files
(format nil "~d files:~%~{~a~^~%~}" (length files) files)
(format nil "No files matching '~a' in ~a" pattern path))))
(t (c) (list :status :error :message (format nil "~a" c)))))))))
(def-cognitive-tool read-file
"Read the contents of a file."
((:name "filepath" :description "Path to the file to read." :type "string")
(:name "start" :description "Optional: line number to start reading from (1-based)." :type "integer")
(:name "limit" :description "Optional: maximum number of lines to read." :type "integer"))
:guard (lambda (args) (declare (ignore args)) nil)
:body (lambda (args)
(block nil
(let* ((filepath (getf args :filepath))
(start (getf args :start))
(limit (getf args :limit)))
(unless filepath
(return (list :status :error :message "read-file requires :filepath")))
(handler-case
(let ((content (uiop:read-file-string filepath)))
(if (or start limit)
(let* ((lines (uiop:split-string content :separator '(#\Newline)))
(start-idx (max 0 (1- (or start 1))))
(end (if limit (min (length lines) (+ start-idx limit)) (length lines)))
(selected (subseq lines start-idx end)))
(list :status :success
:content (format nil "~{~a~^~%~}" selected)))
(list :status :success :content content)))
(error (c) (list :status :error :message (format nil "~a" c))))))))
(def-cognitive-tool write-file
"Write string content to a file. Created directories as needed."
((:name "filepath" :description "Path to the file to write." :type "string")
(:name "content" :description "The text content to write." :type "string"))
:guard nil
:body (lambda (args)
(block nil
(let* ((filepath (getf args :filepath))
(content (getf args :content)))
(unless (and filepath content)
(return (list :status :error :message "write-file requires :filepath and :content")))
(handler-case
(progn
(tools-write-file filepath content)
(list :status :success
:content (format nil "Written ~d bytes to ~a" (length content) filepath)))
(error (c) (list :status :error :message (format nil "~a" c))))))))
(def-cognitive-tool list-directory
"List the contents of a directory."
((:name "path" :description "Directory path to list." :type "string")
(:name "pattern" :description "Optional glob filter (e.g. \"*.org\")." :type "string"))
:guard nil
:body (lambda (args)
(block nil
(let* ((path (getf args :path))
(pattern (getf args :pattern)))
(unless path
(return (list :status :error :message "list-directory requires :path")))
(let ((full-pattern (if pattern
(merge-pathnames pattern path)
(make-pathname :name :wild :type :wild :defaults path))))
(handler-case
(let ((entries (directory full-pattern)))
(list :status :success
:content (if entries
(format nil "~d entries in ~a:~%~{~a~^~%~}" (length entries) path entries)
(format nil "No entries in ~a" path))))
(t (c) (list :status :error :message (format nil "~a" c)))))))))
(def-cognitive-tool run-shell
"Execute a shell command and return stdout, stderr, and exit code."
((:name "cmd" :description "The shell command to execute." :type "string")
(:name "timeout" :description "Optional timeout in seconds (default 30)." :type "integer"))
:guard nil
:body (lambda (args)
(block nil
(let* ((cmd (getf args :cmd))
(timeout (or (getf args :timeout) 30)))
(unless cmd
(return (list :status :error :message "run-shell requires :cmd")))
(handler-case
(multiple-value-bind (out err code)
(uiop:run-program (list "timeout" (format nil "~a" timeout) "bash" "-c" cmd)
:output :string :error-output :string
:ignore-error-status t)
(list :status :success
:content (format nil "~a~@[~%~%stderr:~%~a~]~%exit: ~d"
(or out "") (when (and err (> (length err) 0)) err) code)))
(error (c) (list :status :error :message (format nil "~a" c))))))))
(def-cognitive-tool eval-form
"Evaluate a Lisp expression in the running image and return the result."
((:name "code" :description "The Lisp expression to evaluate as a string." :type "string"))
:guard nil
:body (lambda (args)
(block nil
(let* ((code (getf args :code)))
(unless code
(return (list :status :error :message "eval-form requires :code")))
(handler-case
(let* ((*read-eval* nil)
(form (read-from-string code))
(result (eval form)))
(list :status :success :content (format nil "~a" result)))
(error (c) (list :status :error :message (format nil "~a" c))))))))
(def-cognitive-tool run-tests
"Run FiveAM tests. With no arguments, runs all test suites."
((:name "test-name" :description "Optional: specific test name to run. If nil, runs all tests." :type "string"))
:guard nil
:body (lambda (args)
(block nil
(let* ((test-name (getf args :test-name)))
(handler-case
(if test-name
(let* ((sym (find-symbol (string-upcase test-name) :passepartout))
(result (when sym (fiveam:run (intern (string-upcase test-name) :passepartout)))))
(list :status :success
:content (format nil "Test '~a' ~a" test-name
(if result "completed" "not found"))))
(let ((result (fiveam:run-all-tests)))
(list :status :success :content (format nil "~a" result))))
(error (c) (list :status :error :message (format nil "~a" c))))))))
(def-cognitive-tool org-find-headline
"Find an Org headline by ID or title in the memory store."
((:name "id" :description "Optional: Org ID property to search for." :type "string")
(:name "title" :description "Optional: headline title to search for (case-insensitive substring)." :type "string"))
:guard nil
:body (lambda (args)
(block nil
(let* ((id (getf args :id))
(title (getf args :title))
(results nil))
(unless (or id title)
(return (list :status :error :message "org-find-headline requires :id or :title")))
(handler-case
(let ((is-mem (find-symbol "MEMORY-OBJECT-P" :passepartout))
(get-id (find-symbol "MEMORY-OBJECT-ID" :passepartout))
(get-title (find-symbol "MEMORY-OBJECT-TITLE" :passepartout)))
(unless (and is-mem get-id get-title)
(return (list :status :error :message "Memory store not loaded")))
(maphash (lambda (k obj)
(declare (ignore k))
(when (and (funcall is-mem obj)
(or (and id (string-equal id (funcall get-id obj)))
(and title (search title (funcall get-title obj) :test #'char-equal))))
(push obj results)))
*memory-store*)
(list :status :success
:content (if results
(format nil "~d headlines found:~%~{~a~^~%~}"
(length results)
(mapcar (lambda (r) (funcall get-title r)) results))
(format nil "No headlines matching ~a" (or id title)))))
(error (c) (list :status :error :message (format nil "~a" c))))))))
(def-cognitive-tool org-modify-file
"Replace text in an Org file via exact string match. Returns error if old-text not found."
((:name "filepath" :description "Path to the Org file." :type "string")
(:name "old-text" :description "Exact text to replace." :type "string")
(:name "new-text" :description "Text to insert in its place." :type "string"))
:guard nil
:body (lambda (args)
(block nil
(let* ((filepath (getf args :filepath))
(old-text (getf args :old-text))
(new-text (getf args :new-text)))
(unless (and filepath old-text new-text)
(return (list :status :error :message "org-modify-file requires :filepath, :old-text, and :new-text")))
(handler-case
(let ((content (uiop:read-file-string filepath)))
(let ((pos (search old-text content)))
(if pos
(let ((new-content (concatenate 'string
(subseq content 0 pos)
new-text
(subseq content (+ pos (length old-text))))))
(tools-write-file filepath new-content)
(list :status :success
:content (format nil "Replaced at position ~d in ~a" pos filepath)))
(list :status :error :message (format nil "Text not found in ~a" filepath)))))
(error (c) (list :status :error :message (format nil "~a" c))))))))
(defskill :passepartout-programming-tools
:priority 50
:trigger (lambda (ctx) (declare (ignore ctx)) nil)
:deterministic (lambda (action ctx) (declare (ignore action ctx)) nil))
(eval-when (:compile-toplevel :load-toplevel :execute)
(ql:quickload :fiveam :silent t))
(defpackage :passepartout-programming-tools-tests
(:use :cl :fiveam :passepartout)
(:export #:programming-tools-suite))
(in-package :passepartout-programming-tools-tests)
(def-suite programming-tools-suite :description "Verification of programming cognitive tools")
(in-suite programming-tools-suite)
(defun tools-tmpdir ()
(let ((d (merge-pathnames "tmp/passepartout-tool-tests/" (user-homedir-pathname))))
(uiop:ensure-all-directories-exist (list d))
d))
(defun tools-cleanup ()
(let ((d (tools-tmpdir)))
(uiop:delete-directory-tree d :validate t :if-does-not-exist :ignore)))
(defun tools-write-file (filepath content)
(uiop:ensure-all-directories-exist (list filepath))
(with-open-file (stream filepath :direction :output :if-exists :supersede :if-does-not-exist :create)
(write-string content stream)))
(defun call-tool (tool-name &rest args)
(let ((tool (gethash (string-downcase (string tool-name)) *cognitive-tool-registry*)))
(unless tool (error "Tool ~a not found" tool-name))
(funcall (cognitive-tool-body tool) args)))
;; search-files
(test test-search-files-finds-matches
"Contract 1: search-files finds lines matching a regex pattern."
(let* ((dir (tools-tmpdir))
(file-a (merge-pathnames "src-a.lisp" dir))
(file-b (merge-pathnames "src-b.lisp" dir)))
(tools-write-file file-a "(defun foo () 'hello)")
(tools-write-file file-b "(defun bar () 'world)")
(let ((result (call-tool 'search-files :pattern "defun" :path (namestring dir) :include "*.lisp")))
(is (eq (getf result :status) :success))
(is (search "src-a.lisp:1:" (getf result :content)))
(is (search "src-b.lisp:1:" (getf result :content))))
(tools-cleanup)))
(test test-search-files-missing-params
"search-files returns error when required params are missing."
(let ((result (call-tool 'search-files :pattern "x")))
(is (eq (getf result :status) :error))))
;; find-files
(test test-find-files-by-extension
"Contract 5: find-files returns files matching a glob."
(let ((dir (tools-tmpdir)))
(tools-write-file (merge-pathnames "a.lisp" dir) "test")
(tools-write-file (merge-pathnames "b.lisp" dir) "test")
(tools-write-file (merge-pathnames "c.org" dir) "test")
(let ((result (call-tool 'find-files :pattern "*.lisp" :path (namestring dir))))
(is (eq (getf result :status) :success))
(is (search "a.lisp" (getf result :content)))
(is (search "b.lisp" (getf result :content)))
(is (not (search "c.org" (getf result :content)))))
(tools-cleanup)))
(test test-find-files-missing-params
"find-files returns error without required params."
(let ((result (call-tool 'find-files :pattern "*.lisp")))
(is (eq (getf result :status) :error))))
;; read-file
(test test-read-file-full
"Contract 6: read-file returns full file contents."
(let* ((dir (tools-tmpdir))
(file (merge-pathnames "readme.txt" dir)))
(tools-write-file file (format nil "line one~%line two~%line three"))
(let ((result (call-tool 'read-file :filepath (namestring file))))
(is (eq (getf result :status) :success))
(is (search "line one" (getf result :content))))
(tools-cleanup)))
(test test-read-file-missing-params
"read-file returns error without :filepath."
(let ((result (call-tool 'read-file)))
(is (eq (getf result :status) :error))))
;; write-file
(test test-write-file-creates
"Contract 7: write-file creates file with content."
(let* ((dir (tools-tmpdir))
(file (merge-pathnames "output.txt" dir)))
(let ((result (call-tool 'write-file :filepath (namestring file) :content "hello world")))
(is (eq (getf result :status) :success))
(is (search "11 bytes" (getf result :content))))
(is (string-equal "hello world" (uiop:read-file-string file)))
(tools-cleanup)))
(test test-write-file-missing-params
"write-file returns error without required params."
(let ((result (call-tool 'write-file :content "x")))
(is (eq (getf result :status) :error))))
;; list-directory
(test test-list-directory-all
"Contract 8: list-directory returns all entries."
(let ((dir (tools-tmpdir)))
(tools-write-file (merge-pathnames "alpha.txt" dir) "x")
(tools-write-file (merge-pathnames "beta.txt" dir) "y")
(let ((result (call-tool 'list-directory :path (namestring dir))))
(is (eq (getf result :status) :success))
(is (search "alpha.txt" (getf result :content)))
(is (search "beta.txt" (getf result :content))))
(tools-cleanup)))
(test test-list-directory-missing-params
"list-directory returns error without :path."
(let ((result (call-tool 'list-directory)))
(is (eq (getf result :status) :error))))
;; run-shell
(test test-run-shell-echo
"Contract 9: run-shell executes a command and returns output."
(let ((result (call-tool 'run-shell :cmd "echo hello")))
(is (eq (getf result :status) :success))
(is (search "hello" (getf result :content)))))
(test test-run-shell-missing-params
"run-shell returns error without :cmd."
(let ((result (call-tool 'run-shell)))
(is (eq (getf result :status) :error))))
;; eval-form
(test test-eval-form-arithmetic
"Contract 10: eval-form evaluates a Lisp expression."
(let ((result (call-tool 'eval-form :code "(+ 1 2)")))
(is (eq (getf result :status) :success))
(is (search "3" (getf result :content)))))
(test test-eval-form-missing-params
"eval-form returns error without :code."
(let ((result (call-tool 'eval-form)))
(is (eq (getf result :status) :error))))
;; org-modify-file
(test test-org-modify-file-replace
"Contract 13: org-modify-file replaces exact text in file."
(let* ((dir (tools-tmpdir))
(file (merge-pathnames "doc.org" dir)))
(tools-write-file file "* TODO Buy milk~%* DONE Walk dog~%")
(let ((result (call-tool 'org-modify-file
:filepath (namestring file)
:old-text "TODO" :new-text "WAITING")))
(is (eq (getf result :status) :success))
(is (search "WAITING" (uiop:read-file-string file))))
(tools-cleanup)))
(test test-org-modify-file-not-found
"org-modify-file returns error when text not in file."
(let* ((dir (tools-tmpdir))
(file (merge-pathnames "file.org" dir)))
(tools-write-file file "some content")
(let ((result (call-tool 'org-modify-file
:filepath (namestring file)
:old-text "not-in-file" :new-text "anything")))
(is (eq (getf result :status) :error))
(is (search "not found" (getf result :message))))
(tools-cleanup)))
(test test-org-modify-file-missing-params
"org-modify-file returns error without required params."
(let ((result (call-tool 'org-modify-file :filepath "x" :old-text "y")))
(is (eq (getf result :status) :error))))

View File

@@ -193,8 +193,9 @@ Returns a list of matched pattern names or nil if safe."
(defun dispatcher-check (action context)
"Security gate for high-risk actions.
Vectors: lisp validation, secret path, secret content, vault secrets,
privacy tags, privacy text, shell safety, network exfil, high-impact approval."
Eleven checks: 0=REPL-lint (warn-only), 1=lisp-validation, 2=secret-path,
2b=self-build-core, 3=secret-content, 4=vault-secrets, 5=privacy-tags,
6=privacy-text, 7=shell-safety, 8=network-exfil, 8b=high-impact-approval."
(declare (ignore context))
(let* ((target (proto-get action :target))
(payload (proto-get action :payload))

View File

@@ -203,7 +203,7 @@ The first message sent to every new connection. The client can use this to verif
Validates that an incoming message has the minimum required structure: a plist with a valid ~:type~ field. Used by the protocol validator skill to reject malformed messages before they enter the cognitive loop.
#+begin_src lisp :tangle ../lisp/core-communication.lisp
#+begin_src lisp
(in-package :passepartout)
(defun protocol-schema-validate (msg)
@@ -258,7 +258,7 @@ Use this function to manually verify that the daemon is alive and the framing pr
* Test Suite
Verifies that the framing protocol correctly serializes and deserializes messages.
#+begin_src lisp :tangle ../lisp/core-communication.lisp
#+begin_src lisp
(eval-when (:compile-toplevel :load-toplevel :execute)
(ql:quickload :fiveam :silent t))

View File

@@ -306,7 +306,7 @@ to ~context-awareness-assemble~.
* Test Suite
Verifies that the Foveal-Peripheral rendering correctly distinguishes between foveal (detailed) and peripheral (outline) content, and that the awareness budget includes all active projects.
#+begin_src lisp :tangle ../lisp/core-context.lisp
#+begin_src lisp
(eval-when (:compile-toplevel :load-toplevel :execute)
(ql:quickload :fiveam :silent t))

View File

@@ -22,7 +22,7 @@ The implementation section includes:
** Package Definition and Export List
The package definition. All public symbols are exported here.
#+begin_src lisp :tangle ../lisp/core-defpackage.lisp
#+begin_src lisp
(defpackage :passepartout
(:use :cl)
(:export
@@ -195,7 +195,7 @@ The package implementation section defines the low-level utilities and global st
*** Robust plist access (plist-get)
Retrieves a value from a plist, checking both upper and lowercase keyword variants. This is needed because different components use different keyword conventions.
#+begin_src lisp :tangle ../lisp/core-defpackage.lisp
#+begin_src lisp
(in-package :passepartout)
(defun plist-get (plist key)
@@ -208,7 +208,7 @@ Retrieves a value from a plist, checking both upper and lowercase keyword varian
*** Logging state
The harness maintains a bounded ring buffer of log messages for inclusion in LLM context. Access is thread-safe via a lock.
#+begin_src lisp :tangle ../lisp/core-defpackage.lisp
#+begin_src lisp
(defvar *log-buffer* nil)
(defvar *log-lock* (bordeaux-threads:make-lock "log-messages-lock"))
(defvar *log-limit* 100)
@@ -216,14 +216,14 @@ The harness maintains a bounded ring buffer of log messages for inclusion in LLM
*** Skill registry
The global registry of all loaded skills. This is the authoritative list that the deterministic engine iterates.
#+begin_src lisp :tangle ../lisp/core-defpackage.lisp
#+begin_src lisp
(defvar *skill-registry* (make-hash-table :test 'equal)
"Global registry of all loaded skills.")
#+end_src
*** Skill telemetry
Tracks execution metrics per skill (count, duration, failures) for diagnostics and performance analysis.
#+begin_src lisp :tangle ../lisp/core-defpackage.lisp
#+begin_src lisp
(defvar *telemetry-table* (make-hash-table :test 'equal))
(defvar *telemetry-lock* (bordeaux-threads:make-lock "harness-telemetry-lock"))
@@ -240,11 +240,11 @@ Tracks execution metrics per skill (count, duration, failures) for diagnostics a
*** Cognitive tool registry
Tools that the LLM can invoke are registered here. Each tool has a name, description, parameters, optional guard, and implementation body. The ~def-cognitive-tool~ macro handles registration. ~cognitive-tool-prompt~ serialises the registry into the LLM's system prompt.
#+begin_src lisp :tangle ../lisp/core-defpackage.lisp
#+begin_src lisp
(defvar *cognitive-tool-registry* (make-hash-table :test 'equal))
#+end_src
#+begin_src lisp :tangle ../lisp/core-defpackage.lisp
#+begin_src lisp
(defstruct cognitive-tool
name
description
@@ -253,7 +253,7 @@ Tools that the LLM can invoke are registered here. Each tool has a name, descrip
body)
#+end_src
#+begin_src lisp :tangle ../lisp/core-defpackage.lisp
#+begin_src lisp
(defmacro def-cognitive-tool (name description parameters &key guard body)
"Registers a cognitive tool. PARAMETERS is a list of plists, one per parameter."
`(setf (gethash (string-downcase (string ',name)) *cognitive-tool-registry*)
@@ -264,7 +264,7 @@ Tools that the LLM can invoke are registered here. Each tool has a name, descrip
:body ,body)))
#+end_src
#+begin_src lisp :tangle ../lisp/core-defpackage.lisp
#+begin_src lisp
(defun cognitive-tool-prompt ()
"Serialises all registered tools into a prompt string for the LLM."
(let ((descriptions nil))
@@ -287,7 +287,7 @@ Tools that the LLM can invoke are registered here. Each tool has a name, descrip
*** Centralized logging (log-message)
Thread-safe logging function that writes to both the ring buffer (for LLM context) and stdout (for the user). Bounded by ~*log-limit*~.
#+begin_src lisp :tangle ../lisp/core-defpackage.lisp
#+begin_src lisp
(defun log-message (msg &rest args)
"Centralized, thread-safe logging for the harness."
(let ((formatted-msg (apply #'format nil msg args)))
@@ -301,7 +301,7 @@ Thread-safe logging function that writes to both the ring buffer (for LLM contex
*** Debugger hook
Friendly error handler that replaces the raw SBCL debugger with a diagnostic message. This prevents the agent from entering the debugger on unhandled conditions.
#+begin_src lisp :tangle ../lisp/core-defpackage.lisp
#+begin_src lisp
(setf *debugger-hook* (lambda (condition hook)
"Friendly error handler - shows diagnostic message instead of raw debugger."
(declare (ignore hook))

View File

@@ -300,7 +300,7 @@ uses the old name can call this alias. New code should call
* Test Suite
Verifies that the act gate correctly processes an approved action and sets the signal status to ~:acted~.
#+begin_src lisp :tangle ../lisp/core-loop-act.lisp
#+begin_src lisp
(eval-when (:compile-toplevel :load-toplevel :execute)
(ql:quickload :fiveam :silent t))

View File

@@ -247,7 +247,7 @@ uses the old name can call this alias. New code should call
* Test Suite
Verifies that the perceive gate correctly ingests AST nodes into memory and that the depth limiter prevents runaway recursive signals.
#+begin_src lisp :tangle ../lisp/core-loop-perceive.lisp
#+begin_src lisp
(eval-when (:compile-toplevel :load-toplevel :execute)
(ql:quickload :fiveam :silent t))

View File

@@ -195,9 +195,9 @@ This is the main entry point for the probabilistic engine. Every cognitive cycle
The function handles several cases:
- If a triggered skill provides a probabilistic prompt generator, that replaces the raw user input
- If the previous proposal was rejected, the rejection trace is injected into the LLM's context so it can self-correct
- Skills can augment the system prompt with domain-specific mandates via the ~system-prompt-augment~ mechanism
- Standing mandates from ~*standing-mandates*~ are injected into the IDENTITY section of the system prompt
The system prompt assembly order — identity, tools, context, logs, mandates — is intentional: the most dynamic content (mandates from skills) comes last so it has the most influence on the LLM's output.
The system prompt assembly order — identity (including mandates), tools, context, logs — is intentional: standing mandates appear early in IDENTITY so they set the behavioral frame before the model processes tools, context, and logs.
;; REPL-VERIFIED: 2026-05-03T13:00:00
#+begin_src lisp
@@ -216,19 +216,18 @@ The system prompt assembly order — identity, tools, context, logs, mandates
(reflection-feedback (if rejection-trace
(format nil "~%~%PREVIOUS PROPOSAL REJECTED: ~a" rejection-trace)
""))
(skill-augments (let ((augments ""))
(maphash (lambda (name skill)
(declare (ignore name))
(let ((aug-fn (skill-system-prompt-augment skill)))
(when aug-fn
(let ((aug-text (ignore-errors (funcall aug-fn context))))
(when (and aug-text (stringp aug-text) (> (length aug-text) 0))
(setf augments (concatenate 'string augments aug-text (string #\Newline))))))))
*skill-registry*)
(when (> (length augments) 0) augments)))
(system-prompt (format nil "IDENTITY: ~a~a~%~%TOOLS:~%~a~%~%CONTEXT:~%~a~%~%LOGS:~%~a~%~a"
assistant-name reflection-feedback tool-belt global-context system-logs
(or skill-augments ""))))
(standing-mandates-text (let ((out ""))
(dolist (fn *standing-mandates*)
(let ((text (ignore-errors (funcall fn context))))
(when (and text (stringp text) (> (length text) 0))
(setf out (concatenate 'string out text (string #\Newline))))))
(when (> (length out) 0) out)))
(system-prompt (format nil "IDENTITY: ~a~a~a~%~%TOOLS:~%~a~%~%CONTEXT:~%~a~%~%LOGS:~%~a"
assistant-name reflection-feedback
(if standing-mandates-text
(concatenate 'string (string #\Newline) standing-mandates-text)
"")
tool-belt global-context system-logs)))
(let* ((thought (backend-cascade-call raw-prompt :system-prompt system-prompt :context context))
(cleaned (if (and (listp thought) (getf thought :type))
(format nil "~a" (getf (getf thought :payload) :text))
@@ -375,7 +374,7 @@ uses the old name can call this alias. New code should call
* Test Suite
Verifies that the deterministic engine correctly rejects unsafe actions (like ~rm -rf /~) while allowing safe ones.
#+begin_src lisp :tangle ../lisp/core-loop-reason.lisp
#+begin_src lisp
(eval-when (:compile-toplevel :load-toplevel :execute)
(ql:quickload :fiveam :silent t))

View File

@@ -306,7 +306,7 @@ Boot sequence:
* Test Suite
Verifies that the immune system (error handling) correctly catches and reports errors from the cognitive pipeline.
#+begin_src lisp :tangle ../lisp/core-loop.lisp
#+begin_src lisp
(eval-when (:compile-toplevel :load-toplevel :execute)
(ql:quickload :fiveam :silent t))

View File

@@ -363,7 +363,7 @@ Restores memory state from a previously saved snapshot file. Called during boot
* Test Suite
Verifies that the Merkle hash is deterministic and consistent across independent AST ingestions.
#+begin_src lisp :tangle ../lisp/core-memory.lisp
#+begin_src lisp
(eval-when (:compile-toplevel :load-toplevel :execute)
(ql:quickload :fiveam :silent t))

View File

@@ -72,10 +72,10 @@ Simple mask function and the vault memory hash table. Used by the Security Dispa
** Skill data structures
The ~skill~ struct holds all metadata about a loaded skill: its name, priority, dependencies, trigger function, probabilistic prompt generator, deterministic gate, and system prompt augmentor. The ~skill-entry~ struct tracks the loading state of each discovered skill file.
The ~skill~ struct holds all metadata about a loaded skill: its name, priority, dependencies, trigger function, probabilistic prompt generator, and deterministic gate. The ~skill-entry~ struct tracks the loading state of each discovered skill file.
#+begin_src lisp
(defstruct skill name priority dependencies trigger-fn probabilistic-prompt deterministic-fn system-prompt-augment)
(defstruct skill name priority dependencies trigger-fn probabilistic-prompt deterministic-fn)
#+end_src
#+begin_src lisp
@@ -87,6 +87,13 @@ The ~skill~ struct holds all metadata about a loaded skill: its name, priority,
"Tracks all discovered skill files and their loading state.")
#+end_src
#+begin_src lisp
(defvar *standing-mandates* nil
"List of functions (context) → string-or-nil. Each is called on every think() cycle.
When non-nil, the returned string is injected into the IDENTITY section of the system prompt.
Unlike skills (which activate on triggers), standing mandates are always consulted.")
#+end_src
#+begin_src lisp
(defstruct skill-entry filename (status :discovered) error-log (load-time 0))
#+end_src
@@ -114,14 +121,22 @@ This is how the system determines which skill "owns" the current user input. For
(first (sort triggered #'> :key #'skill-priority))))
#+end_src
** Standing Mandates
Standing mandates are cross-cutting instructions injected into every LLM system prompt. They live in ~*standing-mandates*~, a list of functions ~(context) → string-or-nil~. Each is called on every reasoning cycle; nil results are skipped.
This is the mechanism for always-on behavioral instructions. Skills call their registered trigger function to determine if they should activate for a given context; standing mandates always run and decide themselves whether to contribute text. Use ~push~ to register:
#+begin_example
(push #'my-mandate *standing-mandates*)
#+end_example
** Skill registration macro (defskill)
The primary API for skills. Each skill file calls this once to register itself. The macro creates a ~skill~ struct and stores it in ~*skill-registry*~ keyed by the skill's name.
The ~:system-prompt-augment~ slot is optional. If provided, it's a function that receives the context and returns a string to append to the LLM's system prompt. This allows skills to inject domain-specific instructions into every reasoning cycle.
#+begin_src lisp
(defmacro defskill (name &key priority dependencies trigger probabilistic deterministic system-prompt-augment)
(defmacro defskill (name &key priority dependencies trigger probabilistic deterministic)
"Registers a new skill. NAME is a keyword. TRIGGER is a function (context) → bool."
`(setf (gethash (string-downcase (string ,name)) *skill-registry*)
(make-skill :name (string-downcase (string ,name))
@@ -129,8 +144,7 @@ The ~:system-prompt-augment~ slot is optional. If provided, it's a function that
:dependencies ',dependencies
:trigger-fn ,trigger
:probabilistic-prompt ,probabilistic
:deterministic-fn ,deterministic
:system-prompt-augment ,system-prompt-augment)))
:deterministic-fn ,deterministic)))
#+end_src
** Dependency resolution (skill-dependencies-resolve)

View File

@@ -242,7 +242,10 @@ writes the result back through the reply-stream."
* Phase E: Lifecycle
The REPL skill loads at priority 200 (after diagnostics at 100, before utils-lisp at 400).
** System Prompt Augment (repl-mandate)
** Standing Mandate (repl-mandate)
The REPL-first mandate is registered as a standing mandate — it runs on every ~think()~ cycle, inspecting the user input for code-related keywords. When it matches, the mandate text is injected into the IDENTITY section of the system prompt.
;; REPL-VERIFIED: 2026-05-03T13:00:00
#+begin_src lisp
(defun repl-mandate (context)
@@ -265,8 +268,12 @@ The REPL skill loads at priority 200 (after diagnostics at 100, before utils-lis
(defskill :passepartout-programming-repl
:priority 200
:trigger (lambda (ctx) (declare (ignore ctx)) nil)
:deterministic (lambda (action ctx) (declare (ignore action ctx)) nil)
:system-prompt-augment #'repl-mandate)
:deterministic (lambda (action ctx) (declare (ignore action ctx)) nil))
#+end_src
#+begin_src lisp
(eval-when (:load-toplevel :execute)
(push #'repl-mandate *standing-mandates*))
#+end_src
* Test Suite

View File

@@ -78,6 +78,17 @@ The Diagnostics skill is the self-knowledge of Passepartout. It answers
3. Every test in ~* Test Suite~ MUST reference a specific Contract item.
4. If you change a function's signature, you MUST update its Contract item.
5. These files are excluded (no defuns): ~core-manifest.org~, ~setup.org~.
6. **NO-HARDCODED-CONSTANTS**: All configurable values (thresholds, intervals,
paths, limits, counters) MUST be read from environment variables with a
documented default in ~.env.example~. No magic numbers, no hardcoded
string literals in function bodies for any value a user might need to
change. The user owns their configuration — they change it in ~.env~, not
in the source code. Exceptions: internal implementation details that are
never user-facing (hash-table sizes, buffer capacity limits, loop
iteration caps) may live in source. But if the value controls *behavior*
(how many approvals before a rule, what similarity threshold gates
context, how long a shell command runs before timeout), it lives
in ~.env~ with a fallback default.
** Engineering Lifecycle (Two-Track)

520
org/programming-tools.org Normal file
View File

@@ -0,0 +1,520 @@
#+TITLE: SKILL: Programming Tools (programming-tools.org)
#+AUTHOR: Agent
#+FILETAGS: :programming:tools:cognitive:
#+PROPERTY: header-args:lisp :tangle ../lisp/programming-tools.lisp
* Cognitive Tools for Codebase Operations
This skill registers ten cognitive tools that let the LLM search codebases, read and write files, evaluate Lisp expressions, run tests, and manipulate Org files. Without these tools, the agent can chat and run shell commands but cannot perform the core operations of a programming assistant.
Each tool is registered via ~def-cognitive-tool~ and appears in the LLM's tool belt prompt via ~cognitive-tool-prompt~. Tools receive arguments as a plist and return a plist with ~:status~ (~:success or :error~) and either ~:content~ (success) or ~:message~ (error). The tool executor (~action-tool-execute~) normalizes nested argument lists, dispatches by name, and feeds results back into the perception pipeline.
** Contract
1. Every tool returns a plist with at least ~:status~. On success: ~(:status :success :content "...")~. On error: ~(:status :error :message "...")~.
2. Every tool guards against missing required parameters and returns a clear error message.
3. Every tool handles runtime exceptions (~handler-case~) — a tool must never crash the daemon.
4. ~search-files~: given ~:pattern~, ~:path~, optional ~:include~ (glob), returns matched lines with file:line prefixes.
5. ~find-files~: given ~:pattern~ (glob), ~:path~, returns list of matching file paths.
6. ~read-file~: given ~:filepath~, optional ~:start~, ~:limit~ (lines), returns file contents.
7. ~write-file~: given ~:filepath~, ~:content~, creates directories, writes file, returns byte count.
8. ~list-directory~: given ~:path~, optional ~:pattern~, returns sorted directory entries.
9. ~run-shell~: given ~:cmd~, optional ~:timeout~, returns stdout, stderr, and exit code.
10. ~eval-form~: given ~:code~ (Lisp expression string), returns evaluated result. Disables ~*read-eval*~.
11. ~run-tests~: given optional ~:test-name~, runs specific test or all suites via ~fiveam:run-all-tests~.
12. ~org-find-headline~: given ~:id~ or ~:title~, searches ~*memory-store*~ for matching memory objects.
13. ~org-modify-file~: given ~:filepath~, ~:old-text~, ~:new-text~, performs exact-string replacement. Returns error if text not found.
* Implementation
** Package Context
#+begin_src lisp
(in-package :passepartout)
(defun tools-write-file (filepath content)
"Write string CONTENT to FILEPATH, creating parent directories."
(uiop:ensure-all-directories-exist (list filepath))
(with-open-file (stream filepath :direction :output :if-exists :supersede :if-does-not-exist :create)
(write-string content stream)))
#+end_src
** Tool: search-files
Searches file contents recursively under a directory using regex pattern matching.
#+begin_src lisp
(def-cognitive-tool search-files
"Search file contents under a directory for a regex pattern."
((:name "pattern" :description "The regex pattern to search for." :type "string")
(:name "path" :description "Directory to search recursively." :type "string")
(:name "include" :description "Optional glob filter for filenames (e.g. \"*.lisp\")." :type "string"))
:guard nil
:body (lambda (args)
(block nil
(let* ((pattern (getf args :pattern))
(path (getf args :path))
(include (getf args :include))
(results nil))
(unless (and pattern path)
(return (list :status :error :message "search-files requires :pattern and :path")))
(handler-case
(dolist (file (directory (merge-pathnames
(if include
(make-pathname :name :wild :type (subseq include 2) :defaults path)
(make-pathname :name :wild :type :wild :defaults path))
path)))
(let ((base (file-namestring file)))
(with-open-file (stream file :direction :input :if-does-not-exist nil)
(when stream
(loop for line = (read-line stream nil nil)
for line-num from 1
while line
when (cl-ppcre:scan pattern line)
do (push (format nil "~a:~d: ~a" base line-num (string-trim '(#\Space #\Tab) line))
results))))))
(t (c) (return (list :status :error :message (format nil "~a" c)))))
(list :status :success
:content (if results
(format nil "~d matches:~%~a" (length results)
(format nil "~{~a~^~%~}" (reverse results)))
(format nil "No matches for '~a' in ~a" pattern path)))))))
#+end_src
** Tool: find-files
Glob file matching using SBCL's ~directory~.
#+begin_src lisp
(def-cognitive-tool find-files
"Find files matching a glob pattern under a directory."
((:name "pattern" :description "Glob pattern (e.g. \"*.lisp\", \"core-*\")." :type "string")
(:name "path" :description "Directory to search in." :type "string"))
:guard nil
:body (lambda (args)
(block nil
(let* ((pattern (getf args :pattern))
(path (getf args :path)))
(unless (and pattern path)
(return (list :status :error :message "find-files requires :pattern and :path")))
(let ((full (merge-pathnames pattern path)))
(handler-case
(let ((files (directory full)))
(list :status :success
:content (if files
(format nil "~d files:~%~{~a~^~%~}" (length files) files)
(format nil "No files matching '~a' in ~a" pattern path))))
(t (c) (list :status :error :message (format nil "~a" c)))))))))
#+end_src
** Tool: read-file
Reads a file into a string. Supports optional ~:start~ and ~:limit~ for partial reads.
#+begin_src lisp
(def-cognitive-tool read-file
"Read the contents of a file."
((:name "filepath" :description "Path to the file to read." :type "string")
(:name "start" :description "Optional: line number to start reading from (1-based)." :type "integer")
(:name "limit" :description "Optional: maximum number of lines to read." :type "integer"))
:guard (lambda (args) (declare (ignore args)) nil)
:body (lambda (args)
(block nil
(let* ((filepath (getf args :filepath))
(start (getf args :start))
(limit (getf args :limit)))
(unless filepath
(return (list :status :error :message "read-file requires :filepath")))
(handler-case
(let ((content (uiop:read-file-string filepath)))
(if (or start limit)
(let* ((lines (uiop:split-string content :separator '(#\Newline)))
(start-idx (max 0 (1- (or start 1))))
(end (if limit (min (length lines) (+ start-idx limit)) (length lines)))
(selected (subseq lines start-idx end)))
(list :status :success
:content (format nil "~{~a~^~%~}" selected)))
(list :status :success :content content)))
(error (c) (list :status :error :message (format nil "~a" c))))))))
#+end_src
** Tool: write-file
Writes string content to a file, creating parent directories as needed.
#+begin_src lisp
(def-cognitive-tool write-file
"Write string content to a file. Created directories as needed."
((:name "filepath" :description "Path to the file to write." :type "string")
(:name "content" :description "The text content to write." :type "string"))
:guard nil
:body (lambda (args)
(block nil
(let* ((filepath (getf args :filepath))
(content (getf args :content)))
(unless (and filepath content)
(return (list :status :error :message "write-file requires :filepath and :content")))
(handler-case
(progn
(tools-write-file filepath content)
(list :status :success
:content (format nil "Written ~d bytes to ~a" (length content) filepath)))
(error (c) (list :status :error :message (format nil "~a" c))))))))
#+end_src
** Tool: list-directory
Lists the contents of a directory, optionally filtered by a glob pattern.
#+begin_src lisp
(def-cognitive-tool list-directory
"List the contents of a directory."
((:name "path" :description "Directory path to list." :type "string")
(:name "pattern" :description "Optional glob filter (e.g. \"*.org\")." :type "string"))
:guard nil
:body (lambda (args)
(block nil
(let* ((path (getf args :path))
(pattern (getf args :pattern)))
(unless path
(return (list :status :error :message "list-directory requires :path")))
(let ((full-pattern (if pattern
(merge-pathnames pattern path)
(make-pathname :name :wild :type :wild :defaults path))))
(handler-case
(let ((entries (directory full-pattern)))
(list :status :success
:content (if entries
(format nil "~d entries in ~a:~%~{~a~^~%~}" (length entries) path entries)
(format nil "No entries in ~a" path))))
(t (c) (list :status :error :message (format nil "~a" c)))))))))
#+end_src
** Tool: run-shell
Executes a shell command and returns stdout, stderr, and exit code.
#+begin_src lisp
(def-cognitive-tool run-shell
"Execute a shell command and return stdout, stderr, and exit code."
((:name "cmd" :description "The shell command to execute." :type "string")
(:name "timeout" :description "Optional timeout in seconds (default 30)." :type "integer"))
:guard nil
:body (lambda (args)
(block nil
(let* ((cmd (getf args :cmd))
(timeout (or (getf args :timeout) 30)))
(unless cmd
(return (list :status :error :message "run-shell requires :cmd")))
(handler-case
(multiple-value-bind (out err code)
(uiop:run-program (list "timeout" (format nil "~a" timeout) "bash" "-c" cmd)
:output :string :error-output :string
:ignore-error-status t)
(list :status :success
:content (format nil "~a~@[~%~%stderr:~%~a~]~%exit: ~d"
(or out "") (when (and err (> (length err) 0)) err) code)))
(error (c) (list :status :error :message (format nil "~a" c))))))))
#+end_src
** Tool: eval-form
Evaluates a Lisp expression in the running image. Binds ~*read-eval*~ to nil for safety.
#+begin_src lisp
(def-cognitive-tool eval-form
"Evaluate a Lisp expression in the running image and return the result."
((:name "code" :description "The Lisp expression to evaluate as a string." :type "string"))
:guard nil
:body (lambda (args)
(block nil
(let* ((code (getf args :code)))
(unless code
(return (list :status :error :message "eval-form requires :code")))
(handler-case
(let* ((*read-eval* nil)
(form (read-from-string code))
(result (eval form)))
(list :status :success :content (format nil "~a" result)))
(error (c) (list :status :error :message (format nil "~a" c))))))))
#+end_src
** Tool: run-tests
Runs FiveAM test suites. Without arguments, runs all tests via ~fiveam:run-all-tests~.
#+begin_src lisp
(def-cognitive-tool run-tests
"Run FiveAM tests. With no arguments, runs all test suites."
((:name "test-name" :description "Optional: specific test name to run. If nil, runs all tests." :type "string"))
:guard nil
:body (lambda (args)
(block nil
(let* ((test-name (getf args :test-name)))
(handler-case
(if test-name
(let* ((sym (find-symbol (string-upcase test-name) :passepartout))
(result (when sym (fiveam:run (intern (string-upcase test-name) :passepartout)))))
(list :status :success
:content (format nil "Test '~a' ~a" test-name
(if result "completed" "not found"))))
(let ((result (fiveam:run-all-tests)))
(list :status :success :content (format nil "~a" result))))
(error (c) (list :status :error :message (format nil "~a" c))))))))
#+end_src
** Tool: org-find-headline
Finds Org headlines in the memory store by ID property or title substring match.
#+begin_src lisp
(def-cognitive-tool org-find-headline
"Find an Org headline by ID or title in the memory store."
((:name "id" :description "Optional: Org ID property to search for." :type "string")
(:name "title" :description "Optional: headline title to search for (case-insensitive substring)." :type "string"))
:guard nil
:body (lambda (args)
(block nil
(let* ((id (getf args :id))
(title (getf args :title))
(results nil))
(unless (or id title)
(return (list :status :error :message "org-find-headline requires :id or :title")))
(handler-case
(let ((is-mem (find-symbol "MEMORY-OBJECT-P" :passepartout))
(get-id (find-symbol "MEMORY-OBJECT-ID" :passepartout))
(get-title (find-symbol "MEMORY-OBJECT-TITLE" :passepartout)))
(unless (and is-mem get-id get-title)
(return (list :status :error :message "Memory store not loaded")))
(maphash (lambda (k obj)
(declare (ignore k))
(when (and (funcall is-mem obj)
(or (and id (string-equal id (funcall get-id obj)))
(and title (search title (funcall get-title obj) :test #'char-equal))))
(push obj results)))
*memory-store*)
(list :status :success
:content (if results
(format nil "~d headlines found:~%~{~a~^~%~}"
(length results)
(mapcar (lambda (r) (funcall get-title r)) results))
(format nil "No headlines matching ~a" (or id title)))))
(error (c) (list :status :error :message (format nil "~a" c))))))))
#+end_src
** Tool: org-modify-file
Surgical text replacement in an Org file — matches exact text and replaces it.
#+begin_src lisp
(def-cognitive-tool org-modify-file
"Replace text in an Org file via exact string match. Returns error if old-text not found."
((:name "filepath" :description "Path to the Org file." :type "string")
(:name "old-text" :description "Exact text to replace." :type "string")
(:name "new-text" :description "Text to insert in its place." :type "string"))
:guard nil
:body (lambda (args)
(block nil
(let* ((filepath (getf args :filepath))
(old-text (getf args :old-text))
(new-text (getf args :new-text)))
(unless (and filepath old-text new-text)
(return (list :status :error :message "org-modify-file requires :filepath, :old-text, and :new-text")))
(handler-case
(let ((content (uiop:read-file-string filepath)))
(let ((pos (search old-text content)))
(if pos
(let ((new-content (concatenate 'string
(subseq content 0 pos)
new-text
(subseq content (+ pos (length old-text))))))
(tools-write-file filepath new-content)
(list :status :success
:content (format nil "Replaced at position ~d in ~a" pos filepath)))
(list :status :error :message (format nil "Text not found in ~a" filepath)))))
(error (c) (list :status :error :message (format nil "~a" c))))))))
#+end_src
** Skill Registration
#+begin_src lisp
(defskill :passepartout-programming-tools
:priority 50
:trigger (lambda (ctx) (declare (ignore ctx)) nil)
:deterministic (lambda (action ctx) (declare (ignore action ctx)) nil))
#+end_src
* Test Suite
#+begin_src lisp
(eval-when (:compile-toplevel :load-toplevel :execute)
(ql:quickload :fiveam :silent t))
(defpackage :passepartout-programming-tools-tests
(:use :cl :fiveam :passepartout)
(:export #:programming-tools-suite))
(in-package :passepartout-programming-tools-tests)
(def-suite programming-tools-suite :description "Verification of programming cognitive tools")
(in-suite programming-tools-suite)
(defun tools-tmpdir ()
(let ((d (merge-pathnames "tmp/passepartout-tool-tests/" (user-homedir-pathname))))
(uiop:ensure-all-directories-exist (list d))
d))
(defun tools-cleanup ()
(let ((d (tools-tmpdir)))
(uiop:delete-directory-tree d :validate t :if-does-not-exist :ignore)))
(defun tools-write-file (filepath content)
(uiop:ensure-all-directories-exist (list filepath))
(with-open-file (stream filepath :direction :output :if-exists :supersede :if-does-not-exist :create)
(write-string content stream)))
(defun call-tool (tool-name &rest args)
(let ((tool (gethash (string-downcase (string tool-name)) *cognitive-tool-registry*)))
(unless tool (error "Tool ~a not found" tool-name))
(funcall (cognitive-tool-body tool) args)))
;; search-files
(test test-search-files-finds-matches
"Contract 1: search-files finds lines matching a regex pattern."
(let* ((dir (tools-tmpdir))
(file-a (merge-pathnames "src-a.lisp" dir))
(file-b (merge-pathnames "src-b.lisp" dir)))
(tools-write-file file-a "(defun foo () 'hello)")
(tools-write-file file-b "(defun bar () 'world)")
(let ((result (call-tool 'search-files :pattern "defun" :path (namestring dir) :include "*.lisp")))
(is (eq (getf result :status) :success))
(is (search "src-a.lisp:1:" (getf result :content)))
(is (search "src-b.lisp:1:" (getf result :content))))
(tools-cleanup)))
(test test-search-files-missing-params
"search-files returns error when required params are missing."
(let ((result (call-tool 'search-files :pattern "x")))
(is (eq (getf result :status) :error))))
;; find-files
(test test-find-files-by-extension
"Contract 5: find-files returns files matching a glob."
(let ((dir (tools-tmpdir)))
(tools-write-file (merge-pathnames "a.lisp" dir) "test")
(tools-write-file (merge-pathnames "b.lisp" dir) "test")
(tools-write-file (merge-pathnames "c.org" dir) "test")
(let ((result (call-tool 'find-files :pattern "*.lisp" :path (namestring dir))))
(is (eq (getf result :status) :success))
(is (search "a.lisp" (getf result :content)))
(is (search "b.lisp" (getf result :content)))
(is (not (search "c.org" (getf result :content)))))
(tools-cleanup)))
(test test-find-files-missing-params
"find-files returns error without required params."
(let ((result (call-tool 'find-files :pattern "*.lisp")))
(is (eq (getf result :status) :error))))
;; read-file
(test test-read-file-full
"Contract 6: read-file returns full file contents."
(let* ((dir (tools-tmpdir))
(file (merge-pathnames "readme.txt" dir)))
(tools-write-file file (format nil "line one~%line two~%line three"))
(let ((result (call-tool 'read-file :filepath (namestring file))))
(is (eq (getf result :status) :success))
(is (search "line one" (getf result :content))))
(tools-cleanup)))
(test test-read-file-missing-params
"read-file returns error without :filepath."
(let ((result (call-tool 'read-file)))
(is (eq (getf result :status) :error))))
;; write-file
(test test-write-file-creates
"Contract 7: write-file creates file with content."
(let* ((dir (tools-tmpdir))
(file (merge-pathnames "output.txt" dir)))
(let ((result (call-tool 'write-file :filepath (namestring file) :content "hello world")))
(is (eq (getf result :status) :success))
(is (search "11 bytes" (getf result :content))))
(is (string-equal "hello world" (uiop:read-file-string file)))
(tools-cleanup)))
(test test-write-file-missing-params
"write-file returns error without required params."
(let ((result (call-tool 'write-file :content "x")))
(is (eq (getf result :status) :error))))
;; list-directory
(test test-list-directory-all
"Contract 8: list-directory returns all entries."
(let ((dir (tools-tmpdir)))
(tools-write-file (merge-pathnames "alpha.txt" dir) "x")
(tools-write-file (merge-pathnames "beta.txt" dir) "y")
(let ((result (call-tool 'list-directory :path (namestring dir))))
(is (eq (getf result :status) :success))
(is (search "alpha.txt" (getf result :content)))
(is (search "beta.txt" (getf result :content))))
(tools-cleanup)))
(test test-list-directory-missing-params
"list-directory returns error without :path."
(let ((result (call-tool 'list-directory)))
(is (eq (getf result :status) :error))))
;; run-shell
(test test-run-shell-echo
"Contract 9: run-shell executes a command and returns output."
(let ((result (call-tool 'run-shell :cmd "echo hello")))
(is (eq (getf result :status) :success))
(is (search "hello" (getf result :content)))))
(test test-run-shell-missing-params
"run-shell returns error without :cmd."
(let ((result (call-tool 'run-shell)))
(is (eq (getf result :status) :error))))
;; eval-form
(test test-eval-form-arithmetic
"Contract 10: eval-form evaluates a Lisp expression."
(let ((result (call-tool 'eval-form :code "(+ 1 2)")))
(is (eq (getf result :status) :success))
(is (search "3" (getf result :content)))))
(test test-eval-form-missing-params
"eval-form returns error without :code."
(let ((result (call-tool 'eval-form)))
(is (eq (getf result :status) :error))))
;; org-modify-file
(test test-org-modify-file-replace
"Contract 13: org-modify-file replaces exact text in file."
(let* ((dir (tools-tmpdir))
(file (merge-pathnames "doc.org" dir)))
(tools-write-file file "* TODO Buy milk~%* DONE Walk dog~%")
(let ((result (call-tool 'org-modify-file
:filepath (namestring file)
:old-text "TODO" :new-text "WAITING")))
(is (eq (getf result :status) :success))
(is (search "WAITING" (uiop:read-file-string file))))
(tools-cleanup)))
(test test-org-modify-file-not-found
"org-modify-file returns error when text not in file."
(let* ((dir (tools-tmpdir))
(file (merge-pathnames "file.org" dir)))
(tools-write-file file "some content")
(let ((result (call-tool 'org-modify-file
:filepath (namestring file)
:old-text "not-in-file" :new-text "anything")))
(is (eq (getf result :status) :error))
(is (search "not found" (getf result :message))))
(tools-cleanup)))
(test test-org-modify-file-missing-params
"org-modify-file returns error without required params."
(let ((result (call-tool 'org-modify-file :filepath "x" :old-text "y")))
(is (eq (getf result :status) :error))))
#+end_src

View File

@@ -9,16 +9,19 @@ The Dispatcher is the physical security layer of Passepartout. While the Policy
Every action that reaches the Dispatcher has already been approved by the Reasoning pipeline. The LLM generated it, the deterministic gates verified it, and the Act stage is about to execute it. The Dispatcher is the last gate before the action touches the physical world.
The Dispatcher inspects nine vectors:
1. **REPL verification** — warns if a defun is written without REPL prototyping
The Dispatcher runs ten blocking checks (eleven including the warn-only REPL lint):
1. **REPL verification** — warns if a ~defun~ is written without REPL prototyping (warn only, doesn't block)
2. **Lisp syntax** — blocks writes with unbalanced parens
3. **Secret paths** — blocks reads to ~.env~, SSH keys, PEM files, etc.
4. **Content exposure** — scans for API keys, PGP blocks, tokens
5. **Vault secrets** — matches against stored credentials
6. **Privacy tags** — blocks ~@personal~ tagged content
7. **Privacy text** — warns if text references privacy tag names
8. **Shell safety** — blocks destructive commands and injection patterns
9. **Network exfil** — blocks unwhitelisted outbound connections
4. **Self-build safety** — blocks writes to ~core-*~ files unless HITL approved (active when ~SELF_BUILD_MODE=true~)
5. **Content exposure** — scans for API keys, PGP blocks, tokens
6. **Vault secrets** — matches against stored credentials
7. **Privacy tags** — blocks ~@personal~ tagged content
8. **Privacy text** — warns if text references privacy tag names
9. **Shell safety** — blocks destructive commands and injection patterns
10. **Network exfil** — blocks unwhitelisted outbound connections
11. **High-impact approval** — requires HITL for ~:shell~, ~:system :eval~, and ~:emacs :eval~
The Dispatcher also handles the **Flight Plan** system: when a high-risk action is blocked, it creates a Flight Plan node in the Org files that the user can manually approve.
@@ -357,8 +360,9 @@ Returns a list of matched pattern names or nil if safe."
#+begin_src lisp
(defun dispatcher-check (action context)
"Security gate for high-risk actions.
Vectors: lisp validation, secret path, secret content, vault secrets,
privacy tags, privacy text, shell safety, network exfil, high-impact approval."
Eleven checks: 0=REPL-lint (warn-only), 1=lisp-validation, 2=secret-path,
2b=self-build-core, 3=secret-content, 4=vault-secrets, 5=privacy-tags,
6=privacy-text, 7=shell-safety, 8=network-exfil, 8b=high-impact-approval."
(declare (ignore context))
(let* ((target (proto-get action :target))
(payload (proto-get action :payload))

View File

@@ -8,7 +8,7 @@
Every cognitive tool (file read, file write, shell execute, etc.) has a permission level: ~:allow~ (executed without asking), ~:ask~ (user is prompted before execution), or ~:deny~ (blocked entirely). Tool Permissions maintains the registry of these levels and provides the ~permission-gate-check~ that the Dispatcher calls before dispatching a tool action.
The complexity lives in the Dispatcher (security-dispatcher.org), which
consults this table as one of its nine scan vectors.
consults this table as one of its ten scan vectors.
** Contract