Release v0.3.0 — Event Orchestration, Human-in-the-Loop, Daily-Driver TUI

Test results: 86 pass, 0 fail across 21 suites. TUI integration: 7/7 pass. Features: - 9-vector deterministic dispatcher gates (secrets, paths, shells, network) - Human-in-the-Loop Flight Plan workflow for blocked actions - Event Orchestrator: unified hooks + cron + tier-based routing - Context Manager: stack-based project scoping with persistence - Model-Tier Routing: per-slot provider cascades with privacy filter - Memory Scope Segmentation: memex/session/project with scope-aware retrieval - Asynchronous Embedding Gateway: provider-agnostic vectors with cron job - TUI Experience: scrollback, history, status bar, themes, tab completion - v0.2.x Backfill Remediation: 14 stale/todo/stub items resolved - Multi-distro deployment: Debian + Fedora, systemd, Docker - 31 literate Org files with full prose Fixes: - CLI test: fiveam:is t -> pass/fail handler-case - Cascade-parsing integration test: load provider before checking - Version strings 0.2.0 -> 0.3.0 in core-communication, tui-main, architecture
2026-05-06 15:50:20 -04:00
parent 1d91fcc6cc
commit 42e07801ce
15 changed files with 643 additions and 321 deletions
--- a/docs/DESIGN_DECISIONS.org
+++ b/docs/DESIGN_DECISIONS.org
@@ -2,11 +2,21 @@

 This document captures the rationale behind key architectural choices. It is not a specification - it is a thinking medium for future architects and contributors who need to understand why the system is built this way, not just how.

+** Non-Negotiable Identity
+- Pure Common Lisp + Org-mode. No JSON. No YAML. No external databases.
+- Single-address-space memory (Lisp hash tables in RAM — the agent IS the memory).
+- "Thin harness, fat skills" — complexity lives at the edges, not the kernel.
+- One agent composed of many skills. Concurrency via bordeaux-threads (shared memory).
+- Plists everywhere — homoiconic communication between all components.
+
+This is the foundational decision from which all other decisions derive. It is not negotiable. Every architectural choice below exists because this identity makes it possible — and in some cases, makes it the only viable path. The single memory space enables Merkle-tree integrity without serialization boundaries. Plists enable the cognitive pipeline to be transparent and inspectable at every stage. Org-mode as the universal format means the agent's memory, the user's notes, and the agent's own source code are the same structure. This identity is the constraint that produces the architecture.
+
 * Design

 ** One single agent
 :PROPERTIES:
 :ID:       design-multi-agent-default
+:CREATED:  [2026-05-07 Wed]
 :END:

 The AI industry has developed an intuition toward multi-agent systems as the default solution to hard problems. Multiple agents spawn, delegate, coordinate, debate, and consensus their way toward solutions. This pattern is compelling in demos and genuinely useful in specific contexts - but it has become a default assumption that warrants scrutiny.
@@ -28,6 +38,7 @@ Passepartout is single-agent by default not from limitation but from conviction:
 ** The Unified Memory Argument
 :PROPERTIES:
 :ID:       design-unified-memory
+:CREATED:  [2026-05-07 Wed]
 :END:

 If single-agent architecture is the decision, unified memory becomes the mechanism that makes it viable. The critical question is not "how many agents" but "how does the agent manage context without saturating."
@@ -47,6 +58,7 @@ The unified memory argument is not that infinite context is free. It is that wit
 ** Org-Mode as Unified AST
 :PROPERTIES:
 :ID:       design-org-unified-ast
+:CREATED:  [2026-05-07 Wed]
 :END:

 Passepartout makes a bet that most systems consider too expensive to place: that humans and machines should share the same file format. That bet is Org-mode.
@@ -80,6 +92,7 @@ This is what "sovereignty" means in technical terms: the user owns the data in a
 ** Homoiconicity as Foundation
 :PROPERTIES:
 :ID:       design-homoiconicity
+:CREATED:  [2026-05-07 Wed]
 :END:

 Common Lisp is homoiconic: code and data share the same representation. A Lisp program is a list, and a list is a Lisp program. This is usually presented as a curiosity, an interesting property that enables macros. In Passepartout, it is the foundational enabling property of the entire self-modification architecture.
@@ -110,19 +123,10 @@ The implications extend beyond convenience. A system that cannot modify its own

 This is the final expression of homoiconicity: not just that code is readable as data, or that skills are modifiable, but that the entire system - including the parts that other systems protect - is open to modification. There is no ceiling on self-improvement. The agent can rewrite the very code that rewrites itself.

-*Lisp and the AI Dream*
-
-Lisp was invented in 1958 by John McCarthy with artificial intelligence explicitly in mind. Its design - code as data, runtime mutation, symbols and lists as first-class constructs - was shaped by the belief that a truly intelligent machine would need to reason about and modify its own reasoning. For decades, Lisp machines were the closest thing to thinking machines that existed.
-
-Then the AI winter came. Symbolic AI fell out of favor. Statistical learning and neural networks dominated. Lisp was relegated to niche applications and academic curiosity. The machine that was designed for AI was never used for the task it was designed for.
-
-Six decades later, neural networks have arrived at the problem from a different direction. They can learn and generalize, but they hallucinate, cannot explain their reasoning, and cannot safely modify themselves. The neuro-symbolic synthesis - combining neural pattern recognition with symbolic reasoning - is recognized as the path toward AI that is both powerful and trustworthy.
-
-Lisp's time may finally have come. Not as a replacement for neural networks, but as the governor that makes them safe - the symbolic engine that verifies what the neural engine proposes, the homoiconic substrate that allows the system to inspect, modify, and improve its own reasoning. The machine that was designed for AI in 1958 may be the exact machine needed for AI in 2026 and beyond.
-
 ** The Probabilistic-Deterministic Split
 :PROPERTIES:
 :ID:       design-probabilistic-deterministic
+:CREATED:  [2026-05-07 Wed]
 :END:

 The architecture divides cognition into two fundamentally different reasoning systems. This is not arbitrary engineering but a structural response to a fundamental truth: probabilistic systems will hallucinate, and you cannot build reliable autonomy on an unreliable foundation.
@@ -142,6 +146,7 @@ The split also explains why the system gets safer over time without the LLM impr
 ** The Dispatcher as Learning System
 :PROPERTIES:
 :ID:       design-bouncer-learning
+:CREATED:  [2026-05-07 Wed]
 :END:

 The Dispatcher begins as a static guard - a set of rules that block obviously dangerous actions. But defining "obviously" is the hard problem. The agent encounters situations the rules do not anticipate. The Dispatcher must grow.
@@ -161,6 +166,7 @@ This is the bootstrap. The system begins dependent on human judgment because it
 ** The REPL as Cognitive Substrate
 :PROPERTIES:
 :ID:       design-repl-cognition
+:CREATED:  [2026-05-07 Wed]
 :END:

 A REPL - Read, Eval, Print, Loop - is an interactive programming environment that reads an expression, evaluates it, prints the result, and loops back to read the next expression. It is the opposite of batch processing: where batch compiles and runs a program in one shot, a REPL works one expression at a time, with each evaluation building on all previous ones. The programmer defines a function, calls it, inspects the result, modifies it, and calls it again. The state accumulates. The session is the program.
@@ -182,6 +188,7 @@ This is why the REPL becomes more important as the system matures. In early vers
 ** Observability and the Thought Trace
 :PROPERTIES:
 :ID:       design-observability
+:CREATED:  [2026-05-07 Wed]
 :END:

 When a human asks why the system made a decision, the answer must be findable. In most AI systems, the reasoning is ephemeral - it exists in the model's activations and disappears when the session ends. In Passepartout, every significant cognitive event is written to an Org buffer as it happens.
@@ -197,6 +204,7 @@ Without observability, the system is a black box that happens to produce correct
 ** Literate Programming as Discipline
 :PROPERTIES:
 :ID:       design-literate-programming
+:CREATED:  [2026-05-07 Wed]
 :END:

 The decision to use Org-mode as the source of truth for code, not just documentation, is not a ceremonial preference. It is a constraint mechanism that enforces better engineering habits at the cost of convenience.
@@ -218,6 +226,7 @@ The literate programming discipline is not about producing documentation. It is
 ** The Evaluation Harness
 :PROPERTIES:
 :ID:       design-evaluation-harness
+:CREATED:  [2026-05-07 Wed]
 :END:

 SOTA parity is meaningless without measurement. A system that claims to match commercial agents must demonstrate it through reproducible benchmarks, not through feature checklists. The evaluation harness is the apparatus by which Passepartout proves its capabilities.
@@ -233,6 +242,7 @@ The harness also supports regression testing on the skill set. Every skill is te
 ** The MCP Strategy
 :PROPERTIES:
 :ID:       design-mcp-strategy
+:CREATED:  [2026-05-07 Wed]
 :END:

 The Model Context Protocol (MCP) is a standard for connecting AI systems to external tools and data sources. It defines how a client requests tools from a server, how the server exposes its capabilities, and how the client invokes them. The ecosystem is growing: MCP servers exist for GitHub, Slack, Postgres, filesystem access, and much more.
@@ -248,6 +258,7 @@ Passepartout's native client is smaller, faster, and more maintainable. The MCP
 ** Local-First Architecture
 :PROPERTIES:
 :ID:       design-local-first
+:CREATED:  [2026-05-07 Wed]
 :END:

 Passepartout is designed to run on the user's machine, on their hardware, with their data, without requiring an internet connection. This is not a deployment option - it is an architectural commitment. The system must be able to reason, plan, and act using only the resources available locally.
@@ -260,6 +271,8 @@ The symbolic engine does not require a network connection. The Prolog/Datalog re

 This does not mean Passepartout refuses to use cloud services when available and appropriate. It means cloud services are optional enhancements, not architectural requirements. The core is local. The user can choose to add cloud LLM providers for more capable inference, but the system functions without them.

+*On live images and binaries.* Passepartout's primary delivery path is source code running in a live SBCL process. The REPL is available. Skills hot-reload. The cognitive loop runs in an image that is mutable, inspectable, and homeiconic — the user can connect with SLIME, trace functions, inspect memory objects, and modify the system while it runs. A ~save-lisp-and-die~ binary is provided as a convenience for platforms where SBCL cannot be installed (corporate laptops, shared hosts). The binary is the same image saved to disk with Swank pre-loaded — it is not a sealed container. The REPL works. Skills hot-reload. The binary is a packaging format, not an architectural decision. The system is constitutionally open in both delivery paths.
+
 * Token Economics and Performance Advantage
 :PROPERTIES:
 :ID:       design-token-economics
@@ -367,7 +380,11 @@ Passepartout at 4K effective context: ~67 MB KV cache. Competitor at 128K: ~2.1
 | Min viable local model      | 3-4B params, 4K ctx | 30-70B params, 32K+ ctx | 30-70B params, 32K+ ctx      | 7-13B params, 8K+ ctx |
 | Min VRAM for local          | 4-6 GB              | 16-32 GB                | 24-48 GB                     | 8-16 GB               |

+*Note:* Observations about OpenClaw and Hermes Agent are based on their public documentation and repositories as of 2026-05. OpenClaw (github.com/openclaw/openclaw) is a TypeScript personal AI assistant by @steipete with a Node.js gateway, 25+ messaging channels, and Canvas/voice companion apps. Hermes Agent (github.com/NousResearch/hermes-agent) is a Python fork by Nous Research with a built-in learning loop, full TUI, and sub-agent delegation. Both use prompt-based safety guardrails rather than deterministic gates. Architectural claims should be re-verified as these projects evolve.
+
 *Conclusion:* Passepartout's architecture is designed to produce 2-3x token savings for coding, 13-24x for knowledge management, and 2x for life management at v1.0.0 maturity. The three structural advantages — sparse trees, deterministic safety, and REPL verification — compound. The critical risk is implementation gap: achieving the retrieval precision, dispatcher learning, and REPL integration depth required to realize the design.
+
+*Note:* The token savings projections in this section (2–3x for coding, 13–24x for knowledge management) are architectural estimates based on the sparse-tree retrieval and deterministic safety mechanisms. They have not yet been empirically verified. A token audit harness will produce measured comparisons at v0.5.0 (Token Economics & Prompt Efficiency). Until then, the README cites the mechanisms (sparse-tree rendering, deterministic gates) rather than specific magnitudes.
 * Open Questions and Risks

 1. *Retrieval accuracy is the bottleneck.* If sparse tree retrieval loads the wrong subtree (low-similarity but causally relevant), the LLM makes unfixable errors. The architecture assumes embedding quality is "good enough" — this is untested at scale.