From 761678bbd63ca446534eef934d854932f7588ab1 Mon Sep 17 00:00:00 2001 From: Amr Gharbeia Date: Wed, 13 May 2026 11:48:08 -0400 Subject: [PATCH] docs: trim roadmap to v1.0.0, move v2.0.0+ to stoa MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cut v2.0.0 (Lisp Machine Emergence), v3.0.0+ (Cannibalization), v4.0.0+ (Native Inference, Hardware, True Agency) from passepartout roadmap. These belong to Stoa — the body/environment layer. Passepartout now only tracks the path to Neurosymbolic Maturity (v1.0.0). --- docs/ROADMAP.org | 244 ----------------------------------------------- 1 file changed, 244 deletions(-) diff --git a/docs/ROADMAP.org b/docs/ROADMAP.org index e340433..345f104 100644 --- a/docs/ROADMAP.org +++ b/docs/ROADMAP.org @@ -1591,247 +1591,3 @@ The system is benchmarked against SWE-bench (competitive score with Claude Code The TUI at v1.0.0 is competitive: streaming responses, gate trace visualization, sidebar with 10 panels, skin system with 10+ presets, adaptive layout, full markdown, mouse support, spinner personality, and progress bars. The sidebar's gate trace, focus map, rule counter, sufficiency score, and provenance breakdown are capabilities no competitor can replicate — Passepartout's permanent UX differentiator. v1.0.0 is the brain at maturity. The symbolic engine reasons. The probabilistic engine translates. The gate stack verifies. The Merkle tree preserves provenance. The eval harness guards against regression. - -* v2.0.0: Lisp Machine Emergence - -v2.0.0 is where Passepartout stops being a daemon with clients and becomes the environment. The agent's cognitive loop, the user's editor, the user's shell, and the user's browser run in the same Common Lisp image. The Dispatcher gate stack verifies every action regardless of who initiated it — user or agent. The distinction between "tool" and "self" dissolves. - -*Why this version matters for UX parity.* v0.4.0 through v1.0.0 give Passepartout four interaction surfaces (TUI, messaging apps, Emacs, voice). v2.0.0 inverts the problem: instead of building more clients, it builds a platform where the agent's environment and the user's environment are the same process, separated not by a sandbox but by the Dispatcher gate stack. The editor IS the agent's prompt. The shell IS the agent's actuator. The browser IS the agent's web research tool. There are no clients — there is one Lisp image, one address space, one Org-mode file system. - -*Architectural principle: Browser inside Lisp, not Lisp inside browser.* Lisp is the parent process. It owns the window, the memory, and the input loop. The rendering engine (WebKit/Blink) is a library that paints pixels inside a Lisp buffer. The user can redefine functions while browsing without restarting. Keybinding lookups happen in microseconds (SBCL machine code) — the browser cannot "steal" shortcuts. - -** Qt/QML via EQL5 — the rendering surface - -- Qt/QML (via EQL5) is the UI framework. EQL5 exposes the full Qt C++ API from Common Lisp. QML is declarative — it matches Lisp's generation model. -- Desktop: native look and feel on Linux, macOS, and Windows. -- Mobile: Qt runs natively on iOS and Android. Android uses F-Droid for the unrestricted version and Play Store for sandboxed. iOS uses Guideline 4.7 ("Educational/Developer Tool" loophole, no JIT compilation). -- Safety Bridge for mobile: Lisp code can manipulate browser/files but cannot touch hardware (GPS, camera, contacts) without standard permission pop-ups. -- The minibuffer: a universal command line at the bottom of the screen. Not an Emacs modeline. Not a VS Code command palette. A single command surface for every action — edit files, navigate web, run Lisp expressions, invoke agent commands. ~M-x~ for everything. - -*** Lish — the Common Lisp editor - -Not elisp. Not Emacs. A multi-threaded Common Lisp editor rendered via Qt/QML. The complete system prompt lives in an Org buffer — the agent's identity, its skill registry, its memory, and its reasoning are visible and editable as Org text. The user modifies the agent's prompt and the agent reflects the change immediately — the prompt is a file in memory, not a hidden string in a config. - -Org-babel for interactive evaluation: source blocks in Org files are executable. The user evaluates a ~#+begin_src lisp~ block and the result appears inline. The agent evaluates blocks to verify code before writing. The REPL is not a separate window — it is the Org buffer in which the agent and user both work. - -The editor and the agent share the same Lisp image. The editor is not a client that connects to a daemon — it IS the daemon process. The TUI from v0.x is the editor's rendering surface. - -*** Nyxt — the Common Lisp browser (three erosion stages) - -The browser is not a one-time feature. It is a multi-year erosion of the rendering stack toward pure Lisp: - -*Stage 1 — Qt + WebKit.* Qt provides window management and native widgets. WebKit renders web content inside a Lisp buffer. Network requests via dexador (pure Lisp). HTML parsed via Plump (pure Lisp). Layout via Yoga (C-based Flexbox, wrapped via FFI). JavaScript via embedded QuickJS. This stage delivers a working browser in months, not years. - -*Stage 2 — S-expression DOM.* Lisp builds its own DOM representation as native S-expressions. WebKit is reduced to pixel painting only — it receives rendered layouts from Lisp, not raw HTML. The agent can traverse and manipulate the DOM as Lisp data structures without serialization. This makes web content natively queryable and modifiable by the agent's cognitive loop. - -*Stage 3 — Pure Lisp layout.* WebKit turned off entirely. Lisp-native layout engine (12-18 months of focused development). CSS subset sufficient for the modern web's 95% use case. JavaScript via QuickJS remains for interactive content. The browser is now a Lisp application that happens to speak HTTP, not a web engine wrapped in a Lisp process. - -*** Lish — the Lisp shell - -Bash is a text-stream protocol. Passepartout speaks plists. The Lish shell replaces text streams with structured data — every command returns a plist, not a byte stream. Pipe becomes function composition. Scripts become Lisp functions that operate on memory objects directly. - -The agent and the user share the same shell. The user types ~(list-todos :tag "@urgent")~. The agent proposes ~(shell "npm run build")~. The Dispatcher verifies both. The shell is not a separate process — it is a REPL connected to the same Lisp image as the agent's cognitive loop. - -Org-mode buffers become the file system. The user's memex (~/memex/) is browsable as a tree of Org headlines. File operations (read, write, list, search) operate on Org AST nodes, not byte streams. A "directory listing" is a tree of headlines. A "file read" is a subtree rendered as text. - -Bash remains available as a backend for running external commands, but it is not the primary interface. - -*** Emacs migration — three phases - -The Emacs bridge (v0.4.0) is Phase I. The deep integration is three phases, not one: - -*Phase I — Parasite (v0.4.0).* Emacs is a client. The elisp TCP bridge sends text and receives responses. The agent does not control Emacs. Emacs users get a native chat experience alongside the TUI. - -*Phase II — Interpreter (v2.0.0).* An ELisp compatibility layer runs inside Passepartout's Common Lisp image. Key Emacs packages (Org-mode, Magit) run natively without an Emacs process. The compatibility layer does not aim for 100% coverage — it targets the packages the agent's workflows depend on. - -*Phase III — Successor (v2.0.0 and beyond).* Native Common Lisp implementations of Org-mode workflows and Git integration read/write the same file formats. Total independence from Emacs. Emacs users who prefer Emacs keep the bridge. New users get the native experience. - -*** Strategic timeline - -v0.4.0 Emacs bridge (Phase I Parasite) → v1.0.0 Neurosymbolic Maturity → v2.0.0 Lish editor + Nyxt browser (Stage 1) + Emacs Phase II/III + mobile. The Qt/QML surface enables gradual erosion of the rendering stack without rewriting the application logic. The three-phase Emacs migration ensures Lisp users are never abandoned — the bridge works from day one, the native experience grows under it. - -* v3.0.0+: Cannibalization — Eat Your Dependencies - -v3.0.0 begins the erosion of external dependencies — the system that was bootstrapped on Qt, WebKit, C runtime, and Linux starts replacing them piece by piece with native Lisp components. This is the realization of the Lisp Machine: not built from scratch, but arrived at through gradual replacement of a working system. - -*** v3.0.0: Single-Process Convergence -- TCP bridge between daemon and EQL5 client becomes an internal function call -- One SBCL image: daemon + editor + shell + browser share one address space -- The wire protocol becomes nil — all communication is plist exchange in memory - -*** v3.1.0: Lisp-Native Layout Engine -- Replace QML layout with Lisp layout (Yoga FFI as intermediate step) -- CLOS-based widget tree with computed dirty regions -- Diff-based redisplay: only changed cells re-render - -*** v3.2.0: Browser Stage 2 — S-Expression DOM -- Lisp builds its own DOM as native s-expressions -- WebKit reduced to pixel painting only -- Agent traverses and manipulates DOM as Lisp data without serialization - -*** v3.3.0: Browser Stage 3 — Pure Lisp Browser -- Lisp-native layout engine handles CSS subset -- JavaScript via QuickJS remains -- WebKit turned off entirely -- The browser is now a Lisp application - -*** v3.4.0+: Qt/QML Erosion -- Replace QML components with Lisp-native widgets (one at a time) -- Window management via Lisp-native X11/Wayland bindings -- Font rendering via HarfBuzz FFI → Lisp replacement -- Event loop: Qt's → SBCL's native thread scheduler -- Each replacement is verified by the eval harness; the system remains usable at every step - -*** v3.6.0: Stage0 Lisp Bootstrap -- 500-byte hex bootstrap → self-hosting Lisp -- Replace Linux bootloader -- The Lisp machine runs on bare metal - -* v4.0.0: Native Inference - -LLM inference moves in-process. No external servers. No API keys required for inference. - -*Lisp as Sovereign Governor, not as Math Engine.* The weights themselves are not stored as Lisp objects — this would waste 50% memory on type tags and destroy cache locality through pointer-chasing. Instead, the entire tensor is tagged as a single Lisp object (~macro-tag~). The Lisp image holds a pointer to optimized flat binary (GPU-friendly, FPGA-compatible). The tag is checked once. After that, all math happens in the optimized backend. - -** Native inference (FFI binding to llama.cpp) - -- FFI binding to llama.cpp via CFFI: load GGUF models, run inference, manage KV cache. Single SBCL image, zero process boundaries. The agent and the model share memory. -- Speculative safety: the Dispatcher gate stack intercepts token generation in real time. A token that would produce a blocked action is preemptively suppressed before generation. No external inference API supports this. -- Foveal-peripheral compute: the model skips pruned context nodes during attention computation. External APIs compute full attention regardless of what you send. In-process inference makes the sparse-tree rendering pay off at the compute level, not just the token level. - -** Live surgery on cognition - -With in-process inference, the agent's internal state becomes inspectable: - -- Pause inference mid-stream. Inspect hidden states and activations as Lisp variables. -- Modify a vector, change a sampling parameter, resume. -- Detect when the agent is likely to hallucinate by comparing current activation patterns against historical baselines. -- The REPL becomes a surgical instrument for the agent's own cognition — not just for verifying code, but for inspecting and correcting the neural process that generates it. - -** DSL-compiled model architectures - -Model architectures are described as Lisp DSL: - -- ~(defmodel passepartout-reasoning :type 'transformer :heads 32 :dim 4096 :layers 32)~ -- The DSL compiles to machine code for the target backend (GPU via CUDA, FPGA via VexRiscv, CPU via llama.cpp). -- Python interprets at runtime. Lisp compiles once. Model architecture changes are treated the same as code changes — edited, verified, hot-reloaded. - -* v5.0.0: Hardware — Tagged Lisp Architecture - -The Lisp machine becomes physical. RISC-V with tagged architecture, hardware-enforced type checking, and FPGA prototype for the symbolic core. - -*Not a from-scratch processor.* Use RISC-V as the skeleton, add custom Lisp extensions. RISC-V provides the carrier architecture (standard instruction set, existing toolchain, LLVM support). Lisp extensions provide tagged computation (type checking in hardware, parallel garbage collection, S-expression traversal as atomic operations). - -** The macro-tag approach - -- Top 4–8 bits of every memory word = Type Tag. Hardware checks tags in parallel with ALU operations. Trap on type mismatch. -- A tensor (70B weights) is one macro-tagged Lisp object — a pointer to flat binary. The tag is checked once. Math happens at native speed. This replaces "weights as sexps" (which wastes 50% memory on per-weight tags and destroys cache locality). -- Custom instructions: TADD (tagged add), LISP.CAR, LISP.CDR — Lisp primitives as single-cycle hardware operations. - -** Phase migration: Host → Co-processor → Self-hosted - -1. *Parasitic.* Lisp card (FPGA) is a PCIe co-processor. Host CPU (Intel/AMD, Linux/Windows) handles "dirty" I/O — networking, display, file systems. Lisp card handles tagged computation and the agent's cognitive loop. If Lisp crashes, host survives. Reset card, reload. Memory mapping: the card can see the host's memory. The Lisp environment reaches out and inspects data. - -2. *Functional Hijacking.* Lisp UI runs on the card, displays through the PC's GPU. The agent indexes Linux files into Lisp objects. The host becomes an I/O server for the Lisp card. - -3. *Driver Cannibalization.* Point the agent at C drivers. Ask it to generate native Lisp drivers for the hardware the card controls directly. PCIe Passthrough for direct hardware access. - -4. *Self-Hosting.* Replace the Linux bootloader with Stage0 Lisp (a bootstrap from 500 bytes of hex to a self-hosting Lisp). Cut the umbilical cord. The Lisp machine runs on bare metal. - -** Concrete prototyping milestones - -| Stage | Hardware | Cost | What it delivers | -|-------+----------+------+-----------------| -| TinyTapeout | Custom silicon (130nm) | ~$500–1,000 | 8-bit tagged toy processor with Lisp primitives | -| Shuttle | Multi-project wafer | ~$10,000–20,000 | Tagged RISC-V core at 100–300MHz | -| FPGA | Terasic DE10-Nano / Xilinx KCU105 | ~$200–500 | VexRiscv with custom Lisp extensions, PCIe card form factor | -| Industrial | Commercial foundry (5nm) | ~$10M–100M+ | Competes with modern CPUs on tagged workloads | - -Start at TinyTapeout. Validate the tagged architecture works. Move to FPGA. Validate at speed. Only then consider silicon. - -** Garbage collection in hardware - -Dedicated bus master (Scavenger) runs background garbage collection while the main CPU executes code. No "GC pause." The scavenger traverses the heap in parallel with computation, freeing unreachable objects without stopping the agent. - -** Persistent single-address-space memory - -NVRAM for the entire heap. Turn on the machine — state is exactly where you left it. No "booting." No "loading memory from disk." The agent's Merkle-tree memory, skill registry, knowledge graph, and induced functions survive restarts as a contiguous hardware state. - -** Why this is not "Lisp inside browser" - -Most Lisp-on-hardware attempts fail because they try to compete with Intel on raw math. That's the wrong axis. The tagged architecture doesn't need to beat a GPU at matrix multiplication. It needs to beat a CPU at symbolic computation — graph traversal, constraint solving, theorem proving, garbage collection. These are the v3.0.0 symbolic engine's workload. Hardware that makes them single-cycle is the differentiator, not hardware that runs matrix math faster. - -* v6.0.0: True Agency - -World models, temporal reasoning, goal persistence across restarts. - -- World models: Predictive models of user behavior, project dynamics, system state. -- Temporal reasoning: Scheduling, deadlines, elapsed duration awareness. -- Goal persistence: Goals survive restarts. Long-term projects in memory-objects. - -* Neurosymbolic Phase Reference - -Each phase has a detailed implementation spec in its version section above. Summary of what is and isn't built: - -| Phase | Component | Lines | Release | -|-------+-----------------------------------------+-------+----------| -| 0 | PM-type-level gates + core integrity | ~75 | v0.10.0 | -| 0b | Layered auth — Layer 1 (cryptographic) | ~200 | v0.12.0 | -| 1 | Triple fact store + abstract API | ~200 | v0.14.0 | -| 1a | Self-preservation mechanisms | ~120 | v0.16.0 | -| 2 | Screamer admission gate | ~200 | v0.18.0 | -| 3 | Archivist as fact proposer | ~100 | v0.20.0 | -| 4 | Sufficiency criterion — the flip | ~50 | v0.22.0 | -| 5 | VivaceGraph + Merkle DAG + ontology ver | ~400 | v0.25.0 | -| 6 | ACL2 structural verification | ~200 | v0.27.0 | -| 7 | 10-80-10 planner | ~500 | v0.36.0 | -| 8+ | Semantic Wikipedia integration | TBD | v0.36.1+ | -|-------+-----------------------------------------+-------+----------| -| Total | | ~2045 | | - -** What Is NOT Built by the Neurosymbolic Phases - -1. *A separate knowledge graph serialization format before the ephemeral phase proves what facts are useful.* Premature format commitment is the ontology problem writ small. Let use determine the format. - -2. *ACL2 verification of empirical claims.* Apple is red. rm -rf / is destructive. These are observations, not theorems. Screamer handles empirical consistency. ACL2 handles structural verification. - -3. *VivaceGraph before Screamer.* The admission gate is the critical path. The persistence layer is an optimization of a working system. - -4. *A per-fact ontology designed upfront.* Extract from the gate stack, extend from deductions and observations, prune through contradiction detection. The ontology is a garden, not a building. - -5. *New core ASDF components.* Every phase is a skill. A corrupted symbolic engine degrades reasoning but does not kill the agent. Satisfies the self-repair criterion. - -6. *A "complete" symbolic index for the broad domain.* The neural index is the permanent gateway to the richness of prose. The symbolic index handles what can be mechanically verified. The boundary is permanent, not transitional. The neuro is the brain. The symbolic is the education. - -** Competitive Advantage Analysis - -*** Phase 0-1: Deterministic safety, now with type-level guarantees -The existing Dispatcher gate stack already provides 0-LLM-token safety verification. Phase 0 adds structural guarantees: no heuristic bypassing of the type hierarchy. A request to modify the dispatcher's own rules is impossible by construction, not just caught by pattern matching. No competitor has this — their equivalent of "core file protection" is a prompt instruction, not a type system. - -*** Phase 0b: Layered signal authentication — verified origin, not claimed origin -No competitor verifies /who/ issued a signal. Every agent harness accepts signals from any source that speaks its protocol. A compromised dependency can impersonate any signal source. Passepartout's four-layer authentication gate makes signal source spoofing impossible at Layer 1 (cryptographic), detectable at Layers 2-3 (sensory + deterministic reasoning), and probabilistically flagged at Layer 4 (style analysis). The key registry has Merkle-hashed provenance — key creation, promotion, and revocation are auditable, versioned, and survivable across restarts. - -*** Phase 2-3: Verified extraction — the symbolic index grows without corruption -No competitor verifies extracted facts against an existing knowledge base. Their memory systems (Claude Code's ~extractMemories~, Hermes's MemoryProvider, OpenClaw's session transcripts) record what the LLM /said/ happened, not what the system /proved/ happened. Passepartout's Screamer-gated admission makes the symbolic index a monotonic, verified structure. Facts are admitted because they are consistent, not because the LLM generated them. - -*** Phase 4-5: Self-accelerating knowledge — the downward cost curve -The sufficiency criterion makes Passepartout's "cheaper over time" thesis measurable. As the ratio of non-lossy facts grows, LLM calls for extraction decrease. At sufficiency, extraction of known categories becomes deterministic. The downward cost curve is not a marketing claim — it is a structural property of the architecture, visible through the sufficiency score. - -*** Phase 6-7: Provable plan soundness -No competitor verifies task plans against formal constraints. Claude Code plans in a single LLM call with no post-hoc verification. Hermes decomposes tasks into subtasks but does not prove them non-contradictory. Passepartout's ACL2-verified plans are structurally guaranteed to have no deadlocks, no dependency cycles, and no safety violations. The verification is a proof, not a prompt. - -*** Phase 0-1a: Self-preservation — the agent knows when it is wounded -No competitor detects its own degradation. Claude Code, OpenCode, and Hermes all fail silently when a tool crashes or a dependency is missing — the agent keeps running, producing degraded output, never telling the user. Passepartout's quarantine system detects failing skills, unloads them automatically, and displays a degraded-mode indicator in the status bar. The external watchdog restarts the daemon if the process dies. The integrity monitor detects corrupted core files. The agent refuses to execute commands that would destroy its own runtime, explaining /why/ and redirecting to the safe termination path. - -*** Semantic Wikipedia: Entity coverage at zero marginal cost -No competitor has a general-knowledge entity graph because no competitor has a symbolic engine to populate. Claude Code knows codebases; it doesn't know that Nabokov wrote /Pale Fire/ and lectured on Kafka. Passepartout with Wikidata loaded knows both, and the entity knowledge costs zero LLM tokens — it is loaded once as structured data and queried via VivaceGraph traversals. - -*** The permanent competitive advantage -The competitive advantage is not any single feature. It is the architecture's ability to accumulate verified knowledge from four independent sources (gates, deduction, verified LLM proposals, human authoring) and to make that knowledge queryable with provenance. Competitors accumulate chat transcripts. Passepartout accumulates a provenanced, self-verifying knowledge graph. Transcripts become stale and unreliable. The knowledge graph becomes richer and more trustworthy with every session. - -Design rationale is in: -- ~notes/passepartout-neurosymbolic-design-decisions-and-options.org~ — design rationale for every decision -- ~notes/passepartout-symbolic-engine-exploration.org~ — original architecture exploration -- ~notes/passepartout-whitehead.org~ — Whitehead's four concrete contributions -- ~docs/ARCHITECTURE.org~ — current pipeline architecture -- ~docs/DESIGN_DECISIONS.org~ — foundational architectural decisions