stoa: create body/environment project with v2.0.0+ roadmap
Trim passepartout roadmap to v1.0.0 (Neurosymbolic Maturity). Move v2.0.0 (Lisp Machine Emergence), v3.0.0+ (Cannibalization), v4.0.0+ (Native Inference, Hardware, True Agency) into stoa/docs/ROADMAP.org. Stoa is the porch — the infrastructure layer that hosts the agent. cl-tty is retroactively recognized as the first harvested library in the Stoa pipeline.
This commit is contained in:
199
projects/stoa/docs/ROADMAP.org
Normal file
199
projects/stoa/docs/ROADMAP.org
Normal file
@@ -0,0 +1,199 @@
|
||||
#+TITLE: Stoa Roadmap — The Porch
|
||||
#+STARTUP: content
|
||||
#+FILETAGS: :docs:roadmap:stoa:
|
||||
|
||||
* The Porch
|
||||
|
||||
Stoa (Στοά) is the body/environment layer of the triad:
|
||||
|
||||
| Logos | The mind — recorded discourse (memex + agent) |
|
||||
| Stoa | The porch — editor, browser, shell, infrastructure |
|
||||
| Agora | The society — identity, communication, contracts |
|
||||
|
||||
The name comes from the Stoa Poikile (Painted Porch) in ancient Athens,
|
||||
where Zeno taught Stoic philosophy. The porch was not the philosophy
|
||||
itself — it was the environment that made discourse possible. Stoa is
|
||||
the same: not the agent, not the network, but the infrastructure that
|
||||
hosts both.
|
||||
|
||||
cl-tty (projects/cl-tty/) is retroactively the first harvested library
|
||||
in the Stoa pipeline — a pure-CL terminal I/O library that replaced
|
||||
ncurses/croatoan. Future libraries will follow the same pattern:
|
||||
identify a C/FFI dependency, write a pure CL replacement, use it.
|
||||
|
||||
* v2.0.0: Lisp Machine Emergence
|
||||
|
||||
v2.0.0 is where Passepartout stops being a daemon with clients and becomes the environment. The agent's cognitive loop, the user's editor, the user's shell, and the user's browser run in the same Common Lisp image. The Dispatcher gate stack verifies every action regardless of who initiated it — user or agent. The distinction between "tool" and "self" dissolves.
|
||||
|
||||
*Why this version matters for UX parity.* v0.4.0 through v1.0.0 give Passepartout four interaction surfaces (TUI, messaging apps, Emacs, voice). v2.0.0 inverts the problem: instead of building more clients, it builds a platform where the agent's environment and the user's environment are the same process, separated not by a sandbox but by the Dispatcher gate stack. The editor IS the agent's prompt. The shell IS the agent's actuator. The browser IS the agent's web research tool. There are no clients — there is one Lisp image, one address space, one Org-mode file system.
|
||||
|
||||
*Architectural principle: Browser inside Lisp, not Lisp inside browser.* Lisp is the parent process. It owns the window, the memory, and the input loop. The rendering engine (WebKit/Blink) is a library that paints pixels inside a Lisp buffer. The user can redefine functions while browsing without restarting. Keybinding lookups happen in microseconds (SBCL machine code) — the browser cannot "steal" shortcuts.
|
||||
|
||||
** Qt/QML via EQL5 — the rendering surface
|
||||
|
||||
- Qt/QML (via EQL5) is the UI framework. EQL5 exposes the full Qt C++ API from Common Lisp. QML is declarative — it matches Lisp's generation model.
|
||||
- Desktop: native look and feel on Linux, macOS, and Windows.
|
||||
- Mobile: Qt runs natively on iOS and Android. Android uses F-Droid for the unrestricted version and Play Store for sandboxed. iOS uses Guideline 4.7 ("Educational/Developer Tool" loophole, no JIT compilation).
|
||||
- Safety Bridge for mobile: Lisp code can manipulate browser/files but cannot touch hardware (GPS, camera, contacts) without standard permission pop-ups.
|
||||
- The minibuffer: a universal command line at the bottom of the screen. Not an Emacs modeline. Not a VS Code command palette. A single command surface for every action — edit files, navigate web, run Lisp expressions, invoke agent commands. ~M-x~ for everything.
|
||||
|
||||
*** Lish — the Common Lisp editor
|
||||
|
||||
Not elisp. Not Emacs. A multi-threaded Common Lisp editor rendered via Qt/QML. The complete system prompt lives in an Org buffer — the agent's identity, its skill registry, its memory, and its reasoning are visible and editable as Org text. The user modifies the agent's prompt and the agent reflects the change immediately — the prompt is a file in memory, not a hidden string in a config.
|
||||
|
||||
Org-babel for interactive evaluation: source blocks in Org files are executable. The user evaluates a ~#+begin_src lisp~ block and the result appears inline. The agent evaluates blocks to verify code before writing. The REPL is not a separate window — it is the Org buffer in which the agent and user both work.
|
||||
|
||||
The editor and the agent share the same Lisp image. The editor is not a client that connects to a daemon — it IS the daemon process. The TUI from v0.x is the editor's rendering surface.
|
||||
|
||||
*** Nyxt — the Common Lisp browser (three erosion stages)
|
||||
|
||||
The browser is not a one-time feature. It is a multi-year erosion of the rendering stack toward pure Lisp:
|
||||
|
||||
*Stage 1 — Qt + WebKit.* Qt provides window management and native widgets. WebKit renders web content inside a Lisp buffer. Network requests via dexador (pure Lisp). HTML parsed via Plump (pure Lisp). Layout via Yoga (C-based Flexbox, wrapped via FFI). JavaScript via embedded QuickJS. This stage delivers a working browser in months, not years.
|
||||
|
||||
*Stage 2 — S-expression DOM.* Lisp builds its own DOM representation as native S-expressions. WebKit is reduced to pixel painting only — it receives rendered layouts from Lisp, not raw HTML. The agent can traverse and manipulate the DOM as Lisp data structures without serialization. This makes web content natively queryable and modifiable by the agent's cognitive loop.
|
||||
|
||||
*Stage 3 — Pure Lisp layout.* WebKit turned off entirely. Lisp-native layout engine (12-18 months of focused development). CSS subset sufficient for the modern web's 95% use case. JavaScript via QuickJS remains for interactive content. The browser is now a Lisp application that happens to speak HTTP, not a web engine wrapped in a Lisp process.
|
||||
|
||||
*** Lish — the Lisp shell
|
||||
|
||||
Bash is a text-stream protocol. Passepartout speaks plists. The Lish shell replaces text streams with structured data — every command returns a plist, not a byte stream. Pipe becomes function composition. Scripts become Lisp functions that operate on memory objects directly.
|
||||
|
||||
The agent and the user share the same shell. The user types ~(list-todos :tag "@urgent")~. The agent proposes ~(shell "npm run build")~. The Dispatcher verifies both. The shell is not a separate process — it is a REPL connected to the same Lisp image as the agent's cognitive loop.
|
||||
|
||||
Org-mode buffers become the file system. The user's memex (~/memex/) is browsable as a tree of Org headlines. File operations (read, write, list, search) operate on Org AST nodes, not byte streams. A "directory listing" is a tree of headlines. A "file read" is a subtree rendered as text.
|
||||
|
||||
Bash remains available as a backend for running external commands, but it is not the primary interface.
|
||||
|
||||
*** Emacs migration — three phases
|
||||
|
||||
The Emacs bridge (v0.4.0) is Phase I. The deep integration is three phases, not one:
|
||||
|
||||
*Phase I — Parasite (v0.4.0).* Emacs is a client. The elisp TCP bridge sends text and receives responses. The agent does not control Emacs. Emacs users get a native chat experience alongside the TUI.
|
||||
|
||||
*Phase II — Interpreter (v2.0.0).* An ELisp compatibility layer runs inside Passepartout's Common Lisp image. Key Emacs packages (Org-mode, Magit) run natively without an Emacs process. The compatibility layer does not aim for 100% coverage — it targets the packages the agent's workflows depend on.
|
||||
|
||||
*Phase III — Successor (v2.0.0 and beyond).* Native Common Lisp implementations of Org-mode workflows and Git integration read/write the same file formats. Total independence from Emacs. Emacs users who prefer Emacs keep the bridge. New users get the native experience.
|
||||
|
||||
*** Strategic timeline
|
||||
|
||||
v0.4.0 Emacs bridge (Phase I Parasite) → v1.0.0 Neurosymbolic Maturity → v2.0.0 Lish editor + Nyxt browser (Stage 1) + Emacs Phase II/III + mobile. The Qt/QML surface enables gradual erosion of the rendering stack without rewriting the application logic. The three-phase Emacs migration ensures Lisp users are never abandoned — the bridge works from day one, the native experience grows under it.
|
||||
|
||||
* v3.0.0+: Cannibalization — Eat Your Dependencies
|
||||
|
||||
v3.0.0 begins the erosion of external dependencies — the system that was bootstrapped on Qt, WebKit, C runtime, and Linux starts replacing them piece by piece with native Lisp components. This is the realization of the Lisp Machine: not built from scratch, but arrived at through gradual replacement of a working system.
|
||||
|
||||
** v3.0.0: Single-Process Convergence
|
||||
- TCP bridge between daemon and EQL5 client becomes an internal function call
|
||||
- One SBCL image: daemon + editor + shell + browser share one address space
|
||||
- The wire protocol becomes nil — all communication is plist exchange in memory
|
||||
|
||||
** v3.1.0: Lisp-Native Layout Engine
|
||||
- Replace QML layout with Lisp layout (Yoga FFI as intermediate step)
|
||||
- CLOS-based widget tree with computed dirty regions
|
||||
- Diff-based redisplay: only changed cells re-render
|
||||
|
||||
** v3.2.0: Browser Stage 2 — S-Expression DOM
|
||||
- Lisp builds its own DOM as native s-expressions
|
||||
- WebKit reduced to pixel painting only
|
||||
- Agent traverses and manipulates DOM as Lisp data without serialization
|
||||
|
||||
** v3.3.0: Browser Stage 3 — Pure Lisp Browser
|
||||
- Lisp-native layout engine handles CSS subset
|
||||
- JavaScript via QuickJS remains
|
||||
- WebKit turned off entirely
|
||||
- The browser is now a Lisp application
|
||||
|
||||
** v3.4.0+: Qt/QML Erosion
|
||||
- Replace QML components with Lisp-native widgets (one at a time)
|
||||
- Window management via Lisp-native X11/Wayland bindings
|
||||
- Font rendering via HarfBuzz FFI → Lisp replacement
|
||||
- Event loop: Qt's → SBCL's native thread scheduler
|
||||
- Each replacement is verified by the eval harness; the system remains usable at every step
|
||||
|
||||
** v3.6.0: Stage0 Lisp Bootstrap
|
||||
- 500-byte hex bootstrap → self-hosting Lisp
|
||||
- Replace Linux bootloader
|
||||
- The Lisp machine runs on bare metal
|
||||
|
||||
* v4.0.0: Native Inference
|
||||
|
||||
LLM inference moves in-process. No external servers. No API keys required for inference.
|
||||
|
||||
*Lisp as Sovereign Governor, not as Math Engine.* The weights themselves are not stored as Lisp objects — this would waste 50% memory on type tags and destroy cache locality through pointer-chasing. Instead, the entire tensor is tagged as a single Lisp object (~macro-tag~). The Lisp image holds a pointer to optimized flat binary (GPU-friendly, FPGA-compatible). The tag is checked once. After that, all math happens in the optimized backend.
|
||||
|
||||
** Native inference (FFI binding to llama.cpp)
|
||||
|
||||
- FFI binding to llama.cpp via CFFI: load GGUF models, run inference, manage KV cache. Single SBCL image, zero process boundaries. The agent and the model share memory.
|
||||
- Speculative safety: the Dispatcher gate stack intercepts token generation in real time. A token that would produce a blocked action is preemptively suppressed before generation. No external inference API supports this.
|
||||
- Foveal-peripheral compute: the model skips pruned context nodes during attention computation. External APIs compute full attention regardless of what you send. In-process inference makes the sparse-tree rendering pay off at the compute level, not just the token level.
|
||||
|
||||
** Live surgery on cognition
|
||||
|
||||
With in-process inference, the agent's internal state becomes inspectable:
|
||||
|
||||
- Pause inference mid-stream. Inspect hidden states and activations as Lisp variables.
|
||||
- Modify a vector, change a sampling parameter, resume.
|
||||
- Detect when the agent is likely to hallucinate by comparing current activation patterns against historical baselines.
|
||||
- The REPL becomes a surgical instrument for the agent's own cognition — not just for verifying code, but for inspecting and correcting the neural process that generates it.
|
||||
|
||||
** DSL-compiled model architectures
|
||||
|
||||
Model architectures are described as Lisp DSL:
|
||||
|
||||
- ~(defmodel passepartout-reasoning :type 'transformer :heads 32 :dim 4096 :layers 32)~
|
||||
- The DSL compiles to machine code for the target backend (GPU via CUDA, FPGA via VexRiscv, CPU via llama.cpp).
|
||||
- Python interprets at runtime. Lisp compiles once. Model architecture changes are treated the same as code changes — edited, verified, hot-reloaded.
|
||||
|
||||
* v5.0.0: Hardware — Tagged Lisp Architecture
|
||||
|
||||
The Lisp machine becomes physical. RISC-V with tagged architecture, hardware-enforced type checking, and FPGA prototype for the symbolic core.
|
||||
|
||||
*Not a from-scratch processor.* Use RISC-V as the skeleton, add custom Lisp extensions. RISC-V provides the carrier architecture (standard instruction set, existing toolchain, LLVM support). Lisp extensions provide tagged computation (type checking in hardware, parallel garbage collection, S-expression traversal as atomic operations).
|
||||
|
||||
** The macro-tag approach
|
||||
|
||||
- Top 4–8 bits of every memory word = Type Tag. Hardware checks tags in parallel with ALU operations. Trap on type mismatch.
|
||||
- A tensor (70B weights) is one macro-tagged Lisp object — a pointer to flat binary. The tag is checked once. Math happens at native speed. This replaces "weights as sexps" (which wastes 50% memory on per-weight tags and destroys cache locality).
|
||||
- Custom instructions: TADD (tagged add), LISP.CAR, LISP.CDR — Lisp primitives as single-cycle hardware operations.
|
||||
|
||||
** Phase migration: Host → Co-processor → Self-hosted
|
||||
|
||||
1. *Parasitic.* Lisp card (FPGA) is a PCIe co-processor. Host CPU (Intel/AMD, Linux/Windows) handles "dirty" I/O — networking, display, file systems. Lisp card handles tagged computation and the agent's cognitive loop. If Lisp crashes, host survives. Reset card, reload. Memory mapping: the card can see the host's memory. The Lisp environment reaches out and inspects data.
|
||||
|
||||
2. *Functional Hijacking.* Lisp UI runs on the card, displays through the PC's GPU. The agent indexes Linux files into Lisp objects. The host becomes an I/O server for the Lisp card.
|
||||
|
||||
3. *Driver Cannibalization.* Point the agent at C drivers. Ask it to generate native Lisp drivers for the hardware the card controls directly. PCIe Passthrough for direct hardware access.
|
||||
|
||||
4. *Self-Hosting.* Replace the Linux bootloader with Stage0 Lisp (a bootstrap from 500 bytes of hex to a self-hosting Lisp). Cut the umbilical cord. The Lisp machine runs on bare metal.
|
||||
|
||||
** Concrete prototyping milestones
|
||||
|
||||
| Stage | Hardware | Cost | What it delivers |
|
||||
|-------+----------+------+-----------------|
|
||||
| TinyTapeout | Custom silicon (130nm) | ~$500–1,000 | 8-bit tagged toy processor with Lisp primitives |
|
||||
| Shuttle | Multi-project wafer | ~$10,000–20,000 | Tagged RISC-V core at 100–300MHz |
|
||||
| FPGA | Terasic DE10-Nano / Xilinx KCU105 | ~$200–500 | VexRiscv with custom Lisp extensions, PCIe card form factor |
|
||||
| Industrial | Commercial foundry (5nm) | ~$10M–100M+ | Competes with modern CPUs on tagged workloads |
|
||||
|
||||
Start at TinyTapeout. Validate the tagged architecture works. Move to FPGA. Validate at speed. Only then consider silicon.
|
||||
|
||||
** Garbage collection in hardware
|
||||
|
||||
Dedicated bus master (Scavenger) runs background garbage collection while the main CPU executes code. No "GC pause." The scavenger traverses the heap in parallel with computation, freeing unreachable objects without stopping the agent.
|
||||
|
||||
** Persistent single-address-space memory
|
||||
|
||||
NVRAM for the entire heap. Turn on the machine — state is exactly where you left it. No "booting." No "loading memory from disk." The agent's Merkle-tree memory, skill registry, knowledge graph, and induced functions survive restarts as a contiguous hardware state.
|
||||
|
||||
** Why this is not "Lisp inside browser"
|
||||
|
||||
Most Lisp-on-hardware attempts fail because they try to compete with Intel on raw math. That's the wrong axis. The tagged architecture doesn't need to beat a GPU at matrix multiplication. It needs to beat a CPU at symbolic computation — graph traversal, constraint solving, theorem proving, garbage collection. These are the v3.0.0 symbolic engine's workload. Hardware that makes them single-cycle is the differentiator, not hardware that runs matrix math faster.
|
||||
|
||||
* v6.0.0: True Agency
|
||||
|
||||
World models, temporal reasoning, goal persistence across restarts.
|
||||
|
||||
- World models: Predictive models of user behavior, project dynamics, system state.
|
||||
- Temporal reasoning: Scheduling, deadlines, elapsed duration awareness.
|
||||
- Goal persistence: Goals survive restarts. Long-term projects in memory-objects.
|
||||
Reference in New Issue
Block a user