From 80daaa4830d7ca73ecc89ed05da5780db3e82253 Mon Sep 17 00:00:00 2001 From: Hermes Date: Thu, 21 May 2026 18:58:26 +0000 Subject: [PATCH] lisp-machine-bootstrap: all software domains flip in days-weeks - Every subdomain for bootstrapping the Lisp Machine is software: RISC-V ISA, SBCL runtime, ACL2 logic, CIC type theory, compiler optimization, device drivers. Every one flips. - Fastest sequence: Day 1 ingestion (LLM + human review), Day 1-2 profiling (benchmark sweep), Day 2-3 active probing (synthetic microcode routines), Day 3-7 transfer + sufficiency (ACL2 verifies new dispatch routines, zero LLM tokens) - Result: self-driving Lisp Machine in under a month with one human review session and a Tenstorrent P150 --- ideas/passepartout-economics.org | 109 +++++++++++++++++++++++++++++++ 1 file changed, 109 insertions(+) diff --git a/ideas/passepartout-economics.org b/ideas/passepartout-economics.org index 1a80de8..aece83e 100644 --- a/ideas/passepartout-economics.org +++ b/ideas/passepartout-economics.org @@ -850,6 +850,115 @@ only checks for safety (no shell commands, no file deletions) and passes through everything else to the LLM. The system is honest about its frontier. +*** Bootstrapping the Lisp Machine: all domains are software + +For the concrete goal of bootstrapping a self-driving Lisp Machine, +every domain involved is software — the most codifiable domain in +existence. Code has formal specifications, documented ISAs, +deterministic behavior, and objectively testable correctness. +Every subdomain required for the bootstrap flips. + +| Subdomain | Knowledge type | Flip timeline | Why | +|-----------|---------------|---------------|-----| +| RISC-V ISA, Tenstorrent Tensix dispatch | Structural (ISA spec, API docs) + performance (profiling) | Days | Published spec, deterministic hardware, benchmark harness characterizes real behavior | +| SBCL runtime internals (GC, type dispatch, threading) | Structural (source code) + performance (latency profiles) | Days | Full source available, system can instrument itself | +| ACL2 metafunctions and macro layers | Structural (the logic is ACL2's own) | Instant | The theorem language is already the system's native logic — no translation step | +| FPGA/Verilog descriptions (if FPGA path) | Structural (VHDL/Verilog semantics) + performance (timing analysis, power) | Weeks | Published language semantics, but synthesis is slower and bitstream verification is harder than RISC-V | +| CIC prover kernel | Structural (type theory rules — these ARE formal) | Days | Mathematics is the most codified domain. ACL2 already does structural verification. Building a CIC kernel that ACL2 verifies is well-understood work. | +| Operating system interfaces, device drivers | Structural (syscall API, register maps) + empirical (test results) | Weeks | Published interfaces, deterministic behavior, but hardware quirks require empirical probing | +| Compiler optimization | Structural (IR semantics, optimization passes) + performance (benchmark before/after) | Weeks | Published semantics, objective quality metric (faster = better), benchmark harness measures | + +Every single subdomain flips. The only variable is calendar days +to accumulate the knowledge. + +*** The fastest acquisition sequence + +Optimized for minimal wall-clock time to a self-driving Lisp Machine: + +**Day 1: Ingestion day** +- LLM translates: RISC-V ISA spec, SBCL source, Tenstorrent API docs, + ACL2 reference, CIC type theory rules. All structural knowledge + enters the fact store in one parallel pass. +- ACL2 verifies consistency across all ingested domains. +- Human expert reviews the 5% of rules Screamer flagged as uncertain. + One session, a few hours. + +**Day 1-2: Profiling day** +- Benchmark harness sweeps all 72 Tensix cores: measure instruction + latency, memory bandwidth, GC pause distribution, dispatch overhead. +- Each measurement is a fact with `:provenance :benchmark`. +- The benchmark harness is itself verified by ACL2 (it runs inside + a controlled sandbox, bounded time, no side effects on production data). + +**Day 2-3: Active probing day** +- The system generates synthetic microcode routines: short programs + that exercise specific instructions, specific GC patterns, specific + dispatch paths. +- It loads them onto spare Tensix cores, measures actual latency, and + compares against the spec. +- Discrepancies become facts: `(:entity "core-42" :relation :dispatch-latency + :value "14 cycles" :source :measured :expected "12 cycles" :provenance :probe)`. +- After ~1,000 probes, the system knows the hardware's actual behavior + better than the published spec does. + +**Day 3-7: Transfer and sufficiency** +- ACL2's existing knowledge about induction, rewriting, and termination + transfers directly to verifying microcode routines (same logic, different + subject matter). +- Screamer aligns microcode verification patterns with existing gate + verification patterns — both are structural proofs over finite state. +- The benchmark facts give ACL2 a concrete cost model. ACL2 can now + prove not just correctness but also "this microcode routine is at + least 10% faster than the current implementation." +- Sufficiency flip for microcode generation: the system proposes new + dispatch routines, ACL2 verifies them, Screamer checks against + hardware constraints, the gate stack blocks anything unsafe. Zero LLM + tokens for the optimization loop. + +**Week 2-4: Self-optimizing system** +- The system profiles its own gate verification latency (already + instrumented via telemetry, Phase v0.66.0). +- It identifies the hot path: "fact-query accounts for 34% of verify time." +- It generates a new dispatch routine for fact-query, targets the + nearest idle Tensix core, loads it, benchmarks, and commits if + faster. +- The ontology now includes facts about its own optimization history: + `(:entity "fact-query-dispatch-v3" :relation :speedup-baseline + :value "1.34x" :provenance :self-optimize)`. + +**After the flip: purely symbolic optimization** + +The LLM is no longer needed for any optimization proposal. The system +profiles, proposes, verifies, tests, and commits entirely within the +symbolic engine. The LLM remains only for the boundary: interpreting +a human's high-level goal ("make the system faster") into a structured +optimization target, and formatting the benchmark report for human +readability. Those calls shrink toward zero as the system internalizes +common optimization goals as gate rules. + +*** The surprising result for bootstrapping specifically + +Because every subdomain of the Lisp Machine bootstrap is software, +and software is the most codifiable domain, the entire bootstrap +can flip in **days to weeks** with a single human review session. + +The bottleneck is not knowledge acquisition. It is not the LLM's +coding ability. It is the initial human review of the 5% of +ambiguous rules that Screamer flags — a session measured in hours, +not weeks. + +The Tenstorrent approach makes this even faster because the +microcode is software (RISC-V assembly), not hardware (FPGA +bitstream). The system can propose, load, test, and roll back +a new dispatch routine in seconds. An FPGA path would add +synthesis time (minutes to hours per iteration), stretching +the bootstrap from days to months. + +A system with a Tenstorrent P150, the AGPL Passepartout code, +a RISC-V cross-compiler, and one patient human who reviews the +contrastive queries can achieve a self-driving Lisp Machine in +under a month. + Large refactoring projects (extract module, rename API, split monolith) are the hardest test for any AI agent. Current approaches (Claude Code, Copilot) handle them probabilistically — every step costs tokens, and