refactoring: semantic equivalence boundary, self-driving Lisp Machine

- ACL2 proves semantic equivalence for Passepartout's own Lisp code today; for other languages via logical specification modeling - CIC prover (future) extends to dependent-type-level equivalence across language boundaries - Self-driving threshold: when system can synthesize and load its own FPGA microcode or RISC-V dispatch from within the running image - Tenstorrent P150 (72 RISC-V cores) is particularly interesting: microcode is RISC-V software, not FPGA hardware — system writes, compiles, loads, benchmarks its own core dispatch logic
2026-05-21 18:47:49 +00:00
parent 852fcae4a6
commit f9085a4690
1 changed files with 127 additions and 1 deletions
--- a/ideas/passepartout-economics.org
+++ b/ideas/passepartout-economics.org
@@ -594,7 +594,133 @@ context would not accept an unverified upgrade anyway.
  signed and verified against the hardware root of trust before
  applying.

-** Large refactoring in a neurosymbolic planner
+** Large refactoring in a neurosymbolic planner — semantic equivalence
+
+*** The workflow
+
+ACL2 proves semantic equivalence of programs written in its own
+logic — which includes Passepartout's own source code. When the
+system refactors its own skills, ACL2 can prove the new function
+produces the same outputs for all inputs as the old one. This is
+standard ACL2 practice (verifying compiler optimizations, sort
+algorithm replacements).
+
+For other languages (Python, Java, JavaScript), the path is:
+1. Model the critical subset (API surface, contracts, data
+   transformations) in ACL2 as a logical specification
+2. Prove the specification is preserved across the refactoring
+3. The actual implementation stays in the target language —
+   ACL2 proves the structural contract, not the runtime behavior
+
+The CIC prover upgrade (Lean-in-Lisp, planned as future work)
+would extend this to dependent-type-level equivalence proofs
+across language boundaries — verifying that a Rust API binding
+correctly wraps a C library, or that a Python refactoring
+preserves the type-level contract of the original.
+
+** The self-driving Lisp Machine on FPGA or Tenstorrent
+
+A Tenstorrent P150 (~72 RISC-V Tensix cores on a PCIe card) or
+a mid-range FPGA (AMD Alveo, Intel Agilex) offers enough
+hardware to run a full Passepartout image with Lisp microcode
+acceleration. The host Linux system provides boot, I/O, and
+thermal management; the accelerator card provides the Lisp
+execution fabric.
+
+*** What it can do today
+
+- **Run the full symbolic engine.** ACL2, Screamer, VivaceGraph,
+  and the fact store are pure Lisp — they run on any Lisp backend.
+  The RISC-V cores on a Tenstorrent or the soft-core on an FPGA
+  provide enough compute for real-time gate verification and
+  constraint solving.
+
+- **Hot-reload skills and macro layers.** The Lisp image loads
+  skills, tangles Org files, compiles ACL2 books, and registers
+  metafunctions — all without reboot. The FPGA fabric can be
+  reprogrammed with new microcode in milliseconds.
+
+- **Manage its own knowledge base.** The fact store grows and
+  evolves. Gate rules are proposed by the LLM and verified by
+  ACL2. Ontology versions are tracked. The system knows what
+  it knows and what changed.
+
+- **Roll back failed upgrades.** Merkle snapshots provide
+  instant undo for both software state and FPGA configuration.
+
+*** What it needs to cross the threshold to self-driving
+
+The system is not yet fully self-driving because three things
+still require external intervention:
+
+1. **The LLM dependency.** The 10% I/O translation (natural
+   language → structured goal, structured result → natural
+   language) requires an LLM. A small local model (Phi-4,
+   Qwen 2.5) on the host or card can serve this. The symbolic
+   engine handles everything else. Once sufficiency flips
+   (Phase 4), even the LLM is rarely needed.
+
+2. **Hardware driver development.** The FPGA microcode (tagged
+   memory, hardware GC, Lisp dispatch in hardware) is currently
+   written by humans. The system could eventually propose new
+   microcode patterns from profiling data — "your GC accounts
+   for 12% of runtime; here is a hardware GC barrier that
+   reduces it to 3%" — but the synthesis and verification of
+   hardware descriptions (VHDL, Verilog) requires a separate
+   toolchain.
+
+3. **The initial bootstrap.** The first FPGA load, the first
+   Linux boot, the first Lisp image — these are done by a
+   human or a pre-existing system. Once bootstrapped, the
+   system manages itself. The threshold is crossed when the
+   system can design, compile, and load its own FPGA microcode
+   from within the running image.
+
+*** The threshold
+
+The self-driving threshold is crossed when the system can
+synthesize and load its own FPGA microcode or Tensix dispatch
+programs from within the running Lisp image. At that point:
+
+- The system profiles its own gate verification latency
+- It proposes a new microcoded instruction for the hot path
+- It compiles Verilog from ACL2-verified specifications
+- It reprograms the FPGA fabric via PCIe DMA from within SBCL
+- It benchmarks the new instruction against the old one
+- If throughput improves, the new microcode becomes permanent
+- If not, it rolls back and tries another approach
+
+This is not science fiction — it is the natural extension of
+an architecture that already hot-reloads its own code, tracks
+its own performance telemetry, and verifies its own changes
+before committing them. The hardware description language is
+the last abstraction boundary.
+
+*** What stops it from being full science fiction
+
+| Barrier | Status | Path |
+|---------|--------|------|
+| LLM dependency | Phase 4 flip reduces it to near-zero | Already designed |
+| Hardware microcode synthesis | Most speculative | Requires hardware DSL verified by ACL2, then compiled to FPGA bitstream |
+| Initial bootstrap | One-time human action | After first load, system manages itself |
+| Power and thermal | Handled by host Linux | Unchanged |
+| PCIe DMA from SBCL | Feasible with sb-alien + libpcie | Needs driver, but well-understood |
+
+The Tenstorrent approach is particularly interesting because
+its Tensix cores are *already* RISC-V processors. The microcode
+is not FPGA logic — it's a RISC-V program. The system can write
+RISC-V assembly, compile it with the RISC-V toolchain, load it
+onto the Tensix cores, and benchmark the result. This is
+dramatically simpler than FPGA synthesis because it's software,
+not hardware.
+
+A Tenstorrent P150 running Passepartout would be: 72 RISC-V
+cores running Lisp microcode, one core dedicated to the ACL2
+prover, one to Screamer, the rest to gate verification and
+fact store operations. The host Linux system handles I/O and
+the LLM. The system designs its own core dispatch logic,
+loads it onto idle cores, and verifies the result with ACL2
+before committing.

 Large refactoring projects (extract module, rename API, split monolith)
 are the hardest test for any AI agent. Current approaches (Claude Code,