refactoring: semantic equivalence boundary, self-driving Lisp Machine

- ACL2 proves semantic equivalence for Passepartout's own Lisp code today; for other languages via logical specification modeling - CIC prover (future) extends to dependent-type-level equivalence across language boundaries - Self-driving threshold: when system can synthesize and load its own FPGA microcode or RISC-V dispatch from within the running image - Tenstorrent P150 (72 RISC-V cores) is particularly interesting: microcode is RISC-V software, not FPGA hardware — system writes, compiles, loads, benchmarks its own core dispatch logic
2026-05-21 18:47:49 +00:00
parent 852fcae4a6
commit f9085a4690
1 changed files with 127 additions and 1 deletions
--- a/ideas/passepartout-economics.org
+++ b/ideas/passepartout-economics.org
@@ -594,7 +594,133 @@ context would not accept an unverified upgrade anyway.
  signed and verified against the hardware root of trust before
  applying.
-** Large refactoring in a neurosymbolic planner
+** Large refactoring in a neurosymbolic planner — semantic equivalence
 *** The workflow
 ACL2 proves semantic equivalence of programs written in its own
 logic — which includes Passepartout's own source code. When the
 system refactors its own skills, ACL2 can prove the new function
 produces the same outputs for all inputs as the old one. This is
 standard ACL2 practice (verifying compiler optimizations, sort
 algorithm replacements).
 For other languages (Python, Java, JavaScript), the path is:
 1. Model the critical subset (API surface, contracts, data
   transformations) in ACL2 as a logical specification
 2. Prove the specification is preserved across the refactoring
 3. The actual implementation stays in the target language —
   ACL2 proves the structural contract, not the runtime behavior
 The CIC prover upgrade (Lean-in-Lisp, planned as future work)
 would extend this to dependent-type-level equivalence proofs
 across language boundaries — verifying that a Rust API binding
 correctly wraps a C library, or that a Python refactoring
 preserves the type-level contract of the original.
 ** The self-driving Lisp Machine on FPGA or Tenstorrent
 A Tenstorrent P150 (~72 RISC-V Tensix cores on a PCIe card) or
 a mid-range FPGA (AMD Alveo, Intel Agilex) offers enough
 hardware to run a full Passepartout image with Lisp microcode
 acceleration. The host Linux system provides boot, I/O, and
 thermal management; the accelerator card provides the Lisp
 execution fabric.
 *** What it can do today
 - **Run the full symbolic engine.** ACL2, Screamer, VivaceGraph,
  and the fact store are pure Lisp — they run on any Lisp backend.
  The RISC-V cores on a Tenstorrent or the soft-core on an FPGA
  provide enough compute for real-time gate verification and
  constraint solving.
 - **Hot-reload skills and macro layers.** The Lisp image loads
  skills, tangles Org files, compiles ACL2 books, and registers
  metafunctions — all without reboot. The FPGA fabric can be
  reprogrammed with new microcode in milliseconds.
 - **Manage its own knowledge base.** The fact store grows and
  evolves. Gate rules are proposed by the LLM and verified by
  ACL2. Ontology versions are tracked. The system knows what
  it knows and what changed.
 - **Roll back failed upgrades.** Merkle snapshots provide
  instant undo for both software state and FPGA configuration.
 *** What it needs to cross the threshold to self-driving
 The system is not yet fully self-driving because three things
 still require external intervention:
 1. **The LLM dependency.** The 10% I/O translation (natural
   language → structured goal, structured result → natural
   language) requires an LLM. A small local model (Phi-4,
   Qwen 2.5) on the host or card can serve this. The symbolic
   engine handles everything else. Once sufficiency flips
   (Phase 4), even the LLM is rarely needed.
 2. **Hardware driver development.** The FPGA microcode (tagged
   memory, hardware GC, Lisp dispatch in hardware) is currently
   written by humans. The system could eventually propose new
   microcode patterns from profiling data — "your GC accounts
   for 12% of runtime; here is a hardware GC barrier that
   reduces it to 3%" — but the synthesis and verification of
   hardware descriptions (VHDL, Verilog) requires a separate
   toolchain.
 3. **The initial bootstrap.** The first FPGA load, the first
   Linux boot, the first Lisp image — these are done by a
   human or a pre-existing system. Once bootstrapped, the
   system manages itself. The threshold is crossed when the
   system can design, compile, and load its own FPGA microcode
   from within the running image.
 *** The threshold
 The self-driving threshold is crossed when the system can
 synthesize and load its own FPGA microcode or Tensix dispatch
 programs from within the running Lisp image. At that point:
 - The system profiles its own gate verification latency
 - It proposes a new microcoded instruction for the hot path
 - It compiles Verilog from ACL2-verified specifications
 - It reprograms the FPGA fabric via PCIe DMA from within SBCL
 - It benchmarks the new instruction against the old one
 - If throughput improves, the new microcode becomes permanent
 - If not, it rolls back and tries another approach
 This is not science fiction — it is the natural extension of
 an architecture that already hot-reloads its own code, tracks
 its own performance telemetry, and verifies its own changes
 before committing them. The hardware description language is
 the last abstraction boundary.
 *** What stops it from being full science fiction
 | Barrier | Status | Path |
 |---------|--------|------|
 | LLM dependency | Phase 4 flip reduces it to near-zero | Already designed |
 | Hardware microcode synthesis | Most speculative | Requires hardware DSL verified by ACL2, then compiled to FPGA bitstream |
 | Initial bootstrap | One-time human action | After first load, system manages itself |
 | Power and thermal | Handled by host Linux | Unchanged |
 | PCIe DMA from SBCL | Feasible with sb-alien + libpcie | Needs driver, but well-understood |
 The Tenstorrent approach is particularly interesting because
 its Tensix cores are *already* RISC-V processors. The microcode
 is not FPGA logic — it's a RISC-V program. The system can write
 RISC-V assembly, compile it with the RISC-V toolchain, load it
 onto the Tensix cores, and benchmark the result. This is
 dramatically simpler than FPGA synthesis because it's software,
 not hardware.
 A Tenstorrent P150 running Passepartout would be: 72 RISC-V
 cores running Lisp microcode, one core dedicated to the ACL2
 prover, one to Screamer, the rest to gate verification and
 fact store operations. The host Linux system handles I/O and
 the LLM. The system designs its own core dispatch logic,
 loads it onto idle cores, and verifies the result with ACL2
 before committing.
 Large refactoring projects (extract module, rename API, split monolith)
 are the hardest test for any AI agent. Current approaches (Claude Code,