refactoring: semantic equivalence boundary, self-driving Lisp Machine
- ACL2 proves semantic equivalence for Passepartout's own Lisp code today; for other languages via logical specification modeling - CIC prover (future) extends to dependent-type-level equivalence across language boundaries - Self-driving threshold: when system can synthesize and load its own FPGA microcode or RISC-V dispatch from within the running image - Tenstorrent P150 (72 RISC-V cores) is particularly interesting: microcode is RISC-V software, not FPGA hardware — system writes, compiles, loads, benchmarks its own core dispatch logic
This commit is contained in:
@@ -594,7 +594,133 @@ context would not accept an unverified upgrade anyway.
|
||||
signed and verified against the hardware root of trust before
|
||||
applying.
|
||||
|
||||
** Large refactoring in a neurosymbolic planner
|
||||
** Large refactoring in a neurosymbolic planner — semantic equivalence
|
||||
|
||||
*** The workflow
|
||||
|
||||
ACL2 proves semantic equivalence of programs written in its own
|
||||
logic — which includes Passepartout's own source code. When the
|
||||
system refactors its own skills, ACL2 can prove the new function
|
||||
produces the same outputs for all inputs as the old one. This is
|
||||
standard ACL2 practice (verifying compiler optimizations, sort
|
||||
algorithm replacements).
|
||||
|
||||
For other languages (Python, Java, JavaScript), the path is:
|
||||
1. Model the critical subset (API surface, contracts, data
|
||||
transformations) in ACL2 as a logical specification
|
||||
2. Prove the specification is preserved across the refactoring
|
||||
3. The actual implementation stays in the target language —
|
||||
ACL2 proves the structural contract, not the runtime behavior
|
||||
|
||||
The CIC prover upgrade (Lean-in-Lisp, planned as future work)
|
||||
would extend this to dependent-type-level equivalence proofs
|
||||
across language boundaries — verifying that a Rust API binding
|
||||
correctly wraps a C library, or that a Python refactoring
|
||||
preserves the type-level contract of the original.
|
||||
|
||||
** The self-driving Lisp Machine on FPGA or Tenstorrent
|
||||
|
||||
A Tenstorrent P150 (~72 RISC-V Tensix cores on a PCIe card) or
|
||||
a mid-range FPGA (AMD Alveo, Intel Agilex) offers enough
|
||||
hardware to run a full Passepartout image with Lisp microcode
|
||||
acceleration. The host Linux system provides boot, I/O, and
|
||||
thermal management; the accelerator card provides the Lisp
|
||||
execution fabric.
|
||||
|
||||
*** What it can do today
|
||||
|
||||
- **Run the full symbolic engine.** ACL2, Screamer, VivaceGraph,
|
||||
and the fact store are pure Lisp — they run on any Lisp backend.
|
||||
The RISC-V cores on a Tenstorrent or the soft-core on an FPGA
|
||||
provide enough compute for real-time gate verification and
|
||||
constraint solving.
|
||||
|
||||
- **Hot-reload skills and macro layers.** The Lisp image loads
|
||||
skills, tangles Org files, compiles ACL2 books, and registers
|
||||
metafunctions — all without reboot. The FPGA fabric can be
|
||||
reprogrammed with new microcode in milliseconds.
|
||||
|
||||
- **Manage its own knowledge base.** The fact store grows and
|
||||
evolves. Gate rules are proposed by the LLM and verified by
|
||||
ACL2. Ontology versions are tracked. The system knows what
|
||||
it knows and what changed.
|
||||
|
||||
- **Roll back failed upgrades.** Merkle snapshots provide
|
||||
instant undo for both software state and FPGA configuration.
|
||||
|
||||
*** What it needs to cross the threshold to self-driving
|
||||
|
||||
The system is not yet fully self-driving because three things
|
||||
still require external intervention:
|
||||
|
||||
1. **The LLM dependency.** The 10% I/O translation (natural
|
||||
language → structured goal, structured result → natural
|
||||
language) requires an LLM. A small local model (Phi-4,
|
||||
Qwen 2.5) on the host or card can serve this. The symbolic
|
||||
engine handles everything else. Once sufficiency flips
|
||||
(Phase 4), even the LLM is rarely needed.
|
||||
|
||||
2. **Hardware driver development.** The FPGA microcode (tagged
|
||||
memory, hardware GC, Lisp dispatch in hardware) is currently
|
||||
written by humans. The system could eventually propose new
|
||||
microcode patterns from profiling data — "your GC accounts
|
||||
for 12% of runtime; here is a hardware GC barrier that
|
||||
reduces it to 3%" — but the synthesis and verification of
|
||||
hardware descriptions (VHDL, Verilog) requires a separate
|
||||
toolchain.
|
||||
|
||||
3. **The initial bootstrap.** The first FPGA load, the first
|
||||
Linux boot, the first Lisp image — these are done by a
|
||||
human or a pre-existing system. Once bootstrapped, the
|
||||
system manages itself. The threshold is crossed when the
|
||||
system can design, compile, and load its own FPGA microcode
|
||||
from within the running image.
|
||||
|
||||
*** The threshold
|
||||
|
||||
The self-driving threshold is crossed when the system can
|
||||
synthesize and load its own FPGA microcode or Tensix dispatch
|
||||
programs from within the running Lisp image. At that point:
|
||||
|
||||
- The system profiles its own gate verification latency
|
||||
- It proposes a new microcoded instruction for the hot path
|
||||
- It compiles Verilog from ACL2-verified specifications
|
||||
- It reprograms the FPGA fabric via PCIe DMA from within SBCL
|
||||
- It benchmarks the new instruction against the old one
|
||||
- If throughput improves, the new microcode becomes permanent
|
||||
- If not, it rolls back and tries another approach
|
||||
|
||||
This is not science fiction — it is the natural extension of
|
||||
an architecture that already hot-reloads its own code, tracks
|
||||
its own performance telemetry, and verifies its own changes
|
||||
before committing them. The hardware description language is
|
||||
the last abstraction boundary.
|
||||
|
||||
*** What stops it from being full science fiction
|
||||
|
||||
| Barrier | Status | Path |
|
||||
|---------|--------|------|
|
||||
| LLM dependency | Phase 4 flip reduces it to near-zero | Already designed |
|
||||
| Hardware microcode synthesis | Most speculative | Requires hardware DSL verified by ACL2, then compiled to FPGA bitstream |
|
||||
| Initial bootstrap | One-time human action | After first load, system manages itself |
|
||||
| Power and thermal | Handled by host Linux | Unchanged |
|
||||
| PCIe DMA from SBCL | Feasible with sb-alien + libpcie | Needs driver, but well-understood |
|
||||
|
||||
The Tenstorrent approach is particularly interesting because
|
||||
its Tensix cores are *already* RISC-V processors. The microcode
|
||||
is not FPGA logic — it's a RISC-V program. The system can write
|
||||
RISC-V assembly, compile it with the RISC-V toolchain, load it
|
||||
onto the Tensix cores, and benchmark the result. This is
|
||||
dramatically simpler than FPGA synthesis because it's software,
|
||||
not hardware.
|
||||
|
||||
A Tenstorrent P150 running Passepartout would be: 72 RISC-V
|
||||
cores running Lisp microcode, one core dedicated to the ACL2
|
||||
prover, one to Screamer, the rest to gate verification and
|
||||
fact store operations. The host Linux system handles I/O and
|
||||
the LLM. The system designs its own core dispatch logic,
|
||||
loads it onto idle cores, and verifies the result with ACL2
|
||||
before committing.
|
||||
|
||||
Large refactoring projects (extract module, rename API, split monolith)
|
||||
are the hardest test for any AI agent. Current approaches (Claude Code,
|
||||
|
||||
Reference in New Issue
Block a user