Files
hermes-brain/ideas/lisp-provers-and-rust-comparison.org

6.6 KiB

Lisp, Provers, and vs Rust

Lisp vs Rust — the honest gap

True defects in Lisp (inherent, not fixable without changing what Lisp is)

  • GC pauses — everything is a tagged pointer on the heap; mitigatable but not eliminatable
  • No memory model / no Send/Sync — threads share everything; concurrency correctness is a discipline, not a guarantee

Achievable improvements (implementation/ecosystem, not philosophy)

  • Package manager and build system like Cargo (ASDF + Quicklisp is from 2005)
  • LSP-level tooling (SBCL infers types well — surface it)
  • Static binaries and deployment (save-lisp-and-die + tree-shaker exists, not standard)
  • Standard library modernization (hash sets, priority queues, JSON, HTTP, async — Clojure proved this)
  • Ecosystem scale (network effect, fixable with sustained effort)

Things I was wrong about (corrected during conversation)

  • The type system is not a fundamental limitation — Coalton embeds sound Hindley-Milner typing inside CL, preserving homoiconicity. You opt in per-file.
  • Unpredictable performance is not a language defect — it's a consequence of macro-heavy programming styles. Write in a disciplined subset with declarations and SBCL compiles to within 2x of C. The compiler is deterministic.
  • Macros don't cause unpredictability; they enable styles that confuse the optimizer. You choose not to use those styles.

Everything else is ecosystem and convention, not language capability: Unicode handling, pattern matching, async I/O, immutable data structures, module systems, error messages.

The numeric tower is a genuine trade-off: Lisp's auto-promoting math (integers, ratios, floats, complex) is mathematically correct but blocks SIMD and overflow assumptions that make Rust's arithmetic fast. Pick your priority.

Prover discussion

The architecture for a Lisp prover

An autonomous agent with mathematical proving can close the gap between Lisp's flexibility and Rust's guarantees without making Lisp rigid. The prover sits between the programmer and the compiler:

[Lisp code] → [Passepartout prover] → [SBCL compiler] → [binary]
                   ↑
         [Verified: race-free, type-safe, bounded memory, constant-factor performance]

The prover can:

  • Prove memory safety without a borrow checker — analyze call graphs and data flow
  • Predict performance — unfold macros, SSA-ize, tell the programmer what will optimize
  • Infer types across untyped CL code — flag contradictions, converge toward provably-correct programs

This preserves Lisp's core property: the language stays fluid. The proving agent is an external constraint, not a compiler-enforced limitation. You can always bypass it for fast prototyping.

Where the coming shift ranks languages

Rust has the most to lose — its entire value proposition is compile-time safety. An AI prover that verifies the same properties about Lisp code makes the borrow checker a solved problem. Rust keeps the advantage only as long as the prover is slower and more expensive than a type system that runs in milliseconds.

Python/JS have the most surface to gain — an AI prover could give them type soundness, no null pointers, thread safety, without changing the languages.

Lisp is uniquely positioned because of homoiconicity. The prover works on S-expressions, which are the AST. No parse/unparse gap. The prover and programmer work on exactly the same artifact. The prover can rewrite a macro expansion without expanding it, annotate specific cons cells with lifetime constraints, and the output is standard diffable code.

ACL2 and the bootstrap

ACL2 is the right foundation but not the complete solution:

  • It is first-order — can't quantify over functions, can't prove meta-level properties about the evaluator itself
  • It is interactive — human must guide the proof, lemmas must be manually provided
  • It is a subset — pure, no side effects, no CLOS, no I/O, no threading
  • It doesn't scale to everyday production code without massive human effort

But ACL2 proves the architecture is viable: a Lisp-based prover verifying Lisp programs is deeply natural.

Bootstrapping a HOL prover via ACL2 + Screamer

The HOL kernel is famously tiny (HOL Light: ~400 lines of OCaml, 10 primitive inference rules). The path:

  1. Write the HOL kernel in pure CL (~500-800 lines)
  2. ACL2 verifies the kernel implements the inference rules correctly
  3. Screamer provides the proof search engine (backtracking maps to proof tree exploration)
  4. HOL proves meta-level properties: macro soundness, evaluator equivalence, compositionality of skills, safety of self-modification
  5. HOL theorems are reflected back as ACL2 rewrite rules — automation compounds

Trust chain: SBCL runtime → ACL2 (verified by its own meta-circular proof) → HOL kernel (verified by ACL2) → any HOL theorem.

This is not a research problem — the kernel is a well-known artifact. It's an engineering task: transcribe HOL Light's kernel into CL, verify with ACL2, couple with Screamer.

An LLM can write the HOL kernel

The HOL kernel is the ideal LLM task: ~400-800 lines of pure code, fully specified, well-documented, stateless, with unambiguous correctness criteria.

Workflow: LLM writes plausible kernel → ACL2 verifies it → ACL2 catches hallucinations → iteration converges in 2-3 attempts.

The LLM is allowed to be wrong. ACL2 is the filter. This is achievable today.

Comparison with Lean

Lean is the standard for mathematics and Passepartout will not compete with it for that purpose:

  • mathlib4 has 100,000+ theorems. A bootstrapped HOL prover starts at zero.
  • Lean has a community of mathematicians; a Lisp prover won't attract one.
  • Lean's elaborator is the result of decades of type theory research; a HOL kernel + Screamer doesn't touch that.

Where Passepartout's prover wins: it verifies running code, not abstract mathematics. It is embedded in a self-modifying agent. Its design is optimized for program properties (memory safety, race freedom, macro soundness), not mathematical expressiveness.

Convergence is possible if the agent automates proof discovery at sufficient speed, but that question is unanswered and is the one Passepartout is designed to answer.

Key insight from the conversation

The self-writing machine doesn't need to prove itself from the void. It proves each next step using everything proven before. The writer is allowed to be heuristic (LLM). The prover just needs to be conservative and accurate. Every time a human signs off on a proof, it becomes future automation.