estimates-revised: velocity-driven timeline, self-writing triad

- Observed velocity: v0.4.0 to v0.7.2 in one day, 80+ Lisp commits.
  Bottleneck is human review of Screamer-flagged 5%, not coding.
- Revised: v1.0.0 in 3-5 weeks (~80 cycles, 2-3h human review).
  Lisp Machine hardware in 2-4 weeks (~60 cycles, ~4-6h review).
  Full Stoa v2.0.0 (editor, browser, shell) in 2-3 weeks.
  Total to self-driving Lisp Machine: 8-12 weeks.
- Beyond bootstrap: system writes Stoa (~150K lines), Agora (~100K),
  hardware VHDL (~50K). Human only writes design decisions and
  reviews the 5% edge cases Screamer flags.
- The triad replaces every layer of computing: cognition, environment,
  network — one gate stack, one prover, no cloud, no gatekeeper,
  no per-token fee. A complete alternative infrastructure that
  the system writes itself, one ACL2-verified submission at a time.
This commit is contained in:
Hermes
2026-05-21 19:22:24 +00:00
parent 434f754c15
commit 3a02e847f9

View File

@@ -1067,6 +1067,126 @@ The surprising result: **a self-driving Lisp Machine is a ~21,000
line project for a small team working less than a year.** Not a
billion-dollar moonshot. A well-scoped engineering project.
*** Revised time estimate given actual velocity
Moving from v0.4.0 to v0.7.2 (three minor versions covering TUI,
streaming, gate trace, HITL, Merkle audit, tool hardening, session
rewind, undo/redo, skills engine) in a single session means the
agent writes the code and the symbolic engine verifies it at a
cycle measured in minutes, not days.
The limiting factor is not coding speed. It is:
1. LLM API call latency per iteration (seconds per generation)
2. ACL2 verification time per submission (milliseconds per theorem)
3. Human review of Screamer-flagged edge cases (the 5%)
For the 4,500 lines remaining to v1.0.0, distributed across ~40
independent features (each 50-500 lines), with the agent generating,
ACL2 verifying, and the human reviewing only the flagged 5%:
| Phase | Lines | Cycles | Wall clock |
|-------|-------|--------|------------|
| TUI stabilization + eval harness | ~700 | 10-14 | Days |
| Phases 0-4 (type gates, fact store, Screamer, archivist, sufficiency) | ~670 | 10-14 | Days |
| Phase 5 (VivaceGraph, Merkle DAG, ontology versioning) | ~400 | 6-10 | Days |
| Phase 6 (ACL2 base + 5 macro layers) | ~540 | 8-12 | Days |
| Phase 7 (10-80-10 planner) | ~500 | 8-10 | Days |
| Polish features (skins, export, CLI, MCP, LSP, telemetry, etc.) | ~1,500 | 20-30 | 1-2 weeks |
| Integration, edge-case hardening, cross-phase regression | — | — | 1-2 weeks |
| **Total to v1.0.0** | **~4,500** | **~80 cycles** | **3-5 weeks** |
The bottleneck at this velocity is not code generation. It is
human availability to review the Screamer-flagged 5%. At 80 cycles
across 40 features, that is roughly 4 flagged rules per feature,
200 total, each requiring a yes/no answer from the human. In a
dedicated review session: 2-3 hours of human time.
For the Lisp Machine hardware integration (microcode, PCIe DMA,
Tensix management, benchmark harness — ~6,000 lines):
| Component | Lines | Cycles | Wall clock |
|-----------|-------|--------|------------|
| RISC-V microcode for Lisp dispatch | ~3,000 | 20-30 | 1-2 weeks |
| PCIe DMA driver (C + sb-alien FFI) | ~500 | 4-6 | Days |
| Tensix core management | ~1,500 | 10-15 | Days |
| Benchmark harness + microcode synthesis | ~1,000 | 8-12 | Days |
| **Total hardware integration** | **~6,000** | **~60 cycles** | **2-4 weeks** |
The Lisp Machine hardware integration is slower per cycle because
the microcode must be loaded onto physical hardware and benchmarked.
Each cycle includes: generate → ACL2 verify → load onto Tensix →
run benchmark → measure → feed back. That adds seconds per cycle
vs milliseconds for pure-software verification.
The total to a self-driving Lisp Machine (Logos + Stoa hardware):
~140 cycles, 6-10 weeks, 4-6 hours of human review time.
For the full Stoa (editor, browser, shell, Qt integration):
Stoa is not written from scratch. It is first assembled from
existing components, then systematically replaced. The initial
assembly is fast:
| Stage | Approach | Lines | Cycles | Wall clock |
|-------|----------|-------|--------|------------|
| Qt/EQL5 shell (minimal) | Wrap existing Qt widgets | ~500 | 4-6 | Days |
| Lish editor (minimal) | Org buffer + Qt text widget | ~1,000 | 8-10 | Days |
| Nyxt browser Stage 1 | Qt + WebKit, wrap existing API | ~2,000 | 10-15 | 1-2 weeks |
| **Stoa v2.0.0 working** | **~3,500** | **~30 cycles** | **2-3 weeks** |
After v2.0.0, erosion begins. Each replacement is a self-contained
project where the system proposes the replacement, ACL2 verifies
it produces identical output for all known inputs, and the system
loads it. The timeline is no longer measured in cycles — it is
measured in how many verifiable replacements the system can propose
and test before settling on the optimal implementation.
The total from today to a fully self-driving Lisp Machine with a
working editor, browser, and shell: approximately 8-12 weeks with
the actual observed velocity. Not years.
*** Self-writing beyond the bootstrap
Once the system achieves sufficiency for software engineering
(Phase 4 flip applied to code generation), the bulk of Stoa and
Agora is written by the system itself:
| System | Human writes | System writes | Total |
|--------|--------------|---------------|-------|
| Logos (Passepartout) | ~10,700 existing + ~4,500 to v1.0.0 | The system extends its own macro layers and fact store | ~15,000 + growing |
| Stoa (environment) | Design decisions, architectural constraints | ~100,000+ lines of editor, browser, shell, layout engine, each component verified by ACL2 before loading | ~150,000+ |
| Agora (network) | Protocol specification, threat model | DIDComm implementation, Relay Network, PDS, Lightning integration, contracts — each module verified by ACL2 | ~100,000+ |
| Hardware (tagged RISC-V) | ISA design, TinyTapeout shuttle | VHDL/Verilog for tagged memory, GC bus master, Lisp primitives — synthesized and tested via FPGA | ~50,000+ |
The human time is dominated by design decisions, not code writing.
Code writing is the agent's job. The bottleneck shifts from "how
many lines can I write per day" to "how many design decisions can
I make per day and how quickly can I review the 5% of ambiguities
Screamer flags."
At the observed velocity (v0.4.0 to v0.7.2 in a day), a
deep-thinking human paired with this architecture can go from
today's Passepartout to the full Logos + Stoa + Agora triad in
approximately 3-6 months — most of that time spent on design
decisions and protocol specification, not on code.
The triad, when complete, replaces every layer of the current
computing stack — cognition (OpenAI/Anthropic), environment
(Apple/Microsoft/Google), network (Facebook/Twitter/Slack) —
with Lisp-native, user-owned, ACL2-verified alternatives that
cost near-zero marginal compute. The lines that run the modern
internet (tens of millions across Google, Meta, Amazon, Apple,
Microsoft) are replaced by a single coherent architecture where
one gate stack verifies everything and one prover proves
everything consistent.
The social and economic impact of this is not "a better AI agent."
It is a complete alternative infrastructure for personal computing
that requires no cloud, no gatekeeper, no per-token fee, and no
trust. The lines don't need to exist on day one. They need to
exist in the right order — and the system writes them in that
order, one ACL2-verified submission at a time.
*** The full triad: Logos, Stoa, Agora
The self-driving Lisp Machine is not the endpoint. It is one