passepartout/docs/ROADMAP.org

#+TITLE: Passepartout Evolutionary Roadmap
#+STARTUP: content
#+FILETAGS: :docs:roadmap:

* The Evolutionary Roadmap

The roadmap is designed working backwards from SOTA parity (v1.0.0), guiding each version toward a fully autonomous, self-editing agent. Each version builds on the previous, with features designed to be implemented in pure Common Lisp + Org-mode.

The TODO states in each version's Tasks section are the authoritative task tracker. The feature tables describe what each version delivers.

** Non-Negotiable Identity
- Pure Common Lisp + Org-mode. No JSON. No YAML. No external databases.
- Single-address-space memory (Lisp hash tables in RAM — the agent IS the memory).
- "Thin harness, fat skills" — complexity lives at the edges, not the kernel.
- One agent composed of many skills. Concurrency via bordeaux-threads (shared memory).
- Plists everywhere — homoiconic communication between all components.

** Version Roadmap

Understanding Passepartout as a function in time is not nostalgia. It is architectural guidance. Every decision in v0.x should be made with awareness of where the system is going. Code written today becomes the substrate for v3.0. Skills designed today become the vocabulary the symbolic engine speaks tomorrow.

The probabilistic beginning is not a weakness to overcome. It is the bootstrap. The system learns the domain through probabilistic inference, and that learned knowledge becomes the seed for the symbolic engine. By the time the symbolic engine takes over, it has a rich knowledge graph to reason about, grown from thousands of probabilistic interactions.

This is how you build a reasoning machine: start with a learner, make it learn to verify, let verification become the core, remove the learner once it has learned enough.


*** v0.1.0: The Autonomous Foundation — RELEASED 2026-04-20

The secure, auditable Lisp kernel. All core infrastructure in place.

**** DONE Perceive-Reason-Act pipeline
:PROPERTIES:
:ID:       id-06f10b9a-4054-4dea-a927-b0935fbdcd2f
:CREATED:  [2026-03-22 Sun]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-04-20 Mon]
:END:

**** DONE Skills engine with jailed loading
:PROPERTIES:
:ID:       id-dc83944f-3923-4142-b324-c317dacd6b0b
:CREATED:  [2026-03-22 Sun]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-04-20 Mon]
:END:

**** DONE Policy skill (6 invariants)
:PROPERTIES:
:ID:       id-929c84b7-d6ae-42b9-a8b5-d9df962db826
:CREATED:  [2026-03-22 Sun]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-04-20 Mon]
:END:

**** DONE Memory (memory-object + Merkle hashing)
:PROPERTIES:
:ID:       id-3a96b384-cacf-4da0-8faa-1647739feba9
:CREATED:  [2026-03-22 Sun]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-04-20 Mon]
:END:

**** DONE Scribe + Gardener background workers
:PROPERTIES:
:ID:       id-3f618a38-ec23-4034-ba3c-ef272e212e2b
:CREATED:  [2026-03-22 Sun]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-04-20 Mon]
:END:

**** DONE LLM gateway (OpenRouter, Ollama)
:PROPERTIES:
:ID:       id-f5d870e2-cbd2-4c00-a8d4-174ab4118afc
:CREATED:  [2026-04-11 Sat]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-04-20 Mon]
:END:

**** DONE Shell actuator, Emacs bridge, credentials vault
:PROPERTIES:
:ID:       id-7ca3167f-8353-4bb7-8b97-c039017716b0
:CREATED:  [2026-04-11 Sat]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-04-20 Mon]
:END:

**** DONE FiveAM test suite
:PROPERTIES:
:ID:       id-925d4180-764b-4219-8bdc-8e1849572da1
:CREATED:  [2026-04-11 Sat]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-04-20 Mon]
:END:

*** v0.2.0: Interactive Refinement — RELEASED 2026-04-29

The "Brain" meets the "Machine." Standardization and professionalization of the user interface and environment.

*v0.2.0 through v0.5.0: The Dispatcher Learns*

Each version expands the deterministic layer. The Dispatcher writes rules from approved exceptions. Shadow mode runs trial executions. Tool permission tiers mature from simple allow/deny to nuanced context-aware policies. The agent becomes less likely to attempt dangerous actions not because it is smarter but because the guard has more complete information.

This is the bootstrapping phase. The system learns by watching itself and its user. Every blocked action becomes a rule. Every approved exception becomes a pattern. The symbolic layer grows at the probabilistic layer's expense.


**** DONE Professional TUI (Croatoan-based, styled, scrollable)
:PROPERTIES:
:ID:       id-57cef382-fe14-42e6-aade-03e05e3e920b
:CREATED:  [2026-04-28 Tue]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-04-29 Wed]
:END:

**** DONE Self-editing (error detection, surgical fix, hot-reload)
:PROPERTIES:
:ID:       id-459b8275-9979-4d0f-8d61-a9af883930d4
:CREATED:  [2026-04-23 Wed]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-04-29 Wed]
:END:

**** DONE Enhanced utilities (structural Lisp/Org manipulation + REPL)
:PROPERTIES:
:ID:       id-23f37c0d-4e77-4dc3-ab43-52a5987eb426
:CREATED:  [2026-04-23 Wed]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-04-29 Wed]
:END:

**** DONE Onboarding wizard (modular Lisp setup for LLM providers)
:PROPERTIES:
:ID:       id-bd497de7-3533-4056-b89f-2c992d2ea28b
:CREATED:  [2026-04-28 Tue]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-04-29 Wed]
:END:

**** DONE Memory rollback (snapshot and restore)
:PROPERTIES:
:ID:       id-fd2fb6e3-03e7-4e22-b9e9-a7eecfd06718
:CREATED:  [2026-04-12 Sun]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-04-29 Wed]
:END:

**** DONE Secret Exposure Gate, Shell Safety, Lisp Validation
:PROPERTIES:
:ID:       id-aa53c128-195b-42d4-9838-2def59faf7cf
:CREATED:  [2026-05-02 Sat]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-02 Sat]
:END:

**** DONE Multi-distro deployment (Debian+Fedora, systemd, Docker)
:PROPERTIES:
:ID:       id-783df999-f7fe-45c8-896d-2fd07c604d64
:CREATED:  [2026-05-02 Sat]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-02 Sat]
:END:

**** DONE Project rename to Passepartout (files, packages, env vars)
:PROPERTIES:
:ID:       id-91724874-aa0d-4804-9220-8bc5551f1366
:CREATED:  [2026-05-02 Sat]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-02 Sat]
:END:

**** DONE 31 org files with full literate prose
:PROPERTIES:
:ID:       id-597b2a92-aac6-481a-b2c4-4f9842ced97c
:CREATED:  [2026-05-02 Sat]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-02 Sat]
:END:

*** v0.3.0: Event Orchestration + HITL

Unified control plane and Human-in-the-Loop state management.

**** Remediation: Backfill v0.1.0/v0.2.0 Gaps

These features were marked DONE in prior versions but are stubs, no-ops, or
missing. They must be completed before v0.3.0 feature work proceeds.

***** DONE P0: Add vault-get-secret / vault-set-secret wrappers       :backfill:
CLOSED: [2026-05-03 Sun 10:42]
:PROPERTIES:
:ID:       id-vault-secret-wrappers
:CREATED:  [2026-05-03 Sun]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-03 Sun 10:42]
:END:
=vault-get-secret= and =vault-set-secret= are exported from =core-defpackage=
and called from =gateway-messaging.org= (lines 36, 86, 180) but never defined.
=gateway-link= crashes at runtime. Add one-line wrappers in =security-vault.org=
that delegate to the existing =vault-get=/=vault-set= with ~:type :secret~.

***** DONE P0: system-archivist — Scribe + Gardener                   :backfill:
CLOSED: [2026-05-03 Sun 10:42]
:PROPERTIES:
:ID:       id-archivist-distillation
:CREATED:  [2026-05-03 Sun]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-03 Sun 10:42]
:END:
Scribe: distill daily Org logs into atomic Zettelkasten notes with backlinks.
Gardener: scan for broken =[[file:]]= links and orphaned =memory-object= entries.
Wire both as cron jobs via =system-event-orchestrator=.
Depends on: orchestrator bootstrap (P1 item below).

***** DONE P0: system-self-improve — surgical edit + error fix        :backfill:
CLOSED: [2026-05-03 Sun 10:42]
:PROPERTIES:
:ID:       id-self-improve-real
:CREATED:  [2026-05-03 Sun]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-03 Sun 10:42]
:END:
= self-improve-edit=: =org-read-file= → text replace → =snapshot-memory= →
=org-write-file= → =literate-block-balance-check= → tangle → reload.
=self-improve-fix=: parse error log → =lisp-structural-check= →
=lisp-extract= → surgical repair → =repl-eval= verify.
Remove the dead first =defskill= registration (trigger nil, overwritten by second).
Depends on: =programming-org=, =programming-literate= (P0 items below).

***** DONE P0: programming-org — fix org-modify + org-ast-render      :backfill:
CLOSED: [2026-05-03 Sun 10:42]
:PROPERTIES:
:ID:       id-org-modify-render
:CREATED:  [2026-05-03 Sun]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-03 Sun 10:42]
:END:
=org-modify(filepath, id, changes)= ignores ~changes~ and only logs. Should locate
node by ID in file and apply changes to its content.
=org-ast-render(ast)= returns a hardcoded placeholder. Should convert plist AST
back to Org text.

***** DONE P0: programming-literate — fix both stubs                  :backfill:
CLOSED: [2026-05-03 Sun 10:42]
:PROPERTIES:
:ID:       id-literate-real
:CREATED:  [2026-05-03 Sun]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-03 Sun 10:42]
:END:
=literate-block-balance-check=: verify all =#+begin_src lisp= blocks in an Org file
have balanced parentheses. Returns T if all balanced, error message otherwise.
=literate-tangle-sync-check=: verify =.lisp= file matches tangled output of =.org= file.

***** DONE P1: system-event-orchestrator — bootstrap implementation   :backfill:
CLOSED: [2026-05-03 Sun 10:42]
:PROPERTIES:
:ID:       id-orchestrator-bootstrap
:CREATED:  [2026-05-03 Sun]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-03 Sun 10:42]
:END:
=orchestrator-bootstrap= currently only logs. Should scan Org files for =#+HOOK:=
and =#+CRON:= properties and register them via the existing registries.
Prerequisite for archivist cron jobs.

***** DONE P1: system-memory — memory introspection                   :backfill:
CLOSED: [2026-05-03 Sun 10:42]
:PROPERTIES:
:ID:       id-memory-inspect
:CREATED:  [2026-05-03 Sun]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-03 Sun 10:42]
:END:
=memory-inspect= only logs. Should return structured statistics: object count
by type, TODO state distribution, orphan count, snapshot list. Trigger on
=:INTROSPECTION= sensor type.

***** DONE P1: Path relic — skills/ → lisp/ in skill-initialize-all   :backfill:
CLOSED: [2026-05-03 Sun 10:42]
:PROPERTIES:
:ID:       id-path-relic
:CREATED:  [2026-05-03 Sun]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-03 Sun 10:42]
:END:
=skill-initialize-all= and =context-skill-source= resolve against =skills/=
under =$PASSEPARTOUT_DATA_DIR=. Core and skills were merged into =lisp/=.
Update both functions to point at =lisp/=.

***** DONE P2: core-context — semantic retrieval (embeddings)         :backfill:
CLOSED: [2026-05-03 Sun 11:42]
:PROPERTIES:
:ID:       id-embeddings
:CREATED:  [2026-05-03 Sun]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-03 Sun 11:42]
:END:
=org-object-vector= is never populated; all similarities are 0.0. Generate
embeddings via Ollama =nomic-embed-text= at ingest time. Store in
=memory-object.vector=. Fallback: TF-IDF bag-of-words.

***** DONE P2: core-context — subtree-based skill source loading      :backfill:
CLOSED: [2026-05-03 Sun 11:42]
:PROPERTIES:
:ID:       id-skill-subtree
:CREATED:  [2026-05-03 Sun]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-03 Sun 11:42]
:END:
=context-skill-source= reads entire Org files. Add =context-skill-subtree=
for targeted retrieval of specific function docs or test blocks by heading name.

***** DONE P3: Variable name drift normalization (out of scope for now) :backfill:
CLOSED: [2026-05-03 Sun 11:50]

***** DONE P4: Eliminate STYLE-WARNINGs from setup output             :cosmetic:
CLOSED: [2026-05-04 Mon]
SBCL emits ~25 STYLE-WARNINGs at boot due to forward references (function
called before its =defun= appears in the file). Actual bugs (C/T, handler-case,
bare =return=) are already fixed. Remaining warnings fall into two categories:
1. Same-file forward references (reorder =defun=s to fix).
2. Cross-skill references (inherent to skill architecture; suppress or accept).
Reordering is mechanical but tedious — grep each file's =defun= list, compute
topological order, move definitions down. Do not change function bodies.
:PROPERTIES:
:ID:       id-name-normalization
:CREATED:  [2026-05-03 Sun]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-03 Sun 11:50]
:END:
=*memory*= (context) vs =*memory-store*= (memory). =*skills-registry*= with
underscore (reason/context) vs =*skill-registry*= with hyphen (defpackage).
Normalization pass across all modules. Touches every file — do after P0-P2
are stable. Do not mix with functional changes.

**** DONE Project Renaming (Bouncer → Dispatcher)
:PROPERTIES:
:ID:       id-9e779580-287b-b3d1-37b9-bcefd750bf9e
:CREATED:  [2026-05-01 Fri 15:40]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-02 Sat 22:00]
:END:
The Dispatcher's role has evolved beyond security guard. It is the seed of the deterministic engine — it learns to execute procedures without invoking the neural net.

**** DONE Event Orchestrator (unified hooks+cron+routing)
:PROPERTIES:
:ID:       id-d35aea3d-2e5f-4a12-a9b0-1c2d3e4f5a6b
:CREATED:  [2026-05-02 Sat 14:00]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-02 Sat 22:36]
:END:
Unified control plane for hooks, cron, and complexity-based routing.
- *hook-registry* + *cron-registry* + tier classifier
- Hooks via ~#+HOOK:~ Org-mode properties
- Three complexity tiers: ~:REFLEX~ (no LLM), ~:COGNITION~ (light LLM), ~:REASONING~ (full LLM)
- Hooked into heartbeat for cron processing
- Rule-based tier classifier (overrideable via ~*tier-classifier*~)

**** DONE Context Manager (project scoping)
CLOSED: [2026-05-05 Tue]
:PROPERTIES:
:ID:       id-context-manager-scoping
:CREATED:  [2026-05-05 Tue]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-05 Tue]
:END:
Stack-based project focusing with persistence.
- ~push-context~/~pop-context~/~with-context~ stack operations
- ~current-scope~ wired into perceive gate ~*scope-resolver*~
- ~/focus~/~/scope~/~/unfocus~ TUI commands
- Context stack persisted to ~~/.cache/passepartout/context.lisp~, auto-restores on boot

**** DONE Model-Tier Routing (cost optimization)
CLOSED: [2026-05-03 Sun 16:00]
:PROPERTIES:
:ID:       id-model-tier-routing
:CREATED:  [2026-05-02 Sat 23:00]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-03 Sun 16:00]
:END:
Extend ~*model-selector*~ for quadrant-based routing with per-slot provider cascades.
- Privacy filter (local-only for @personal content) — top priority
- Quadrant tagging (foreground/background × probabilistic/deterministic)
- Complexity classifier (code/plan/chat/background slots), each with its own provider cascade
- Model-selector skill registers into $*model-selector*$ hook

Deferred:
- Economics / budget tracking (per-request cost, cumulative caps)
- TUI /config command for cascade configuration (env vars for now)
- Skill metadata declaring complexity at defskill time (keyword-based for now)
- Visual model indicator in TUI status bar

**** DONE Memory Scope Segmentation
CLOSED: [2026-05-03 Sun 16:30]
:PROPERTIES:
:ID:       id-memory-scope-segmentation
:CREATED:  [2026-05-02 Sat 23:00]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-03 Sun 16:30]
:END:
Extend memory-object with ~:scope~ property.
- ~:memex~ (permanent knowledge), ~:session~ (ephemeral), ~:project~ (current work)
- Scope-aware retrieval in memory layer

**** DONE Asynchronous Embedding Gateway
CLOSED: [2026-05-05 Tue]
:PROPERTIES:
:ID:       id-async-embedding
:CREATED:  [2026-05-02 Sat 23:00]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-05 Tue]
:END:
Provider-agnostic vector generation (Ollama, OpenAI, hashing fallback).
- Three backends: local (Ollama-compatible), openai (/v1/embeddings), hashing (SHA-256)
- ~embeddings-compute~ and ~*embedding-backend*~ for runtime provider selection
- ~ingest-ast~ populates vectors at object creation time
- ~mark-vector-stale~ marks vectors as ~:pending~ and queues for re-embedding
- ~embed-all-pending~ drains queue, computes vectors, stores in ~*memory-store*~
- Cron job registered with orchestrator: runs every 10m on ~:reflex~ tier
- ~EMBEDDING_PROVIDER~ env var for provider selection
- Registered as proper skill (~defskill~~:passepartout-system-model-embedding~)

**** DONE TUI Experience (Daily Driver Quality)
CLOSED: [2026-05-05 Tue]
:PROPERTIES:
:ID:       id-tui-experience
:CREATED:  [2026-05-02 Sat 23:00]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-05 Tue]
:END:
All P0-P4 items implemented:
- P0: Chat scrollback (Page Up/Down), Input history (up/down arrows)
- P1: Status bar (connection, mode, msg count, scroll, activity indicator)
- P1: Message rendering (timestamps, colors, role icons)
- P2: Command palette (~/help~ command listing)
- P2: Multi-line input (~\ + Enter~ inserts newline)
- P3: Background activity indicator (~…thinking~ spinner)
- P4: Tab completion for all ~/~~ commands
- P4: Configurable theme (~*tui-theme*~ plist, ~~/theme~~ command)

**** DONE Human-in-the-Loop (HITL)
CLOSED: [2026-05-03 Sun 14:00]
Continuation-based interaction. The agent can suspend its cognitive loop to ask for
permission or clarification and resume precisely where it left off. Builds on the
dispatcher's existing Flight Plan mechanism.
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-03 Sun 14:00]
:END:

*** v0.4.0: Long-Horizon Planning + Git Workflows

Structured tracking, failure handling, and course correction for multi-step engineering work.


**** TODO Long-Horizon Planning (task tree DAG)
Decompose complex tasks into Org-mode headline trees.
Terminal states: ~:todo~ → ~:next-action~ → ~:in-progress~ → ~:done~ / ~:blocked~ / ~:stuck~.
Parent summarises child results.
Branch pruning when paths fail.

**** TODO Git Steward (version control integration)
Status, diff, commit, push, branch operations.
Policy enforces commit-before-modify gate.
Log commits to memory.

**** TODO TDD Runner Integration
Run FiveAM tests on file save.
Inject ~:test-failure~ event on red.
Hook into self-fix for auto-repair proposals.

**** TODO Deep Emacs Integration
Full org-agenda awareness: navigate, clock time, refile, archive.
Uses org-element + org-id.

*** v0.5.0: Interactive Actuation & Environment Stewardship

Interactive terminal sessions and autonomous dependency management.


**** TODO Interactive PTY Actuator
Stream long-running process output to the context window (e.g., ~npm run dev~, REPLs).
Async interrupt control (Ctrl+C emulation).

**** TODO The Environment Steward
Autonomously detect missing dependencies ("Command not found").
Propose installation command and retry the failed action.

*** v0.6.0: Concurrency + Creator + GTD


*v0.6.0 through v0.7.0: The Architecture Crystallizes*

Skills become more deterministic. The agent learns to write its own skills - first drafts generated by the LLM, but verified and refined by the symbolic engine. Self-editing improves. The REPL becomes a first-class cognitive substrate - code is not just written but verified, iterated, tested before committing.

The balance shifts. The neural engine still translates and generates, but the symbolic engine checks, constrains, and corrects. The system is becoming what Gemini called "the strict guard" - a mathematically rigorous layer intercepting probabilistic output.

The agent bootstraps itself and manages parallel workstreams.


**** TODO Skill Creator (autonomous skill generation)
LLM drafts complete skill org-file from natural language.
Mandatory: syntax validation → jail-load → test → register.

**** TODO Architect Agent (PRD → PROTOCOL)
Scan ~:STATUS: FROZEN~ PRDs. Generate Phase B PROTOCOL from Phase A.

**** TODO GTD Integration (project tracking)
Full GTD cycle: capture, clarify, organize, reflect, engage.
org-gtd v4.0 DAG (~:TRIGGER:~, ~:BLOCKER:~).

**** TODO Consensus Loop (multi-model agreement)
Run multiple providers for critical decisions.
Compare results, detect disagreements.
Confidence scoring.

**** TODO Web Research (Playwright browsing)
Headless Chromium via Python bridge.
Text extraction, screenshots, Gemini Web UI automation.

**** TODO Memex Management (PARA lifecycle)
Archive DONE tasks, suggest refiling.
Detect orphaned nodes.
PARA/Zettelkasten maintenance.

*** v0.7.0: Visual Grounding & MCP Bridge

Multimodal visual interaction and ecosystem-wide tool compatibility.


**** TODO Computer Use / Vision
Allow the agent to request host OS or browser screenshots.
Analyze UI and issue precise X/Y coordinate click/type commands via X11/Wayland bridge.

**** TODO MCP Gateway Bridge
Lisp-native client for the Model Context Protocol.
Connect Passepartout to external tools and data sources.

*** v0.8.0: The Evaluation Harness

Automated benchmarking to mathematically prove the agent's reasoning capabilities.


**** TODO SWE-Bench Harness
Automated pipeline that clones repositories and feeds GitHub issues.
Track multi-step resolution trajectory, run tests, and score success.

*** v1.0.0: SOTA Parity

Feature-complete agent competitive with commercial agents. All features from v0.2.0 through v0.8.0 combined, verified, and tested end-to-end.

Achieving feature parity with commercial agents requires the full v0.x series complete. At this point, Passepartout is a reliable autonomous agent - it can handle multi-step engineering tasks, maintain context across sessions, recover from errors, pass benchmarks. It is safer than alternatives because the Bouncer is mature and the memory architecture is sound.

But it is still fundamentally probabilistic at its core. The symbolic engine verifies and constrains, but the generative engine is still the primary reasoning source.


| Area              | Parity Target                               |
|-------------------+---------------------------------------------|
| Self-improvement  | Claude Code self-debug                      |
| Planning          | ULTRAPLAN equivalent                        |
| Tool ecosystem    | 10+ cognitive tools                         |
| Context window    | Semantic search + scope segmentation        |
| Safety            | 6 Policy invariants + formal verification   |
| Multi-step tasks  | Task trees with terminal states             |
| Code editing      | Full file read/write via org manipulation   |
| Memory            | Vector recall in memory-object              |
| Emacs integration | Full org-mode control (exceeds Claude Code) |
| Autonomy          | 100% local capable (exceeds Claude Code)    |

*** v2.0.0: Lisp Machine Emergence


This version is not about the symbolic engine - it is about tools. The agent stops running inside Emacs and starts replacing it. Lish (Lisp shell) emerges: a shell that speaks plists, not POSIX. Org-mode buffers become the file system. Org-babel becomes the REPL. The agent is no longer a passenger in Emacs - it is the operating system.

The key insight is that the agent's interface and the agent's brain become the same thing. In earlier versions, there is a clear separation: the agent produces output, the TUI displays it. In v2.0.0, the distinction blurs. The agent's thoughts are displayed in Org buffers that are also the interface that the agent manipulates.

This is the Emacs cannibalization phase. Not hostile replacement but evolution - Emacs was always a Lisp machine, and v2.0.0 completes the metamorphosis.

From Lisp-using agent to true Lisp machine. Agent IS the Emacs process.

- Lish: Lisp editor — Org-mode as IDE. Org-babel for interactive evaluation. Full REPL in TUI.
- Lish: Shell replacement — Lisp-based shell that speaks plists. Org-mode buffers as file system.

*** v3.0.0: Neurosymbolic Maturity

Deterministic planner takes the wheel. LLM relegated to semantic translation.

- Deterministic planner: Pure Lisp task scheduler. No LLM needed for scheduling.
- Self-correcting gates: Gates learn from false positives (user override patterns).

This is the architectural leap. The system transitions from "probabilistic engine with symbolic verification" to "symbolic engine with probabilistic input and output."

The 10-80-10 architecture becomes fully realized: ten percent neural for input translation, eighty percent symbolic for reasoning against a knowledge graph, ten percent neural for output formatting. The symbolic engine maintains facts, relationships, rules, and formal proofs. When the neural engine generates something, the symbolic engine verifies it - not by checking against a blocklist, but by running the proposal through a Prolog/Datalog reasoner that understands the domain constraints.

The deterministic planner takes the wheel. The LLM is no longer consulted for planning decisions - it translates human language to structured queries and structured results back to human language. The planning itself is pure Lisp: task graphs generated by a symbolic reasoner that has access to the full knowledge graph.

Self-correcting gates replace the learned Bouncer rules. The system learns not just from approved exceptions but from the full history of outcomes - did the plan succeed? Where did it fail? The symbolic engine updates its own rules based on the results.

The implications are significant. Hallucination becomes structurally impossible because the symbolic engine will not accept a fact that contradicts its knowledge graph. Safety becomes provable because the formal verification layer can prove properties about the system's behavior. Self-improvement becomes stable because the agent modifies skills that are then verified before execution.

*** v4.0.0: AI Stack Internalized

The agent understands its own weights. No external inference.

- Llama.cpp in Lisp: FFI binding. No Python subprocess. Pure Common Lisp inference.
- Weights as sexps: Neural weights as Lisp data structures. Homoiconic model introspection.

*** v5.0.0: Hardware

The Lisp machine becomes physical. RISC-V with tagged architecture, hardware-enforced type checking, FPGA prototype for the symbolic core. The agent runs not in emulation but on silicon purpose-built for the architecture.

This is the long horizon. The symbolic engine runs on logic ASICs optimized for symbolic computation. The neural engine runs on GPU or purpose-built matrix math hardware. Lisp orchestrates both, enforcing at the hardware level what it enforced at the software level in earlier versions.

*** v6.0.0: True Agency

World models, temporal reasoning, goal persistence across restarts.

- World models: Predictive models of user behavior, project dynamics, system state.
- Temporal reasoning: Scheduling, deadlines, elapsed duration awareness.
- Goal persistence: Goals survive restarts. Long-term projects in memory-objects.