Files

Amr Gharbeia 5a0d1b1c38 remediation: backfill v0.1.0/v0.2.0 gaps (P0+P1)

- vault: add vault-get-secret/vault-set-secret wrappers
- programming-org: implement org-modify (text search-replace) and org-ast-render (AST to Org text)
- programming-literate: implement literate-block-balance-check (paren validation) and literate-tangle-sync-check (org→lisp diff)
- system-self-improve: replace stubs with surgical text editing and error diagnosis; remove dead first defskill
- system-event-orchestrator: implement orchestrator-bootstrap (scan Org files for HOOK/CRON)
- system-archivist: implement Scribe distillation (daily logs→atomic notes) and Gardener link/orphan repair
- system-memory: implement memory-inspect with type/todo/orphan statistics
- core-skills, core-context: fix path relic (skills/ → lisp/, org/)
- docs: add Token Economics section to DESIGN_DECISIONS, remediation roadmap entries

2026-05-03 10:43:14 -04:00

17 KiB

Raw Blame History

Passepartout Evolutionary Roadmap

The Evolutionary Roadmap

The Evolutionary Roadmap

The roadmap is designed working backwards from SOTA parity (v1.0.0), guiding each version toward a fully autonomous, self-editing agent. Each version builds on the previous, with features designed to be implemented in pure Common Lisp + Org-mode.

The TODO states in each version's Tasks section are the authoritative task tracker. The feature tables describe what each version delivers.

Non-Negotiable Identity

Pure Common Lisp + Org-mode. No JSON. No YAML. No external databases.
Single-address-space memory (Lisp hash tables in RAM — the agent IS the memory).
"Thin harness, fat skills" — complexity lives at the edges, not the kernel.
One agent composed of many skills. Concurrency via bordeaux-threads (shared memory).
Plists everywhere — homoiconic communication between all components.

Version Roadmap

v0.1.0: The Autonomous Foundation — RELEASED 2026-04-20

The secure, auditable Lisp kernel. All core infrastructure in place.

DONE Perceive-Reason-Act pipeline

State "DONE" from "TODO" [2026-04-20 Mon]

DONE Skills engine with jailed loading

State "DONE" from "TODO" [2026-04-20 Mon]

DONE Policy skill (6 invariants)

State "DONE" from "TODO" [2026-04-20 Mon]

DONE Memory (memory-object + Merkle hashing)

State "DONE" from "TODO" [2026-04-20 Mon]

DONE Scribe + Gardener background workers

State "DONE" from "TODO" [2026-04-20 Mon]

DONE LLM gateway (OpenRouter, Ollama)

State "DONE" from "TODO" [2026-04-20 Mon]

DONE Shell actuator, Emacs bridge, credentials vault

State "DONE" from "TODO" [2026-04-20 Mon]

DONE FiveAM test suite

State "DONE" from "TODO" [2026-04-20 Mon]

v0.2.0: Interactive Refinement — RELEASED 2026-04-29

The "Brain" meets the "Machine." Standardization and professionalization of the user interface and environment.

DONE Professional TUI (Croatoan-based, styled, scrollable)

State "DONE" from "TODO" [2026-04-29 Wed]

DONE Self-editing (error detection, surgical fix, hot-reload)

State "DONE" from "TODO" [2026-04-29 Wed]

DONE Enhanced utilities (structural Lisp/Org manipulation + REPL)

State "DONE" from "TODO" [2026-04-29 Wed]

DONE Onboarding wizard (modular Lisp setup for LLM providers)

State "DONE" from "TODO" [2026-04-29 Wed]

DONE Memory rollback (snapshot and restore)

State "DONE" from "TODO" [2026-04-29 Wed]

DONE Secret Exposure Gate, Shell Safety, Lisp Validation

State "DONE" from "TODO" [2026-05-02 Sat]

DONE Multi-distro deployment (Debian+Fedora, systemd, Docker)

State "DONE" from "TODO" [2026-05-02 Sat]

DONE Project rename to Passepartout (files, packages, env vars)

State "DONE" from "TODO" [2026-05-02 Sat]

DONE 31 org files with full literate prose

State "DONE" from "TODO" [2026-05-02 Sat]

v0.3.0: Event Orchestration + HITL

Unified control plane and Human-in-the-Loop state management.

Tasks

Remediation: Backfill v0.1.0/v0.2.0 Gaps

These features were marked DONE in prior versions but are stubs, no-ops, or missing. They must be completed before v0.3.0 feature work proceeds.

TODO P0: Add vault-get-secret / vault-set-secret wrappers backfill

vault-get-secret and vault-set-secret are exported from core-defpackage and called from gateway-manager.org (lines 36, 86, 180) but never defined. gateway-link crashes at runtime. Add one-line wrappers in security-vault.org that delegate to the existing vault-get=/=vault-set with :type :secret.

TODO P0: system-archivist — Scribe + Gardener backfill

Scribe: distill daily Org logs into atomic Zettelkasten notes with backlinks. Gardener: scan for broken [[file:]] links and orphaned memory-object entries. Wire both as cron jobs via system-event-orchestrator. Depends on: orchestrator bootstrap (P1 item below).

TODO P0: system-self-improve — surgical edit + error fix backfill

= self-improve-edit=: org-read-file → text replace → snapshot-memory → org-write-file → literate-block-balance-check → tangle → reload. self-improve-fix: parse error log → lisp-structural-check → lisp-extract → surgical repair → repl-eval verify. Remove the dead first defskill registration (trigger nil, overwritten by second). Depends on: programming-org, programming-literate (P0 items below).

TODO P0: programming-org — fix org-modify + org-ast-render backfill

org-modify(filepath, id, changes) ignores changes and only logs. Should locate node by ID in file and apply changes to its content. org-ast-render(ast) returns a hardcoded placeholder. Should convert plist AST back to Org text.

TODO P0: programming-literate — fix both stubs backfill

literate-block-balance-check: verify all #+begin_src lisp blocks in an Org file have balanced parentheses. Returns T if all balanced, error message otherwise. literate-tangle-sync-check: verify .lisp file matches tangled output of .org file.

TODO P1: system-event-orchestrator — bootstrap implementation backfill

orchestrator-bootstrap currently only logs. Should scan Org files for #+HOOK: and #+CRON: properties and register them via the existing registries. Prerequisite for archivist cron jobs.

TODO P1: system-memory — memory introspection backfill

memory-inspect only logs. Should return structured statistics: object count by type, TODO state distribution, orphan count, snapshot list. Trigger on :INTROSPECTION sensor type.

TODO P1: Path relic — skills/ → lisp/ in skill-initialize-all backfill

skill-initialize-all and context-skill-source resolve against skills/ under $PASSEPARTOUT_DATA_DIR. Core and skills were merged into lisp/. Update both functions to point at lisp/.

TODO P2: core-context — semantic retrieval (embeddings) backfill

org-object-vector is never populated; all similarities are 0.0. Generate embeddings via Ollama nomic-embed-text at ingest time. Store in memory-object.vector. Fallback: TF-IDF bag-of-words.

TODO P2: core-context — subtree-based skill source loading backfill

context-skill-source reads entire Org files. Add context-skill-subtree for targeted retrieval of specific function docs or test blocks by heading name.

TODO P3: Variable name drift normalization (out of scope for now) backfill

*memory* (context) vs *memory-store* (memory). *skills-registry* with underscore (reason/context) vs *skill-registry* with hyphen (defpackage). Normalization pass across all modules. Touches every file — do after P0-P2 are stable. Do not mix with functional changes.

DONE Project Renaming (Bouncer → Dispatcher)

State "DONE" from "TODO" [2026-05-02 Sat 22:00]

The Dispatcher's role has evolved beyond security guard. It is the seed of the deterministic engine — it learns to execute procedures without invoking the neural net.

DONE Event Orchestrator (unified hooks+cron+routing)

State "DONE" from "TODO" [2026-05-02 Sat 22:36]

Unified control plane for hooks, cron, and complexity-based routing.

hook-registry + cron-registry + tier classifier
Hooks via #+HOOK: Org-mode properties
Three complexity tiers: :REFLEX (no LLM), :COGNITION (light LLM), :REASONING (full LLM)
Hooked into heartbeat for cron processing
Rule-based tier classifier (overrideable via *tier-classifier*)

TODO Context Manager (project scoping)

Stack-based context with push-context / pop-context. Path resolution relative to current context. Memory scope: :scope property on memory-objects (memex/session/project). Implement lazy-loading proxies for large-scale memory traversal.

TODO Model-Tier Routing (cost optimization)

Extend *model-selector-fn* for complexity-based routing.

Heartbeats → smallest model
User input → medium model
Complex reasoning → large model

TODO Memory Scope Segmentation

Extend memory-object with :scope property.

:memex (permanent knowledge), :session (ephemeral), :project (current work)
Scope-aware retrieval in memory layer

TODO Asynchronous Embedding Gateway

Provider-agnostic vector generation (Ollama, llama.cpp, OpenAI). Edits mark nodes as :vector :pending; background worker batches and updates Merkle tree.

TODO TUI Experience (Daily Driver Quality)

The TUI is a standalone Croatoan app in org/gateway-tui.org. None of these changes require daemon modifications — the protocol between TUI and daemon (port 9105, framed plists) is stable.

P0: Chat scrollback (Page Up/Down) — ~2h
P0: Input history (up/down arrows) — ~1h
P1: Status bar (daemon, model, time) — ~3h
P1: Message rendering (timestamps, colors, wrapping) — ~2h
P2: Command palette (/help redesign) — ~4h
P2: Multi-line input (Shift+Enter) — ~3h
P3: Background activity indicator — ~2h
P4: Tab completion for / commands — ~3h
P4: Configurable theme — ~4h

TODO Human-in-the-Loop (HITL)

Continuation-based interaction. The agent can suspend its cognitive loop to ask for permission or clarification and resume precisely where it left off. Builds on the dispatcher's existing Flight Plan mechanism.

v0.4.0: Long-Horizon Planning + Git Workflows

Structured tracking, failure handling, and course correction for multi-step engineering work.

Tasks

TODO Long-Horizon Planning (task tree DAG)

Decompose complex tasks into Org-mode headline trees. Terminal states: :todo → :next-action → :in-progress → :done / :blocked / :stuck. Parent summarises child results. Branch pruning when paths fail.

TODO Git Steward (version control integration)

Status, diff, commit, push, branch operations. Policy enforces commit-before-modify gate. Log commits to memory.

TODO TDD Runner Integration

Run FiveAM tests on file save. Inject :test-failure event on red. Hook into self-fix for auto-repair proposals.

TODO Deep Emacs Integration

Full org-agenda awareness: navigate, clock time, refile, archive. Uses org-element + org-id.

v0.5.0: Interactive Actuation & Environment Stewardship

Interactive terminal sessions and autonomous dependency management.

Tasks

TODO Interactive PTY Actuator

Stream long-running process output to the context window (e.g., npm run dev, REPLs). Async interrupt control (Ctrl+C emulation).

TODO The Environment Steward

Autonomously detect missing dependencies ("Command not found"). Propose installation command and retry the failed action.

v0.6.0: Concurrency + Creator + GTD

The agent bootstraps itself and manages parallel workstreams.

Tasks

TODO Skill Creator (autonomous skill generation)

LLM drafts complete skill org-file from natural language. Mandatory: syntax validation → jail-load → test → register.

TODO Architect Agent (PRD → PROTOCOL)

Scan :STATUS: FROZEN PRDs. Generate Phase B PROTOCOL from Phase A.

TODO GTD Integration (project tracking)

Full GTD cycle: capture, clarify, organize, reflect, engage. org-gtd v4.0 DAG (:TRIGGER:, :BLOCKER:).

TODO Consensus Loop (multi-model agreement)

Run multiple providers for critical decisions. Compare results, detect disagreements. Confidence scoring.

TODO Web Research (Playwright browsing)

Headless Chromium via Python bridge. Text extraction, screenshots, Gemini Web UI automation.

TODO Memex Management (PARA lifecycle)

Archive DONE tasks, suggest refiling. Detect orphaned nodes. PARA/Zettelkasten maintenance.

v0.7.0: Visual Grounding & MCP Bridge

Multimodal visual interaction and ecosystem-wide tool compatibility.

Tasks

TODO Computer Use / Vision

Allow the agent to request host OS or browser screenshots. Analyze UI and issue precise X/Y coordinate click/type commands via X11/Wayland bridge.

TODO MCP Gateway Bridge

Lisp-native client for the Model Context Protocol. Connect Passepartout to external tools and data sources.

v0.8.0: The Evaluation Harness

Automated benchmarking to mathematically prove the agent's reasoning capabilities.

Tasks

TODO SWE-Bench Harness

Automated pipeline that clones repositories and feeds GitHub issues. Track multi-step resolution trajectory, run tests, and score success.

v1.0.0: SOTA Parity

Feature-complete agent competitive with commercial agents. All features from v0.2.0 through v0.8.0 combined, verified, and tested end-to-end.

Area	Parity Target
Self-improvement	Claude Code self-debug
Planning	ULTRAPLAN equivalent
Tool ecosystem	10+ cognitive tools
Context window	Semantic search + scope segmentation
Safety	6 Policy invariants + formal verification
Multi-step tasks	Task trees with terminal states
Code editing	Full file read/write via org manipulation
Memory	Vector recall in memory-object
Emacs integration	Full org-mode control (exceeds Claude Code)
Autonomy	100% local capable (exceeds Claude Code)

v2.0.0: Lisp Machine Emergence

From Lisp-using agent to true Lisp machine. Agent IS the Emacs process.

Lish: Lisp editor — Org-mode as IDE. Org-babel for interactive evaluation. Full REPL in TUI.
Lish: Shell replacement — Lisp-based shell that speaks plists. Org-mode buffers as file system.

v3.0.0: Neurosymbolic Maturity

Deterministic planner takes the wheel. LLM relegated to semantic translation.

Deterministic planner: Pure Lisp task scheduler. No LLM needed for scheduling.
Self-correcting gates: Gates learn from false positives (user override patterns).

v4.0.0: AI Stack Internalized

The agent understands its own weights. No external inference.

Llama.cpp in Lisp: FFI binding. No Python subprocess. Pure Common Lisp inference.
Weights as sexps: Neural weights as Lisp data structures. Homoiconic model introspection.

v5.0.0: True Agency

World models, temporal reasoning, goal persistence across restarts.

World models: Predictive models of user behavior, project dynamics, system state.
Temporal reasoning: Scheduling, deadlines, elapsed duration awareness.
Goal persistence: Goals survive restarts. Long-term projects in memory-objects.

17 KiB Raw Blame History

Passepartout Evolutionary Roadmap

The Evolutionary Roadmap

Non-Negotiable Identity

Version Roadmap

v0.1.0: The Autonomous Foundation — RELEASED 2026-04-20

DONE Perceive-Reason-Act pipeline

DONE Skills engine with jailed loading

DONE Policy skill (6 invariants)

DONE Memory (memory-object + Merkle hashing)

DONE Scribe + Gardener background workers

DONE LLM gateway (OpenRouter, Ollama)

DONE Shell actuator, Emacs bridge, credentials vault

DONE FiveAM test suite

v0.2.0: Interactive Refinement — RELEASED 2026-04-29

DONE Professional TUI (Croatoan-based, styled, scrollable)

DONE Self-editing (error detection, surgical fix, hot-reload)

DONE Enhanced utilities (structural Lisp/Org manipulation + REPL)

DONE Onboarding wizard (modular Lisp setup for LLM providers)

DONE Memory rollback (snapshot and restore)

DONE Secret Exposure Gate, Shell Safety, Lisp Validation

DONE Multi-distro deployment (Debian+Fedora, systemd, Docker)

DONE Project rename to Passepartout (files, packages, env vars)

DONE 31 org files with full literate prose

v0.3.0: Event Orchestration + HITL

Tasks

Remediation: Backfill v0.1.0/v0.2.0 Gaps

TODO P0: Add vault-get-secret / vault-set-secret wrappers backfill

TODO P0: system-archivist — Scribe + Gardener backfill

TODO P0: system-self-improve — surgical edit + error fix backfill

TODO P0: programming-org — fix org-modify + org-ast-render backfill

TODO P0: programming-literate — fix both stubs backfill

TODO P1: system-event-orchestrator — bootstrap implementation backfill

TODO P1: system-memory — memory introspection backfill

TODO P1: Path relic — skills/ → lisp/ in skill-initialize-all backfill

TODO P2: core-context — semantic retrieval (embeddings) backfill

TODO P2: core-context — subtree-based skill source loading backfill

TODO P3: Variable name drift normalization (out of scope for now) backfill

DONE Project Renaming (Bouncer → Dispatcher)

DONE Event Orchestrator (unified hooks+cron+routing)

TODO Context Manager (project scoping)

TODO Model-Tier Routing (cost optimization)

TODO Memory Scope Segmentation

TODO Asynchronous Embedding Gateway

TODO TUI Experience (Daily Driver Quality)

TODO Human-in-the-Loop (HITL)

v0.4.0: Long-Horizon Planning + Git Workflows

Tasks

TODO Long-Horizon Planning (task tree DAG)

TODO Git Steward (version control integration)

TODO TDD Runner Integration

TODO Deep Emacs Integration

v0.5.0: Interactive Actuation & Environment Stewardship

Tasks

TODO Interactive PTY Actuator

TODO The Environment Steward

v0.6.0: Concurrency + Creator + GTD

Tasks

TODO Skill Creator (autonomous skill generation)

TODO Architect Agent (PRD → PROTOCOL)

TODO GTD Integration (project tracking)

TODO Consensus Loop (multi-model agreement)

TODO Web Research (Playwright browsing)

TODO Memex Management (PARA lifecycle)

v0.7.0: Visual Grounding & MCP Bridge

Tasks

TODO Computer Use / Vision

TODO MCP Gateway Bridge

v0.8.0: The Evaluation Harness

Tasks

TODO SWE-Bench Harness

v1.0.0: SOTA Parity

v2.0.0: Lisp Machine Emergence

v3.0.0: Neurosymbolic Maturity

v4.0.0: AI Stack Internalized

v5.0.0: True Agency

17 KiB

Raw Blame History