#+TITLE: Passepartout Evolutionary Roadmap #+STARTUP: content #+FILETAGS: :docs:roadmap: * The Evolutionary Roadmap The roadmap is designed working backwards from SOTA parity (v1.0.0), guiding each version toward a fully autonomous, self-editing agent. Each version builds on the previous, with features designed to be implemented in pure Common Lisp + Org-mode. The TODO states in each version's Tasks section are the authoritative task tracker. The feature tables describe what each version delivers. ** Non-Negotiable Identity - Pure Common Lisp + Org-mode. No JSON. No YAML. No external databases. - Single-address-space memory (Lisp hash tables in RAM — the agent IS the memory). - "Thin harness, fat skills" — complexity lives at the edges, not the kernel. - One agent composed of many skills. Concurrency via bordeaux-threads (shared memory). - Plists everywhere — homoiconic communication between all components. ** Version Roadmap *** v0.1.0: The Autonomous Foundation — RELEASED The secure, auditable Lisp kernel. All core infrastructure in place. - Perceive-Reason-Act pipeline (3-stage metabolic loop) - Skills engine with jailed loading (defskill, topological sort, hot-reload) - Policy skill (6 invariants) - Memory (memory-object + Merkle hashing) - Scribe + Gardener background workers - LLM gateway (OpenRouter, Ollama) - Shell actuator, Emacs bridge, credentials vault - FiveAM test suite *** v0.2.0: Interactive Refinement — RELEASED The "Brain" meets the "Machine." Standardization and professionalization of the user interface and environment. - Professional TUI (Croatoan-based, styled, scrollable) - Self-editing (detects errors, applies fixes, learns from outcomes) - Enhanced utilities (structural Lisp/Org manipulation + REPL) - Onboarding wizard (modular Lisp setup for multiple LLM providers) - Memory rollback (snap back to known-good state) - Project renamed to Passepartout - Secret Exposure Gate, Shell Safety, Lisp Validation Gate - Multi-distro deployment (Debian + Fedora), systemd service, Docker - 31 org files with full literate prose *** v0.3.0: Event Orchestration + HITL Unified control plane and Human-in-the-Loop state management. ** Tasks *** DONE Project Renaming (Bouncer → Dispatcher) :PROPERTIES: :ID: id-9e779580-287b-b3d1-37b9-bcefd750bf9e :CREATED: [2026-05-01 Fri 15:40] :END: :LOGBOOK: - State "DONE" from "TODO" [2026-05-02 Sat 22:00] :END: CLOSED: [2026-05-02 Sat 22:00] The Dispatcher's role has evolved beyond security guard. It is the seed of the deterministic engine — it learns to execute procedures without invoking the neural net. *** DONE Event Orchestrator (unified hooks+cron+routing) :PROPERTIES: :ID: id-d35aea3d-2e5f-4a12-a9b0-1c2d3e4f5a6b :CREATED: [2026-05-02 Sat 14:00] :END: :LOGBOOK: - State "DONE" from "TODO" [2026-05-02 Sat 22:36] :END: CLOSED: [2026-05-02 Sat 22:36] Unified control plane for hooks, cron, and complexity-based routing. - *hook-registry* + *cron-registry* + tier classifier - Hooks via ~#+HOOK:~ Org-mode properties - Three complexity tiers: ~:REFLEX~ (no LLM), ~:COGNITION~ (light LLM), ~:REASONING~ (full LLM) - Hooked into heartbeat for cron processing - Rule-based tier classifier (overrideable via ~*tier-classifier*~) *** TODO Context Manager (project scoping) :PROPERTIES: :ID: id-a10ed34e-9f37-4a15-b499-46672c00d951 :CREATED: [2026-05-02 Sat 23:00] :END: Stack-based context with ~push-context~ / ~pop-context~. Path resolution relative to current context. Memory scope: ~:scope~ property on memory-objects (memex/session/project). Implement lazy-loading proxies for large-scale memory traversal. *** TODO Model-Tier Routing (cost optimization) Extend ~*model-selector-fn*~ for complexity-based routing. - Heartbeats → smallest model - User input → medium model - Complex reasoning → large model *** TODO Memory Scope Segmentation Extend memory-object with ~:scope~ property. - ~:memex~ (permanent knowledge), ~:session~ (ephemeral), ~:project~ (current work) - Scope-aware retrieval in memory layer *** TODO Asynchronous Embedding Gateway Provider-agnostic vector generation (Ollama, llama.cpp, OpenAI). Edits mark nodes as ~:vector :pending~; background worker batches and updates Merkle tree. *** TODO TUI Experience (Daily Driver Quality) The TUI is a standalone Croatoan app in ~org/gateway-tui.org~. None of these changes require daemon modifications — the protocol between TUI and daemon (port 9105, framed plists) is stable. - P0: Chat scrollback (Page Up/Down) — ~2h - P0: Input history (up/down arrows) — ~1h - P1: Status bar (daemon, model, time) — ~3h - P1: Message rendering (timestamps, colors, wrapping) — ~2h - P2: Command palette (/help redesign) — ~4h - P2: Multi-line input (Shift+Enter) — ~3h - P3: Background activity indicator — ~2h - P4: Tab completion for / commands — ~3h - P4: Configurable theme — ~4h *** TODO Human-in-the-Loop (HITL) Continuation-based interaction. The agent can suspend its cognitive loop to ask for permission or clarification and resume precisely where it left off. Builds on the dispatcher's existing Flight Plan mechanism. *** v0.4.0: Long-Horizon Planning + Git Workflows Structured tracking, failure handling, and course correction for multi-step engineering work. ** Tasks *** TODO Long-Horizon Planning (task tree DAG) Decompose complex tasks into Org-mode headline trees. Terminal states: ~:todo~ → ~:next-action~ → ~:in-progress~ → ~:done~ / ~:blocked~ / ~:stuck~. Parent summarises child results. Branch pruning when paths fail. *** TODO Git Steward (version control integration) Status, diff, commit, push, branch operations. Policy enforces commit-before-modify gate. Log commits to memory. *** TODO TDD Runner Integration Run FiveAM tests on file save. Inject ~:test-failure~ event on red. Hook into self-fix for auto-repair proposals. *** TODO Deep Emacs Integration Full org-agenda awareness: navigate, clock time, refile, archive. Uses org-element + org-id. *** v0.5.0: Interactive Actuation & Environment Stewardship Interactive terminal sessions and autonomous dependency management. ** Tasks *** TODO Interactive PTY Actuator Stream long-running process output to the context window (e.g., ~npm run dev~, REPLs). Async interrupt control (Ctrl+C emulation). *** TODO The Environment Steward Autonomously detect missing dependencies ("Command not found"). Propose installation command and retry the failed action. *** v0.6.0: Concurrency + Creator + GTD The agent bootstraps itself and manages parallel workstreams. ** Tasks *** TODO Skill Creator (autonomous skill generation) LLM drafts complete skill org-file from natural language. Mandatory: syntax validation → jail-load → test → register. *** TODO Architect Agent (PRD → PROTOCOL) Scan ~:STATUS: FROZEN~ PRDs. Generate Phase B PROTOCOL from Phase A. *** TODO GTD Integration (project tracking) Full GTD cycle: capture, clarify, organize, reflect, engage. org-gtd v4.0 DAG (~:TRIGGER:~, ~:BLOCKER:~). *** TODO Consensus Loop (multi-model agreement) Run multiple providers for critical decisions. Compare results, detect disagreements. Confidence scoring. *** TODO Web Research (Playwright browsing) Headless Chromium via Python bridge. Text extraction, screenshots, Gemini Web UI automation. *** TODO Memex Management (PARA lifecycle) Archive DONE tasks, suggest refiling. Detect orphaned nodes. PARA/Zettelkasten maintenance. *** v0.7.0: Visual Grounding & MCP Bridge Multimodal visual interaction and ecosystem-wide tool compatibility. ** Tasks *** TODO Computer Use / Vision Allow the agent to request host OS or browser screenshots. Analyze UI and issue precise X/Y coordinate click/type commands via X11/Wayland bridge. *** TODO MCP Gateway Bridge Lisp-native client for the Model Context Protocol. Connect Passepartout to external tools and data sources. *** v0.8.0: The Evaluation Harness Automated benchmarking to mathematically prove the agent's reasoning capabilities. ** Tasks *** TODO SWE-Bench Harness Automated pipeline that clones repositories and feeds GitHub issues. Track multi-step resolution trajectory, run tests, and score success. *** v1.0.0: SOTA Parity Feature-complete agent competitive with commercial agents. All features from v0.2.0 through v0.8.0 combined, verified, and tested end-to-end. | Area | Parity Target | |------|--------------| | Self-improvement | Claude Code self-debug | | Planning | ULTRAPLAN equivalent | | Tool ecosystem | 10+ cognitive tools | | Context window | Semantic search + scope segmentation | | Safety | 6 Policy invariants + formal verification | | Multi-step tasks | Task trees with terminal states | | Code editing | Full file read/write via org manipulation | | Memory | Vector recall in memory-object | | Emacs integration | Full org-mode control (exceeds Claude Code) | | Autonomy | 100% local capable (exceeds Claude Code) | *** v2.0.0: Lisp Machine Emergence From Lisp-using agent to true Lisp machine. Agent IS the Emacs process. - Lish: Lisp editor — Org-mode as IDE. Org-babel for interactive evaluation. Full REPL in TUI. - Lish: Shell replacement — Lisp-based shell that speaks plists. Org-mode buffers as file system. *** v3.0.0: Neurosymbolic Maturity Deterministic planner takes the wheel. LLM relegated to semantic translation. - Deterministic planner: Pure Lisp task scheduler. No LLM needed for scheduling. - Self-correcting gates: Gates learn from false positives (user override patterns). *** v4.0.0: AI Stack Internalized The agent understands its own weights. No external inference. - Llama.cpp in Lisp: FFI binding. No Python subprocess. Pure Common Lisp inference. - Weights as sexps: Neural weights as Lisp data structures. Homoiconic model introspection. *** v5.0.0: True Agency World models, temporal reasoning, goal persistence across restarts. - World models: Predictive models of user behavior, project dynamics, system state. - Temporal reasoning: Scheduling, deadlines, elapsed duration awareness. - Goal persistence: Goals survive restarts. Long-term projects in memory-objects.