passepartout/README.org

#+TITLE: OpenCortex: The Conductor of your Life Stack

#+CAPTION: A neurosymbolic AI agent framework for the 100-year Memex
#+ATTR_HTML: :width 800

*opencortex* is a minimalist, extensible AI agent framework designed to manage and continuously organize your personal knowledge base. It transforms a static collection of plaintext notes into a live, programmable [[https://en.wikipedia.org/wiki/Memex][Memex]]—an automated, personalized memory system where humans and AI collaborate in the exact same workspace.

* The Problem with Current AI Agents

The current ecosystem of AI agents (typically built in Python or TypeScript) is overwhelmingly built on architectural choices that prioritize rapid prototyping over long-term reliability, security, and self-modification:

** 1. The Format Trap (Markdown & JSON)

Most agents force a painful translation layer. Humans write in Markdown, which lacks a strict Abstract Syntax Tree (AST)—a rigorous, nested representation of data that machines need to parse context reliably. Machines, in turn, output JSON, which is hostile for human thought and note-taking.

The result is a fractured workspace where the agent's memory and the human's memory are fundamentally incompatible. You cannot see what the agent sees. The agent cannot naturally work with your notes.

** 2. The Language Trap (Python & TypeScript)

Python and TypeScript are fantastic for gluing together APIs, but they are poorly suited for an agent that needs to safely read, write, and execute its own code at runtime. Their underlying structures are complex and opaque, making autonomous self-editing incredibly brittle and dangerous.

How do you trust an agent to modify its own Python code when Python's AST is so complex that even human programmers need IDEs to navigate it?

** 3. The Probabilistic Trap

Almost all modern agents rely entirely on /probabilistic/ reasoning. We ask an AI model to guess a shell command or write a Python script, and then blindly pipe that output to a terminal. Without a rigorous, /deterministic/ layer to formally verify the model's proposals before execution, these systems are fundamentally unsafe.

The model might hallucinate a command. It might output valid syntax that still does something dangerous. Without a deterministic gate, there's nothing between the guess and the terminal.

* The Vision: A Modern, Homoiconic Memex

openCortex abandons these fragile paradigms by returning to first principles and embracing two historically powerful technologies: *Org-mode* and *Common Lisp*.

** Org-mode: The Universal Language

Instead of wrestling with Markdown parsers or hiding data in opaque databases, openCortex mandates that *Org-mode is the native AST for both humans and machines.*

Org-mode is unique because it seamlessly brings together:
- Human-readable prose
- Structured metadata (properties and tags)
- Lifecycle states (TODO/DONE/PLAN)
- Executable code blocks

...all in a single plain-text file. The code is the data, and the data is the interface. When the agent "remembers" a fact or schedules a task, it writes an Org headline. You read exactly what the agent reads.

This is not a compromise—it's the design principle. The agent's memory and your memory are the same format, the same file, the same text.

** Common Lisp: The Engine of Self-Modification

There is a beautiful irony to openCortex: Lisp was invented in 1958 specifically to achieve Artificial Intelligence, and it has been waiting nearly 70 years for /this exact moment/ in computing history.

Lisp possesses a unique property called *Homoiconicity*: the primary representation of the program is also a data structure (nested lists) within the language itself. Because Lisp code /is/ Lisp data, it is trivially easy for an AI to generate, manipulate, and safely evaluate new tools at runtime.

This makes Lisp the ultimate, un-brittle language for a "self-writing" agent. The agent doesn't need an AST parser—it can simply read and write lists directly. The agent doesn't need a code generator—it can write Lisp that executes Lisp.

** The Probabilistic-Deterministic Loop

openCortex does not let AI models touch your system directly. Instead, it splits cognition into two distinct engines:

1. *The Probabilistic Engine (Neural/Dynamic):* Provides semantic understanding and dynamic reasoning. It utilizes a **Dynamic LLM Cascade** (OpenRouter, Ollama, Anthropic) to ensure the agent always has a "brain," falling back to local models if cloud services are unavailable.

2. *The Deterministic Engine (Logic/Safety):* Intercepts LLM proposals and formally verifies them against your security rules (the "Bouncer" pattern) before execution.

#+begin_src mermaid
flowchart LR
    subgraph Probabilistic["Probabilistic Engine (LLM)"]
        LLM[LLM Call]
    end

    subgraph Deterministic["Deterministic Engine (Skills)"]
        Policy[Policy Skill<br/>Constitutional invariants]
        Bouncer[Bouncer Skill<br/>Security vectors]
        Validator[Lisp Validator<br/>Structural verification]
    end

    subgraph Actuation["Actuation"]
        Shell[Shell Actuator]
        TUI[TUI Client]
        Emacs[Emacs Gateway]
    end

    LLM -->|Proposes action| Deterministic
    Policy -->|Checks| Bouncer
    Bouncer -->|Verifies| Validator
    Validator -->|Approves| Actuation
    Actuation -->|Feeds back| LLM
#+end_src

* Architecture: Thin Harness, Fat Skills

To guarantee long-term stability, openCortex enforces a strict architectural boundary inspired by the "thin harness, fat skills" philosophy.

** The Minimalist Harness

The Lisp microkernel is a thin, unbreakable harness strictly responsible for:

| Layer | Responsibility | Examples |
|-------|----------------|----------|
| *Perceive* | Normalize sensory input | CLI parsing, Emacs events, heartbeats |
| *Reason* | Bridge neural and deterministic | LLM dispatch, response parsing, skill routing |
| *Act* | Execute approved actions | Shell commands, tool calls, UI output |
| *Memory* | Live object store | Org-object graph, snapshots, rollback |

What the harness does /not/ contain:
- Policy rules (those are skills)
- LLM integrations (those are skills)
- Domain-specific functionality (those are skills)

** Literate, Single-File Skills

In openCortex, a Skill is simply a *single .org file* containing everything:
- The documentation (prose explaining the skill's purpose)
- The AI instructions (how the LLM should use this skill)
- The deterministic code (Lisp that verifies/proposes actions)

When the system boots, it compiles these skills directly into the live Lisp image. Skills are hot-reloadable without restarting the daemon.

#+begin_src mermaid
flowchart TD
    subgraph Skill["Skill: policy.org"]
        Docs["Documentation<br/>'This skill enforces...'"]
        AI["AI Instructions<br/>'When the user asks about...'"]
        Code["Deterministic Code<br/>'(defun policy-check-...)'"]
    end

    subgraph Harness["Harness Core"]
        Package["package.lisp"]
        Loop["loop.lisp"]
        Perceive["perceive.lisp"]
        Reason["reason.lisp"]
        Act["act.lisp"]
    end

    Code --> |Compiles into| Harness
    Harness --> |Runs| Pipeline
    Pipeline --> |Feeds| Skill
#+end_src

** The Metabolic Pipeline

Every signal in openCortex moves through the same three-stage pipeline:

1. *Perceive:* Normalize raw input into a standardized Signal
2. *Reason:* Generate a proposal via LLM, verify via skills
3. *Act:* Execute the approved action, generate feedback

#+begin_src mermaid
sequenceDiagram
    participant User
    participant Gateway
    participant Perceive
    participant Reason
    participant Act
    participant User

    User->>Gateway: "Write a note about X"
    Gateway->>Perceive: Raw message
    Perceive->>Perceive: Normalize to Signal
    Perceive->>Reason: Signal
    Reason->>Reason: LLM generates proposal
    Reason->>Reason: Skills verify proposal
    Reason->>Act: Approved action
    Act->>Act: Execute action
    Act->>Reason: Feedback signal
    Reason->>Perceive: New signal
    Perceive->>Gateway: Response
    Gateway->>User: "Done"
#+end_src

** The Skill Registry

Skills are discovered, sorted by dependency, and loaded at boot:

#+begin_src mermaid
flowchart LR
    subgraph Discovery["Skill Discovery"]
        Scan["Scan skills/ directory"]
        Sort["Topological sort by DEPENDS_ON"]
    end

    subgraph Loading["Skill Loading"]
        Validate["Validate syntax"]
        Jail["Jail in package namespace"]
        Register["Register in catalog"]
    end

    Scan --> Sort --> Validate --> Jail --> Register
#+end_src

* The Three Data Stores

openCortex maintains three distinct representations of your knowledge:

| Store | Format | Location | Purpose |
|-------|--------|----------|---------|
| *Source of Truth* | Plaintext .org files | `~/memex/` | Human-readable, version-controlled |
| *Active Brain* | RAM (Lisp hash tables) | Memory | Fast, live, queryable |
| *Snapshots* | Binary .snap files | `~/.opencortex/` | Crash recovery, rollback |

The Active Brain is built from the Source of Truth on boot and kept in sync via:
- Buffer updates from Emacs (when you edit)
- Heartbeat snapshots (periodic persistence)
- Graceful shutdown saves

* The Evolutionary Roadmap

openCortex's roadmap is designed working backwards from SOTA parity (V 1.0.0), guided by a critical analysis of four reference systems: OpenCode, Claude Code (leaked source), GBrain, and OpenClaw/Hermes. Every borrowed concept is reimplemented in pure Lisp. Every rejected pattern is documented.

** Non-Negotiable Identity
- Pure Common Lisp + Org-mode. No JSON. No YAML. No external databases.
- Single-address-space memory (Lisp hash tables in RAM — we *are* the memory).
- "Thin harness, fat skills" — complexity lives at the edges, not the kernel.
- One agent composed of many skills. No sub-agent topologies.
- Plists everywhere — homoiconic communication between all components.

*** OpenCode: Borrowed / Rejected

| Feature | Decision | Rationale |
|---------|----------|-----------|
| Permission filtering before LLM sees tools | BORROW | Hook into =generate-tool-belt-prompt= to exclude denied tools. We have =:guard= but no pre-filter. |
| Hook system (session start/end) | BORROW | Already designing event-orchestrator. Expose via =#+HOOK:= properties. |
| Skills with YAML frontmatter | REJECT | Our Org-mode =:PROPERTIES:= + =#+FILETAGS= already do this. |

*** Claude Code: Borrowed / Rejected

| Feature | Decision | Rationale |
|---------|----------|-----------|
| ULTRAPLAN / structured task decomposition | BORROW (reimplement) | LLM already generates plist actions. Add task-tree skill that decomposes into Org-mode headline DAGs with terminal states. |
| 43 integrated tools | BORROW (approach) | Start with ~3. Build more as skills. Keep =def-cognitive-tool= pattern. |
| 4-tier permission chain (ask/allow/deny) | BORROW (concept) | Three-tier per-tool permission: ask/allow/deny stored in org-objects. |
| Multi-agent hub-and-spoke topology | REJECT | We have one agent. Concurrency via bordeaux-threads (shared memory). Skills ARE the specialization — intra-process, not inter-process. |
| Mailbox pattern for dangerous ops | REJECT | Jailed skill packages + Policy skill already provide isolation. Bouncer gate satisfies "worker can't self-approve". |

*** GBrain: Borrowed / Rejected

| Feature | Decision | Rationale |
|---------|----------|-----------|
| RESOLVER.md intent routing | BORROW (concept) | =find-triggered-skill= already does this. Enhance with multi-skill triggers for complex intents. |
| Three search modes (keyword, hybrid, direct) | BORROW | Keep keyword + direct. Hybrid/vector via local Ollama embeddings — no external DBs. |
| Memory segmentation (brain/agent/session) | BORROW (concept) | Extend org-object with =:scope= property: =:memex= (permanent), =:session= (ephemeral), =:project= (scoped). |
| 20+ cron jobs for background work | BORROW (concept) | Heartbeat already does this. Enhance with Event Orchestrator's cron registry — pure Lisp. |
| Sub-agent model routing for cost | BORROW (concept) | Our =*model-selector-fn*= already selects models. Extend to route by complexity tier. |
| Postgres + pgvector | REJECT | Single-address-space hash tables. No external databases. |

*** opencortex-contrib: Integrate / Reject

| Skill | Decision | Rationale |
|-------|----------|-----------|
| self-fix + lisp-repair | INTEGRATE | Merge into =org-skill-self-edit=. Our memory has snapshot/rollback. Add =repair-file= as cognitive tool. |
| event-orchestrator | INTEGRATE | Merge hooks + cron + routing into ONE skill. Our loop has no unified orchestration. |
| formal-verification | INTEGRATE | =def-invariant= macro + =verify-action-formally= belong in =org-skill-policy.org= as additional checks. |
| engineering-standards | INTEGRATE | Git-clean-p gate + "Commit Before Modify" belong in Policy. |
| sub-agent-manager | REJECT | Redundant with BT threads. Our =defskill= pattern (trigger + probabilistic + deterministic) is intra-process specialization — same goal, zero process overhead. |
| embedding-generator | BORROW | Ollama embeddings for semantic search — no external vector DB. |
| playwright + web-research | DEFER | V 0.5.0. Browser automation via Python bridge. |

** Version Roadmap

*** v0.1.0: The Autonomous Foundation — CURRENT RELEASE ✅

The secure, auditable Lisp kernel. All core infrastructure in place.

| Component | Status | Notes |
|-----------|--------|-------|
| Perceive-Reason-Act pipeline | ✅ | 3-stage metabolic loop |
| Skills engine with jailed loading | ✅ | defskill, topological sort, hot-reload |
| Policy skill (6 invariants) | ✅ | Transparency, Autonomy, Bloat, Modularity, Mentorship, Sustainability |
| Bouncer skill | ✅ | Command whitelist guard functions |
| Memory (org-object + Merkle) | ✅ | Hash tables, snapshots, rollback |
| Lisp validator skill | ✅ | Syntax validation before eval |
| Scribe + Gardener skills | ✅ | Heartbeat-driven distillation + audit |
| LLM gateway (OpenRouter + Ollama) | ✅ | Provider cascade |
| Shell actuator | ✅ | Safe command execution |
| Emacs bridge via Swank | ✅ | Point/buffer updates |
| FiveAM test suite | ✅ | Memory, boot, pipeline, act, communication |
| Credentials vault | ✅ | Encrypted storage |

*** v0.2.0: Self-Improvement + Local LLMs — NEXT

Priority: Self-editing is the foundation of all growth. Full org-mode manipulation makes the agent a true Emacs citizen.

| Feature | Source | Implementation |
|---------|--------|----------------|
| org-skill-self-edit (self-modification) | contrib self-fix + lisp-repair | Hook into =:syntax-error= events. Deterministic: auto-balance parens. Probabilistic: LLM surgical fix. Memory rollback on failure. |
| org-skill-emacs-edit (full org manipulation) | Own need | Read org buffers, parse AST, create/update/delete headlines, set properties, manage TODO, handle links. Uses org-element. |
| Local vector search (Ollama embeddings) | contrib embedding-generator | =generate-embeddings= via Ollama. Add =:vector= to org-object. Semantic search with cosine similarity. |
| Tool permission tiers (ask/allow/deny) | Claude Code | Per-tool permission plist in org-object. =generate-tool-belt-prompt= filters denied tools. |
| Skill hot-reload (=:reload-skill= tool) | Own need | Swap compiled skill files without breaking sockets. |

*** v0.3.0: Event Orchestration + Context Awareness

Priority: Unified control plane, deep project understanding before complex work.

| Feature | Source | Implementation |
|---------|--------|----------------|
| org-skill-event-orchestrator (hooks+cron+routing) | contrib event-orchestrator | Merge *hook-registry* + *cron-registry* + complexity classifier. Hooks via =#+HOOK:=. Three tiers: =:REFLEX= (no LLM), =:COGNITION= (light LLM), =:REASONING= (full LLM). |
| org-skill-context-manager (project scoping) | contrib context-manager | Stack-based context. =push-context= / =pop-context=. Path resolution relative to current context. |
| Memory scope segmentation | GBrain | =:scope= on org-objects: memex/session/project. Scope-aware retrieval. |
| Model-tier routing (cost optimization) | GBrain | Heartbeat → smallest model. User input → medium. Complex reasoning → large. |
| Slash commands (TUI ergonomics) | Own need | =M-x= style command palette. =/-= prefix. Commands defined in org-mode. |

*** v0.4.0: Long-Horizon Planning + Git Workflows

Priority: Real engineering work spans dozens of steps. Structured tracking, failure handling, course correction.

| Feature | Source | Implementation |
|---------|--------|----------------|
| org-skill-long-horizon (task tree DAG) | Claude Code ULTRAPLAN | Decompose tasks into Org-mode headline trees. Terminal states: =:done= / =:blocked= / =:stuck=. Parent summarises children. Branch pruning. |
| org-skill-git-steward (version control) | contrib git-steward | Status, diff, commit, push, branch. Policy enforces commit-before-modify. |
| TDD runner integration | contrib tdd-runner | FiveAM on file save. =:test-failure= events. Hook into self-fix for auto-repair. |
| Deep Emacs integration | Own need | Full org-agenda awareness. Clock time, refile, archive. |

*** v0.5.0: Creator + Architect + GTD

Priority: Agent bootstraps itself. Creates skills autonomously, designs projects from PRDs, tracks work.

| Feature | Source | Implementation |
|---------|--------|----------------|
| org-skill-creator (autonomous skill generation) | contrib creator | LLM drafts complete skill org-file. Mandatory: syntax validation → jail-load → test → register. |
| org-skill-architect (PRD → PROTOCOL) | contrib architect | Scan =:STATUS: FROZEN= PRDs. Generate Phase B PROTOCOL. |
| org-skill-gtd (project tracking) | contrib gtd | Full GTD cycle. org-gtd v4.0 DAG (=:TRIGGER:=, =:BLOCKER:=). |
| Consensus loop (multi-model agreement) | contrib consensus | Run multiple providers, compare results, detect disagreements. |
| Web research (Playwright browsing) | contrib playwright | Headless Chromium via Python bridge. Gemini Web UI automation. |

*** v1.0.0: SOTA Parity

Feature-complete agent, competitive with commercial agents. All borrowed concepts reimplemented in pure Lisp.

| Area | Status | Notes |
|------|--------|-------|
| Self-improvement | ✅ v0.2.0 | Self-edit + lisp-repair = Claude Code self-debug parity |
| Planning | ✅ v0.4.0 | Task tree DAGs = ULTRAPLAN equivalent |
| Tool ecosystem | 🟡 v0.4.0 | 10+ tools (expand from 3) |
| Context window | ✅ v0.3.0 | Semantic search + scope segmentation |
| Safety | ✅ v0.1.0 | 6 Policy invariants + formal verification |
| Multi-step tasks | ✅ v0.4.0 | Task trees with terminal states |
| Code editing | ✅ v0.2.0 | Full file read/write via org manipulation |
| Memory | 🟡 v0.2.0 | Add vector recall to org-object |
| Emacs integration | ✅ v0.2.0 | Full org-mode control — exceeds Claude Code |
| Autonomy | ✅ v0.1.0 | 100% local capable (Ollama) — exceeds Claude Code |

*** v2.0.0: Lisp Machine Emergence

The agent moves from "using Lisp" to "being Lisp."

| Feature | Implementation |
|---------|----------------|
| Lisp editor (Lish) | Org-mode as IDE. Org-babel for interactive evaluation. Full REPL in TUI. |
| Shell replacement (Lish) | Lisp-based shell that speaks plists. Org-mode buffers as file system. |

*** v3.0.0: Neurosymbolic Maturity

| Feature | Implementation |
|---------|----------------|
| Deterministic planner | Planner as pure Lisp function. No LLM for scheduling. |
| Self-correcting gates | Gates learn from false positives (user override patterns). |

*** v4.0.0: AI Stack Internalized

| Feature | Implementation |
|---------|----------------|
| Llama.cpp in Lisp | FFI binding to llama.cpp. No Python. |
| Weights as sexps | Neural weights as Lisp data structures. |

*** v5.0.0: True Agency

| Feature | Implementation |
|---------|----------------|
| World models | Agent builds predictive models of user behavior, project dynamics, system state. |
| Temporal reasoning | The agent reasons about time: scheduling, deadlines, elapsed duration. |
| Goal persistence | Goals survive restarts. Long-term projects tracked in org-objects. |

** Design Principles

** 1. Radical Transparency

If you can't explain it, you can't do it. Every action must be auditable. Hidden reasoning is forbidden.

** 2. Autonomy First

Dependency on proprietary systems is debt. Prefer local, offline-capable solutions.

** 3. Zero Bloat

Complexity must be earned, not anticipated. The harness must remain minimal.

** 4. Modularity

The kernel must survive even if all skills fail. Complexity belongs at the edges.

** 5. Mentorship

Teaching is the highest form of assistance. Every action should increase capability.

** 6. Sustainability

Build for the 100-year horizon. Design for offline operation, local inference.

* Contributing

See [[file:docs/CONTRIBUTING.org][CONTRIBUTING.org]] for the Literate Granularity standard and skill creation guidelines.

* License

openCortex is released under the [[file:LICENSE][AGPLv3 license]].

See [[file:CLA.org][CLA.org]] for the Contributor License Agreement.