3 Commits

Author SHA1 Message Date
9e77958028 feat(opencortex): project-local TODO.org and expanded design decisions
Some checks failed
Deploy-Agent-V15-Stdin / JOB-V15-STDIN (push) Failing after 2s
- Create TODO.org for project-specific tasks (migrated from gtd.org)
- Expand DESIGN_DECISIONS.org with 8 new sections:
  - Self-modification without boundaries (vs Hermes)
  - Lisp and the AI Dream (1958 vision fulfilled)
  - REPL as cognitive substrate (with REPL explanation)
  - Evaluation harness (SWE-bench, chaos testing)
  - Observability and the thought trace
  - MCP strategy (native Lisp client)
  - Local-first architecture
  - Zero-dependendency deployment
- Fix org-mode syntax errors in tui-client

Parent gtd.org now links to projects/opencortex/TODO.org
Add projects/opencortex/TODO.org to org-agenda-files in emacs-gtd.org
2026-05-01 21:42:54 -04:00
9191aecab2 fix(tui): resolve crash by removing --non-interactive and adding defensive rendering 2026-05-01 18:16:55 -04:00
48520ec517 refactor(harness): centralize mandates, fix TUI reader structure, and enhance memory/perceive 2026-05-01 12:43:25 -04:00
20 changed files with 1670 additions and 231 deletions

View File

@@ -1,18 +0,0 @@
# OpenCortex Agent Mandates
This file defines mandatory workflows and technical standards for the Gemini CLI agent operating within the OpenCortex environment. These mandates supersede general defaults.
## Lisp Integrity Mandates
- **Validation:** Before applying any change to a `.lisp` file or a Lisp block in an `.org` file, you MUST use `utils-lisp-validate` to ensure structural and semantic integrity.
- **Formatting:** All generated Lisp code MUST be piped through `utils-lisp-format` to maintain project-standard indentation before being saved.
- **Structural Editing:** When modifying complex Lisp forms (nested macros or large functions), prefer using `utils-lisp-structural-extract` and `utils-lisp-structural-wrap` to avoid manual parenthesis errors.
- **Verification:** For new or non-trivial logic, use `utils-lisp-eval` to test the behavior of the isolated S-expression in a live REPL environment before tangling.
## Literate Org Mandates
- **AST Integrity:** When modifying Org files, utilize `utils-org-set-property`, `utils-org-set-todo`, and `utils-org-add-headline` to manipulate the document structure programmatically whenever possible.
- **ID Management:** Every new headline intended for tracking or tangling MUST have a unique ID generated via `utils-org-generate-id`.
## Engineering Workflow
- **Commit-Before-Modify:** Verify the git state is clean before starting a multi-file refactor.
- **Tangle Sync:** After modifying any `.org` file, you MUST ensure the corresponding `.lisp` artifacts are tangled and in sync.
- **Validation:** Run the project-specific test suite (`sbcl --load opencortex.asd`) after every significant change to verify system stability.

793
TODO.org Normal file
View File

@@ -0,0 +1,793 @@
# OpenCortex Project Tasks
# All OpenCortex-related TODOs live here. gtd.org links to this file.
* PHASE: AUTONOMOUS MVP (v0.1.0 Released)
:PROPERTIES:
:ID: proj-mvp-v0-1-0
:END:
The "Zero-to-One" release. The agent must be mathematically secure, CLI-first, and capable of autonomous Memex maintenance.
** DONE 1. Harness Hardening (The Final Audit)
*** DONE Audit remaining core skills (`org-skill-policy.org`, `org-skill-bouncer.org`) to the new Literate Granularity standard.
*** DONE Implement Verification Lock: Ensure `MANDATORY_SKILLS` pass `validate-lisp-syntax` before boot proceeds.
*** DONE Logging & Transparency: Ensure `context-get-system-logs` is utilized by the Reason engine to explain blocked actions.
** DONE 2. The Autonomous Scribe & Gardener (The Primary Value Prop)
*** DONE Implement `org-skill-scribe.org`: Background worker that distills daily chronological logs into structured Zettelkasten notes.
*** DONE Implement `org-skill-gardener.org`: Heartbeat-driven skill that autonomously flags orphaned nodes and repairs broken links.
** DONE 3. The Zero-to-One Experience (setup.org)
*** DONE Consolidate installation instructions, `onboard.sh`, and `Dockerfile` into a single, literate `setup.org` file.
*** DONE Ensure the setup process interactively builds the `.env` and verifies SBCL/Quicklisp dependencies.
** DONE 4. CLI-First Actuation
CLOSED: [2026-04-14 Tue 09:40]
*** DONE Verified the `cli` actuator and inbound gateway handle standard I/O interaction gracefully via a stateful `socat` connection.
* PHASE: PUBLICATION & VERIFICATION (v0.1.0 Post-Release)
:PROPERTIES:
:ID: proj-pub-v0-1-0
:END:
Ensuring the system is ready for the world through collaborative testing, documentation, and licensing.
** DONE 1. Collaborative End-to-End Testing [2026-04-21 Tue]
CLOSED: [2026-04-21 Tue 17:30]
*** DONE Verified stable foundation at commit `cab0e5a`.
*** DONE Verified boot sequence and bidirectional connectivity.
** DONE 2. Semantic Reorganization & System Stabilization [2026-04-21 Tue]
CLOSED: [2026-04-21 Tue 18:30]
*** DONE Rename directories: harness/, library/, environment/, infrastructure/.
*** DONE Consolidate Probabilistic engine into reason.lisp.
*** DONE Embed bidirectional CLI logic into opencortex.sh.
*** DONE Stabilize skill engine: 12/12 skills loaded with package jailing.
*** DONE Cleanup legacy documentation and deployment artifacts.
** DONE 2. Comprehensive Documentation <2026-04-14 Tue>
CLOSED: [2026-04-20 Mon 18:00]
*** DONE Draft `USER_MANUAL.org`: Focus on CLI interaction, installation, and Memex structure.
*** DONE Draft `CONTRIBUTING.org`: Explain Literate Granularity and Skill creation standards.
** DONE 3. License & Legal Finalization <2026-04-14 Tue>
CLOSED: [2026-04-17 Fri 11:25]
*** DONE Assign the AGPLv3 open-source license.
*** DONE Implement a broad Contributor License Agreement (CLA) process.
*** DONE Update `LICENSE` and `CHANGELOG` accordingly.
** TODO 4. GitHub Migration & Repository Setup <2026-04-14 Tue>
*** TODO Migrate primary remote to GitHub and configure canonical repository.
*** TODO Set repository topics, badges, issue templates, and CI/CD foundations.
** TODO 5. Marketing & Social Media Launch <2026-04-14 Tue>
*** TODO Execute PR plan (Reddit, Hacker News, X/Twitter).
*** TODO Create a short, high-quality terminal demo GIF/video of the TUI interaction.
* PHASE: INTERACTIVE REFINEMENT (v0.2.0 Target)
:PROPERTIES:
:ID: proj-refinement-v0-2-0
:END:
Elevating the user interface from raw shell piping to a high-fidelity, native Lisp experience. Priority: Self-editing is the foundation of all future growth. Full org-mode manipulation makes the agent a true Emacs citizen.
Roadmap basis: Evolutionary roadmap from README.org. Working backwards from SOTA parity.
** DONE 0. Autonomous Self-Editing Foundation
*** DONE org-skill-lisp-repair (Lisp syntax repair)
- Deterministic: auto-balance parens via paren-counting
- Probabilistic: LLM generates surgical fix on =:syntax-error= events
- Memory rollback on failure
DONE: Now in org-skill-lisp-utils (merged from contrib)
*** DONE org-skill-emacs-edit (full org-mode manipulation)
- Read org buffers, parse AST via org-element
- Create/update/delete headlines, set properties, manage TODO states
- Handle =id:= links and internal links
- Pure Lisp implementation (no Emacs subprocess)
*** DONE Expose Structural AST Editing Tools
- Wrap org-skill-emacs-edit into def-cognitive-tool definitions
- Force LLM to use semantic node updates instead of regex file I/O
*** DONE Implement Reflection Loops
- Feed rejection traces (syntax errors, policy blocks) back to LLM to trigger self-correction
*** DONE Harden Actuators
- Fix path-traversal vulnerabilities in file I/O tools (e.g. :write-file)
- Enforce Merkle-snapshots on all state-modifying actions
*** DONE Implement tool permission tiers (ask/allow/deny)
- Per-tool permission plist stored in org-object
- =generate-tool-belt-prompt= filters denied tools before LLM sees them
- Ask-tier prompts user before execution
*** DONE Implement skill hot-reload (=:reload-skill= tool)
- Swap compiled skill files without breaking active sockets
- Reload skill into jailed package namespace
- DONE: Added :reload-skill, :read-file, :write-file, :replace-string tools
- DONE: Fixed ASDF compilation bug (position tracking issue with :serial t)
- DONE: Added explicit :depends-on declarations to opencortex.asd
** DONE Engineering Process Improvements [2026-04-23 Wed]
*** DONE Fix ASDF compilation bug (position tracking at byte 16834)
- Root cause: Duplicate proto-get, bt: prefix issues, :serial t position cache
- Fix: Removed duplicate, fixed bt:->bordeaux-threads, explicit dependencies
- Added eval-when wrapper for new tools (good Lisp practice)
*** DONE Add test-first methodology to engineering standards
- Rule 10: Test-first - design tests before coding, run chaos testing
- Rule 11: Org as thinking medium - document investigations in prose
- Rule 12: Engineering decision audit trail - document root cause, tradeoffs
- Added to opencortex-contrib/skills/org-skill-engineering-standards.org
*** DONE Perform comprehensive architectural review and evolution strategy [2026-04-27 Mon]
- Identified hidden gaps: Org-mode round-trip, sandboxing vulnerabilities, and GC scaling.
- Defined "Structural AST Editing" and "Reflection Loops" as core strategic requirements.
- Captured findings in [[file:notes/opencortex-architectural-evolution.org][opencortex-architectural-evolution.org]].
*** DONE Fix API drift in opencortex-contrib [2026-04-27 Mon]
- Standardized legacy keywords (:neuro/:symbolic) to new harness standard (:probabilistic/:deterministic).
- Updated 16 skills in opencortex-contrib for kernel compatibility.
** DONE 4. Core Skills Consolidation [2026-04-23 Thu]
- Merged lisp-validator + lisp-repair → org-skill-lisp-utils.org
- Added lisp utilities: count-char, deterministic-repair, neural-repair
- Added validation: structural, syntactic, semantic checks
- Moved org-skill-self-fix from contrib → core
- Moved org-skill-engineering-standards from contrib → core
- Deleted old org-skill-lisp-validator.org
** DONE 5. Advanced CLI Onboarding Experience
*** DONE Implement interactive Lisp CLI wizard (=opencortex setup=)
*** TODO Implement =opencortex link <gateway>= for Telegram/Signal bot connection
*** DONE Implement =opencortex doctor= for environment health and API key validation [2026-04-28 Tue]
- Verified 22/22 skills loading with clean boot.
- Fixed macro conflicts and package jailing bugs.
*** TODO Implement =opencortex install <skill>= for dynamic skill downloading
** DONE Chaos-Driven Bug Fixes (v0.2.0 Pre-Release) [2026-04-28 Tue]
- Fixed major conflict between Type A and Type B def-cognitive-tool macros.
- Enforced dynamic-only loading by removing skills from opencortex.asd.
- Fixed let/let* sequential binding bugs in emacs-edit and self-edit.
- Standardized *cognitive-tools* as a centralized hash table.
- Resolved missing in-package declarations in core skills.
** DONE 1. Common Lisp TUI Implementation [2026-04-28 Tue]
*** DONE Integrate =croatoan= for native terminal rendering
*** DONE Implement scrollable history viewport for chat and thought streams
*** DONE Implement fixed bottom input box with multi-line support and command history
*** DONE Implement persistent status bar for background workers (Scribe/Gardener)
*** DONE Support syntax highlighting for Lisp code blocks and Org-mode syntax
** DONE 2. Slash Commands & Interactive Control [2026-04-28 Tue]
*** DONE Implement =/help= command for system overview
*** DONE Implement =/exit= and =/clear= commands
*** DONE Implement =/skill-load <name>= for dynamic hot-reloading
*** DONE Implement =/status=, =/config=, =/search=, =/commit= slash commands
** DONE 3. Direct Lisp-to-Terminal Actuation [2026-04-28 Tue]
*** DONE Refactor the =:cli= actuator to use native TUI rendering
** DONE 4. Persistent REPL for Interactive Development [2026-04-30 Thu]
*** DONE Implement org-skill-repl for persistent Lisp evaluation
- repl-eval: evaluate code with result+output+error separation
- repl-inspect: inspect variables and functions
- repl-list-vars: list bound symbols in package
- repl-load-file: load files into image
- Supports REPL-first workflow with literate reflection in org
* PHASE: EVENT ORCHESTRATION + HITL (v0.3.0)
:PROPERTIES:
:ID: proj-orchestration-v0-3-0
:END:
Unified control plane: hooks + cron + routing in one skill. Deep project understanding.
** TODO 0. Project Renaming (Bouncer → Dispatcher)
*** TODO Audit all files for component names to rename
*** TODO Rename org-skill-bouncer.org → org-skill-dispatcher.org
*** TODO Rename skill-bouncer package → skill-dispatcher
*** TODO Rename cognitive tool =:bouncer= → =:dispatcher=
*** TODO Update all references in harness, skills, documentation
*** TODO Update gtd.org and ROADMAP.org terminology
*** TODO Update DESIGN_DECISIONS.org section if applicable
*** TODO Verify all tests pass after rename
:LOGBOOK:
- State "TODO" from "" [2026-05-01 Fri 15:40]
:END:
The Dispatcher's role has evolved beyond security guard. It is the seed of the deterministic engine - it learns to execute procedures without invoking the neural net.
** TODO 1. Event Orchestrator (unified hooks+cron+routing)
*** TODO Integrate contrib org-skill-event-orchestrator
- Merge *hook-registry* + *cron-registry* + complexity classifier
- Hooks via =#+HOOK:= Org-mode properties
- Three complexity tiers: =:REFLEX= (no LLM), =:COGNITION= (light LLM), =:REASONING= (full LLM)
- Hook into heartbeat for cron processing
** TODO 2. Context Manager (project scoping)
*** TODO Integrate contrib org-skill-context-manager
- Stack-based context with =push-context= / =pop-context=
- Path resolution relative to current context
- Memory scope: =:scope= property on org-objects (memex/session/project)
- Implement lazy-loading proxies for large-scale memory traversal (offload cold nodes to disk)
** TODO 3. Model-Tier Routing (cost optimization)
*** TODO Extend =*model-selector-fn= for complexity-based routing
- Heartbeats → smallest model
- User input → medium model
- Complex reasoning → large model
- Source: GBrain sub-agent model routing
** TODO 4. Memory Scope Segmentation
*** TODO Extend org-object with =:scope= property
- =:memex= (permanent knowledge)
- =:session= (ephemeral context)
- =:project= (scoped to current work)
- Scope-aware retrieval in memory.lisp
** TODO 5. Asynchronous Embedding Gateway
*** TODO Implement provider-agnostic org-skill-embedding-gateway
- Support Ollama, llama.cpp, and OpenAI based on .env config
- Implement lazy-loading: edits mark nodes as =:vector :pending=
- Background worker thread batches pending nodes and updates Merkle tree silently
** TODO 6. Slash Commands (TUI ergonomics)
*** TODO M-x style command palette
*** TODO /- prefix for command mode
*** TODO Commands defined in Org-mode
* PHASE: LONG-HORIZON PLANNING + GIT WORKFLOWS (v0.4.0)
:PROPERTIES:
:ID: proj-planning-v0-4-0
:END:
Multi-step task mastery, structured tracking with failure handling and course correction.
** TODO 0. Long-Horizon Planning (task tree DAG)
*** TODO Implement org-skill-long-horizon
- Decompose complex tasks into Org-mode headline trees
- Terminal states: =:todo==:next-action==:in-progress==:done= / =:blocked= / =:stuck=
- Parent summarises child results
- Branch pruning when paths fail
- Source: Claude Code ULTRAPLAN (reimplemented in Lisp)
** TODO 1. Git Steward (version control integration)
*** TODO Integrate contrib org-skill-git-steward
- Status, diff, commit, push, branch operations
- Policy: commit-before-modify gate (from contrib engineering-standards)
- Log commits to memory
** TODO 2. TDD Runner Integration
*** TODO Integrate contrib org-skill-tdd-runner
- Run FiveAM tests on file save
- Inject =:test-failure= event on red
- Hook into self-fix for auto-repair proposals
** TODO 3. Deep Emacs Integration
*** TODO Full org-agenda awareness
- Navigate, clock time, refile, archive
- Uses org-element + org-id
* PHASE: INTERACTIVE ACTUATION & ENVIRONMENT STEWARDSHIP (v0.5.0)
:PROPERTIES:
:ID: proj-actuation-v0-5-0
:END:
Interactive terminal sessions and autonomous dependency management.
** TODO 0. Interactive PTY Actuator
*** TODO Stream long-running process output to the context window (e.g., `npm run dev`, REPLs)
*** TODO Implement async interrupt control (Ctrl+C emulation)
** TODO 1. The Environment Steward
*** TODO Autonomously detect missing dependencies (e.g., "Command not found")
*** TODO Propose an installation command and retry the failed action
* PHASE: CREATOR + ARCHITECT + GTD (v0.6.0)
:PROPERTIES:
:ID: proj-creator-v0-5-0
:END:
Agent bootstraps itself: creates skills autonomously, designs projects from PRDs, tracks work.
** TODO 0. Skill Creator (autonomous skill generation)
*** TODO Integrate contrib org-skill-creator
- LLM drafts complete skill org-file from natural language
- Mandatory: syntax validation → jail-load → test → register
** TODO 1. Architect Agent (PRD → PROTOCOL)
*** TODO Integrate contrib org-skill-architect
- Scan =:STATUS: FROZEN= PRDs
- Generate Phase B PROTOCOL from Phase A
- Write to same file
** TODO 2. GTD Integration (project tracking)
*** TODO Integrate contrib org-skill-gtd
- Full GTD cycle: capture, clarify, organize, reflect, engage
- org-gtd v4.0 DAG (=:TRIGGER:=, =:BLOCKER:=)
** TODO 3. Consensus Loop (multi-model agreement)
*** TODO Integrate contrib org-skill-consensus
- Run multiple providers for critical decisions
- Compare results, detect disagreements
- Confidence scoring
** TODO 4. Web Research (Playwright browsing)
*** TODO Integrate contrib org-skill-playwright + org-skill-web-research
- Headless Chromium via Python bridge
- Text extraction and screenshots
- Gemini Web UI automation
** TODO 5. Memex Management (PARA lifecycle)
*** TODO Integrate contrib org-skill-memex + org-skill-workspace-manager
- Archive DONE tasks, suggest refiling
- Detect orphaned nodes
- PARA/Zettelkasten maintenance
* PHASE: VISUAL GROUNDING & MCP BRIDGE (v0.7.0)
:PROPERTIES:
:ID: proj-vision-v0-7-0
:END:
Multimodal visual interaction and ecosystem-wide tool compatibility.
** TODO 0. Computer Use / Vision
*** TODO Allow the agent to request host OS or browser screenshots
*** TODO Analyze UI and issue precise X/Y coordinate click/type commands via an X11/Wayland bridge
** TODO 1. MCP Gateway Bridge
*** TODO Build a Lisp-native client for the Model Context Protocol
*** TODO Connect OpenCortex to external tools and data sources
* PHASE: THE EVALUATION HARNESS (v0.8.0)
:PROPERTIES:
:ID: proj-eval-v0-8-0
:END:
Automated benchmarking to mathematically prove the agent's reasoning capabilities.
** TODO 0. SWE-Bench Harness
*** TODO Automated pipeline that clones repositories and feeds GitHub issues
*** TODO Track multi-step resolution trajectory, run tests, and score success
* PHASE: SOTA PARITY (v1.0.0)
:PROPERTIES:
:ID: proj-sota-v1-0-0
:END:
Feature-complete agent competitive with commercial agents. All borrowed concepts reimplemented in pure Lisp.
All features from v0.2.0 through v0.8.0 combined, verified, and tested end-to-end.
| Area | Parity Target |
|------|--------------|
| Self-improvement | Claude Code self-debug |
| Planning | ULTRAPLAN equivalent |
| Tool ecosystem | 10+ cognitive tools |
| Context window | Semantic search + scope segmentation |
| Safety | 6 Policy invariants + formal verification |
| Multi-step tasks | Task trees with terminal states |
| Code editing | Full file read/write via org manipulation |
| Memory | Vector recall in org-object |
| Emacs integration | Full org-mode control (exceeds Claude Code) |
| Autonomy | 100% local capable (exceeds Claude Code) |
* PHASE: LISP MACHINE EMERGENCE (v2.0.0)
:PROPERTIES:
:ID: proj-lisp-v2-0-0
:END:
From Lisp-using agent to true Lisp machine. Agent IS the Emacs process.
** TODO Lish: Lisp editor as Org-mode IDE
- Org-babel for interactive Lisp evaluation
- Full REPL in TUI
- No bridge needed — direct memory access
** TODO Lish: Shell replacement
- Lisp-based shell that speaks plists
- Org-mode buffers as file system
- No bash dependency
* PHASE: NEUROSYMBOLIC MATURITY (v3.0.0)
:PROPERTIES:
:ID: proj-neuro-v3-0-0
:END:
Deterministic planner takes the wheel. LLM relegated to semantic translation.
** TODO Deterministic planner
- Planner as pure Lisp function
- No LLM needed for scheduling
- Generates task graphs without probabilistic inference
** TODO Self-correcting gates
- Gates learn from false positives (user override patterns)
- Adaptive threshold adjustment
* PHASE: AI STACK INTERNALIZED (v4.0.0)
:PROPERTIES:
:ID: proj-ai-v4-0-0
:END:
The agent understands its own weights. No external inference.
** TODO Llama.cpp in Lisp
- FFI binding to llama.cpp
- No Python subprocess
- Pure Common Lisp inference
** TODO Weights as sexps
- Neural weights represented as Lisp data structures
- Homoiconic model introspection
* PHASE: TRUE AGENCY (v5.0.0)
:PROPERTIES:
:ID: proj-agency-v5-0-0
:END:
World models, temporal reasoning, goal persistence across restarts.
** TODO World models
- Agent builds predictive models of user behavior
- Project dynamics awareness
- System state forecasting
** TODO Temporal reasoning
- Scheduling and deadline awareness
- Elapsed duration tracking
- Calendar integration
** TODO Goal persistence
- Goals survive restarts
- Long-term projects tracked in org-objects
- Cross-session continuity
* PHASE: EVOLUTIONARY ROADMAP (Previous — Superseded by Critical Analysis)
:PROPERTIES:
:ID: proj-old-roadmap
:END:
Superseded by the critical analysis-informed roadmap above (v0.2.0 through v5.0.0). This section kept for historical reference.
** TODO v0.1.0: The Autonomous Foundation (Current Release) — Now COMPLETE
** TODO v1.0.0 (Phase 2.5): The Verified Wrapper (SOTA Parity) — Now v1.0.0
** TODO v2.0.0 (Phase 3): Cannibalizing the Toolchain — Now v2.0.0
** TODO v3.0.0 (Phase 4): True Symbolic Determinism — Now v3.0.0
* PHASE: FOUNDATION (Complete)
** DONE Draft Swank/Socket communication protocol between CL and Emacs
:PROPERTIES:
:CREATED: [2026-03-22 Sun 14:00]
:ASSIGNED: Agent
:END:
** DONE Implement core Perceive-Think-Act loop in Common Lisp
:PROPERTIES:
:CREATED: [2026-03-22 Sun 14:00]
:ASSIGNED: Agent
:END:
** DONE Implement Persistent Object-Store for Org entities in CL
:PROPERTIES:
:CREATED: [2026-03-22 Sun 16:30]
:ASSIGNED: Agent
:END:
** DONE Implement LLM Connector (Probabilistic Engine) in CL Daemon
:PROPERTIES:
:CREATED: [2026-03-22 Sun 17:30]
:ASSIGNED: Agent
:END:
** DONE Design Deterministic Engine Heuristics (Lisp logic over Memory)
:PROPERTIES:
:CREATED: [2026-03-22 Sun 17:30]
:END:
** DONE Achieve Phase 3: The Self-Editing Kernel
:PROPERTIES:
:CREATED: [2026-03-23 Mon 16:30]
:END:
- Jailing & Sandboxing implemented
- Org-Native Skill Standard established
- Telemetry & Introspection API active
* PHASE: THE SOVEREIGN BOUNDARY (Core vs Skills Refactor)
:PROPERTIES:
:ID: proj-autonomous-boundary
:END:
Slim down the opencortex microharness by moving non-essential cognitive functions to hot-reloadable user-space skills.
** DONE Extract LLM Provider Routing to a Skill (neuro.lisp)
** DONE Extract Vector Embedding Algorithms to a Skill (embedding.lisp)
CLOSED: [2026-04-12 Sun 14:10]
:PROPERTIES:
:ID: extract-embedding-skill
:END:
- Created `org-skill-embedding.org`.
- Moved logic to `src/embedding-logic.lisp` via tangling.
- Updated `system-definition.org`.
** DONE Extract Sparse Tree Context Pruning Strategies to a Skill (context.lisp)
CLOSED: [2026-04-12 Sun 14:25]
:PROPERTIES:
:ID: extract-context-skill
:END:
- Created `org-skill-peripheral-vision.org`.
- Moved logic to `src/context-logic.lisp` via tangling.
- Updated `system-definition.org`.
** DONE Implement `org-skill-peripheral-vision` (Moving embedding logic out of core)
CLOSED: [2026-04-12 Sun 14:25]
:PROPERTIES:
:ID: impl-peripheral-vision
:END:
** DONE Implement communication protocol Schema Validation (Prevent reader macro injection in communication.lisp)
CLOSED: [2026-04-12 Sun 14:45]
:PROPERTIES:
:ID: communication-protocol-schema-validation
:END:
- Created `org-skill-protocol-validator.org`.
- Integrated `validate-communication-protocol-schema` into `communication.org`.
- Added `protocol-validator.lisp` to system definition.
** DONE Implement Pluggable communication protocol Integrity Hashing (Core interface, Skill-based algorithms)
CLOSED: [2026-04-12 Sun 15:15]
:PROPERTIES:
:ID: communication-protocol-integrity-hashing
:END:
- Integrated HMAC-SHA256 (`ironclad:make-mac`) in `literate/communication.org`.
** DONE Implement Native Lisp Merkle-Tree Versioning (Short-term undo buffer in memory.lisp)
CLOSED: [2026-04-12 Sun 19:15]
** DONE Performance: Implement Copy-on-Write (CoW) or Persistent Data Structures for Memory
CLOSED: [2026-04-12 Sun 19:15]
** DONE Feature: Implement Latent Reflection (Proactive Gardening) using heartbeat idle cycles
CLOSED: [2026-04-12 Sun 19:15]
** DONE Simplification: Refactor Cognitive Cycle into a Unified Reactive Signal Pipeline
CLOSED: [2026-04-12 Sun 19:15]
** DONE Resilience: Implement Micro-Rollbacks for the Immune System
CLOSED: [2026-04-12 Sun 19:15]
** DONE Implement `org-skill-memory-archivist` (Long-term IPFS checkpointing and P2P sync)
CLOSED: [2026-04-12 Sun 19:15]
** DONE Implement True Lisp Sandboxing (eval-safe mechanism in core and policy in skills)
CLOSED: [2026-04-12 Sun 19:15]
** DONE Decouple Vendor Logic from Probabilistic Engine (Move Google/Anthropic/OpenAI to Skills)
CLOSED: [2026-04-12 Sun 19:15]
** DONE Component IV: Comprehensive Core Skill Audit (Review all 39 skills)
CLOSED: [2026-04-12 Sun 19:45]
:PROPERTIES:
:ID: core-skill-audit-task
:END:
** DONE Consolidation I: Unified LLM Gateway (Anthropic, Gemini, Groq, OpenAI, etc.)
** DONE Consolidation II: Credentials Vault (Secure Enclave & Masked Logging)
** DONE Consolidation III: Homoiconic Memory (Unified Grammar, Bridge, & ID Generation)
** DONE Consolidation IV: State Persistence Layer (Unified Local & IPFS Checkpointing)
** DONE Consolidation V: Event Orchestrator (Unified Cron, Hooks, & Cognitive Routing)
** DONE Consolidation VI: Task Orchestrator (Task Integrity, Delegation, & Consensus)
CLOSED: [2026-04-11 Sat 13:45]
:PROPERTIES:
:ID: task-orchestrator-consolidation
:END:
- Implemented Parallel Multi-Backend Consensus in neuro.lisp.
- Implemented Task Integrity (GTD semantics) in symbolic.lisp.
- Integrated Consensus Gate and Delegation hooks in core.lisp.
- Verified with new task-orchestrator-tests.lisp.
** DONE Implement Unified Envelope Architecture & Channel-Awareness
CLOSED: [2026-04-20 Mon 13:20]
- Removed specialized :CHAT type; reverted to semantic :REQUEST/:EVENT protocol.
- Decoupled routing metadata into a :META envelope (SOURCE, SESSION-ID).
- Updated TUI, Emacs, and CLI gateways to use the unified protocol.
- Verified end-to-end loop with TUI; kernel correctly routes responses back to origin interface.
- Achieved "Equality of Clients" mandate.
** DONE Full review of opencortex's harness
CLOSED: [2026-05-01 Fri 12:46]
:PROPERTIES:
:CREATED: [2026-04-13 Mon 13:30]
:ASSIGNED: Agent
:END:
- [X] Audit terminology: Replaced OACP with "communication protocol" workspace-wide.
- [X] Audit boot sequence: Synchronized loader with `org-skill-policy.org`.
- [X] Implement Unified Envelope (Channel-Aware Routing).
- [X] Audit core Perceive-Think-Act loop.
- [X] Verified protocol framing and reader jailing (`*read-eval* nil`).
- [X] Refactored `loop.org` for literate granularity and configuration externalization.
- [X] Improved error handling (restricted rollback) and added graceful shutdown.
- [X] **FIXED:** Implemented symbolic guard check in `act-gate` via Dispatcher skill refactoring.
- [X] **FIXED:** Harness `deterministic-verify` now correctly respects skill triggers.
- [X] **FIXED:** Resolved TUI crash by removing `--non-interactive` from `opencortex.sh` and adding defensive coordinate handling.
- [X] **VERIFIED:** Confirmed bidirectional TUI communication and signed off v0.2.0.
- [X] Ensure alignment with System Policy and Engineering Standards.
- [X] Restored structural integrity by fixing `manifest.org` loading sequence.
** TODO Wake up the Scribe (Implement autonomous weekly Journal-to-Ledger distillation in org-skill-scribe.org)
** TODO Implement `org-skill-lisp-repair` (Self-correcting syntax gate for Deterministic Engine)
CLOSED: [2026-04-11 Sat 15:10]
:PROPERTIES:
:ID: lisp-repair-gate
:END:
- Implemented asynchronous, event-driven repair logic.
- Decoupled core from repair logic (emits `:syntax-error` event).
- Proven via lisp-repair-tests.lisp (Asynchronous flow verified).
** DONE Implement `org-skill-formal-verification` (Prove safety of high-impact actions)
CLOSED: [2026-04-11 Sat 18:15]
:PROPERTIES:
:ID: formal-verification-task
:END:
- Implemented `org-skill-formal-verification.org`.
- Created Lisp-Native Symbolic Prover for security invariants.
- Implemented `path-confinement` invariant (restricted to memex root).
- Implemented `no-network-exfil` invariant (blocking nc, ssh, etc).
- Verified with `formal-verification-tests.lisp`.
* PHASE: DETERMINISTIC ENGINE REFINEMENT
** DONE Verify Autonomous Self-Fix Loop
CLOSED: [2026-04-11 Sat 14:20]
:PROPERTIES:
:CREATED: [2026-03-23 Mon 16:30]
:END:
- Proven repair capability via self-fix-tests.lisp.
- Verified surgical code patching and hot-reloading.
- Documentation and RCA complete.
** DONE Implement "Planning Mode" (Deterministic Engine Dispatcher) for Complex Actions
CLOSED: [2026-04-11 Sat 15:30]
:PROPERTIES:
:CREATED: [2026-04-01 Wed 17:00]
:END:
- Implemented `dispatcher-check` interceptor in `symbolic.lisp`.
- Created `org-skill-dispatcher.org` for flight plan serialization.
- Verified asynchronous Org-native approval loop via `dispatcher-tests.lisp`.
** DONE Implement Authorization Gate (communication protocol) for "Planning Mode"
CLOSED: [2026-04-11 Sat 15:30]
:PROPERTIES:
:CREATED: [2026-04-01 Wed 17:00]
:END:
- Integrated with Org-mode state transitions (`PLAN` -> `APPROVED`).
- Leveraged Memory event bus for asynchronous re-injection.
** DONE Refactor Architecture Terminology (Associative -> Probabilistic, Deliberate -> Deterministic)
CLOSED: [2026-04-12 Sun 21:00]
:PROPERTIES:
:ID: terminology-refactor-task
:END:
- Updated codebase-wide terminology to use Probabilistic/Deterministic Engines.
- Replaced System 1/2 with Probabilistic/Deterministic Engines respectively.
** DONE Refactor org-skill-policy.org: Concrete Invariants, Conflict Hierarchy, and Auditable Gate
CLOSED: [2026-04-22 Wed 11:50]
:PROPERTIES:
:ID: policy-refactor-concrete-invariants
:END:
- Added explicit Override Hierarchy (Transparency > Autonomy > Bloat > Mentorship > Sustainability).
- Implemented `policy-check-transparency`: blocks user-facing actions without :explanation.
- Implemented `policy-check-autonomy`: flags proprietary domain references as autonomy debt.
- Implemented `policy-check-bloat`: warns on :create-skill actions exceeding size threshold.
- Implemented `policy-check-mentorship`: blocks high-impact actions missing :mentorship-note.
- Implemented `policy-check-sustainability`: logs cloud-provider usage as sustainability debt.
- Implemented `policy-explain`: formats auditable rationale for every policy decision.
- Implemented `policy-find-engineering-standards-gate`: robust cross-package search for standards skill.
- Hardened `policy-deterministic-gate`: never returns NIL silently; always returns action or auditable log.
- Raised skill priority from 100 to 500 to ensure it runs before other deterministic gates.
** DONE Add Invariant 6 (Modularity) and Harness Boundary Contract to Policy/Manifest
CLOSED: [2026-04-22 Wed 12:10]
:PROPERTIES:
:ID: policy-modularity-invariant
:END:
- Added Modularity as Invariant 6 in `org-skill-policy.org`: general life principle that complexity must live at the edges.
- Implemented `policy-check-modularity`: blocks modifications to protected core paths unless `:modularity-justification` is provided.
- Defined `*modularity-protected-paths*` as project-configurable variable (defaults: harness/, opencortex.asd).
- Updated Override Hierarchy to include Modularity between Bloat and Mentorship.
- Added Harness Boundary Contract section to `harness/manifest.org` documenting primary boundary files and generated artifacts.
- Converted checkbox sub-tasks to hierarchical TODO headlines per GTD standard.
** DONE Implement `org-skill-lisp-validator` (3-phase deterministic validation gate)
CLOSED: [2026-04-22 Wed 12:30]
:PROPERTIES:
:ID: lisp-validator-implementation
:END:
- Created 3-phase validation pipeline: Structural (O(n) paren scanner), Syntactic (reader with *read-eval* nil), Semantic (whitelist AST walk).
- Implemented `lisp-validator-validate` returning structured plists for machine parsing.
- Exposed `:validate-lisp` cognitive tool for Probabilistic Engine self-correction.
- Replaced `validate-lisp-syntax` in `harness/skills.org` with delegation to the validator.
- Added mandatory validation rule to Probabilistic Engine system prompt in `harness/reason.org`.
- Fixed paren balance and `return-from` compilation errors in org source; tangled and validated in SBCL.
** DONE Fix Skill Loader to Respect `:tangle` Blocks and Eliminate Circular Dependency
CLOSED: [2026-04-22 Wed 12:45]
:PROPERTIES:
:ID: skill-loader-tangle-fix
:END:
- Updated `load-skill-from-org` in `harness/skills.org` to only collect blocks with `:tangle` directives pointing to runtime `.lisp` files, excluding `tests/` and `test/` paths.
- Added fallback to `validate-lisp-syntax` so it uses a basic reader check when `lisp-validator-validate` is not yet loaded (breaks circular harness->skill dependency).
- Verified full boot: 13/13 skills loaded successfully into SBCL, including `skill-lisp-validator` at priority 900 and `skill-policy` at priority 500.
* TRACK: SECURITY & CONTAINMENT (The 5-Vector Dispatcher Matrix)
** DONE Implement Path-Based Scoping for File Writes (DNA/State vs Work)
CLOSED: [2026-04-12 Sun 15:15]
:PROPERTIES:
:ID: path-based-scoping
:END:
- Implemented as `path-confinement` invariant in `org-skill-formal-verification.org`.
** DONE Implement Network Exfiltration Gate (Intercept generic HTTP requests)
CLOSED: [2026-04-12 Sun 15:15]
:PROPERTIES:
:ID: network-exfiltration-gate
:END:
- Implemented as `no-network-exfil` invariant in `org-skill-formal-verification.org`.
** TODO Implement Secret Exposure Gate (Intercept reads to .env, keys)
* TRACK: INTELLIGENCE & ACTUATION (The Engines)
** DONE Verify individual provider track (Anthropic, Gemini, Groq, OpenAI, OpenRouter, Ollama)
CLOSED: [2026-04-11 Sat 15:45]
:PROPERTIES:
:ID: provider-verification-track
:END:
- Added unit tests for each provider in `llm-gateway-tests.lisp`.
- Mocked `dex:post` to verify JSON payload formatting and response parsing.
- Implemented robust `get-nested` helper to handle various provider structures.
- Integrated `llm-gateway` and `credentials-vault` into `opencortex.asd`.
** TODO Verify org-skill-shell-actuator formal safety harnesses
** DONE Build Playwright-Python Bridge for high-fidelity browsing
CLOSED: [2026-04-11 Sat 18:30]
:PROPERTIES:
:ID: playwright-bridge-task
:END:
- Created `scripts/browser-bridge.py` (Playwright wrapper).
- Implemented `org-skill-playwright.org`.
- Registered `:browser` cognitive tool (JS-rendering, text extraction, screenshots).
- Updated `Dockerfile` with Python/Playwright dependencies.
- Verified with `playwright-tests.lisp`.
* TRACK: COMMUNICATION & INTERFACES
** DONE Implement org-skill-gateway-telegram
CLOSED: [2026-04-11 Sat 16:15]
:PROPERTIES:
:ID: gateway-telegram-task
:END:
- Implemented `org-skill-gateway-telegram.org`.
- Added automated background polling for Telegram GetUpdates.
- Implemented `:telegram` actuator for outbound responses.
- Refactored `org-skill-chat` to be channel-aware.
- Verified with `gateway-telegram-tests.lisp`.
** DONE Implement org-skill-gateway-signal
CLOSED: [2026-04-11 Sat 16:50]
:PROPERTIES:
:ID: gateway-signal-task
:END:
- Implemented `org-skill-gateway-signal.org` (signal-cli wrapper).
- Added background polling for `signal-cli receive --json`.
- Implemented `:signal` actuator for outbound responses.
- Updated `org-skill-chat` to support Signal channel.
- Verified with `gateway-signal-tests.lisp`.
** DONE Implement org-skill-gateway-matrix
CLOSED: [2026-04-11 Sat 17:15]
:PROPERTIES:
:ID: gateway-matrix-task
:END:
- Implemented `org-skill-gateway-matrix.org` (Client-Server API).
- Added background polling for `/sync` with token persistence.
- Implemented `:matrix` actuator for `m.room.message` delivery.
- Updated `org-skill-chat` to support Matrix channel and room IDs.
- Verified with `gateway-matrix-tests.lisp`.
* TRACK: DEPLOYMENT & INFRASTRUCTURE
** DONE Create Dockerfile and docker-compose.yml for containerized setup
CLOSED: [2026-04-11 Sat 17:30]
:PROPERTIES:
:ID: docker-infra-task
:END:
- Created `Dockerfile` (Debian-based, SBCL + Quicklisp + signal-cli).
- Created `docker-compose.yml` with host-volume mapping for memex.
- Created `docs/deployment.org` guide.
** TODO Create Bare Metal installation scripts/playbooks
** TODO Create LXC (Linux Containers) template/guide
** TODO Create VM Vagrantfiles/Cloud-init configs
* TRACK: MAINTENANCE & HYGIENE
** TODO [RECURRING: Monthly] Review and test Infrastructure Dependency Upgrades
:PROPERTIES:
:ID: monthly-infra-audit
:REPEAT_TO_STATE: TODO
:END:
*** TODO Check for new Debian security patches (`apt-get update` check)
*** TODO Check for new `signal-cli` releases (compare vs v0.14.0)
*** TODO Check for new Quicklisp distribution (monthly snapshot)
*** TODO Verification: Update `Dockerfile`, run `docker-compose build --no-cache`, and execute full test suite
*** TODO If all tests pass, commit updated `Dockerfile` and `.asd` dependencies
* TRACK: COMMUNITY & DOCS
** TODO Write Quickstart Guide
** TODO Write Skill Creation Guide
** TODO Write Architecture Deep-Dive
** TODO Clean up GitHub repository structure and add CI/CD
** TODO Create Marketing Material (Landing page copy, diagrams)
** TODO Draft Release Plan checklist
* SUB-PROJECT: THE BOOT SEQUENCE (skills.lisp)
:PROPERTIES:
:ID: proj-skill-boot-sequence
:END:
** DONE Refactor `skills.lisp` into a Micro-Loader (Harness)
CLOSED: [2026-04-12 Sun 19:10]
** DONE Implement Topological Sort based on `#+DEPENDS_ON:` tags
CLOSED: [2026-04-12 Sun 15:15]
:PROPERTIES:
:ID: topological-sort-skills
:END:
- Implemented in `literate/skills.org`.
** DONE Enforce `org-skill-system-invariants` as the mandatory Gateway Skill (Loaded first)
CLOSED: [2026-04-12 Sun 15:15>
:PROPERTIES:
:ID: enforce-mandatory-skill
:END:
- Enforced in `initialize-all-skills` in `literate/skills.org`.
** DONE Formalize the "Minimal Boot Set" (Router, Vision, Steward, Actuator)
CLOSED: [2026-04-12 Sun 19:10>

339
docs/DESIGN_DECISIONS.org Normal file
View File

@@ -0,0 +1,339 @@
# OpenCortex Design Decisions
This document captures the rationale behind key architectural choices. It is not a specification - it is a thinking medium for future architects and contributors who need to understand why the system is built this way, not just how.
* Multi-Agent by Default is a Smell
:PROPERTIES:
:ID: design-multi-agent-default
:END:
The AI industry has developed an intuition toward multi-agent systems as the default solution to hard problems. Multiple agents spawn, delegate, coordinate, debate, and consensus their way toward solutions. This pattern is compelling in demos and genuinely useful in specific contexts - but it has become a default assumption that warrants scrutiny.
When context windows grew expensive and task complexity increased, the response was natural: split the problem across agents, each handling a slice. But this architectural choice carries hidden costs that are rarely acknowledged in the enthusiasm of implementation.
**The synchronization tax** is the most immediate burden. Each agent operates with partial information, and maintaining coherence requires continuous state reconciliation. Tokens and processing cycles are spent not on the task itself, but on protocol overhead - who holds what, who decided what, who is correct when they disagree.
**Fragmented context** is the deeper problem. When Agent A writes a function and Agent B modifies a type it depends on, neither has the full picture. Integration failures emerge not from individual incompetence but from systemic communication gaps. Single-agent systems avoid this entirely: one brain holds the complete model, every decision is made with full visibility.
**Audit trails become complex** in multi-agent systems. A decision traced through a single-agent system has a clean, linear history. A decision traced through a multi-agent system branches and forks, with each agent's reasoning partially overlapping and partially conflicting.
None of this is to say multi-agent systems are never appropriate. Embarrassingly parallel workloads - scanning ten thousand files, processing batch jobs - benefit from parallelism regardless of context. When distinct expertises are required and cannot coexist in one model, delegation makes sense. In adversarial scenarios where conflicting goals are features, multi-agent architectures shine.
But the default assumption that complex reasoning tasks are best solved by multiple agents is unproven and likely wrong for the engineering domain. Claude Code is a single-agent system. It handles 50-file refactors, debugs complex stack traces, writes tests, and navigates large codebases. The assumption that you need five agents to do what one well-designed agent can do is an industry habit, not a technical necessity.
OpenCortex is single-agent by default not from limitation but from conviction: for reasoning-heavy work where coherence matters, a unified memory space and single decision-making locus are architectural assets, not constraints.
* The Unified Memory Argument
:PROPERTIES:
:ID: design-unified-memory
:END:
If single-agent architecture is the decision, unified memory becomes the mechanism that makes it viable. The critical question is not "how many agents" but "how does the agent manage context without saturating."
Context window limits are largely a symptom of lazy architecture. The default approach - stuff everything in, hope the model figures it out - works poorly at scale. A more principled approach inverts the problem: the system should hold effectively infinite context, with the active window kept lean through intelligent management.
**Lazy loading** is the core technique. When an agent needs information about a function, it does not load the entire codebase. It loads precisely what the function does. Context stays lean - 2,000 to 4,000 tokens - while the full context remains accessible through retrieval.
**Compaction events** are scheduled during idle cycles. The system extracts new facts from active context and writes them to permanent storage. Active context is wiped clean, not because space ran out, but because the information has been preserved in a form that can be retrieved when relevant.
**Org-mode as externalized memory** solves the persistence problem elegantly. Every decision, every note, every task lives in plain text files the user already owns. The agent does not maintain a separate database. It queries files it can already access, modifies files it already owns.
**Retrieval is the key primitive.** Semantic search across Org files finds relevant nodes. The agent does not hold the full context - it holds pointers to context, loaded on demand. This is how a single agent handles tasks that would saturate a naive multi-megabyte context window.
The unified memory argument is not that infinite context is free. It is that with proper architecture, effective infinite context is achievable without the synchronization and fragmentation costs of multi-agent systems.
* The Probabilistic-Deterministic Split
:PROPERTIES:
:ID: design-probabilistic-deterministic
:END:
The architecture divides cognition into two fundamentally different reasoning systems. This is not arbitrary engineering but a structural response to a fundamental truth: probabilistic systems will hallucinate, and you cannot build reliable autonomy on an unreliable foundation.
An LLM is a statistical engine. It generates outputs based on patterns in training data. It is remarkable at translation, generation, pattern matching, and fuzzy reasoning. It can take messy human intent and produce structured queries. It can take structured results and produce natural language. It is, in the terminology of the system, the creative brain.
But it cannot be trusted. Not because it is poorly designed or insufficiently trained, but because hallucination is a fundamental property of probabilistic inference. The model generates the most likely continuation, not the correct one. Given sufficient context, the most likely continuation is correct. Given novel context, it is often wrong in confident-sounding ways.
The deterministic engine addresses this by being what the probabilistic engine is not: mathematically rigorous, formally verifiable, and incapable of hallucination by design. It operates on explicit symbolic representations - lists, property lists, knowledge graphs - not on floating-point activations. When it evaluates a path confinement check, it returns true or false, not a probability distribution.
The division of labor is architectural. The LLM handles the fuzzy interface between human language and structured representation. It translates what the user wants into what the system can reason about. The deterministic engine receives those structured representations and evaluates them against formal invariants. It decides whether to execute, not whether the translation was semantically plausible.
This separation is the source of OpenCortex's safety guarantee. Other agents add "guardrails" as an afterthought - a layer of filtering around a dangerous core. OpenCortex makes the division explicit: the LLM never touches the file system, never executes a command, never modifies memory. It generates proposals. The deterministic engine evaluates and executes. The dangerous operations are never in the probabilistic path.
The split also explains why the system gets safer over time without the LLM improving. The deterministic engine accumulates rules. The LLM proposes actions, the engine evaluates them against a growing rule set. Early versions block obvious dangers. Later versions block sophisticated attacks that were previously unknown. The safety grows logarithmically with the number of interactions, not linearly with model capability.
* Homoiconicity as Foundation
:PROPERTIES:
:ID: design-homoiconicity
:END:
Common Lisp is homoiconic: code and data share the same representation. A Lisp program is a list, and a list is a Lisp program. This is usually presented as a curiosity, an interesting property that enables macros. In OpenCortex, it is the foundational enabling property of the entire self-modification architecture.
When code is data, the agent can read its own source the same way it reads a text file or an Org buffer. There is no AST parser required, no external tool to extract the function object from the running image. The agent evaluates (read-from-string source) and the result is executable Lisp. The representation it manipulates is the same representation that the runtime executes.
This is not true of most languages. In Python, the agent can inspect an AST through the ast module, but that AST is a foreign object - a data structure that represents code but is not code itself. The agent can see that a function takes certain arguments and returns a certain type, but it cannot treat the AST as a live object it can modify and re-evaluate. In C, the agent cannot inspect its own compiled machine code at all.
In Lisp, the distinction between code and data is a convention, not a barrier. The agent's skills are lists. The agent can take a skill, extract a function definition, modify the body, wrap it in a new list, and evaluate it. The modification is surgical: it changes exactly what it intends to change, with no risk of corrupting adjacent state, because the representation is a tree that the runtime understands natively.
Runtime introspection is therefore native. The agent does not need a debugger API or a reflection protocol. It operates on its own code as data because its own code is data. (describe 'function-name) returns the function's documentation. (function-lambda-list 'function-name) returns its parameters. (macroexpand-1 '(defskill ...)) shows what the macro produces. There is no impedance mismatch between the agent's reasoning and the system's representation.
Self-modification is the practical consequence. The agent can detect an error, locate the erroneous function, generate a corrected version, and hot-reload it into the running image. The correction is not applied to a file that requires a restart - it is applied to the live object that the system is currently executing. This is what makes the self-editing skill viable: the agent can fix itself without stopping.
In v3.0.0, when the symbolic engine takes over the reasoning core, homoiconicity becomes the bridge between the neural and symbolic layers. The neural engine generates proposals as s-expressions. The symbolic engine evaluates them against formal constraints. The result is a modification that is simultaneously a data structure the symbolic engine can analyze and code the runtime can execute. The two representations are identical by construction.
This is the technical meaning of "Lisp as Governor": not just that Lisp orchestrates the other components, but that the representation of the system is uniform and inspectable at every level. There is no hidden state, no opaque machine code, no representation that the agent cannot reach into and modify. The system is legible to itself by design.
**Self-Modification Without Boundaries**
Other systems that support self-editing draw a line between the core and the skills. Hermes can modify its skills at runtime, but the core harness is protected - editing it requires a restart because the core is treated as privileged code that cannot be safely modified while running.
OpenCortex has no such boundary. The "thin harness, fat skills" distinction describes where complexity lives, not where authority flows. The harness is small by design, but it is not privileged. The agent can read and write any part of the system - including the very code that is currently executing - without restarting.
This is only possible because Lisp code is mutable data at runtime. In a compiled language, the machine code for a running function is locked in memory, protected by the call stack, impossible to modify safely. In Lisp, the function object is a list you can modify with =setf=. When the agent changes a harness function, the running image immediately reflects the change. The next invocation uses the new code. There is no restart, no special boot mode, no distinction between development and production.
The implications extend beyond convenience. A system that cannot modify its own core is a system that has limits on its own adaptability. It can learn skills but not improve its own structure. It can grow but not evolve. OpenCortex's lack of a core boundary means the system can improve its own reasoning engine, fix bugs in its own cognition, and evolve its own architecture - all while continuing to operate.
This is the final expression of homoiconicity: not just that code is readable as data, or that skills are modifiable, but that the entire system - including the parts that other systems protect - is open to modification. There is no ceiling on self-improvement. The agent can rewrite the very code that rewrites itself.
**Lisp and the AI Dream**
Lisp was invented in 1958 by John McCarthy with artificial intelligence explicitly in mind. Its design - code as data, runtime mutation, symbols and lists as first-class constructs - was shaped by the belief that a truly intelligent machine would need to reason about and modify its own reasoning. For decades, Lisp machines were the closest thing to thinking machines that existed.
Then the AI winter came. Symbolic AI fell out of favor. Statistical learning and neural networks dominated. Lisp was relegated to niche applications and academic curiosity. The machine that was designed for AI was never used for the task it was designed for.
Six decades later, neural networks have arrived at the problem from a different direction. They can learn and generalize, but they hallucinate, cannot explain their reasoning, and cannot safely modify themselves. The neuro-symbolic synthesis - combining neural pattern recognition with symbolic reasoning - is recognized as the path toward AI that is both powerful and trustworthy.
Lisp's time may finally have come. Not as a replacement for neural networks, but as the governor that makes them safe - the symbolic engine that verifies what the neural engine proposes, the homoiconic substrate that allows the system to inspect, modify, and improve its own reasoning. The machine that was designed for AI in 1958 may be the exact machine needed for AI in 2026 and beyond.
* Org-Mode as Unified AST
:PROPERTIES:
:ID: design-org-unified-ast
:END:
OpenCortex makes a bet that most systems consider too expensive to place: that humans and machines should share the same file format. That bet is Org-mode.
Most systems separate human-readable notes from machine-readable data. The user writes Markdown. The system stores it, indexes it, searches it. But internally, the system maintains its own model - a database, an object store, a knowledge graph - that is disconnected from the Markdown. When the user dies or leaves, the Markdown survives but the model must be reconstructed.
OpenCortex refuses this separation. The Org file is not a representation of the data. The Org file IS the data. The same text that the user reads and edits is what the system parses and operates on. org-element reads an Org buffer and returns a tree structure that is the direct Lisp representation of the file's content.
This has several profound implications.
First, there is no translation layer between human and machine. When the agent writes a skill, it writes Org text that is immediately readable by the human who owns the file. When the human writes a note, it is immediately accessible to the agent as a native data structure. The communication is not mediated by a schema or an import/export process.
Second, the format is genuinely readable by both parties, not just technically accessible. Org-mode's syntax is human-friendly: headlines begin with asterisks, properties live in drawers, tags are labels after colons. The human does not have to understand the full Org specification to read what the agent wrote. The agent does not have to handle edge cases in human notation.
Third, the format is stable across decades. Org-mode has been in active development since 2003. The files written today will be readable by Org-mode in 2040. There is no schema migration, no database upgrade, no vendor lock-in. The human's notes survive the system.
Fourth, the format is universally available. Org-mode is free software. The files are plain text. There is no proprietary format to decode, no application to purchase, no cloud service to access.
Fifth, the format is header-aware and sparse-tree capable. Org-mode's headline hierarchy is not just formatting - it is a semantic structure the system can query. The agent can retrieve only the relevant subtree under a heading, ignoring the rest of the file. This is fundamentally different from Markdown, where the entire file must be loaded or the retrieval logic must parse and filter at the string level.
Sparse tree retrieval is the key to efficient context management. When the agent needs information about the =openctl-db= function, it queries for the =openctl-db= subtree specifically. It receives exactly the code, documentation, and metadata under that heading - nothing more. The context stays lean not because the file was pre-split but because the retrieval is structural. In a Markdown system, the agent either loads the entire file (expensive, noisy) or relies on imprecise grep-like search (fragile, loses hierarchy). In Org-mode, retrieval is precise, hierarchical, and cheap. The heading boundary is the access boundary.
Sixth, Org-mode unifies what every other format fragments. A single Org file contains the headline hierarchy, prose documentation, source code blocks with live evaluation, tags for categorization, metadata in property drawers, TODO state for task management, timestamps and deadlines, and links to other nodes. Markdown cannot express TODO state without external tools. JSON cannot contain prose. YAML cannot embed runnable code. Each format serves one purpose; Org-mode serves all of them. When the agent reads a skill file, it reads documentation, code, dependencies, metadata, and task state in one parseable structure. When the human reads the same file, they see the same information rendered in a human-friendly form. No other format achieves this unification without maintaining parallel files or external databases.
Seventh, a skill lives in one Org file, not a directory. The standard pattern for a software project is a directory containing =README.md=, =package.json=, =src/main.py=, =src/utils.py=, =tests/test_main.py=, =scripts/deploy.sh=, and =config.yaml=. Each file type is isolated by convention: prose lives in README, code lives in src, tests in tests, configuration in config. This fragmentation means the skill is not a single object the system can reason about - it is a collection of files the system must assemble. OpenCortex's skills violate this convention deliberately. Each skill is one Org file. The file contains the skill's documentation, the skill's code, the skill's metadata, the skill's TODO state, and the skill's dependencies on other skills. There is no directory to navigate, no external files to locate, no risk that the README describes behavior that the code does not implement. The skill is a single atomic unit: readable by human and machine, editable by both, versionable as one entity.
The unified format is what makes the memory architecture work. The agent's memory is not a database that the user cannot inspect. It is a folder of Org files that the user can read, edit, and understand. The agent manipulates these files directly, using the same tools the user would use. There is no hidden state, no shadow database, no model that differs from the source.
This is what "sovereignty" means in technical terms: the user owns the data in a format they can access, and the agent operates on the data in the same format they own.
* Literate Programming as Discipline
:PROPERTIES:
:ID: design-literate-programming
:END:
The decision to use Org-mode as the source of truth for code, not just documentation, is not a ceremonial preference. It is a constraint mechanism that enforces better engineering habits at the cost of convenience.
The traditional development workflow is: write code, write comments, commit. The literate programming workflow is: write prose, write code, commit the Org. The order matters. The prose must come first not because of style guidelines but because the act of explaining what a function does before writing it forces clarity of thought that editing code directly does not.
When you must write a paragraph describing what a function does before you write the function, you discover the cases you have not considered. You find the edge conditions that are ambiguous. You realize that the function's name does not match its behavior, or that its behavior does not match your intent. The friction is not a bug - it is the mechanism by which thinking is enforced.
The one-function-per-block rule enforces granularity. A function that cannot be explained in a paragraph is a function that is doing too much. The block boundary is not aesthetic - it is architectural. It prevents the drift toward monolithic functions that accumulate responsibilities over time and become untestable, unmaintainable, and incomprehensible.
The tangle step enforces source-of-truth discipline. The .lisp file is generated from the Org file. This means the Org file cannot drift from the implementation. If the implementation changes, the Org must be updated to match. If the Org describes behavior that the implementation does not perform, the tangle produces code that does not match the Org description. Either way, inconsistency is visible and recoverable.
The evaluation gate enforces correctness. Every block can be evaluated independently in a running Lisp image. This means syntax errors are caught at authorship time, not at integration time. The function that compiles in isolation but fails in context is the function whose context dependencies were never made explicit. The evaluation gate forces those dependencies to surface.
Together, these constraints create a development experience that is slower in the small and faster in the large. Writing a new function takes longer because you must explain it. But debugging, maintaining, and extending the codebase is faster because every function has a human-readable explanation of its intent, every function is testable in isolation, and every function's source is always synchronized with its documentation.
The literate programming discipline is not about producing documentation. It is about producing code whose correctness has been verified by the act of explaining it.
* The Bouncer as Learning System
:PROPERTIES:
:ID: design-bouncer-learning
:END:
The Bouncer begins as a static guard - a set of rules that block obviously dangerous actions. But defining "obviously" is the hard problem. The agent encounters situations the rules do not anticipate. The Bouncer must grow.
The human-in-the-loop exception is the seed. When the LLM proposes an action the Bouncer does not recognize, the system does not default to blocking or allowing. It suspends. It writes the proposed action to an Org buffer in a format the human can read and understand. The human reviews and approves or denies. The Bouncer observes the decision.
From this single observation, the Bouncer extracts a rule. Not merely "allow this specific action" but "allow this class of actions parameterized by these dimensions." The human approved a write to ~/projects/myapp/src/core.clj. The Bouncer generalizes: writes to ~/projects/*/src/*.lisp are approved for this session, or for this project, or indefinitely depending on the context and the user's pattern of decisions.
Shadow mode is where rules are tested before deployment. When the Bouncer encounters a novel situation and is uncertain, it can run the proposed action in a simulated environment. It observes the side effects - what files would be modified, what processes would be spawned, what network calls would be made. If the simulation produces dangerous side effects, the rule is discarded. If it appears safe, the rule is added to the active set with a confidence rating.
Formal verification is where the learned rules are checked against invariants. The Bouncer's rules are not merely patterns observed from human behavior. They are formulas in a logic that the system can reason about. A rule that would enable path traversal is not discarded because it was observed to be safe in prior instances - it is discarded because it violates the path-confinement invariant by construction.
The Bouncer becomes, over time, not a guard that blocks bad actions but a reasoning system that understands why actions are good or bad. Early versions learn from human decisions. Later versions learn from their own logical analysis. The human's role transitions from approver to auditor to, eventually, unnecessary oversight.
This is the bootstrap. The system begins dependent on human judgment because it has no basis for judgment of its own. Through accumulated decisions, it constructs a model of what is permitted and why. That model is the foundation for the deterministic symbolic engine that in v3.0.0 takes over the reasoning that the Bouncer learned to perform.
* OpenCortex as a Function in Time
:PROPERTIES:
:ID: design-trajectory
:END:
The system is not static. OpenCortex is defined not just by its current state but by its trajectory - how its cognitive architecture evolves over versions, with each phase reducing probabilistic surface area while increasing deterministic control.
**v0.1.0: The Probabilistic Foundation**
The agent begins by relying heavily on the neural engine. The LLM translates messy human intent into structured queries, generates code, proposes solutions. The Bouncer is present but thin - it blocks obviously dangerous actions, verifies path confinement, enforces basic invariants. Most reasoning is probabilistic because the symbolic infrastructure does not yet exist to do otherwise.
At this stage, OpenCortex is similar to other LLM-based agents. The key difference is the gate is already there - the architecture assumes the LLM will hallucinate and structures safety accordingly.
**v0.2.0 through v0.5.0: The Bouncer Learns**
Each version expands the deterministic layer. The Bouncer writes rules from approved exceptions. Shadow mode runs trial executions. Tool permission tiers mature from simple allow/deny to nuanced context-aware policies. The agent becomes less likely to attempt dangerous actions not because it is smarter but because the guard has more complete information.
This is the bootstrapping phase. The system learns by watching itself and its user. Every blocked action becomes a rule. Every approved exception becomes a pattern. The symbolic layer grows at the probabilistic layer's expense.
**v0.6.0 through v0.7.0: The Architecture Crystallizes**
Skills become more deterministic. The agent learns to write its own skills - first drafts generated by the LLM, but verified and refined by the symbolic engine. Self-editing improves. The REPL becomes a first-class cognitive substrate - code is not just written but verified, iterated, tested before committing.
The balance shifts. The neural engine still translates and generates, but the symbolic engine checks, constrains, and corrects. The system is becoming what Gemini called "the strict guard" - a mathematically rigorous layer intercepting probabilistic output.
**v1.0.0: SOTA Parity - The Probabilistic Ceiling**
Achieving feature parity with commercial agents requires the full v0.x series complete. At this point, OpenCortex is a reliable autonomous agent - it can handle multi-step engineering tasks, maintain context across sessions, recover from errors, pass benchmarks. It is safer than alternatives because the Bouncer is mature and the memory architecture is sound.
But it is still fundamentally probabilistic at its core. The symbolic engine verifies and constrains, but the generative engine is still the primary reasoning source.
**v2.0.0: The Agent Becomes the Interface**
This version is not about the symbolic engine - it is about tools. The agent stops running inside Emacs and starts replacing it. Lish (Lisp shell) emerges: a shell that speaks plists, not POSIX. Org-mode buffers become the file system. Org-babel becomes the REPL. The agent is no longer a passenger in Emacs - it is the operating system.
The key insight is that the agent's interface and the agent's brain become the same thing. In earlier versions, there is a clear separation: the agent produces output, the TUI displays it. In v2.0.0, the distinction blurs. The agent's thoughts are displayed in Org buffers that are also the interface that the agent manipulates.
This is the Emacs cannibalization phase. Not hostile replacement but evolution - Emacs was always a Lisp machine, and v2.0.0 completes the metamorphosis.
**v3.0.0: The Symbolic Breakthrough**
This is the architectural leap. The system transitions from "probabilistic engine with symbolic verification" to "symbolic engine with probabilistic input and output."
The 10-80-10 architecture becomes fully realized: ten percent neural for input translation, eighty percent symbolic for reasoning against a knowledge graph, ten percent neural for output formatting. The symbolic engine maintains facts, relationships, rules, and formal proofs. When the neural engine generates something, the symbolic engine verifies it - not by checking against a blocklist, but by running the proposal through a Prolog/Datalog reasoner that understands the domain constraints.
The deterministic planner takes the wheel. The LLM is no longer consulted for planning decisions - it translates human language to structured queries and structured results back to human language. The planning itself is pure Lisp: task graphs generated by a symbolic reasoner that has access to the full knowledge graph.
Self-correcting gates replace the learned Bouncer rules. The system learns not just from approved exceptions but from the full history of outcomes - did the plan succeed? Where did it fail? The symbolic engine updates its own rules based on the results.
The implications are significant. Hallucination becomes structurally impossible because the symbolic engine will not accept a fact that contradicts its knowledge graph. Safety becomes provable because the formal verification layer can prove properties about the system's behavior. Self-improvement becomes stable because the agent modifies skills that are then verified before execution.
**v4.0.0 and Beyond: Hardware as the Final Constraint**
The Lisp machine becomes physical. RISC-V with tagged architecture, hardware-enforced type checking, FPGA prototype for the symbolic core. The agent runs not in emulation but on silicon purpose-built for the architecture.
This is the long horizon. The symbolic engine runs on logic ASICs optimized for symbolic computation. The neural engine runs on GPU or purpose-built matrix math hardware. Lisp orchestrates both, enforcing at the hardware level what it enforced at the software level in earlier versions.
**The Trajectory as Design Principle**
Understanding OpenCortex as a function in time is not nostalgia. It is architectural guidance. Every decision in v0.x should be made with awareness of where the system is going. Code written today becomes the substrate for v3.0. Skills designed today become the vocabulary the symbolic engine speaks tomorrow.
The probabilistic beginning is not a weakness to overcome. It is the bootstrap. The system learns the domain through probabilistic inference, and that learned knowledge becomes the seed for the symbolic engine. By the time the symbolic engine takes over, it has a rich knowledge graph to reason about, grown from thousands of probabilistic interactions.
This is how you build a reasoning machine: start with a learner, make it learn to verify, let verification become the core, remove the learner once it has learned enough.
* The REPL as Cognitive Substrate
:PROPERTIES:
:ID: design-repl-cognition
:END:
A REPL - Read, Eval, Print, Loop - is an interactive programming environment that reads an expression, evaluates it, prints the result, and loops back to read the next expression. It is the opposite of batch processing: where batch compiles and runs a program in one shot, a REPL works one expression at a time, with each evaluation building on all previous ones. The programmer defines a function, calls it, inspects the result, modifies it, and calls it again. The state accumulates. The session is the program.
In Lisp, the REPL is not a debugging tool bolted onto the language - it is the natural mode of interaction. The running image is the environment. When you evaluate =(+ 2 2)=, the result =4= is printed, and you remain in the same image where =+= is defined, where previous definitions persist, where the next expression can reference anything that came before. There is no separation between development and execution. The REPL is not a simulation of the program - it is the program running.
OpenCortex uses the REPL in this spirit, but elevated: it is not merely a tool for writing code, it is the mechanism by which the agent interacts with its own cognition - a loop that mirrors the perceive-reason-act metabolic cycle at the implementation level.
In the agent's cognitive architecture, the REPL serves three functions that are difficult or impossible to achieve through batch processing or stateless API calls.
First, the REPL enables verification before commitment. When the agent generates code, it does not write and forget - it evaluates in a running image, observes the result, iterates if incorrect. The feedback loop is tight: the time between writing and seeing the error is measured in milliseconds, not in the round-trip to a language server or a batch compiler. This is the "verification over hallucination" principle from the RLM paper made concrete: the agent tests what it writes before claiming it works.
Second, the REPL enables stateful exploration. The agent can define a variable, inspect it, modify it, redefine it. The exploration accumulates state across interactions. This is not a debugging session - it is the agent thinking with its hands, working through a problem by trying variations and observing outcomes, keeping the successful ones and discarding the failures.
Third, the REPL is a shared substrate. When the agent evaluates code, that code runs in the same image as the agent's own cognition. There is no process boundary between the agent and its tools. The REPL is not a subprocess the agent controls - it is a direct interface to the agent's own nervous system.
This is why the REPL becomes more important as the system matures. In early versions, it is a development tool. In v0.6.0 and beyond, it becomes a cognitive tool: the agent explores hypotheses by evaluating them, verifies the output of sub-agents by inspecting live state, and tests modifications before committing them to the knowledge graph.
* The Evaluation Harness
:PROPERTIES:
:ID: design-evaluation-harness
:END:
SOTA parity is meaningless without measurement. A system that claims to match commercial agents must demonstrate it through reproducible benchmarks, not through feature checklists. The evaluation harness is the apparatus by which OpenCortex proves its capabilities.
The industry standard for coding agents is SWE-bench: a corpus of GitHub issues paired with pull requests. The agent is given an issue, must understand the codebase, write a fix, and submit. Success is measured by whether the submitted PR passes the existing test suite. This tests the full chain: understanding, planning, code generation, verification, and multi-step reasoning.
OpenCortex implements a native Lisp harness for this. A background thread clones repositories, feeds issues into the cognitive loop, tracks the resolution trajectory as an Org-mode headline tree, and scores success by test outcomes. The trajectory is persisted: when a resolution fails, the system can inspect where in the chain the reasoning broke down. The headline tree records the agent's thoughts at each step, making the failure auditable and the debugging human-assisted.
Beyond SWE-bench, the harness includes chaos testing. The system is subjected to resource starvation, concurrent load, and adversarial input. The deterministic engine must maintain safety invariants under pressure. The symbolic verifier must not deadlock or livelock. The probabilistic engine must degrade gracefully - if tokens are limited, it must still produce valid proposals that the deterministic engine can evaluate. Failure under chaos is a design flaw, not a benchmark anomaly.
The harness also supports regression testing on the skill set. Every skill is tested against a suite of known inputs and expected outputs. When a modification is proposed to any skill - whether through manual editing or the agent's own self-modification - the test suite runs first. A skill that fails its tests is rejected before it can propagate to the running image. This is not a convenience - it is the mechanism by which self-modification remains safe. The agent can propose changes, but the harness verifies them before the changes take effect.
* Observability and the Thought Trace
:PROPERTIES:
:ID: design-observability
:END:
When a human asks why the system made a decision, the answer must be findable. In most AI systems, the reasoning is ephemeral - it exists in the model's activations and disappears when the session ends. In OpenCortex, every significant cognitive event is written to an Org buffer as it happens.
The thought trace is the agent's journal, written in parallel with its reasoning. When the probabilistic engine generates a proposal, the trace records the input, the prompt, and the raw output. When the deterministic engine evaluates it, the trace records which rules were checked, which passed, which failed, and why. When an action is executed, the trace records the timestamp, the user who approved it (if human-in-the-loop), and the outcome.
This is not logging in the traditional sense. Logs are forensically useful but are written in a machine format optimized for storage, not for human reading. The thought trace is written in Org-mode: headlines for major events, property drawers for structured data, tags for categorization. The human can open the trace in Emacs and navigate it like any other Org file. They can search for a specific decision, filter by time range, find all actions blocked by a specific rule, or see the complete trajectory of a multi-step task.
The trace becomes the foundation for the Bouncer's learning. Every blocked action is in the trace. Every approved exception is in the trace. The human-in-the-loop decisions are in the trace. The system does not need to reconstruct what happened - it reads what happened from the trace it wrote.
Without observability, the system is a black box that happens to produce correct outputs sometimes. With observability, the system is auditable. The human can see why a decision was made, identify where the reasoning failed, and course-correct the system or its own behavior accordingly.
* The MCP Strategy
:PROPERTIES:
:ID: design-mcp-strategy
:END:
The Model Context Protocol (MCP) is a standard for connecting AI systems to external tools and data sources. It defines how a client requests tools from a server, how the server exposes its capabilities, and how the client invokes them. The ecosystem is growing: MCP servers exist for GitHub, Slack, Postgres, filesystem access, and much more.
OpenCortex connects to this ecosystem, but not by becoming a Node.js runtime. The architecture is: external MCP servers communicate via stdio or SSE to a Lisp-native MCP client that runs in the same image as the agent. The client is pure Common Lisp - it parses the JSON-RPC messages, invokes the tools, and presents results to the agent as Lisp data structures. There is no serialization overhead between the agent and the MCP layer, no process boundary, no impedance mismatch.
When the agent calls a tool via MCP, it receives a plist with the tool name, arguments, and result. The result is immediately usable by the agent's symbolic engine. When the agent generates a file, it can be written to the filesystem through an MCP filesystem server. When the agent needs to send a message, it can use an MCP Slack server. The agent does not need to know that these are MCP interactions - it sees only the plists that flow through its cognitive architecture.
The alternative is to build MCP wrappers in Python or TypeScript and bridge to Lisp via subprocess. This is what OpenClaw does: a Node.js runtime that manages MCP servers, with a bridge to the Lisp process. The bridge introduces latency, serialization costs, and a maintenance burden. The Node.js process must be kept running. The bridge must be maintained across Lisp and JavaScript runtimes. The cognitive architecture must handle errors that cross the process boundary.
OpenCortex's native client is smaller, faster, and more maintainable. The MCP client is a skill, not a core component. It can be reloaded, replaced, or removed without restarting the agent. The agent can add new MCP tool integrations by loading new skills, not by deploying new infrastructure.
* Local-First Architecture
:PROPERTIES:
:ID: design-local-first
:END:
OpenCortex is designed to run on the user's machine, on their hardware, with their data, without requiring an internet connection. This is not a deployment option - it is an architectural commitment. The system must be able to reason, plan, and act using only the resources available locally.
The motivation is not merely philosophical. Cloud-based AI agents are economically incentivized to collect data, to train on user interactions, and to build lock-in through proprietary formats and network effects. When the agent runs locally, the user owns the hardware, owns the data, and can terminate the process without asking permission. There is no vendor that can change terms, no service that can go offline, no model that can be updated without consent.
Technically, local-first means several things. The LLM must be able to run on local hardware. OpenCortex supports Ollama as a provider, which runs quantized models on CPU and GPU without requiring an external API. The vector database must be local. OpenCortex uses its own org-object store, which is a folder of Org files that the agent already owns. There is no ChromaDB or Qdrant to install, no cloud vector service to authenticate with.
The symbolic engine does not require a network connection. The Prolog/Datalog reasoner that in v3.0.0 verifies neural proposals runs entirely in the Lisp image. The Bouncer's rule synthesis does not call an external service. The agent can operate in a disconnected environment indefinitely, resuming full capability when connectivity is restored.
This does not mean OpenCortex refuses to use cloud services when available and appropriate. It means cloud services are optional enhancements, not architectural requirements. The core is local. The user can choose to add cloud LLM providers for more capable inference, but the system functions without them.
* Zero-Dependency Deployment
:PROPERTIES:
:ID: design-zero-dependency
:END:
The simplest deployment is one that requires no installation steps. The user downloads one file, runs it, and the system works. OpenCortex approximates this through SBCL's ability to produce standalone executables via save-lisp-and-die. The executable contains the Lisp runtime, the compiled system, and Quicklisp libraries - everything bundled into one binary.
The practical reality is more nuanced. Building a truly standalone executable requires resolving all library dependencies at build time and embedding them in the binary. SBCL supports this, but the resulting binary is large (tens of megabytes), and updating any component requires a full rebuild. The current deployment model uses a Docker container that maps the user's memex directory as a volume. The container starts, loads the system, and is ready. No compilation on the user's machine, no dependency installation, no platform-specific quirks.
The long-term goal is a single =opencortex= binary that the user runs. It starts a local web server on a Unix domain socket. The TUI connects through the socket. The user's Org files are in =~/memex/=. The binary is the only thing that needs to be installed.
This stands in stark contrast to most AI agent systems, which require managing Python environments, npm packages, API keys, environment variables, and configuration files. OpenAI's agents SDK requires pip install, a Python environment, and external API access. OpenClaw requires Node.js, npm, and a plugin ecosystem that must be individually installed. LangChain requires a Python environment with dozens of dependencies that must be kept compatible.
OpenCortex's dependency model is SBCL plus Quicklisp. Quicklisp loads libraries on demand from the internet, but caches them locally. A system with internet access can fetch any library it needs. A system without internet access uses only the libraries it has already loaded - and those are preserved in the cache. The agent does not require internet access to function after initial setup.

View File

@@ -84,8 +84,8 @@
(format t "==================================================~%")
(handler-case
(progn
(when (fboundp 'doctor-run-all)
(let ((result (doctor-run-all :auto-install nil)))
(when (fboundp 'doctor-run-all)
(let ((result (doctor-run-all)))
(setf *health-check-ran* t)
(if result
(progn

View File

@@ -4,13 +4,16 @@
(defvar *history-store* (make-hash-table :test 'equal)
"Immutable Merkle-Tree versioning store mapping hashes to objects.")
(defun lookup-object (id)
(gethash id *memory*))
(defstruct org-object
id type attributes content vector parent-id children version last-sync hash)
(defmethod make-load-form ((obj org-object) &optional env)
(make-load-form-saving-slots obj :environment env))
(defun copy-org-object (obj)
(defun deep-copy-org-object (obj)
(make-org-object :id (org-object-id obj)
:type (org-object-type obj)
:attributes (copy-list (org-object-attributes obj))
@@ -71,7 +74,7 @@
(defun snapshot-memory ()
(let ((snapshot (make-hash-table :test 'equal :size (hash-table-size *memory*))))
(maphash (lambda (k v) (setf (gethash k snapshot) (copy-org-object v))) *memory*)
(maphash (lambda (k v) (setf (gethash k snapshot) (deep-copy-org-object v))) *memory*)
(push (list :timestamp (get-universal-time) :data snapshot) *object-store-snapshots*)
(when (> (length *object-store-snapshots*) 20) (setf *object-store-snapshots* (subseq *object-store-snapshots* 0 20)))
(harness-log "MEMORY - CoW Memory snapshot created.")))

View File

@@ -21,6 +21,12 @@ The Memory module is the cognitive bedrock of the opencortex. It is not a databa
"Immutable Merkle-Tree versioning store mapping hashes to objects.")
#+end_src
** Object Lookup
#+begin_src lisp
(defun lookup-object (id)
(gethash id *memory*))
#+end_src
** The Data Structure (org-object)
#+begin_src lisp
(defstruct org-object
@@ -29,7 +35,7 @@ The Memory module is the cognitive bedrock of the opencortex. It is not a databa
(defmethod make-load-form ((obj org-object) &optional env)
(make-load-form-saving-slots obj :environment env))
(defun copy-org-object (obj)
(defun deep-copy-org-object (obj)
(make-org-object :id (org-object-id obj)
:type (org-object-type obj)
:attributes (copy-list (org-object-attributes obj))
@@ -99,7 +105,7 @@ The Memory module is the cognitive bedrock of the opencortex. It is not a databa
(defun snapshot-memory ()
(let ((snapshot (make-hash-table :test 'equal :size (hash-table-size *memory*))))
(maphash (lambda (k v) (setf (gethash k snapshot) (copy-org-object v))) *memory*)
(maphash (lambda (k v) (setf (gethash k snapshot) (deep-copy-org-object v))) *memory*)
(push (list :timestamp (get-universal-time) :data snapshot) *object-store-snapshots*)
(when (> (length *object-store-snapshots*) 20) (setf *object-store-snapshots* (subseq *object-store-snapshots* 0 20)))
(harness-log "MEMORY - CoW Memory snapshot created.")))

View File

@@ -254,6 +254,7 @@
;; --- Debugger Hook ---
(setf *debugger-hook* (lambda (condition hook)
"Friendly error handler - shows diagnostic message instead of raw debugger."
(declare (ignore hook))
(format t "~%")
(format t "┌─────────────────────────────────────────────┐~%")
(format t "│ ERROR: ~A~%" (type-of condition))

View File

@@ -268,8 +268,10 @@ The ~package.lisp~ file defines the public API of the ~opencortex~ harness.
(finish-output)))
;; --- Debugger Hook ---
#+begin_src lisp :tangle package.lisp
(setf *debugger-hook* (lambda (condition hook)
"Friendly error handler - shows diagnostic message instead of raw debugger."
(declare (ignore hook))
(format t "~%")
(format t "┌─────────────────────────────────────────────┐~%")
(format t "│ ERROR: ~A~%" (type-of condition))

View File

@@ -1,5 +1,6 @@
(in-package :opencortex)
(defvar *interrupt-flag* nil)
(defvar *async-sensors* '(:chat-message :delegation :user-command)
"Sensors that are processed in dedicated threads.")

View File

@@ -14,6 +14,11 @@ The Perceive stage is the "sensory cortex" of OpenCortex. Its job is to take raw
(in-package :opencortex)
#+end_src
** Interrupt Handling
#+begin_src lisp
(defvar *interrupt-flag* nil)
#+end_src
** Sensor Configuration
#+begin_src lisp
(defvar *async-sensors* '(:chat-message :delegation :user-command)

View File

@@ -1,6 +1,6 @@
(in-package :cl-user)
(defpackage :opencortex.tui
(:use :cl :croatoan :usocket)
(:use :cl :croatoan :usocket :bordeaux-threads)
(:export :main))
(in-package :opencortex.tui)
@@ -9,139 +9,139 @@
(defvar *socket* nil)
(defvar *stream* nil)
(defvar *chat-history* nil)
(defvar *scroll-index* 0)
(defvar *input-buffer* (make-array 0 :element-type 'char :fill-pointer 0 :adjustable t))
(defvar *input-list* nil) ; List of characters (stored in reverse)
(defvar *is-running* t)
(defvar *queue-lock* (bt:make-lock))
(defvar *incoming-msgs* nil)
(defun log-debug (msg &rest args)
(ignore-errors
(with-open-file (s "/tmp/opencortex-tui-debug.log" :direction :output :if-exists :append :if-does-not-exist :create)
(format s "[~a] " (get-universal-time))
(apply #'format s msg args)
(terpri s)
(finish-output s))))
(defun enqueue-msg (msg)
"Thread-safe addition to incoming message queue."
(bt:with-lock-held (*queue-lock*)
(setf *incoming-msgs* (append *incoming-msgs* (list msg)))))
(defun dequeue-msgs ()
"Thread-safe retrieval of incoming messages."
(bt:with-lock-held (*queue-lock*)
(let ((msgs *incoming-msgs*))
(setf *incoming-msgs* nil)
msgs)))
(defun get-line-style (text)
(cond
((uiop:string-prefix-p "*" text) '(:bold :yellow))
((uiop:string-prefix-p "⬆" text) '(:cyan))
((uiop:string-prefix-p "🤔" text) '(:italic))
((uiop:string-prefix-p "ERROR" text) '(:bold :red))
(t nil)))
(defun render-chat (win)
(clear win)
(let* ((h (height win))
(view-height (max 0 (- h 2)))
(history-len (length *chat-history*))
(start-idx *scroll-index*)
(end-idx (min history-len (+ start-idx view-height)))
(slice (reverse (subseq *chat-history* start-idx end-idx))))
(loop for msg in slice
for i from 1
do (add-string win (format nil "│ ~a" msg) :y i :x 1 :attributes (get-line-style msg)))
(defun render-chat (win h)
(when (and win (integerp h))
(clear win)
(box win 0 0)
(let* ((view-height (- h 2))
(history (copy-list *chat-history*))
(len (length history))
(num-to-draw (min len view-height))
(slice (subseq history 0 num-to-draw)))
(loop for i from 0 below num-to-draw
for msg in (reverse slice)
do (when msg
(add-string win (format nil "│ ~a" msg) :y (1+ i) :x 2))))
(refresh win)))
(defun handle-backspace ()
(when (> (fill-pointer *input-buffer*) 0)
(decf (fill-pointer *input-buffer*))))
(pop *input-list*))
(defun handle-return (stream)
(let ((cmd (coerce *input-buffer* 'string)))
(setf (fill-pointer *input-buffer*) 0)
(let ((cmd (coerce (reverse *input-list*) 'string)))
(setf *input-list* nil)
(log-debug "SUBMITTING: '~a'" cmd)
(when (> (length cmd) 0)
(enqueue-msg (format nil "⬆ ~a" cmd))
(push (format nil "⬆ ~a" cmd) *chat-history*)
(handler-case
(progn
(when (and stream (open-stream-p stream))
(let* ((msg (list :TYPE :EVENT
:META (list :SOURCE :tui)
:PAYLOAD (list :SENSOR :user-input :TEXT cmd)))
(payload (format nil "~s" msg))
(len (length payload)))
(format stream "~6,'0x~a" len payload)
(finish-output stream)))
(enqueue-msg "✓ Sent"))
(if (and stream (open-stream-p stream))
(let* ((msg (list :TYPE :EVENT
:META (list :SOURCE :tui)
:PAYLOAD (list :SENSOR :user-input :TEXT cmd)))
(payload (format nil "~s" msg))
(len (length payload)))
(format stream "~6,'0x~a" len payload)
(finish-output stream)
(log-debug "SENT WIRE: ~a" payload))
(push "ERROR: Not connected." *chat-history*)))
(error (c)
(format t "Send error: ~a~%" c)
(enqueue-msg "ERROR: Connection to daemon lost.")
(log-debug "SEND ERROR: ~a" c)
(push (format nil "ERROR: ~a" c) *chat-history*)
(setf *is-running* nil))))
(when (string= cmd "/exit") (setf *is-running* nil))
(when (string= cmd "/clear") (setf *chat-history* nil))))
(defun start-background-reader (stream)
"Starts a thread that reads framed messages from the daemon stream."
(bt:make-thread
(lambda ()
(loop while *is-running* do
(handler-case
(let* ((len-buf (make-string 6))
(count (read-sequence len-buf stream)))
(when (= count 6)
(let* ((msg-len (parse-integer len-buf :radix 16))
(msg-buf (make-string msg-len)))
(read-sequence msg-buf stream)
(let ((msg (read-from-string msg-buf)))
(let ((payload (getf msg :payload)))
(cond
((eq (getf payload :action) :handshake)
(enqueue-msg "* Connected to daemon *"))
((and (eq (getf payload :sensor) :loop-error)
(not (string= (or (getf payload :message) "") "Neural Cascade Failure: All providers exhausted.")))
(enqueue-msg (format nil "ERROR: Daemon loop error (~a)"
(getf payload :message))))
(t
(let ((text (or (getf payload :text) (format nil "~a" payload))))
(enqueue-msg (format nil "⬇ ~a" text)))))))))
(if (= count 6)
(let* ((msg-len (parse-integer len-buf :radix 16))
(msg-buf (make-string msg-len)))
(read-sequence msg-buf stream)
(log-debug "DAEMON MSG: ~a" msg-buf)
(let ((msg (read-from-string msg-buf)))
(let ((payload (getf msg :payload)))
(cond
((eq (getf payload :action) :handshake)
(enqueue-msg "* Connected *"))
(t
(let ((text (or (getf payload :text) (format nil "~a" payload))))
(enqueue-msg (format nil "⬇ ~a" text))))))))
(sleep 0.05)))
(error (c)
(when *is-running*
(enqueue-msg (format nil "ERROR: Connection lost (~a)" c))
(log-debug "READER ERROR: ~a" c)
(enqueue-msg "ERROR: Connection lost.")
(setf *is-running* nil))))))
:name "opencortex-tui-reader"))
(defun main ()
(log-debug "=== START ===")
(handler-case
(setf *socket* (usocket:socket-connect *daemon-host* *daemon-port*))
(error (e) (format t "Offline: ~a~%" e) (return-from main)))
(setf *stream* (usocket:socket-stream *socket*))
;; Guard: Croatoan needs a real terminal (TERM env var, real TTY)
(unless (uiop:getenv "TERM")
(format t "TUI requires a terminal. Set TERM environment variable.~%")
(format t "Or use: echo 'your message' | nc localhost 9105~%")
(return-from main))
(unwind-protect
(handler-case
(with-screen (scr :input-echoing nil :input-blocking nil :enable-colors t)
(let* ((h (height scr)) (w (width scr)))
(let ((chat-win (make-instance 'window :height (- h 5) :width (- w 2) :position '(1 1) :border t))
(input-win (make-instance 'window :height 1 :width (- w 2) :position (list (- h 2) 1) :border t)))
(setf (input-blocking input-win) nil)
(start-background-reader *stream*)
(loop :while *is-running* :do
(let ((msgs (dequeue-msgs)))
(when msgs
(dolist (m msgs) (push m *chat-history*))
(render-chat chat-win)))
(let* ((ev (get-event input-win))
(ch (when (and ev (typep ev 'event)) (event-key ev))))
(when ch
(cond
((or (eq ch #\Newline) (eq ch #\Return)) (handle-return *stream*))
((or (eq ch :backspace) (eq ch (code-char 127))) (handle-backspace))
((characterp ch) (vector-push-extend ch *input-buffer*))))
(clear input-win)
(add-string input-win (format nil "▶ ~a" (coerce *input-buffer* 'string)) :y 0 :x 1)
(refresh input-win))
(sleep 0.02)))))
(error (c)
(format t "TUI Error: ~a~%" c)))
(with-screen (scr :input-echoing nil :input-blocking nil :enable-colors t)
(let* ((h (or (height scr) 24))
(w (or (width scr) 80))
(chat-h (- h 4))
(chat-win (make-instance 'window :height chat-h :width (- w 2) :y 1 :x 1))
(input-win (make-instance 'window :height 1 :width (- w 2) :y (- h 2) :x 1)))
(setf (input-blocking input-win) nil)
(start-background-reader *stream*)
(loop :while *is-running* :do
(let ((msgs (dequeue-msgs)))
(when msgs
(dolist (m msgs) (push m *chat-history*))
(render-chat chat-win chat-h)))
(let ((ch (get-char input-win)))
(when (and ch (not (equal ch -1)))
(log-debug "KEY: ~s" ch)
(cond
((or (eql ch 10) (eql ch 13) (eq ch :enter) (eql ch #\Newline) (eql ch #\Return))
(handle-return *stream*)
(render-chat chat-win chat-h))
((or (eql ch 127) (eql ch 8) (eq ch :backspace) (eql ch #\Backspace))
(handle-backspace))
((characterp ch)
(push ch *input-list*))
((integerp ch)
(let ((converted (code-char ch)))
(when (graphic-char-p converted)
(push converted *input-list*))))))
(clear input-win)
(add-string input-win (format nil "▶ ~a" (coerce (reverse *input-list*) 'string)) :y 0 :x 1)
(refresh input-win))
(sleep 0.01))))
(setf *is-running* nil)
(when *socket* (ignore-errors (usocket:socket-close *socket*)))))

124
harness/tui-client.lisp.tmp Normal file
View File

@@ -0,0 +1,124 @@
(defpackage :opencortex-tui-tests
(:use :cl :opencortex)
(:export #:tui-suite))
(in-package :opencortex-tui-tests)
(eval-when (:compile-toplevel :load-toplevel :execute)
(ql:quickload :fiveam :silent t))
(fiveam:def-suite tui-suite :description "Verification of the TUI parsing and styling logic")
(fiveam:in-suite tui-suite)
(fiveam:test test-tui-connection-drop
"Tier 2 Chaos: Verify that handle-return degrades gracefully when the daemon connection is lost."
;; Create a closed stream to simulate connection drop
(mock-stream (make-string-output-stream)))
(close mock-stream)
(opencortex.tui::handle-return mock-stream)
;; Check if the error was enqueued to history instead of crashing
(in-package :cl-user)
(defpackage :opencortex.tui
(:use :cl :croatoan :usocket :bordeaux-threads)
(:export :main))
(in-package :opencortex.tui)
(defun enqueue-msg (msg)
"Thread-safe addition to incoming message queue."
(defun dequeue-msgs ()
"Thread-safe retrieval of incoming messages."
msgs)))
(defun get-line-style (text)
(cond
((uiop:string-prefix-p "⬆" text) '(:cyan))
((uiop:string-prefix-p "🤔" text) '(:italic))
((uiop:string-prefix-p "ERROR" text) '(:bold :red))
(t nil)))
(defun render-chat (win)
(clear win)
(view-height (max 0 (- h 2)))
(end-idx (min history-len (+ start-idx view-height)))
(loop for msg in slice
for i from 1
do (add-string win (format nil "│ ~a" msg) :y i :x 1 :attributes (get-line-style msg)))
(refresh win)))
(defun handle-backspace ()
(defun handle-return (stream)
(when (> (length cmd) 0)
(enqueue-msg (format nil "⬆ ~a" cmd))
(handler-case
(progn
(when (and stream (open-stream-p stream))
:META (list :SOURCE :tui)
:PAYLOAD (list :SENSOR :user-input :TEXT cmd)))
(payload (format nil "~s" msg))
(len (length payload)))
(format stream "~6,'0x~a" len payload)
(finish-output stream)))
(enqueue-msg "✓ Sent"))
(error (c)
(format t "Send error: ~a~%" c)
(enqueue-msg "ERROR: Connection to daemon lost.")
(defun start-background-reader (stream)
"Starts a thread that reads framed messages from the daemon stream."
(bt:make-thread
(lambda ()
(handler-case
(count (read-sequence len-buf stream)))
(when (= count 6)
(msg-buf (make-string msg-len)))
(read-sequence msg-buf stream)
(let ((msg (read-from-string msg-buf)))
(let ((payload (getf msg :payload)))
(cond
((eq (getf payload :action) :handshake)
((and (eq (getf payload :sensor) :loop-error)
(not (string= (or (getf payload :message) "") "Neural Cascade Failure: All providers exhausted.")))
(enqueue-msg (format nil "ERROR: Daemon loop error (~a)"
(getf payload :message))))
(t
(let ((text (or (getf payload :text) (format nil "~a" payload))))
(enqueue-msg (format nil "⬇ ~a" text)))))))))
(error (c)
(enqueue-msg (format nil "ERROR: Connection lost (~a)" c))
:name "opencortex-tui-reader")))
(defun main ()
(handler-case
(error (e) (format t "Offline: ~a~%" e) (return-from main)))
;; Guard: Croatoan needs a real terminal (TERM env var, real TTY)
(unless (uiop:getenv "TERM")
(format t "TUI requires a terminal. Set TERM environment variable.~%")
(format t "Or use: echo 'your message' | nc localhost 9105~%")
(return-from main))
(unwind-protect
(handler-case
(with-screen (scr :input-echoing nil :input-blocking nil :enable-colors t)
(let ((chat-win (make-instance 'window :height (- h 5) :width (- w 2) :position '(1 1) :border t))
(input-win (make-instance 'window :height 1 :width (- w 2) :position (list (- h 2) 1) :border t)))
(setf (input-blocking input-win) nil)
(let ((msgs (dequeue-msgs)))
(when msgs
(render-chat chat-win)))
(ch (when (and ev (typep ev 'event)) (event-key ev))))
(when ch
(cond
((or (eq ch :backspace) (eq ch (code-char 127))) (handle-backspace))
(clear input-win)
(refresh input-win))
(sleep 0.02)))))
(error (c)
(format t "TUI Error: ~a~%" c)))

View File

@@ -6,39 +6,13 @@
* Overview
The OpenCortex TUI Client is a standalone Common Lisp application built on **Croatoan**.
* Test Suite
#+begin_src lisp :tangle ../tests/tui-tests.lisp
(defpackage :opencortex-tui-tests
(:use :cl :opencortex)
(:export #:tui-suite))
(in-package :opencortex-tui-tests)
(eval-when (:compile-toplevel :load-toplevel :execute)
(ql:quickload :fiveam :silent t))
(fiveam:def-suite tui-suite :description "Verification of the TUI parsing and styling logic")
(fiveam:in-suite tui-suite)
(fiveam:test test-tui-connection-drop
"Tier 2 Chaos: Verify that handle-return degrades gracefully when the daemon connection is lost."
(let ((opencortex.tui::*incoming-msgs* nil)
(opencortex.tui::*input-buffer* (make-array 5 :element-type 'char :initial-contents "hello" :fill-pointer 5 :adjustable t))
;; Create a closed stream to simulate connection drop
(mock-stream (make-string-output-stream)))
(close mock-stream)
(opencortex.tui::handle-return mock-stream)
;; Check if the error was enqueued to history instead of crashing
(fiveam:is (member "ERROR: Connection to daemon lost." opencortex.tui::*incoming-msgs* :test #'string=))))
#+end_src
* Implementation
** Package Context
#+begin_src lisp
(in-package :cl-user)
(defpackage :opencortex.tui
(:use :cl :croatoan :usocket)
(:use :cl :croatoan :usocket :bordeaux-threads)
(:export :main))
(in-package :opencortex.tui)
#+end_src
@@ -50,8 +24,7 @@ The OpenCortex TUI Client is a standalone Common Lisp application built on **Cro
(defvar *socket* nil)
(defvar *stream* nil)
(defvar *chat-history* nil)
(defvar *scroll-index* 0)
(defvar *input-buffer* (make-array 0 :element-type 'char :fill-pointer 0 :adjustable t))
(defvar *input-list* nil) ; List of characters (stored in reverse)
(defvar *is-running* t)
(defvar *queue-lock* (bt:make-lock))
(defvar *incoming-msgs* nil)
@@ -59,68 +32,69 @@ The OpenCortex TUI Client is a standalone Common Lisp application built on **Cro
** Utilities
#+begin_src lisp
(defun log-debug (msg &rest args)
(ignore-errors
(with-open-file (s "/tmp/opencortex-tui-debug.log" :direction :output :if-exists :append :if-does-not-exist :create)
(format s "[~a] " (get-universal-time))
(apply #'format s msg args)
(terpri s)
(finish-output s))))
(defun enqueue-msg (msg)
"Thread-safe addition to incoming message queue."
(bt:with-lock-held (*queue-lock*)
(setf *incoming-msgs* (append *incoming-msgs* (list msg)))))
(defun dequeue-msgs ()
"Thread-safe retrieval of incoming messages."
(bt:with-lock-held (*queue-lock*)
(let ((msgs *incoming-msgs*))
(setf *incoming-msgs* nil)
msgs)))
(defun get-line-style (text)
(cond
((uiop:string-prefix-p "*" text) '(:bold :yellow))
((uiop:string-prefix-p "⬆" text) '(:cyan))
((uiop:string-prefix-p "🤔" text) '(:italic))
((uiop:string-prefix-p "ERROR" text) '(:bold :red))
(t nil)))
#+end_src
** Rendering
#+begin_src lisp
(defun render-chat (win)
(clear win)
(let* ((h (height win))
(view-height (max 0 (- h 2)))
(history-len (length *chat-history*))
(start-idx *scroll-index*)
(end-idx (min history-len (+ start-idx view-height)))
(slice (reverse (subseq *chat-history* start-idx end-idx))))
(loop for msg in slice
for i from 1
do (add-string win (format nil "│ ~a" msg) :y i :x 1 :attributes (get-line-style msg)))
(defun render-chat (win h)
(when (and win (integerp h))
(clear win)
(box win 0 0)
(let* ((view-height (- h 2))
(history (copy-list *chat-history*))
(len (length history))
(num-to-draw (min len view-height))
(slice (subseq history 0 num-to-draw)))
(loop for i from 0 below num-to-draw
for msg in (reverse slice)
do (when msg
(add-string win (format nil "│ ~a" msg) :y (1+ i) :x 2))))
(refresh win)))
#+end_src
** Input Handling
#+begin_src lisp
(defun handle-backspace ()
(when (> (fill-pointer *input-buffer*) 0)
(decf (fill-pointer *input-buffer*))))
(pop *input-list*))
(defun handle-return (stream)
(let ((cmd (coerce *input-buffer* 'string)))
(setf (fill-pointer *input-buffer*) 0)
(let ((cmd (coerce (reverse *input-list*) 'string)))
(setf *input-list* nil)
(log-debug "SUBMITTING: '~a'" cmd)
(when (> (length cmd) 0)
(enqueue-msg (format nil "⬆ ~a" cmd))
(push (format nil "⬆ ~a" cmd) *chat-history*)
(handler-case
(progn
(when (and stream (open-stream-p stream))
(let* ((msg (list :TYPE :EVENT
:META (list :SOURCE :tui)
:PAYLOAD (list :SENSOR :user-input :TEXT cmd)))
(payload (format nil "~s" msg))
(len (length payload)))
(format stream "~6,'0x~a" len payload)
(finish-output stream)))
(enqueue-msg "✓ Sent"))
(if (and stream (open-stream-p stream))
(let* ((msg (list :TYPE :EVENT
:META (list :SOURCE :tui)
:PAYLOAD (list :SENSOR :user-input :TEXT cmd)))
(payload (format nil "~s" msg))
(len (length payload)))
(format stream "~6,'0x~a" len payload)
(finish-output stream)
(log-debug "SENT WIRE: ~a" payload))
(push "ERROR: Not connected." *chat-history*)))
(error (c)
(format t "Send error: ~a~%" c)
(enqueue-msg "ERROR: Connection to daemon lost.")
(log-debug "SEND ERROR: ~a" c)
(push (format nil "ERROR: ~a" c) *chat-history*)
(setf *is-running* nil))))
(when (string= cmd "/exit") (setf *is-running* nil))
(when (string= cmd "/clear") (setf *chat-history* nil))))
@@ -129,32 +103,30 @@ The OpenCortex TUI Client is a standalone Common Lisp application built on **Cro
** Background Reader
#+begin_src lisp
(defun start-background-reader (stream)
"Starts a thread that reads framed messages from the daemon stream."
(bt:make-thread
(lambda ()
(loop while *is-running* do
(handler-case
(let* ((len-buf (make-string 6))
(count (read-sequence len-buf stream)))
(when (= count 6)
(let* ((msg-len (parse-integer len-buf :radix 16))
(msg-buf (make-string msg-len)))
(read-sequence msg-buf stream)
(let ((msg (read-from-string msg-buf)))
(let ((payload (getf msg :payload)))
(cond
((eq (getf payload :action) :handshake)
(enqueue-msg "* Connected to daemon *"))
((and (eq (getf payload :sensor) :loop-error)
(not (string= (or (getf payload :message) "") "Neural Cascade Failure: All providers exhausted.")))
(enqueue-msg (format nil "ERROR: Daemon loop error (~a)"
(getf payload :message))))
(t
(let ((text (or (getf payload :text) (format nil "~a" payload))))
(enqueue-msg (format nil "⬇ ~a" text)))))))))
(if (= count 6)
(let* ((msg-len (parse-integer len-buf :radix 16))
(msg-buf (make-string msg-len)))
(read-sequence msg-buf stream)
(log-debug "DAEMON MSG: ~a" msg-buf)
(let ((msg (read-from-string msg-buf)))
(let ((payload (getf msg :payload)))
(cond
((eq (getf payload :action) :handshake)
(enqueue-msg "* Connected *"))
(t
(let ((text (or (getf payload :text) (format nil "~a" payload))))
(enqueue-msg (format nil "⬇ ~a" text))))))))
(sleep 0.05)))
(error (c)
(when *is-running*
(enqueue-msg (format nil "ERROR: Connection lost (~a)" c))
(log-debug "READER ERROR: ~a" c)
(enqueue-msg "ERROR: Connection lost.")
(setf *is-running* nil))))))
:name "opencortex-tui-reader"))
#+end_src
@@ -162,43 +134,45 @@ The OpenCortex TUI Client is a standalone Common Lisp application built on **Cro
** Main Entry Point
#+begin_src lisp
(defun main ()
(log-debug "=== START ===")
(handler-case
(setf *socket* (usocket:socket-connect *daemon-host* *daemon-port*))
(error (e) (format t "Offline: ~a~%" e) (return-from main)))
(setf *stream* (usocket:socket-stream *socket*))
;; Guard: Croatoan needs a real terminal (TERM env var, real TTY)
(unless (uiop:getenv "TERM")
(format t "TUI requires a terminal. Set TERM environment variable.~%")
(format t "Or use: echo 'your message' | nc localhost 9105~%")
(return-from main))
(unwind-protect
(handler-case
(with-screen (scr :input-echoing nil :input-blocking nil :enable-colors t)
(let* ((h (height scr)) (w (width scr)))
(let ((chat-win (make-instance 'window :height (- h 5) :width (- w 2) :position '(1 1) :border t))
(input-win (make-instance 'window :height 1 :width (- w 2) :position (list (- h 2) 1) :border t)))
(setf (input-blocking input-win) nil)
(start-background-reader *stream*)
(loop :while *is-running* :do
(let ((msgs (dequeue-msgs)))
(when msgs
(dolist (m msgs) (push m *chat-history*))
(render-chat chat-win)))
(let* ((ev (get-event input-win))
(ch (when (and ev (typep ev 'event)) (event-key ev))))
(when ch
(cond
((or (eq ch #\Newline) (eq ch #\Return)) (handle-return *stream*))
((or (eq ch :backspace) (eq ch (code-char 127))) (handle-backspace))
((characterp ch) (vector-push-extend ch *input-buffer*))))
(clear input-win)
(add-string input-win (format nil "▶ ~a" (coerce *input-buffer* 'string)) :y 0 :x 1)
(refresh input-win))
(sleep 0.02)))))
(error (c)
(format t "TUI Error: ~a~%" c)))
(with-screen (scr :input-echoing nil :input-blocking nil :enable-colors t)
(let* ((h (or (height scr) 24))
(w (or (width scr) 80))
(chat-h (- h 4))
(chat-win (make-instance 'window :height chat-h :width (- w 2) :y 1 :x 1))
(input-win (make-instance 'window :height 1 :width (- w 2) :y (- h 2) :x 1)))
(setf (input-blocking input-win) nil)
(start-background-reader *stream*)
(loop :while *is-running* :do
(let ((msgs (dequeue-msgs)))
(when msgs
(dolist (m msgs) (push m *chat-history*))
(render-chat chat-win chat-h)))
(let ((ch (get-char input-win)))
(when (and ch (not (equal ch -1)))
(log-debug "KEY: ~s" ch)
(cond
((or (eql ch 10) (eql ch 13) (eq ch :enter) (eql ch #\Newline) (eql ch #\Return))
(handle-return *stream*)
(render-chat chat-win chat-h))
((or (eql ch 127) (eql ch 8) (eq ch :backspace) (eql ch #\Backspace))
(handle-backspace))
((characterp ch)
(push ch *input-list*))
((integerp ch)
(let ((converted (code-char ch)))
(when (graphic-char-p converted)
(push converted *input-list*))))))
(clear input-win)
(add-string input-win (format nil "▶ ~a" (coerce (reverse *input-list*) 'string)) :y 0 :x 1)
(refresh input-win))
(sleep 0.01))))
(setf *is-running* nil)
(when *socket* (ignore-errors (usocket:socket-close *socket*)))))
#+end_src

View File

@@ -15,6 +15,7 @@
(:file "harness/perceive")
(:file "harness/reason")
(:file "harness/act")
(:file "harness/doctor")
(:file "harness/loop")))
(defsystem :opencortex/tests

View File

@@ -342,7 +342,7 @@ case "$COMMAND" in
echo "Daemon not running. Starting daemon first..."
$0 daemon
fi
if sbcl --non-interactive \
if sbcl \
--eval '(load (merge-pathnames "quicklisp/setup.lisp" (user-homedir-pathname)))' \
--eval "(push (truename \"$OC_DATA_DIR/\") asdf:*central-registry*)" \
--eval '(ql:quickload :opencortex/tui)' \
@@ -361,7 +361,7 @@ case "$COMMAND" in
cli|boot)
check_dependencies
if sbcl --non-interactive \
if sbcl \
--eval '(load (merge-pathnames "quicklisp/setup.lisp" (user-homedir-pathname)))' \
--eval "(push (truename \"$OC_DATA_DIR/\") asdf:*central-registry*)" \
--eval "(ql:quickload '(:opencortex :croatoan))" \

View File

@@ -0,0 +1,9 @@
(load (merge-pathnames "quicklisp/setup.lisp" (user-homedir-pathname)))
(ql:quickload :croatoan :silent t)
(handler-case
(croatoan:with-screen (scr)
(format t "Screen height: ~s~%" (croatoan:height scr))
(format t "Screen width: ~s~%" (croatoan:width scr))
(finish-output))
(error (c) (format t "Croatoan Error: ~a~%" c)))
(uiop:quit 0)

62
scripts/harness-screen.sh Executable file
View File

@@ -0,0 +1,62 @@
#!/bin/bash
# OpenCortex TUI Harness via GNU Screen
# Provides a persistent PTY for Croatoan/ncurses TUI testing.
set -euo pipefail
SESSION="oct-tui"
LOG="$HOME/.local/state/opencortex/tui-screen.log"
function cleanup() {
screen -S "$SESSION" -X quit 2>/dev/null || true
}
case "${1:-start}" in
start)
cleanup
mkdir -p "$(dirname "$LOG")"
export TERM=screen-256color
export SKILLS_DIR="$HOME/.local/share/opencortex/skills"
screen -dmS "$SESSION" bash -c '
sbcl --non-interactive \
--eval "(load (merge-pathnames \"quicklisp/setup.lisp\" (user-homedir-pathname)))" \
--eval "(push (truename \"$HOME/.local/share/opencortex/\") asdf:*central-registry*)" \
--eval "(ql:quickload :opencortex/tui :silent t)" \
--eval "(opencortex.tui:main)" \
2>&1 | tee '"$LOG"'
echo "[TUI exited with code $?]"
sleep 3600
'
sleep 2
echo "TUI started in screen session '$SESSION'"
echo "Logs: $LOG"
;;
send)
shift
screen -S "$SESSION" -X stuff "$*"
;;
enter)
screen -S "$SESSION" -X stuff "$(printf '\r')"
;;
capture)
screen -S "$SESSION" -X hardcopy -h /tmp/oct-tui-capture.txt
cat /tmp/oct-tui-capture.txt
;;
log)
tail -f "$LOG"
;;
kill)
cleanup
echo "TUI session killed."
;;
*)
echo "Usage: $0 {start|send <text>|enter|capture|log|kill}"
exit 1
;;
esac

48
scripts/test-hi.lisp Normal file
View File

@@ -0,0 +1,48 @@
(load (merge-pathnames "quicklisp/setup.lisp" (user-homedir-pathname)))
(ql:quickload :usocket :silent t)
(defun frame-message (msg)
(let* ((payload (format nil "~s" msg))
(len (length payload)))
(format nil "~6,'0x~a" len payload)))
(defun test-hi ()
(handler-case
(let* ((socket (usocket:socket-connect "127.0.0.1" 9105))
(stream (usocket:socket-stream socket)))
(format t "Connected to daemon.~%")
;; Read HELLO
(let* ((len-buf (make-string 6))
(count (read-sequence len-buf stream)))
(when (= count 6)
(let* ((len (parse-integer len-buf :radix 16))
(msg-buf (make-string len)))
(read-sequence msg-buf stream)
(format t "Received HELLO: ~a~%" msg-buf))))
;; Send HI
(let* ((msg '(:TYPE :EVENT :META (:SOURCE :tui) :PAYLOAD (:SENSOR :user-input :TEXT "hi")))
(framed (frame-message msg)))
(format stream "~a" framed)
(finish-output stream)
(format t "Sent HI.~%"))
;; Wait for response
(loop
(let* ((len-buf (make-string 6))
(count (read-sequence len-buf stream)))
(if (= count 6)
(let* ((len (parse-integer len-buf :radix 16))
(msg-buf (make-string len)))
(read-sequence msg-buf stream)
(format t "Received Response: ~a~%" msg-buf)
(return))
(progn
(format t "Waiting...~%")
(sleep 1)))))
(usocket:socket-close socket))
(error (c) (format t "Error: ~a~%" c))))
(test-hi)
(uiop:quit 0)

89
scripts/test-tui.sh Executable file
View File

@@ -0,0 +1,89 @@
#!/bin/bash
# OpenCortex TUI Automated Test Harness
# Runs the TUI in a tmux pane, sends "hi", captures response.
set -euo pipefail
SESSION="opencortex-tui-test"
TUI_LOG="/tmp/opencortex-tui-test.log"
CAPTURE="/tmp/opencortex-tui-capture.txt"
TIMEOUT_SEC=30
echo "=== OpenCortex TUI Test Harness ==="
echo "Log: $TUI_LOG"
echo "Capture: $CAPTURE"
# Clean up any stale session
tmux kill-session -t "$SESSION" 2>/dev/null || true
# Verify daemon is running
if ! ss -tln | grep -q ':9105'; then
echo "ERROR: Daemon not running on port 9105"
echo "Start it with: cd ~/memex/projects/opencortex && ./opencortex.sh daemon"
exit 1
fi
# Create tmux session with TUI
echo "[1/5] Starting TUI in tmux session '$SESSION'..."
tmux new-session -d -s "$SESSION" \
-e OC_CONFIG_DIR="$HOME/.config/opencortex" \
-e OC_DATA_DIR="$HOME/.local/share/opencortex" \
-e SKILLS_DIR="$HOME/.local/share/opencortex/skills" \
-e TERM="screen-256color" \
"sbcl --non-interactive \
--eval '(load (merge-pathnames \"quicklisp/setup.lisp\" (user-homedir-pathname)))' \
--eval '(push (truename \"$HOME/.local/share/opencortex/\") asdf:*central-registry*)' \
--eval '(ql:quickload :opencortex/tui)' \
--eval '(opencortex.tui:main)' 2>&1 | tee $TUI_LOG"
sleep 3
# Capture initial state
tmux capture-pane -t "$SESSION" -p > "$CAPTURE"
echo "[2/5] Initial TUI state captured ($(wc -l < "$CAPTURE") lines)"
# Send message
echo "[3/5] Sending 'hi' + Enter..."
tmux send-keys -t "$SESSION" "hi" Enter
# Wait for response
echo "[4/5] Waiting up to ${TIMEOUT_SEC}s for response..."
for i in $(seq 1 $TIMEOUT_SEC); do
tmux capture-pane -t "$SESSION" -p > "$CAPTURE"
# Check if daemon response arrived (contains arrow-down marker or actual response text)
if grep -qE "(⬇|Hi|Hello|Neural Cascade)" "$CAPTURE"; then
echo " ✓ Response detected after ${i}s"
break
fi
sleep 1
done
# Final capture
tmux capture-pane -t "$SESSION" -p > "$CAPTURE"
echo "[5/5] Final capture ($(wc -l < "$CAPTURE") lines)"
# Extract and display results
echo ""
echo "=== SCREEN CAPTURE ==="
cat "$CAPTURE"
echo ""
echo "=== TUI LOG (last 20 lines) ==="
tail -20 "$TUI_LOG"
echo ""
# Check for errors
if grep -qE "(TUI Error|Connection lost|ERROR:)" "$TUI_LOG"; then
echo "❌ TEST FAILED: Errors detected in TUI log"
tmux kill-session -t "$SESSION" 2>/dev/null || true
exit 1
fi
if grep -qE "(⬇|Hi|Hello)" "$CAPTURE"; then
echo "✅ TEST PASSED: Response received from daemon"
else
echo "⚠️ TEST INCOMPLETE: No response marker found (daemon may have timed out)"
fi
# Cleanup
tmux kill-session -t "$SESSION" 2>/dev/null || true
echo "Done."

View File

@@ -13,7 +13,7 @@
(fiveam:test test-tui-connection-drop
"Tier 2 Chaos: Verify that handle-return degrades gracefully when the daemon connection is lost."
(let ((opencortex.tui::*incoming-msgs* nil)
(opencortex.tui::*input-buffer* (make-array 5 :element-type 'char :initial-contents "hello" :fill-pointer 5 :adjustable t))
(opencortex.tui::*input-buffer* (make-array 5 :element-type 'character :initial-contents "hello" :fill-pointer 5 :adjustable t))
;; Create a closed stream to simulate connection drop
(mock-stream (make-string-output-stream)))
(close mock-stream)