fix: Paren balance and flatten refactor

Flatten refactor: library/ -> harness/ and library/gen/ -> skills/ opencortex.asd updated with new paths. README.org rewritten with simplified pitch. Critical fixes in harness/skills.org (source of truth): - COSINE-SIMILARITY: Fixed let* binding list paren mismatch - parse-skill-metadata: Fixed ID extraction block closes - load-skill-from-org: Fixed handler-case and nested let closes - def-cognitive-tool :replace-string: Removed extra closing paren - boot-sequence-tests: Fixed let/unwind-protect nesting Note: .lisp files are generated from .org via org-babel-tangle at install time; fixes must always be made in .org and retangled. Skills derived from opencortex-contrib imported via earlier commits.
2026-04-27 10:51:16 -04:00
parent 8063a63bfd
commit 5d4979f5ab
4 changed files with 117 additions and 87 deletions
--- a/README.org
+++ b/README.org
@@ -5,61 +5,41 @@

 *opencortex* is a minimalist, extensible AI agent framework designed to manage and continuously organize your personal knowledge base. It transforms a static collection of plaintext notes into a live, programmable [[https://en.wikipedia.org/wiki/Memex][Memex]]—an automated, personalized memory system where humans and AI collaborate in the exact same workspace.

-* The Problem with Current AI Agents
+* Meet Your Private, Always-On Assistant

-The current ecosystem of AI agents (typically built in Python or TypeScript) is overwhelmingly built on architectural choices that prioritize rapid prototyping over long-term reliability, security, and self-modification:
+Most AI assistants are just chatbots. They answer a question, you close the tab, and they immediately forget who you are.

-** 1. The Format Trap (Markdown & JSON)
+**OpenCortex is different. It is an AI that actually lives inside your notes.** 

-Most agents force a painful translation layer. Humans write in Markdown, which lacks a strict Abstract Syntax Tree (AST)—a rigorous, nested representation of data that machines need to parse context reliably. Machines, in turn, output JSON, which is hostile for human thought and note-taking.
+It runs locally on your machine, reads the exact same plain-text files you do, and organizes your life in the background. It is designed to be a permanent, private companion that works for you, securely, for the next 100 years.

-The result is a fractured workspace where the agent's memory and the human's memory are fundamentally incompatible. You cannot see what the agent sees. The agent cannot naturally work with your notes.
+** The OpenCortex Experience
+- **Zero-Friction Capture:** Type a messy, half-finished thought into your daily journal. While you sleep, OpenCortex reads it, researches the topic, summarizes it, and seamlessly links it into your permanent knowledge base.
+- **Autonomous Execution:** Add a task like `=TODO Run system updates=` to your inbox. OpenCortex sees it, opens a secure background terminal, runs the bash commands, and marks the task `=DONE=` with the exact terminal output attached.
+- **Radical Privacy:** OpenCortex runs on your hardware. It defaults to using local AI models (like Ollama). Your personal notes, API keys, and drafts never leave your machine unless you explicitly tell them to.

-** 2. The Language Trap (Python & TypeScript)
+* The Philosophy: A 100-Year Memory

-Python and TypeScript are fantastic for gluing together APIs, but they are poorly suited for an agent that needs to safely read, write, and execute its own code at runtime. Their underlying structures are complex and opaque, making autonomous self-editing incredibly brittle and dangerous.
+How do you build a digital brain that will survive the next century? You can't rely on SaaS subscriptions, proprietary databases, or trendy Javascript frameworks that break every six months.

-How do you trust an agent to modify its own Python code when Python's AST is so complex that even human programmers need IDEs to navigate it?
+You build it using the most durable, time-tested format in computing: **Plain Text.**

-** 3. The Probabilistic Trap
+** 1. Why Org-Mode?
+OpenCortex uses **Org-mode** (a powerful plain-text markup language) as its only database. Unlike Markdown, Org-mode natively supports scheduling, metadata, tags, and code execution. 
+There is no opaque database. The agent's memory and your memory are exactly the same file. You can read it, edit it, and back it up using any text editor, forever.

-Almost all modern agents rely entirely on /probabilistic/ reasoning. We ask an AI model to guess a shell command or write a Python script, and then blindly pipe that output to a terminal. Without a rigorous, /deterministic/ layer to formally verify the model's proposals before execution, these systems are fundamentally unsafe.
+** 2. Why Local-First?
+Dependency on cloud services is a massive vulnerability for an autonomous agent. OpenCortex is designed to survive offline. It uses a "dynamic cascade" to route complex thoughts to big cloud models when available, but gracefully falls back to local, open-source models when the internet is down.

-The model might hallucinate a command. It might output valid syntax that still does something dangerous. Without a deterministic gate, there's nothing between the guess and the terminal.
+* The Engine: Safe, Self-Modifying Intelligence

-* The Vision: A Modern, Homoiconic Memex
+Under the hood, OpenCortex is a **Neurosymbolic Lisp Machine**. It solves the biggest problem with modern AI agents: they hallucinate, and they are dangerous if let loose on a computer.

-openCortex abandons these fragile paradigms by returning to first principles and embracing two historically powerful technologies: *Org-mode* and *Common Lisp*.
+OpenCortex splits its brain into two engines:
+1. **The Probabilistic Engine (The Creative LLM):** This part reads your notes, understands context, and proposes creative solutions or code.
+2. **The Deterministic Engine (The Strict Lisp Guard):** Before the LLM is allowed to touch your files or run a terminal command, this strict, symbolic engine intercepts the proposal. It mathematically verifies that the action is safe, permitted, and properly formatted before execution.

-** Org-mode: The Universal Language
-
-Instead of wrestling with Markdown parsers or hiding data in opaque databases, openCortex mandates that *Org-mode is the native AST for both humans and machines.*
-
-Org-mode is unique because it seamlessly brings together:
- Human-readable prose
- Structured metadata (properties and tags)
- Lifecycle states (TODO/DONE/PLAN)
- Executable code blocks
-
-...all in a single plain-text file. The code is the data, and the data is the interface. When the agent "remembers" a fact or schedules a task, it writes an Org headline. You read exactly what the agent reads.
-
-This is not a compromise—it's the design principle. The agent's memory and your memory are the same format, the same file, the same text.
-
-** Common Lisp: The Engine of Self-Modification
-
-There is a beautiful irony to openCortex: Lisp was invented in 1958 specifically to achieve Artificial Intelligence, and it has been waiting nearly 70 years for /this exact moment/ in computing history.
-
-Lisp possesses a unique property called *Homoiconicity*: the primary representation of the program is also a data structure (nested lists) within the language itself. Because Lisp code /is/ Lisp data, it is trivially easy for an AI to generate, manipulate, and safely evaluate new tools at runtime.
-
-This makes Lisp the ultimate, un-brittle language for a "self-writing" agent. The agent doesn't need an AST parser—it can simply read and write lists directly. The agent doesn't need a code generator—it can write Lisp that executes Lisp.
-
-** The Probabilistic-Deterministic Loop
-
-openCortex does not let AI models touch your system directly. Instead, it splits cognition into two distinct engines:
-
-1. *The Probabilistic Engine (Neural/Dynamic):* Provides semantic understanding and dynamic reasoning. It utilizes a **Dynamic LLM Cascade** (OpenRouter, Ollama, Anthropic) to ensure the agent always has a "brain," falling back to local models if cloud services are unavailable.
-
-2. *The Deterministic Engine (Logic/Safety):* Intercepts LLM proposals and formally verifies them against your security rules (the "Bouncer" pattern) before execution.
+Because the system is written in Common Lisp, the agent can actually safely rewrite its own code while it runs—a capability most Python-based agents can only dream of.

 #+begin_src mermaid
 flowchart LR
@@ -246,13 +226,14 @@ Self-editing is the foundation of all future growth. Full org-mode manipulation
 | Tool permission tiers | Per-tool permission: ask/allow/deny stored in org-objects. Filter tools before LLM sees them. |
 | Skill hot-reload | Swap compiled skill files without breaking active sockets. |

-*** v0.3.0: Event Orchestration + Context Awareness
+*** v0.3.0: Event Orchestration + HITL

-Unified control plane for deep project understanding before complex work.
+Unified control plane and Human-in-the-Loop (HITL) state management.

 | Feature | Description |
 |---------|-------------|
 | org-skill-event-orchestrator | Unified hooks + cron + routing. Three tiers: =:REFLEX= (no LLM), =:COGNITION= (light LLM), =:REASONING= (full LLM). |
+| Human-in-the-Loop (HITL) | Continuation-based interaction. The agent can "suspend" its cognitive loop to ask for permission or clarification and resume precisely where it left off. |
 | org-skill-context-manager | Stack-based project scoping. =push-context= / =pop-context=. Path resolution relative to context. |
 | Memory scope segmentation | =:scope= property on org-objects: memex/session/project. Scope-aware retrieval. |
 | Model-tier routing | Complexity-based model selection: heartbeat → tiny, user → medium, reasoning → large. |
@@ -269,18 +250,45 @@ Structured tracking, failure handling, and course correction for multi-step engi
 | TDD runner | FiveAM on file save. =:test-failure= events. Hook into self-fix for auto-repair. |
 | Deep Emacs integration | Full org-agenda awareness. Navigate, clock time, refile, archive. |

-*** v0.5.0: Creator + Architect + GTD
+*** v0.5.0: Interactive Actuation & Environment Stewardship

-The agent bootstraps itself: creates skills autonomously, designs projects from PRDs, manages work.
+Interactive terminal sessions and autonomous dependency management.

 | Feature | Description |
 |---------|-------------|
+| Interactive PTY Actuator | Stream long-running process output to the context window (e.g., `npm run dev`, REPLs) with async interrupt control. |
+| The Environment Steward | Autonomously detect missing dependencies (e.g., "Command not found"), propose an installation command, and retry the failed action. |
+
+*** v0.6.0: Concurrency + Creator + GTD
+
+The agent bootstraps itself and manages parallel workstreams.
+
+| Feature | Description |
+|---------|-------------|
+| org-skill-sub-agent-manager | Lightweight Lisp-native sub-agents (via bordeaux-threads) that share memory but have isolated execution contexts for background work. |
 | org-skill-creator | LLM drafts complete skill org-file from natural language. Mandatory: syntax validation → jail-load → test → register. |
 | org-skill-architect | Scan =:STATUS: FROZEN= PRDs. Generate Phase B PROTOCOL. |
 | org-skill-gtd | Full GTD cycle: capture, clarify, organize, reflect, engage. org-gtd v4.0 DAG (=:TRIGGER:=, =:BLOCKER:=). |
 | Consensus loop | Run multiple providers for critical decisions. Compare results, detect disagreements. |
 | Web research | Headless Chromium via Python bridge. Text extraction, screenshots, Gemini Web UI automation. |

+*** v0.7.0: Visual Grounding & MCP Bridge
+
+Multimodal visual interaction and ecosystem-wide tool compatibility.
+
+| Feature | Description |
+|---------|-------------|
+| Computer Use / Vision | Allow the agent to request host OS or browser screenshots, analyze the UI, and issue precise X/Y coordinate click/type commands via an X11/Wayland bridge. |
+| MCP Gateway Bridge | Lisp-native client for the Model Context Protocol, allowing OpenCortex to connect to the entire ecosystem of external tools and data sources. |
+
+*** v0.8.0: The Evaluation Harness
+
+Automated benchmarking to mathematically prove the agent's reasoning capabilities.
+
+| Feature | Description |
+|---------|-------------|
+| SWE-Bench Harness | Automated pipeline that clones repositories, feeds GitHub issues, tracks the multi-step resolution trajectory, runs tests, and scores success. |
+
 *** v1.0.0: SOTA Parity

 Feature-complete agent competitive with commercial agents. All features reimplemented in pure Lisp.
@@ -337,27 +345,27 @@ World models, temporal reasoning, goal persistence across restarts.

 ** Design Principles

-** 1. Radical Transparency
+*** Radical Transparency

 If you can't explain it, you can't do it. Every action must be auditable. Hidden reasoning is forbidden.

-** 2. Autonomy First
+*** Autonomy First

 Dependency on proprietary systems is debt. Prefer local, offline-capable solutions.

-** 3. Zero Bloat
+*** Zero Bloat

 Complexity must be earned, not anticipated. The harness must remain minimal.

-** 4. Modularity
+*** Modularity

 The kernel must survive even if all skills fail. Complexity belongs at the edges.

-** 5. Mentorship
+*** Mentorship

 Teaching is the highest form of assistance. Every action should increase capability.

-** 6. Sustainability
+*** Sustainability

 Build for the 100-year horizon. Design for offline operation, local inference.

@@ -369,4 +377,4 @@ See [[file:docs/CONTRIBUTING.org][CONTRIBUTING.org]] for the Literate Granularit

 openCortex is released under the [[file:LICENSE][AGPLv3 license]].

-See [[file:CLA.org][CLA.org]] for the Contributor License Agreement.
+See [[file:CLA.org][CLA.org]] for the Contributor License Agreement.
--- a/harness/skills.lisp
+++ b/harness/skills.lisp
@@ -1,23 +1,23 @@
 (in-package :opencortex)

 (defun COSINE-SIMILARITY (v1 v2)
-  "Computes the cosine similarity between two vectors."
+  "Computes cosine similarity between two vectors."
  (let* ((len1 (length v1))
         (len2 (length v2)))
    (if (or (zerop len1) (zerop len2))
        0.0
-        (let* ((dot-product 0.0d0)
-               (norm1 0.0d0)
-               (norm2 0.0d0))
+        (let* ((dot 0.0d0)
+               (n1 0.0d0)
+               (n2 0.0d0))
          (dotimes (i (min len1 len2))
            (let* ((x (coerce (elt v1 i) 'double-float))
-                  (y (coerce (elt v2 i) 'double-float)))
-              (incf dot-product (* x y))
-              (incf norm1 (* x x))
-              (incf norm2 (* y y))))
-          (if (or (zerop norm1) (zerop norm2))
+                   (y (coerce (elt v2 i) 'double-float)))
+              (incf dot (* x y))
+              (incf n1 (* x x))
+              (incf n2 (* y y))))
+          (if (or (zerop n1) (zerop n2))
              0.0
-              (/ dot-product (sqrt (* norm1 norm2))))))))
+              (/ dot (sqrt (* n1 n2))))))))

 ;; TODO: Stub for vault - implement later
 (defun VAULT-MASK-STRING (s) "[MASKED]")
@@ -81,7 +81,7 @@
      (when id-start
        (let ((id-end (position #\Newline content :start id-start)))
          (when id-end
-            (setf id (subseq content (+ id-start 4) id-end)))))
+            (setf id (subseq content (+ id-start 4) id-end)))))))
    ;; Simple DEPENDS_ON extraction
    (let ((pos 0))
      (loop while (setf pos (search "#+DEPENDS_ON:" content :start2 pos))
@@ -205,7 +205,7 @@ Only loads blocks that specify a .lisp tangle target, ignoring tests and example
          (harness-log "LOADER ERROR in skill '~a': ~a" skill-base-name msg)
          (setf (skill-entry-status entry) :failed)
          (setf (skill-entry-error-log entry) msg)
-          nil)))
+          nil))))

 (defun load-skill-with-timeout (filepath timeout-seconds)
  "Loads a skill Org file with a hard execution timeout."
@@ -226,7 +226,7 @@ Only loads blocks that specify a .lisp tangle target, ignoring tests and example
        #+sbcl (sb-thread:terminate-thread thread)
        #-sbcl (bt:destroy-thread thread)
        (return :timeout))
-      (sleep 0.05))))
+      (sleep 0.05))))))

 (defun initialize-all-skills ()
  "Scans the directory defined by SKILLS_DIR and hot-loads skills using topological order."
--- a/harness/skills.org
+++ b/harness/skills.org
@@ -14,23 +14,23 @@ A static, hardcoded architecture is inherently fragile. The ~opencortex~ Skill E
 (in-package :opencortex)

 (defun COSINE-SIMILARITY (v1 v2)
-  "Computes the cosine similarity between two vectors."
+  "Computes cosine similarity between two vectors."
  (let* ((len1 (length v1))
         (len2 (length v2)))
    (if (or (zerop len1) (zerop len2))
        0.0
-        (let* ((dot-product 0.0d0)
-               (norm1 0.0d0)
-               (norm2 0.0d0))
+        (let* ((dot 0.0d0)
+               (n1 0.0d0)
+               (n2 0.0d0))
          (dotimes (i (min len1 len2))
            (let* ((x (coerce (elt v1 i) 'double-float))
-                  (y (coerce (elt v2 i) 'double-float)))
-              (incf dot-product (* x y))
-              (incf norm1 (* x x))
-              (incf norm2 (* y y))))
-          (if (or (zerop norm1) (zerop norm2))
+                   (y (coerce (elt v2 i) 'double-float)))
+              (incf dot (* x y))
+              (incf n1 (* x x))
+              (incf n2 (* y y))))
+          (if (or (zerop n1) (zerop n2))
              0.0
-              (/ dot-product (sqrt (* norm1 norm2))))))))
+              (/ dot (sqrt (* n1 n2))))))))

 ;; TODO: Stub for vault - implement later
 (defun VAULT-MASK-STRING (s) "[MASKED]")
@@ -97,7 +97,7 @@ A static, hardcoded architecture is inherently fragile. The ~opencortex~ Skill E
      (when id-start
        (let ((id-end (position #\Newline content :start id-start)))
          (when id-end
-            (setf id (subseq content (+ id-start 4) id-end)))))
+            (setf id (subseq content (+ id-start 4) id-end)))))))
    ;; Simple DEPENDS_ON extraction
    (let ((pos 0))
      (loop while (setf pos (search "#+DEPENDS_ON:" content :start2 pos))
@@ -227,7 +227,7 @@ Only loads blocks that specify a .lisp tangle target, ignoring tests and example
          (harness-log "LOADER ERROR in skill '~a': ~a" skill-base-name msg)
          (setf (skill-entry-status entry) :failed)
          (setf (skill-entry-error-log entry) msg)
-          nil)))
+          nil))))

 (defun load-skill-with-timeout (filepath timeout-seconds)
  "Loads a skill Org file with a hard execution timeout."
@@ -248,7 +248,7 @@ Only loads blocks that specify a .lisp tangle target, ignoring tests and example
        #+sbcl (sb-thread:terminate-thread thread)
        #-sbcl (bt:destroy-thread thread)
        (return :timeout))
-      (sleep 0.05))))
+      (sleep 0.05))))))
 #+end_src

 ** Initializing All Skills (initialize-all-skills)
@@ -522,9 +522,9 @@ These tests verify the Skill Engine and loader. Run with:
    (with-open-file (out (merge-pathnames "org-skill-b.org" tmp-dir) :direction :output :if-exists :supersede)
      (format out ":PROPERTIES:~%:ID: skill-b-id~%:END:~%"))
    (unwind-protect
-         (let ((sorted (opencortex::topological-sort-skills tmp-dir)))
+         (let ((sorted (opencortex::topological-sort-skills tmp-dir))
           (let ((pos-a (position "org-skill-a" sorted :key #'pathname-name :test #'string-equal))
-                 (pos-b (position "org-skill-b" sorted :key #'pathname-name :test #'string-equal)))
+                 (pos-b (position "org-skill-b" sorted :key #'pathname-name :test #'string-equal))
             (is (< pos-b pos-a)))
       (uiop:delete-directory-tree (uiop:ensure-directory-pathname tmp-dir) :validate t))))

@@ -537,5 +537,5 @@ These tests verify the Skill Engine and loader. Run with:
         (progn
           (opencortex::load-skill-from-org tmp-skill)
           (is (not (null (gethash "org-skill-jail-test" opencortex::*skills-registry*)))))
-       (uiop:delete-file-if-exists tmp-skill))))
+       (uiop:delete-file-if-exists tmp-skill))))))
 #+end_src
--- a/opencortex.asd
+++ b/opencortex.asd
@@ -1,16 +1,16 @@
 (defsystem :opencortex
  :name "opencortex"
-  :author "Amr"  
+  :author "Amr"
  :version "0.2.0"
  :license "AGPLv3"
  :description "The Probabilistic-Deterministic Lisp Machine"

-  :depends-on (:bordeaux-threads :cl-ppcre :usocket :ironclad :dexador :uuid :cl-json :str :uiop :cl-dotenv :hunchentoot)
+  :depends-on (:usocket :bordeaux-threads :dexador :uiop :cl-dotenv :cl-ppcre :hunchentoot :ironclad :str :cl-json :uuid)

  :serial t

-   :components ((:static-file "harness/package.lisp")
-                (:static-file "harness/skills.lisp")
+   :components ((:file "harness/package")
+                (:file "harness/skills")
                (:file "harness/communication")
                (:file "harness/communication-validator")
                (:file "harness/memory")
@@ -36,4 +36,26 @@

  :build-operation "program-op"
  :build-pathname "opencortex-server"
-  :entry-point "opencortex:main")
+  :entry-point "opencortex:main")
+
+(defsystem :opencortex/tests
+  :depends-on (:opencortex :fiveam)
+  :components ((:file "harness/act-tests")
+               (:file "harness/boot-sequence-tests")
+               (:file "harness/immune-system-tests")
+               (:file "harness/memory-tests")
+               (:file "harness/pipeline-act-tests")
+               (:file "harness/pipeline-perceive-tests")
+               (:file "harness/pipeline-reason-tests")
+               (:file "harness/peripheral-vision-tests")
+               (:file "harness/emacs-edit-tests")
+               (:file "harness/engineering-standards-tests")
+               (:file "harness/lisp-utils-tests")
+               (:file "harness/lisp-validator-tests")
+               (:file "harness/literate-programming-tests")
+               (:file "harness/self-edit-tests")
+               (:file "harness/tool-permissions-tests")))
+
+(defsystem :opencortex/tui
+  :depends-on (:opencortex :croatoan :usocket :bordeaux-threads)
+  :components ((:file "harness/tui-client")))