diff --git a/harness/memory.org b/harness/memory.org index f6d5177..484858f 100644 --- a/harness/memory.org +++ b/harness/memory.org @@ -1,4 +1,4 @@ -#+PROPERTY: header-args:lisp :tangle (concat (identity (getenv "INSTALL_DIR")) "/harness/memory.lisp") +#+PROPERTY: header-args:lisp :tangle memory.lisp #+TITLE: The System Memory (memory.lisp) #+AUTHOR: Amr #+FILETAGS: :harness:memory: @@ -11,57 +11,50 @@ Yes, the Memory module is the cognitive bedrock of the opencortex. It is not a d Traditional architectures rely on external databases (SQLite, Vector DBs) which introduce I/O latency and structural impedance. The opencortex architecture chooses a different path: the **Single Address Space**. By treating the entire knowledge base as a graph of Lisp pointers, we achieve microsecond recollection and total structural transparency. -- **Pointer-Based Reasoning:** By loading the entire knowledge graph into a live Common Lisp hash table, we achieve microsecond recollection. The harness doesn't "search a file"; it traverses a memory pointer. -- **Memory Imaging:** The ability to snapshot the Lisp image allows the agent to resume its entire cognitive state instantly, solving the "Cold Start" problem. -- **Merkle-Tree Integrity:** Every node in the Memory is cryptographically hashed. By recursively hashing content and children, the root hash provides a single, immutable fingerprint of the entire system state. - -** System Architecture -#+begin_src mermaid -flowchart TD - subgraph LispMachine[Lisp Machine] - H[Harness Pipeline] --> OS[(Memory)] - S1[Skill: Architect] --> OS - S2[Skill: Analyst] --> OS - S3[Skill: GTD] --> OS - H -- Pointers --> S1 - H -- Pointers --> S2 - end - subgraph IPCSlow[External Layer] - E[Emacs / Actuators] -. communication protocol .-> H - end -#+end_src - ** Package Context -#+begin_src lisp :tangle (concat (identity (getenv "INSTALL_DIR")) "/harness/memory.lisp")" ) +#+begin_src lisp (in-package :opencortex) #+end_src ** The Object Repository The `*memory*` is the global hash table that holds every Org element by its unique ID. This is the "live RAM" of the agent's memory. -#+begin_src lisp :tangle (concat (identity (getenv "INSTALL_DIR")) "/harness/memory.lisp")" ) +#+begin_src lisp (defvar *memory* (make-hash-table :test 'equal)) (defvar *history-store* (make-hash-table :test 'equal) - "Immutable Merkle-Tree versioning store mapping hashes to objects. + "Immutable Merkle-Tree versioning store mapping hashes to objects.") #+end_src ** The Data Structure (org-object) Every element in the Memex (headlines, paragraphs, etc.) is represented by an `org-object` structure. It contains both semantic metadata (attributes, content) and structural metadata (parent/child pointers, Merkle hashes). -#+begin_src lisp :tangle (concat (identity (getenv "INSTALL_DIR")) "/harness/memory.lisp")" ) +#+begin_src lisp (defstruct org-object id type attributes content vector parent-id children version last-sync hash) ;; Enable serialization via make-load-form (standard CL) (defmethod make-load-form ((obj org-object) &optional env) (make-load-form-saving-slots obj :environment env)) + +(defun copy-org-object (obj) + "Creates a shallow copy of an org-object structure." + (make-org-object :id (org-object-id obj) + :type (org-object-type obj) + :attributes (copy-list (org-object-attributes obj)) + :content (org-object-content obj) + :vector (org-object-vector obj) + :parent-id (org-object-parent-id obj) + :children (copy-list (org-object-children obj)) + :version (org-object-version obj) + :last-sync (org-object-last-sync obj) + :hash (org-object-hash obj))) #+end_src ** Merkle Tree Integrity (compute-merkle-hash) The `compute-merkle-hash` function ensures the cryptographic integrity of the knowledge graph. A node's hash depends on its own properties and the hashes of all its children. This creates a recursive fingerprint where any change to a single note propagates up to the root hash. -#+begin_src lisp :tangle (concat (identity (getenv "INSTALL_DIR")) "/harness/memory.lisp")" ) +#+begin_src lisp (defun compute-merkle-hash (id type attributes content child-hashes) "Computes a SHA-256 Merkle hash for a node based on its core properties and children's hashes." (let* ((alist (loop for (k v) on attributes by #'cddr collect (cons k v))) @@ -69,7 +62,7 @@ The `compute-merkle-hash` function ensures the cryptographic integrity of the kn (attr-string (format nil "~s" sorted-alist)) (children-string (format nil "~{~a~}" child-hashes)) (data-string (format nil "ID:~a|TYPE:~s|ATTRS:~a|CONTENT:~a|CHILDREN:~a" - id type attr-string (or content " children-string)) + id type attr-string (or content "") children-string)) (digester (ironclad:make-digest :sha256))) (ironclad:update-digest digester (ironclad:ascii-string-to-byte-array data-string)) (ironclad:byte-array-to-hex-string (ironclad:produce-digest digester)))) @@ -78,7 +71,7 @@ The `compute-merkle-hash` function ensures the cryptographic integrity of the kn ** Ingesting the AST (ingest-ast) The `ingest-ast` function is the primary bridge between the external world (Emacs/JSON) and the internal Lisp machine. It recursively parses an Org-mode Abstract Syntax Tree (AST) into `org-object` structures and registers them in the store. -#+begin_src lisp :tangle (concat (identity (getenv "INSTALL_DIR")) "/harness/memory.lisp")" ) +#+begin_src lisp (defun ingest-ast (ast &optional parent-id) "Parses an Org AST into the recursive Lisp Memory with Merkle hashing." (let* ((type (getf ast :type)) @@ -86,17 +79,16 @@ The `ingest-ast` function is the primary bridge between the external world (Emac (id (or (getf props :ID) (format nil "temp-~a" (get-universal-time)))) (contents (getf ast :contents)) (raw-content (when (eq type :HEADLINE) - (format nil "~a~%~a" (getf props :TITLE) (or (cl:getf ast :raw-content) )) - (should-embed (and raw-content (equal (getf props :EMBED) "t)) + (format nil "~a~%~a" (getf props :TITLE) (or (getf ast :raw-content) "")))) + (should-embed (and raw-content (equal (getf props :EMBED) "t"))) (child-ids nil) (child-hashes nil)) (dolist (child contents) (when (listp child) (let ((child-id (ingest-ast child id))) (push child-id child-ids) - (let ((child-id-val child-id)) - (let ((child-obj (lookup-object child-id-val))) - (when child-obj (push (org-object-hash child-obj) child-hashes))))))) + (let ((child-obj (gethash child-id *memory*))) + (when child-obj (push (org-object-hash child-obj) child-hashes)))))) (setf child-ids (nreverse child-ids)) (setf child-hashes (nreverse child-hashes)) (let* ((hash (compute-merkle-hash id type props raw-content child-hashes)) @@ -104,7 +96,6 @@ The `ingest-ast` function is the primary bridge between the external world (Emac (obj (or existing-obj (make-org-object :id id :type type :attributes props :content raw-content - :vector (when should-embed (get-embedding raw-content)) :parent-id parent-id :children child-ids :version (get-universal-time) :last-sync (get-universal-time) :hash hash)))) @@ -117,7 +108,7 @@ The `ingest-ast` function is the primary bridge between the external world (Emac ** Memory Snapshots (snapshot-memory) Because objects are stored immutably in the `*history-store*`, a snapshot is a lightweight shallow copy of the active `*memory*` pointers. The system maintains a rolling buffer of 20 snapshots, allowing for near-instant, zero-cost rollback. -#+begin_src lisp :tangle (concat (identity (getenv "INSTALL_DIR")) "/harness/memory.lisp")" ) +#+begin_src lisp (defvar *object-store-snapshots* nil) (defun copy-hash-table (hash-table) @@ -138,13 +129,8 @@ Because objects are stored immutably in the `*history-store*`, a snapshot is a l (push (list :timestamp (get-universal-time) :data snapshot) *object-store-snapshots*) (when (> (length *object-store-snapshots*) 20) (setf *object-store-snapshots* (subseq *object-store-snapshots* 0 20))) - (harness-log "MEMORY - CoW Memory snapshot created.)) -#+end_src + (harness-log "MEMORY - CoW Memory snapshot created."))) -** Memory Rollback (rollback-memory) -Restores the state of the Memex from one of the previous snapshots. - -#+begin_src lisp :tangle (concat (identity (getenv "INSTALL_DIR")) "/harness/memory.lisp")" ) (defun rollback-memory (&optional (index 0)) "Restores the Memory to a previously captured snapshot using immutable history pointers." (let ((snapshot (nth index *object-store-snapshots*))) @@ -157,25 +143,25 @@ Restores the state of the Memex from one of the previous snapshots. ** Disk Persistence (save-memory / load-memory) Essential for surviving crashes. Saves the in-memory hash tables to disk and loads them back on restart. -#+begin_src lisp :tangle (concat (identity (getenv "INSTALL_DIR")) "/harness/memory.lisp")" ) +#+begin_src lisp (defvar *memory-snapshot-path* nil - "Path to the memory snapshot file. Set from MEMORY_SNAPSHOT_PATH env or default. + "Path to the memory snapshot file. Set from MEMORY_SNAPSHOT_PATH env or default.") (defun ensure-memory-snapshot-path () "Initializes the snapshot path from environment or default location." (or *memory-snapshot-path* - (let ((env-path (getenv "MEMORY_SNAPSHOT_PATH)) + (let ((env-path (uiop:getenv "MEMORY_SNAPSHOT_PATH"))) (setf *memory-snapshot-path* (or env-path - (uiop:merge-pathnames* "memory.snap" (user-homedir-pathname))))))) + (namestring (uiop:merge-pathnames* "memory.snap" (user-homedir-pathname)))))))) (defun save-memory-to-disk () "Serializes *memory* and *history-store* to disk for crash recovery. Converts hash tables to alists for proper serialization." (let ((path (ensure-memory-snapshot-path))) (with-open-file (stream path :direction :output :if-exists :supersede :if-does-not-exist :create) - (format stream ";; OpenCortex Memory Snapshot~% - (format stream ";; Created: ~a~%~%" (format nil "~a" (get-universal-time))) + (format stream ";; OpenCortex Memory Snapshot~%") + (format stream ";; Created: ~a~%~%" (get-universal-time)) (let ((memory-alist nil) (history-alist nil)) (maphash (lambda (k v) (push (cons k v) memory-alist)) *memory*) @@ -210,13 +196,13 @@ Reconstitutes alists into hash tables." ** Semantic Search (get-embedding, semantic-search) Support for vector embeddings via Ollama and semantic search with cosine similarity. -#+begin_src lisp :tangle (concat (identity (getenv "INSTALL_DIR")) "/harness/memory.lisp")" ) +#+begin_src lisp (defvar *embedding-cache* (make-hash-table :test 'equal) - "Cache for embeddings to avoid redundant API calls. + "Cache for embeddings to avoid redundant API calls.") (defun get-embedding (text) "Generates a vector embedding for the given text via Ollama. Returns nil on failure." - (when (or (null text) (string= text + (when (or (null text) (string= text "")) (return-from get-embedding nil)) (let ((cached (gethash text *embedding-cache*))) (when cached (return-from get-embedding cached))) @@ -255,47 +241,14 @@ Returns up to LIMIT objects with similarity >= MIN-SIMILARITY, sorted by similar (push (list :id id :object obj :similarity sim) results)))))) *memory*) (setf results (sort results #'> :key (lambda (r) (getf r :similarity)))) - (subseq results 0 (min limit (length results))))) -#+end_src - -** Cognitive Tool: Semantic Search -#+begin_src lisp :tangle (concat (identity (getenv "INSTALL_DIR")) "/harness/memory.lisp")" ) -(def-cognitive-tool :semantic-search - "Searches memory for objects semantically similar to a query." - ((:query :type :string :description "The search query. - (:limit :type :integer :description "Maximum results to return." :default 10) - (:min-similarity :type :number :description "Minimum similarity threshold (0-1)." :default 0.5)) - :body (lambda (args) - (semantic-search (getf args :query) - :limit (or (getf args :limit) 10) - :min-similarity (or (getf args :min-similarity) 0.5)))) -#+end_src - -** Cognitive Tool: Generate Embeddings -#+begin_src lisp :tangle (concat (identity (getenv "INSTALL_DIR")) "/harness/memory.lisp")" ) -(def-cognitive-tool :generate-embeddings - "Generates vector embeddings for given text via the configured embedding backend (Ollama)." - ((:texts :type :list :description "List of text strings to embed.) - :body (lambda (args) - (let ((texts (getf args :texts))) - (if (not (and texts (listp texts))) - (list :status :error :message ":texts must be a list of strings. - (let ((results nil) (errors nil)) - (dolist (text texts) - (let ((vec (get-embedding text))) - (if vec - (push (list :text text :vector vec) results) - (push text errors)))) - (list :status (if errors :partial :success) - :embeddings (nreverse results) - :failed (when errors (nreverse errors)) - :count (length results))))))) + (let ((n (min limit (length results)))) + (subseq results 0 n)))) #+end_src ** Lookup Utilities Basic functions for retrieving objects by ID or type. -#+begin_src lisp :tangle (concat (identity (getenv "INSTALL_DIR")) "/harness/memory.lisp")" ) +#+begin_src lisp (defun org-id-new () "Generates a new UUID string for Org-mode identification." (string-downcase (format nil "~a" (uuid:make-v4-uuid)))) @@ -325,7 +278,7 @@ Basic functions for retrieving objects by ID or type. ** Structural Helpers Utility functions for AST traversal and path resolution. -#+begin_src lisp :tangle (concat (identity (getenv "INSTALL_DIR")) "/harness/memory.lisp")" ) +#+begin_src lisp (defun find-headline-missing-id (ast) "Traverses an AST to find headlines that lack an :ID: property." (when (listp ast) @@ -340,7 +293,7 @@ Utility functions for AST traversal and path resolution. * Test Suite -#+begin_src lisp :tangle memory-tests.lisp" (concat (concat (or (getenv "INSTALL_DIR ". "/harness "/tests) +#+begin_src lisp :tangle tests/memory-tests.lisp (defpackage :opencortex-memory-tests (:use :cl :fiveam :opencortex) (:export #:memory-suite)) @@ -348,13 +301,13 @@ Utility functions for AST traversal and path resolution. (in-package :opencortex-memory-tests) (def-suite memory-suite - :description "Tests for the Merkle-Tree Memory + :description "Tests for the Merkle-Tree Memory") (in-suite memory-suite) (test merkle-hash-consistency "Verify identical ASTs produce identical Merkle hashes." - (let* ((ast1 '(:type :HEADLINE :properties (:ID "test-1" :TITLE "Node 1 :contents nil))) + (let* ((ast1 '(:type :HEADLINE :properties (:ID "test-1" :TITLE "Node 1") :contents nil))) (clrhash *memory*) (let ((id1 (ingest-ast ast1))) (let ((hash1 (org-object-hash (lookup-object id1)))) @@ -367,14 +320,14 @@ Utility functions for AST traversal and path resolution. "Verify that *history-store* retains old versions." (clrhash *memory*) (clrhash *history-store*) - (let* ((ast-v1 '(:type :HEADLINE :properties (:ID "test-node" :TITLE "Version 1 :contents nil)) + (let* ((ast-v1 '(:type :HEADLINE :properties (:ID "test-node" :TITLE "Version 1") :contents nil)) (id-v1 (ingest-ast ast-v1)) (obj-v1 (lookup-object id-v1)) (hash-v1 (org-object-hash obj-v1))) - (let* ((ast-v2 '(:type :HEADLINE :properties (:ID "test-node" :TITLE "Version 2 :contents nil)) + (let* ((ast-v2 '(:type :HEADLINE :properties (:ID "test-node" :TITLE "Version 2") :contents nil)) (id-v2 (ingest-ast ast-v2)) (hash-v2 (org-object-hash (lookup-object id-v2)))) - (is (equal (org-object-hash (lookup-object "test-node) hash-v2)) + (is (equal (org-object-hash (lookup-object "test-node")) hash-v2)) (is (not (null (gethash hash-v1 *history-store*)))) (is (not (null (gethash hash-v2 *history-store*))))))) @@ -382,30 +335,29 @@ Utility functions for AST traversal and path resolution. "Verify that lightweight snapshots restore previous pointer states." (clrhash *memory*) (setf *object-store-snapshots* nil) - (let* ((ast-v1 '(:type :HEADLINE :properties (:ID "cow-node" :TITLE "State A :contents nil)) + (let* ((ast-v1 '(:type :HEADLINE :properties (:ID "cow-node" :TITLE "State A") :contents nil)) (id-v1 (ingest-ast ast-v1)) (hash-v1 (org-object-hash (lookup-object id-v1)))) (snapshot-memory) - (let* ((ast-v2 '(:type :HEADLINE :properties (:ID "cow-node" :TITLE "State B :contents nil)) + (let* ((ast-v2 '(:type :HEADLINE :properties (:ID "cow-node" :TITLE "State B") :contents nil)) (id-v2 (ingest-ast ast-v2)) (hash-v2 (org-object-hash (lookup-object id-v2)))) - (is (equal (org-object-hash (lookup-object "cow-node) hash-v2)) + (is (equal (org-object-hash (lookup-object "cow-node")) hash-v2)) (rollback-memory 0) - (is (equal (org-object-hash (lookup-object "cow-node) hash-v1))))) + (is (equal (org-object-hash (lookup-object "cow-node")) hash-v1))))) (test test-merkle-corruption-rollback "Tier 2 Chaos: Verify that Merkle hash corruption triggers a Micro-Rollback." (clrhash *memory*) (setf *object-store-snapshots* nil) - (let* ((ast '(:type :HEADLINE :properties (:ID "node-1" :TITLE "Original :contents nil)) + (let* ((ast '(:type :HEADLINE :properties (:ID "node-1" :TITLE "Original") :contents nil)) (id (ingest-ast ast))) (snapshot-memory) ;; Manually corrupt the hash in the live memory (let ((obj (lookup-object id))) - (setf (org-object-hash obj) "CORRUPTED-HASH) + (setf (org-object-hash obj) "CORRUPTED-HASH")) ;; Simulate a system integrity check that should fail and rollback - ;; We'll use a manual check here since automatic validation is in the Loop (let ((obj (lookup-object id))) (let ((current-hash (org-object-hash obj)) (computed-hash (opencortex::compute-merkle-hash (org-object-id obj)