Files
passepartout/literate/object-store.org

13 KiB

The Object Store (object-store.lisp)

The Object Store (object-store.lisp)

Architectural Intent: The Single Address Space

Traditional architectures rely on external databases (SQLite, Vector DBs) which introduce I/O latency and structural impedance. The org-agent architecture chooses a different path: the Single Address Space.

  • Pointer-Based Reasoning: By loading the entire knowledge graph into a live Common Lisp hash table, we achieve microsecond recollection. The harness doesn't "search a file"; it traverses a memory pointer.
  • Memory Imaging: The ability to snapshot the Lisp image allows the agent to resume its entire cognitive state instantly, solving the "Cold Start" problem.
  • Merkle-Tree Integrity: Every node in the Object Store is cryptographically hashed. By recursively hashing content and children, the root hash provides a single, immutable fingerprint of the entire system state.

System Architecture

flowchart TD
    subgraph LispMachine[Lisp Machine]
        H[Harness Pipeline] --> OS[(Object Store)]
        S1[Skill: Architect] --> OS
        S2[Skill: Analyst] --> OS
        S3[Skill: GTD] --> OS
        H -- Pointers --> S1
        H -- Pointers --> S2
    end
    subgraph IPCSlow[External Layer]
        E[Emacs / Actuators] -. Harness Communication .-> H
    end

Package Context

(in-package :org-agent)

The Object Repository

The `*object-store*` is the global hash table that holds every Org element by its unique ID. This is the "live RAM" of the agent's memory.

(defvar *object-store* (make-hash-table :test 'equal))

(defvar *history-store* (make-hash-table :test 'equal)
  "Immutable Merkle-Tree versioning store mapping hashes to objects.")

The Data Structure (org-object)

Every element in the Memex (headlines, paragraphs, etc.) is represented by an `org-object` structure. It contains both semantic metadata (attributes, content) and structural metadata (parent/child pointers, Merkle hashes).

(defstruct org-object
  id type attributes content vector parent-id children version last-sync hash)

Merkle Tree Integrity (compute-merkle-hash)

The `compute-merkle-hash` function ensures the cryptographic integrity of the knowledge graph. A node's hash depends on its own properties and the hashes of all its children. This creates a recursive fingerprint where any change to a single note propagates up to the root hash.

(defun compute-merkle-hash (id type attributes content child-hashes)
  "Computes a SHA-256 Merkle hash for a node based on its core properties and children's hashes."
  (let* ((alist (loop for (k v) on attributes by #'cddr collect (cons k v)))
         (sorted-alist (sort alist #'string< :key (lambda (x) (format nil "~a" (car x)))))
         (attr-string (format nil "~s" sorted-alist))
         (children-string (format nil "~{~a~}" child-hashes))
         (data-string (format nil "ID:~a|TYPE:~s|ATTRS:~a|CONTENT:~a|CHILDREN:~a"
                              id type attr-string (or content "") children-string))
         (digester (ironclad:make-digest :sha256)))
    (ironclad:update-digest digester (ironclad:ascii-string-to-byte-array data-string))
    (ironclad:byte-array-to-hex-string (ironclad:produce-digest digester))))

Ingesting the AST (ingest-ast)

The `ingest-ast` function is the primary bridge between the external world (Emacs/JSON) and the internal Lisp machine. It recursively parses an Org-mode Abstract Syntax Tree (AST) into `org-object` structures and registers them in the store.

(defun ingest-ast (ast &optional parent-id)
  "Parses an Org AST into the recursive Lisp Object Store with Merkle hashing."
  (let* ((type (getf ast :type))
         (props (getf ast :properties))
         (id (or (getf props :ID) (format nil "temp-~a" (get-universal-time))))
         (contents (getf ast :contents))
         (raw-content (when (eq type :HEADLINE)
                        (format nil "~a~%~a" (getf props :TITLE) (or (cl:getf ast :raw-content) ""))))
         (should-embed (and raw-content (equal (getf props :EMBED) "t")))
         (child-ids nil)
         (child-hashes nil))
    (dolist (child contents)
      (when (listp child)
        (let ((child-id (ingest-ast child id)))
          (push child-id child-ids)
          (let ((child-id-val child-id))
             (let ((child-obj (lookup-object child-id-val)))
               (when child-obj (push (org-object-hash child-obj) child-hashes)))))))
    (setf child-ids (nreverse child-ids))
    (setf child-hashes (nreverse child-hashes))
    (let* ((hash (compute-merkle-hash id type props raw-content child-hashes))
           (existing-obj (gethash hash *history-store*))
           (obj (or existing-obj
                    (make-org-object 
                     :id id :type type :attributes props :content raw-content
                     :vector (when should-embed (get-embedding raw-content))
                     :parent-id parent-id :children child-ids
                     :version (get-universal-time) :last-sync (get-universal-time)
                     :hash hash))))
      (unless existing-obj
        (setf (gethash hash *history-store*) obj))
      (setf (gethash id *object-store*) obj)
      id)))

Memory Snapshots (snapshot-object-store)

Because objects are stored immutably in the `*history-store*`, a snapshot is a lightweight shallow copy of the active `*object-store*` pointers. The system maintains a rolling buffer of 20 snapshots, allowing for near-instant, zero-cost rollback.

(defvar *object-store-snapshots* nil)

(defun copy-hash-table (hash-table)
  "Creates a shallow copy of a hash table."
  (let ((new-table (make-hash-table :test (hash-table-test hash-table) 
                                    :size (hash-table-size hash-table))))
    (maphash (lambda (k v) (setf (gethash k new-table) v)) hash-table)
    new-table))

(defun snapshot-object-store ()
  "Creates a lightweight, Copy-on-Write snapshot using Merkle-Tree pointers."
  (let ((snapshot (copy-hash-table *object-store*)))
    (push (list :timestamp (get-universal-time) :data snapshot) *object-store-snapshots*)
    (when (> (length *object-store-snapshots*) 20)
      (setf *object-store-snapshots* (subseq *object-store-snapshots* 0 20)))
    (harness-log "MEMORY - CoW Object Store snapshot created.")))

Memory Rollback (rollback-object-store)

Restores the state of the Memex from one of the previous snapshots.

(defun rollback-object-store (&optional (index 0))
  "Restores the Object Store to a previously captured snapshot using immutable history pointers."
  (let ((snapshot (nth index *object-store-snapshots*)))
    (if snapshot
        (progn (setf *object-store* (copy-hash-table (getf snapshot :data)))
               (harness-log "MEMORY - Object Store rolled back to snapshot ~a" index))
        (harness-log "MEMORY ERROR - Snapshot ~a not found." index))))

Lookup Utilities

Basic functions for retrieving objects by ID or type.

(defun lookup-object (id) 
  "Retrieves an object from the store by its unique ID."
  (gethash id *object-store*))

(defun list-objects-by-type (type)
  "Returns a list of all objects matching a specific Org element type."
  (let ((results nil))
    (maphash (lambda (id obj) (declare (ignore id)) (when (eq (org-object-type obj) type) (push obj results))) *object-store*)
    results))

Structural Helpers

Utility functions for AST traversal and path resolution.

(defun find-headline-missing-id (ast)
  "Traverses an AST to find headlines that lack an :ID: property."
  (when (listp ast)
    (if (and (eq (getf ast :type) :HEADLINE) (not (getf (getf ast :properties) :ID)))
        ast
        (cl:some #'find-headline-missing-id (getf ast :contents)))))

(defun file-name-nondirectory (path)
  "Extracts the filename from a full path string."
  (let ((pos (position #\/ path :from-end t))) (if pos (subseq path (1+ pos)) path)))

Phase E: Chaos (Verification)

Following the Engineering Standards, the Object Store must be empirically verified through automated testing. The following test suite ensures the mathematical integrity of the Merkle hashes and the behavioral correctness of the immutable versioning and rollback systems.

(defpackage :org-agent-object-store-tests
  (:use :cl :fiveam :org-agent)
  (:export #:object-store-suite))

(in-package :org-agent-object-store-tests)

(def-suite object-store-suite
  :description "Tests for the Merkle-Tree Object Store.")

(in-suite object-store-suite)

(test merkle-hash-consistency
  (let* ((ast1 '(:type :HEADLINE :properties (:ID "test-1" :TITLE "Node 1") :contents nil))
         (ast2 '(:type :HEADLINE :properties (:ID "test-1" :TITLE "Node 1") :contents nil)))
    (clrhash *object-store*)
    (let ((id1 (ingest-ast ast1)))
      (let ((hash1 (org-object-hash (lookup-object id1))))
        (clrhash *object-store*)
        (let ((id2 (ingest-ast ast2)))
          (let ((hash2 (org-object-hash (lookup-object id2))))
            (is (equal hash1 hash2))))))))

(test merkle-hash-cascading
  (let* ((ast-leaf '(:type :HEADLINE :properties (:ID "leaf" :TITLE "Leaf") :contents nil))
         (ast-root-full '(:type :HEADLINE :properties (:ID "root" :TITLE "Root") 
                           :contents ((:type :HEADLINE :properties (:ID "leaf" :TITLE "Leaf") :contents nil))))
         (id-root (progn (clrhash *object-store*) (ingest-ast ast-root-full)))
         (initial-root-hash (org-object-hash (lookup-object id-root))))
      
      ;; Now ingest a modified version (title change)
      (let* ((ast-root-modified '(:type :HEADLINE :properties (:ID "root" :TITLE "Root") 
                                 :contents ((:type :HEADLINE :properties (:ID "leaf" :TITLE "Leaf Modified") :contents nil))))
             (id-root-mod (progn (clrhash *object-store*) (ingest-ast ast-root-modified)))
             (modified-root-hash (org-object-hash (lookup-object id-root-mod))))
        (is (not (equal initial-root-hash modified-root-hash))))))

(test history-store-immutability
  "Verify that *history-store* retains old versions even after *object-store* updates."
  (clrhash *object-store*)
  (clrhash *history-store*)
  (let* ((ast-v1 '(:type :HEADLINE :properties (:ID "test-node" :TITLE "Version 1") :contents nil))
         (id-v1 (ingest-ast ast-v1))
         (obj-v1 (lookup-object id-v1))
         (hash-v1 (org-object-hash obj-v1)))
    
    (let* ((ast-v2 '(:type :HEADLINE :properties (:ID "test-node" :TITLE "Version 2") :contents nil))
           (id-v2 (ingest-ast ast-v2))
           (obj-v2 (lookup-object id-v2))
           (hash-v2 (org-object-hash obj-v2)))
      
      ;; The active pointer should be v2
      (is (equal (org-object-hash (lookup-object "test-node")) hash-v2))
      
      ;; Both v1 and v2 should exist in the immutable history store
      (is (not (null (gethash hash-v1 *history-store*))))
      (is (not (null (gethash hash-v2 *history-store*))))
      
      ;; Modifying v2 should not affect v1 in the history store
      (is (equal (org-object-content (gethash hash-v1 *history-store*)) "Version 1
"))
      (is (equal (org-object-content (gethash hash-v2 *history-store*)) "Version 2
")))))

(test cow-snapshot-and-rollback
  "Verify that lightweight snapshots can accurately restore previous pointer states."
  (clrhash *object-store*)
  (clrhash *history-store*)
  (setf *object-store-snapshots* nil)
  
  (let* ((ast-v1 '(:type :HEADLINE :properties (:ID "cow-node" :TITLE "State A") :contents nil))
         (id-v1 (ingest-ast ast-v1))
         (hash-v1 (org-object-hash (lookup-object id-v1))))
    
    ;; Take a snapshot at State A
    (snapshot-object-store)
    
    (let* ((ast-v2 '(:type :HEADLINE :properties (:ID "cow-node" :TITLE "State B") :contents nil))
           (id-v2 (ingest-ast ast-v2))
           (hash-v2 (org-object-hash (lookup-object id-v2))))
      
      ;; Verify we are currently in State B
      (is (equal (org-object-hash (lookup-object "cow-node")) hash-v2))
      
      ;; Rollback to State A (index 0 because we only took 1 snapshot)
      (rollback-object-store 0)
      
      ;; Verify we are back in State A
      (is (equal (org-object-hash (lookup-object "cow-node")) hash-v1))
      
      ;; Verify State B is still safely in the history store (no data loss)
      (is (not (null (gethash hash-v2 *history-store*)))))))