Files
passepartout/literate/object-store.org

13 KiB

The Object Store (object-store.lisp)

The Object Store (object-store.lisp)

Deep Reasoning: The Single Address Space Advantage

Industry-standard "Vector Databases" or "SQLite Backends" add external complexity and I/O latency.

  • Pointer-Based Reasoning: By loading the entire Memex into a live Lisp hash table, we achieve microsecond recollection. The agent doesn't "search a file"; it traverses a memory pointer.
  • Memory Imaging: The `memory-image.lisp` snapshot allows the agent to wake up with its entire context already parsed. This solves the "Cold Start" problem of massive Org files.
  • Merkle-Tree Integrity: Every node in the Object Store is cryptographically hashed. By hashing the content and the hashes of its children, the root hash provides a single, immutable fingerprint of the entire Memex state.

The Single Address Space (Architecture)

graph TD
    subgraph LispMachine[Lisp Machine]
        K[Kernel Core] --> OS[(Object Store)]
        S1[Skill: Architect] --> OS
        S2[Skill: Analyst] --> OS
        S3[Skill: GTD] --> OS
        K -- Pointers --> S1
        K -- Pointers --> S2
    end
    subgraph IPCSlow[IPC Slow]
        E[Emacs / Actuators] -. OACP .-> K
    end

Package Context

We begin by establishing the `org-agent` package context.

(in-package :org-agent)

The Object Repository

The `*object-store*` is the global hash table that holds every Org element by its unique ID. This is the "live RAM" of the agent's memory.

(defvar *object-store* (make-hash-table :test 'equal))

(defvar *history-store* (make-hash-table :test 'equal)
  "Immutable Merkle-Tree versioning store mapping hashes to objects.")

The Data Structure (org-object)

Every element in the Memex (headlines, paragraphs, etc.) is represented by an `org-object` structure. It contains both semantic metadata (attributes, content) and structural metadata (parent/child pointers, Merkle hashes).

(defstruct org-object
  id type attributes content vector parent-id children version last-sync hash)

Merkle Tree Integrity (compute-merkle-hash)

The `compute-merkle-hash` function ensures the cryptographic integrity of the knowledge graph. A node's hash depends on its own properties and the hashes of all its children. This creates a recursive fingerprint where any change to a single note propagates up to the root hash.

(defun compute-merkle-hash (id type attributes content child-hashes)
  "Computes a SHA-256 Merkle hash for a node based on its core properties and children's hashes."
  (let* ((alist (loop for (k v) on attributes by #'cddr collect (cons k v)))
         (sorted-alist (sort alist #'string< :key (lambda (x) (format nil "~a" (car x)))))
         (attr-string (format nil "~s" sorted-alist))
         (children-string (format nil "~{~a~}" child-hashes))
         (data-string (format nil "ID:~a|TYPE:~s|ATTRS:~a|CONTENT:~a|CHILDREN:~a"
                              id type attr-string (or content "") children-string))
         (digester (ironclad:make-digest :sha256)))
    (ironclad:update-digest digester (ironclad:ascii-string-to-byte-array data-string))
    (ironclad:byte-array-to-hex-string (ironclad:produce-digest digester))))

Ingesting the AST (ingest-ast)

The `ingest-ast` function is the primary bridge between the external world (Emacs/JSON) and the internal Lisp machine. It recursively parses an Org-mode Abstract Syntax Tree (AST) into `org-object` structures and registers them in the store.

(defun ingest-ast (ast &optional parent-id)
  "Parses an Org AST into the recursive Lisp Object Store with Merkle hashing."
  (let* ((type (getf ast :type))
         (props (getf ast :properties))
         (id (or (getf props :ID) (format nil "temp-~a" (get-universal-time))))
         (contents (getf ast :contents))
         (raw-content (when (eq type :HEADLINE)
                        (format nil "~a~%~a" (getf props :TITLE) (or (cl:getf ast :raw-content) ""))))
         (should-embed (and raw-content (equal (getf props :EMBED) "t")))
         (child-ids nil)
         (child-hashes nil))
    (dolist (child contents)
      (when (listp child)
        (let ((child-id (ingest-ast child id)))
          (push child-id child-ids)
          (let ((child-id-val child-id))
             (let ((child-obj (lookup-object child-id-val)))
               (when child-obj (push (org-object-hash child-obj) child-hashes)))))))
    (setf child-ids (nreverse child-ids))
    (setf child-hashes (nreverse child-hashes))
    (let* ((hash (compute-merkle-hash id type props raw-content child-hashes))
           (existing-obj (gethash hash *history-store*))
           (obj (or existing-obj
                    (make-org-object 
                     :id id :type type :attributes props :content raw-content
                     :vector (when should-embed (get-embedding raw-content))
                     :parent-id parent-id :children child-ids
                     :version (get-universal-time) :last-sync (get-universal-time)
                     :hash hash))))
      (unless existing-obj
        (setf (gethash hash *history-store*) obj))
      (setf (gethash id *object-store*) obj)
      id)))

Memory Snapshots (snapshot-object-store)

Because objects are stored immutably in the `*history-store*`, a snapshot is no longer a heavy, recursive deep-copy of the Memex. It is a lightweight shallow copy of the active `*object-store*` pointers. The system maintains a rolling buffer of 20 snapshots. This allows for near-instant, zero-cost rollback if the agent makes a mistake.

(defvar *object-store-snapshots* nil)

(defun copy-hash-table (hash-table)
  "Creates a shallow copy of a hash table."
  (let ((new-table (make-hash-table :test (hash-table-test hash-table) 
                                    :size (hash-table-size hash-table))))
    (maphash (lambda (k v) (setf (gethash k new-table) v)) hash-table)
    new-table))

(defun snapshot-object-store ()
  "Creates a lightweight, Copy-on-Write snapshot using Merkle-Tree pointers."
  (let ((snapshot (copy-hash-table *object-store*)))
    (push (list :timestamp (get-universal-time) :data snapshot) *object-store-snapshots*)
    (when (> (length *object-store-snapshots*) 20)
      (setf *object-store-snapshots* (subseq *object-store-snapshots* 0 20)))
    (kernel-log "MEMORY - CoW Object Store snapshot created.")))

Memory Rollback (rollback-object-store)

Restores the state of the Memex from one of the previous snapshots.

(defun rollback-object-store (&optional (index 0))
  "Restores the Object Store to a previously captured snapshot using immutable history pointers."
  (let ((snapshot (nth index *object-store-snapshots*)))
    (if snapshot
        (progn (setf *object-store* (copy-hash-table (getf snapshot :data)))
               (kernel-log "MEMORY - Object Store rolled back to snapshot ~a" index))
        (kernel-log "MEMORY ERROR - Snapshot ~a not found." index))))

Lookup Utilities

Basic functions for retrieving objects by ID or type.

(defun lookup-object (id) 
  "Retrieves an object from the store by its unique ID."
  (gethash id *object-store*))

(defun list-objects-by-type (type)
  "Returns a list of all objects matching a specific Org element type."
  (let ((results nil))
    (maphash (lambda (id obj) (declare (ignore id)) (when (eq (org-object-type obj) type) (push obj results))) *object-store*)
    results))

Structural Helpers

Utility functions for AST traversal and path resolution.

(defun find-headline-missing-id (ast)
  "Traverses an AST to find headlines that lack an :ID: property."
  (when (listp ast)
    (if (and (eq (getf ast :type) :HEADLINE) (not (getf (getf ast :properties) :ID)))
        ast
        (cl:some #'find-headline-missing-id (getf ast :contents)))))

(defun file-name-nondirectory (path)
  "Extracts the filename from a full path string."
  (let ((pos (position #\/ path :from-end t))) (if pos (subseq path (1+ pos)) path)))

Phase E: Chaos (Verification)

Following the PSF mandates, the Object Store must be empirically verified through automated testing. The following test suite ensures the mathematical integrity of the Merkle hashes and the behavioral correctness of the immutable versioning and rollback systems.

(defpackage :org-agent-object-store-tests
  (:use :cl :fiveam :org-agent)
  (:export #:object-store-suite))

(in-package :org-agent-object-store-tests)

(def-suite object-store-suite
  :description "Tests for the Merkle-Tree Object Store.")

(in-suite object-store-suite)

(test merkle-hash-consistency
  (let* ((ast1 '(:type :HEADLINE :properties (:ID "test-1" :TITLE "Node 1") :contents nil))
         (ast2 '(:type :HEADLINE :properties (:ID "test-1" :TITLE "Node 1") :contents nil)))
    (clrhash *object-store*)
    (let ((id1 (ingest-ast ast1)))
      (let ((hash1 (org-object-hash (lookup-object id1))))
        (clrhash *object-store*)
        (let ((id2 (ingest-ast ast2)))
          (let ((hash2 (org-object-hash (lookup-object id2))))
            (is (equal hash1 hash2))))))))

(test merkle-hash-cascading
  (let* ((ast-leaf '(:type :HEADLINE :properties (:ID "leaf" :TITLE "Leaf") :contents nil))
         (ast-root-full '(:type :HEADLINE :properties (:ID "root" :TITLE "Root") 
                           :contents ((:type :HEADLINE :properties (:ID "leaf" :TITLE "Leaf") :contents nil))))
         (id-root (progn (clrhash *object-store*) (ingest-ast ast-root-full)))
         (initial-root-hash (org-object-hash (lookup-object id-root))))
      
      ;; Now ingest a modified version (title change)
      (let* ((ast-root-modified '(:type :HEADLINE :properties (:ID "root" :TITLE "Root") 
                                 :contents ((:type :HEADLINE :properties (:ID "leaf" :TITLE "Leaf Modified") :contents nil))))
             (id-root-mod (progn (clrhash *object-store*) (ingest-ast ast-root-modified)))
             (modified-root-hash (org-object-hash (lookup-object id-root-mod))))
        (is (not (equal initial-root-hash modified-root-hash))))))

(test history-store-immutability
  "Verify that *history-store* retains old versions even after *object-store* updates."
  (clrhash *object-store*)
  (clrhash *history-store*)
  (let* ((ast-v1 '(:type :HEADLINE :properties (:ID "test-node" :TITLE "Version 1") :contents nil))
         (id-v1 (ingest-ast ast-v1))
         (obj-v1 (lookup-object id-v1))
         (hash-v1 (org-object-hash obj-v1)))
    
    (let* ((ast-v2 '(:type :HEADLINE :properties (:ID "test-node" :TITLE "Version 2") :contents nil))
           (id-v2 (ingest-ast ast-v2))
           (obj-v2 (lookup-object id-v2))
           (hash-v2 (org-object-hash obj-v2)))
      
      ;; The active pointer should be v2
      (is (equal (org-object-hash (lookup-object "test-node")) hash-v2))
      
      ;; Both v1 and v2 should exist in the immutable history store
      (is (not (null (gethash hash-v1 *history-store*))))
      (is (not (null (gethash hash-v2 *history-store*))))
      
      ;; Modifying v2 should not affect v1 in the history store
      (is (equal (org-object-content (gethash hash-v1 *history-store*)) "Version 1
"))
      (is (equal (org-object-content (gethash hash-v2 *history-store*)) "Version 2
")))))

(test cow-snapshot-and-rollback
  "Verify that lightweight snapshots can accurately restore previous pointer states."
  (clrhash *object-store*)
  (clrhash *history-store*)
  (setf *object-store-snapshots* nil)
  
  (let* ((ast-v1 '(:type :HEADLINE :properties (:ID "cow-node" :TITLE "State A") :contents nil))
         (id-v1 (ingest-ast ast-v1))
         (hash-v1 (org-object-hash (lookup-object id-v1))))
    
    ;; Take a snapshot at State A
    (snapshot-object-store)
    
    (let* ((ast-v2 '(:type :HEADLINE :properties (:ID "cow-node" :TITLE "State B") :contents nil))
           (id-v2 (ingest-ast ast-v2))
           (hash-v2 (org-object-hash (lookup-object id-v2))))
      
      ;; Verify we are currently in State B
      (is (equal (org-object-hash (lookup-object "cow-node")) hash-v2))
      
      ;; Rollback to State A (index 0 because we only took 1 snapshot)
      (rollback-object-store 0)
      
      ;; Verify we are back in State A
      (is (equal (org-object-hash (lookup-object "cow-node")) hash-v1))
      
      ;; Verify State B is still safely in the history store (no data loss)
      (is (not (null (gethash hash-v2 *history-store*)))))))