- TUI: Fix stale contract (remove handle-return/*incoming-msgs*), rewrite 10->13 tests (38 checks, 100% pass). Export missing symbols from TUI package. Fix view-chat contract arity. - Gateway messaging: Add :configured key to registry (boolean, nil default). Fix contract to match (vault-based, not env-var-based). - Async Embedding Gateway: Add *embedding-backend* var, embeddings-compute function. Modify ingest-ast to populate vectors on new objects. Add EMBEDDING_PROVIDER env var support. Add Contract + 4 tests (8 checks). - Context Manager: Add /focus, /scope, /unfocus commands to TUI on-key handler. Commands degrade gracefully when context-manager not loaded. - Export hygiene: Remove 30+ ghost exports (undefined symbols). Remove duplicate/mismatched names. Exports now match actual definitions.
8.2 KiB
SKILL: Embedding Gateway (org-skill-embedding-gateway.org)
Architectural Intent
system-model-embedding converts text into vector representations for semantic search and memory retrieval. It provides three backends:
:local— any OpenAI-compatible/api/embeddingsendpoint (Ollama, vLLM, etc.):openai— the OpenAI/v1/embeddingsAPI with an API key:hashing— a zero-dependency fallback that produces deterministic vectors from SHA-256 hashes. No server, no config, works offline.
The embedding queue (embed-queue-object / embed-all-pending) decouples document indexing from the main loop. On each heartbeat tick, embed-all-pending drains the queue and embeds all accumulated objects. This prevents indexing traffic from blocking conversational responses.
The default provider is :hashing — useful for bootstrapping with zero configuration and for deployments where embedding quality isn't critical. Switch to :local or :openai when you have an embedding server available.
This replaces the old system-embedding-gateway with the same logic but renamed to system-model-embedding to live alongside the other system-model-* skills.
Implementation
State
(in-package :passepartout)
(defvar *embedding-provider* :hashing
"Active embedding provider: :hashing, :local, :openai.")
(defvar *embedding-queue* nil
"Queue of text objects awaiting embedding.")
(defvar *embedding-batch-size* 10
"Maximum texts per embedding API call.")
Local backend (OpenAI-compatible)
(defun embedding-backend-local (text)
"Generate embeddings via a local OpenAI-compatible endpoint."
(let* ((url (or (uiop:getenv "LOCAL_BASE_URL") (format nil "http://~a" (or (uiop:getenv "OLLAMA_HOST") "localhost:11434"))))
(model (or (uiop:getenv "EMBEDDING_MODEL") "nomic-embed-text"))
(body (cl-json:encode-json-to-string
`((model . ,model) (input . ,text)))))
(handler-case
(let* ((response (dex:post (format nil "~a/api/embeddings" url)
:headers '(("Content-Type" . "application/json"))
:content body :connect-timeout 5 :read-timeout 30))
(json (cl-json:decode-json-from-string response))
(data (car (cdr (assoc :data json)))))
(or (cdr (assoc :embedding data))
(list :error "No embedding in response")))
(error (c)
(list :error (format nil "Embedding failed: ~a" c))))))
OpenAI backend
(defun embedding-backend-openai (text)
"Generate embeddings via OpenAI compatible /v1/embeddings endpoint."
(let* ((api-key (uiop:getenv "OPENAI_API_KEY"))
(base-url (or (uiop:getenv "EMBEDDING_BASE_URL") "https://api.openai.com/v1"))
(model (or (uiop:getenv "EMBEDDING_MODEL") "text-embedding-3-small"))
(body (cl-json:encode-json-to-string
`((model . ,model) (input . ,text)))))
(handler-case
(let* ((response (dex:post (format nil "~a/embeddings" base-url)
:headers `(("Content-Type" . "application/json")
("Authorization" . ,(format nil "Bearer ~a" api-key)))
:content body :connect-timeout 5 :read-timeout 30))
(json (cl-json:decode-json-from-string response))
(data (car (cdr (assoc :data json)))))
(or (cdr (assoc :embedding data))
(list :error "No embedding in response")))
(error (c)
(list :error (format nil "OpenAI Embedding failed: ~a" c))))))
Hashing fallback
(defun embedding-backend-hashing (text)
"Fallback: produces a deterministic vector from the text hash."
(let* ((digest (ironclad:digest-sequence :sha256 (babel:string-to-octets text)))
(vec (make-array 8 :element-type 'single-float :initial-element 0.0)))
(dotimes (i (min (length digest) 8))
(setf (aref vec i) (float (/ (aref digest i) 255.0) 0.0)))
vec))
Object embedding and queuing
(defvar *embedding-backend* nil
"Explicit backend override (nil = use *embedding-provider*).")
(defun embeddings-compute (text)
"Compute an embedding vector for text using the active backend."
(embed-object text))
(defun embed-object (text)
"Embed a single text string using the active backend."
(let* ((selected (or *embedding-backend* *embedding-provider* :hashing))
(backend (case selected
(:local #'embedding-backend-local)
(:openai #'embedding-backend-openai)
(t #'embedding-backend-hashing))))
(if backend
(progn
(log-message "EMBEDDING: Provider ~a, backend=~a" selected backend)
(funcall backend text))
(progn
(log-message "EMBEDDING: No backend for provider ~a, using hashing" selected)
(embedding-backend-hashing text)))))
(defun embed-queue-object (object)
"Queue a text object for async embedding."
(push object *embedding-queue*)
(log-message "EMBEDDING: Queued object"))
(defun embed-all-pending ()
"Drain the embedding queue, store vectors in the store-keyed objects."
(let ((batch (nreverse *embedding-queue*)))
(setf *embedding-queue* nil)
(dolist (item batch)
(handler-case
(let ((id (getf item :id))
(text (getf item :text)))
(when (and id text)
(let ((vec (embeddings-compute text))
(obj (gethash id *memory-store*)))
(when (and obj vec (not (listp vec)))
(setf (memory-object-vector obj) vec))
(log-message "EMBEDDING: Computed vector for ~a (~d dims)" id (length vec)))))
(error (c)
(log-message "EMBEDDING: Failed to embed object: ~a" c))))))
;; Apply env var override at load time
(let ((provider-env (uiop:getenv "EMBEDDING_PROVIDER")))
(when provider-env
(let ((kw (intern (string-upcase provider-env) :keyword)))
(setf *embedding-provider* kw)
(log-message "EMBEDDING: Set provider to ~a from EMBEDDING_PROVIDER env" kw))))
(log-message "EMBEDDING: Gateway loaded with provider ~a" *embedding-provider*)
Contract
- (embeddings-compute text): produces a vector (single-float array) for
any text string using the active backend (
*embedding-backend*or*embedding-provider*). - (embedding-backend-hashing text): zero-dependency fallback. Returns an 8-element single-float vector deterministically from SHA-256.
- (embed-all-pending): drains
*embedding-queue*, computes vectors for all queued objects, and stores them in*memory-store*entries.
Test Suite
(eval-when (:compile-toplevel :load-toplevel :execute)
(ql:quickload :fiveam :silent t))
(defpackage :passepartout-embedding-tests
(:use :cl :passepartout)
(:export #:embedding-suite))
(in-package :passepartout-embedding-tests)
(fiveam:def-suite embedding-suite :description "Embedding gateway verification")
(fiveam:in-suite embedding-suite)
(fiveam:test test-embedding-backend-hashing
"Contract 2: hashing backend produces 8-element float vector."
(let ((vec (embedding-backend-hashing "hello world")))
(fiveam:is (arrayp vec))
(fiveam:is (= 8 (length vec)))
(fiveam:is (every #'numberp (coerce vec 'list)))))
(fiveam:test test-embedding-backend-hashing-deterministic
"Contract 2: same input produces same vector."
(let ((v1 (embedding-backend-hashing "test"))
(v2 (embedding-backend-hashing "test")))
(fiveam:is (equalp v1 v2))))
(fiveam:test test-embeddings-compute
"Contract 1: embeddings-compute returns a float vector."
(let ((vec (embeddings-compute "some text")))
(fiveam:is (arrayp vec))
(fiveam:is (> (length vec) 0))))
(fiveam:test test-embed-queue-and-drain
"Contract 3: embed-all-pending drains queue and stores vectors."
(let ((*embedding-queue* nil))
(embed-queue-object '(:id "test-obj" :text "sample text"))
(fiveam:is (= 1 (length *embedding-queue*)))
(embed-all-pending)
(fiveam:is (null *embedding-queue*))))