cl-dotenv preserves surrounding quotes in .env values (unlike bash). PROVIDER_CASCADE="deepseek,..." resulted in keywords like :"DEEPSEEK instead of :DEEPSEEK, causing all cascade lookups to fail silently. Fixes: - .env.example: remove quotes from PROVIDER_CASCADE - provider-cascade-initialize: add #" and #' to string-trim chars - system-model-router: same fix for LOCAL_BACKENDS parsing
8.0 KiB
SKILL: Model Router (org-skill-model-router.org)
Overview: Quadrant-Based Model Routing
The Model Router implements the four-quadrant cognitive architecture for LLM model selection. Each signal is routed through a pipeline of three filters — privacy, quadrant, and complexity — before a model is chosen.
The routing pipeline for every probabilistic signal:
all backends → privacy filter → quadrant/classifier → per-slot cascade → model
- Privacy filter strips cloud backends when content carries
@personaltags. - Quadrant determines if the signal is foreground or background.
- Complexity classifier assigns foreground signals to one of three slots:
:code,:plan, or:chat. - Per-slot cascade selects a backend and model for the slot, with fallback ordering defined in each cascade list.
The model selector function is registered into the core *model-selector* hook
at load time. The core iterates providers, calling the selector for each one.
Implementation
Package Context
(in-package :passepartout)
Configuration: Per-Slot Cascades
Four env-configurable cascade variables, one per slot. Each cascade is a list
of (provider-keyword . "model-name") pairs. The first match for the current
backend is used.
Example: MODEL_CASCADE_CODE='((:ollama . "deepseek-coder:6.7b") (:openrouter . "claude-sonnet"))'
model-cascade-code
The cascade for :code tasks (code generation, refactoring, bug fixing).
Format: ((:ollama . "model-name") ...). Configured via MODEL_CASCADE_CODE.
;; REPL-VERIFIED: 2026-05-03T14:00:00
(defvar *model-cascade-code* nil
"Cascade for :code tasks: ((:ollama . \"model\") ...)")
model-cascade-plan
Cascade for planning and architecture tasks. Configured via MODEL_CASCADE_PLAN.
;; REPL-VERIFIED: 2026-05-03T14:00:00
(defvar *model-cascade-plan* nil
"Cascade for :plan tasks.")
model-cascade-chat
Cascade for general conversation and simple Q&A. Configured via MODEL_CASCADE_CHAT.
;; REPL-VERIFIED: 2026-05-03T14:00:00
(defvar *model-cascade-chat* nil
"Cascade for :chat tasks.")
model-cascade-background
Cascade for background tasks (heartbeat scraping, delegation processing).
Configured via MODEL_CASCADE_BACKGROUND.
;; REPL-VERIFIED: 2026-05-03T14:00:00
(defvar *model-cascade-background* nil
"Cascade for background tasks (heartbeat, delegation).")
local-backends
List of backend keywords considered local for privacy routing. Content tagged
with @personal will only be sent to these backends.
;; REPL-VERIFIED: 2026-05-03T14:00:00
(defvar *local-backends* '(:ollama :llama-cpp)
"Backend keywords considered local (privacy-safe).")
Complexity Classifier
Keyword-based heuristic that assigns signal text to a complexity slot.
Pluggable — set *complexity-classifier* to override.
;; REPL-VERIFIED: 2026-05-03T14:00:00
(defun model-classify-complexity (text)
"Classify TEXT into :code, :plan, or :chat."
(let ((lower (string-downcase text)))
(cond
((or (search "defun" lower) (search "defmacro" lower)
(search "write" lower) (search "refactor" lower)
(search "fix " lower) (search "implement" lower)
(search "code" lower)
(search "#+begin_src" lower))
:code)
((or (search "plan" lower) (search "roadmap" lower)
(search "strategy" lower) (search "design" lower)
(search "architecture" lower))
:plan)
(t :chat))))
Cascade Lookup
The core iterates each backend in *provider-cascade* and calls the model
selector for each one. This function matches the current backend against the
per-slot cascade list to find the appropriate model. Returns the first
:code (provider . model) entry whose provider matches, or nil if
the backend has no entry in that slot's cascade (the core will skip to
the next provider).
;; REPL-VERIFIED: 2026-05-03T14:00:00
(defun model-cascade-find (cascade backend)
"Find first (PROVIDER . MODEL) in CASCADE matching BACKEND."
(assoc backend cascade
:test (lambda (a b) (string-equal (string a) (string b)))))
Model Selector
The main routing function. Registered into *model-selector* at init time.
Called per-backend by backend-cascade-call. Returns a model name string,
or :skip if the backend should not be tried (e.g., privacy filter).
Filter order: privacy → quadrant → complexity → cascade.
;; REPL-VERIFIED: 2026-05-03T14:00:00
(defun model-select (backend context)
"Select model for BACKEND given CONTEXT signal.
Returns model name or :skip."
(let* ((payload (getf context :payload))
(text (or (getf payload :text) ""))
(sensor (getf payload :sensor))
(has-personal (and (boundp '*dispatcher-privacy-tags*)
(some (lambda (tag) (search tag text))
(symbol-value '*dispatcher-privacy-tags*))))
(is-local (member backend *local-backends*)))
;; Privacy: skip cloud backends for personal content
(when (and has-personal (not is-local))
(log-message "MODEL-ROUTER: Skipping ~a (personal content)" backend)
(return-from model-select :skip))
;; Quadrant: background tasks use background cascade
(if (member sensor '(:heartbeat :delegation :tool-output :loop-error))
(let ((entry (car (or *model-cascade-background*
'((:ollama . "phi-2"))))))
(cdr entry))
;; Foreground: classify complexity, use slot cascade
(let* ((slot (model-classify-complexity text))
(cascade (case slot
(:code *model-cascade-code*)
(:plan *model-cascade-plan*)
(t *model-cascade-chat*)))
(entry (model-cascade-find
(or cascade '((:ollama . "qwen2.5:14b"))) backend)))
(if entry (cdr entry) nil)))))
Initialization
Reads cascade configuration from environment variables and registers
model-select into the core *model-selector* hook.
;; REPL-VERIFIED: 2026-05-03T14:00:00
(defun model-router-init ()
"Read env vars and wire model-select into *model-selector*."
(flet ((parse-cascade (str)
(when (and str (> (length str) 0))
(let ((*read-eval* nil))
(read-from-string str)))))
(setf *model-cascade-code* (parse-cascade (uiop:getenv "MODEL_CASCADE_CODE"))
*model-cascade-plan* (parse-cascade (uiop:getenv "MODEL_CASCADE_PLAN"))
*model-cascade-chat* (parse-cascade (uiop:getenv "MODEL_CASCADE_CHAT"))
*model-cascade-background* (parse-cascade (uiop:getenv "MODEL_CASCADE_BACKGROUND"))
*local-backends* (let ((env (uiop:getenv "LOCAL_BACKENDS")))
(if env
(mapcar (lambda (s) (intern (string-upcase (string-trim '(#\Space #\" #\') s)) :keyword))
(uiop:split-string env :separator '(#\,)))
'(:ollama :llama-cpp)))))
(setf *model-selector* #'model-select)
(log-message "MODEL-ROUTER: Initialized, selector=~a" *model-selector*))
Skill Registration
The model router is an observer skill — it has no trigger and no
deterministic gate. All work happens at load time via model-router-init,
which reads env vars and registers into the core *model-selector* hook.
The defskill call exists only to register metadata (priority, name) for
telemetry and lifecycle management.
(defskill :passepartout-model-router
:priority 250
:trigger (lambda (ctx) (declare (ignore ctx)) nil))
Auto-Init
(model-router-init)