#+TITLE: SKILL: Model Router (org-skill-model-router.org) #+AUTHOR: Agent #+FILETAGS: :system:model:routing: #+PROPERTY: header-args:lisp :tangle ../lisp/system-model-router.lisp * Overview: Quadrant-Based Model Routing The Model Router implements the four-quadrant cognitive architecture for LLM model selection. Each signal is routed through a pipeline of three filters — privacy, quadrant, and complexity — before a model is chosen. The routing pipeline for every probabilistic signal: all backends → privacy filter → quadrant/classifier → per-slot cascade → model - **Privacy filter** strips cloud backends when content carries ~@personal~ tags. - **Quadrant** determines if the signal is foreground or background. - **Complexity classifier** assigns foreground signals to one of three slots: ~:code~, ~:plan~, or ~:chat~. - **Per-slot cascade** selects a backend and model for the slot, with fallback ordering defined in each cascade list. The model selector function is registered into the core ~*model-selector*~ hook at load time. The core iterates providers, calling the selector for each one. * Implementation ** Package Context #+begin_src lisp (in-package :passepartout) #+end_src ** Configuration: Per-Slot Cascades Four env-configurable cascade variables, one per slot. Each cascade is a list of ~(provider-keyword . "model-name")~ pairs. The first match for the current backend is used. Example: MODEL_CASCADE_CODE='((:ollama . "deepseek-coder:6.7b") (:openrouter . "claude-sonnet"))' *** *model-cascade-code* The cascade for ~:code~ tasks (code generation, refactoring, bug fixing). Format: ~((:ollama . "model-name") ...)~. Configured via ~MODEL_CASCADE_CODE~. ;; REPL-VERIFIED: 2026-05-03T14:00:00 #+begin_src lisp (defvar *model-cascade-code* nil "Cascade for :code tasks: ((:ollama . \"model\") ...)") #+end_src *** *model-cascade-plan* Cascade for planning and architecture tasks. Configured via ~MODEL_CASCADE_PLAN~. ;; REPL-VERIFIED: 2026-05-03T14:00:00 #+begin_src lisp (defvar *model-cascade-plan* nil "Cascade for :plan tasks.") #+end_src *** *model-cascade-chat* Cascade for general conversation and simple Q&A. Configured via ~MODEL_CASCADE_CHAT~. ;; REPL-VERIFIED: 2026-05-03T14:00:00 #+begin_src lisp (defvar *model-cascade-chat* nil "Cascade for :chat tasks.") #+end_src *** *model-cascade-background* Cascade for background tasks (heartbeat scraping, delegation processing). Configured via ~MODEL_CASCADE_BACKGROUND~. ;; REPL-VERIFIED: 2026-05-03T14:00:00 #+begin_src lisp (defvar *model-cascade-background* nil "Cascade for background tasks (heartbeat, delegation).") #+end_src *** *local-backends* List of backend keywords considered local for privacy routing. Content tagged with ~@personal~ will only be sent to these backends. ;; REPL-VERIFIED: 2026-05-03T14:00:00 #+begin_src lisp (defvar *local-backends* '(:ollama :llama-cpp) "Backend keywords considered local (privacy-safe).") #+end_src ** Complexity Classifier Keyword-based heuristic that assigns signal text to a complexity slot. Pluggable — set ~*complexity-classifier*~ to override. ;; REPL-VERIFIED: 2026-05-03T14:00:00 #+begin_src lisp (defun model-classify-complexity (text) "Classify TEXT into :code, :plan, or :chat." (let ((lower (string-downcase text))) (cond ((or (search "defun" lower) (search "defmacro" lower) (search "write" lower) (search "refactor" lower) (search "fix " lower) (search "implement" lower) (search "code" lower) (search "#+begin_src" lower)) :code) ((or (search "plan" lower) (search "roadmap" lower) (search "strategy" lower) (search "design" lower) (search "architecture" lower)) :plan) (t :chat)))) #+end_src ** Cascade Lookup The core iterates each backend in ~*provider-cascade*~ and calls the model selector for each one. This function matches the current backend against the per-slot cascade list to find the appropriate model. Returns the first ~:code~ ~(provider . model)~ entry whose provider matches, or ~nil~ if the backend has no entry in that slot's cascade (the core will skip to the next provider). ;; REPL-VERIFIED: 2026-05-03T14:00:00 #+begin_src lisp (defun model-cascade-find (cascade backend) "Find first (PROVIDER . MODEL) in CASCADE matching BACKEND." (assoc backend cascade :test (lambda (a b) (string-equal (string a) (string b))))) #+end_src ** Model Selector The main routing function. Registered into ~*model-selector*~ at init time. Called per-backend by ~backend-cascade-call~. Returns a model name string, or ~:skip~ if the backend should not be tried (e.g., privacy filter). Filter order: privacy → quadrant → complexity → cascade. ;; REPL-VERIFIED: 2026-05-03T14:00:00 #+begin_src lisp (defun model-select (backend context) "Select model for BACKEND given CONTEXT signal. Returns model name or :skip." (let* ((payload (getf context :payload)) (text (or (getf payload :text) "")) (sensor (getf payload :sensor)) (has-personal (and (boundp '*dispatcher-privacy-tags*) (some (lambda (tag) (search tag text)) (symbol-value '*dispatcher-privacy-tags*)))) (is-local (member backend *local-backends*))) ;; Privacy: skip cloud backends for personal content (when (and has-personal (not is-local)) (log-message "MODEL-ROUTER: Skipping ~a (personal content)" backend) (return-from model-select :skip)) ;; Quadrant: background tasks use background cascade (if (member sensor '(:heartbeat :delegation :tool-output :loop-error)) (let ((entry (car (or *model-cascade-background* '((:ollama . "phi-2")))))) (cdr entry)) ;; Foreground: classify complexity, use slot cascade (let* ((slot (model-classify-complexity text)) (cascade (case slot (:code *model-cascade-code*) (:plan *model-cascade-plan*) (t *model-cascade-chat*))) (entry (model-cascade-find (or cascade '((:ollama . "qwen2.5:14b"))) backend))) (if entry (cdr entry) nil))))) #+end_src ** Initialization Reads cascade configuration from environment variables and registers ~model-select~ into the core ~*model-selector*~ hook. ;; REPL-VERIFIED: 2026-05-03T14:00:00 #+begin_src lisp (defun model-router-init () "Read env vars and wire model-select into *model-selector*." (flet ((parse-cascade (str) (when (and str (> (length str) 0)) (let ((*read-eval* nil)) (read-from-string str))))) (setf *model-cascade-code* (parse-cascade (uiop:getenv "MODEL_CASCADE_CODE")) *model-cascade-plan* (parse-cascade (uiop:getenv "MODEL_CASCADE_PLAN")) *model-cascade-chat* (parse-cascade (uiop:getenv "MODEL_CASCADE_CHAT")) *model-cascade-background* (parse-cascade (uiop:getenv "MODEL_CASCADE_BACKGROUND")) *local-backends* (let ((env (uiop:getenv "LOCAL_BACKENDS"))) (if env (mapcar (lambda (s) (intern (string-upcase (string-trim '(#\Space #\" #\') s)) :keyword)) (uiop:split-string env :separator '(#\,))) '(:ollama :llama-cpp))))) (setf *model-selector* #'model-select) (log-message "MODEL-ROUTER: Initialized, selector=~a" *model-selector*)) #+end_src ** Skill Registration The model router is an observer skill — it has no trigger and no deterministic gate. All work happens at load time via ~model-router-init~, which reads env vars and registers into the core ~*model-selector*~ hook. The ~defskill~ call exists only to register metadata (priority, name) for telemetry and lifecycle management. #+begin_src lisp (defskill :passepartout-model-router :priority 250 :trigger (lambda (ctx) (declare (ignore ctx)) nil)) #+end_src ** Auto-Init #+begin_src lisp (model-router-init) #+end_src