Files
passepartout/org/core-reason.org
Amr Gharbeia 0b1fbc36bb
Some checks failed
Deploy (Gitea) / deploy (push) Failing after 2s
v0.5.0 hotfix: restore register-probabilistic-backend
Restored core-reason from clean git base:
- Re-add register-probabilistic-backend (lost during rename)
- Re-add (in-package :passepartout)
- Don't pass tools to cascade (avoids unsupported-provider 400s)

Daemon stable, cascade reaches providers, gate trace works.
2026-05-07 20:56:42 -04:00

598 lines
31 KiB
Org Mode

#+TITLE: Stage 2: Reason (reason.lisp)
#+AUTHOR: Agent
#+FILETAGS: :harness:reason:
#+STARTUP: content
#+PROPERTY: header-args:lisp :tangle ../lisp/core-reason.lisp
* Overview: Architectural Intent
The Reason stage is the cognitive heart of Passepartout. It takes a normalized signal from Perceive and produces an approved action for Act. This is where the two engines — probabilistic (LLM) and deterministic (Lisp logic) — collaborate.
The design is shaped by one non-negotiable constraint: **the LLM must never touch the actuators directly.** Every action the LLM proposes must pass through a deterministic verification gate that has the final say. This is what separates Passepartout from every other AI agent: the creative brain suggests, but the logical brain decides.
** The Probabilistic-Deterministic Split
An LLM is a statistical engine. Given enough context, it is remarkably good at translation, generation, and pattern matching. But it cannot be trusted with authority because hallucination is a *fundamental property* of probabilistic inference — the model generates the most likely continuation, not the correct one.
The deterministic engine addresses this by being what the probabilistic engine is not: mathematically rigorous, formally verifiable, and incapable of hallucination by design. It operates on explicit symbolic representations — lists, property lists, knowledge graphs — not on floating-point activations. When it evaluates a path confinement check, it returns true or false, not a probability distribution.
The division of labor is architectural:
- The LLM handles the fuzzy interface between human language and structured representation
- The deterministic engine receives those structured representations and evaluates them against formal invariants
- The LLM never reads a file, never executes a command, never modifies memory — it generates proposals
This separation is the source of Passepartout's safety guarantee. Other agents add "guardrails" as an afterthought — a layer of filtering around a dangerous core. Passepartout makes the division explicit.
** Why Plists for Communication?
Every message in the Reason pipeline is a property list (plist):
(TYPE :REQUEST TARGET :CLI PAYLOAD (ACTION :MESSAGE TEXT "Hello"))
A plist is simultaneously:
- Human-readable text
- Machine-parseable data structure
- Executable Lisp code
This is not a cosmetic choice. It means the reasoning pipeline can generate, modify, and execute its own communication protocol without external parsing libraries. There is no JSON encoder, no schema validator, no serialization layer between the two engines. They speak the same language because they *are* the same language.
** Contract
1. (cognitive-verify proposed-action context): runs all registered
deterministic gates sorted by priority. Returns a rejection plist
(~:LOG~ or ~:EVENT~) if any gate blocks the action, an
~:approval-required~ event if a gate requires HITL, or the action
(potentially modified) if it passes.
2. (loop-gate-reason signal): the full reason pipeline — only processes
~:user-input~ and ~:chat-message~ sensors. Runs ~think~ to generate
a candidate, then ~cognitive-verify~ to gate it. Retries up to 3
times on rejection. Sets ~:status :reasoned~ on completion.
3. (reason-gate signal): thin alias for ~loop-gate-reason~.
4. (backend-cascade-call prompt): iterates ~*provider-cascade*~ calling
each backend's handler until one succeeds. Returns the LLM content
string, or a ~:LOG~ failure if all backends are exhausted.
5. (json-alist-to-plist alist): converts a JSON alist (from
~cl-json:decode-json-from-string~) to a keyword-prefixed plist.
String keys → upcased keywords. Nested alists recurse into plists.
JSON arrays (lists whose first element is not a cons) pass through.
Scalars and nil pass through.
* Implementation
** Package Context
#+begin_src lisp
(in-package :passepartout)
#+end_src
** Probabilistic Backend Registry
~*probabilistic-backends*~ is a hash table mapping provider keywords to
their handler functions. Populated by ~register-probabilistic-backend~.
Skills like system-model-provider register into this table at boot time.
;; REPL-VERIFIED: 2026-05-03T14:00:00
#+begin_src lisp
(defvar *probabilistic-backends* (make-hash-table :test 'equal)
"Maps provider keyword → handler function (prompt system-prompt &key model).")
(defun register-probabilistic-backend (name fn)
"Register FN as the handler for provider NAME."
(setf (gethash name *probabilistic-backends*) fn))
#+end_src
The probabilistic engine maintains four pieces of global state that control how LLM requests are dispatched:
~*backend-registry*~ is a hash table mapping provider keywords (like ~:ollama~ or ~:openrouter~) to the actual function that calls that provider's API. ~*provider-cascade*~ is the ordered list of providers to try — if the first one fails, the cascade falls through to the next. ~*model-selector*~ is an optional function that examines the context and picks a model per request (useful for routing simple questions to a small fast model and complex reasoning to a large expensive one). ~*consensus-enabled*~ toggles multi-provider agreement, where multiple LLMs run the same prompt and the system waits for consensus.
These variables are configurable at runtime. The cascade can be changed without restart: (setf *provider-cascade* (quote (:ollama :openrouter))).
;; REPL-VERIFIED: 2026-05-03T13:00:00
#+begin_src lisp
(defvar *backend-registry* (make-hash-table :test 'equal))
#+end_src
** Provider Cascade
;; REPL-VERIFIED: 2026-05-03T13:00:00
#+begin_src lisp
(defvar *provider-cascade* nil)
#+end_src
** Model Selector
;; REPL-VERIFIED: 2026-05-03T13:00:00
#+begin_src lisp
(defvar *model-selector* nil)
#+end_src
** Consensus Toggle
;; REPL-VERIFIED: 2026-05-03T13:00:00
#+begin_src lisp
(defvar *consensus-enabled* nil)
#+end_src
** Backend Registration (backend-register)
Each LLM provider registers itself by calling this function. The backend function receives a prompt string, a system prompt string, and optional keyword arguments for model selection. It must return either a plist with ~:status :success~ and ~:content~, or ~:status :error~ with a message.
Registration is typically done at boot time by the unified-llm-backend skill, but can also be done dynamically:
(backend-register :my-custom-provider #'my-fn)
;; REPL-VERIFIED: 2026-05-03T13:00:00
#+begin_src lisp
(defun backend-register (name fn)
(setf (gethash name *backend-registry*) fn))
#+end_src
** Cascade Dispatch (backend-cascade-call)
Given a prompt, this function iterates through the provider cascade and calls each backend in order until one succeeds. A provider "succeeds" when it returns ~:status :success~ with content, or when it returns a plain string (the LLM's raw output).
The function has a fallback for every failure mode:
- If a backend returns ~:status :error~, the cascade moves to the next provider
- If a backend throws an exception, it is caught and logged, and the cascade moves on
- If ALL backends are exhausted, a structured LOG message is returned saying "Neural Cascade Failure"
This is deliberately resilient. The system should never crash because an LLM provider is down. It should log the failure, try the next provider, and if all fail, return a diagnostic message that the deterministic engine can present to the user.
;; REPL-VERIFIED: 2026-05-03T14:00:00
#+begin_src lisp
(defun backend-cascade-call (prompt &key
(system-prompt "You are the Probabilistic engine.")
(cascade nil)
(context nil)
tools)
(let ((backends (or cascade *provider-cascade*))
(result nil))
(dolist (backend backends (or result
(list :type :LOG
:payload (list :text "Neural Cascade Failure: All providers exhausted."))))
(let ((backend-fn (or (gethash backend *backend-registry*)
(gethash backend *probabilistic-backends*))))
(when backend-fn
(log-message "PROBABILISTIC: Attempting backend ~a..." backend)
(let* ((model (and *model-selector*
(funcall *model-selector* backend context)))
(skip (eq model :skip))
(r (unless skip
(apply backend-fn
(append (list prompt system-prompt :model model)
(when tools (list :tools tools)))))))
(when skip
(log-message "PROBABILISTIC: Skipping ~a (filtered)" backend))
(cond ((and (listp r) (eq (getf r :status) :success))
(let ((tool-calls (getf r :tool-calls)))
(if tool-calls
(return (list :status :success :tool-calls tool-calls))
(progn
(setf result (getf r :content))
(return result)))))
((stringp r)
(setf result r)
(return result))
(t
(log-message "PROBABILISTIC: Backend ~a failed: ~a"
backend (getf r :message))))))))))
#+end_src
** Markdown Strip
The LLM might wrap its output in Markdown code fences (~```~). This function strips them before parsing. It also strips trailing/leading whitespace.
;; REPL-VERIFIED: 2026-05-03T13:00:00
#+begin_src lisp
(defun markdown-strip (text)
(if (and text (stringp text))
(let ((cleaned text))
(setf cleaned (cl-ppcre:regex-replace-all "^```[a-z]*\\n" cleaned ""))
(setf cleaned (cl-ppcre:regex-replace-all "\\n```$" cleaned ""))
(setf cleaned (cl-ppcre:regex-replace-all "```" cleaned ""))
(string-trim '(#\Space #\Newline #\Tab) cleaned))
text))
#+end_src
** Normalize plist keywords
Lisp keywords are case-sensitive. The LLM might produce ~:payload~ or ~:PAYLOAD~ or ~:Payload~ depending on the model. This function normalizes all keyword keys to uppercase to ensure the deterministic engine receives consistent input.
;; REPL-VERIFIED: 2026-05-03T13:00:00
#+begin_src lisp
(defun plist-keywords-normalize (plist)
(when (listp plist)
(loop for (k v) on plist by #'cddr
collect (if (and (symbolp k) (not (keywordp k)))
(intern (string k) :keyword)
k)
collect v)))
#+end_src
** Think: assemble context and call the LLM
This is the main entry point for the probabilistic engine. Every cognitive cycle goes through here.
The function handles several cases:
- If a triggered skill provides a probabilistic prompt generator, that replaces the raw user input
- If the previous proposal was rejected, the rejection trace is injected into the LLM's context so it can self-correct
- Standing mandates from ~*standing-mandates*~ are injected into the IDENTITY section of the system prompt
The system prompt assembly order — identity (including mandates), tools, context, logs — is intentional: standing mandates appear early in IDENTITY so they set the behavioral frame before the model processes tools, context, and logs.
;; REPL-VERIFIED: 2026-05-03T13:00:00
#+begin_src lisp
(defun think (context)
(let* ((active-skill (find-triggered-skill context))
(tool-belt (generate-tool-belt-prompt))
(global-context (context-assemble-global-awareness))
(system-logs (context-get-system-logs))
(assistant-name (or (uiop:getenv "MEMEX_ASSISTANT") "Agent"))
(rejection-trace (proto-get (proto-get context :payload) :rejection-trace))
(prompt-generator (when active-skill (skill-probabilistic-prompt active-skill)))
(raw-prompt (if prompt-generator
(funcall prompt-generator context)
(let ((p (proto-get (proto-get context :payload) :text)))
(if (and p (stringp p)) p "Maintain metabolic stasis."))))
(reflection-feedback (if rejection-trace
(format nil "~%~%PREVIOUS PROPOSAL REJECTED: ~a" rejection-trace)
""))
(standing-mandates-text (let ((out ""))
(dolist (fn *standing-mandates*)
(let ((text (ignore-errors (funcall fn context))))
(when (and text (stringp text) (> (length text) 0))
(setf out (concatenate 'string out text (string #\Newline))))))
(when (> (length out) 0) out)))
(system-prompt (format nil "IDENTITY: ~a~a~a~%~%TOOLS:~%~a~%~%CONTEXT:~%~a~%~%LOGS:~%~a"
assistant-name reflection-feedback
(if standing-mandates-text
(concatenate 'string (string #\Newline) standing-mandates-text)
"")
tool-belt global-context system-logs))
(api-tools (let ((tools nil))
(maphash (lambda (k tool)
(declare (ignore k))
(push (list :name (cognitive-tool-name tool)
:description (cognitive-tool-description tool)
:parameters (cognitive-tool-parameters tool))
tools))
*cognitive-tool-registry*)
(when tools tools))))
(let* ((thought (backend-cascade-call raw-prompt
:system-prompt system-prompt
:context context
:tools api-tools))
(tool-calls (and (listp thought) (getf thought :tool-calls))))
(if tool-calls
(let* ((first-call (car tool-calls))
(tool-name (getf first-call :name))
(args (getf first-call :arguments))
(args-plist (json-alist-to-plist args)))
(list :TYPE :REQUEST
:PAYLOAD (list* :TOOL tool-name
:ARGS args-plist
:EXPLANATION "Generated by function-calling engine.")))
(let* ((cleaned (if (and (listp thought) (getf thought :type))
(format nil "~a" (getf (getf thought :payload) :text))
(markdown-strip thought))))
(if (and cleaned (stringp cleaned) (> (length cleaned) 0) (or (char= (char cleaned 0) #\() (char= (char cleaned 0) #\[)))
(handler-case
(let ((parsed (let ((*read-eval* nil)) (read-from-string cleaned))))
(if (listp parsed)
(let ((normalized (plist-keywords-normalize parsed)))
;; Ensure explanation is present in the payload for policy gate
(let ((payload (proto-get normalized :payload)))
(if (and payload (proto-get payload :explanation))
normalized
(let ((new-payload (list* :EXPLANATION "Generated by the Probabilistic engine."
(if (listp payload) payload nil))))
(list* :PAYLOAD new-payload
(loop for (k v) on normalized by #'cddr
unless (eq k :PAYLOAD)
collect k collect v))))))
(list :TYPE :REQUEST :PAYLOAD (list :ACTION :MESSAGE :TEXT cleaned :EXPLANATION "Generated by the Probabilistic engine."))))
(error () (list :TYPE :REQUEST :PAYLOAD (list :ACTION :MESSAGE :TEXT cleaned :EXPLANATION "Generated by the Probabilistic engine."))))
(list :TYPE :REQUEST :PAYLOAD (list :ACTION :MESSAGE :TEXT (if (stringp cleaned) cleaned "No response") :EXPLANATION "Generated by the Probabilistic engine."))))))))
#+end_src
** JSON-to-Plist Conversion (json-alist-to-plist)
Converts a JSON alist as returned by ~cl-json:decode-json-from-string~ to a keyword-prefixed plist — the internal data format that ~cognitive-verify~ and the actuator layer expect. This is the boundary where the probabilistic layer's output format (JSON) meets the deterministic layer's input format (plists).
String keys are interned as upcased keywords (~"action" → :ACTION~). Nested alists recurse. JSON arrays (lists whose first element is an atom) pass through unchanged since the actuator layer handles list arguments natively.
#+begin_src lisp
(defun json-alist-to-plist (alist)
"Convert a JSON alist to a keyword-prefixed plist."
(when (listp alist)
(loop for (key . value) in alist
append (list (intern (string-upcase (string key)) :keyword)
(if (listp value)
(if (consp (car value))
(json-alist-to-plist value)
value)
value)))))
#+end_src
** Deterministic Engine (cognitive-verify)
The deterministic engine is the strict guard. It receives a proposed action from the probabilistic engine and runs it through every registered deterministic gate, sorted by priority.
**Gate Trace (v0.4.0)**
As part of v0.4.0's TUI differentiator visualizations, ~cognitive-verify~ now accumulates a ~:gate-trace~ — a list of ~(:gate <name> :result <:passed|:blocked|:approval>)~ entries — as each deterministic gate processes the action. The trace is prepended to the result plist via ~list*~ and flows through the pipeline to the TUI actuator, which transmits it to the client.
This is Passepartout's permanent UX advantage: no competitor can ship a gate trace because none has deterministic gates to trace. Claude Code, OpenClaw, and Hermes Agent all use prompt-based guardrails where the safety decision is invisible. In Passepartout, the user sees exactly which nine safety gates ran, what each decided, and why — all at 0 LLM tokens.
Skills register deterministic gates via ~defskill~ with the ~:deterministic~ keyword. Each gate is a function that receives (action context) and returns either:
- A modified action (the gate approves or adjusts the proposal)
- A LOG or EVENT plist (the gate rejects the proposal with a reason)
Gates run in priority order, highest first. If any gate returns a LOG or EVENT, the proposal is rejected immediately and the rejection reason flows back to the probabilistic engine via the rejection trace. If all gates pass, the proposal is approved.
This architecture makes safety compositional: each skill adds one constraint. The dispatcher checks secrets. The policy checks explanations. The shell actuator checks destructive commands. No single skill needs to understand the full security model.
;; REPL-VERIFIED: 2026-05-03T13:00:00
#+begin_src lisp
(defun cognitive-verify (proposed-action context)
"Runs all registered deterministic gates against the proposed action,
sorted by priority (highest first). Returns a rejection plist or the action."
(let ((current-action (copy-tree proposed-action))
(approval-needed nil)
(approval-action nil)
(gates nil)
(gate-trace nil))
;; Collect gates sorted by priority (highest first)
(maphash (lambda (name skill)
(declare (ignore name))
(when (skill-deterministic-fn skill)
(push (cons (skill-priority skill) (cons (skill-name skill) (skill-deterministic-fn skill))) gates)))
*skill-registry*)
(setf gates (sort gates #'> :key #'car))
(dolist (gate-entry gates)
(let* ((gate-name (cadr gate-entry))
(result (funcall (cddr gate-entry) current-action context)))
(cond
((eq (getf result :level) :approval-required)
(push (list :gate (or gate-name (car gate-entry)) :result :approval) gate-trace)
(setf approval-needed t
approval-action (getf (getf result :payload) :action)))
((member (getf result :type) '(:LOG :EVENT))
(push (list :gate (or gate-name (car gate-entry)) :result :blocked) gate-trace)
(return-from cognitive-verify
(list* :gate-trace (nreverse gate-trace) result)))
((and (listp result) result)
(push (list :gate (or gate-name (car gate-entry)) :result :passed) gate-trace)
(setf current-action result)))))
(if approval-needed
(list :type :EVENT :level :approval-required
:gate-trace (nreverse gate-trace)
:payload (list :sensor :approval-required
:action approval-action))
(list* :gate-trace (nreverse gate-trace) current-action))))
#+end_src
** Reason Gate (Stage 2)
The reason gate is the pipeline stage that orchestrates Think + Determine. It receives a signal, checks if it requires reasoning (only ~:user-input~ and ~:chat-message~ events do), and runs through the cognitive + verification loop.
The loop has retry logic: up to 3 attempts. If the deterministic engine rejects a proposal, the rejection reason is fed back into the next think call so the LLM can self-correct. This loop — propose, reject, correct, re-propose — is the core mechanism by which the agent improves its own output without human intervention.
The retry limit prevents infinite loops. If the LLM cannot produce a passable proposal within 3 attempts, the last rejection reason is attached to the signal and the acted pipeline sees a failed reasoning cycle.
*** loop-gate-reason
;; REPL-VERIFIED: 2026-05-03T13:00:00
#+begin_src lisp
(defun loop-gate-reason (signal)
(let* ((type (proto-get signal :type))
(payload (proto-get signal :payload))
(sensor (proto-get payload :sensor)))
(unless (and (eq type :EVENT) (member sensor '(:user-input :chat-message)))
(return-from loop-gate-reason signal))
(let ((retries 3)
(current-signal (copy-tree signal))
(last-rejection nil))
(loop
(when (<= retries 0)
(setf (getf signal :approved-action) last-rejection)
(setf (getf signal :status) :reasoned)
(return signal))
(when last-rejection
(setf (getf (getf current-signal :payload) :rejection-trace) last-rejection))
(let ((candidate (think current-signal)))
(if (and candidate (listp candidate))
(let ((verified (cognitive-verify candidate current-signal)))
;; Approval-required is not a rejection — pass to act for Flight Plan
(if (eq (getf verified :level) :approval-required)
(progn
(setf (getf signal :approved-action) verified)
(setf (getf signal :status) :requires-approval)
(return signal))
;; Hard rejection: retry with feedback
(if (member (getf verified :type) '(:LOG :EVENT))
(progn (decf retries) (setf last-rejection verified))
(progn
(setf (getf signal :approved-action) verified)
(setf (getf signal :status) :reasoned)
(return signal)))))
(progn
(setf (getf signal :approved-action) nil)
(setf (getf signal :status) :reasoned)
(return signal))))))))
#+end_src
*** reason-gate (backward-compatibility alias)
The pipeline gate was originally named ~reason-gate~. Code that still
uses the old name can call this alias. New code should call
~loop-gate-reason~.
;; REPL-VERIFIED: 2026-05-03T13:00:00
#+begin_src lisp
(defun reason-gate (signal)
(loop-gate-reason signal))
#+end_src
* Test Suite
Verifies that the deterministic engine correctly rejects unsafe actions (like ~rm -rf /~) while allowing safe ones.
#+begin_src lisp
(eval-when (:compile-toplevel :load-toplevel :execute)
(ql:quickload :fiveam :silent t))
(defpackage :passepartout-pipeline-reason-tests
(:use :cl :fiveam :passepartout)
(:export #:pipeline-reason-suite))
(in-package :passepartout-pipeline-reason-tests)
(def-suite pipeline-reason-suite :description "Test suite for Reason pipeline")
(in-suite pipeline-reason-suite)
(test test-decide-gate-safety
"Contract 1: cognitive-verify blocks unsafe actions with :LOG rejection."
(clrhash passepartout::*skill-registry*)
(passepartout::defskill :mock-safety
:priority 50
:trigger (lambda (ctx) (declare (ignore ctx)) t)
:deterministic (lambda (action ctx)
(declare (ignore ctx))
(if (search "rm -rf" (format nil "~s" action))
(list :type :LOG :payload (list :text "Rejected"))
action)))
(let* ((candidate '(:type :REQUEST :payload (:action :shell :cmd "rm -rf /")))
(signal '(:type :EVENT :payload (:sensor :user-input)))
(result (cognitive-verify candidate signal)))
(is (eq :LOG (getf result :type)))))
(test test-cognitive-verify-pass-through
"Contract 1: safe actions pass through cognitive-verify unchanged."
(clrhash passepartout::*skill-registry*)
(passepartout::defskill :mock-passthrough
:priority 50
:trigger (lambda (ctx) (declare (ignore ctx)) t)
:deterministic (lambda (action ctx)
(declare (ignore ctx))
action))
(let* ((candidate '(:type :REQUEST :payload (:action :shell :cmd "echo hello")))
(signal '(:type :EVENT :payload (:sensor :user-input)))
(result (cognitive-verify candidate signal)))
(is (eq :REQUEST (getf result :type)))
(is (equal (getf candidate :payload) (getf result :payload)))
(is (getf result :gate-trace))))
(test test-cognitive-verify-empty-registry
"Contract 1: with no gates registered, action passes through unchanged."
(clrhash passepartout::*skill-registry*)
(let* ((candidate '(:type :REQUEST :payload (:action :shell :cmd "ls")))
(signal '(:type :EVENT :payload (:sensor :user-input)))
(result (cognitive-verify candidate signal)))
(is (eq :REQUEST (getf result :type)))
(is (equal (getf candidate :payload) (getf result :payload)))))
(test test-cognitive-verify-approval-required
"Contract 1: gate returning :approval-required produces an approval event."
(clrhash passepartout::*skill-registry*)
(passepartout::defskill :mock-approval
:priority 50
:trigger (lambda (ctx) (declare (ignore ctx)) t)
:deterministic (lambda (action ctx)
(declare (ignore ctx))
(list :type :EVENT :level :approval-required
:payload (list :action action))))
(let* ((candidate '(:type :REQUEST :payload (:action :shell :cmd "sudo reboot")))
(signal '(:type :EVENT :payload (:sensor :user-input)))
(result (cognitive-verify candidate signal)))
(is (eq :approval-required (getf result :level)))
(is (eq :EVENT (getf result :type)))))
(test test-loop-gate-reason-passthrough
"Contract 2: non-user-input sensors pass through loop-gate-reason unchanged."
(let* ((signal '(:type :EVENT :payload (:sensor :heartbeat) :meta (:source :system)))
(result (loop-gate-reason signal)))
(is (not (null result)))))
(test test-loop-gate-reason-sets-status
"Contract 2: loop-gate-reason sets :status on :user-input signals."
(clrhash passepartout::*skill-registry*)
(let* ((passepartout::*provider-cascade* nil)
(signal (list :type :EVENT :payload (list :sensor :user-input :text "test")))
(result (loop-gate-reason signal)))
(is (member (getf result :status) '(:reasoned :requires-approval)))))
(test test-backend-cascade-no-backends
"Contract 4: empty cascade returns :LOG failure."
(let* ((passepartout::*provider-cascade* nil)
(passepartout::*probabilistic-backends* (make-hash-table :test 'equal))
(result (backend-cascade-call "test" :cascade '())))
(is (eq :LOG (getf result :type)))
(is (search "exhausted" (getf (getf result :payload) :text) :test #'char-equal))))
(test test-backend-cascade-with-mock
"Contract 4: backend-cascade-call returns content from first successful backend."
(let ((passepartout::*backend-registry* (make-hash-table :test 'equal)))
(setf (gethash :mock-backend passepartout::*backend-registry*)
(lambda (prompt sp &key model)
(declare (ignore prompt sp model))
(list :status :success :content "mock-response")))
(let ((result (backend-cascade-call "hello" :cascade '(:mock-backend))))
(is (string= "mock-response" result)))))
(test test-read-eval-rce-blocked
"Contract 1/v0.3.1: #. reader macro in LLM output must not execute arbitrary code."
(let ((passepartout::*backend-registry* (make-hash-table :test 'equal))
(passepartout::*provider-cascade* '(:mock-evil)))
(setf (gethash :mock-evil passepartout::*backend-registry*)
(lambda (prompt sp &key model)
(declare (ignore prompt sp model))
(list :status :success :content "(#.(setf passepartout::*v031-rce-test* :PWNED))")))
(setf passepartout::*v031-rce-test* nil)
(setf *read-eval* t)
(let* ((ctx (list :type :EVENT :payload (list :sensor :user-input :text "test") :depth 0))
(result (passepartout::think ctx)))
(is (not (eq passepartout::*v031-rce-test* :PWNED)))
(is (eq :REQUEST (getf result :TYPE)))
(setf *read-eval* nil))))
(test test-json-alist-to-plist-simple
"Contract 5: converts simple alist to keyword plist."
(let ((alist (list (cons "action" "shell") (cons "cmd" "echo hello"))))
(let ((result (json-alist-to-plist alist)))
(is (eq :ACTION (first result)))
(is (string= "shell" (second result)))
(is (eq :CMD (third result)))
(is (string= "echo hello" (fourth result))))))
(test test-json-alist-to-plist-nested
"Contract 5: nested alists recurse into nested plists."
(let ((alist (list (cons "tool" "write-file")
(cons "args" (list (cons "filepath" "/tmp/x")
(cons "content" "hi"))))))
(let ((result (json-alist-to-plist alist)))
(is (eq :TOOL (first result)))
(is (eq :ARGS (third result)))
(let ((inner (fourth result)))
(is (eq :FILEPATH (first inner)))
(is (string= "/tmp/x" (second inner)))
(is (eq :CONTENT (third inner)))))))
(test test-json-alist-to-plist-array-passthrough
"Contract 5: JSON arrays pass through unchanged."
(let ((alist (list (cons "names" (list "alice" "bob")))))
(let ((result (json-alist-to-plist alist)))
(is (eq :NAMES (first result)))
(is (equal (list "alice" "bob") (second result))))))
(test test-json-alist-to-plist-null
"Contract 5: nil passes through unchanged."
(let ((result (json-alist-to-plist nil)))
(is (null result))))
(test test-json-alist-to-plist-scalar
"Contract 5: scalar values pass through."
(let ((alist (list (cons "count" 42) (cons "active" :true))))
(let ((result (json-alist-to-plist alist)))
(is (eq :COUNT (first result)))
(is (= 42 (second result)))
(is (eq :ACTIVE (third result)))
(is (eq :true (fourth result))))))
#+end_src