Files
passepartout/org/system-integration-tests.org
Amr Gharbeia 750918527d
Some checks failed
Deploy (Gitea) / deploy (push) Failing after 2s
tests: TUI integration + cascade parsing — precise LLM diagnostics
- TUI agent-responds: hardened to detect and FAIL on cascade/exhausted
  responses (previously a separate WARN-only test that let real
  cascade failures slip through)
- New TUI cascade-parsing test: /eval *provider-cascade* on screen,
  checks for clean keywords (no cl-dotenv quote artifacts)
- Pre-warm step: sbcl --eval '(ql:quickload :passepartout/tui)'
  before launching tmux, cuts TUI startup from ~120s to ~10s
- Removed test_agent_not_cascade_failure (absorbed into agent-responds)
- New integration test: test-provider-cascade-parsing verifies
  PROVIDER_CASCADE entries are keywords without quotes, matching
  registered backends — catches the exact cl-dotenv quote bug
- Fixed stop-daemon ghost symbol (removed export) and paren bug
- Contract section updated with numbered Phase 2/3 items
2026-05-06 08:56:07 -04:00

16 KiB

SKILL: System Integration Tests

Architectural Intent

Integration tests verify that modules work together over real boundaries — TCP sockets, file I/O, subprocess execution, and the full daemon pipeline. Unlike unit tests (which mock collaborators), integration tests start a real daemon, connect like a real client, and assert observable behavior.

Contract

Phase 1 — In-process daemon (no external credentials):

  1. (start-daemon &key port): binds port, sends handshake on connect.
  2. Pipeline: a :user-input event traverses the full pipeline.
  3. Communication: framed messages survive TCP round-trip; malformed input does not crash the daemon.
  4. Skill loader: after daemon start, *skill-registry* is populated.
  5. Shell actuator: safe commands execute; dangerous patterns are blocked.
  6. CLI gateway: text injected via TCP reaches the pipeline.
  7. Gateway registry: gateway-registry-initialize is available.

Phase 2 — LLM + messaging:

  1. Provider cascade: PROVIDER_CASCADE entries are clean keywords matching registered backends (no quote contamination).
  2. Backend cascade: real provider returns string content.

Phase 3 — TUI via tmux:

  1. Agent response: TUI ↛ daemon ↛ LLM round-trip produces non-cascade agent text on screen.
  2. Cascade inspection: /eval *provider-cascade* shows clean keywords on TUI screen (no quote artifacts).
  3. Eval command: /eval (+ 1 2) displays ~=> 3~ on screen.
  4. Status bar: rendered screen shows ~msgs:~ in status bar.

Boundaries

  • Requires passepartout setup to have been run (skills in XDG data dir).
  • Phase 2 tests skip if required env vars are unset.
  • Phase 3 tests require tmux and Emacs installed.

Prologue

Shared test harness: package, suite, helpers, and with-daemon.

(eval-when (:compile-toplevel :load-toplevel :execute)
  (ql:quickload :fiveam :silent t)
  (ql:quickload :usocket :silent t))

(defpackage :passepartout-integration-tests
  (:use :cl :passepartout)
  (:export #:integration-suite))

(in-package :passepartout-integration-tests)

(fiveam:def-suite integration-suite :description "Integration tests across process boundaries")
(fiveam:in-suite integration-suite)

(defvar *daemon-port* nil)

(defun find-free-port ()
  (let ((socket (usocket:socket-listen "127.0.0.1" 0 :reuse-address t)))
    (unwind-protect (usocket:get-local-port socket)
      (usocket:socket-close socket))))

(defmacro with-daemon (() &body body)
  `(let ((*daemon-port* (find-free-port)))
     (unwind-protect
          (progn
            (passepartout:actuator-initialize)
            (passepartout:skill-initialize-all)
            (passepartout:start-daemon :port *daemon-port*)
            (sleep 2)
            ,@body)
        (values)))

(defun daemon-connect ()
  (let* ((sock (usocket:socket-connect "127.0.0.1" *daemon-port*))
         (stream (usocket:socket-stream sock)))
    (read-framed-message stream)  ;; discard handshake
    (values stream sock)))

(defun daemon-send (stream msg)
  (write-string (frame-message msg) stream)
  (finish-output stream))

(defun daemon-recv (stream &key (timeout 5))
  (let ((deadline (+ (get-universal-time) timeout)))
    (loop
      (when (listen stream)
        (return (read-framed-message stream)))
      (when (> (get-universal-time) deadline) (return nil))
      (sleep 0.1))))

Daemon Lifecycle

Verifies the daemon starts, binds its port, and sends a valid handshake.

(fiveam:test test-daemon-starts
  "Contract 1: daemon binds port and sends valid handshake."
  (with-daemon ()
    (multiple-value-bind (stream sock) (daemon-connect)
      (is (open-stream-p stream))
      (usocket:socket-close sock))))

Pipeline End-to-End

Sends a :user-input event and verifies the pipeline produces a response.

(fiveam:test test-pipeline-user-input
  "Contract 2: :user-input traverses pipeline and produces a response."
  (with-daemon ()
    (multiple-value-bind (stream sock) (daemon-connect)
      (unwind-protect
           (progn
             (daemon-send stream
              '(:TYPE :EVENT :PAYLOAD (:SENSOR :user-input :TEXT "test")))
             (let ((resp (daemon-recv stream :timeout 10)))
               (is (not (null resp)) "Expected a response")))
        (usocket:socket-close sock)))))

(fiveam:test test-pipeline-heartbeat
  "Contract 2: heartbeat signals do not crash the daemon."
  (with-daemon ()
    (multiple-value-bind (stream sock) (daemon-connect)
      (unwind-protect
           (daemon-send stream
            '(:TYPE :EVENT :PAYLOAD (:SENSOR :heartbeat)))
        (usocket:socket-close sock))
      (pass))))

Communication Protocol

Verifies framed TCP round-trip and malformed-input resilience.

(fiveam:test test-tcp-round-trip
  "Contract 3: framed health-check survives TCP round-trip."
  (with-daemon ()
    (multiple-value-bind (stream sock) (daemon-connect)
      (unwind-protect
           (progn
             (daemon-send stream '(:TYPE :health-check))
             (let ((resp (daemon-recv stream :timeout 5)))
               (is (not (null resp)))
               (is (member (getf resp :type) '(:HEALTH-RESPONSE)))))
        (usocket:socket-close sock)))))

(fiveam:test test-daemon-survives-junk
  "Contract 3: daemon does not crash on junk input."
  (with-daemon ()
    (multiple-value-bind (stream sock) (daemon-connect)
      (write-string "ZZZZZZ" stream)
      (finish-output stream)
      (sleep 1)
      (usocket:socket-close sock))
    ;; Connect again to verify daemon is still alive
    (multiple-value-bind (stream2 sock2) (daemon-connect)
      (is (open-stream-p stream2))
      (usocket:socket-close sock2))))

Skill Loader

Verifies the skill loader populates *skill-registry* after daemon start.

(fiveam:test test-skill-registry-populated
  "Contract 4: *skill-registry* is populated after daemon start."
  (with-daemon ()
    (is (hash-table-p passepartout::*skill-registry*))
    (is (>= (hash-table-count passepartout::*skill-registry*) 1)
        "Expected at least 1 skill in registry, got ~a"
        (hash-table-count passepartout::*skill-registry*))))

Shell Actuator

Verifies safe shell commands execute and dangerous patterns are blocked.

(fiveam:test test-shell-safe-echo
  "Contract 5: safe shell command does not crash the daemon."
  (with-daemon ()
    (multiple-value-bind (stream sock) (daemon-connect)
      (unwind-protect
           (daemon-send stream
            '(:TYPE :REQUEST :TARGET :shell
              :PAYLOAD (:ACTION :execute :CMD "echo hello")))
        (usocket:socket-close sock))
      (pass))))

(fiveam:test test-shell-dangerous-blocked
  "Contract 5: rm -rf / is blocked by the security dispatcher."
  (with-daemon ()
    (multiple-value-bind (stream sock) (daemon-connect)
      (unwind-protect
           (daemon-send stream
            '(:TYPE :REQUEST :TARGET :shell
              :PAYLOAD (:ACTION :execute :CMD "rm -rf /")))
        (usocket:socket-close sock))
      (pass))))

CLI Gateway

Verifies text input over TCP reaches the pipeline.

(fiveam:test test-cli-gateway-input
  "Contract 6: text via TCP produces a response."
  (with-daemon ()
    (multiple-value-bind (stream sock) (daemon-connect)
      (unwind-protect
           (daemon-send stream
            '(:TYPE :EVENT :META (:SOURCE :CLI)
              :PAYLOAD (:SENSOR :user-input :TEXT "hello from CLI")))
        (usocket:socket-close sock))
      (pass))))

Gateway Registry

Verifies the gateway registry function is available after daemon start.

(fiveam:test test-gateway-registry
  "Contract 7: gateway-registry-initialize is available."
  (with-daemon ()
    (is (fboundp 'gateway-registry-initialize))
    (gateway-registry-initialize)
    (pass)))

LLM Provider Cascade

Tests backend-cascade-call and provider-openai-request with real API credentials. Skipped silently if OPENROUTER_API_KEY is unset.

(defun has-api-key (env-var)
  "Returns T if env-var is set and non-empty."
  (let ((val (uiop:getenv env-var)))
    (and val (> (length val) 0))))

(defmacro skip-unless (env-var &body body)
  "Execute body if env-var is set, otherwise skip the test."
  `(if (has-api-key ,env-var)
       (progn ,@body)
       (progn
         (format t "  [SKIP] ~a not set~%" ,env-var)
         (skip "~a not set" ,env-var))))

(fiveam:test test-provider-openai-request
  "Contract Phase2: provider-openai-request returns :success with valid API key."
  (skip-unless "OPENROUTER_API_KEY"
    (let ((result (provider-openai-request "Say hello" "Be brief."
                                           :provider :openrouter
                                           :model "openrouter/auto")))
      (is (or (eq (getf result :status) :success)
              (eq (getf result :status) :error))
          "Expected :success or :error, got: ~a" result))))

(fiveam:test test-backend-cascade-real
  "Contract Phase2: backend-cascade-call returns string content with real provider."
  (skip-unless "OPENROUTER_API_KEY"
    (let ((passepartout::*provider-cascade* '(:openrouter)))
      (let ((result (backend-cascade-call "Say hello" :system-prompt "Be brief.")))
        (is (stringp result) "Expected string response, got: ~a" result)))))

(fiveam:test test-provider-cascade-parsing
  "Contract Phase2: PROVIDER_CASCADE env var parses to clean keywords matching backends."
  (provider-cascade-initialize)
  (let ((cascade passepartout::*provider-cascade*))
    (is (listp cascade) "Cascade must be a list")
    (is (>= (length cascade) 1) "Cascade must have at least one entry")
    (dolist (entry cascade)
      (is (keywordp entry) "Entry ~s must be a keyword" entry)
      (let ((name (symbol-name entry)))
        (is (not (find #\" name)) "Entry ~s must not contain double-quote" entry)
        (is (not (find #\' name)) "Entry ~s must not contain single-quote" entry)))
    (is (some (lambda (e) (gethash e passepartout::*probabilistic-backends*)) cascade)
        "At least one cascade entry must match a registered backend")))

Messaging Link/Unlink

Verifies messaging-link stores a token in the vault, gateway-configured-p returns the correct status, and messaging-unlink removes it. No real API credentials needed — these are management functions.

(fiveam:test test-messaging-link-unlink
  "Contract Phase2: messaging-link stores token, configured-p returns T, unlink removes it."
  (with-daemon ()
    (messaging-link :test-platform :token "fake-token-123")
    (is (gateway-configured-p :test-platform)
        "Expected test-platform to be configured after linking")
    (messaging-unlink :test-platform)
    (is (not (gateway-configured-p :test-platform))
        "Expected test-platform to be unconfigured after unlinking")))

(fiveam:test test-gateway-configured-p-false
  "Contract Phase2: gateway-configured-p returns nil for unknown platform."
  (with-daemon ()
    (is (not (gateway-configured-p :nonexistent-platform-xyz)))))

(fiveam:test test-gateway-start-messaging
  "Contract Phase2: gateway registry initializes with expected platforms."
  (with-daemon ()
    (gateway-registry-initialize)
    (is (hash-table-p passepartout::*gateway-registry*))
    (is (>= (hash-table-count passepartout::*gateway-registry*) 1))))

TUI Integration Shell Script

Verifies the TUI end-to-end via tmux: input rendering, /eval, status bar, connection drop.

#!/bin/bash
set -euo pipefail

PASS=0
FAIL=0
WARN=0
TUI_LOG="/tmp/passepartout-tui-test.log"
> "$TUI_LOG"

cleanup() {
  tmux kill-session -t tui-test 2>/dev/null || true
}
trap cleanup EXIT

run_test() {
  local name="$1"; shift
  echo -n "  $name ... "
  if "$@" 2>/dev/null; then
    echo "PASS"
    PASS=$((PASS + 1))
  else
    echo "FAIL"
    FAIL=$((FAIL + 1))
  fi
}

# ---- Setup ----
echo "Pre-warming FASL cache (speeds up TUI start from ~120s to ~10s)..."
sbcl --noinform --load ~/quicklisp/setup.lisp \
     --eval '(ql:quickload :passepartout/tui :silent t)' \
     --eval '(uiop:quit)' 2>/dev/null &
WARM_PID=$!
wait $WARM_PID 2>/dev/null
echo "  Pre-warm complete"

echo "Starting TUI in tmux (daemon must already be running on port 9105)..."
tmux new-session -d -s tui-test "passepartout tui 2>&1 | tee $TUI_LOG"
for i in $(seq 1 15); do
  sleep 2
  if tmux capture-pane -t tui-test -p 2>/dev/null | grep -q 'Connected v[0-9]'; then
    echo "  TUI ready after $((i*2))s"
    break
  fi
  if [ "$i" -eq 15 ]; then
    echo "  WARNING: TUI did not render after 30s"
  fi
done

# ---- Tests ----

test_agent_responds() {
  # Full round-trip: TUI → daemon → LLM → daemon → TUI.
  # Must contain a real agent response (⬇), NOT a cascade failure.
  local before_ts
  before_ts=$(date +%s)
  tmux send-keys -t tui-test "Say hello in one word" Enter
  while true; do
    local pane
    pane=$(tmux capture-pane -t tui-test -p -S -60 2>/dev/null)
    if echo "$pane" | grep -q '⬇.*[a-zA-Z]\{3,\}'; then
      if echo "$pane" | grep '⬇' | grep -qi 'cascade.*fail\|exhausted\|neural cascade'; then
        echo "FAIL: agent responded with cascade failure, not LLM content" >&2
        return 1
      fi
      return 0
    fi
    local now_ts
    now_ts=$(date +%s)
    if (( now_ts - before_ts > 60 )); then
      echo "TIMEOUT: no agent response after 60s" >&2
      return 1
    fi
    sleep 2
  done
}

test_cascade_parsing() {
  # Via /eval, check that *provider-cascade* contains clean keywords.
  # This catches the cl-dotenv quote contamination bug.
  tmux send-keys -t tui-test "/eval *provider-cascade*" Enter
  sleep 3
  local pane
  pane=$(tmux capture-pane -t tui-test -p -S -15 2>/dev/null)
  # Must contain keyword syntax :SOMETHING (not "SOMETHING with quotes)
  echo "$pane" | grep -q ':DEEPSEEK\|:OPENROUTER\|:OPENAI\|:ANTHROPIC\|:GROQ\|:GEMINI\|:NVIDIA'
}

test_eval_command() {
  tmux send-keys -t tui-test "/eval (+ 1 2)" Enter
  sleep 3
  tmux capture-pane -t tui-test -p -S -10 2>/dev/null | grep -q '=> 3'
}

test_status_bar() {
  tmux capture-pane -t tui-test -p -S -20 2>/dev/null | grep -q 'msgs:'
}

test_connection_drop() {
  sleep 1
  tmux capture-pane -t tui-test -p -S -10 2>/dev/null | grep -qi 'connection.*lost\|ERROR.*Connection\|error.*connect' || true
  return 0
}

run_test "agent-responds"       test_agent_responds
run_test "cascade-parsing"      test_cascade_parsing
run_test "eval-command"         test_eval_command
run_test "status-bar"           test_status_bar
run_test "connection-drop"      test_connection_drop

# ---- Summary ----
echo ""
echo "===== $PASS passed, $FAIL failed, $WARN warnings ====="
exit $(( FAIL > 0 ? 1 : 0 ))

Emacs Integration

Verifies Flight Plan message format and Emacs daemon connectivity.

(fiveam:test test-flight-plan-message-format
  "Contract Phase3: dispatcher-flight-plan-create returns valid message."
  (with-daemon ()
    (load (merge-pathnames ".local/share/passepartout/lisp/security-dispatcher.lisp"
                           (user-homedir-pathname)))
    (let ((plan (dispatcher-flight-plan-create
                 '(:TYPE :REQUEST :TARGET :shell :PAYLOAD (:CMD "sudo restart")))))
      (is (eq :REQUEST (getf plan :type)))
      (is (eq :emacs (getf plan :target)))
      (is (eq :insert-node (getf (getf plan :payload) :action)))
      (let ((attrs (getf (getf plan :payload) :attributes)))
        (is (string= "Flight Plan: High-Risk Action" (getf attrs :TITLE)))
        (is (string= "PLAN" (getf attrs :TODO)))
        (is (member "FLIGHT_PLAN" (getf attrs :TAGS) :test #'string-equal))))))

(fiveam:test test-emacs-daemon-connect
  "Contract Phase3: Emacs daemon is reachable via emacsclient."
  (handler-case
      (let ((result (uiop:run-program '("emacsclient" "--eval" "(+ 1 2)")
                                       :output :string
                                       :ignore-error-status t)))
        (is (search "3" result) "Expected '3' from emacsclient, got: ~a" result))
    (error (c)
      (skip "Emacs daemon not available: ~a" c)))))