Files
passepartout/skills/org-skill-lisp-validator.org
Amr Gharbeia 76040c1f48
Some checks failed
Deploy-Agent-V15-Stdin / JOB-V15-STDIN (push) Failing after 5s
feat: Add Lisp Validator skill with 3-phase deterministic gate
- Implements structural (O(n) paren balance), syntactic (reader with *read-eval* nil),
  and semantic (whitelist AST walk) validation.
- Exposes :validate-lisp cognitive tool for Probabilistic Engine self-correction.
- Replaces validate-lisp-syntax stub in harness/skills.org with delegation.
- Adds mandatory validation rule to Probabilistic Engine system prompt.
- Refactors org-skill-policy.org with 6 concrete invariants (Transparency, Autonomy,
  Zero-Bloat, Modularity, Mentorship, Sustainability) and explicit override hierarchy.
- Adds Harness Boundary Contract to harness/manifest.org.
2026-04-22 13:12:49 -04:00

17 KiB

SKILL: Lisp Validator (Structural & Semantic Gate)

Overview

The Lisp Validator is the primary structural gate for the Probabilistic-Deterministic Lisp Machine. It eliminates the token-waste of probabilistic paren-balancing by providing a deterministic, three-phase validation pipeline: Structural, Syntactic, and Semantic. It is exposed as a cognitive tool so both the harness and the Probabilistic Engine can invoke it before declaring code complete.

Phase A: Demand (PRD)

1. Purpose

Provide a deterministic, fast, and auditable validation gate for all Lisp code proposals.

2. User Needs

  • Structural Validation: Detect unbalanced parentheses, brackets, and unterminated strings without invoking the reader.
  • Syntactic Validation: Ensure the code can be read by SBCL with `*read-eval*` disabled.
  • Semantic Validation: Optionally enforce a whitelist of safe symbols for sandboxed execution.
  • Tool Exposure: The Probabilistic Engine must be able to call this as a cognitive tool.

3. Success Criteria

  • Structural check runs in O(n) and catches all paren/string defects.
  • Syntactic check catches reader errors and malformed sexps.
  • Semantic check blocks non-whitelisted symbols when strict mode is enabled.
  • Returns structured plist for machine parsing and human explanation.

Phase B: Blueprint (PROTOCOL)

1. Architectural Intent

A single entry point, `lisp-validator-validate`, runs three sequential checks. Each check is isolated so a failure in one does not obscure failures in others. The function returns a unified result plist.

2. Semantic Interfaces

  • `(lisp-validator-validate code-string &key strict)` → plist
  • Tool `:validate-lisp` with args `(:code "…" :strict t/nil)`

Phase D: Build (Implementation)

Package Context

(in-package :opencortex)

Check 1: Structural Validation (Paren Balance)

Scans the raw string character-by-character, tracking open/close pairs for `()`, `[]`, `#()`, and string delimiters `"`. Ignores escaped characters and line comments (`;`). This is O(n) and does not invoke the Lisp reader.

(defun lisp-validator-check-structural (code-string)
  "Checks for balanced parens, brackets, and terminated strings.
Returns (VALUES t nil) if clean, or (VALUES nil reason-string line col)."
  (let ((stack nil)
        (in-string nil)
        (escaped nil)
        (line 1)
        (col 0)
        (last-open-line 1)
        (last-open-col 0))
    (dotimes (i (length code-string)
               (if (null stack)
                   (values t nil nil nil)
                   (values nil (format nil "Unbalanced '~a' opened at line ~a, col ~a"
                                       (caar stack) last-open-line last-open-col)
                           last-open-line last-open-col)))
      (let ((ch (char code-string i)))
        (cond (escaped (setf escaped nil))
              ((char= ch #\\) (setf escaped t))
              (in-string
               (when (char= ch #\") (setf in-string nil)))
              ((char= ch #\;)
               ;; Skip to end of line
               (loop while (and (< i (1- (length code-string)))
                                (not (char= (char code-string (1+ i)) #\Newline)))
                     do (incf i))
               (incf line) (setf col 0))
              ((char= ch #\")
               (setf in-string t))
              ((member ch '(#\( #\[))
               (push (list (string ch) line col) stack)
               (setf last-open-line line last-open-col col))
              ((char= ch #\))
               (cond ((null stack)
                      (return-from lisp-validator-check-structural
                        (values nil (format nil "Unexpected ')' at line ~a, col ~a" line col) line col)))
                     ((string= (caar stack) "[")
                      (return-from lisp-validator-check-structural
                        (values nil (format nil "Mismatched ']' expected at line ~a, col ~a" line col) line col)))
                     (t (pop stack))))
              ((char= ch #\])
               (cond ((null stack)
                      (return-from lisp-validator-check-structural
                        (values nil (format nil "Unexpected ']' at line ~a, col ~a" line col) line col)))
                     ((string= (caar stack) "(")
                      (return-from lisp-validator-check-structural
                        (values nil (format nil "Mismatched ')' expected at line ~a, col ~a" line col) line col)))
                     (t (pop stack))))
               ((char= ch #\Newline)
                (incf line) (setf col 0)))
          (unless (char= ch #\Newline) (incf col))))))

Check 2: Syntactic Validation (Reader Check)

Wraps the code in `(progn …)` and attempts to read every top-level form with `*read-eval*` disabled. Catches reader errors, invalid syntax, and malformed sexps that the structural check cannot detect (e.g., invalid reader macros).

(defun lisp-validator-check-syntactic (code-string)
  "Checks if the code can be read by SBCL with *read-eval* nil.
Returns (VALUES t nil) if clean, or (VALUES nil error-message line col)."
  (handler-case
      (let ((*read-eval* nil))
        (with-input-from-string (stream (format nil "(progn ~a)" code-string))
          (loop for form = (read stream nil :eof) until (eq form :eof)))
        (values t nil nil nil))
    (error (c)
      (let ((msg (format nil "~a" c)))
        (values nil msg nil nil)))))

Check 3: Semantic Validation (Whitelist AST Walk)

Recursively walks the parsed AST and verifies that every function call and symbol reference appears on a whitelist. This is the "Deny-by-Default" sandbox. When `strict` is nil, this check is skipped for general validation (e.g., skill loading) but enforced for `:eval` tool execution.

(defparameter *lisp-validator-whitelist*
  '(;; Math & Logic
    + - * / = < > <= >= 1+ 1- min max mod abs floor ceiling round
    and or not null eq eql equal string= string-equal char= char-equal
    ;; List Manipulation
    list cons car cdr cadr cddr cdar caar caddr cdddr append mapcar remove-if remove-if-not
    length reverse sort nth nthcdr push pop last butlast subseq
    ;; Plists, Alists, and Hash Tables
    getf gethash assoc acons pairlis rassoc
    ;; Control Flow
    let let* if cond when unless case typecase prog1 progn
    ;; Strings
    format concatenate string-downcase string-upcase search subseq replace
    ;; Type predicates
    stringp numberp integerp listp symbolp keywordp null
    ;; Kernel safe symbols
    opencortex::harness-log
    opencortex::snapshot-memory opencortex::rollback-memory
    opencortex::lookup-object opencortex::list-objects-by-type
    opencortex::ingest-ast opencortex::find-headline-missing-id
    opencortex::context-query-store opencortex::context-get-active-projects
    opencortex::context-get-recent-completed-tasks opencortex::context-list-all-skills
    opencortex::context-get-system-logs opencortex::context-assemble-global-awareness
    opencortex::org-object-id opencortex::org-object-type opencortex::org-object-attributes
    opencortex::org-object-content opencortex::org-object-parent-id
    opencortex::org-object-children opencortex::org-object-version
    opencortex::org-object-last-sync opencortex::org-object-hash
    opencortex::org-object-vector
    ;; Essential macros and special operators
    declare ignore quote function lambda defun defvar defparameter defmacro
    ;; Safe I/O
    with-open-file write-string read-line
    ;; Package introspection
    find-package make-package in-package do-external-symbols find-symbol
    ;; Safe system interaction
    uiop:run-program uiop:getenv uiop:merge-pathnames* uiop:file-exists-p
    uiop:directory-exists-p uiop:read-file-string uiop:split-string
    ;; Time
    get-universal-time get-internal-real-time sleep
    ;; Equality
    equalp = equal eq eql))
  "Static whitelist of symbols permitted in the Lisp Validator sandbox."

(defvar *lisp-validator-registry* nil
  "List of dynamically registered safe symbols.")

(defun lisp-validator-register (symbols)
  "Adds symbols to the global validator registry."
  (setf *lisp-validator-registry*
        (append *lisp-validator-registry*
                (if (listp symbols) symbols (list symbols))))
  (harness-log "LISP VALIDATOR: Registered ~a new safe symbols."
               (length (if (listp symbols) symbols (list symbols)))))

(defun lisp-validator-is-safe (symbol)
  "Checks if a symbol is in the static whitelist or the dynamic registry."
  (or (member symbol *lisp-validator-whitelist* :test #'string-equal)
      (member symbol *lisp-validator-registry* :test #'string-equal)))

(defun lisp-validator-ast-walk (form)
  "Recursively walks the Lisp AST. Returns T if safe, NIL if unsafe."
  (cond
    ;; Self-evaluating objects are safe.
    ((or (stringp form) (numberp form) (keywordp form) (characterp form)) t)
    ;; Symbols used as variables (in non-function position)
    ((symbolp form) (lisp-validator-is-safe form))
    ;; Lists represent function calls or special forms.
    ((listp form)
     (let ((head (car form)))
       (cond
         ((eq head 'quote) t)
         ((not (symbolp head)) nil)
         ((lisp-validator-is-safe head)
          (every #'lisp-validator-ast-walk (cdr form)))
         (t
          (harness-log "LISP VALIDATOR: Blocked call to non-whitelisted function ~a" head)
          nil))))
    (t nil)))

(defun lisp-validator-check-semantic (code-string)
  "Checks if all symbols in CODE-STRING are whitelisted.
Returns (VALUES t nil) if clean, or (VALUES nil reason-string nil nil)."
  (handler-case
      (let ((*read-eval* nil))
        (with-input-from-string (stream (format nil "(progn ~a)" code-string))
          (loop for form = (read stream nil :eof)
                until (eq form :eof)
                do (unless (lisp-validator-ast-walk form)
                     (return-from lisp-validator-check-semantic
                       (values nil "Code contains non-whitelisted symbols." nil nil)))))
        (values t nil nil nil))
    (error (c)
      (values nil (format nil "Semantic check failed: ~a" c) nil nil))))

Unified Entry Point

Orchestrates the three checks and returns a single structured plist.

(defun lisp-validator-validate (code-string &key strict)
  "Validates Lisp code through structural, syntactic, and optional semantic checks.
Returns a plist:
  (:status :success :checks (:structural t :syntactic t :semantic t))
or
  (:status :error :failed <check-key> :reason <string> :line <n> :col <n>)

When STRICT is non-nil, the semantic whitelist check is enforced.
When STRICT is nil, semantic check is skipped for general validation."
  (let ((structural-ok nil) (syntactic-ok nil) (semantic-ok nil)
        (reason nil) (line nil) (col nil))
    ;; Phase 1: Structural
    (multiple-value-setq (structural-ok reason line col)
      (lisp-validator-check-structural code-string))
    (unless structural-ok
      (return-from lisp-validator-validate
        (list :status :error :failed :structural :reason reason :line line :col col)))
    ;; Phase 2: Syntactic
    (multiple-value-setq (syntactic-ok reason line col)
      (lisp-validator-check-syntactic code-string))
    (unless syntactic-ok
      (return-from lisp-validator-validate
        (list :status :error :failed :syntactic :reason reason :line line :col col)))
    ;; Phase 3: Semantic (only when strict)
    (when strict
      (multiple-value-setq (semantic-ok reason line col)
        (lisp-validator-check-semantic code-string))
      (unless semantic-ok
        (return-from lisp-validator-validate
          (list :status :error :failed :semantic :reason reason :line line :col col))))
    ;; All clear
    (list :status :success
          :checks (list :structural t :syntactic t :semantic (or (not strict) semantic-ok)))))

Cognitive Tool: :validate-lisp

Exposes the validator to the Probabilistic Engine so it can self-correct before presenting code.

(def-cognitive-tool :validate-lisp
  "Deterministically validates Lisp code for structural, syntactic, and semantic correctness.
Use this BEFORE declaring any Lisp code edit complete."
  ((:code :type :string :description "The Lisp code string to validate.")
   (:strict :type :boolean :description "If non-nil, enforces the semantic whitelist."))
  :body (lambda (args)
          (let ((code (getf args :code))
                (strict (getf args :strict)))
            (if (and code (stringp code))
                (lisp-validator-validate code :strict strict)
                (list :status :error :reason "Missing :code argument.")))))

Skill Registration

(defskill :skill-lisp-validator
  :priority 900
  :trigger (lambda (ctx)
             ;; Trigger on any eval or shell action, or when validation is explicitly requested
             (let ((candidate (getf ctx :approved-action)))
               (when candidate
                 (let ((payload (getf candidate :payload)))
                   (member (getf payload :action) '(:eval :shell))))))
  :probabilistic nil
  :deterministic (lambda (action context)
                   (declare (ignore context))
                   (let ((payload (getf action :payload)))
                     (if (eq (getf payload :action) :eval)
                         (let* ((code (getf payload :code))
                                (result (lisp-validator-validate code :strict t)))
                           (if (eq (getf result :status) :error)
                               (progn
                                 (harness-log "LISP VALIDATOR: Blocked unsafe :eval action. ~a"
                                              (getf result :reason))
                                 (list :type :LOG
                                       :payload (list :level :error
                                                      :text (format nil "LISP VALIDATOR: Blocked unsafe eval. ~a"
                                                                    (getf result :reason)))))
                               action))
                         action))))

Phase E: Chaos (Verification)

(defpackage :opencortex-lisp-validator-tests
  (:use :cl :fiveam :opencortex)
  (:export #:lisp-validator-suite))

(in-package :opencortex-lisp-validator-tests)

(def-suite lisp-validator-suite
  :description "Tests for the Lisp Validator structural, syntactic, and semantic gates.")

(in-suite lisp-validator-suite)

(test structural-balanced
  (let ((result (opencortex::lisp-validator-check-structural "(+ 1 2)")))
    (is (eq result t))))

(test structural-unbalanced-open
  (multiple-value-bind (ok reason line col)
      (opencortex::lisp-validator-check-structural "(+ 1 2")
    (is (null ok))
    (is (search "Unbalanced" reason))
    (is (= line 1))))

(test structural-unbalanced-close
  (multiple-value-bind (ok reason line col)
      (opencortex::lisp-validator-check-structural "+ 1 2)")
    (is (null ok))
    (is (search "Unexpected" reason)))

(test structural-mismatched-bracket
  (multiple-value-bind (ok reason line col)
      (opencortex::lisp-validator-check-structural "(let [x 1) x)")
    (is (null ok))
    (is (search "Mismatched" reason))))

(test syntactic-valid
  (multiple-value-bind (ok reason line col)
      (opencortex::lisp-validator-check-syntactic "(+ 1 2) (* 3 4)")
    (is ok)))

(test syntactic-invalid-reader
  (multiple-value-bind (ok reason line col)
      (opencortex::lisp-validator-check-syntactic "(1+ 2 #")")
    (is (null ok))))

(test semantic-safe
  (multiple-value-bind (ok reason line col)
      (opencortex::lisp-validator-check-semantic "(+ 1 2)")
    (is ok)))

(test semantic-blocked-eval
  (multiple-value-bind (ok reason line col)
      (opencortex::lisp-validator-check-semantic "(eval '(+ 1 2))")
    (is (null ok))))

(test unified-success
  (let ((result (opencortex::lisp-validator-validate "(+ 1 2)" :strict t)))
    (is (eq (getf result :status) :success))
    (is (getf (getf result :checks) :structural))
    (is (getf (getf result :checks) :syntactic))
    (is (getf (getf result :checks) :semantic))))

(test unified-structural-failure
  (let ((result (opencortex::lisp-validator-validate "(+ 1 2" :strict nil)))
    (is (eq (getf result :status) :error))
    (is (eq (getf result :failed) :structural))))

(test unified-semantic-failure-strict
  (let ((result (opencortex::lisp-validator-validate "(delete-file \"x.txt\")" :strict t)))
    (is (eq (getf result :status) :error))
    (is (eq (getf result :failed) :semantic))))