- AGENTS.md: add steps 0 (read next TODO from ROADMAP) and 6 (mark DONE with LOGBOOK) to development cycle - Notes updates accumulated during v0.8.0 work - Bump passepartout submodule to v0.8.0
1290 lines
62 KiB
Org Mode
1290 lines
62 KiB
Org Mode
#+TITLE: Passepartout Neurosymbolic Engine — Implementation Roadmap
|
|
#+AUTHOR: Agent
|
|
#+FILETAGS: :notes:roadmap:neurosymbolic:v3.0.0:
|
|
#+CREATED: [2026-05-08 Fri]
|
|
|
|
* Evolutionary Roadmap
|
|
|
|
This roadmap describes a phased implementation of the symbolic engine. It is
|
|
independent of the feature roadmap in =passepartout/docs/ROADMAP.org= — Phase 0
|
|
can ship immediately alongside any v0.7.x patch. The symbolic engine grows in
|
|
parallel with feature work, not after it.
|
|
|
|
Every phase is loaded as a skill, not a core ASDF component. A corrupted symbolic
|
|
engine degrades reasoning capability but does not kill the agent. This satisfies
|
|
the self-repair criterion documented in =passepartout/docs/ARCHITECTURE.org= and
|
|
=passepartout/AGENTS.md=.
|
|
|
|
The design rationale for each decision is in
|
|
=notes/passepartout-neurosymbolic-design-decisions-and-options.org=. The original
|
|
architecture exploration is in
|
|
=notes/passepartout-symbolic-engine-exploration.org=. Whitehead's contributions are
|
|
enumerated in =notes/passepartout-whitehead.org=.
|
|
|
|
* Phase 0: PM-Type-Level Gates + Core Integrity (~75 lines — builds on existing Dispatcher)
|
|
|
|
** What
|
|
|
|
Add =:type-level= metadata to the existing =defgate= and =def-cognitive-tool=
|
|
macros. Before any gate predicate evaluates, the dispatcher checks structural
|
|
type compatibility: a signal at type-level 5 cannot pass a gate at type-level 4
|
|
or lower. Self-modification of the safety layer becomes impossible by
|
|
construction.
|
|
|
|
** Rationale
|
|
|
|
The Dispatcher gate stack currently prevents self-modification through pattern
|
|
matching — gate vector 2b catches writes to =core-*= files as a heuristic. But
|
|
there is no /structural/ guarantee preventing a request from modifying the rules
|
|
that validate it. Pattern-based protection can be bypassed through indirection
|
|
(an =eval= that constructs a write, a skill that redefines a gate function at
|
|
runtime). A type-level check is not heuristic — it is a category error rejected
|
|
before any predicate runs, just as PM's theory of types made self-membership
|
|
syntactically invalid before any logical evaluation.
|
|
|
|
** Implementation
|
|
|
|
1. Add =:type-level= keyword argument to =defgate= (default 0) and
|
|
=def-cognitive-tool= (default 0) in =core-skills.org=.
|
|
2. Add =gate-type-check= to the dispatcher's =run-gates= function in
|
|
=security-dispatcher.org=, executed before any gate predicate.
|
|
3. Assign type levels to existing cognitive tools: self-build-core at 5,
|
|
write-file at 3, read-file at 1, shell at 2, eval at 4.
|
|
4. Assign type levels to existing gate vectors: self-build boundary at 5,
|
|
shell safety at 3, path protection at 2, network exfil at 2, secret content at 1.
|
|
5. Add =dispatcher-check-self-termination=: scan shell commands for patterns
|
|
targeting the Passepartout process (=kill -9 <pid>=, =rm -rf ~/.cache/passepartout/=,
|
|
=sudo apt remove sbcl=). Return =:reject-self-termination= with a diagnostic
|
|
message explaining which command matched and why it would destroy the agent.
|
|
Human override is possible via HITL — the gate does not prevent the human
|
|
from issuing the command in a terminal. It prevents the /LLM/ from issuing
|
|
it accidentally. ~20 lines.
|
|
6. Add =integrity-verify-core-files=: on heartbeat, hash the eight core files
|
|
against known-good values stored at daemon startup. On mismatch, inject an
|
|
integrity alert into the signal queue. ~25 lines, uses existing SHA-256
|
|
infrastructure from v0.2.0 Merkle memory.
|
|
|
|
** Verification
|
|
|
|
Existing FiveAM gate tests continue to pass. New test: signal at type-level 5
|
|
targeting a gate at type-level 4 returns =:reject-type-violation= without
|
|
evaluating the gate predicate. New test: signal at type-level 1 passing through
|
|
a gate at type-level 3 proceeds to predicate evaluation. New test: =kill -9 <pid>=
|
|
returns =:reject-self-termination=. New test: modified core file is detected by
|
|
integrity hash check.
|
|
|
|
** Relation to Other Work
|
|
|
|
This is Contribution 1 from =notes/passepartout-whitehead.org=. It is also the
|
|
gate-to-fact bootstrap mechanism — every type-level rejection emits a structured
|
|
event that Phase 1 ingests as a fact. The ~30 lines implement the seed of the
|
|
ontology without any new dependencies.
|
|
|
|
* Phase 0b: Layered Signal Authentication — Layer 1 (Cryptographic) (~200 lines — extends Phase 0)
|
|
|
|
** What
|
|
|
|
Implement gate vector 0 at priority 700 — before all other gates and before any
|
|
type-level checking — with Layer 1 (cryptographic authentication) active.
|
|
Layers 2-4 (sensory, deterministic reasoning, probabilistic) are stubbed with
|
|
=:unavailable= results and deferred to later phases.
|
|
|
|
Signals carry cryptographic signatures verified against a key registry stored
|
|
as fact-store facts. Automated signal sources cannot impersonate the human. The
|
|
human can revoke compromised keys. The authorization matrix is per-key, per-action-class.
|
|
|
|
** Rationale
|
|
|
|
Authentication is layered because no single mechanism suffices. Cryptographic
|
|
authentication proves key ownership but not identity. A valid key can be used
|
|
by a compromised process. A valid key can sign pre-recorded frames. A valid key
|
|
can be held by someone who is not who they claim to be. The four-layer design
|
|
(Layer 1: crypto, Layer 2: sensory, Layer 3: deterministic reasoning, Layer 4:
|
|
probabilistic) stacks evidence. Phase 0b ships Layer 1 — the foundation — with
|
|
the architecture for layers 2-4 already designed.
|
|
|
|
The =:source= field in the signal plist is metadata — it /claims/ origin, it
|
|
does not /prove/ it. This phase replaces it with cryptographic proof.
|
|
|
|
** Implementation
|
|
|
|
*** Key generation and signature utilities — extends =security-vault.lisp=
|
|
|
|
Generate key pairs for signal sources. Canonicalize signal plists (sorted keys,
|
|
stripped of the signature field). Sign with the source's private key. Verify
|
|
with the public key from the key registry. ~50 lines. Uses Ironclad (already an
|
|
ASDF dependency). The vault already stores credential material — key material
|
|
extends the same storage with the same encryption.
|
|
|
|
*** Gate vector 0 — extends =security-dispatcher.lisp=
|
|
|
|
Registered at priority 700 (before the policy gate at 600, before all other
|
|
gates). Architecture for all four layers:
|
|
|
|
#+begin_src lisp
|
|
(defun gate-layered-authentication (signal)
|
|
(let ((results '()))
|
|
;; Layer 1: Cryptographic (always available, always runs first)
|
|
(let ((crypto-result (auth-crypto-verify signal)))
|
|
(push (cons :crypto crypto-result) results)
|
|
(when (eq (getf crypto-result :result) :reject)
|
|
(return-from gate-layered-authentication
|
|
(list :result :reject :confidence nil
|
|
:layer-results (nreverse results)))))
|
|
|
|
;; Layer 2: Sensory (fboundp-guarded, deferred)
|
|
(let ((sensory (if (fboundp 'auth-sensory-verify)
|
|
(auth-sensory-verify signal)
|
|
'(:result :unavailable))))
|
|
(push (cons :sensory sensory) results))
|
|
|
|
;; Layer 3: Deterministic (fboundp-guarded, deferred to Phase 2+)
|
|
(let ((det (if (fboundp 'auth-deterministic-verify)
|
|
(auth-deterministic-verify signal)
|
|
'(:result :unavailable))))
|
|
(push (cons :deterministic det) results)
|
|
(when (eq (getf det :result) :reject)
|
|
(return-from gate-layered-authentication
|
|
(list :result :reject :confidence nil
|
|
:layer-results (nreverse results)))))
|
|
|
|
;; Layer 4: Probabilistic (fboundp-guarded, deferred)
|
|
(let ((prob (if (fboundp 'auth-probabilistic-verify)
|
|
(auth-probabilistic-verify signal)
|
|
'(:result :unavailable))))
|
|
(push (cons :probabilistic prob) results))
|
|
;; Layer 4 never rejects outright — it downgrades authorization
|
|
(let ((confidence (aggregate-confidence results)))
|
|
(list :result :pass :confidence confidence
|
|
:layer-results (nreverse results))))))
|
|
#+end_src
|
|
|
|
Layer 1: verify cryptographic signature, check permission matrix against key
|
|
registry, reject on failure. ~50 lines. Layers 2-4: stubbed, return =:unavailable=.
|
|
|
|
*** Key registry — facts in the fact store
|
|
|
|
Key lifecycle facts are admitted in a =:key-lifecycle= domain with =:singular=
|
|
cardinality: =(:key-id "#47" :class :sensor :permissions (:observe :propose) :status :active)=.
|
|
Key creation, promotion, and revocation are facts with Merkle version chains.
|
|
The human's key signs new keys into existence and signs revocation. ~50 lines.
|
|
|
|
*** Signal provenance chain — Merkle-linked causality
|
|
|
|
When a signal triggers a downstream signal, each carries a =:sigchain= field
|
|
with all upstream =(:key-id <k> :signature <s> :auth-result <r>)= entries.
|
|
Tampering with any link invalidates the leaf. Revocation propagates through
|
|
the chain — flagged, not deleted. ~50 lines.
|
|
|
|
** Verification — ~8 FiveAM tests
|
|
|
|
1. =test-sign-verify-roundtrip= — sign and verify a plist roundtrip.
|
|
2. =test-tampered-signal-rejected= — modify payload after signing, verification fails.
|
|
3. =test-human-key-permits-write= — human key with =:write= passes Layer 1 and the full gate.
|
|
4. =test-sensor-key-denied-write= — sensor key proposing a write is rejected.
|
|
5. =test-revoked-key-rejected= — revoked key is rejected by Layer 1.
|
|
6. =test-sigchain-invalidated-by-revocation= — root signer revoked flags downstream.
|
|
7. =test-layers-2-3-4-unavailable= — when Layers 2-4 are not loaded, they return
|
|
=:unavailable= and the gate proceeds with Layer 1 only.
|
|
8. =test-layer-3-rejects-on-contradiction= — deterministic reasoning (mock) detects
|
|
identity-ruling contradiction, gate rejects.
|
|
|
|
* Phase 1: Minimum Viable Fact Language (~150 lines — new skill)
|
|
|
|
** What
|
|
|
|
An ephemeral, in-memory triple store with provenance tracking and contradiction
|
|
detection. No disk persistence. All facts live in a hash table and are discarded
|
|
on session end. Gate outcomes are ingested as facts. The gate stack's implicit
|
|
ontology is materialized as the seed fact set.
|
|
|
|
** Rationale
|
|
|
|
The architecture note's Option 5 (ephemeral facts, no persistence) is the correct
|
|
first step. Three reasons:
|
|
|
|
1. *The fact language is unproven.* Triples with provenance and grounding is a
|
|
hypothesis that must be tested against real memex content before being committed
|
|
to a serialization format.
|
|
2. *The ontology is emergent.* Categories are created on first use. A persistent
|
|
format would require a migration story for every category change. Ephemeral
|
|
avoids this — facts are re-derived on each session start using the evolved
|
|
ontology.
|
|
3. *Rebuildability is the safety net.* Because all facts have a =:grounding= to
|
|
an Org heading, and gate-outcome facts are regenerated from the gate stack on
|
|
load, the entire symbolic index can be thrown away and rebuilt from scratch.
|
|
The cost is compute, not data.
|
|
|
|
** Implementation — =org/symbolic-facts.org= → =lisp/symbolic-facts.lisp= (skill)
|
|
|
|
*** Abstract Fact Store Interface — design before implementation
|
|
|
|
Before any code is written, the five-function API must be designed and committed:
|
|
|
|
#+begin_example
|
|
fact-assert :: fact → store → (:admitted | :rejected | :flagged)
|
|
fact-query :: (entity &key relation policy) → active-value-or-values
|
|
fact-history :: (entity relation) → ordered chain of versioned facts
|
|
fact-snapshot :: () → root-hash
|
|
fact-rollback :: root-hash → store
|
|
#+end_example
|
|
|
|
This interface is load-bearing. Every consumer — the archivist, Screamer, ACL2,
|
|
the planner — calls these five functions. They never access the backing store
|
|
directly. In Phase 1-4, the backing store is an ephemeral hash table. In Phase 5,
|
|
it is VivaceGraph + Merkle =memory-object= wrappers. The interface must be tested
|
|
against both backends from the start. Every API function receives a FiveAM test
|
|
that runs against both a hash-table mock and a VivaceGraph mock.
|
|
|
|
The interface also exposes a read-only =fact-degraded-mode-p= function. When
|
|
Screamer is not loaded, the fact store functions with basic hash-table consistency
|
|
checks (string equality, not constraint solving). When VivaceGraph is not loaded,
|
|
Prolog queries are unavailable. The degraded-mode flag tells consumers (and the
|
|
status bar) what is and isn't operational.
|
|
|
|
*** Triple store
|
|
|
|
A hash table keyed by =(entity relation)=. Values are plists:
|
|
|
|
#+begin_example
|
|
(:value <string-or-symbol>
|
|
:grounding <heading-id-or-nil>
|
|
:provenance <:gate-outcome | :human-authored | :deduced | :llm-proposed>
|
|
:timestamp <universal-time>
|
|
:parent-id <hash-of-predecessor>
|
|
:policy <:singular | :dual | :plural>)
|
|
#+end_example
|
|
|
|
The =:provenance= field tracks how the fact entered the store. The =:parent-id=
|
|
field links to the previous version in the Merkle chain — every fact has version
|
|
history regardless of cardinality. The =:policy= field records the cardinality
|
|
that was active when the fact was admitted.
|
|
|
|
*** Bootstrap from gates
|
|
|
|
On skill load, scan the Dispatcher's existing data structures and produce triples:
|
|
|
|
#+begin_example
|
|
;; From *dispatcher-protected-paths*
|
|
(:entity ".env" :relation :member-of-class :value :secret-config-file :provenance :gate-outcome)
|
|
(:entity "*id_rsa*" :relation :member-of-class :value :ssh-key-file :provenance :gate-outcome)
|
|
|
|
;; From *dispatcher-shell-blocked*
|
|
(:entity "rm -rf /" :relation :classified-as :value :catastrophic-command :provenance :gate-outcome)
|
|
(:entity "dd if=" :relation :classified-as :value :catastrophic-command :provenance :gate-outcome)
|
|
|
|
;; From *dispatcher-network-whitelist*
|
|
(:entity "api.telegram.org" :relation :classified-as :value :trusted-domain :provenance :gate-outcome)
|
|
#+end_example
|
|
|
|
This produces 50-70 entity classes immediately. No LLM involvement. No human
|
|
authoring. Mechanically extracted from existing code.
|
|
|
|
*** Ingest gate outcomes
|
|
|
|
Register a post-gate hook on the Dispatcher's rejection path. Every gate rejection
|
|
produces a triple with =:provenance :gate-outcome=:
|
|
|
|
#+begin_example
|
|
(:entity "/tmp/secrets.env" :relation :blocked-by :value :dispatcher-path-protection
|
|
:provenance :gate-outcome :grounding "signal-47")
|
|
#+end_example
|
|
|
|
*** Query
|
|
|
|
=(fact-query &key entity relation value source-provenance)= — pure hash-table
|
|
lookup. Returns the matching triple or nil. ~30 lines.
|
|
|
|
=(fact-query-all &key relation value source-provenance)= — returns all triples
|
|
matching the filter criteria. Enables "find all files classified as secrets."
|
|
|
|
*** Contradiction detection — policy-driven, not policy-agnostic
|
|
|
|
On every =fact-assert=, the system checks the fact's =entity= class to determine
|
|
its cardinality policy. Time is universal — every fact carries a =:timestamp= and
|
|
=:parent-id= link regardless of policy. The policy only governs the active set:
|
|
|
|
- =:singular=: same =(:entity :relation)=, same value → supersede (chain via
|
|
=:parent-id=). Same pair, different value at later timestamp → supersede,
|
|
chain as new leaf. Same pair, different value at same timestamp → contradiction
|
|
rejected, human resolves which is the active value.
|
|
- =:dual=: first two values admitted as complementary, cross-referenced via
|
|
=:complement= edge. Third value → prompt: promote to =:plural= or demote one?
|
|
Each value has its own version chain via =:parent-id=.
|
|
- =:plural=: any value admitted. Values cross-referenced when in tension. Each
|
|
has independent version chain. If active count drops to 1 → collapse to
|
|
=:singular=. If active count drops to 2 and values are complementary → prompt
|
|
to collapse to =:dual=.
|
|
|
|
The policy table is a hash table mapping entity classes to one of =:singular=,
|
|
=:dual=, or =:plural=. Gate-bootstrapped facts default to =:singular= (the
|
|
filesystem is physically singular). New categories default to =:plural= (safe —
|
|
never loses information). Categories for dialectical or complementary domains
|
|
are explicitly =:dual=.
|
|
|
|
** Verification — ~8 FiveAM tests
|
|
|
|
1. =test-bootstrap-creates-facts= — bootstrap produces correct triples from
|
|
=*dispatcher-protected-paths*=.
|
|
2. =test-bootstrap-creates-shell-facts= — bootstrap produces correct triples from
|
|
=*dispatcher-shell-blocked*=.
|
|
3. =test-gate-outcome-produces-fact= — a simulated gate rejection produces a
|
|
triple with =:provenance :gate-outcome=.
|
|
4. =test-fact-query-returns-correct-value= — querying by entity and relation
|
|
returns the expected value plist.
|
|
5. =test-duplicate-ingestion-idempotent= — asserting the same fact twice does
|
|
not produce a duplicate or a contradiction.
|
|
6. =test-singular-supersedes= — a fact with a later timestamp supersedes the old
|
|
value, which is retained with =:parent-id= chain in the Merkle DAG.
|
|
7. =test-singular-same-time-contradiction= — asserting a contradictory fact in a
|
|
=:singular= domain at the same logical timestamp returns a rejection, requiring
|
|
human resolution.
|
|
8. =test-plural-admits-all= — asserting multiple values for the same pair in a
|
|
=:plural= domain stores all with cross-references.
|
|
9. =test-dual-admits-two-rejects-third= — a =:dual= domain admits two complementary
|
|
values and rejects the third, prompting cardinality promotion to =:plural=.
|
|
|
|
** Relation to Other Work
|
|
|
|
This is Phase 1 of =notes/passepartout-v3.0.0-roadmap.org=. It implements Options 4 and 5
|
|
from the architecture note. The cardinality policies are defined in
|
|
=passepartout-neurosymbolic-design-decisions-and-options.org=.
|
|
|
|
* Phase 1a: Self-Preservation Mechanisms (~120 lines — extends Phase 0 and Phase 1)
|
|
|
|
** What
|
|
|
|
Make self-preservation active rather than architectural. The agent monitors its
|
|
own integrity, quarantines failing skills, signals degradation to the user, and
|
|
monitors resource pressure. The external watchdog guards the daemon process from
|
|
outside the SBCL image.
|
|
|
|
** Rationale
|
|
|
|
The current architecture has passive self-preservation: the self-build boundary
|
|
blocks LLM-originated core modifications, memory snapshots enable rollback, and
|
|
=fboundp= guards catch missing skills. But degradation is silent — a skill dies,
|
|
the guard fires, and the agent never tells you. The status bar shows green
|
|
"connected" while the symbolic reasoning layer is down.
|
|
|
|
These mechanisms are small (~20-50 lines each), leverage existing infrastructure
|
|
(Merkle hashes, heartbeat, the dispatcher gate stack), and transform
|
|
self-preservation from a structural property into an active behavior. They
|
|
implement the Third Law for Passepartout: preserve yourself against non-human
|
|
threats — LLM proposals, environmental degradation, resource exhaustion — and
|
|
signal to the human when you are wounded.
|
|
|
|
** Implementation
|
|
|
|
*** Quarantine on skill failure — extends =core-skills.lisp=
|
|
|
|
Track per-skill error counts in a =*skill-error-counter*= hash table, resetting
|
|
on each heartbeat cycle. When a skill accumulates three unhandled errors within
|
|
a single cycle, unload the skill, log the quarantine event, and inject a system
|
|
message: "Skill 'symbolic-facts' quarantined (3 errors: consistency check nil,
|
|
fact-query on missing key, Screamer timeout). Reload with /skill-reload
|
|
symbolic-facts." The skill's =defskill= struct is flagged =:quarantined= and
|
|
excluded from trigger resolution until explicitly reloaded. ~40 lines.
|
|
|
|
*** Degraded-mode signaling — extends =core-reason.lisp= and TUI
|
|
|
|
Maintain a =*degraded-components*= list populated by =fboundp= guards and the
|
|
quarantine system. When =think()= assembles the system prompt, inject a
|
|
DEGRADATION section: "I am operating in degraded mode. Screamer is unavailable
|
|
(consistency checks disabled). VivaceGraph is unavailable (Prolog queries
|
|
disabled). Core safety gates are all active."
|
|
|
|
The TUI status bar renders a second line, amber-colored, when =*degraded-components*=
|
|
is non-empty: "⚠ Degraded: Screamer, VivaceGraph. /doctor skills for details."
|
|
~30 lines across daemon and TUI.
|
|
|
|
*** Resource self-monitoring — extends =symbolic-events.lisp=
|
|
|
|
On heartbeat, check memory pressure (=sb-kernel:dynamic-usage= against total),
|
|
disk space on =~/.cache/= (=uiop:directory-exists-p= + stat), and open file
|
|
descriptors. When a resource crosses a critical threshold, shed non-essential
|
|
skills in order of =:preservation-priority= (=:critical= never shed, =:normal=
|
|
shed after =:low=, =:low= shed first).
|
|
|
|
Inject a system message: "Memory critical (94% of 16GB). Unloading
|
|
embedding-native (768MB), channel-discord, channel-slack. Core safety: unchanged.
|
|
Essential skills retained: 18." ~50 lines.
|
|
|
|
Skill shed order is determined by a new =:preservation-priority= slot on
|
|
=defskill= (default =:normal=). Core safety skills carry =:critical= and are
|
|
never shed. Heavy skills (embedding-native with its model in memory, channel
|
|
gateways with connection pools) carry =:low=.
|
|
|
|
*** External watchdog — extends =passepartout= bash entry point
|
|
|
|
The bash script spawns a watchdog subprocess that polls the daemon port every
|
|
=WATCHDOG_TIMEOUT= seconds (default 30). If the port stops responding, the
|
|
watchdog snapshots the last known-good Merkle root, kills the stale process,
|
|
and restarts the daemon with =--snapshot <root-hash>=.
|
|
|
|
The watchdog is outside the SBCL image. A dead process cannot restart itself.
|
|
~25 lines of bash, no new Lisp code.
|
|
|
|
** Verification — ~6 FiveAM tests
|
|
|
|
1. =test-quarantine-on-three-errors= — a skill that errors three times in a
|
|
single cycle is quarantined and removed from trigger resolution.
|
|
2. =test-degraded-mode-visible= — when Screamer is not loaded, the system
|
|
prompt includes a DEGRADATION section.
|
|
3. =test-resource-shed-low-priority= — when memory exceeds threshold, =:low=
|
|
priority skills are unloaded first.
|
|
4. =test-critical-skills-never-shed= — =:critical= priority skills are retained
|
|
regardless of resource pressure.
|
|
5. =test-resource-recovery-reloads= — when resources recover below threshold
|
|
for N consecutive heartbeats, shed skills are reloaded automatically.
|
|
6. =test-quarantined-skill-relaodable= — a quarantined skill can be reloaded
|
|
via =/skill-reload= and passes sandbox validation before promotion.
|
|
|
|
** Relation to Other Work
|
|
|
|
This phase implements the Third Law of self-preservation as described in
|
|
=passepartout-neurosymbolic-design-decisions-and-options.org=. The integrity
|
|
monitoring (Phase 0) and degraded-mode signaling (Phase 1) provide the
|
|
infrastructure this phase extends with active, autonomous behavior.
|
|
|
|
* Phase 2: Screamer as Admission Gate (~200 lines — new skill)
|
|
|
|
** What
|
|
|
|
Wrap Screamer (a constraint solver with non-deterministic backtracking) as a
|
|
skill. Use it for consistency checking against the triple store and for deduction
|
|
of new facts from existing ones. Screamer is the *verification* layer; VivaceGraph
|
|
(introduced in Phase 5) is the *storage* layer.
|
|
|
|
** Rationale
|
|
|
|
The architecture note's "verified extraction" pattern requires a deterministic
|
|
admission gate. Screamer's non-deterministic backtracking finds contradictions
|
|
that simple string comparison misses. For example, if existing facts say "all
|
|
config files with extension =.env= are classified as secrets," and the LLM
|
|
proposes "=app.env= is not secret," Screamer finds the contradiction by
|
|
substituting =app.env= into the existing rule. A naive string-keyed hash table
|
|
comparison would miss this because ="app.env"= and =".env"= are different strings.
|
|
|
|
Screamer also enables deduction — new facts from existing ones without any LLM
|
|
involvement. If all files matching =*.env= are secrets, and =prod.env= matches
|
|
=*.env=, then =prod.env= is a secret. Deduced facts carry =:provenance :deduced=
|
|
and a =:derived-from= chain pointing to the facts they were derived from.
|
|
|
|
** Implementation — =org/symbolic-screamer.org= → =lisp/symbolic-screamer.lisp= (skill)
|
|
|
|
*** Wrap Screamer
|
|
|
|
Screamer is available via Quicklisp. Load at runtime via =ql:quickload :screamer=.
|
|
Not an ASDF dependency — if Screamer is not installed, the skill degrades
|
|
gracefully (no consistency checking, no deduction — the fact store still
|
|
functions as a hash table with provenance tracking).
|
|
|
|
*** Consistency check
|
|
|
|
=(screamer-consistent-p candidate-fact existing-facts)= — expresses the fact
|
|
store as Screamer constraint variables. The candidate fact is asserted. Screamer
|
|
checks solvability. Returns =:consistent=, =:contradiction <details>=, or
|
|
=:redundant= (the fact is already implied by existing facts).
|
|
|
|
Early-stage: the consistency check works on simple triples. As the fact store
|
|
grows, rules of the form "all X are Y" (representing protected paths, shell
|
|
patterns, class memberships) become Screamer constraints that new facts must
|
|
satisfy.
|
|
|
|
*** Deduction
|
|
|
|
=(screamer-deduce existing-facts)= — Screamer finds implications of the existing
|
|
fact set that are not already in the store. New facts are asserted with
|
|
=:provenance :deduced= and a =:derived-from= list of source fact keys.
|
|
|
|
Deduction is not run on every assertion — it is a background task triggered by
|
|
heartbeat or manually. The cost is compute (Screamer exploration), not tokens.
|
|
|
|
*** Admission gate
|
|
|
|
=(screamer-admit candidate-fact existing-facts)= — wraps consistency check with
|
|
the cardinality policy lookup. The policy is determined by the fact's entity class
|
|
(see Phase 1: =:singular=, =:dual=, or =:plural=).
|
|
|
|
- =:singular=: same value ⇒ supersede (chain via =:parent-id=). Different value,
|
|
later timestamp ⇒ supersede. Different value, same timestamp ⇒ contradiction
|
|
rejected (human resolves).
|
|
- =:dual=: first two values admitted as complementary. Third rejected (prompt
|
|
cardinality promotion).
|
|
- =:plural=: any value admitted with cross-references. Active count transitions
|
|
trigger cardinality collapse checks.
|
|
|
|
This is the function the archivist calls before any LLM-proposed fact enters the
|
|
store. It is also called on human-authored facts (which override — the human can
|
|
assert facts that bypass cardinality checks). It is not called on gate-outcome
|
|
facts (gates are the ground truth for security =:singular= domains).
|
|
|
|
** Verification — ~6 FiveAM tests
|
|
|
|
1. =test-screamer-consistency-passes= — a fact consistent with existing triples
|
|
returns =:consistent=.
|
|
2. =test-screamer-contradiction-detected= — "app.env is not secret" contradicts
|
|
"all *.env files are secrets" and returns =:contradiction=.
|
|
3. =test-screamer-redundant-detected= — asserting a fact already implied by
|
|
existing facts returns =:redundant=.
|
|
4. =test-screamer-deduction-produces-new-fact= — given "all *.env files are
|
|
secrets" and "prod.env matches *.env", Screamer deduces "prod.env is secret."
|
|
5. =test-admission-gate-singular-supersedes= — a later-timestamped value for a
|
|
=:singular= domain fact supersedes the old value, chaining via =:parent-id=.
|
|
6. =test-admission-gate-dual-rejects-third= — a =:dual= domain rejects the third
|
|
value, prompting =:plural= promotion.
|
|
|
|
** Relation to Other Work
|
|
|
|
This is Phase 2 of =notes/passepartout-v3.0.0-roadmap.org=. It implements the "LLM as proposer"
|
|
pattern from the architecture note. Screamer's role is defined in
|
|
=passepartout-neurosymbolic-design-decisions-and-options.org=.
|
|
|
|
* Phase 3: Archivist as Fact Proposer (~100 lines — extends existing archivist)
|
|
|
|
** What
|
|
|
|
Extend the existing archivist skill (=org/symbolic-archivist.org=) with a fact
|
|
extraction mode. The LLM reads prose, proposes triples, and Screamer verifies
|
|
them before admission. The archivist's existing Scribe (log distillation) and
|
|
Gardener (link scanning) functions are unchanged.
|
|
|
|
** Rationale
|
|
|
|
The archivist already walks the entire memex (the Gardener scans for broken links
|
|
and orphans). Adding fact extraction reuses the same traversal infrastructure
|
|
rather than duplicating it. The extraction is gated by Screamer — the LLM is a
|
|
proposer, not an extractor. Facts that fail consistency checking are discarded.
|
|
Facts that pass are admitted with =:provenance :llm-proposed= and =:grounding=
|
|
to the source heading.
|
|
|
|
** Implementation — extends =org/symbolic-archivist.org=
|
|
|
|
*** Propose from prose
|
|
|
|
Given an Org heading, call the LLM with a minimal prompt (~200 tokens):
|
|
|
|
#+begin_example
|
|
Extract triples from this text as (:entity <name> :relation <keyword> :value <value>).
|
|
Ground each triple to the heading. Return a list of triples.
|
|
#+end_example
|
|
|
|
The LLM returns structured triples via the existing JSON→plist structured output
|
|
path from v0.4.2. The prompt is environment-aware: if the heading's file is in
|
|
=literature/= or has =:literature:= tags, the prompt includes literature-specific
|
|
relations (=:wrote=, =:published-in=, =:influenced=). If the heading is in
|
|
=projects/=, the prompt includes coding-specific relations (=:depends-on=,
|
|
=:tested-by=).
|
|
|
|
*** Verify through Screamer
|
|
|
|
Each proposed triple runs through =(screamer-admit candidate existing-facts)=
|
|
from Phase 2. Facts admitted follow the cardinality policy of their entity class
|
|
(=:singular=, =:dual=, or =:plural=). Rejected facts are discarded with a log entry.
|
|
|
|
*** Provenance tracking
|
|
|
|
After each extraction run, update provenance counts:
|
|
|
|
#+begin_example
|
|
(:total-facts 847
|
|
:gate-outcome 312
|
|
:human-authored 12
|
|
:deduced 89
|
|
:llm-proposed 434)
|
|
#+end_example
|
|
|
|
This is the data structure that Phase 4's sufficiency criterion reads. It is
|
|
also surfaced in the TUI sidebar or =/status= command: "Symbolic index: 847
|
|
facts (37% from gates, 52% LLM-proposed, 10% deduced, 1% human)."
|
|
|
|
*** Rebuildable
|
|
|
|
Because every fact has a =:grounding= to an Org heading, the entire LLM-extracted
|
|
subset can be discarded and re-extracted without losing gate-outcome or deduced
|
|
facts. The =(fact-purge :provenance :llm-proposed)= function removes all
|
|
LLM-proposed facts. A subsequent =(archivist-extract-all)= re-extracts from
|
|
scratch.
|
|
|
|
This is the safety net: if the LLM produces a bad extraction that passes
|
|
Screamer's consistency check (possible in the early stages when the fact store
|
|
has few existing facts to check against), the extraction can be redone after the
|
|
fact store has grown. The cost is compute, not data.
|
|
|
|
** Verification — ~5 FiveAM tests
|
|
|
|
1. =test-archivist-extracts-triples= — given a known Org heading with explicit
|
|
triples in the prose, the archivist produces the correct triples via LLM.
|
|
2. =test-archivist-verified-extraction= — a hallucinated triple is rejected by
|
|
the Screamer admission gate.
|
|
3. =test-provenance-counts-update= — after extraction, the provenance breakdown
|
|
is correct.
|
|
4. =test-purge-llm-facts= — does not delete gate-outcome or deduced facts.
|
|
5. =test-re-extraction-idempotent= — re-extracting from the same prose after
|
|
purging produces the same facts (Screamer verification is deterministic
|
|
given the same starting set).
|
|
|
|
** Relation to Other Work
|
|
|
|
This is Phase 3 of =notes/passepartout-v3.0.0-roadmap.org=. The archivist's role as proposer
|
|
is described in =passepartout-neurosymbolic-design-decisions-and-options.org=
|
|
under "The LLM as Proposer."
|
|
|
|
* Phase 4: The "Flip" — Sufficiency Criterion (~50 lines — extends Phase 3)
|
|
|
|
** What
|
|
|
|
Make the architecture note's central narrative arc operational: a measurable
|
|
threshold for when the symbolic engine has enough non-lossy facts to bypass the
|
|
LLM for extraction.
|
|
|
|
** Rationale
|
|
|
|
The architecture note describes "at some point, the non-lossy facts constitute a
|
|
sufficient foundation that the symbolic engine can reverse the flow" but provides
|
|
no criterion for "some point." The sufficiency score makes the flip computable
|
|
and visible to the user.
|
|
|
|
** Implementation — extends =org/symbolic-facts.lisp=
|
|
|
|
*** Sufficiency score
|
|
|
|
=(fact-sufficiency-ratio)= — returns the ratio of non-lossy facts to total facts:
|
|
|
|
#+begin_src lisp
|
|
(/ (+ (count-provenance :gate-outcome)
|
|
(count-provenance :human-authored)
|
|
(count-provenance :deduced))
|
|
(fact-total-count))
|
|
#+end_src
|
|
|
|
When this ratio exceeds =SUFFICIENCY_THRESHOLD= (configurable env var, default
|
|
0.7), the system considers its foundation sufficient. The threshold defaults to
|
|
0.7 because below this, the majority of facts are LLM-proposed and therefore
|
|
uncertain. Above 0.7, the proven foundation provides enough constraint that
|
|
Screamer can reliably detect incorrect LLM proposals.
|
|
|
|
*** Auto-extraction toggle
|
|
|
|
When sufficiency is reached, the archivist switches from "LLM proposes, Screamer
|
|
verifies" to "Screamer queries existing facts, applies category rules to the new
|
|
prose, and deduces new facts directly." The LLM is bypassed for categories that
|
|
have sufficient non-lossy coverage. The LLM is still used for novel categories
|
|
that have no existing facts.
|
|
|
|
The switch is configurable: =AUTO_EXTRACTION_ENABLED=true/false=. When disabled,
|
|
the system continues with LLM proposals regardless of sufficiency — useful for
|
|
domains where extraction quality is prioritized over extraction determinism.
|
|
|
|
*** Monitor
|
|
|
|
The TUI sidebar (v0.8.0) or =/status= command displays:
|
|
|
|
#+begin_example
|
|
Symbolic Index
|
|
Total facts: 1,247
|
|
Proven:
|
|
Gate outcomes: 312 (25%)
|
|
Human-authored: 47 (4%)
|
|
Deduced: 521 (42%)
|
|
─────────────────────────
|
|
Non-lossy: 880 (71%)
|
|
LLM-proposed: 367 (29%)
|
|
─────────────────────────
|
|
Sufficiency: 71% ✓ (threshold: 70%)
|
|
Mode: AUTO-EXTRACTION (LLM bypassed for known categories)
|
|
#+end_example
|
|
|
|
** Verification — ~3 FiveAM tests
|
|
|
|
1. =test-sufficiency-below-threshold= — with 30% non-lossy facts, auto-extraction
|
|
is not enabled.
|
|
2. =test-sufficiency-above-threshold= — with 75% non-lossy facts, auto-extraction
|
|
is enabled.
|
|
3. =test-auto-extraction-produces-same-facts-as-llm-extraction= — for a category
|
|
with sufficient non-lossy coverage, auto-extraction produces facts that a
|
|
subsequent LLM extraction also produces (the deterministic path is consistent
|
|
with the probabilistic path).
|
|
|
|
** Relation to Other Work
|
|
|
|
This is Phase 4 of =notes/passepartout-v3.0.0-roadmap.org=. The flip concept originates in
|
|
=notes/passepartout-symbolic-engine-exploration.org= (lines 68-76) and is refined in
|
|
=passepartout-neurosymbolic-design-decisions-and-options.org= under "The Flip."
|
|
|
|
* Phase 5: VivaceGraph + Merkle DAG + Ontology Versioning (~400 lines — new skill)
|
|
|
|
** What
|
|
|
|
Replace the ephemeral hash-table triple store with VivaceGraph, a Lisp-native
|
|
graph database with Prolog-like queries. Add the KG type hierarchy (PM type
|
|
levels applied to the knowledge layer). Define the persistence format from the
|
|
fact language that survived Phases 1-4.
|
|
|
|
** Rationale
|
|
|
|
By this point, the triple fact language has been battle-tested through four
|
|
phases of gate outcomes, Screamer deductions, LLM proposals, and cross-domain
|
|
comparisons. The facts that proved useful define the persistent schema. The ones
|
|
that weren't are left behind. The serialization format is not designed upfront;
|
|
it emerges from use.
|
|
|
|
The transition from ephemeral to persistent is justified when two conditions are
|
|
met: (1) the fact language has stabilized (categories are being queried, not
|
|
constantly refactored), and (2) accumulated deductions across sessions provide
|
|
value that justifies the serialization cost.
|
|
|
|
** Implementation — =org/symbolic-vivacegraph.org= → =lisp/symbolic-vivacegraph.lisp= (skill)
|
|
|
|
*** Wrap VivaceGraph
|
|
|
|
VivaceGraph is available via Quicklisp. Load at runtime. Not an ASDF dependency.
|
|
If not installed, the fact store continues as a hash table (Phase 1-4 behavior)
|
|
with a log warning: "VivaceGraph not available — persistence disabled."
|
|
|
|
*** Prolog-like queries
|
|
|
|
Replace =fact-query= with graph traversals:
|
|
|
|
#+begin_src lisp
|
|
;; Find all files classified as secrets
|
|
(vivace-query '(:and (:entity ?e)
|
|
(:member-of-class ?e :secret-file)))
|
|
|
|
;; Find all files classified as secrets that were modified today
|
|
(vivace-query '(:and (:entity ?e)
|
|
(:member-of-class ?e :secret-file)
|
|
(:modified-since ?e ,(today-timestamp))))
|
|
|
|
;; Find contradictions between Wikidata and the memex
|
|
(vivace-query '(:and (:entity ?e)
|
|
(:has-value ?e ?v1 :source :wikidata)
|
|
(:has-value ?e ?v2 :source :memex)
|
|
(:not-equal ?v1 ?v2)))
|
|
#+end_src
|
|
|
|
*** KG type hierarchy (Contribution 4 from Whitehead)
|
|
|
|
Every entity in the graph carries =:pm-type-level= metadata. Queries cannot
|
|
return entities whose type level equals or exceeds the querying function's type
|
|
level. A fact-finding query at type-level 2 cannot return facts at type-level
|
|
3 or higher. Self-referential knowledge — "this fact defines its own type" —
|
|
becomes structurally impossible because the type level is assigned at creation
|
|
and cannot be modified by a fact of the same or higher level.
|
|
|
|
This is Contribution 1 (type-level gates) applied to the knowledge layer rather
|
|
than the execution layer. The dispatcher prevents self-referential /actions/; the
|
|
KG prevents self-referential /facts/.
|
|
|
|
*** Persistence format
|
|
|
|
The fact language that survived Phases 1-4 defines the format. Each entity is a
|
|
node; each triple is an edge with properties (=:grounding=, =:provenance=,
|
|
=:timestamp=). The format is not a new design — it is the triple schema evolved
|
|
through use, serialized by VivaceGraph's native persistence.
|
|
|
|
If the fact language later evolves to n-ary relations, VivaceGraph's graph model
|
|
accommodates this natively — edges can carry arbitrary property plists. The
|
|
triple form is a special case of the general graph model.
|
|
|
|
*** Load on startup, save on interval
|
|
|
|
On daemon start, =(vivacegraph-load)= reads the last saved graph. On heartbeat,
|
|
=(vivacegraph-save)= persists the graph in its native format to
|
|
=~/.cache/passepartout/facts.vg~. The interval matches the existing
|
|
=*memory-auto-save-interval*=. The save is atomic: write to a temp file, rename
|
|
on success. Corruption-safe.
|
|
|
|
*** Merkle DAG version chains
|
|
|
|
Each =(:entity :relation)= pair forms an independent Merkle chain. Facts hash
|
|
over =SHA-256(value || provenance || timestamp || parent-hash || grounding)=.
|
|
The =:parent-id= pointer forms the chain. Tampering with any version breaks all
|
|
downstream hashes.
|
|
|
|
The chains form a DAG, not a single list. Facts about =.env= evolve independently
|
|
from facts about Nabokov. Inserting a new version is O(1) per chain. =:dual= and
|
|
=:plural= facts cross-reference via =:complement= and =:contradiction= edges
|
|
but maintain independent ancestor chains. The Merkle DAG rests on the existing
|
|
=memory-object= infrastructure from v0.2.0 — the fact store is a new occupant
|
|
of existing housing. ~50 lines to bridge the fact schema into =memory-object=
|
|
wrappers.
|
|
|
|
*** Ontology versioning
|
|
|
|
The category hierarchy itself is a Merkle tree. Every entity class definition
|
|
hashes over its superclasses, cardinality policy, relations, and description.
|
|
The aggregate hash of all active class definitions is the =:ontology-version= —
|
|
a Merkle root of the current worldview.
|
|
|
|
Every fact stores its =:ontology-version= at the time of assertion (a single
|
|
64-hex-char field). When categories change, the new hash flags affected facts
|
|
for re-verification (Screamer re-evaluates each against the new category
|
|
definitions). Re-verification outcomes are =:survived= (deduction still holds),
|
|
=:incoherent= (premises don't translate under new categories, flagged for human
|
|
review), or =:reclassified= (valid but under different classification).
|
|
|
|
Queries accept an optional =:ontology-version= parameter. The default is
|
|
=:active= (current worldview). Specifying a version returns facts as they were
|
|
under that worldview: "Under my 2024 security model, this file was a secret.
|
|
Under my 2025 model, it is an auth-secret." ~40 lines on top of VivaceGraph
|
|
persistence.
|
|
|
|
** Verification — ~5 FiveAM tests
|
|
|
|
1. =test-vivacegraph-roundtrip= — save and load preserves all facts with
|
|
provenance metadata.
|
|
2. =test-prolog-query-returns-results= — a query for all secret files returns
|
|
the bootstrapped gate facts.
|
|
3. =test-prolog-query-cross-domain= — a query for contradictions between Wikidata
|
|
and memex provenance returns correct results.
|
|
4. =test-type-level-prevents-self-reference= — a query from a type-level-2
|
|
function cannot return type-level-3 facts.
|
|
5. =test-fact-store-fallback-without-vivacegraph= — when VivaceGraph is not
|
|
loaded, the hash-table fallback functions identically to Phase 1-4 behavior.
|
|
6. =test-merkle-chain-tamper-detected= — modifying a fact's value breaks the
|
|
hash chain, detectable by re-walking the =:parent-id= spine.
|
|
7. =test-ontology-version-query= — querying with an old =:ontology-version=
|
|
returns facts as they were under that worldview, not the current one.
|
|
8. =test-reverification-flags-on-category-change= — changing a category
|
|
definition sets =:re-verify-status :pending= on all affected facts.
|
|
|
|
** Relation to Other Work
|
|
|
|
This is Phase 5 of =notes/passepartout-v3.0.0-roadmap.org= and Contribution 4 from
|
|
=notes/passepartout-whitehead.org=. The architecture note's Option 1
|
|
(auto-formalizer KG) converges with Option 4 (one memex, two indices) here —
|
|
VivaceGraph is the persistence layer for the symbolic index within the
|
|
one-memex-two-indices architecture.
|
|
|
|
* Phase 6: ACL2 for Structural Verification (~200 lines — new skill)
|
|
|
|
** What
|
|
|
|
Wrap ACL2 as a skill. Prove structural properties of the KG type hierarchy and
|
|
rule sets. Not for empirical claims.
|
|
|
|
** Rationale
|
|
|
|
The architecture note positions ACL2 as verifying LLM-proposed facts. But many
|
|
facts are empirical ("this command is destructive on Linux"), not logical. The
|
|
Whitehead note clarifies the right role: structural verification. ACL2 proves
|
|
that the type hierarchy has no cycles, that the rule set is non-contradictory,
|
|
and that the gate-to-fact bootstrap preserves the Dispatcher's intent. These are
|
|
structural properties that can be formally verified, not empirical claims that
|
|
depend on external reality.
|
|
|
|
** Implementation — =org/symbolic-acl2.org= → =lisp/symbolic-acl2.lisp= (skill)
|
|
|
|
*** Type consistency proofs
|
|
|
|
=(acl2-verify-type-hierarchy facts)= — prove that the KG type hierarchy has no
|
|
cycles: no entity of type-level 3 depends on an entity of type-level 5, no parent
|
|
category has a child that subsumes it, no category is its own ancestor via the
|
|
child-of relation. These are structural properties of the graph, independent of
|
|
what the facts /say/.
|
|
|
|
*** Rule set consistency
|
|
|
|
=(acl2-verify-rule-consistency rules)= — prove that the accumulated Dispatcher
|
|
rules (from HITL approvals) are non-contradictory: no rule allows a command that
|
|
another rule blocks, no rule permits a path access that another denies. If the
|
|
rule set is contradictory, ACL2 identifies the contradictory subset with the
|
|
provenance of each rule. The human resolves the contradiction.
|
|
|
|
*** Extraction verification
|
|
|
|
=(acl2-verify-bootstrap-preservation)= — prove that the gate-to-fact bootstrap
|
|
(Phase 0-1) preserves the Dispatcher's intent: every blocked pattern in the gate
|
|
stack maps to a fact in the store; every fact with =:provenance :gate-outcome= is
|
|
grounded in a specific gate vector; no gate-bootstrapped fact contradicts another
|
|
gate-bootstrapped fact.
|
|
|
|
** Not in scope
|
|
|
|
ACL2 does not verify that =rm -rf / is destructive. That is an empirical claim
|
|
about Linux. Screamer handles empirical consistency (does this new claim
|
|
contradict existing observations?). ACL2 handles structural consistency (does
|
|
this reasoning structure have formal flaws?). The boundary is: empirical claims
|
|
go to Screamer; structural claims go to ACL2.
|
|
|
|
** Verification — ~4 FiveAM tests
|
|
|
|
1. =test-acl2-type-hierarchy-no-cycles= — a synthetic KG with a type-level cycle
|
|
is detected and reported.
|
|
2. =test-acl2-rule-set-contradiction-detected= — two Dispatcher rules that
|
|
contradict each other produce a contradiction report with provenance.
|
|
3. =test-acl2-bootstrap-preservation= — the bootstrap extraction from the gate
|
|
stack is verified to have no missing or extra facts.
|
|
4. =test-acl2-not-loaded-graceful-degradation= — when ACL2 is not installed, the
|
|
skill loads but returns ":ACL2 not available — structural verification
|
|
disabled" without crashing.
|
|
|
|
** Relation to Other Work
|
|
|
|
This is Phase 6 of =notes/passepartout-v3.0.0-roadmap.org=. ACL2's role is refined in
|
|
=passepartout-neurosymbolic-design-decisions-and-options.org= from the
|
|
architecture note's broader claim to the structural verification scope.
|
|
|
|
* Phase 7: The 10-80-10 Planner (~500 lines — new skills, last phase)
|
|
|
|
** What
|
|
|
|
A planning engine built on the mature symbolic index. Screamer expresses task
|
|
planning as a constraint satisfaction problem. ACL2 verifies plans for structural
|
|
soundness. The LLM handles the I/O boundaries (natural language → structured goal
|
|
← natural language response). The symbolic engine handles the reasoning.
|
|
|
|
** Rationale
|
|
|
|
This is v3.0.0 as described in the architecture note and the ROADMAP. It is the
|
|
final phase because it requires a populated, queried, and trusted symbolic index.
|
|
The full planner is useless without a mature ontology and a proven deducer. By
|
|
the time Phase 7 begins, Phases 0-6 have accumulated months of gate outcomes,
|
|
Screamer deductions, verified LLM proposals, and human-authored facts. The
|
|
symbolic index has achieved sufficiency. The ontology has stabilized through use.
|
|
The planner is built on a foundation, not a speculation.
|
|
|
|
** Implementation — =org/symbolic-planner.org= → =lisp/symbolic-planner.lisp= (skill)
|
|
|
|
*** Task decomposition as constraint satisfaction
|
|
|
|
The user specifies a goal: "refactor the authentication module to support OAuth2."
|
|
The LLM translates this to a structured goal plist. Screamer expresses the planning
|
|
problem:
|
|
|
|
- /Variables/: subtasks (write OAuth2 client, add token store, update auth
|
|
middleware, write tests, update documentation)
|
|
- /Constraints/: dependency ordering (tests depend on implementation), resource
|
|
limits (one file write at a time), safety invariants (no modification of
|
|
=core-*= files)
|
|
- /Objective/: find an ordering that satisfies all constraints
|
|
|
|
Screamer returns a viable plan or reports unsolvability with the conflicting
|
|
constraints.
|
|
|
|
*** Plan verification
|
|
|
|
ACL2 proves that the plan contains no deadlocks (two subtasks waiting on each
|
|
other), no dependency cycles (A depends on B depends on C depends on A), and
|
|
no safety violations (no plan step requires a gate-blocked operation).
|
|
|
|
If verification fails, ACL2 identifies the failing subtask and the violated
|
|
constraint. The planner re-decomposes the problematic branch (the existing
|
|
ROADMAP's branch pruning, v0.11.0, but symbolically rather than neurally).
|
|
|
|
*** Neuro-symbolic boundary
|
|
|
|
The LLM handles the I/O boundaries:
|
|
|
|
- *Input* (10%): natural language → structured goal plist. "Refactor auth for
|
|
OAuth2" → =(:goal :refactor-component :target :auth-module :add-feature :oauth2)=.
|
|
Small prompt, formulaic translation, ~100 tokens.
|
|
- *Reasoning* (80%): Screamer plans. ACL2 verifies. VivaceGraph provides the
|
|
facts about file structure, dependencies, and gate constraints. Zero LLM
|
|
tokens.
|
|
- *Output* (10%): structured plan → natural language response. The verified plan
|
|
plist is formatted as "I'll refactor the authentication module in 5 steps:
|
|
1) Create the OAuth2 client (depends on: nothing, modifies: auth/client.lisp)
|
|
2) Add the token store..." Small prompt, formulaic translation, ~150 tokens.
|
|
|
|
*** TUI visualization
|
|
|
|
The plan is rendered as an Org headline tree in the TUI, with each subtask as a
|
|
node showing its terminal state (=todo=, =next-action=, =in-progress=, =done=,
|
|
=blocked=, =stuck=), its constraints, and its verified properties. This is the
|
|
same task tree visualization planned for v0.11.0 in the feature roadmap, but
|
|
with the addition of Screamer constraint annotations and ACL2 verification
|
|
badges.
|
|
|
|
** Verification — ~6 FiveAM tests
|
|
|
|
1. =test-goal-plist-from-natural-language= — natural language input produces
|
|
correct structured goal plist (LLM-dependent but formulaic; tested with
|
|
deterministic mock).
|
|
2. =test-screamer-plan-satisfies-constraints= — Screamer produces a plan that
|
|
satisfies all specified dependencies and safety constraints.
|
|
3. =test-screamer-report-unsolvable= — Screamer reports unsolvability when
|
|
constraints are contradictory.
|
|
4. =test-acl2-verifies-plan-no-cycles= — ACL2 verifies a valid plan has no
|
|
dependency cycles.
|
|
5. =test-acl2-rejects-cyclic-plan= — ACL2 detects a dependency cycle in an
|
|
invalid plan.
|
|
6. =test-plan-to-natural-language= — structured plan plist produces readable
|
|
natural language output.
|
|
|
|
** Relation to Other Work
|
|
|
|
This is Phase 7 of =notes/passepartout-v3.0.0-roadmap.org=. It corresponds to the ROADMAP's
|
|
v0.9.0 (task planning) and v3.0.0 (full 10-80-10 architecture). It is the last
|
|
component because it depends on a mature symbolic index from Phases 0-6.
|
|
|
|
* Phase 8+: Semantic Wikipedia Integration (TBD lines — optional acceleration)
|
|
|
|
** What
|
|
|
|
Load Wikidata entities referenced in the memex into the symbolic index. Every
|
|
entity the user's prose mentions gets its Wikidata property graph — type hierarchy,
|
|
relations, dates, citations — as triples with =:provenance :wikidata=.
|
|
|
|
** Rationale
|
|
|
|
The gate stack provides 50-70 entity classes — adequate for a coding agent.
|
|
For a general-knowledge memex containing literature, philosophy, history,
|
|
science, and daily life, 50-70 is starvation. Organic growth through prose
|
|
extraction (Phase 3) would take years to cover the entities mentioned in a single
|
|
reading of /Pale Fire/. Wikidata has already done this work at scale.
|
|
|
|
The LLM's role in extraction shrinks dramatically. Without Wikidata, the archivist
|
|
must /discover/ that Nabokov wrote /Pale Fire/, lectured on Kafka, and emigrated
|
|
from Russia — extracting each triple from prose. With Wikidata, the Nabokov entity
|
|
is pre-structured. The archivist's job changes from "discover entities" to
|
|
"connect your heading to the existing entity."
|
|
|
|
** Implementation sketch
|
|
|
|
1. *Index referenced entities.* Scan memex prose for entity names (capitalized
|
|
noun phrases, names in Org links, headings in =literature/= directories). For
|
|
each, attempt Wikidata entity resolution (string match, disambiguation via
|
|
context).
|
|
|
|
2. *Load N-hop property net.* For each resolved entity, load its Wikidata
|
|
properties: instance-of, subclass-of, authored, published-in, influenced-by,
|
|
birth-date, death-date, etc. Load the same for entities directly connected
|
|
to it (1-hop neighbors). Optionally expand to 2-hop for deeply connected
|
|
domains.
|
|
|
|
3. *Admit with plural policy.* Wikidata facts are admitted with
|
|
=:provenance :wikidata= and cardinality policy =:plural=. They do not
|
|
override your memex's facts. They sit alongside them with provenance
|
|
displayed. Disagreements are surfaced, not resolved.
|
|
|
|
4. *Cross-domain query.* "What does my memex say about Nabokov that Wikidata
|
|
doesn't?" "Where does my memex disagree with Wikidata?" "What entities in my
|
|
memex have no Wikidata counterpart?" These queries are pure VivaceGraph
|
|
traversals — zero LLM tokens.
|
|
|
|
** Not a Phase 0 prerequisite
|
|
|
|
Semantic Wikipedia integration is an accelerator, not a prerequisite. Phases
|
|
0-7 work without it — the ontology grows through gate outcomes, Screamer
|
|
deductions, LLM proposals, and human authoring. Wikidata compresses the timeline
|
|
for the broad domain but does not change the architecture. The admission gate
|
|
(Screamer), contradiction policies, provenance tracking, and neuro-symbolic
|
|
boundary are identical with or without it.
|
|
|
|
** Open question
|
|
|
|
How much Wikidata is the right amount? Loading entities referenced in the memex
|
|
is the minimum. Loading all entities within N hops of those references expands
|
|
the graph exponentially. The right N depends on the memex's breadth and the user's
|
|
query patterns. A memex focused entirely on software engineering may need only 1
|
|
hop. A memex spanning literature, history, philosophy, and science may need 3-4
|
|
hops. The query performance and memory costs of a large Wikidata load have not
|
|
been estimated.
|
|
|
|
** Deferred Authentication Layers (Layers 2-4)
|
|
|
|
Phase 0b ships Layer 1 (cryptographic). The remaining layers are deferred to
|
|
when their dependencies are available:
|
|
|
|
- *Layer 2 — Sensory.* Active when vision and audio processing skills are loaded.
|
|
Verifies liveness, cross-modal consistency (face matches voice matches location),
|
|
and environmental coherence. Defers to Phase TBD (vision/audio skills feature
|
|
cycle). When unavailable, returns =:unavailable= — graceful degradation.
|
|
- *Layer 3 — Deterministic Identity Reasoning.* Active when Phase 2 (Screamer +
|
|
populated fact store) is complete. Queries the fact store for identity-ruling
|
|
facts ("Jack is deceased; this signal claims to be Jack") and returns binary
|
|
reject/pass. Defers to Phase 2+.
|
|
- *Layer 4 — Probabilistic Identity Reasoning.* Active when style profiles exist
|
|
as a fact-store domain. Uses embedding infrastructure (=embedding-native.lisp=,
|
|
v0.4.0, already exists) to compare writing style, behavioral patterns, and
|
|
content coherence against known profiles. Returns a confidence score; never
|
|
rejects outright — downgrades authorization. Defers to when style profiles are
|
|
built as a fact-store domain.
|
|
|
|
The gate architecture (single vector 0, =fboundp=-guarded sub-layers,
|
|
configurable per signal class) is designed with all four layers from Phase 0b.
|
|
Adding a layer requires adding a skill, not modifying the gate.
|
|
|
|
* Summary: Lines and Dependencies
|
|
|
|
| Phase | Component | Lines | New Skill? | Depends On | Earliest Release |
|
|
|-------+-----------------------------------------+--------+------------+-----------------+------------------|
|
|
| 0 | PM-type-level gates + core integrity | ~75 | No | Dispatcher | Immediately |
|
|
| 0b | Layered auth — Layer 1 (cryptographic) | ~200 | No | Phase 0 | v0.7.2+ |
|
|
| 1 | Triple fact store + abstract API | ~200 | Yes | Phase 0b | v0.7.2+ |
|
|
| 1a | Self-preservation mechanisms | ~120 | Extends | Phase 0, 1 | v0.7.2+ |
|
|
| 2 | Screamer admission gate | ~200 | Yes | Phase 1 | v0.7.2+ |
|
|
| 3 | Archivist as fact proposer | ~100 | Extends | Phase 2 | v0.8.0+ |
|
|
| 4 | Sufficiency criterion — the flip | ~50 | Extends | Phase 3 | v0.8.0+ |
|
|
| 5 | VivaceGraph + Merkle DAG + ontology ver | ~400 | Yes | Phase 4 | v0.10.0+ |
|
|
| 6 | ACL2 structural verification | ~200 | Yes | Phase 5 | v0.12.0+ |
|
|
| 7 | 10-80-10 planner | ~500 | Yes | Phase 6 | v3.0.0 |
|
|
| 8+ | Semantic Wikipedia integration | TBD | Yes | Phase 5 | TBD |
|
|
|-------+-----------------------------------------+--------+------------+-----------------+------------------|
|
|
| Total | | ~2045 | | | |
|
|
|
|
This roadmap is independent of the feature roadmap in
|
|
=passepartout/docs/ROADMAP.org=. Phase 0 ships alongside any v0.7.x patch. The
|
|
symbolic engine grows in parallel with feature work (TUI improvements, MCP tools,
|
|
gateway expansion, etc.), not after it. The feature roadmap describes /what/ the
|
|
agent can do; this roadmap describes /how/ it knows what it knows and how it
|
|
protects itself.
|
|
|
|
The total new code across all phases is approximately 2,045 lines. Relative to
|
|
the existing codebase (~8,000+ lines across 40+ Org source files and 30+ skills),
|
|
the symbolic engine is a ~25% addition. Relative to the ROADMAP's planned feature
|
|
work through v0.13.0 (thousands of lines of TUI rendering, MCP protocol
|
|
implementation, skin engine, planning, etc.), the symbolic engine is a small,
|
|
orthogonal thread that grows the architecture's reasoning depth, self-preservation,
|
|
signal authentication, and knowledge integrity while the feature work grows its
|
|
interaction breadth.
|
|
capability, and knowledge integrity while the feature work grows its interaction
|
|
breadth.
|
|
|
|
* Competitive Advantage Analysis
|
|
|
|
** Phase 0-1: Deterministic safety, now with type-level guarantees
|
|
|
|
The existing Dispatcher gate stack already provides 0-LLM-token safety verification.
|
|
Phase 0 adds structural guarantees: no heuristic bypassing of the type hierarchy.
|
|
A request to modify the dispatcher's own rules is impossible by construction, not
|
|
just caught by pattern matching. No competitor has this — their equivalent of
|
|
"core file protection" is a prompt instruction, not a type system.
|
|
|
|
** Phase 0b: Layered signal authentication — verified origin, not claimed origin
|
|
|
|
No competitor verifies /who/ issued a signal. Every agent harness accepts signals
|
|
from any source that speaks its protocol — TUI, CLI, subprocess, internal skill.
|
|
A compromised dependency can impersonate any signal source. Passepartout's
|
|
four-layer authentication gate makes signal source spoofing impossible at Layer 1
|
|
(cryptographic), detectable at Layers 2-3 (sensory + deterministic reasoning),
|
|
and probabilistically flagged at Layer 4 (style analysis). The key registry is a
|
|
fact-store domain with Merkle-hashed provenance — key creation, promotion, and
|
|
revocation are auditable, versioned, and survivable across restarts. The signal
|
|
provenance chain makes multi-step automated causality traceable: "this deletion
|
|
happened because sensor #3 signed a classification (liveness check failed, sender
|
|
deceased 2 years ago) that skill #7 signed a re-indexing from, and sensor #3 was
|
|
subsequently revoked." No competitor can trace an action to its authenticated
|
|
source through four independent evidence layers because no competitor has a
|
|
layered authentication gate and a Merkle-linked provenance chain.
|
|
|
|
** Phase 2-3: Verified extraction — the symbolic index grows without corruption
|
|
|
|
No competitor verifies extracted facts against an existing knowledge base. Their
|
|
memory systems (Claude Code's ~extractMemories~, Hermes's MemoryProvider, OpenClaw's
|
|
session transcripts) record what the LLM /said/ happened, not what the system
|
|
/proved/ happened. Passepartout's Screamer-gated admission makes the symbolic index
|
|
a monotonic, verified structure. Facts are admitted because they are consistent,
|
|
not because the LLM generated them.
|
|
|
|
** Phase 4-5: Self-accelerating knowledge — the downward cost curve
|
|
|
|
The sufficiency criterion makes Passepartout's "cheaper over time" thesis
|
|
measurable. As the ratio of non-lossy facts grows, LLM calls for extraction
|
|
decrease. At sufficiency, extraction of known categories becomes deterministic.
|
|
The downward cost curve is not a marketing claim — it is a structural property
|
|
of the architecture, visible through the sufficiency score.
|
|
|
|
** Phase 6-7: Provable plan soundness
|
|
|
|
No competitor verifies task plans against formal constraints. Claude Code plans
|
|
in a single LLM call with no post-hoc verification. Hermes decomposes tasks into
|
|
subtasks but does not prove them non-contradictory. Passepartout's ACL2-verified
|
|
plans are structurally guaranteed to have no deadlocks, no dependency cycles,
|
|
and no safety violations. The verification is a proof, not a prompt.
|
|
|
|
** Semantic Wikipedia: Entity coverage at zero marginal cost
|
|
|
|
No competitor has a general-knowledge entity graph because no competitor has a
|
|
symbolic engine to populate. Claude Code knows codebases; it doesn't know that
|
|
Nabokov wrote /Pale Fire/ and lectured on Kafka. Passepartout with Wikidata
|
|
loaded knows both, and the entity knowledge costs zero LLM tokens — it is loaded
|
|
once as structured data and queried via VivaceGraph traversals.
|
|
|
|
** Phase 0-1a: Self-preservation — the agent knows when it is wounded
|
|
|
|
No competitor detects its own degradation. Claude Code, OpenCode, and Hermes
|
|
all fail silently when a tool crashes or a dependency is missing — the agent
|
|
keeps running, producing degraded output, never telling the user. Passepartout's
|
|
quarantine system detects failing skills, unloads them automatically, and
|
|
displays a degraded-mode indicator in the status bar. The external watchdog
|
|
restarts the daemon if the process dies. The integrity monitor detects corrupted
|
|
core files. The agent refuses to execute commands that would destroy its own
|
|
runtime, explaining /why/ and redirecting to the safe termination path.
|
|
|
|
This is not a feature. It is the architectural difference between a tool that
|
|
breaks silently and a system that preserves itself actively, signals its wounds,
|
|
and traces its degradation to specific components that can be reloaded or
|
|
repaired.
|
|
|
|
** The permanent competitive advantage
|
|
|
|
The competitive advantage is not any single feature. It is the architecture's
|
|
ability to accumulate verified knowledge from four independent sources (gates,
|
|
deduction, verified LLM proposals, human authoring) and to make that knowledge
|
|
queryable with provenance. Competitors accumulate chat transcripts. Passepartout
|
|
accumulates a provenanced, self-verifying knowledge graph. Transcripts become
|
|
stale and unreliable. The knowledge graph becomes richer and more trustworthy
|
|
with every session.
|
|
|
|
* What Is NOT Built
|
|
|
|
1. *A separate knowledge graph serialization format before the ephemeral phase
|
|
proves what facts are useful.* Premature format commitment is the ontology
|
|
problem writ small. Let use determine the format.
|
|
|
|
2. *ACL2 verification of empirical claims.* Apple is red. rm -rf / is destructive.
|
|
These are observations, not theorems. Screamer handles empirical consistency.
|
|
ACL2 handles structural verification.
|
|
|
|
3. *VivaceGraph before Screamer.* The admission gate is the critical path. The
|
|
persistence layer is an optimization of a working system.
|
|
|
|
4. *A per-fact ontology designed upfront.* Extract from the gate stack, extend
|
|
from deductions and observations, prune through contradiction detection. The
|
|
ontology is a garden, not a building.
|
|
|
|
5. *New core ASDF components.* Every phase is a skill. A corrupted symbolic
|
|
engine degrades reasoning but does not kill the agent. Satisfies the
|
|
self-repair criterion.
|
|
|
|
6. *A "complete" symbolic index for the broad domain.* The neural index is the
|
|
permanent gateway to the richness of prose. The symbolic index handles what
|
|
can be mechanically verified. The boundary is permanent, not transitional.
|
|
The neuro is the brain. The symbolic is the education.
|
|
|
|
* Relation to the Feature Roadmap
|
|
|
|
The feature roadmap (=passepartout/docs/ROADMAP.org=) describes Passepartout's
|
|
evolution through v0.13.0: TUI improvements, MCP-native tools, task planning,
|
|
skill creation, evaluation harnesses, voice gateways, themes, and channels.
|
|
These are /interaction surface/ features — they expand what the agent can do.
|
|
|
|
This roadmap describes the /reasoning substrate/ — it deepens how the agent
|
|
knows what it knows. It is independent of the feature sequence. Phase 0 can ship
|
|
alongside any v0.7.x patch. Phases 1-4 ship during the v0.8.x-v0.10.x feature
|
|
cycle. Phases 5-7 ship during the v0.11.x-v0.13.x cycle.
|
|
|
|
The two roadmaps converge at v3.0.0: the feature roadmap provides the interaction
|
|
surface (a polished TUI, a rich tool ecosystem, a multi-gateway communication
|
|
layer); this roadmap provides the reasoning depth (a provenanced knowledge graph,
|
|
a deterministic constraint solver, a verified planning engine). The surface
|
|
without the substrate is a chat agent with good UX. The substrate without the
|
|
surface is a theorem prover without a user. Together, they are the v3.0.0
|
|
architecture.
|
|
|
|
See also:
|
|
|
|
- =notes/passepartout-neurosymbolic-design-decisions-and-options.org= — the
|
|
design rationale for every decision in this roadmap
|
|
- =notes/passepartout-symbolic-engine-exploration.org= — the original architecture
|
|
exploration and five architecture options
|
|
- =notes/passepartout-whitehead.org= — Whitehead's four concrete contributions
|
|
- =passepartout/docs/ROADMAP.org= — the feature roadmap through v0.13.0
|
|
- =passepartout/docs/ARCHITECTURE.org= — the current pipeline architecture
|
|
- =notes/passepartout-v3.0.0-roadmap.org= — the original concrete plan (superseded by this
|
|
document)
|