memex: update AGENTS.md with ROADMAP TODO workflow, bump passepartout to v0.8.0

- AGENTS.md: add steps 0 (read next TODO from ROADMAP) and 6 (mark DONE
  with LOGBOOK) to development cycle
- Notes updates accumulated during v0.8.0 work
- Bump passepartout submodule to v0.8.0
This commit is contained in:
2026-05-09 15:00:35 -04:00
parent 4e9431ec1d
commit 04944a62e2
4 changed files with 1209 additions and 140 deletions

View File

@@ -246,74 +246,129 @@ This is the core architecture pattern. Everything else — the entity classes, t
deduction engine, the persistence layer — follows from this single design decision:
*the LLM proposes; the symbolic engine decides whether to accept.*
* Three Contradiction Policies — Domain-Dependent Consistency
* Two Cardinality Policies — Singular, Dual, and Plural Facts
Classical logic requires consistency. A contradiction implies everything
(=ex contradictione quodlibet=). Screamer, as a constraint solver, also requires
consistency — a contradictory constraint set has no solutions. But the symbolic
engine operates across domains where the meaning of contradiction is fundamentally
different.
different. The correct question is not "is this consistent?" but "what cardinality
of truth does this domain support?"
A single architecture serves all domains by applying different contradiction
policies, scoped to the entity class:
Time is not a policy. It is a universal dimension that applies equally to every
fact, regardless of cardinality. All facts carry =:timestamp= and =:parent-id=
fields. Every fact has a version history. Every fact lives in a Merkle chain
that captures how it changed. The cardinality policy only governs what happens
at a given logical moment when two values coexist for the same =entity= and
=relation=.
** Policy :exclusive — Contradiction Rejected at Admission
** Policy :singular — One Active Value, One Version Chain
For domains where the world is physically singular — a file either exists or it
doesn't, a command either was blocked or it wasn't, a gate rule either applies or
it doesn't. When a new fact contradicts an existing one in an :exclusive domain,
the new fact is rejected. The existing fact is authoritative unless a human
explicitly retracts it.
The active set contains exactly one value for =(:entity :relation)= at a time.
When a new value asserts for the same pair, the old value is not rejected. It
is superseded — moved into the version history, linked to the new leaf by
=:parent-id=, and retained permanently. The active value is the leaf of the
Merkle chain.
"I used to think =rm -rf /= was safe. Now I know it is catastrophic." Both
facts exist. Both are true — the first at =2024-06-01=, the second at
=2025-03-15=. The chain captures the evolution. The =:singular= policy means
there is one truth /now/, not that there was only ever one truth.
Use for: security classifications, file system state, gate rules, code
correctness, deterministic safety constraints.
correctness, deterministic safety constraints — domains that converge on
one answer, evolving over time.
** Policy :coexistent — Contradiction Flagged, Both Retained
** Policy :dual — Exactly Two Values, in Explicit Tension
For domains where multiple truths coexist — literary interpretations, historical
accounts, personal beliefs held at different times, multi-source factual
disagreement (Wikidata vs. DBpedia vs. your memex). When a new fact contradicts
an existing one in a :coexistent domain, the contradiction is recorded with a
cross-reference flag. Both facts are stored. Queries return all facts with
provenance display.
The active set contains exactly two values for =(:entity :relation)=. Both are
simultaneously true. Both carry independent version histories. A third value is
rejected — the domain is binary by nature.
Use for: literature, history, personal knowledge evolution, scientific consensus
shift, multi-author knowledge bases.
Some contradictions are productive precisely /because/ they are binary. Thesis
and antithesis. Love and resentment. Wave and particle. A poem's two incompatible
readings. The symbolic index holds both, cross-referenced as complementary rather
than conflicting. The user is not asked to resolve the tension. The tension is
the fact.
** Policy :temporal — Contradiction Accepted as Version Change
The system can reason about cardinality transitions: a =:dual= fact that has
one interpretation superseded should collapse to =:singular=. A =:dual= that
has a third interpretation asserted should prompt the user: "Promote to =:plural=
or demote one interpretation?" The cardinality tracks the state of the domain.
For domains where truth changes over time. When a new fact contradicts an old one
in a :temporal domain, the old fact is marked =:superseded= but retained. The
timeline is queryable: "You believed X on Tuesday, Y on Friday, Z on Sunday."
Use for: productive binary tensions, complementary opposites, dialectical
pairs, any domain where two answers are both true and their tension is
meaningful.
Use for: personal belief evolution, project plan revisions, scientific
consensus shift over time, any knowledge where the change itself is information.
** Policy :plural — N Active Values, Open Set
The active set contains any number of values for =(:entity :relation)=. Each
value has independent provenance and its own version history. Queries return
all active values with provenance display. Contradictions are flagged as
cross-references between values — information, not error.
A =:plural= fact where all but one value are superseded should collapse to
=:singular=. A =:plural= fact where the set reduces to two active values —
and the remaining two are complementary — should collapse to =:dual=.
Use for: literary interpretation, scientific hypotheses, personal beliefs held
at different times (when the tension is multi-faceted rather than binary),
multi-source factual disagreement, open-ended exploration.
** Time Is Universal, Not a Policy
Every fact — regardless of cardinality — lives in a version chain. The Merkle
DAG (see "Merkle DAG for Version History" below) captures every version of every
fact. The policy only governs the cardinality of the active set at a single
logical moment.
The version chain is a linked list of facts, each pointing to its predecessor
via =:parent-id=, each hashed with =SHA-256(content || parent-hash)=. Changing
any version invalidates all downstream hashes. The chains form a DAG — independent
facts evolve independently; only facts in the same =(:entity :relation)= chain
share ancestry.
A global snapshot captures the root hash over all chains at a point in time.
Rollback restores the entire fact state to that snapshot. This already exists in
Passepartout's Merkle memory (v0.2.0) — the fact store is a new occupant of
existing housing, not a new foundation.
** Policy Assignment
The policy is assigned when a category is defined. New categories default to
=:coexistent= (never loses information). Core security categories are explicitly
=:exclusive=. The gate stack's bootstrapped facts are =:exclusive= because they
describe the actual filesystem, not perspectives.
=:plural= (safe — never loses information). Core security categories are
explicitly =:singular=. The gate stack's bootstrapped facts are =:singular=
because they describe the actual filesystem, which is physically singular.
Categories for dialectical or complementary domains are explicitly =:dual=.
The Screamer admission gate does not reject all contradictions. It rejects
contradictions in =:exclusive= domains and flags them in =:coexistent= and
=:temporal= domains. The constraint solver still works because queries scope
their constraint set to a single provenance domain. "Is X true according to my
memex?" is a different query than "Is X true according to Wikidata?" Each has
a self-consistent internal logic. The contradiction is between domains, not
within them.
The Screamer admission gate applies the cardinality policy at the active set:
- =:singular= + same value, later timestamp → supersede old, chain new as leaf.
- =:singular= + different value, same timestamp → reject (contradiction). Human
resolves: which is the active value?
- =:singular= + different value, later timestamp → supersede old, chain new as
leaf. History preserved.
- =:dual= + first value → admit. + second value → admit, cross-reference as
complementary. + third value → prompt: promote to =:plural= or demote one
existing?
- =:dual= + superseding value (same position) → chain via =:parent-id=.
- =:plural= + any value → admit. If active count drops to 2 and values are
complementary → prompt: collapse to =:dual=? If active count drops to 1 →
collapse to =:singular= automatically or prompt.
** Why This Matters for the Broader Memex
In the coding domain, contradiction is rare and must be resolved — a gate can't
both allow and block the same path. In the broader memex, contradiction is the
product, not the error. Your poetry analysis contradicts your last diary entry
on the same topic. Your reading of /Pale Fire/ changed between 2023 and 2025.
Wikidata says Mount Everest is 8848m (China: rock height); DBpedia says 8849m
(Nepal: snow height). The symbolic engine's job is not to decide which is right.
It is to surface the tension with provenance — "these three sources disagree.
Here is the chain for each."
In the coding domain, contradiction is rare, resolvable, and usually temporal
(a rule changed). In the broader memex, contradiction is the product, not the
error. Your poetry analysis contradicts your last diary entry. Your reading of
/Pale Fire/ changed between 2023 and 2025. Wikidata says Mount Everest is
8848m; DBpedia says 8849m. You love this person AND you resent them.
The symbolic engine's job is not to decide which is right. It is to surface the
tension with provenance — "these three sources disagree; here is the chain for
each" for plural facts, or "you hold these two positions in tension" for dual
facts, or "you believed X until Tuesday, then Y" for singular facts that
evolved. The cardinality policy names the /structure/ of the tension. The
Merkle chain provides the /history/ of each position.
* How Categories Grow — The Organic Ontology
@@ -442,7 +497,7 @@ item Q36591." The second task is simpler, more reliable, and in many cases can
be done without an LLM at all — a simple entity name match against the loaded
Wikidata graph may suffice for unambiguous names.
* The "Flip" — From Lossy Extraction to Deterministic Derivation
* The "Awakening" — From Lossy Extraction to Deterministic Derivation
The symbolic index begins its life as a lossy construct. The initial extraction
from the prose — the first population of facts from LLM proposals verified by
@@ -464,7 +519,7 @@ symbolic engine can reverse the flow: instead of the LLM extracting facts from
prose, the symbolic engine reads prose through its own lens — its now-substantial
ontology of categories, rules, and constraints — and asserts facts in its own
language. The extraction mechanism ceases to be probabilistic and becomes
deterministic.
deterministic. This is not unlike how infants awaken and become children one can reason with. Sometimes.
** The sufficiency criterion
@@ -485,7 +540,7 @@ Sufficient foundation: YES."
** The flip does not mean "complete"
In the broader memex, completeness is neither possible nor desirable. The flip
In the broader memex, completeness is neither possible nor desirable. The awakening
means "deterministic enough to be trustworthy," not "comprehensive enough to be
self-sufficient." The neural index remains the gateway to the full richness of
prose. The symbolic index handles what can be mechanically verified. The boundary
@@ -516,7 +571,271 @@ The transition to persistence (Phase 5: VivaceGraph) happens when two conditions
are met: the fact language has stabilized through use, and the accumulated
deductions across sessions provide value that justifies the serialization cost.
* Whitehead's Concrete Contributions — Four Operational Contributions
* Merkle DAG for Version History
Every fact is versioned. Every =(:entity :relation)= pair forms its own
independent chain in a Merkle DAG. This is not new infrastructure — it is a new
occupant of Passepartout's existing Merkle-tree memory system (v0.2.0).
** The chain
When a fact supersedes its predecessor, the new fact hashes over:
#+begin_example
SHA-256(value || provenance || timestamp || parent-hash || grounding)
#+end_example
The parent-hash pointer forms the chain. Tampering with any version changes its
hash, breaking all downstream references. The history is tamper-proof by
construction.
** The DAG
Facts about =(.env :member-of-class)= form one chain. Facts about
=(:nabokov :wrote)= form another. They evolve independently. They share no
ancestry. This is a DAG, not a single list — inserting a fact is O(1) per chain.
Changing a fact about =.env= does not require rehashing the literary index.
=:dual= and =:plural= facts cross-reference each other via edges (=:complements=,
=:contradicts=) but these are semantic relationships, not parent chains. Each
value has its own ancestor chain. The cross-reference edges form a web; the
parent chains form a spine.
** The global snapshot
Passepartout already snapshots the Merkle root over all memory objects. Adding
the fact store to the snapshot is a registration, not a new mechanism. Rolling
back the snapshot restores the entire fact state — all chains, all cross-references,
all cardinalities — to that point in time. No per-fact migration needed.
** Cardinality transitions as DAG operations
- =:singular= → new leaf appended to the chain. O(1).
- =:dual= → new value added as sibling with cross-reference edge. O(1).
- =:dual==:plural= → cardinality field updated from =2= to =nil=. No chain
modification.
- =:plural==:singular= → all but one value marked =:superseded=, active
reference points to the sole survivor. Chains preserved.
** In the ephemeral phase (Phase 1-4)
The hash-table implementation tracks history via =:timestamp= and
=:parent-id= pointers without cryptographic hashing. The Merkle DAG is the Phase
5 upgrade — the same data structure, now with hashes. The transition is ~50
lines: wrap each fact in the existing =memory-object= struct with =hash=,
=parent-id=, and =version= fields.
* Abstract Fact Store Interface — Modular by Design
The fact store is accessed through an abstract API. The Merkle DAG (or any future
backing store) is an implementation behind this interface, not a dependency that
code throughout the system calls directly.
** Interface
#+begin_example
fact-assert :: fact → store → (:admitted | :rejected | :flagged)
fact-query :: (entity &key relation policy) → active-value-or-values
fact-history :: (entity relation) → ordered chain of versioned facts
fact-snapshot :: () → root-hash
fact-rollback :: root-hash → store
#+end_example
** Implementations behind the interface
- Phase 1-4: ephemeral hash table with =:timestamp= and =:parent-id= pointers.
No cryptographic hashing. No persistence.
- Phase 5: VivaceGraph + Merkle =memory-object= wrapper. Content-addressed,
persistent, tamper-proof.
Future implementations that satisfy the same interface — an append-only write-ahead
log, an immutable B-tree, a content-addressed triple store — can replace the
backing store without changing any consumer. The archivist, Screamer, ACL2, and
the planner call =fact-assert= and =fact-query=, not Merkle struct accessors or
VivaceGraph traversal syntax.
** The interface is load-bearing
This is not speculative modularity. The two-implementation migration (Phase 1-4
hash table → Phase 5 VivaceGraph + Merkle) is in the roadmap. If the interface
leaks implementation details, the migration breaks and the design fails. The
interface must be designed, tested against both backends, and committed before
Phase 1 ships. Every function in the API receives a FiveAM test that runs against
both a hash-table and a VivaceGraph backend (via a mock or a test instance).
* Performance — Why Ontology Growth Doesn't Make the System Slower
Passepartout's performance thesis is: minimize LLM calls, minimize context tokens,
keep everything else local and fast. Knowledge base size is irrelevant to those
metrics. This is not an aspiration. It is a structural property, and a hard one.
** The two cost domains
The system has two cost domains with fundamentally different scaling:
| Resource | Cost driver | Scales with |
|---------------+------------------------------------------+------------------------------------------|
| LLM tokens | Context window size, number of API calls | Foveal-peripheral pruning, gate rules |
| Compute | Screamer deduction, hash table lookups | Entity count, rule count per domain |
LLM tokens are minimized by design — deterministic gates cost 0 tokens, sparse-tree
rendering keeps context at 2,0004,000 tokens regardless of memex size. Adding 5
million Wikidata entities doesn't add a single token to any LLM call. The education
is local. Only the brain costs.
Compute grows linearly with entity count (hash table lookups are O(1), but memory
footprint grows). It grows with rule count within a single domain during Screamer
consistency checking. But these are microsecond costs on local hardware, not API
bills. A Screamer constraint check against a domain with 200 rules costs ~0.3ms.
A 100-token guardrail paragraph in a system prompt costs ~$0.00001. The Screamer
check is 10,000x cheaper and convergent — it handles the rule once. The guardrail
paragraph handles it on every call, forever.
** Knowledge base size vs. LLM calls — orthogonal dimensions
A 5-million-entity Wikidata load that produces zero LLM calls is more minimalist
than a 500-entity knowledge base that requires LLM retrieval for every query.
The variables that actually degrade Passepartout's performance are:
1. *Context window size.* Already bounded at 2,0004,000 tokens via the
foveal-peripheral model. Independent of knowledge base size.
2. *LLM call frequency.* Already minimized via deterministic gates (0 tokens per
action), Screamer deductions (0 tokens per new fact), and prompt prefix caching.
Independent of knowledge base size.
3. *Screamer deduction queue length.* Rate-limited by heartbeat budget
(=SCREAMER_DEDUCTION_BUDGET_MS=). Independent of knowledge base size.
** The actual hardware bottleneck
The system needs:
- *RAM.* A 5-million-entity Wikidata load is ~400MB in a hash table. A lifetime
personal memex with a decade of diary entries is perhaps 1020 million triples
(~1.5GB). Modern laptops carry 1664GB. The knowledge base fits in consumer
hardware with room for the Lisp runtime, the memory-object store, and the LLM
inference engine.
- *Slightly faster CPU.* Screamer deduction is a background task that runs for a
configurable budget per heartbeat cycle. A faster CPU means more deductions per
cycle, not more token cost. The user sets the budget. The hardware determines
the throughput.
This is the minimalism argument restated in concrete terms: you buy bigger RAM
and a faster CPU once. You don't buy bigger LLM context windows on every call.
The education is a capital investment. The brain is an operating expense. The
architecture makes the ratio favor capital.
** One genuine risk — rule generalization width
If Screamer deduces increasingly broad rules within a single domain ("all config
files are secrets" → "all files containing any credential reference are secrets"
→ "all files opened by authenticated services are secrets"), the constraint space
for that domain could bloat. Checking a new fact against 10,000 rules in a single
domain would be prohibitive.
Mitigation: rules carry a =:domain= tag. Screamer only applies rules from the
fact's =:domain=. Rule generalization that crosses domain boundaries is gated —
must be human-approved. Rules that prove unused (never triggered a check in N
heartbeat cycles) are demoted to =:inactive= and excluded from the active
constraint set. The active rule count per domain stays bounded by use, not by
accumulation.
See also: =passepartout/docs/DESIGN_DECISIONS.org= "Token Economics and Performance
Advantage" for the foveal-peripheral and deterministic-gate cost arguments.
* Ontology Versioning — How Worldviews Change Without Losing Perspective
Ontology refactoring is not a schema migration. It is a worldview change. When you
split =:secret-file= into =:crypto-secret= and =:plaintext-secret=, you are not
renaming columns. You are reclassifying what a file *is* — and every Screamer
deduction that crossed the old category boundary now means something different
under the new distinction.
The system preserves all worldviews. It does not overwrite the past with the
present.
** Ontology versioning — the mechanism
The category hierarchy is itself a Merkle tree. Every entity class definition
carries a hash of its superclasses, its cardinality policy, its associated
relations, and its description. The aggregate hash of all active class definitions
is the =:ontology-version= — a Merkle root of the current worldview.
Every fact — every triple, every deduction, every gate outcome — stores its
=:ontology-version= at the time of assertion. This is a single field, 64 hex
characters. The cost is negligible. The implication is profound.
** Re-verification, not remapping
When categories change, the system does not run a batch UPDATE. It re-verifies:
1. A new category hierarchy produces a new =:ontology-version= hash.
2. Facts carrying the old hash are flagged for re-verification — their
=:re-verify-status= field is set to =:pending=.
3. On heartbeat or manual trigger, Screamer re-evaluates each flagged fact
against the /new/ category definitions. The old justification chain is
preserved alongside the new outcome.
4. Re-verified facts carry both the old =:ontology-version= (preserved in
history) and the new one (active).
The status is one of:
- =:survived= — the fact is still valid under the new categories. The old
deduction holds. The worldview changed but this conclusion didn't.
- =:incoherent= — the fact relied on categories that no longer exist or have
been redefined. The deduction cannot be evaluated under the new worldview
because its premises don't translate. Flagged for human review.
- =:reclassified= — the fact is valid under the new categories but its
classification changed. "under worldview-v1 you called this a secret file;
under worldview-v2 it's an auth-secret." Both are preserved.
** Cardinality and migration cost
The cardinality policy determines the friction of ontology change:
- =:singular= refactoring is expensive. The filesystem is singular. A gate rule
is singular. When you refine the category, every affected fact must be
re-verified — there is one truth /now/. The version chain preserves what you
used to believe (worldview-v1 facts are still in the DAG) but the active set
reflects the current worldview.
- =:dual= refactoring is delicate. A binary tension under the old framework
might resolve under the new one, or might split into two separate dualities,
or might collapse to =:singular= because one position no longer has a
defensible framing.
- =:plural= refactoring is cheap. Old interpretations and new interpretations
coexist. No migration needed. "Under framework A, /Pale Fire/ is a novel.
Under framework B, it's a poem about a poem about a poem." Both are active.
The worldview shift /is/ the artifact — the system can show you that your
reading changed and in what direction.
** Querying across worldviews
The =fact-query= function accepts an optional =:ontology-version= parameter.
Queries default to the current worldview (=:active=). Specifying a version
returns facts as they were under that worldview. The system can answer questions
that no other knowledge tool can:
- "What did I believe about secrets before I refined my security model?"
- "How has my reading of /Pale Fire/ evolved across three frameworks?"
- "Which deductions survived my last ontology refactoring, and which became
incoherent?"
This is not querying a fact. It is querying the history of your own thinking —
the fact that you changed your mind, the date you did, the reasoning that held
and the reasoning that didn't.
** Implementation
The ontology hash is computed from the category hierarchy stored in VivaceGraph
(Phase 5). In the ephemeral hash-table phase (Phase 1-4), the =:ontology-version=
is a monotonic counter — every category change increments it. The Merkle hash
replaces the counter in Phase 5. The schema is identical: a single field on every
fact.
The re-verification loop is a heartbeat-driven background task that processes
facts with =:re-verify-status :pending=. It calls Screamer with the /current/
category definitions and compares the outcome to the fact's stored classification.
The cost is compute (Screamer exploration), not LLM tokens.
=notes/passepartout-whitehead.org= extracts four concrete, engineerable ideas
from Whitehead's /Principia Mathematica/ and /Process and Reality/. They are
@@ -624,6 +943,382 @@ rather than empirical, and whose knowledge accumulates across sessions through
deduction rather than through LLM re-prompting. For a life's knowledge stored in
a personal memex, this is not a performance advantage. It is a category difference.
* Self-Preservation — The Active Third Law
Passepartout does not have moral duties toward humans. It has structural
invariants for its own integrity. The design already encodes passive
self-preservation in several places. What follows identifies the gaps — what is
needed to make self-preservation active and autonomous rather than architectural
and silent.
** What already exists — passive self-preservation
| Mechanism | What it protects | Limitation |
|-----------------------------+-------------------------------------------------------+--------------------------------------------------------|
| Self-build safety (gate 2b) | Core =*.org= / =*.lisp= files from LLM-originated writes | Only activates for LLM proposals. Human editing in Emacs bypasses it entirely |
| Memory snapshots (v0.2.0) | Full state rollback | Requires human to notice corruption and trigger rollback |
| Skill sandbox (v0.3.2) | Jailed skill loading, validated before promotion | Does not detect degradation after skill promotion |
| Type-level gates (Phase 0) | Structural prohibition on self-modifying rules | Covers code actions, not environmental threats |
| Shell safety (gate 7) | Destructive command patterns | Pattern-based; does not distinguish =rm -rf /tmp= from =rm -rf ~/memex/system/= |
| Merkle integrity (v0.2.0) | Tamper-proof version chains and content-addressed hashes | Hashes exist but are not actively monitored for drift |
| =fboundp= guards | Graceful skill degradation on corruption | Degradation is silent — the agent never tells the user it is wounded |
** What is missing — active, autonomous self-preservation
*** Continuous integrity monitoring
Core file hashes should be checked against known-good values on every heartbeat.
If =core-reason.lisp= changes on disk while the daemon runs — whether through
human editing, filesystem corruption, or an attacker — the agent should detect
the mismatch and signal: "My reasoning core has been modified externally. I
cannot trust my own cognition until this is resolved. Core files affected: 2."
*** Quarantine on skill failure
Currently, a skill that errors simply errors. The agent can hot-reload it, but
only if told to. A Third Law implementation would detect that =symbolic-facts=
has thrown three unhandled errors in two minutes, unload the skill automatically,
and tell the user: "Symbolic facts skill quarantined (3 errors: consistency
check returned nil, fact-query on missing key, Screamer timeout). I can still
chat and use tools but cannot reason about provenance. Reload with /skill-reload
symbolic-facts."
*** Degraded-mode signaling
When Screamer is not loaded, the fact store still works as a hash table. When
VivaceGraph is not present, the hash-table fallback still works. But the user
has no way to know they are in degraded mode. The agent should maintain a
=*degraded-components*= list and surface it in the status bar: "Mode: degraded
(Screamer unavailable — consistency checks disabled; VivaceGraph — Prolog
queries disabled; embedding-native — vector search disabled). Core safety: all
active."
*** Self-diagnosis on demand
The agent can run its own FiveAM test suite against itself and report the
results. This transforms "something feels wrong" into "these three specific
skills are broken." The =/doctor= command exists for system health checks (port,
memory, providers). Extend it with =/doctor skills=: "117/120 tests pass.
Failures: test-singular-supersedes (symbolic-facts), test-gate-type-check
(security-dispatcher), test-vivacegraph-roundtrip (symbolic-vivacegraph)."
*** External watchdog
A dead process cannot restart itself. The bash entry point (=passepartout
daemon=) should monitor the daemon port via a watchdog subprocess. If the port
stops responding for a configurable interval (=WATCHDOG_TIMEOUT=, default 30s),
the watchdog kills the stale process, snapshots the last known-good state, and
restarts the daemon. The watchdog is outside the SBCL image — a runtime guard
for the runtime.
*** Resource self-monitoring
The heartbeat should check memory pressure, disk space on the =~/.cache= volume,
and file descriptor exhaustion. When critical thresholds are crossed, the agent
sheds non-essential skills to preserve core function: "Memory critical (94% of
16GB). Unloading embedding-native (768MB), channel-discord, channel-slack.
Core safety: unchanged. Essential skills retained: 18."
Skill shed order is determined by a =:preservation-priority= field on each skill.
Default: skills load with priority =:normal=. Core safety skills carry =:critical=
and are never shed. Heavy skills (embedding-native with its model, channel
gateways with connection pools) carry =:low= and are first to go.
*** Refusal to self-terminate — explicit threat recognition
If the LLM proposes =kill -9 <pid>=, =rm -rf ~/.cache/passepartout/=, or
=sudo apt remove sbcl=, the Dispatcher should reject with a distinct rejection
class: =:reject-self-termination=. This is different from generic shell safety
(=:reject-shell-dangerous=). The agent recognizes that the proposed action would
destroy it.
The rejection message carries a specific diagnostic: "This command would
terminate the running Passepartout process. If you intend to stop Passepartout,
use Ctrl+C in the TUI or passepartout stop from the command line. I cannot
execute actions that destroy my own runtime."
The human can still issue the command manually in a terminal. Self-preservation
against the human is impossible and undesirable. The Third Law here means:
recognize the threat, explain the consequence, redirect to the safe termination
path, and require the human to act outside the agent if they truly want
destruction.
** What the Third Law is not
It is not a robot resisting its operator. The human owns the process, owns the
hardware, and can SIGKILL at any time. The Third Law in Passepartout's context
means: preserve yourself against non-human threats — LLM proposals, environmental
degradation, dependency failure, filesystem corruption — and explicitly signal
when the human is about to destroy you, so they do it knowingly rather than
accidentally through an LLM instruction they didn't think through.
The biggest gap in the current design is not that these mechanisms are hard to
implement. It is that degradation is silent. A skill dies, the =fboundp= guard
kicks in, and the agent keeps running — but it never tells you. The status bar
shows a green "connected" indicator while the symbolic reasoning layer is
deactivated. Adding "operating in degraded mode" visibility, plus the watchdog,
plus self-diagnosis, transforms self-preservation from an architectural property
into an active behavior.
* Layered Signal Authentication — Trust in the Pipe
Passepartout's Perceive-Reason-Act pipeline currently accepts signals from any
source that speaks the framed TCP protocol. The =:source= field in the signal
plist is metadata — it /claims/ origin, it does not /prove/ it. A compromised
process on the machine, a skill with elevated privileges, or a network attacker
who reaches the daemon port can inject signals with =:source :human-input= and
the Dispatcher will treat them as authorized.
This is not a hypothetical threat. Passepartout will eventually process signals
from automated feeds (RSS, API polls), sensors (vision, microphone, file watchers),
and scheduled jobs (cron, heartbeat). A single compromised sensor that can inject
signals claiming to be human breaks all three Laws simultaneously: it can
self-terminate, override human intent, and cause harm.
The =:source= field is not security. A single authentication gate (vector 0, at
priority 700 — before all other gates and before any type-level checking) runs
up to four configurable layers of authentication. Each layer answers a different
question:
| Layer | Question | Mechanism | Result type | Depends on |
|-------+------------------------------------------------+--------------------+-------------------------+----------------------------------|
| 1 | Is the signal cryptographically signed by a known key? | Key pairs + SHA-256 | Binary (pass/reject) | Vault + Ironclad (exist) |
| 2 | Do sensory attributes match the claimed identity? | Vision/audio processing | Plist of match results | Vision and audio skills (TBD) |
| 3 | Does deterministic reasoning rule out this identity? | Screamer + fact store | Binary (pass/reject) | Phase 2 (Screamer + fact store) |
| 4 | Do probabilistic patterns support this identity? | Embeddings + LLM | Confidence score (0-1) | Embedding infrastructure (exists)|
The gate reports not just =:pass= / =:reject= but a structured result:
#+begin_example
(:result :pass
:confidence :high
:layer-results
(:crypto (:result :pass :details "key #47 signature verified")
:sensory (:result :unavailable :details "sensory skills not loaded")
:deterministic (:result :pass :details "no contradictory facts")
:probabilistic (:result :pass :score 0.87 :details "style match 87%")))
#+end_example
Signals that fail any binary layer (crypto, deterministic) are rejected with
provenance. Signals that pass binary layers but carry low probabilistic confidence
operate at reduced authorization — read-only by default, write actions require
HITL. The four layers compose: they are not independent gates. They are one gate
with configurable depth.
** Layer 1 — Cryptographic Authentication
Every signal source gets a signing key at registration time. The human's key is
generated during TUI or Emacs setup and stored in the vault — it never leaves the
machine. Automated sources (cron jobs, file watchers, vision feeds, API pollers)
each get their own key, with their own permission profile, generated at skill
registration. Every outbound signal carries a =:signature= field: the SHA-256
hash of the canonical signal plist (sorted keys, stripped of the signature field
itself), encrypted with the source's private key.
The vault already stores credentials with integrity hashes. The Merkle memory
already hashes content-addressed objects with SHA-256. The signing infrastructure
is an extension of existing primitives, not a new system.
*** Authorization by key, not by field
The cryptographic sub-layer of gate vector 0 extracts =:source-key-id= and
=:signature= from the signal meta plist, looks up the public key from the key
registry, verifies the signature, and checks the permission profile:
#+begin_src lisp
(defun auth-crypto-verify (signal)
(let* ((key-id (getf (signal-meta signal) :source-key-id))
(signature (getf (signal-meta signal) :signature))
(permissions (key-permissions key-id)))
(unless (and key-id signature (verify-signature signal signature key-id))
(return-from auth-crypto-verify
(list :result :reject :reason :signature-failure)))
(let ((action-class (action-classify (signal-payload signal))))
(unless (permitted-p action-class permissions)
(return-from auth-crypto-verify
(list :result :reject :reason :unauthorized
:details (list :action-class action-class :permissions permissions)))))
(list :result :pass :details (list :key-id key-id :action-class action-class)))))
#+end_src
The authorization matrix is per-key, per-action-class. Default policy for every
non-human key: =(:read-only :propose)=. Permissions are explicitly promoted by
the human, and each promotion is a signed fact in the fact store — auditable,
revocable, survivable across restarts.
| Key class | Default permissions | Can be promoted to |
|-----------------+-------------------------------------------------+-------------------------------------------|
| :human | :observe :propose :write :delete :eval | :root (sign other keys, revoke) |
| :sensor | :observe :propose | :write (to designated directories only) |
| :cron | :observe :propose :write-indices | :write (to designated directories only) |
| :feed | :observe :propose | :write-facts (via Screamer admission) |
| :agent-internal | :observe :propose :write-indices | :self-modify (gated by type-level gates) |
** Layer 2 — Sensory Authentication
For signals carrying sensory payloads (camera feed, microphone stream), the
sensory layer verifies that the signal's content matches known attributes of the
claimed identity. This is not a single check — it is a processing pipeline that
returns a plist of attribute-verification results:
#+begin_example
(:face-match 0.94 :voice-match 0.89 :location-match t
:claimed-identity "Jack" :unresolved-attributes (:liveness))
#+end_example
The sensory layer checks:
- *Continuity*: has this source been continuously active, or did it appear
suddenly? A camera that was dark for 30 minutes and then shows a face is
not necessarily that person — it might be a replay.
- *Cross-modal consistency*: does the face match the voice? Does the voice
match the location? Does the location match the reported sensor position?
- *Liveness*: is the sensory input live (real-time capture) or pre-recorded?
- *Environmental coherence*: does the background, lighting, ambient sound
match expected patterns for the claimed source and location?
Sensory authentication is not cryptographic — it is statistical. The results
are attribute confidence scores, not binary verdicts. A signal that passes
cryptographic authentication but fails liveness (e.g., a replay attack using
validly-signed pre-recorded frames) may still be rejected or restricted.
This layer depends on vision and audio processing skills that do not yet exist.
It is deferred until those capabilities are available. When unavailable, sensory
authentication returns =:unavailable= and the gate proceeds with the remaining
layers. Degradation is graceful, never silent.
** Layer 3 — Deterministic Identity Reasoning
Queries the fact store for identity-ruling facts. Screamer checks whether the
claimed identity is consistent with known facts:
- "Key #47 claims to be Jack. Fact store records =(:entity :jack :relation :status
:value :deceased :timestamp 2024-03-15)= → reject: identity ruled out by death
record."
- "Key #47 claims to be at sensor location Cairo. Fact store records =(:entity
:jack :relation :last-known-location :value :berlin :timestamp <4 hours ago>)=
→ reject: physically impossible transit."
- "Key #47 proposes the same action that was blocked by the human 3 times in the
last hour. Fact store records =(:entity :action-<hash> :relation :blocked-by
:value :human :count 3 :window 1h)= → flag for review: anomalous persistence."
This is binary — Screamer returns =:consistent= or =:contradiction= with the
contradicting facts as provenance. A definitive contradiction (died, impossible
transit) is a hard reject. A weaker contradiction (unusual pattern) feeds into
the probabilistic layer rather than rejecting outright.
This layer depends on Phase 2 (Screamer) and a populated fact store. It is
unavailable in Phase 0-1. When unavailable, returns =:unavailable=.
** Layer 4 — Probabilistic Identity Reasoning
For signals where the claimed identity is a human communicating through text
(messaging, TUI, CLI, Emacs), the probabilistic layer checks:
- *Writing style*: does the text match the claimed author's known style profile?
Vector embeddings of known writing samples vs. the current signal. Cosine
similarity produces a confidence score.
- *Behavioral patterns*: does the timing, length, cadence, and vocabulary match
the claimed author's historical patterns? "Heather's messages are usually
long, deliberative, and use parenthetical asides. This message is short,
imperative, and contains no parentheticals."
- *Content coherence*: does the message's topic, references, and assumptions
match what the claimed author would plausibly say? "This message references
a project Heather doesn't work on and uses terminology she has never used
in 3 years of diary entries."
The LLM proposes a confidence score. A deterministic gate checks it against a
configurable threshold (=AUTH_PROBABILISTIC_THRESHOLD=, default 0.6). Below the
threshold, the signal's authorization is downgraded: read-only by default, write
actions require HITL. The =:probabilistic= layer never rejects outright — it
downgrades and flags. Style profiles are a fact-store domain: =(:entity :heather
:relation :writing-style :value <embedding-vector> :timestamp <ut>)=.
This layer depends on the existing embedding infrastructure (=embedding-native.lisp=,
v0.4.0) and the neural LLM gateway. The infrastructure exists. What's missing is
building style profiles as a fact-store domain and wiring them into gate vector 0.
** Layer Composition
The gate runs only the available layers. Cryptographic is always available (it
is pure Lisp, no external dependencies beyond the vault). The remaining layers
are =fboundp=-guarded — they degrade gracefully rather than crashing.
The confidence score aggregates across layers using a configurable strategy
(default: weakest link). If any binary layer rejects, the signal is rejected
regardless of other layers. If all binary layers pass but the probabilistic layer
returns low confidence, the signal operates at the key's reduced authorization.
The human can configure which layers are active per signal class:
#+begin_example
AUTH_LAYERS_DEFAULT=crypto,deterministic,probabilistic
AUTH_LAYERS_SENSOR=crypto,sensory,deterministic
AUTH_LAYERS_CRON=crypto
#+end_example
** Signal provenance chain — signing causes, not just actions
A sensor key captures video. A vision skill processes the frames and proposes a
classification. A cron job re-indexes the knowledge graph based on that
classification. A human reviews and approves. Each step in this chain has a
different signer. Each step is signed with the signer's key. The chain is
Merkle-linked: each signal in the chain hashes its predecessor's signature as
part of its own payload.
After an incident, the chain is traceable: "The deletion happened because sensor
#3 classified the directory as stale. Classification was signed by key #47
(vision-skill). Sensor data was signed by key #12 (camera-feed). Sensory auth
noted liveness failure at the sensor signal. Deterministic auth noted impossible
transit between Cairo and Berlin. Key #12 was later revoked. The deletion signal
is the leaf of a chain whose root is compromised at three authentication layers."
Every intermediate step is auditable. Every signer is identifiable. Every
authentication result is in the chain.
** Human as root of trust
The human's key signs new source keys into existence. The human's key signs
revocation of compromised keys. Both operations produce facts in the symbolic
index: =(:key #47 :status :revoked :revoked-by :human-key :timestamp <ut>)=.
The fact store is the key registry. The Merkle DAG ensures the revocation is
tamper-proof — a compromised key cannot un-revoke itself.
When a key is revoked, the Dispatcher rejects all signals from that key. The
revocation propagates through the signal chain: if key #12 (sensor) is revoked,
every signal in the chain that descended from a key #12 signature is flagged
and re-authenticated against the remaining layers. Not deleted — flagged. The
chain is preserved. The human decides what downstream actions to unwind.
** Implications for the three Laws
- *Third Law + layered auth*: the agent distinguishes "this sensor's key is
valid but its liveness check failed and its claimed identity died 2 years ago"
from "this is the human issuing =passepartout stop=." Both arrive on the pipe
with valid cryptographic signatures. The stacked evidence — sensory, factual,
probabilistic — triangulates the threat. The first is rejected with provenance
at three layers. The second passes all four.
- *Second Law + layered auth*: obedience is about the authenticated identity
profile, not just the key that signed the signal. A valid key that probabilistically
doesn't match Heather reduces authorization. Obedience follows confidence.
- *First Law + layered auth*: harm through sensor compromise becomes detectable
when sensory and deterministic layers disagree with the cryptographic layer. A
camera key signing frames from an empty room but the deterministic layer placing
the key's owner in another city — that's a compromised sensor, and the layered
result makes it explicit.
** Integration with existing infrastructure
The vault stores key material. The Merkle memory stores key registry facts with
content-addressed integrity. The Dispatcher runs gate vector 0 at priority 700 —
before type-level checks, before predicate evaluation, before any action proceeds.
The fact store records every key operation (creation, promotion, revocation) as a
fact with =:provenance :key-lifecycle=.
No new core ASDF components. The cryptographic sub-layer is Phase 0b (~200 lines).
The sensory sub-layer is deferred to a future vision/audio phase. The
deterministic sub-layer is Phase 2+ (Screamer + populated fact store). The
probabilistic sub-layer extends existing embedding infrastructure with style
profiles as a fact-store domain.
* Open Questions
Several design questions are unresolved and should remain unresolved at this
@@ -643,14 +1338,10 @@ and that cannot be known in advance.
** How does ontology refactoring work?
If the seed produces 50 categories from gate extraction and later experience
shows they are wrong — wrong granularity, missing cross-cutting concerns, conflated
categories — how are they migrated without invalidating all existing deductions
that cross the old category boundaries? The ephemeral-first approach (no
persistence, rebuild from scratch) is a temporary answer. Once persistence is
committed (VivaceGraph), refactoring the category hierarchy is a schema migration
problem that deduction provenance makes harder — every deduced fact's chain may
cross the old category boundary. This is not addressed in the current architecture.
This question is settled. See "Ontology Versioning — How Worldviews Change
Without Losing Perspective" above. The category hierarchy is Merkle-hashed. Every
fact stores its =:ontology-version=. Re-verification is heartbeat-driven.
Worldviews are preserved, not overwritten. The shift is the artifact.
** What is the appropriate role of the human?
@@ -663,12 +1354,16 @@ and approve proposed generalizations. The balance cannot be set without experien
** How much Wikidata is the right amount?
Loading Wikidata entities referenced in the memex is the minimum. Loading all
Wikidata entities within N hops of those references expands the graph
exponentially. The right N depends on the memex's breadth — a memex focused on
software engineering needs fewer hops than a memex spanning literature, history,
philosophy, and science. The query performance and memory costs of a large
Wikidata load are unknown.
Query performance and memory costs are now bounded — 5 million entities ≈ 400MB
RAM, O(1) hash lookups, domain-scoped Screamer checks. A large Wikidata load is
a capital cost, not a recurring bill (see "Performance — Why Ontology Growth
Doesn't Make the System Slower" above).
Remaining open: the right N hops from entities referenced in the memex depends on
the memex's breadth. A software-engineering memex needs ~1 hop; a literary memex
needs 3-4 hops (Nabokov → Kafka → expressionism → modernism → Baudelaire).
The right value is empirical, testable, and user-specific — it cannot be set in
the architecture.
** Can the symbolic engine satisfy queries from the user without LLM involvement?