amr/memex

Files

Amr Gharbeia 4e9431ec1d memex: update passepartout submodule → v0.7.2, add notes

passepartout v0.7.2 (Gate Trace + HITL + Search + 11 more features):
- Gate trace visualization with Ctrl+G toggle
- HITL inline panels with styled collapse on approve/deny
- Agent identity file + /identity command
- Safe-tool read-only allowlist
- Message search mode with Up/Down nav and highlights
- Context budget visibility with section breakdown
- Session rewind /sessions /resume /rewind
- Undo/redo per operation
- Context debugging /context why /context dropped
- Tool hardening (timeouts, write verify, read-only cache)
- Tag stack severity tiers + trigger counts
- Merkle provenance audit + audit-verify
- Self-help /help <topic> reads USER_MANUAL.org
- Live CONFIG section in system prompts
- Pads: Page Up/Down scroll by 10 lines

Core 92/92  TUI Main 104/104  TUI View 29/29  Neuro 13/13

2026-05-08 21:56:11 -04:00

39 KiB

Raw Blame History

Passepartout Neurosymbolic Engine — Design Decisions and Architecture Options

The Hallucination Problem — Why Neurosymbolic
- See also:
The Five Architecture Options
The Chosen Path: Option 4, Starting with Option 5
- Why the dual index is permanent, not transitional
The Neuro as Brain, the Symbolic as Education
The Gate-to-Fact Bootstrap — Extracting the First Ontology from Existing Code
The LLM as Proposer — Verified Extraction
Three Contradiction Policies — Domain-Dependent Consistency
How Categories Grow — The Organic Ontology
- Growth is self-limiting by design
Semantic Wikipedia as Entity Backbone
- The decisive simplification
The "Flip" — From Lossy Extraction to Deterministic Derivation
- The sufficiency criterion
- The flip does not mean "complete"
Ephemeral First, Persistent Later
Whitehead's Concrete Contributions — Four Operational Contributions
The Provenance Chain as Product
The Competitive Argument
Open Questions
Relation to Passepartout's Existing Architecture

The Hallucination Problem — Why Neurosymbolic

An LLM is a statistical engine trained on token sequences. It generates the most probable continuation of a prompt. Given sufficient context, that continuation is correct. Given novel context, it is often wrong in confident-sounding ways.

This is not a training deficiency. Hallucination is a fundamental property of probabilistic inference. You can reduce it with better models, longer contexts, and clever prompting, but you cannot eliminate it by making the LLM better. You eliminate it by not asking the LLM to do things that require certainty.

This is the architectural bet at the heart of Passepartout's neurosymbolic design. The LLM should not be the reasoning engine. It should be the creative engine — proposing possibilities, surfacing connections, translating between natural language and formal representation. The reasoning engine should be symbolic: deterministic, verification-grounded, provenance-tracked, and incapable of hallucination by construction.

This is not a rejection of neural methods. It is a division of labor. The neuro is the brain — generative, associative, creative, comfortable with ambiguity. It produces hypotheses. The symbolic engine is the education — accumulated, verified, provenance-tracked knowledge that the brain draws on and is disciplined by. It doesn't think. It remembers, checks, and constrains.

The brain is always smarter than the education, but the education prevents the brain from being confidently wrong.

The Five Architecture Options

The symbolic engine must relate to the human memex. The relationship is not obvious because knowledge lives in two incompatible forms: natural language prose (what the human reads and writes) and formal facts (what the symbolic engine reasons about). The translation between them is lossy by nature. The architecture is defined by how it handles that lossiness.

notes/passepartout-symbolic-engine-exploration.org explores five options. They are summarized here to make subsequent decisions legible.

Option 1: The Auto-Formalizer

A separate knowledge graph stores symbolic facts. The LLM populates it by extracting triples from unstructured data — documentation, manuals, logs, session histories. The KG becomes co-authoritative with the human prose.

This is the simplest to implement but inherits the dual-representation problem in its most acute form. The KG and the prose can disagree, and the architecture provides no mechanism for resolving disagreements. It also stores knowledge twice — once in the user's Org files, once in the KG — with no guarantee that they stay synchronized.

Option 2: Two Intentionally Separate Memexes

The human memex contains prose: thoughts, diaries, decisions, documentation. The symbolic memex contains formal facts: constraints, rules, relationships, deductions. The archivist bridges between them but does not try to keep them synchronized. They are allowed to diverge because they serve different purposes. The prose captures what the human intended. The symbolic memex captures what the symbolic engine has proven.

This is philosophically honest — it admits that no lossless translation between natural language and formal logic is possible. But it forces the user to reason about two separate knowledge stores and understand when to trust each.

Option 3: Tangled Fact Blocks in Org Files

The tangle mechanism already handles the dual-representation problem for code. Lisp code lives in literate blocks within Org files (#+begin_src lisp). The tangle mechanism extracts these blocks and generates .lisp files. A new block type — #+begin_src knowledge — would contain symbolic facts in a formal language. The tangle mechanism would load these facts into the symbolic engine's in-memory store, just as it loads Lisp code into the SBCL image.

This is aesthetically appealing because it unifies the format. One toolchain, one version control system, one Merkle tree. But the block language itself IS the knowledge representation language, and that language is the ontology we have not yet defined. The format is unified but the content is unspecified.

Option 4: One Memex, Two Indices

The prose remains in human language in Org files. The prose is always the ground truth. Two indices sit on top of the prose as derived views:

The neural index uses vector embeddings to enable semantic search. The LLM navigates the prose through embedding space, retrieving relevant headings.
The symbolic index stores formal assertions about what the prose says — predicates, relations, constraints — each grounded to a specific heading or block in the Org file.

Each index serves its own side of the machine. They do not need to understand each other's representations. They only need to agree on which heading or block they are referring to. Because the prose is always the ground truth, the symbolic index can be thrown away and rebuilt from scratch if it becomes corrupted or stale. No information is lost — only the extracted assertions.

Option 5: Ephemeral Symbolic Facts

No persistence, no serialization format, no knowledge graph stored on disk. VivaceGraph exists in memory during the session. Screamer derives facts from the prose as needed. When the session ends, the facts are discarded and re-derived from the prose on the next start.

This punts the ontological design problem entirely. You never have to decide on a serialization format because you never serialize. The cost is compute (re-derivation on every restart) and the inability to accumulate facts across sessions. But it is the correct first step — a way to learn what kinds of facts are actually useful before committing to a storage format.

The Chosen Path: Option 4, Starting with Option 5

The one-memex-two-indices architecture (Option 4) is the correct long-term architecture. The prose is the ground truth. The symbolic index is a derived view that can be rebuilt. The neural index handles what the symbolic index cannot — semantic search, fuzzy matching, associative leaps.

But committing to a persistence format before knowing what facts are useful is premature. The practical path starts with Option 5 (ephemeral facts) as the Phase 1-4 implementation, then graduates to Option 4 with VivaceGraph persistence in Phase 5 when the fact language has been battle-tested (see =passepartout-neurosymbolic-roadmap.org).

Why the dual index is permanent, not transitional

In the coding domain, there is an aspiration that the symbolic index could eventually capture enough of the prose's propositional content to become a complete representation — the "flip" described in the architecture note. But for the broader memex (literature, poetry, personal reflection, daily logs), completeness is neither possible nor desirable. You cannot formalize what makes a poem beautiful. You cannot extract a triple that captures the emotional weight of a diary entry. The neural index will always be the gateway to the full richness of the prose. The symbolic index handles what can be mechanically verified: citations, entities, temporal order, contradictions, provenance. The division of labor between the two indices is permanent because the domains they serve are fundamentally different kinds of knowledge.

The Neuro as Brain, the Symbolic as Education

The original 10-80-10 architecture (10% neural, 80% symbolic, 10% neural) describes the target ratios for a coding agent — a domain where most reasoning is formalizable. For the broader memex, the ratios are different and less important than the metaphor itself.

The neuro is the brain — generative, associative, creative, comfortable with ambiguity. It produces insights that are provisional, connections that are speculative, hypotheses that may be wrong. It is the driver.

The symbolic engine is the education — accumulated, verified, provenance-tracked knowledge that the brain draws on and is disciplined by. It doesn't think creatively. It remembers, checks, and constrains. It prevents the brain from being confidently wrong.

This framing resolves a tension in the original architecture. The 10-80-10 implies the symbolic engine replaces the neuro for reasoning. But a symbolic engine is terrible at creativity, ambiguity, and associative leaps across unrelated domains — exactly what you need for a memex that contains Pale Fire, a shopping list, and a project plan. The brain proposes that your sudden interest in unreliable narrators coincides with a week where your project retrospective used the word "deception." The education verifies: "those two diary entries are 4 days apart; the word 'deception' appears in both; here are the headings." The brain makes the leap. The education makes it trustworthy.

This means the symbolic engine never needs to be "complete." Education isn't complete knowledge — it's structured knowledge. You don't need a fact for every sentence in your diary. You need facts for what can be mechanically verified: dates, citations, entities, contradictions, temporal order. The brain handles the rest.

The Gate-to-Fact Bootstrap — Extracting the First Ontology from Existing Code

The Dispatcher gate stack already encodes an implicit ontology. Every gate vector asserts the existence of a category of things:

Gate vector 2 asserts there exists a class of files called secrets.
Gate vector 7 asserts there exists a class of commands called destructive.
Gate vector 8 asserts there exists a class of domains called trusted.
The self-build boundary asserts there exists a class of files called core-harness and a class called skills.

These claims are currently expressed as code — Lisp functions that pattern-match against file paths, shell commands, and URLs. They are not facts the symbolic engine can query, derive from, or check for consistency. But they can be made explicit.

The bootstrap makes every gate a set of initial symbolic facts: (:file ".env" :member-of-class :secret-files :source gate-vector-2), (:command "rm -rf /" :classified-as :catastrophic :source gate-vector-7), (:domain "api.telegram.org" :classified-as :trusted :source gate-vector-8).

This produces 50-70 entity classes directly from the existing gate stack, without any new infrastructure:

Source	Count	Example categories
`dispatcher-protected-paths`	11	:secret-config-file, :ssh-key-file, :gpg-key-file
`dispatcher-shell-blocked`	8	:catastrophic-command, :injection-pattern
`dispatcher-network-whitelist`	2	:trusted-domain, :untrusted-domain
Self-build boundary	2	:core-harness-file, :skill-file
Privacy tags	3	:private-content, :financial-content
Permission table	3	:read-only-tool, :write-tool, :eval-tool
Cognitive tools	6	:code-search-tool, :file-io-tool, :shell-tool
Relations (all gates)	~15	:member-of-class, :classified-as, :depends-on
Qualities	~8	:catastrophic, :dangerous, :moderate, :harmless
Provenance sources	4	:gate-outcome, :human-authored, :deduced, :llm-proposed

This is the seed. It gives Screamer a domain to reason about immediately, without any LLM involvement. It proves the pattern — code becomes facts, facts enable reasoning — at the cost of approximately 30 lines of Lisp.

The LLM as Proposer — Verified Extraction

The LLM cannot be trusted to populate the symbolic index directly. Its outputs are sampled, not proven. A probabilistic extraction feeding a deterministic engine defeats the purpose of being deterministic.

But the LLM is still useful. It can surface facts that are obvious to a human reader of prose but would take the symbolic engine many deduction steps to reach independently. The solution is to demote the LLM from extractor to proposer:

The archivist reads a prose heading.
The LLM proposes candidate triples.
Screamer checks each triple for consistency against the existing fact store.
Only consistent triples are admitted to the symbolic index, flagged with :provenance :llm-proposed and grounded to the source heading.

The LLM might hallucinate facts that don't correspond to the prose. It might extract facts that contradict existing knowledge. It might produce syntactically malformed triples. None of these failures contaminate the symbolic index because proposals are not admitted automatically. The admission gate (Screamer) is deterministic.

This is the core architecture pattern. Everything else — the entity classes, the deduction engine, the persistence layer — follows from this single design decision: the LLM proposes; the symbolic engine decides whether to accept.

Three Contradiction Policies — Domain-Dependent Consistency

Classical logic requires consistency. A contradiction implies everything (ex contradictione quodlibet). Screamer, as a constraint solver, also requires consistency — a contradictory constraint set has no solutions. But the symbolic engine operates across domains where the meaning of contradiction is fundamentally different.

A single architecture serves all domains by applying different contradiction policies, scoped to the entity class:

Policy :exclusive — Contradiction Rejected at Admission

For domains where the world is physically singular — a file either exists or it doesn't, a command either was blocked or it wasn't, a gate rule either applies or it doesn't. When a new fact contradicts an existing one in an :exclusive domain, the new fact is rejected. The existing fact is authoritative unless a human explicitly retracts it.

Use for: security classifications, file system state, gate rules, code correctness, deterministic safety constraints.

Policy :coexistent — Contradiction Flagged, Both Retained

For domains where multiple truths coexist — literary interpretations, historical accounts, personal beliefs held at different times, multi-source factual disagreement (Wikidata vs. DBpedia vs. your memex). When a new fact contradicts an existing one in a :coexistent domain, the contradiction is recorded with a cross-reference flag. Both facts are stored. Queries return all facts with provenance display.

Use for: literature, history, personal knowledge evolution, scientific consensus shift, multi-author knowledge bases.

Policy :temporal — Contradiction Accepted as Version Change

For domains where truth changes over time. When a new fact contradicts an old one in a :temporal domain, the old fact is marked :superseded but retained. The timeline is queryable: "You believed X on Tuesday, Y on Friday, Z on Sunday."

Use for: personal belief evolution, project plan revisions, scientific consensus shift over time, any knowledge where the change itself is information.

Policy Assignment

The policy is assigned when a category is defined. New categories default to :coexistent (never loses information). Core security categories are explicitly :exclusive. The gate stack's bootstrapped facts are :exclusive because they describe the actual filesystem, not perspectives.

The Screamer admission gate does not reject all contradictions. It rejects contradictions in :exclusive domains and flags them in :coexistent and :temporal domains. The constraint solver still works because queries scope their constraint set to a single provenance domain. "Is X true according to my memex?" is a different query than "Is X true according to Wikidata?" Each has a self-consistent internal logic. The contradiction is between domains, not within them.

Why This Matters for the Broader Memex

In the coding domain, contradiction is rare and must be resolved — a gate can't both allow and block the same path. In the broader memex, contradiction is the product, not the error. Your poetry analysis contradicts your last diary entry on the same topic. Your reading of Pale Fire changed between 2023 and 2025. Wikidata says Mount Everest is 8848m (China: rock height); DBpedia says 8849m (Nepal: snow height). The symbolic engine's job is not to decide which is right. It is to surface the tension with provenance — "these three sources disagree. Here is the chain for each."

How Categories Grow — The Organic Ontology

Whitehead's Principia Mathematica took over 300 pages to define the logical foundations before it could prove that one plus one equals two. Every category introduced carried a burden of justification. Every inference rule had to be demonstrated sound. This is the classical approach to ontology: define everything upfront, exhaustively, formally.

Passepartout cannot afford this and does not need it. Its domain is bounded (software engineering, personal knowledge, literary engagement, daily life) and its ontology grows from the system's own operation:

The gate stack seeds the ontology. Every gate vector is an implicit claim about a category of things. The bootstrap makes these claims explicit. The seed is 50-70 entity classes with no human authoring required — they are mechanically extracted from the existing code.
New gate vectors add categories directly. As the Dispatcher grows (new shell patterns, new path protections, new tool classifications), the ontology grows with it. Every new pattern in the gate stack becomes a fact on skill load. No human effort. The gate stack grows, the ontology grows.
Screamer generalizes from gate outcomes. After 37 shell commands are blocked as destructive, Screamer extracts structural commonalities: "commands writing to block devices," "commands recursively deleting outside the workspace." These become new subcategories (:block-device-command, :workspace-external-delete) that didn't exist in the original gate patterns. The ontology deepens through observation.
The archivist proposes from prose. The archivist reads a diary entry about a book: "Nabokov's lectures on Kafka." The LLM proposes (:entity :nabokov :relation :lectures-on :value :kafka). Screamer checks consistency. Admitted. The categories :author, :lectures-on, and :subject didn't exist before — they are created on first use. This is the primary growth mechanism for the broader memex.
The human declares explicitly. The human writes a declarative fact directly into the symbolic index. No extraction step. No LLM involvement. The fact is admitted with :provenance :human-authored — the highest trust level.
Temporal patterns crystallize into categories. Every Sunday the memex gets a retrospective heading. Every Monday a planning heading. The time-awareness system observes the periodicity and proposes :weekly-retrospective and :weekly-planning as fact types. Screamer verifies they don't contradict existing categorizations. Admitted.
Cross-domain overlap produces parent categories. Screamer notices that :secret-files (from the gate stack) and :private-content (from privacy tags) share members — .env is both a secret file and private content. It proposes :sensitive-material as a parent with both as children. Taxonomy building happens automatically through overlap detection.

Growth is self-limiting by design

Not every conceivable category is added. The system prunes through use:

New categories are admitted only through Screamer's consistency check. A category that contradicts an existing classification is rejected.
A category that never gets queried costs nothing (a hash table entry) but produces no value. It fades from use naturally.
Overly fine-grained categories (.env.foo.bar.baz as its own class) are rejected because they are redundant with the wildcard pattern that already covers them.
Overly broad categories that subsume meaningful distinctions ("everything is a :file") produce contradictions when Screamer tries to apply existing rules. Rejected.

The system converges on a useful granularity through use, not through upfront design. The gate stack provides the seed. Gate outcomes, prose extraction, deduction, and human authoring grow the shoots. Screamer prunes contradictions. The ontology is a garden, not a building.

Semantic Wikipedia as Entity Backbone

The gate stack provides 50-70 entity classes — adequate for a coding agent where the domain is bounded to files, commands, and code symbols. For a general-knowledge memex, 50-70 is starvation. Your memex mentions Nabokov, Pale Fire, Kinbote, Zembla, paranoid reading, unreliable narrators, postmodernism, butterfly migration, chess problems, and the Russian exile experience. The gate stack knows none of these. Organic growth through prose extraction would take years just to cover the entities in one person's engagement with a single novel.

Wikidata has already done this work: approximately 2 million entity classes, over 100 million entities, a decade of human curation. By loading the neighborhood of your memex into the symbolic index (entities referenced in your prose, plus their N-hop property net from Wikidata), the entity recognition problem vanishes. The archivist doesn't need to discover Nabokov from your diary. It needs to connect your heading to the existing Wikidata entity. That is a simpler task — reference resolution, not knowledge extraction.

The LLM's role shrinks to three thin boundaries:

Input translation — natural language question to structured query. "What do I think about monorepos?" → (fact-query :entity :monorepo :relation :opinion :source :memex). Formulaic, ~100 tokens, any model sufficient.
Prose to candidate triple — for personal memex entries that have no Wikidata counterpart: your opinions, your day's events, your project plans. Proposals are verified by Screamer before admission. This is the only extraction path that still requires an LLM, and its scope is limited to what Wikidata cannot provide — your subjective, personal, or novel content.
Result to prose — structured answer to readable sentence. "Your 2023 diary says 8848m. Wikidata (last edited Feb 2024) says 8849m. They disagree on height." The reasoning is done; the LLM wraps the plist in grammar. ~100 tokens, any model sufficient, purely cosmetic. Users who prefer no LLM at all can navigate through command-driven interaction (/query, /contradictions, /audit, /context why).

Everything else — the gate stack, the fact store, the constraint solver, the type hierarchy, the provenance tracking, the contradiction surfacing, the cross-domain comparison — is pure deterministic Lisp with zero LLM tokens.

The decisive simplification

Without Semantic Wikipedia, the archivist must discover entities from prose: extract a triple for every person, place, work, concept, and event mentioned in the memex. This is unbounded LLM work and the quality depends on extraction accuracy.

With Wikidata loaded, the entity graph is pre-structured. The archivist's job changes from "discover that Nabokov wrote Pale Fire and lectured on Kafka" to "verify that the Nabokov referenced in heading #47 is the same entity as Wikidata item Q36591." The second task is simpler, more reliable, and in many cases can be done without an LLM at all — a simple entity name match against the loaded Wikidata graph may suffice for unambiguous names.

The "Flip" — From Lossy Extraction to Deterministic Derivation

The symbolic index begins its life as a lossy construct. The initial extraction from the prose — the first population of facts from LLM proposals verified by Screamer — is built from an uncertain foundation. Some facts are correct. Some are missing. Some are wrong.

But the symbolic engine accumulates non-lossy facts through three independent mechanisms:

Gate outcomes — every gate rejection is a fact. No LLM involved. These accumulate at the rate of user interactions.
Screamer deductions — new facts derived from existing facts. No LLM involved. These accumulate whenever the fact store crosses a density threshold where structural patterns emerge.
Human authoring — the human explicitly declares facts. No LLM involved.

At some point, the non-lossy facts constitute a sufficient foundation that the symbolic engine can reverse the flow: instead of the LLM extracting facts from prose, the symbolic engine reads prose through its own lens — its now-substantial ontology of categories, rules, and constraints — and asserts facts in its own language. The extraction mechanism ceases to be probabilistic and becomes deterministic.

The sufficiency criterion

The architecture note (notes/passepartout-symbolic-engine-exploration.org) describes this "flip" as aspirational: "at some point, the non-lossy facts constitute a sufficient foundation." This design decision makes it operational:

(/ (count-provenance :gate-outcome :human-authored :deduced) total-facts)

When this ratio exceeds a configurable threshold (SUFFICIENCY_THRESHOLD, default 0.7), the system considers its foundation sufficient. The archivist switches from "LLM proposes, Screamer verifies" to "Screamer queries existing facts, applies to the new prose, and deduces new facts directly."

The flip is visible to the user through the TUI sidebar or /status command: "Symbolic index: 847 facts (73% non-lossy, 12% LLM-proposed, 15% Wikidata). Sufficient foundation: YES."

The flip does not mean "complete"

In the broader memex, completeness is neither possible nor desirable. The flip means "deterministic enough to be trustworthy," not "comprehensive enough to be self-sufficient." The neural index remains the gateway to the full richness of prose. The symbolic index handles what can be mechanically verified. The boundary is permanent.

Ephemeral First, Persistent Later

The architecture note's Option 5 (ephemeral facts, no disk persistence) is the correct first implementation. Three reasons:

The fact language is unproven. Triples with provenance and grounding is a hypothesis. It may be too simple for some domains, too complex for others. Committing to a serialization format before knowing what's useful is premature.
The ontology is emergent. Categories are created on first use. What proves useful stays; what doesn't fades. A persistent format would need a migration story every time the category structure changes. Ephemeral avoids this entirely — the facts are re-derived on each session start using the current (evolved) ontology.
Rebuildability is the safety net. Because all facts have a :grounding to an Org heading, and gate-outcome facts are regenerated from the gate stack on every load, the entire symbolic index can be thrown away and rebuilt from scratch. The cost is compute, not data. This is the practical realization of "the prose is always the ground truth."

The transition to persistence (Phase 5: VivaceGraph) happens when two conditions are met: the fact language has stabilized through use, and the accumulated deductions across sessions provide value that justifies the serialization cost.

Whitehead's Concrete Contributions — Four Operational Contributions

notes/passepartout-whitehead.org extracts four concrete, engineerable ideas from Whitehead's Principia Mathematica and Process and Reality. They are summarized here because each informs the neurosymbolic design.

Contribution 1: PM-Type-Level Gates

PM's ramified theory of types solved Russell's paradox by assigning every propositional function a type level, making self-application syntactically invalid. Passepartout applies the same principle to prevent a request from modifying the rules that validate it. Every cognitive tool and gate vector carries a :type-level integer. Before any gate predicate runs, the dispatcher checks: if the signal's type level equals or exceeds the gate's type level, the signal is rejected. A request to modify dispatcher rules (type-level 5) cannot pass a gate of type-level 4 or lower. This is a structural prohibition, not a heuristic — self-modification of the safety layer is impossible by construction.

Implementation: approximately 30 lines in the existing dispatcher. No new dependencies. Backward compatible. This is Phase 0 of the symbolic engine roadmap.

Contribution 2: Theory of Descriptions → Reference Resolution

PM's theory of descriptions addressed the problem of referring to nonexistent entities: "the current king of France is bald" is false, not meaningless, when there is no unique referent. Passepartout applies this to reference resolution: when the user says "the function that validates secrets," a cognitive tool checks uniqueness before resolving. Ambiguous references trigger a clarification prompt rather than a blind guess.

Implementation: approximately 40 lines as a cognitive tool. When the knowledge graph ships, descriptions become native Prolog queries with uniqueness constraints.

Contribution 3: Process and Reality → Architectural Vocabulary

Whitehead's process ontology maps with surprising precision to Passepartout's pipeline architecture. Prehension = a gate grasping a signal. Positive prehension = a gate passing. Negative prehension = a gate rejecting. Concrescence = the pipeline process from input to output. Satisfaction = the final agent response. This vocabulary is precise, standard, and already mapped to the architecture. It provides the language for the /why command, the gate trace, and the ARCHITECTURE documentation. It is descriptive, not operational — the design would be correct without it, but it would lack the vocabulary to describe why it is correct.

Contribution 4: VivaceGraph + PM Types → KG Type Hierarchy

When the knowledge graph ships, every entity inherits PM's type hierarchy. Entities carry :pm-type-level metadata. Queries cannot return entities of the same level as the querying function. Self-referential knowledge becomes structurally impossible — no "this entity defines its own type level." This is Contribution 1 applied to the knowledge layer rather than the execution layer. The dispatcher prevents self-referential actions; the KG prevents self-referential facts.

The Provenance Chain as Product

In the coding domain, the value of the symbolic engine is the verified fact: "this command is safe." In the broader memex, the value is the provenance itself: "this claim originated in that diary entry on that date, has been referenced 7 times across 4 different projects, was contradicted in a retrospective 6 months later, and was revised in a note 3 weeks after that."

The symbolic engine doesn't tell you what is true. It tells you what you wrote, when, where, and how it connects to everything else you wrote — with a verifiable audit trail. It is a memory prosthesis that makes your own mind legible to you.

Every fact carries:

:grounding — the specific Org heading from which it was extracted
:provenance — who or what produced it (gate-outcome, human-authored, deduced, LLM-proposed)
:timestamp — when it was admitted to the symbolic index
:referenced-by — other facts that depend on or reference this one
:contradicted-by — other facts that disagree with this one (if any)
:superseded-by — if this fact was replaced by a newer version

These fields make every fact auditable. The /audit <node-id> command renders the full provenance chain as an Org headline tree. The provenance is not a logging feature. It is the product.

The Competitive Argument

No competitor has this problem because no competitor has a symbolic engine. The 55 systems surveyed in notes/competitive-landscape.org range from pure chat agents (Claude, ChatGPT) to agent harnesses (Claude Code, OpenCode, Hermes) to platform agents (OpenClaw). None of them encode knowledge as formal facts with provenance. None of them verify extractions against an existing knowledge base. None of them can prove properties about their own rulesets.

Their safety is heuristic (prompt-based guardrails that consume LLM tokens and can be evaded with clever phrasing). Their memory is flat (JSONL transcripts without content-addressed identity or provenance chains). Their reasoning is entirely neural — when you ask "why did you decide that?", the answer is a regenerated LLM explanation, not a retrieved inference chain.

Passepartout's architectural bet is that this problem is worth solving — that a system which can surface contradictions with provenance, derive new facts from observations, and verify claims against a provenanced knowledge graph is fundamentally different from a system that can only call an LLM and hope the response is correct.

The cost is the ontological work that is genuinely difficult. The reward is a system that cannot hallucinate at the reasoning level, whose memory is provable rather than empirical, and whose knowledge accumulates across sessions through deduction rather than through LLM re-prompting. For a life's knowledge stored in a personal memex, this is not a performance advantage. It is a category difference.

Open Questions

Several design questions are unresolved and should remain unresolved at this stage. They represent research decisions that require experience running the system.

What is the minimum viable fact language?

Triples — (:entity :relation :value) with provenance and grounding — is the current hypothesis. It is simple enough to be parseable, expressive enough to capture the gate stack's implicit claims, and extensible enough that Screamer can operate on it. But it may be too simple. Triples do not naturally express temporal relations ("was X before Y?"), modal claims ("should not do X unless Y"), or counterfactuals — all of which may be essential for a symbolically-aided memex. The right granularity depends on what queries actually need to be made, and that cannot be known in advance.

How does ontology refactoring work?

If the seed produces 50 categories from gate extraction and later experience shows they are wrong — wrong granularity, missing cross-cutting concerns, conflated categories — how are they migrated without invalidating all existing deductions that cross the old category boundaries? The ephemeral-first approach (no persistence, rebuild from scratch) is a temporary answer. Once persistence is committed (VivaceGraph), refactoring the category hierarchy is a schema migration problem that deduction provenance makes harder — every deduced fact's chain may cross the old category boundary. This is not addressed in the current architecture.

What is the appropriate role of the human?

The human can explicitly declare facts, write constraints, and correct wrong extractions. But how much of the ontology should the human need to maintain? If the human must write a definition for every new category the symbolic engine encounters, the overhead is prohibitive. If the symbolic engine can generalize from instances, the human role becomes supervision rather than authorship — review and approve proposed generalizations. The balance cannot be set without experience.

How much Wikidata is the right amount?

Loading Wikidata entities referenced in the memex is the minimum. Loading all Wikidata entities within N hops of those references expands the graph exponentially. The right N depends on the memex's breadth — a memex focused on software engineering needs fewer hops than a memex spanning literature, history, philosophy, and science. The query performance and memory costs of a large Wikidata load are unknown.

Can the symbolic engine satisfy queries from the user without LLM involvement?

The design aims for zero-LLM query answering: the user issues a structured command (/query, /contradictions, /audit), and the symbolic engine responds directly. But natural language questions ("what do I think about monorepos?") still require the LLM as a thin translation layer. Whether the structured command interface is sufficient for daily use, or whether users will demand natural language interaction, determines how much LLM involvement remains in the mature system.

Is the triplestore physically bounded or does it explode?

A personal memex with years of diary entries, project notes, reading logs, and literary analyses could produce millions of triples. A naive hash table scales linearly but VivaceGraph's Prolog-like queries may not. The performance characteristics of graph queries over a million-triple knowledge base have not been estimated.

Relation to Passepartout's Existing Architecture

The neurosymbolic engine is an extension of the existing probabilistic-deterministic split, not a replacement for it. The current architecture divides cognition into LLM-driven proposals and Lisp-driven verification. The symbolic engine deepens the verification side from "is this action safe?" to "is this claim supported?" — the same architectural pattern applied to a broader domain.

The self-repair criterion (a file belongs in core only if, when corrupted, the agent cannot fix it without human help) applies to every component of the symbolic engine. Screamer, VivaceGraph, the fact store, the archivist — all are skills, loaded at runtime, hot-reloadable, and recoverable from corruption. A corrupted symbolic engine degrades reasoning capability but does not kill the agent. The eight existing core ASDF files are unchanged.

The symbolic engine is not v3.0.0 alone. It is the layer that sits between the existing gate stack (which it makes explicit as facts) and the existing skill system (which it extends with deduction, contradiction detection, and provenance tracking). It grows within the current architecture without replacing any existing component.

39 KiB Raw Blame History