memex: update AGENTS.md, add passepartout design-decisions notes, SWOT + agora notes, bump submodules → v0.8.1
This commit is contained in:
@@ -12,7 +12,7 @@
|
|||||||
a. Write the test first → tangle → run → prove it FAILS (RED)
|
a. Write the test first → tangle → run → prove it FAILS (RED)
|
||||||
b. Write the implementation → tangle → run → prove it PASSES (GREEN)
|
b. Write the implementation → tangle → run → prove it PASSES (GREEN)
|
||||||
c. Record both failure and success output
|
c. Record both failure and success output
|
||||||
5. **Reflect in org** — once tests pass, ensure the implementation is in the .org source
|
5. **Reflect in org** — once tests pass, ensure the implementation is in the .org source, put each function in a separate code block.
|
||||||
6. **Update literate prose** — write/update the explanatory text around the code:
|
6. **Update literate prose** — write/update the explanatory text around the code:
|
||||||
what it does, why it exists, how it connects to the rest of the system
|
what it does, why it exists, how it connects to the rest of the system
|
||||||
7. **Mark the origin TODO DONE** — in `docs/ROADMAP.org`, change the
|
7. **Mark the origin TODO DONE** — in `docs/ROADMAP.org`, change the
|
||||||
|
|||||||
868
notes/passepartout-SWOT.org
Normal file
868
notes/passepartout-SWOT.org
Normal file
@@ -0,0 +1,868 @@
|
|||||||
|
#+TITLE: Passepartout Neurosymbolic + Agora Integration — SWOT Analysis
|
||||||
|
#+AUTHOR: Agent
|
||||||
|
#+FILETAGS: :notes:analysis:swot:passepartout:agora:neurosymbolic:
|
||||||
|
#+CREATED: [2026-05-09 Sat]
|
||||||
|
|
||||||
|
* Premise and Scope
|
||||||
|
|
||||||
|
This analysis assumes the engineering is possible — Screamer can be wrapped,
|
||||||
|
VivaceGraph can persist facts, ACL2 can verify structural properties, the
|
||||||
|
archivist can extract triples from prose with Screamer verification, and the
|
||||||
|
note-publishing bridge to Agora can be implemented. The question is not "can it
|
||||||
|
be built?" but "does the architecture cohere? What does it enable? What does it
|
||||||
|
miss?"
|
||||||
|
|
||||||
|
* Will It Work Conceptually?
|
||||||
|
|
||||||
|
The short answer: yes, within a specific domain. The long answer: the boundary of
|
||||||
|
that domain is the most important thing to get right.
|
||||||
|
|
||||||
|
** The architecture's core insight is correct and load-bearing
|
||||||
|
|
||||||
|
The central design decision — "the LLM proposes; the symbolic engine decides
|
||||||
|
whether to accept" — is sound. It is the inverse of every existing agent
|
||||||
|
architecture. Claude Code, OpenCode, Hermes — all of them put the LLM in the
|
||||||
|
driver's seat and add safety as an afterthought (prompt-based guardrails that
|
||||||
|
consume tokens and can be evaded). Passepartout inverts this: the LLM proposes
|
||||||
|
actions and facts, but a deterministic layer of gates, constraint solvers, and
|
||||||
|
formal verifiers decides what to admit and what to execute. This inversion is the
|
||||||
|
correct response to the hallucination problem. You cannot eliminate hallucination
|
||||||
|
by making the LLM better. You eliminate it by not asking the LLM to do things
|
||||||
|
that require certainty.
|
||||||
|
|
||||||
|
The bootstrap mechanism — extracting 50-70 entity classes mechanically from the
|
||||||
|
existing Dispatcher gate stack with zero new code — is genuinely elegant. It
|
||||||
|
proves the pattern at minimal cost: code becomes facts, facts enable reasoning.
|
||||||
|
Every new gate pattern adds to the ontology organically. This is the right way to
|
||||||
|
start a knowledge base: not by designing a schema upfront, but by formalizing what
|
||||||
|
the system already knows implicitly.
|
||||||
|
|
||||||
|
** The "one memex, two indices" architecture survives contact with reality
|
||||||
|
|
||||||
|
Option 4 (one memex with neural and symbolic indices over the same Org files) is
|
||||||
|
the correct long-term architecture. The prose is the ground truth — always. The
|
||||||
|
symbolic index is a derived view that can be thrown away and rebuilt. The neural
|
||||||
|
index handles semantic search, associative leaps, and fuzzy matching. This
|
||||||
|
division of labor is permanent, not transitional, because the domains they serve
|
||||||
|
are fundamentally different kinds of knowledge.
|
||||||
|
|
||||||
|
The practical path — starting with Option 5 (ephemeral facts, no persistence)
|
||||||
|
through Phases 1-4, then graduating to Option 4 with VivaceGraph persistence in
|
||||||
|
Phase 5 — is the right sequence. It punts the serialization format problem until
|
||||||
|
the fact language has been battle-tested. It keeps the cost of mistakes low. It
|
||||||
|
treats the ontology as something discovered through use rather than designed
|
||||||
|
upfront.
|
||||||
|
|
||||||
|
** Wikipedia's ontology WOULD give it a running start — with caveats
|
||||||
|
|
||||||
|
Wikidata contains approximately 100 million entities with a decade of human
|
||||||
|
curation: type hierarchies, relations, dates, citations, disambiguation. For a
|
||||||
|
personal memex that mentions Nabokov, /Pale Fire/, Kafka, postmodernism, and
|
||||||
|
butterfly migration, the gate stack's 50-70 entity classes is starvation.
|
||||||
|
Organic growth through prose extraction would take years to cover the entities in
|
||||||
|
one person's engagement with a single novel.
|
||||||
|
|
||||||
|
Loading Wikidata's entity graph into the symbolic index transforms the
|
||||||
|
archivist's job from "discover that Nabokov wrote /Pale Fire/" to "connect your
|
||||||
|
heading to Wikidata entity Q36591." The second task is reference resolution, not
|
||||||
|
knowledge extraction — simpler, more reliable, and in many cases doable without
|
||||||
|
an LLM at all (string match against loaded entities). The notes claim this
|
||||||
|
collapses the LLM's role to three thin boundaries: input translation, prose-to-
|
||||||
|
candidate-triple for personal content, and result-to-prose formatting.
|
||||||
|
|
||||||
|
The caveats are real:
|
||||||
|
|
||||||
|
- Entity resolution (matching prose mentions to Wikidata entities) is genuinely
|
||||||
|
hard. "Nabokov" in a diary might refer to Vladimir Nabokov (Q36591), his son
|
||||||
|
Dmitri (Q566744), or someone else entirely. Disambiguation requires context
|
||||||
|
that the symbolic engine doesn't have without LLM assistance.
|
||||||
|
- Wikidata is biased toward English Wikipedia's coverage. A memex in Arabic,
|
||||||
|
Farsi, or Amharic will find far fewer resolved entities. The "universal" in
|
||||||
|
Wikidata is aspirational, not actual.
|
||||||
|
- Wikidata's property graph is not a ontology in the formal sense — it's a
|
||||||
|
collaboratively edited dataset with contradictions, gaps, and editorial wars
|
||||||
|
frozen in time. Loading it directly into a symbolic index that expects
|
||||||
|
consistency (Screamer checks, cardinality policies) will surface thousands of
|
||||||
|
contradictions on ingest, many of which are Wikidata artifacts, not meaningful
|
||||||
|
tensions.
|
||||||
|
- N-hop expansion is unbounded. One hop from Nabokov hits hundreds of entities
|
||||||
|
(his works, his family, his influences, his translators). Two hops hits
|
||||||
|
thousands. Three hops hits tens of thousands. The notes say "3-4 hops" for a
|
||||||
|
literary memex but don't estimate the entity count this implies. The claim that
|
||||||
|
5 million entities = ~400MB is the best-case hash-table figure; a graph with
|
||||||
|
query indices will be larger, and Prolog-like queries over millions of nodes
|
||||||
|
are not free.
|
||||||
|
|
||||||
|
Still: even a partial Wikidata load with conservative hop limits would provide
|
||||||
|
more ontology than the system could accumulate through years of organic growth.
|
||||||
|
It is the right accelerator, and the architecture handles it correctly — Wikidata
|
||||||
|
facts are admitted with =:provenance :wikidata= and =:policy :plural=, meaning
|
||||||
|
they sit alongside personal facts without overriding them. Disagreements are
|
||||||
|
surfaced, not resolved. The architecture treats Wikidata as evidence from an
|
||||||
|
external source, not as ground truth. That's the correct posture.
|
||||||
|
|
||||||
|
** Cardinality policies are the right abstraction for contradiction
|
||||||
|
|
||||||
|
The =:singular= / =:dual= / =:plural= cardinality model is one of the most
|
||||||
|
important ideas in these notes. Classical logic requires consistency — a
|
||||||
|
contradiction implies everything (ex contradictione quodlibet). A constraint
|
||||||
|
solver like Screamer also requires consistency — a contradictory constraint set
|
||||||
|
has no solutions. But a personal memex operates across domains where the meaning
|
||||||
|
of contradiction is fundamentally different:
|
||||||
|
|
||||||
|
- "rm -rf / is catastrophic" is =:singular= — there is one truth that evolves
|
||||||
|
over time.
|
||||||
|
- "I loved this person AND I resented them" is =:dual= — the tension IS the
|
||||||
|
fact.
|
||||||
|
- "Wikidata says Everest is 8848m; DBpedia says 8849m; my 2023 diary says
|
||||||
|
8848m" is =:plural= — multiple sources disagree, and surfacing the disagreement
|
||||||
|
with provenance is the product.
|
||||||
|
|
||||||
|
This is a genuinely novel contribution to knowledge representation. Most
|
||||||
|
knowledge graphs (Wikidata, Freebase, DBpedia) don't model contradiction at all —
|
||||||
|
they pick one value and discard the rest. Most constraint solvers reject
|
||||||
|
contradiction as error. Passepartout's cardinality model makes contradiction a
|
||||||
|
first-class citizen: you can query the fact that "I used to believe X until
|
||||||
|
Tuesday, then Y," or "these three sources disagree on height," or "I hold these
|
||||||
|
two positions in tension." The symbolic engine's job is not to decide which is
|
||||||
|
right. It is to surface the tension with provenance.
|
||||||
|
|
||||||
|
This alone, if implemented correctly, would be a category-level advance over
|
||||||
|
every existing personal knowledge management tool.
|
||||||
|
|
||||||
|
** Ontology versioning is the right approach to the migration problem
|
||||||
|
|
||||||
|
Every knowledge base eventually faces schema migration — you split =:secret-file=
|
||||||
|
into =:crypto-secret= and =:plaintext-secret=, and now every deduction that
|
||||||
|
crossed the old category boundary is suspect. The standard approach is batch
|
||||||
|
UPDATE operations that overwrite the past. Passepartout's approach — the category
|
||||||
|
hierarchy itself is a Merkle tree, every fact stores the =:ontology-version= at
|
||||||
|
assertion time, category changes trigger re-verification rather than remapping —
|
||||||
|
preserves all worldviews. You can query "what did I believe about secrets before
|
||||||
|
I refined my security model?" This is not querying a fact. It is querying the
|
||||||
|
history of your own thinking.
|
||||||
|
|
||||||
|
This is the kind of capability that no existing tool provides, and it flows
|
||||||
|
directly from the architecture. If the Merkle DAG infrastructure exists (it does,
|
||||||
|
from v0.2.0), ontology versioning is ~40 lines on top of it. The conceptual
|
||||||
|
design is sound. The engineering appears tractable.
|
||||||
|
|
||||||
|
* SWOT Analysis
|
||||||
|
|
||||||
|
** Strengths
|
||||||
|
|
||||||
|
*** Architectural inversion — proposer vs decider
|
||||||
|
|
||||||
|
The LLM proposes. The symbolic engine decides. This is the inverse of every
|
||||||
|
existing agent architecture, and it solves the hallucination problem at the
|
||||||
|
architectural level rather than the prompt-engineering level. No amount of
|
||||||
|
prompt refinement can make a probabilistic system deterministic. But a
|
||||||
|
deterministic admission gate can make a probabilistic proposer safe.
|
||||||
|
|
||||||
|
*** Unified container format (Org files)
|
||||||
|
|
||||||
|
Org files serve as the container for human prose, Lisp source code, symbolic
|
||||||
|
facts, and Agora Notes. One format, one toolchain, one Merkle tree, one version
|
||||||
|
control system. If Passepartout stops existing, the data survives in plain text.
|
||||||
|
This is the hardest commitment in the design and the most undervalued. Most agent
|
||||||
|
architectures store memory in JSONL transcripts, vector databases, or proprietary
|
||||||
|
formats — opaque to the human and dependent on the tool. Passepartout's memory
|
||||||
|
IS the human's memory, in the human's format.
|
||||||
|
|
||||||
|
*** Provenance as product
|
||||||
|
|
||||||
|
Every fact carries =:grounding= (the specific Org heading), =:provenance= (who
|
||||||
|
or what produced it), =:timestamp=, =:referenced-by=, =:contradicted-by=,
|
||||||
|
=:superseded-by=. The =/audit= command renders the full provenance chain. In the
|
||||||
|
broader memex, the value is not the verified fact ("this command is safe"). It
|
||||||
|
is the provenance itself: "this claim originated in that diary entry, has been
|
||||||
|
referenced 7 times across 4 projects, was contradicted 6 months later, and was
|
||||||
|
revised 3 weeks after that." This is a memory prosthesis that makes your own mind
|
||||||
|
legible to you.
|
||||||
|
|
||||||
|
*** Gate-to-fact bootstrap — ontology from existing code
|
||||||
|
|
||||||
|
The existing Dispatcher gate stack encodes an implicit ontology (categories of
|
||||||
|
secrets, destructive commands, trusted domains, core files). The bootstrap
|
||||||
|
extracts this mechanically — zero LLM tokens, zero human authoring, ~30 lines of
|
||||||
|
Lisp. This proves the pattern and provides the seed ontology without any new
|
||||||
|
infrastructure. Every new gate pattern added by the human (HITL approvals that
|
||||||
|
become rules) extends the ontology automatically.
|
||||||
|
|
||||||
|
*** Self-preservation architecture
|
||||||
|
|
||||||
|
The Third Law implementation — quarantine on skill failure, degraded-mode
|
||||||
|
signaling, resource monitoring, external watchdog, refusal to self-terminate —
|
||||||
|
is individually small (~20-50 lines each) and collectively transforms
|
||||||
|
self-preservation from a passive architectural property into an active behavior.
|
||||||
|
The key insight: the biggest gap is not that these mechanisms are hard. It is
|
||||||
|
that degradation is currently silent. Making it visible is cheap and high-impact.
|
||||||
|
|
||||||
|
*** Cardinality policies as a solution to contradiction
|
||||||
|
|
||||||
|
The =:singular= / =:dual= / =:plural= model is novel in knowledge representation
|
||||||
|
and directly addresses the hardest problem in a personal memex: that
|
||||||
|
contradiction is the product, not the error. Bayesian knowledge bases, graph
|
||||||
|
databases, and triple stores all struggle with contradiction. Passepartout's
|
||||||
|
model makes it a feature.
|
||||||
|
|
||||||
|
*** Organic ontology growth
|
||||||
|
|
||||||
|
Categories emerge from the system's own operation: gate patterns → gate outcomes
|
||||||
|
→ Screamer generalizations → archivist proposals → cross-domain overlap
|
||||||
|
detection. The ontology is a garden, not a building. This avoids the Principia
|
||||||
|
Mathematica problem — the need to define everything upfront — by replacing
|
||||||
|
axiomatic design with evolutionary growth. Categories that aren't used fade.
|
||||||
|
Categories that are contradictory are pruned. Categories that emerge from
|
||||||
|
overlapping domains are promoted. The system converges on useful granularity
|
||||||
|
through use.
|
||||||
|
|
||||||
|
*** Agora as provenance layer for networked knowledge
|
||||||
|
|
||||||
|
A BFT-timestamped triple store is one approach, but the Merkle DAG + DID
|
||||||
|
signatures provide a lighter-weight alternative: every fact's provenance is
|
||||||
|
content-addressed, every author's identity is cryptographically verifiable, and
|
||||||
|
the DAG structure enables partial replication without consensus. This is more
|
||||||
|
tractable than full BFT and sufficient for a personal memex that needs to share
|
||||||
|
facts across a network.
|
||||||
|
|
||||||
|
*** Decoupling of compute cost from knowledge base size
|
||||||
|
|
||||||
|
LLM tokens are minimized by design — deterministic gates cost 0 tokens, sparse-
|
||||||
|
tree rendering keeps context at 2,000-4,000 tokens, Screamer deductions cost 0
|
||||||
|
tokens. Adding 5 million Wikidata entities does not add a single token to any LLM
|
||||||
|
call. The variables that actually degrade performance — context window size, LLM
|
||||||
|
call frequency, Screamer deduction budget — are all bounded independently of
|
||||||
|
knowledge base size. This is a structural property: the education is local, only
|
||||||
|
the brain costs.
|
||||||
|
|
||||||
|
** Weaknesses
|
||||||
|
|
||||||
|
*** The fact language is unproven and may be insufficient
|
||||||
|
|
||||||
|
Triples — =(:entity :relation :value)= with provenance and grounding — is the
|
||||||
|
current hypothesis. It is simple enough to be parseable, expressive enough to
|
||||||
|
capture the gate stack's implicit claims, and extensible enough that Screamer can
|
||||||
|
operate on it. But:
|
||||||
|
|
||||||
|
- Triples cannot naturally express temporal relations. "Was X before Y?" requires
|
||||||
|
reification (making the relation itself an entity), which makes queries
|
||||||
|
exponentially more complex.
|
||||||
|
- Triples cannot express modal claims. "Should not do X unless Y" has no natural
|
||||||
|
triple representation. Neither does "could have done X but chose Y."
|
||||||
|
- Triples cannot express counterfactuals. "If X had happened, Y would have
|
||||||
|
followed." These are essential for the "what if" reasoning that a personal
|
||||||
|
memex should support.
|
||||||
|
- Triples struggle with n-ary relations. "Nabokov wrote Pale Fire in 1962 while
|
||||||
|
living in Montreux" is a 4-ary relation (author, work, date, location), not a
|
||||||
|
set of independent binary relations. Breaking it into triples loses the
|
||||||
|
connection that binds them.
|
||||||
|
- Triples cannot express negation cleanly. "Nabokov did NOT write Doctor Zhivago"
|
||||||
|
requires a negative fact, which in a triple store with an open-world assumption
|
||||||
|
means "not known" and "known not" are conflated.
|
||||||
|
|
||||||
|
The notes acknowledge this limitation but defer it. The right granularity
|
||||||
|
"depends on what queries the planner actually needs to make, and that cannot be
|
||||||
|
known in advance." This is honest but unsatisfying. If triples prove insufficient,
|
||||||
|
the entire fact store, the Screamer integration, the VivaceGraph persistence, and
|
||||||
|
the archivist's extraction format must be redesigned. The architecture has no
|
||||||
|
intermediate fallback between "triples" and "something more expressive."
|
||||||
|
|
||||||
|
*** Screamer as admission gate is untested at this scale
|
||||||
|
|
||||||
|
Screamer is a constraint solver with non-deterministic backtracking. Using it
|
||||||
|
to check a candidate triple against an existing fact store is conceptually
|
||||||
|
elegant: express the fact store as constraint variables, assert the candidate,
|
||||||
|
check solvability. But:
|
||||||
|
|
||||||
|
- Screamer was designed for constraint satisfaction problems with tens to
|
||||||
|
hundreds of variables. A fact store with millions of triples (after Wikidata
|
||||||
|
loading) is a constraint space orders of magnitude larger than Screamer's
|
||||||
|
design envelope.
|
||||||
|
- The consistency check is domain-scoped (only rules from the candidate's
|
||||||
|
=:domain= apply), but cross-domain contradictions are the most valuable kind.
|
||||||
|
"Nabokov was born in 1899" (literature domain) should be consistent with
|
||||||
|
"Nabokov died in 1977" (history domain). If these are separate domains, the
|
||||||
|
check misses contradictions; if they are unified, the constraint space
|
||||||
|
explodes.
|
||||||
|
- Screamer's non-deterministic backtracking is worst-case exponential. The notes
|
||||||
|
bound this via deduction budget (=SCREAMER_DEDUCTION_BUDGET_MS=) but don't
|
||||||
|
address the admission check itself, which runs on every assertion.
|
||||||
|
|
||||||
|
There is a risk that Screamer works beautifully for the gate-bootstrapped seed
|
||||||
|
(50-70 entity classes, ~200 facts) and becomes unusably slow after Wikidata
|
||||||
|
loading (millions of facts). The transition from "works" to "doesn't" may be
|
||||||
|
gradual and hard to detect — the system gets slower but doesn't crash,
|
||||||
|
degrading user experience without a clear diagnostic.
|
||||||
|
|
||||||
|
*** The "flip" from lossy to deterministic is underspecified
|
||||||
|
|
||||||
|
The architecture's central narrative arc is the "flip": at some point, the non-
|
||||||
|
lossy facts constitute a sufficient foundation that the symbolic engine can
|
||||||
|
reverse the flow — instead of LLM extraction, the symbolic engine reads prose
|
||||||
|
through its own lens and deduces facts directly. The sufficiency metric
|
||||||
|
(non-lossy / total > 0.7) makes this "computable and visible to the user."
|
||||||
|
|
||||||
|
But:
|
||||||
|
|
||||||
|
- The threshold (0.7) is arbitrary. It is not derived from empirical measurement,
|
||||||
|
information theory, or constraint satisfaction theory. It is a guess.
|
||||||
|
- Sufficiency is domain-specific, not global. The gate stack may have 0.95
|
||||||
|
coverage of security classifications but 0.05 coverage of literary analysis.
|
||||||
|
A global threshold of 0.7 hides the domains where the symbolic engine is still
|
||||||
|
effectively blind.
|
||||||
|
- The "flip" operation itself is not defined. "Screamer reads prose through its
|
||||||
|
own lens" — Screamer does not read prose. It operates on structured facts.
|
||||||
|
Either the archivist still extracts triples (which is LLM work), or some new
|
||||||
|
mechanism parses prose into triples deterministically (which is NLP at a level
|
||||||
|
that does not exist in open-source Lisp).
|
||||||
|
- Even after the flip, facts from the pre-flip period carry =:provenance
|
||||||
|
:llm-proposed= and are therefore suspect. The pre-flip facts were admitted
|
||||||
|
against fewer non-lossy facts, meaning Screamer's consistency checks were
|
||||||
|
weaker. A fact admitted during the seed phase may be wrong but undetected
|
||||||
|
because there were no contradicting facts at the time. Re-verifying all pre-
|
||||||
|
flip facts against the current fact store is described as a heartbeat task but
|
||||||
|
the cost (millions of Screamer checks) is not estimated.
|
||||||
|
|
||||||
|
The flip is a beautiful narrative. It may also be a mirage — the system may
|
||||||
|
achieve high sufficiency in narrow domains (security, filesystem, coding) and
|
||||||
|
never approach it in the broader memex (literature, personal reflection, daily
|
||||||
|
life). If the broader memex is the use case, the flip may never happen.
|
||||||
|
|
||||||
|
*** The archivist's extraction cost is unaccounted
|
||||||
|
|
||||||
|
The archivist calls the LLM to extract triples from prose, with "a minimal prompt
|
||||||
|
(~200 tokens)." Over a personal memex with thousands of entries — a decade of
|
||||||
|
diary entries, hundreds of literature notes, dozens of project logs — the
|
||||||
|
extraction cost is substantial.
|
||||||
|
|
||||||
|
Assume 5,000 headings, 200 tokens per heading prompt, and an LLM that returns
|
||||||
|
~100 tokens of structured triples per heading. That's 1.5 million tokens for the
|
||||||
|
initial extraction, plus verification tokens (Screamer checks cost 0 LLM tokens,
|
||||||
|
but incorrect proposals generate feedback that may trigger re-extraction). At
|
||||||
|
current API prices (~$0.15 per million input tokens for GPT-4o-mini), the cost
|
||||||
|
is modest (~$0.25). But at scale — re-extraction after ontology changes,
|
||||||
|
continuous extraction as new content is added, extraction for all incoming Agora
|
||||||
|
Notes — the cost accumulates.
|
||||||
|
|
||||||
|
More importantly, the extraction latency is human-noticeable. 5,000 headings at
|
||||||
|
1 second per LLM call is ~1.4 hours of extraction time. The system needs to
|
||||||
|
either batch-extract on startup (making cold starts slow) or extract lazily on
|
||||||
|
first query (making first queries slow). Neither is ideal.
|
||||||
|
|
||||||
|
The notes trumpet the token savings from deterministic gates and Screamer
|
||||||
|
deductions (valid — those cost 0 tokens) but the archivist's extraction cost is
|
||||||
|
the system's single largest recurring LLM expense, and it is mentioned only in
|
||||||
|
passing.
|
||||||
|
|
||||||
|
*** The Agora integration is clean in theory, undefined in practice
|
||||||
|
|
||||||
|
The "Passepartout IS the PDS" claim is elegant: the =memory-object= struct IS
|
||||||
|
the Note format, the Merkle DAG IS the Key Event Log, the fact store IS the
|
||||||
|
reputation system. But:
|
||||||
|
|
||||||
|
- An Agora PDS needs to serve HTTP APIs for thin clients. The daemon speaks a
|
||||||
|
framed TCP protocol over a local port. Extending it to serve HTTPS with
|
||||||
|
DIDComm endpoints, subscription management, and Relay push/pull is a
|
||||||
|
substantial engineering effort.
|
||||||
|
- The PDS needs to manage encrypted storage — client-side encrypted content that
|
||||||
|
the PDS itself cannot read. Passepartout's vault stores credentials with
|
||||||
|
integrity hashes but does not currently manage per-Note encryption with
|
||||||
|
audience-specific keys.
|
||||||
|
- The Relay Network is described as an intelligent communication backbone with
|
||||||
|
pub/sub routing. Passepartout has no Relay implementation, no Relay-facing API,
|
||||||
|
and no subscription management beyond its own event orchestrator.
|
||||||
|
- Agora's contract system (SCAL contracts, HODL invoices, arbitration tiers)
|
||||||
|
requires state machines and Lightning Network integration that Passepartout
|
||||||
|
has no primitives for.
|
||||||
|
- The "Passepartout IS the PDS" vision conflates two things: the data model
|
||||||
|
(Org files = Notes) and the infrastructure (a process that serves a network
|
||||||
|
protocol). The data model unification is clean and right. The infrastructure
|
||||||
|
unification implies Passepartout grows from a local agent to a network server
|
||||||
|
— a significant architectural expansion that the notes treat as a ~40-line
|
||||||
|
utility.
|
||||||
|
|
||||||
|
*** No adversarial model
|
||||||
|
|
||||||
|
The notes describe layered authentication (crypto, sensory, deterministic,
|
||||||
|
probabilistic) and type-level gates as structural safety. They do not describe
|
||||||
|
an adversarial model:
|
||||||
|
|
||||||
|
- What stops a malicious Agora Note from containing 100,000 triples that flood
|
||||||
|
the fact store?
|
||||||
|
- What stops a DID from publishing Notes that deliberately inject contradictions
|
||||||
|
to force Screamer into exponential backtracking?
|
||||||
|
- What stops a compromised sensor key from signing valid sensor data that is
|
||||||
|
adversarially crafted (e.g., video frames designed to trigger specific vision
|
||||||
|
model false positives)?
|
||||||
|
- What stops a spam DID from creating millions of Personas and flooding the
|
||||||
|
user's incoming Notes directory?
|
||||||
|
|
||||||
|
The resource monitor (Phase 1a) handles storage pressure generically. The
|
||||||
|
quarantine system handles individual DIDs flagged for spam. But none of these
|
||||||
|
are adversary-aware — they react to symptoms (disk full, error rate high) rather
|
||||||
|
than anticipating attack patterns. An adversarial model would identify these
|
||||||
|
vectors and design mitigations specifically. The notes describe a system that
|
||||||
|
works in a cooperative environment, not an adversarial one.
|
||||||
|
|
||||||
|
*** The self-repair criterion creates a two-tier architecture
|
||||||
|
|
||||||
|
The AGENTS.md rule — "default: everything is a skill" — means the symbolic
|
||||||
|
engine (Screamer, VivaceGraph, fact store, archivist, ACL2, planner) is all
|
||||||
|
skills, not core. This is correct for the self-repair criterion: a corrupted
|
||||||
|
skill degrades the agent but doesn't kill it. A corrupted core file kills the
|
||||||
|
brainstem.
|
||||||
|
|
||||||
|
But it creates a tension: the symbolic engine IS the reasoning layer that would
|
||||||
|
diagnose and repair a corrupted skill. If the fact store itself is corrupted
|
||||||
|
(impossible facts, inconsistent cardinality, broken Merkle chains), the engine
|
||||||
|
that detects corruption is the engine that is corrupted. The system needs a
|
||||||
|
"repair from below" path — a minimal core that can purge and rebuild the symbolic
|
||||||
|
index without depending on the symbolic index. This path exists (the fact store
|
||||||
|
is ephemeral in Phase 1-4 and rebuildable from prose in Phase 5+) but is not
|
||||||
|
exercised automatically. A corruption in the symbolic engine requires human
|
||||||
|
detection and manual rebuild — the exact problem the self-repair criterion was
|
||||||
|
designed to avoid.
|
||||||
|
|
||||||
|
** Opportunities
|
||||||
|
|
||||||
|
*** A memory prosthesis that makes your own mind legible
|
||||||
|
|
||||||
|
The symbolic index, when populated and queried, answers questions that no
|
||||||
|
existing tool can:
|
||||||
|
|
||||||
|
- "What did I believe about monorepos in 2023, and how has that changed?"
|
||||||
|
- "Which of my diary entries contradict each other?"
|
||||||
|
- "What entities in my memex have no connection to any other entity?"
|
||||||
|
- "Show me everything I've written about Nabokov, organized by when I wrote it,
|
||||||
|
what I was reading at the time, and what I concluded."
|
||||||
|
- "Which of my project plans reference security assumptions that I later changed?"
|
||||||
|
- "What did I think about this topic, and why did I change my mind?"
|
||||||
|
|
||||||
|
These are not information retrieval queries. They are self-knowledge queries.
|
||||||
|
They require provenance chains, temporal versioning, contradiction surfacing, and
|
||||||
|
cross-domain linkage — all of which the architecture provides as first-class
|
||||||
|
capabilities. If this works, it transforms the memex from a searchable archive
|
||||||
|
into a thinking partner that knows the history of your thoughts.
|
||||||
|
|
||||||
|
*** Deterministic reasoning as a moat
|
||||||
|
|
||||||
|
Every competitor agent system (Claude Code, OpenCode, OpenClaw, Hermes, Cognee,
|
||||||
|
Mem0) uses neural-only reasoning. They are all vulnerable to the same failure
|
||||||
|
mode: the LLM hallucinates a fact or an action, and there is no second system to
|
||||||
|
catch it. Their safety is heuristic. Their memory is flat. Their reasoning is
|
||||||
|
unprovable.
|
||||||
|
|
||||||
|
Passepartout's architectural bet — a symbolic engine that verifies, deduces, and
|
||||||
|
audits — creates a category difference, not a performance difference. If the bet
|
||||||
|
pays off, Passepartout is not "a better AI agent." It is a different kind of
|
||||||
|
system — one whose reasoning is provable, whose memory is content-addressed, and
|
||||||
|
whose knowledge accumulates through deduction rather than re-prompting.
|
||||||
|
|
||||||
|
This is a genuine moat. It cannot be replicated by adding a better system prompt
|
||||||
|
or a larger context window. It requires building the ontology, the constraint
|
||||||
|
solver, the fact store, and the provenance tracker — work that takes years and
|
||||||
|
cannot be shortcut by spending more on inference.
|
||||||
|
|
||||||
|
*** Agora as the first sovereign agent network
|
||||||
|
|
||||||
|
If Passepartout serves as the PDS and an Agora Persona, then AI agents can:
|
||||||
|
|
||||||
|
- Publish verified outputs as signed Notes with cryptographic provenance.
|
||||||
|
Readers know the agent produced the output, not a human impersonating the
|
||||||
|
agent.
|
||||||
|
- Accept invocation Notes from other persona owners. "Please analyze this
|
||||||
|
contract and publish your findings." The agent receives the request as an
|
||||||
|
Agora Note, processes it, signs the response, and publishes it.
|
||||||
|
- Build reputation through auditable chains of signed work products, not through
|
||||||
|
self-reported claims.
|
||||||
|
- Participate in the compute marketplace as both consumer and provider.
|
||||||
|
- Maintain sovereign identity — the agent's DID is independent of any platform,
|
||||||
|
any provider, any human account.
|
||||||
|
|
||||||
|
This is not a chatbot on a messaging platform. It is an autonomous entity on a
|
||||||
|
decentralized network, with cryptographic identity, verifiable provenance, and
|
||||||
|
economic agency. If Agora reaches even Order 1 (the first 1,000 users),
|
||||||
|
Passepartout agents become some of the most capable participants on the network.
|
||||||
|
|
||||||
|
*** The 10-80-10 ratio for coding is genuinely achievable
|
||||||
|
|
||||||
|
For a coding agent — the domain that Passepartout currently operates in — the
|
||||||
|
10-80-10 ratio is plausible. The existing Dispatcher already verifies every
|
||||||
|
action deterministically. Adding Screamer for consistency checking, VivaceGraph
|
||||||
|
for dependency queries, and ACL2 for structural verification would shift the
|
||||||
|
ratio from the current ~95-5-0 (neural-gate-symbolic) toward 50-40-10 in the
|
||||||
|
near term and potentially 10-80-10 in the long term.
|
||||||
|
|
||||||
|
The bootstrapped gate facts already cover file classifications, command safety,
|
||||||
|
path protections, and tool permissions — the core categories for a coding agent.
|
||||||
|
The archivist's extraction from project files would add dependency information,
|
||||||
|
test coverage, and code structure facts. The planner could reason about
|
||||||
|
refactoring order, dependency chains, and safety constraints deterministically.
|
||||||
|
This is the domain where the symbolic engine provides the most immediate value,
|
||||||
|
and it is the domain Passepartout already operates in.
|
||||||
|
|
||||||
|
*** Wikidata as an entity backbone unlocks cross-domain reasoning
|
||||||
|
|
||||||
|
Without Wikidata, the symbolic index for a general-knowledge memex is a sparse
|
||||||
|
set of personal facts with no connecting structure. With Wikidata, the entity
|
||||||
|
graph is pre-structured. The system can answer:
|
||||||
|
|
||||||
|
- "What does my memex say about Nabokov that Wikidata doesn't?"
|
||||||
|
- "Where does my memex disagree with Wikidata?"
|
||||||
|
- "What entities in my memex have no Wikidata counterpart?" (These are the
|
||||||
|
personal, novel, or subjective entities that are the most valuable.)
|
||||||
|
- "Show me the intersection of my literary interests (from diary) with Wikidata's
|
||||||
|
influence graph — which authors I read influenced each other in ways I haven't
|
||||||
|
written about?"
|
||||||
|
|
||||||
|
These are cross-domain queries that require both the personal memex (for what
|
||||||
|
the user knows) and Wikidata (for what the world knows). Neither alone can
|
||||||
|
answer them. Together, they enable a kind of knowledge synthesis that no existing
|
||||||
|
tool provides.
|
||||||
|
|
||||||
|
*** Ontology versioning enables "what-if" reasoning about one's own thinking
|
||||||
|
|
||||||
|
The ability to query across worldviews — "what did I believe before I changed my
|
||||||
|
security model?" — is a capability that has no analog in any existing tool. It
|
||||||
|
transforms the memex from a static archive into a dynamic record of intellectual
|
||||||
|
evolution. Combined with the temporal awareness system (Phase 0c), the system
|
||||||
|
could surface correlations: "You changed your mind about monorepos two weeks
|
||||||
|
after reading this article, which you bookmarked on this date, and one week
|
||||||
|
before starting this project that uses a monorepo structure." The provenance
|
||||||
|
chain IS the narrative of your thinking.
|
||||||
|
|
||||||
|
*** Contract-level pre-arbitration reduces the cost of decentralized commerce
|
||||||
|
|
||||||
|
Agora's Tier 0 Arbitrator — a local AI that provides evidence summaries before
|
||||||
|
human arbitration — is a genuinely useful role for a neurosymbolic system.
|
||||||
|
|
||||||
|
- "Contract CID X references arbitrator DID Y. DID Y is active. Verified."
|
||||||
|
- "All parties have signed. The HODL invoice is locked. Verified."
|
||||||
|
- "The buyer's claim of non-delivery is supported by 3 signed messages with
|
||||||
|
timestamps after the delivery deadline."
|
||||||
|
- "The seller's proof-of-delivery field is empty. No QR scan recorded."
|
||||||
|
|
||||||
|
Each check is a Screamer query against the contract-lifecycle domain. The results
|
||||||
|
are a plist, not a ruling. Both parties see the same evidence summary before
|
||||||
|
escalating. This makes Level 1 arbitration faster (arbitrators receive
|
||||||
|
pre-processed evidence bundles), cheaper (no human time spent on trivial
|
||||||
|
verification), and more transparent (both parties see the same machine-generated
|
||||||
|
summary).
|
||||||
|
|
||||||
|
This is not AI judging. This is AI preparing the docket. The distinction is
|
||||||
|
important and defensible.
|
||||||
|
|
||||||
|
*** Self-auditing agents could transform AI safety discourse
|
||||||
|
|
||||||
|
If Passepartout can answer =/audit= for any action or fact — showing the full
|
||||||
|
provenance chain, every gate that approved it, every fact that supported it,
|
||||||
|
every alternative that was considered — then AI safety moves from "trust us, we
|
||||||
|
tested it" to "here is the audit trail, verify it yourself."
|
||||||
|
|
||||||
|
This is the transparency that every AI safety framework calls for and none
|
||||||
|
delivers. It is possible because the architecture records provenance as a
|
||||||
|
first-class operation, not as an after-the-fact log. The provenance is the
|
||||||
|
operating system, not a logging layer.
|
||||||
|
|
||||||
|
*** The memex + Agora combination could be a new kind of social network
|
||||||
|
|
||||||
|
Current social networks (Twitter, Facebook, Reddit) separate the person from
|
||||||
|
their knowledge. You are a profile with posts. Your posts are isolated units
|
||||||
|
without connection to your broader intellectual life.
|
||||||
|
|
||||||
|
A Passepartout-powered Agora Persona would publish Notes that are grounded in
|
||||||
|
the memex: "Here is my analysis of /Pale Fire/, drawn from diary entries across
|
||||||
|
three years, annotated with Wikidata context, and verified against my existing
|
||||||
|
literary framework." The Note is cryptographically signed, carrying provenance
|
||||||
|
back to the specific Org headings that informed it. Readers see not just the
|
||||||
|
conclusion but the intellectual scaffolding that produced it.
|
||||||
|
|
||||||
|
This is not a "post." It is a publication — a knowledge artifact with verifiable
|
||||||
|
provenance, auditable reasoning, and cryptographic identity. If this becomes the
|
||||||
|
norm, it raises the standard for public discourse from "this is my opinion" to
|
||||||
|
"this is my opinion, here is the evidence, here is how it evolved, here is who
|
||||||
|
verified it."
|
||||||
|
|
||||||
|
** Threats
|
||||||
|
|
||||||
|
*** The ontology problem may be harder than anticipated
|
||||||
|
|
||||||
|
The notes are honest about this: "Whitehead's Principia Mathematica took over
|
||||||
|
300 pages to define the logical foundations before it could prove that 1+1=2."
|
||||||
|
Passepartout's domain is narrower (coding + personal knowledge) but the
|
||||||
|
ontology problem is the same category of problem. Every entity class must be
|
||||||
|
defined. Every relation must have clear semantics. Every inference rule must be
|
||||||
|
justified.
|
||||||
|
|
||||||
|
The gate-to-fact bootstrap provides 50-70 entity classes — enough for a coding
|
||||||
|
agent. But the broader memex contains orders of magnitude more entity types:
|
||||||
|
people, places, works, concepts, events, emotions, aesthetic judgments,
|
||||||
|
professional skills, personal projects, temporal patterns. Defining these as
|
||||||
|
triples with clear semantics is genuine intellectual work that no amount of
|
||||||
|
engineering can shortcut.
|
||||||
|
|
||||||
|
The risk is not that it's impossible. It's that it's slow — slow enough that
|
||||||
|
the system never achieves the density of facts needed for the "flip" in the
|
||||||
|
broader memex. The coding domain may reach sufficiency in months. The literary
|
||||||
|
domain may take years. The daily-reflection domain may never cross the
|
||||||
|
threshold because the facts involved (mood, insight, aesthetic experience) are
|
||||||
|
not formalizable as triples.
|
||||||
|
|
||||||
|
*** Screamer may not scale to the fact store size
|
||||||
|
|
||||||
|
The constraint satisfaction approach to consistency checking is elegant for a
|
||||||
|
seed fact set of hundreds of triples. It is unproven for millions of triples
|
||||||
|
(after Wikidata loading + years of personal extraction). The domain-scoping
|
||||||
|
strategy (Screamer only checks facts from the candidate's =:domain=) bounds the
|
||||||
|
constraint space, but the most valuable consistency checks are cross-domain:
|
||||||
|
|
||||||
|
- "You classified this file as public in your project notes but the gate stack
|
||||||
|
classifies it as secret." (project domain vs security domain)
|
||||||
|
- "You wrote that Nabokov influenced Kafka, but Wikidata says Kafka died before
|
||||||
|
Nabokov published his first novel." (literature domain vs Wikidata domain)
|
||||||
|
- "You planned to use this dependency, but the dependency's license changed in
|
||||||
|
a way that conflicts with your project's license." (project domain vs legal
|
||||||
|
domain)
|
||||||
|
|
||||||
|
If cross-domain checks are disabled for performance, the most valuable
|
||||||
|
contradictions are never detected. If they are enabled, the constraint space
|
||||||
|
explodes. There is no obvious sweet spot.
|
||||||
|
|
||||||
|
*** Wikidata quality may undermine trust in the symbolic index
|
||||||
|
|
||||||
|
If Wikidata facts are admitted with =:policy :plural= and the user sees
|
||||||
|
thousands of contradictions between Wikidata and their personal memex, the
|
||||||
|
symbolic index may feel less trustworthy, not more. "Wikidata says Mount Everest
|
||||||
|
is 8848m. DBpedia says 8849m. Your 2023 diary says 8848m. These three sources
|
||||||
|
disagree on height." This is correct behavior — surfacing disagreement with
|
||||||
|
provenance — but it may be overwhelming. The user wanted a knowledge base, not
|
||||||
|
a disagreement engine.
|
||||||
|
|
||||||
|
The trust problem is compounded by Wikidata's editorial biases. Wikidata
|
||||||
|
reflects the biases of Wikipedia editors: English-language dominance, Western
|
||||||
|
epistemological frameworks, systemic underrepresentation of non-Western
|
||||||
|
knowledge. A memex in Arabic that references Islamic philosophy, Egyptian
|
||||||
|
history, or African literature will find Wikidata's coverage thin, biased, or
|
||||||
|
absent. The symbolic index would dutifully surface these gaps — "your memex
|
||||||
|
mentions 47 entities with no Wikidata counterpart" — but it cannot fill them.
|
||||||
|
|
||||||
|
*** LLM cost and latency may prevent the archivist from keeping up
|
||||||
|
|
||||||
|
If the user writes a diary entry every day, the archivist must extract triples
|
||||||
|
from each new heading. If the extraction takes 1-3 seconds per heading, it's
|
||||||
|
background noise. But if the user imports 500 old diary entries, or the
|
||||||
|
archivist needs to re-extract after an ontology change, or Agora Notes arrive in
|
||||||
|
bulk from multiple follows, the extraction queue grows faster than it drains.
|
||||||
|
|
||||||
|
The notes describe extraction as a background task triggered by heartbeat, but
|
||||||
|
they don't specify the extraction rate limit. An unbounded queue with no rate
|
||||||
|
limit would consume the LLM budget. A bounded queue would fall behind. A lazy
|
||||||
|
extraction strategy (extract on first query) would make first queries slow.
|
||||||
|
A batch extraction on startup would make cold starts slow.
|
||||||
|
|
||||||
|
The archivist's throughput is gated by LLM API rate limits, token costs, and
|
||||||
|
inference latency. These are external constraints that the architecture cannot
|
||||||
|
eliminate. The symbolic engine can reduce LLM calls for reasoning; it cannot
|
||||||
|
reduce LLM calls for extraction from prose.
|
||||||
|
|
||||||
|
*** Agora may never reach network effects
|
||||||
|
|
||||||
|
Agora faces the cold start problem that every decentralized social protocol
|
||||||
|
faces: users won't join without content, creators won't post without users. The
|
||||||
|
bootstrapping strategy (managed service → hybrid → full decentralization,
|
||||||
|
targeting niche communities first) is well-articulated but its success depends
|
||||||
|
on execution in a market where Mastodon, Bluesky, Nostr, and Farcaster are
|
||||||
|
already competing for the same users.
|
||||||
|
|
||||||
|
If Agora doesn't reach even Order 1 (1,000 users), the PDS integration is
|
||||||
|
academic. Passepartout's DID identity, DIDComm gateway, Note signing, and
|
||||||
|
contract verification are all infrastructure for a network that doesn't exist.
|
||||||
|
The symbolic engine still works locally — provenance tracking, contradiction
|
||||||
|
surfacing, and deduction are all valuable without Agora. But the network effects
|
||||||
|
that make Agora a transformative platform — reputation, contracts, marketplaces,
|
||||||
|
collective governance — require a living network.
|
||||||
|
|
||||||
|
The risk is asymmetric: Passepartout invests significant engineering in Agora
|
||||||
|
integration that provides zero value if Agora fails to launch.
|
||||||
|
|
||||||
|
*** Complexity may prevent adoption
|
||||||
|
|
||||||
|
Passepartout is already a complex system: a Lisp daemon, a terminal UI, a skill
|
||||||
|
engine, a gate stack, multiple LLM backends, a Merkle memory system, and an
|
||||||
|
event orchestrator. Adding a fact store, a constraint solver, a graph database,
|
||||||
|
a theorem prover, an archivist, a planner, and an Agora PDS makes it more
|
||||||
|
complex, not less.
|
||||||
|
|
||||||
|
The target user — someone who wants a personal AI assistant that works offline —
|
||||||
|
may not want or need any of this. They want the TUI to work, the LLM to be fast,
|
||||||
|
and the files to stay safe. The neurosymbolic engine is infrastructure for a use
|
||||||
|
case (lifelong personal knowledge management with verifiable provenance) that
|
||||||
|
most users do not yet know they have.
|
||||||
|
|
||||||
|
The risk is that Passepartout builds a cathedral for a congregation of one — a
|
||||||
|
system that is architecturally brilliant and practically unused because the
|
||||||
|
complexity-to-value ratio is too high for anyone except the author.
|
||||||
|
|
||||||
|
*** The self-repair criterion may not hold under adversarial conditions
|
||||||
|
|
||||||
|
The architecture assumes that skills can fail gracefully (fboundp guards, hash
|
||||||
|
table fallbacks, degraded mode). It does not assume that a skill can be
|
||||||
|
adversarially corrupted to behave correctly while producing wrong results. A
|
||||||
|
compromised archivist that extracts plausible but false triples, a compromised
|
||||||
|
Screamer that passes all consistency checks, a compromised VivaceGraph that
|
||||||
|
returns query results from a parallel graph — these are "living" skills that
|
||||||
|
would pass integrity checks and still poison the symbolic index.
|
||||||
|
|
||||||
|
The type-level gates prevent the LLM from modifying gate code. They do not
|
||||||
|
prevent a compromised skill (loaded by a trusted human, or corrupted on disk by
|
||||||
|
a separate process) from operating normally while subtly wrong. The integrity
|
||||||
|
monitoring (Phase 0) catches disk-level corruption through hash checks. It does
|
||||||
|
not catch semantic corruption — a skill that is byte-for-byte identical to the
|
||||||
|
known-good version but loaded with a malicious input that triggers a latent bug.
|
||||||
|
|
||||||
|
This is not a vulnerability unique to Passepartout. It is a vulnerability in
|
||||||
|
every system where components trust each other. But Passepartout's architecture
|
||||||
|
amplifies the risk because the symbolic engine is supposed to be the trustworthy
|
||||||
|
layer — the component that verifies the LLM's output. If the symbolic engine
|
||||||
|
itself is compromised, the system has no higher court of appeal.
|
||||||
|
|
||||||
|
*** The 10-80-10 ratio may create false confidence
|
||||||
|
|
||||||
|
If the sufficiency metric shows "71% non-lossy, threshold 70%, mode: AUTO-
|
||||||
|
EXTRACTION," the user may assume the system is trustworthy. But sufficiency is
|
||||||
|
global — it aggregates across all domains. The system may have 95% sufficiency
|
||||||
|
in the security domain and 5% sufficiency in the literary domain, averaging to
|
||||||
|
71%. The auto-extraction switch would bypass the LLM for all categories with
|
||||||
|
sufficient coverage, but the threshold is global, not per-domain. A literary
|
||||||
|
query would hit the symbolic index that has "sufficient" coverage globally but
|
||||||
|
insufficient coverage for literature.
|
||||||
|
|
||||||
|
The notes describe domain-scoped Screamer checks but not domain-scoped
|
||||||
|
sufficiency. A global sufficiency metric that triggers a global extraction mode
|
||||||
|
change is the wrong granularity. Per-domain sufficiency, with per-domain
|
||||||
|
extraction mode, would be more complex but more honest. The architecture as
|
||||||
|
described has the simpler, more dangerous version.
|
||||||
|
|
||||||
|
** Summary Matrix
|
||||||
|
|
||||||
|
| | Positive | Negative |
|
||||||
|
|-----------+---------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------|
|
||||||
|
| INTERNAL | S: Architectural inversion, unified Org format, provenance as product, | W: Unproven fact language, Screamer scale unverified, extraction cost hidden, |
|
||||||
|
| | cardinality model, gate-to-fact bootstrap, self-preservation, organic ontology, | flip underspecified, adversarial model absent, self-repair tension, |
|
||||||
|
| | Wikidata as accelerator, decoupled compute cost | Agora integration scope undefined, per-domain sufficiency missing |
|
||||||
|
|-----------+---------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------|
|
||||||
|
| EXTERNAL | O: Memory prosthesis, deterministic moat, sovereign agent network, | T: Ontology may be harder than expected, Screamer may not scale, |
|
||||||
|
| | 10-80-10 for coding achievable, Wikidata cross-domain queries, | Wikidata quality/trust, LLM extraction bottleneck, Agora network effects, |
|
||||||
|
| | ontology versioning, contract pre-arbitration, self-auditing safety, | complexity-to-adoption ratio, adversarial semantic corruption, |
|
||||||
|
| | knowledge-based social network | false confidence from global sufficiency metric |
|
||||||
|
|
||||||
|
* What This Unlocks
|
||||||
|
|
||||||
|
** Technologically
|
||||||
|
|
||||||
|
The neurosymbolic engine, if built, would be the first AI system where:
|
||||||
|
|
||||||
|
1. *Reasoning is auditable.* Every conclusion carries a provenance chain back to
|
||||||
|
its premises. The =/audit= command renders the full inference tree — every
|
||||||
|
fact, every deduction, every gate outcome — in human-readable form.
|
||||||
|
|
||||||
|
2. *Knowledge accumulates deterministically.* Screamer deductions and gate
|
||||||
|
outcomes generate new facts without any LLM involvement. The knowledge base
|
||||||
|
grows from the system's own operation, not from re-prompting the LLM.
|
||||||
|
|
||||||
|
3. *Memory is content-addressed.* Every fact is a Merkle node. Every version
|
||||||
|
chain is tamper-proof. Rollback is atomic. The storage format is proven
|
||||||
|
correct before it is committed to disk.
|
||||||
|
|
||||||
|
4. *Safety is provable, not empirical.* Type-level gates make self-modification
|
||||||
|
structurally impossible. ACL2 proves that the rule set has no contradictions.
|
||||||
|
The dispatcher doesn't "try" to be safe — it is safe by construction.
|
||||||
|
|
||||||
|
5. *The human and the machine share the same format.* Org files for both. No
|
||||||
|
hidden database. No import/export step. The agent's memory IS the human's
|
||||||
|
memory.
|
||||||
|
|
||||||
|
These five properties, together, define a new category of AI system: the
|
||||||
|
*sovereign reasoning agent*. Not sovereign in the blockchain sense (decentralized
|
||||||
|
by consensus), but sovereign in the personal sense: the agent runs on your
|
||||||
|
hardware, reasons with your knowledge, and proves its reasoning to you.
|
||||||
|
|
||||||
|
** Socially
|
||||||
|
|
||||||
|
If the technical vision succeeds and Agora reaches network effects, the
|
||||||
|
combination unlocks:
|
||||||
|
|
||||||
|
1. *Verifiable public discourse.* Every published claim carries provenance back
|
||||||
|
to source material. "I read this, I thought this, I changed my mind on this
|
||||||
|
date, here is the evidence." Public discourse shifts from "competing opinions"
|
||||||
|
to "competing evidence chains." The quality floor rises because claims without
|
||||||
|
provenance are visibly weaker than claims with provenance.
|
||||||
|
|
||||||
|
2. *Sovereign AI agents with legal and economic personhood.* A Passepartout
|
||||||
|
agent with an Agora Persona can own assets, enter contracts, earn reputation,
|
||||||
|
and face consequences for failure. This is not a chatbot. It is an autonomous
|
||||||
|
entity with cryptographic identity, verified provenance, and economic agency
|
||||||
|
— more like a corporation than a tool.
|
||||||
|
|
||||||
|
3. *Self-auditing AI safety.* Every action the agent takes is traceable. Every
|
||||||
|
gate decision is recorded. Every fact that informed a decision is queryable.
|
||||||
|
AI safety moves from "trust us" to "here is the audit trail." This is the
|
||||||
|
transparency that every AI ethics framework calls for.
|
||||||
|
|
||||||
|
4. *A personal knowledge economy.* If your memex can publish Notes as Agora
|
||||||
|
content, your intellectual work — your analyses, your syntheses, your
|
||||||
|
discoveries — becomes a publishable, attributable, monetizable asset. Not
|
||||||
|
through advertising or subscriptions, but through direct value exchange:
|
||||||
|
Lightning payments for content access, contract work for your verified
|
||||||
|
expertise, reputation that follows your Persona across platforms.
|
||||||
|
|
||||||
|
5. *Collective intelligence without centralized control.* If multiple
|
||||||
|
Passepartout agents share facts through Agora Notes, the collective symbolic
|
||||||
|
index represents the verified, provenanced knowledge of a community — not the
|
||||||
|
averaged opinion of a crowd, but the auditable intersection of independently
|
||||||
|
verified claims. This is Wikipedia without the editorial board, science
|
||||||
|
without the journal gatekeepers, journalism without the corporate owners.
|
||||||
|
|
||||||
|
6. *A memory prosthesis that outlives the individual.* A memex with a decade of
|
||||||
|
diary entries, linked to Wikidata's entity graph, with Screamer deductions
|
||||||
|
surfacing patterns and contradictions, with ontology versioning preserving
|
||||||
|
intellectual evolution — this is not a knowledge management tool. It is an
|
||||||
|
externalized, queryable, auditable record of a life's thinking. It is what
|
||||||
|
Vannevar Bush imagined in 1945: "an enlarged intimate supplement to one's
|
||||||
|
memory."
|
||||||
|
|
||||||
|
* Conclusion
|
||||||
|
|
||||||
|
The architecture described in these notes is genuinely novel. Not incrementally
|
||||||
|
novel — most agent architectures are variations on "LLM + tools + prompt-based
|
||||||
|
safety." Passepartout's neurosymbolic vision is categorically different: an
|
||||||
|
inversion where the deterministic layer judges the probabilistic layer, where
|
||||||
|
facts carry provenance chains, where contradiction is a feature rather than an
|
||||||
|
error, and where the user's Org files are the single source of truth for both
|
||||||
|
human and machine.
|
||||||
|
|
||||||
|
The largest risk is not that the architecture is wrong. It is that the ontology
|
||||||
|
problem — the genuine difficulty of defining what a "fact" is, what relations
|
||||||
|
are, what categories are useful, and how they evolve — is harder than the notes
|
||||||
|
anticipate, and that the system spends years in a partially-working state where
|
||||||
|
the symbolic index is too sparse to be useful but too entangled to be discarded.
|
||||||
|
|
||||||
|
The second-largest risk is that Agora never reaches the network effects needed
|
||||||
|
to make the PDS integration valuable beyond a local experiment, and that the
|
||||||
|
engineering investment in DIDComm gateways, Note signing, contract verification,
|
||||||
|
and Relay integration produces infrastructure for a network that doesn't exist.
|
||||||
|
|
||||||
|
The opportunity is equally large: a system that makes your own mind legible to
|
||||||
|
you, that proves its reasoning rather than asserting it, that accumulates
|
||||||
|
knowledge across sessions through deduction rather than re-prompting, and that
|
||||||
|
publishes verified, provenanced knowledge to a decentralized network. If this
|
||||||
|
works — even partially, even slowly — it is a category-level advance over every
|
||||||
|
existing agent architecture and every existing personal knowledge management
|
||||||
|
tool.
|
||||||
|
|
||||||
|
The notes are a map of territory that no one has walked. The territory is real.
|
||||||
|
The map is detailed enough to navigate by. Whether the journey completes depends
|
||||||
|
on whether the ontology problem yields to engineering, and whether the user —
|
||||||
|
the one human whose memex this serves — finds value in the partial system well
|
||||||
|
before the full vision materializes.
|
||||||
314
notes/passepartout-agora.org
Normal file
314
notes/passepartout-agora.org
Normal file
@@ -0,0 +1,314 @@
|
|||||||
|
#+TITLE: Passepartout-Agora Integration — Unified Container Format
|
||||||
|
#+AUTHOR: Agent
|
||||||
|
#+FILETAGS: :notes:integration:agora:passepartout:design:
|
||||||
|
#+CREATED: [2026-05-08 Fri]
|
||||||
|
|
||||||
|
* Summary
|
||||||
|
|
||||||
|
Org files and Agora Notes are the same container. Both are text with headers,
|
||||||
|
tags, properties, and prose body. Both contain zero or more symbolic facts
|
||||||
|
extractable by Passepartout's archivist. The only difference is that an Agora
|
||||||
|
Note carries a DID signature and a CID for cryptographic provenance on the
|
||||||
|
network. An Org file without a signature is a local Note. A signed Org file
|
||||||
|
pushed to the PDS is an Agora Note.
|
||||||
|
|
||||||
|
Passepartout's =memory-object= struct serves as the storage format for both.
|
||||||
|
The archivist extracts facts from one unified store. Authorship is distinguished
|
||||||
|
by provenance, not location.
|
||||||
|
|
||||||
|
* The Unification
|
||||||
|
|
||||||
|
** Org files and Notes are the same container
|
||||||
|
|
||||||
|
| Property | Org file (local) | Agora Note (network) |
|
||||||
|
|------------------+------------------------------+-------------------------------------|
|
||||||
|
| Format | Org-mode text | Org-mode text |
|
||||||
|
| Identity | Merkle hash (=memory-object=) | CIDv1 (same hash) |
|
||||||
|
| Contains facts | Yes (archivist extracts) | Yes (archivist extracts) |
|
||||||
|
| Author identity | Implicit (file in =~/memex/=) | Explicit (DID signature in =proof=) |
|
||||||
|
| Access control | Filesystem permissions | =access_control= flags |
|
||||||
|
| Routing | N/A (local disk) | =notify= + =references= + Relay |
|
||||||
|
| Ephemeral | No | =ephemeral_duration= |
|
||||||
|
| Behavioral flag | Implicit (convention) | =is_feed= field |
|
||||||
|
|
||||||
|
The structure converges in a single plist:
|
||||||
|
|
||||||
|
#+begin_src lisp
|
||||||
|
(:cid <merkle-hash> ;; Identity across local and network
|
||||||
|
:title <string> ;; Org headline title
|
||||||
|
:content <org-text> ;; Full Org body (headings, prose, source blocks)
|
||||||
|
:owner <did-or-nil> ;; For Agora Notes: the signing Persona DID. nil for local
|
||||||
|
:proof <plist-or-nil> ;; ( :editor <did> :signature <bytes> )
|
||||||
|
;; Agora behavioral flags (nil for local files)
|
||||||
|
:is-feed <boolean-or-nil>
|
||||||
|
:access-control <did-list-or-nil>
|
||||||
|
:notify <did-list-or-nil>
|
||||||
|
:references <cid-list-or-nil>
|
||||||
|
:reply-to <cid-or-nil>
|
||||||
|
:thread-root <cid-or-nil>
|
||||||
|
:ephemeral-duration <integer-or-nil>
|
||||||
|
;; Passepartout metadata
|
||||||
|
:created-at <timestamp>
|
||||||
|
:tags <string-list> ;; Org tags
|
||||||
|
:properties <plist> ;; Org property drawer
|
||||||
|
:extracted-facts <fact-list>) ;; Populated by archivist after extraction
|
||||||
|
#+end_src
|
||||||
|
|
||||||
|
** Facts are extracted from both, identically
|
||||||
|
|
||||||
|
An Org file in =~/memex/literature/pale-fire.org= and an Agora Note from
|
||||||
|
=did:agora:heather= with =:references <post-CID>= both contain prose. The
|
||||||
|
archivist scans both, proposes triples via the LLM, verifies via Screamer,
|
||||||
|
and admits facts to the symbolic index. The facts carry different provenance:
|
||||||
|
|
||||||
|
#+begin_src lisp
|
||||||
|
;; Extracted from local Org file
|
||||||
|
(:entity :pale-fire :relation :theme :value :unreliable-narration
|
||||||
|
:provenance :local-prose :grounding "heading-42")
|
||||||
|
|
||||||
|
;; Extracted from Agora Note
|
||||||
|
(:entity :kafka :relation :influence :value :nabokov
|
||||||
|
:provenance :agora-note :grounding <incoming-note-cid> :author "did:agora:heather")
|
||||||
|
#+end_src
|
||||||
|
|
||||||
|
No new extraction path. The archivist already walks containers and extracts
|
||||||
|
facts. The container type determines the provenance tag and the grounding
|
||||||
|
identifier (local heading ID vs. Note CID).
|
||||||
|
|
||||||
|
** The memex distinguishes provenance by location, not format
|
||||||
|
|
||||||
|
Incoming Agora Notes arrive at =~/memex/social/notes/<did>/<cid>.org=.
|
||||||
|
The directory structure encodes authorship:
|
||||||
|
|
||||||
|
| Path | Meaning |
|
||||||
|
|---------------------------------------------------+------------------------------------|
|
||||||
|
| ~/memex/daily/ | Local diary entries |
|
||||||
|
| ~/memex/projects/ | Local project files |
|
||||||
|
| ~/memex/literature/ | Local reading notes |
|
||||||
|
| ~/memex/notes/ | Local design and thinking notes |
|
||||||
|
| ~/memex/social/notes/<did>/<cid>.org | Incoming Notes from other DIDs |
|
||||||
|
| ~/memex/social/outbox/<cid>.org | Outgoing Notes signed by the user |
|
||||||
|
|
||||||
|
The archivist scans all directories. Local files produce facts with
|
||||||
|
=:provenance :local-prose=. Agora files produce facts with =:provenance
|
||||||
|
:agora-note= + =:author <did>=. The symbolic index maps the provenance
|
||||||
|
to the cardinality policy: local prose is =:plural= (the human's own notes —
|
||||||
|
multiple interpretations coexist). Agora Notes are =:plural= by default (the
|
||||||
|
author's claim, not authoritative over local facts). Agora Notes can be promoted
|
||||||
|
to =:singular= or =:dual= if they carry cryptographic proofs of specific claims.
|
||||||
|
|
||||||
|
** Publishing Org content as Agora Notes
|
||||||
|
|
||||||
|
When the user wants to publish a diary entry, project log, or literary note as
|
||||||
|
an Agora Note, the operation is:
|
||||||
|
|
||||||
|
1. Select the Org heading or file.
|
||||||
|
2. Compute the Merkle hash (=memory-object= hash → CIDv1).
|
||||||
|
3. Sign with the user's Persona DID key (Phase 0b key registry).
|
||||||
|
4. Set Agora flags: =:is-feed= t/nil, =:access-control= [], =:references= [previous-note-cid].
|
||||||
|
5. Push to the PDS. The Note is an Org plist with a DID signature.
|
||||||
|
6. The PDS stores and relays it. The Note remains in =~/memex/social/outbox/= with its CID.
|
||||||
|
|
||||||
|
All of this is a single function: =(note-publish heading-id &key is-feed access-control references)=.
|
||||||
|
~40 lines, extending the vault (key signing), the fact store (CID generation),
|
||||||
|
and the memex (output directory).
|
||||||
|
|
||||||
|
* Implications for Passepartout's Architecture
|
||||||
|
|
||||||
|
** The symbolic index now has a second ingestion path
|
||||||
|
|
||||||
|
Facts enter through three gates:
|
||||||
|
1. Gate outcomes (bootstrap + runtime, =:provenance :gate-outcome=)
|
||||||
|
2. Screamer deductions (=:provenance :deduced=)
|
||||||
|
3. Archivist extraction (=:provenance :local-prose= or =:provenance :agora-note=)
|
||||||
|
|
||||||
|
The third path now covers both local Org files and incoming Agora Notes. No new
|
||||||
|
path needed. The archivist gains no new code — only a new directory to walk
|
||||||
|
(=~/memex/social/notes/=) and a new provenance tag to assign.
|
||||||
|
|
||||||
|
** Authentication Layer 1 now has Agora-native verification
|
||||||
|
|
||||||
|
Phase 0b's cryptographic gate (vector 0) verifies DID signatures. An incoming
|
||||||
|
Agora Note carries =:owner <did>= and =:proof.signature <bytes>=. Gate vector 0
|
||||||
|
verifies the signature against the DID's public key (from the key registry, which
|
||||||
|
is now also an Agora DID registry). Verification is identical for local signals
|
||||||
|
and Agora signals — the same gate, the same key lookup.
|
||||||
|
|
||||||
|
** Self-preservation gains an Agora dimension
|
||||||
|
|
||||||
|
The resource monitor (Phase 1a) tracks =~/memex/social/= as a source of storage
|
||||||
|
growth. Incoming Notes from network sources are lower preservation priority than
|
||||||
|
local prose — if disk pressure hits, incoming Agora Notes are evicted first
|
||||||
|
(their source is the remote PDS; they can be re-fetched). Quarantine (Phase 1a)
|
||||||
|
extends to Agora channels: if a DID is sending spam or malformed Notes, their
|
||||||
|
incoming directory is quarantined and the DID is flagged for human review.
|
||||||
|
|
||||||
|
** Sufficiency tracks Agora as a provenance source
|
||||||
|
|
||||||
|
The sufficiency score (Phase 4) gains a new provenance category:
|
||||||
|
|
||||||
|
#+begin_example
|
||||||
|
Symbolic Index
|
||||||
|
Facts: 3,847
|
||||||
|
Gate outcomes: 847 (22%)
|
||||||
|
Deduced: 921 (24%)
|
||||||
|
Human-authored: 72 (2%)
|
||||||
|
Local prose: 1,247 (32%)
|
||||||
|
Agora Notes: 760 (20%)
|
||||||
|
─────────────────────────
|
||||||
|
Non-lossy: 1,840 (48%)
|
||||||
|
LLM-proposed: 2,007 (52%)
|
||||||
|
#+end_example
|
||||||
|
|
||||||
|
Agora Notes are a provenance source, not a lossiness category. Facts from Agora
|
||||||
|
Notes carry =:provenance :agora-note= — they are LLM-extracted (the archivist
|
||||||
|
proposes them) but the source is cryptographically signed by a known DID. They
|
||||||
|
are neither =:gate-outcome= (mechanical) nor =:llm-proposed= from local prose
|
||||||
|
(uncertain source). They occupy a middle ground: verified source, uncertain
|
||||||
|
extraction.
|
||||||
|
|
||||||
|
* Implications for Agora
|
||||||
|
|
||||||
|
** Passepartout IS the PDS
|
||||||
|
|
||||||
|
The TODO.org in =projects/agora/= already captures this: "Passepartout IS the
|
||||||
|
PDS — the agent runs a personal data store in-process." With Org files as the
|
||||||
|
Note format, this is literal. The PDS stores Org files. The agent reads them.
|
||||||
|
The network accesses them via the PDS API. There is no separate PDS process.
|
||||||
|
|
||||||
|
** Level 0 pre-arbitration via Screamer
|
||||||
|
|
||||||
|
Section 07 of the Agora requirements describes a "Tier 0 Arbitrator" — a local
|
||||||
|
AI that provides a sanity check before human arbitration. Passepartout's
|
||||||
|
Screamer + fact store provides this at zero LLM tokens when working from
|
||||||
|
existing facts:
|
||||||
|
|
||||||
|
- "Contract CID X references arbitrator DID Y. DID Y is active. Verified."
|
||||||
|
- "All parties have signed. The HODL invoice is locked. Verified."
|
||||||
|
- "The buyer's claim of non-delivery is supported by 3 signed messages with
|
||||||
|
timestamps after the delivery deadline."
|
||||||
|
- "The seller's proof-of-delivery field is empty. No QR scan recorded."
|
||||||
|
|
||||||
|
Each check is a Screamer query against the contract-lifecycle domain. Results
|
||||||
|
are a plist, not a ruling. Both parties see the same evidence summary before
|
||||||
|
escalating to Level 1.
|
||||||
|
|
||||||
|
** Reputation as deduced facts
|
||||||
|
|
||||||
|
Screamer deduces reputation from signed contract chains, not asserted claims:
|
||||||
|
|
||||||
|
#+begin_src lisp
|
||||||
|
(:entity "did:agora:heather" :relation :contract-reputation
|
||||||
|
:value (:completed 47 :defaulted 0 :disputes 3 :won 3 :escalated 0)
|
||||||
|
:provenance :deduced :derived-from (<list of 47 contract CIDs>))
|
||||||
|
#+end_src
|
||||||
|
|
||||||
|
This is the strong version of Agora's Trust Score. It's a fact deduced from
|
||||||
|
cryptographic evidence, not a claim by the persona (self-reporting could be
|
||||||
|
false) and not a claim by a centralized reputation service (could be bought).
|
||||||
|
The deduction is auditable — `/audit did:agora:heather` shows every contract,
|
||||||
|
every outcome, every ruling.
|
||||||
|
|
||||||
|
** Agent Behavioral Contracts — formal enforcement for the ABC of Agora
|
||||||
|
|
||||||
|
Bhardwaj (2026) introduces a formal framework that brings Design-by-Contract
|
||||||
|
principles to autonomous AI agents. An ABC contract =C = (P, I, G, R)=
|
||||||
|
specifies /Preconditions/, /Invariants/ (hard and soft), /Governance/ policies
|
||||||
|
(hard and soft), and /Recovery/ mechanisms as first-class runtime-enforceable
|
||||||
|
components.
|
||||||
|
|
||||||
|
This maps directly onto Agora's contract lifecycle:
|
||||||
|
|
||||||
|
| ABC component | Agora mapping |
|
||||||
|
|------------------------+--------------------------------------------------------------|
|
||||||
|
| =P= (Preconditions) | Contract Note validity checks: all signers' DIDs active, |
|
||||||
|
| | contract CID correctly referenced, HODL invoice locked |
|
||||||
|
| =I= (Invariants) | Hard: payment amount unchanged, arbitrator DID unchanged. |
|
||||||
|
| | Soft: delivery within estimated window |
|
||||||
|
| =G= (Governance) | Hard: no party modifies contract terms unilaterally. |
|
||||||
|
| | Soft: parties communicate through designated channels |
|
||||||
|
| =R= (Recovery) | Arbitration escalation, HODL invoice release, reputation |
|
||||||
|
| | deduction |
|
||||||
|
|
||||||
|
The framework's key mathematical results have direct implications for Agora:
|
||||||
|
|
||||||
|
- /Drift Bounds Theorem/: contracts with recovery rate γ > α (natural drift rate
|
||||||
|
from LLM non-determinism in agent behavior) bound behavioral drift to D* = α/γ.
|
||||||
|
For Agora, this means contract enforcement can be /predictive/ — detecting drift
|
||||||
|
before violation — rather than just /corrective/ after breach.
|
||||||
|
|
||||||
|
- /Compositionality Theorem/: sufficient conditions (interface compatibility,
|
||||||
|
assumption discharge, governance consistency, recovery independence) under
|
||||||
|
which individual contract guarantees compose end-to-end for multi-agent chains.
|
||||||
|
This is essential for Agora's multi-party contracts, where a buyer, seller,
|
||||||
|
arbitrator, and escrow agent form a chain of interdependent behavioral
|
||||||
|
expectations.
|
||||||
|
|
||||||
|
- /(p, δ, k)-satisfaction/: probabilistic compliance accounting for LLM
|
||||||
|
non-determinism — contracts hold with probability p, deviations stay within
|
||||||
|
tolerance δ, recovery within k steps. This formalizes what Screamer's
|
||||||
|
contract-lifecycle domain queries verify: whether the current state of a
|
||||||
|
contract satisfies its agreed-upon conditions, given the inherent uncertainty
|
||||||
|
in any agent's behavior.
|
||||||
|
|
||||||
|
The empirical results are significant: across 1,980 sessions on 7 models,
|
||||||
|
contracted agents (with ABC enforcement) detected 5.2-6.8 soft violations per
|
||||||
|
session that uncontracted agents missed entirely, with <10ms per-action overhead.
|
||||||
|
Overhead is critical for Passepartout as the PDS — contract enforcement must not
|
||||||
|
add latency to Note processing.
|
||||||
|
|
||||||
|
ABC does not replace Screamer. ABC specifies /what/ must hold; Screamer verifies
|
||||||
|
/whether/ it holds against the fact store. The contract-lifecycle domain already
|
||||||
|
planned for Phase 0b (signal chain) can be implemented as an ABC-like structure:
|
||||||
|
a tuple of preconditions, invariants, governance rules, and recovery mechanisms,
|
||||||
|
each expressed as Screamer-verifiable facts with Merkle provenance.
|
||||||
|
|
||||||
|
See also:
|
||||||
|
- Bhardwaj, V.P. (2026). Agent Behavioral Contracts: Formal Specification and
|
||||||
|
Runtime Enforcement for Reliable Autonomous AI Agents. arXiv:2602.22302.
|
||||||
|
|
||||||
|
** The merkle DAG IS the Key Event Log
|
||||||
|
|
||||||
|
Agora's KEL specification (Section 02) describes an append-only log of key
|
||||||
|
events — inception, rotation, revocation, follow events. Passepartout's Merkle
|
||||||
|
DAG (Phase 5, built on v0.2.0 memory-object infrastructure) is this log. Each
|
||||||
|
key event is a fact in the =:key-lifecycle= domain. Each event has a
|
||||||
|
=:parent-id= chaining to the previous event. The DAG is content-addressed —
|
||||||
|
every event is a CID. The full KEL is queryable: `/audit did:agora:heather`
|
||||||
|
renders every key event, every follow event, every contract signature, with
|
||||||
|
provenance chains.
|
||||||
|
|
||||||
|
* Relation to the Neurosymbolic Roadmap
|
||||||
|
|
||||||
|
The Agora integration is not a new phase. It is a consequence of decisions
|
||||||
|
already made:
|
||||||
|
|
||||||
|
| Roadmap item | Agora consequence |
|
||||||
|
|-------------------------+----------------------------------------------------------------|
|
||||||
|
| Phase 0b (key registry) | Key registry uses Agora DIDs. DID store is =:key-lifecycle= domain |
|
||||||
|
| Phase 1 (fact store) | Fact store is also Note store. Same API, same hash table |
|
||||||
|
| Phase 1a (self-pres.) | Incoming Notes tracked. Spam DIDs quarantined. Disk eviction |
|
||||||
|
| Phase 3 (archivist) | Archivist walks =~/memex/social/notes/= alongside local dirs |
|
||||||
|
| Phase 4 (sufficiency) | Agora Notes are a provenance category in the sufficiency score |
|
||||||
|
| Phase 5 (Merkle DAG) | DAG = KEL. DAG = contract audit trail |
|
||||||
|
| Phase 0b (signal chain) | Signal chain = contract lifecycle chain. Same Merkle linking |
|
||||||
|
|
||||||
|
No new lines in the roadmap. The Note publishing function (~40 lines) is a
|
||||||
|
utility, not a phase.
|
||||||
|
|
||||||
|
* What Is NOT Built
|
||||||
|
|
||||||
|
1. *A separate Note parser.* Agora Notes ARE Org files. The existing Org parser
|
||||||
|
reads both.
|
||||||
|
2. *A separate Note store.* The =memory-object= struct stores both. The
|
||||||
|
=*memory-store*= hash table holds both.
|
||||||
|
3. *A separate extraction path for Agora content.* The archivist extracts facts
|
||||||
|
from prose regardless of origin. The provenance tag distinguishes source.
|
||||||
|
4. *A new authentication mechanism for Agora signals.* Gate vector 0 verifies
|
||||||
|
DID signatures. The key registry is the DID registry.
|
||||||
|
|
||||||
|
See also:
|
||||||
|
- =projects/agora/docs/= — Agora requirements (overview, identity, primitive, social, contracts, governance)
|
||||||
|
- =projects/agora/TODO.org= — Passepartout integration track
|
||||||
|
- =passepartout-neurosymbolic-design-decisions-and-options.org= — the full design rationale
|
||||||
|
- =passepartout-neurosymbolic-roadmap.org= — the phased implementation plan
|
||||||
@@ -442,6 +442,371 @@ design. The gate stack provides the seed. Gate outcomes, prose extraction,
|
|||||||
deduction, and human authoring grow the shoots. Screamer prunes contradictions.
|
deduction, and human authoring grow the shoots. Screamer prunes contradictions.
|
||||||
The ontology is a garden, not a building.
|
The ontology is a garden, not a building.
|
||||||
|
|
||||||
|
* Empirical Validation — Modular Ontology Engineering with LLMs
|
||||||
|
|
||||||
|
Shimizu and Hitzler (2025, /Journal of Web Semantics/) argue that LLMs can
|
||||||
|
significantly accelerate knowledge graph and ontology engineering — modeling,
|
||||||
|
extension, population, alignment, and entity disambiguation — but /only/ if
|
||||||
|
ontologies are modular. Their paper provides empirical evidence that validates
|
||||||
|
the modular architecture described in this document and exposes concrete patterns
|
||||||
|
the archivist should adopt.
|
||||||
|
|
||||||
|
** The central finding: modularity is the key variable
|
||||||
|
|
||||||
|
In a complex ontology alignment task (mapping between two oceanography ontologies
|
||||||
|
with hundreds of classes and properties), an LLM without module information
|
||||||
|
detected correct mappings for 5 of 109 alignment rules — effectively useless. When
|
||||||
|
the same LLM was given the module structure of the target ontology (20 named
|
||||||
|
conceptual modules such as "Organization," "Cruise," "Physical Sample"), it
|
||||||
|
detected correct mappings for 104 of 109 rules — 95% accuracy. The variable was
|
||||||
|
modularity.
|
||||||
|
|
||||||
|
For ontology population (extracting triples from text), their best results came
|
||||||
|
from prompts that included a schematic representation of a /single module/ plus
|
||||||
|
one extraction example. Against ground truth, this achieved approximately 90%
|
||||||
|
extraction accuracy. Without module-scoped prompting, quality degraded
|
||||||
|
substantially.
|
||||||
|
|
||||||
|
The mechanism: conceptual modules scope the LLM's attention to something
|
||||||
|
human-sized. The paper's central claim — "by somehow limiting the scope, we
|
||||||
|
achieve a more human-like approach — and one more capable of being expressed
|
||||||
|
succinctly in language, and thus more appropriate for LLM-based assistance" — is
|
||||||
|
an independent discovery of the same principle underlying Passepartout's
|
||||||
|
domain-scoped Screamer checks and per-domain cardinality policies.
|
||||||
|
|
||||||
|
** MOMo: a mature modular ontology methodology
|
||||||
|
|
||||||
|
The authors' approach, MOMo (Modular Ontology Modeling), has been developed over a
|
||||||
|
decade and includes:
|
||||||
|
|
||||||
|
- A /step-by-step methodology/ that breaks ontology design into clearly delineated
|
||||||
|
pieces, each "easier to automate than going one-shot from base data to an
|
||||||
|
ontology."
|
||||||
|
- A /pattern description language/ (OPLa, expressed in OWL) for annotating modules
|
||||||
|
so they can be identified programmatically.
|
||||||
|
- A /design library/ (MODL) containing hundreds of commonsense micropatterns
|
||||||
|
organized for programmatic access, including via RAG.
|
||||||
|
- A /Protégé plugin/ (CoModIDE) for graphical modular ontology development.
|
||||||
|
|
||||||
|
Critically, their modules are not formal sub-ontologies with logical boundaries.
|
||||||
|
They are /conceptual/ partitions — groupings of classes, properties, and axioms
|
||||||
|
around "key notions" identified by domain experts. Modules can overlap and nest.
|
||||||
|
There are "no precise rules" for what belongs in a module. The modules provide
|
||||||
|
"conceptual bridges between human expert conceptualization and data reality."
|
||||||
|
|
||||||
|
** What Passepartout should adopt
|
||||||
|
|
||||||
|
*** The modular prompt pattern for the archivist
|
||||||
|
|
||||||
|
The extraction prompt structure that achieved 90% accuracy is concrete and
|
||||||
|
replicable: a schematic representation of a domain module plus a single extraction
|
||||||
|
example. The archivist should use this pattern when extracting facts from prose.
|
||||||
|
Instead of a generic "extract triples from this text" prompt (200 tokens), the
|
||||||
|
prompt should reference the relevant module(s) and include an example triple for
|
||||||
|
each relation in that module. The module provides /context/; the example provides
|
||||||
|
/format/. Both improve LLM extraction quality without increasing Screamer's
|
||||||
|
verification burden.
|
||||||
|
|
||||||
|
*** MOMo modules as ontology scaffold
|
||||||
|
|
||||||
|
The Passepartout notes describe an organic growth model: gate-bootstrapped facts
|
||||||
|
seed the ontology; gate outcomes, Screamer deductions, and archivist proposals
|
||||||
|
grow the shoots. This is correct for the /security and filesystem/ domains where
|
||||||
|
the gate stack already encodes expertise. For the broader memex — literature,
|
||||||
|
daily reflection, project planning — the 50-70 gate-bootstrapped entity classes
|
||||||
|
are starvation.
|
||||||
|
|
||||||
|
MOMo's micropattern library provides a ready-made scaffold for these domains.
|
||||||
|
Hundreds of commonsense patterns already exist for temporal relations, spatial
|
||||||
|
relations, agent-action, organizational structure, provenance, and event
|
||||||
|
participation. Loading these as initial modules — with :policy :plural and
|
||||||
|
=:provenance :external-ontology= — would give the symbolic index a structured
|
||||||
|
vocabulary for domains where the gate stack has nothing to offer. The organic
|
||||||
|
growth model then /extends and refines/ these modules rather than inventing them
|
||||||
|
from scratch. This is the Wikidata strategy applied at the schema level: adopt
|
||||||
|
existing structured knowledge, connect personal facts to it, and surface
|
||||||
|
disagreements rather than resolve them.
|
||||||
|
|
||||||
|
*** OPLa annotation for module identification
|
||||||
|
|
||||||
|
MOMo modules annotated in OPLa can "easily be identified programmatically." If
|
||||||
|
Passepartout annotates its ontology modules in a compatible format (even a
|
||||||
|
simplified plist-based equivalent), the archivist can automatically select the
|
||||||
|
right module(s) when extracting facts from prose. A heading in =literature/=
|
||||||
|
triggers the literature module; a heading in =projects/= triggers the software
|
||||||
|
engineering module; a heading tagged =:personal:= triggers the diary module. The
|
||||||
|
module scopes the prompt. The prompt improves extraction. Screamer gates the
|
||||||
|
result. This is the full pipeline, validated at each step.
|
||||||
|
|
||||||
|
** What this means for the Passepartout architecture
|
||||||
|
|
||||||
|
The paper validates three design decisions already made:
|
||||||
|
|
||||||
|
1. /Modularity is non-negotiable./ The paper found that modularity is the
|
||||||
|
difference between 5% and 95% accuracy on alignment. Passepartout's per-domain
|
||||||
|
cardinality policies and domain-scoped Screamer checks are the same insight
|
||||||
|
implemented in a different context. The paper proves the approach works;
|
||||||
|
Passepartout applies it to verification rather than extraction.
|
||||||
|
|
||||||
|
2. /The extraction pipeline is feasible./ 90% population accuracy with module-
|
||||||
|
scoped prompts means the archivist /can/ extract useful facts from prose. The
|
||||||
|
remaining 10% — the hallucination rate — is what Screamer catches. The paper
|
||||||
|
validates the LLM-as-proposer role; Passepartout adds the Screamer-as-verifier
|
||||||
|
role.
|
||||||
|
|
||||||
|
3. /KGs are positioned as anti-hallucination infrastructure./ The paper explicitly
|
||||||
|
frames knowledge graphs as "ground truth to escape from LLM hallucinations" and
|
||||||
|
as "components of other neurosymbolic approaches." This is the Passepartout
|
||||||
|
thesis — the symbolic index as ground truth against which LLM proposals are
|
||||||
|
checked — stated in the academic literature by the editors of the neurosymbolic
|
||||||
|
AI handbooks.
|
||||||
|
|
||||||
|
And it exposes one gap in the current design:
|
||||||
|
|
||||||
|
1. /Emergent modularity may be slower than designed modularity./ Passepartout's
|
||||||
|
modules are supposed to emerge organically from gate patterns, Screamer
|
||||||
|
generalizations, and cross-domain overlap detection. MOMo's modules are
|
||||||
|
designed by domain experts who identify key notions upfront. The emergent
|
||||||
|
approach is philosophically cleaner — the system learns its own categories —
|
||||||
|
but practically slower. The paper's results suggest that adopting designed
|
||||||
|
modules as a scaffold, and letting emergent growth /refine/ rather than
|
||||||
|
/invent/ them, would compress the timeline for sufficiency by years.
|
||||||
|
|
||||||
|
** Relation to Wikidata loading
|
||||||
|
|
||||||
|
The MOMo micropattern approach and the Wikidata loading strategy are complementary:
|
||||||
|
|
||||||
|
| Layer | MOMo provides | Wikidata provides |
|
||||||
|
|----------------+--------------------------------+--------------------------|
|
||||||
|
| Schema | Modular ontology of relations | — (Wikidata's schema is |
|
||||||
|
| | and entity classes | implicit in its data) |
|
||||||
|
| Instances | — (patterns, not entities) | 100M+ entities with |
|
||||||
|
| | | property-value pairs |
|
||||||
|
|
||||||
|
MOMo gives Passepartout the /relations/ (wrote, lectured-on, influenced,
|
||||||
|
published-in). Wikidata gives Passepartout the /instances/ (Nabokov, Pale Fire,
|
||||||
|
Kafka). Both are needed. Neither alone is sufficient. The MOMo scaffold tells the
|
||||||
|
archivist /what kinds of facts to look for/. The Wikidata graph tells the
|
||||||
|
archivist /which entities those facts are about/. Together they transform the
|
||||||
|
extraction task from "discover entities and their relations from prose" to
|
||||||
|
"connect this prose heading to known entities using known relations" — a
|
||||||
|
dramatically simpler prompt with dramatically higher expected accuracy.
|
||||||
|
|
||||||
|
** Reference
|
||||||
|
|
||||||
|
- Shimizu, C., & Hitzler, P. (2025). Accelerating knowledge graph and ontology
|
||||||
|
engineering with large language models. /Journal of Web Semantics, 85/,
|
||||||
|
100862. https://doi.org/10.1016/j.websem.2025.100862
|
||||||
|
|
||||||
|
** See also
|
||||||
|
|
||||||
|
- =passepartout-neurosymbolic-roadmap.org=: Phase 3 (Archivist) — the modular
|
||||||
|
prompt pattern should be incorporated into the extraction pipeline.
|
||||||
|
- =passepartout-agora.org=: the KEL / contract audit trail as instances of
|
||||||
|
MOMo-style key-lifecycle and contract-lifecycle modules.
|
||||||
|
- =notes/passepartout-SWOT.org=: the SWOT analysis which identifies the ontology
|
||||||
|
problem as the key bottleneck — MOMo partially addresses this.
|
||||||
|
|
||||||
|
** Supporting References
|
||||||
|
|
||||||
|
*** MOMo: the canonical methodology
|
||||||
|
|
||||||
|
Shimizu, Hammar & Hitzler (2023, /Semantic Web Journal/) present the full MOMo
|
||||||
|
methodology — 31 pages covering the step-by-step design process, schema diagrams
|
||||||
|
as knowledge elicitation tools, ODP libraries, OPLa annotation language, and
|
||||||
|
CoModIDE, a Protégé plugin for graphical modular ontology development. The paper
|
||||||
|
was evaluated with usability studies and demonstrates that modular development
|
||||||
|
significantly improves approachability for domain experts who are not ontology
|
||||||
|
engineers.
|
||||||
|
|
||||||
|
Key architectural commitments from MOMo that Passepartout should adopt:
|
||||||
|
|
||||||
|
- /Schema diagrams/ as the primary communication format between ontologist and
|
||||||
|
domain expert. Passepartout's equivalent: the archivist's module-scoped prompt
|
||||||
|
includes a simplified schema diagram of the module being populated.
|
||||||
|
- /Template-based instantiation/ of ontology design patterns into concrete
|
||||||
|
modules. Passepartout's equivalent: micropatterns loaded from MODL are
|
||||||
|
instantiated with entities from the user's memex, producing concrete facts.
|
||||||
|
- /Systematic axiomatization/ — 17 frequently used axiom patterns for each
|
||||||
|
node-edge-node construction in a schema diagram. Passepartout's equivalent:
|
||||||
|
Screamer constraint rules derived from module structure.
|
||||||
|
|
||||||
|
Reference:
|
||||||
|
- Shimizu, C., Hammar, K., & Hitzler, P. (2023). Modular ontology modeling.
|
||||||
|
/Semantic Web, 14/(3), 459–489. https://doi.org/10.3233/SW-222886
|
||||||
|
|
||||||
|
*** Ontology Population — the empirical methodology
|
||||||
|
|
||||||
|
Norouzi et al. (2024) provide the full experimental methodology behind the ~90%
|
||||||
|
extraction accuracy claim. Using the Enslaved.org Hub Ontology as ground truth
|
||||||
|
and Wikipedia articles as source text, they tested five LLMs across a three-stage
|
||||||
|
pipeline: preprocessing, text retrieval, and KG population. The critical finding:
|
||||||
|
prompts that included a /schema diagram/ of the target ontology module (using
|
||||||
|
MOMo's visual conventions with colored boxes for classes, arrows for relations)
|
||||||
|
plus a single extraction example achieved the highest accuracy. Without
|
||||||
|
module-scoped prompts, quality degraded substantially.
|
||||||
|
|
||||||
|
Three findings are directly applicable to the archivist:
|
||||||
|
|
||||||
|
1. /Role chain simplification./ The Enslaved Ontology has complex role chains
|
||||||
|
(e.g., Person → hasRole → Role → inEvent → Event). These were collapsed into
|
||||||
|
shortcut relations (e.g., Person → participatedIn → Event) for LLM extraction.
|
||||||
|
The archivist should maintain two layers: the /logical/ schema with full role
|
||||||
|
chains for Screamer verification, and the /extraction/ schema with simplified
|
||||||
|
relations for LLM prompting.
|
||||||
|
|
||||||
|
2. /Variance across models./ Five LLMs were tested. Performance varied
|
||||||
|
significantly. The archivist should benchmark extraction accuracy per provider
|
||||||
|
and per module, and route extraction tasks to the best-performing model for
|
||||||
|
each module — extending the existing model-tier routing (v0.3.0) from
|
||||||
|
complexity-based to accuracy-based routing.
|
||||||
|
|
||||||
|
3. /Cross-source validation./ The paper used both Wikipedia text and Wikidata
|
||||||
|
as overlapping sources for the same entities, enabling cross-verification.
|
||||||
|
The archivist can do the same: extract facts from the user's prose, extract
|
||||||
|
facts from Wikidata for the same entities, and present disagreements with
|
||||||
|
provenance. This is the =:plural= cardinality policy applied at extraction time.
|
||||||
|
|
||||||
|
Reference:
|
||||||
|
- Norouzi, S.S., Barua, A., Christou, A., Gautam, N., Eells, A., Hitzler, P.,
|
||||||
|
& Shimizu, C. (2024). Ontology Population using LLMs. arXiv:2411.01612.
|
||||||
|
|
||||||
|
* Historical Lineage — McCarthy's Advice Taker
|
||||||
|
|
||||||
|
McCarthy's "Programs with Common Sense" (1959) is the direct intellectual ancestor
|
||||||
|
of the Passepartout architecture. The paper proposed an "advice taker" — a program
|
||||||
|
that "will draw immediate conclusions from a list of premises" expressed in
|
||||||
|
"a suitable formal language (most likely a part of the predicate calculus)." The
|
||||||
|
program would:
|
||||||
|
|
||||||
|
1. Accept declarative statements about the world as input.
|
||||||
|
2. Store them as logical formulas.
|
||||||
|
3. Reason from them to produce new conclusions.
|
||||||
|
4. Accept new facts and revise its conclusions.
|
||||||
|
|
||||||
|
This is precisely the Passepartout pipeline: the archivist extracts declarative
|
||||||
|
facts from prose → Screamer checks them for consistency → VivaceGraph stores them
|
||||||
|
→ the planner reasons from them → new facts from gate outcomes and deductions
|
||||||
|
revise the store. McCarthy proposed it in 1959. Passepartout is building it in
|
||||||
|
2026.
|
||||||
|
|
||||||
|
The gap between McCarthy's proposal and Passepartout's implementation is the
|
||||||
|
/hallucination problem/. McCarthy assumed facts would be entered by a human
|
||||||
|
programmer in formal logic. Passepartout's facts are extracted from natural
|
||||||
|
language prose by an LLM — a probabilistic process that requires deterministic
|
||||||
|
verification. Screamer is the component McCarthy didn't need: a constraint solver
|
||||||
|
that gates LLM-proposed facts against the existing fact store.
|
||||||
|
|
||||||
|
The connection is not metaphorical. McCarthy cited Principia Mathematica as an
|
||||||
|
influence on Lisp. Passepartout's Whitehead note traces the same PM → Lisp
|
||||||
|
lineage. The advice taker → Passepartout lineage completes the arc: PM's formal
|
||||||
|
logic → Lisp → McCarthy's advice taker → Passepartout's neurosymbolic engine.
|
||||||
|
|
||||||
|
Reference:
|
||||||
|
- McCarthy, J. (1959). Programs with Common Sense. /Proceedings of the
|
||||||
|
Teddington Conference on the Mechanization of Thought Processes./
|
||||||
|
|
||||||
|
* Philosophical Validation — The Neurosymbolic Consensus
|
||||||
|
|
||||||
|
Three papers from the neurosymbolic AI research community validate the
|
||||||
|
architectural thesis from complementary angles.
|
||||||
|
|
||||||
|
** Marcus (2020): The Case Against Pure Deep Learning
|
||||||
|
|
||||||
|
Gary Marcus's "The Next Decade in AI" argues that deep learning alone is "data
|
||||||
|
hungry, shallow, brittle, and limited in its ability to generalize." The paper
|
||||||
|
demonstrates GPT-2 failing at basic commonsense reasoning:
|
||||||
|
|
||||||
|
- "Yesterday I dropped my clothes off at the dry cleaners and have yet to pick
|
||||||
|
them up. Where are my clothes?" → GPT-2: "at my mom's house."
|
||||||
|
- "There are six frogs on a log. Two leave, but three join. The number of frogs
|
||||||
|
on the log is now" → GPT-2: "seventeen."
|
||||||
|
|
||||||
|
Marcus proposes four steps toward robust AI: hybrid architecture (combining
|
||||||
|
neural and symbolic), large-scale knowledge (abstract and causal, not just
|
||||||
|
statistical), reasoning (formal inference over structured representations), and
|
||||||
|
cognitive models (frameworks for how entities relate). Passepartout implements all
|
||||||
|
four: the perceive-reason-act pipeline is hybrid, the symbolic index is causal
|
||||||
|
knowledge, Screamer + ACL2 provide reasoning, and the gate-bootstrapped ontology
|
||||||
|
plus MOMo modules provide cognitive models.
|
||||||
|
|
||||||
|
Marcus's core claim — "we have no hope of achieving robust intelligence without
|
||||||
|
first developing systems with deep understanding" — is the justification for
|
||||||
|
Passepartout's entire neurosymbolic investment. The alternative is a system that
|
||||||
|
works "on a good day" and fails unpredictably. The deterministic gate stack and
|
||||||
|
Screamer admission gate are the engineering realization of Marcus's call for
|
||||||
|
robustness.
|
||||||
|
|
||||||
|
Reference:
|
||||||
|
- Marcus, G. (2020). The Next Decade in AI: Four Steps Towards Robust
|
||||||
|
Artificial Intelligence. arXiv:2002.06177.
|
||||||
|
|
||||||
|
** Gaur & Sheth (2023): CREST — Trustworthy Neurosymbolic AI
|
||||||
|
|
||||||
|
Gaur and Sheth present the CREST framework: Consistency, Reliability, user-level
|
||||||
|
Explainability, and Safety build Trust — and they argue these require
|
||||||
|
neurosymbolic methods. Their empirical finding: GPT-3.5 breached safety
|
||||||
|
constraints 30% of the time when asked identical questions repeatedly. Claude's
|
||||||
|
16 safety rules and Sparrow's 23 rules provide no /inherent/ safety — they are
|
||||||
|
heuristic guardrails that can be breached through prompt variation.
|
||||||
|
|
||||||
|
These findings validate three Passepartout design commitments:
|
||||||
|
|
||||||
|
1. /Prompt-level safety is insufficient./ Claude and Sparrow use rules that
|
||||||
|
consume LLM tokens and can be evaded. Passepartout's deterministic gates run
|
||||||
|
in pure Lisp, cost 0 tokens, and cannot be evaded by prompt engineering.
|
||||||
|
|
||||||
|
2. /Inconsistency is the norm, not the exception./ Gaur & Sheth show that even
|
||||||
|
identical queries produce inconsistent responses ~30% of the time. This
|
||||||
|
validates the cardinality model: a system that expects contradiction and
|
||||||
|
surfaces it with provenance is architecturally more honest than one that
|
||||||
|
assumes consistency and silently resolves it.
|
||||||
|
|
||||||
|
3. /Knowledge infusion is required for trust./ The CREST framework embeds
|
||||||
|
domain knowledge (clinical guidelines, procedural knowledge) into LLM
|
||||||
|
pipelines. Passepartout's symbolic index IS the knowledge infusion layer —
|
||||||
|
facts extracted from prose, verified by Screamer, and available for any LLM
|
||||||
|
call through the context assembly pipeline.
|
||||||
|
|
||||||
|
Reference:
|
||||||
|
- Gaur, M., & Sheth, A. (2023). Building Trustworthy NeuroSymbolic AI Systems:
|
||||||
|
Consistency, Reliability, Explainability, and Safety. arXiv:2312.06798.
|
||||||
|
|
||||||
|
** Sheth et al. (2022): Knowledge-Infused Learning
|
||||||
|
|
||||||
|
Sheth, Gunaratna, Bhatt, and Gaur define Knowledge-infused Learning (KiL) as
|
||||||
|
"combining various types of explicit knowledge with data-driven deep learning
|
||||||
|
techniques." They identify three infusion levels (shallow, semi-deep, deep) and
|
||||||
|
position KiL as "a sweet spot in neuro-symbolic AI."
|
||||||
|
|
||||||
|
The paper makes two observations relevant to Passepartout:
|
||||||
|
|
||||||
|
1. /Data alone is not enough./ The opening cites Pedro Domingos ("Data Alone is
|
||||||
|
Not Enough"), Andrew Ng ("the importance of Big Data is overhyped"), and
|
||||||
|
Gary Marcus ("AI that captures how humans think"). These are the intellectual
|
||||||
|
warrant for the symbolic index: a knowledge layer that is independent of any
|
||||||
|
specific LLM call, accumulated across sessions, and verified against existing
|
||||||
|
facts.
|
||||||
|
|
||||||
|
2. /Expert knowledge is external to the model./ Domain experts use "their past
|
||||||
|
experience, web or domain-specific knowledge sources, and annotation
|
||||||
|
guidelines" to create ground truth — resources the LLM cannot access during
|
||||||
|
training. The symbolic index makes these resources queryable: facts from the
|
||||||
|
gate stack (security expertise), from the human (declarative authoring), from
|
||||||
|
Wikidata (world knowledge), and from Screamer deductions (derived expertise).
|
||||||
|
|
||||||
|
Passepartout's architecture is a specific implementation of KiL at the deepest
|
||||||
|
infusion level: knowledge is not appended to prompts (shallow) or embedded in
|
||||||
|
fine-tuning (semi-deep). It is a first-class data structure — the symbolic index
|
||||||
|
— that the LLM queries through the archivist and the planner. The knowledge is
|
||||||
|
living: it accumulates, is verified, carries provenance, and evolves through
|
||||||
|
ontology versioning.
|
||||||
|
|
||||||
|
Reference:
|
||||||
|
- Gaur, M., Gunaratna, K., Bhatt, S., & Sheth, A. (2022). Knowledge-Infused
|
||||||
|
Learning: A Sweet Spot in Neuro-Symbolic AI. /IEEE Internet Computing, 26/(4),
|
||||||
|
5–11. https://doi.org/10.1109/MIC.2022.3179759
|
||||||
|
|
||||||
* Semantic Wikipedia as Entity Backbone
|
* Semantic Wikipedia as Entity Backbone
|
||||||
|
|
||||||
The gate stack provides 50-70 entity classes — adequate for a coding agent where
|
The gate stack provides 50-70 entity classes — adequate for a coding agent where
|
||||||
@@ -1412,3 +1777,19 @@ See also:
|
|||||||
- =passepartout/docs/DESIGN_DECISIONS.org= — the existing design decisions
|
- =passepartout/docs/DESIGN_DECISIONS.org= — the existing design decisions
|
||||||
- =passepartout/docs/ARCHITECTURE.org= — the current pipeline architecture
|
- =passepartout/docs/ARCHITECTURE.org= — the current pipeline architecture
|
||||||
- =passepartout/docs/ROADMAP.org= — the feature roadmap through v0.13.0
|
- =passepartout/docs/ROADMAP.org= — the feature roadmap through v0.13.0
|
||||||
|
- =notes/passepartout-SWOT.org= — SWOT analysis of the neurosymbolic architecture
|
||||||
|
- =passepartout-agora.org= — Passepartout-Agora integration design
|
||||||
|
- Shimizu, C. & Hitzler, P. (2025). Accelerating knowledge graph and ontology
|
||||||
|
engineering with large language models. /Journal of Web Semantics, 85/, 100862.
|
||||||
|
https://doi.org/10.1016/j.websem.2025.100862
|
||||||
|
- Shimizu, C., Hammar, K., & Hitzler, P. (2023). Modular ontology modeling.
|
||||||
|
/Semantic Web, 14/(3), 459–489. https://doi.org/10.3233/SW-222886
|
||||||
|
- Norouzi, S.S. et al. (2024). Ontology Population using LLMs. arXiv:2411.01612.
|
||||||
|
- McCarthy, J. (1959). Programs with Common Sense. /Proc. Teddington Conf. on
|
||||||
|
the Mechanization of Thought Processes./
|
||||||
|
- Marcus, G. (2020). The Next Decade in AI. arXiv:2002.06177.
|
||||||
|
- Gaur, M. & Sheth, A. (2023). Building Trustworthy NeuroSymbolic AI Systems.
|
||||||
|
arXiv:2312.06798.
|
||||||
|
- Gaur, M., Gunaratna, K., Bhatt, S., & Sheth, A. (2022). Knowledge-Infused
|
||||||
|
Learning. /IEEE Internet Computing, 26/(4), 5–11.
|
||||||
|
- Bhardwaj, V.P. (2026). Agent Behavioral Contracts. arXiv:2602.22302.
|
||||||
|
|||||||
Submodule projects/passepartout updated: 24a24b481b...6422a84872
Submodule projects/passepartout-contrib updated: ce17336acd...825ef832ba
Reference in New Issue
Block a user