Compare commits
4 Commits
4e9431ec1d
...
0290feccc1
| Author | SHA1 | Date | |
|---|---|---|---|
| 0290feccc1 | |||
| f6094abb7b | |||
| e719443ce7 | |||
| 04944a62e2 |
81
AGENTS.md
81
AGENTS.md
@@ -1,77 +1,12 @@
|
|||||||
# AGENTS.md
|
# AGENTS.md
|
||||||
|
|
||||||
## Development Cycle (every change)
|
This is the memex monorepo. It contains multiple Common Lisp projects, each
|
||||||
|
in `projects/`. See `projects/AGENTS.md` for the general development workflow
|
||||||
|
(ROADMAP-driven, TDD in REPL, literate programming, branch policy).
|
||||||
|
|
||||||
1. **Think in org** — write your reasoning, goals, and approach in the .org file first
|
## Project list
|
||||||
2. **Write contract** — define a `** Contract` section listing each function's behavior:
|
|
||||||
`(fn-name args)`: description. Returns/guarantees ...
|
|
||||||
3. **TDD from contract** — each contract item becomes a `fiveam:test` in `* Test Suite`
|
|
||||||
a. Write the test first → tangle → run → prove it FAILS (RED)
|
|
||||||
b. Write the implementation → tangle → run → prove it PASSES (GREEN)
|
|
||||||
c. Record both failure and success output
|
|
||||||
4. **Reflect in org** — once tests pass, ensure the implementation is in the .org source
|
|
||||||
5. **Update literate prose** — write/update the explanatory text around the code:
|
|
||||||
what it does, why it exists, how it connects to the rest of the system
|
|
||||||
6. **Commit** — only when asked. Ask first.
|
|
||||||
|
|
||||||
## Commands
|
| Project | Description | Runtime |
|
||||||
|
|---------|-------------|---------|
|
||||||
Tangle a single file:
|
| passepartout | Neurosymbolic agent | `passepartout daemon` |
|
||||||
emacs --batch --eval "(progn (require 'org) (find-file \"org/FILE.org\") (org-babel-tangle) (kill-buffer))"
|
| cl-tui | Reusable terminal UI framework | `sbcl` + `(ql:quickload :cl-tui)` |
|
||||||
|
|
||||||
Validate structural integrity:
|
|
||||||
emacs --batch -Q --eval '(progn (find-file "org/FILE.org") (check-parens) (kill-buffer))'
|
|
||||||
|
|
||||||
Run tests:
|
|
||||||
sbcl --noinform \
|
|
||||||
--eval '(load (merge-pathnames "quicklisp/setup.lisp" (user-homedir-pathname)))' \
|
|
||||||
--eval '(ql:quickload :passepartout :silent t)' \
|
|
||||||
--eval '(load "lisp/FILE.lisp")' \
|
|
||||||
--eval '(fiveam:run (intern "SUITE-NAME" :passepartout-TESTS))' --quit
|
|
||||||
|
|
||||||
For error details: bind fiveam:*on-failure* to :debug
|
|
||||||
|
|
||||||
## REPL (port 9105) — preferred when available
|
|
||||||
|
|
||||||
Start: `passepartout daemon`
|
|
||||||
Send code:
|
|
||||||
msg = '(:type :event :payload (:sensor :repl-eval :code "(+ 1 2)"))'
|
|
||||||
s.sendall(f'{len(msg):06x}'.encode() + msg.encode())
|
|
||||||
|
|
||||||
When REPL is up: TDD in-image first, then reflect to .org and tangle.
|
|
||||||
When REPL is down: fall back to the SBCL cycle above.
|
|
||||||
|
|
||||||
## Rules
|
|
||||||
|
|
||||||
- .org is source of truth; .lisp is generated — never edit .lisp directly
|
|
||||||
- Every code change starts with a contract and a failing test
|
|
||||||
- Prove RED before writing implementation
|
|
||||||
- Validate before committing
|
|
||||||
- If a tool fails, explain why and ask before trying alternatives
|
|
||||||
- Before shipping a version, run the `** File Update Checklist` in `docs/ROADMAP.org`
|
|
||||||
- **YOU MAY NOT** push a version tag (e.g., `v0.5.0`), create a GitHub release, or run `git push`
|
|
||||||
that triggers CI/CD version workflows without explicit permission. Ask first.
|
|
||||||
|
|
||||||
## Core Boundary (HARD RULE)
|
|
||||||
|
|
||||||
- **YOU MAY NOT add files to `passepartout.asd` `:components` without asking for permission.**
|
|
||||||
ASDF `:components` is the core harness. Files there load on every daemon boot,
|
|
||||||
cannot be hot-reloaded, and a bug there kills the agent's brainstem.
|
|
||||||
|
|
||||||
- When you want to add a new module, **ask first**. Provide:
|
|
||||||
1. Why it cannot be a skill (the self-repair criterion — can the agent fix it
|
|
||||||
if corrupted without human help?) Demonstrate specifically how a broken
|
|
||||||
version of this file prevents the agent from perceiving, reasoning,
|
|
||||||
or acting — not just degrading performance or losing a feature.
|
|
||||||
2. What it depends on and what depends on it
|
|
||||||
3. Why it cannot use `fboundp` guards from core
|
|
||||||
|
|
||||||
- **Default: everything is a skill.** Skills load via `skill-initialize-all`,
|
|
||||||
are hot-reloadable, self-repairable, and a bug in a skill degrades the agent
|
|
||||||
but doesn't kill it. The harness stays thin.
|
|
||||||
|
|
||||||
- **The self-repair criterion**: a file belongs in core only if, when corrupted,
|
|
||||||
the agent *cannot* fix it without human help. Corrupted core = dead brain,
|
|
||||||
dead hands, or unreachable. Corrupted skill = degraded but self-repairable.
|
|
||||||
This criterion is documented in `docs/ARCHITECTURE.org` and
|
|
||||||
`docs/DESIGN_DECISIONS.org`.
|
|
||||||
|
|||||||
868
notes/passepartout-SWOT.org
Normal file
868
notes/passepartout-SWOT.org
Normal file
@@ -0,0 +1,868 @@
|
|||||||
|
#+TITLE: Passepartout Neurosymbolic + Agora Integration — SWOT Analysis
|
||||||
|
#+AUTHOR: Agent
|
||||||
|
#+FILETAGS: :notes:analysis:swot:passepartout:agora:neurosymbolic:
|
||||||
|
#+CREATED: [2026-05-09 Sat]
|
||||||
|
|
||||||
|
* Premise and Scope
|
||||||
|
|
||||||
|
This analysis assumes the engineering is possible — Screamer can be wrapped,
|
||||||
|
VivaceGraph can persist facts, ACL2 can verify structural properties, the
|
||||||
|
archivist can extract triples from prose with Screamer verification, and the
|
||||||
|
note-publishing bridge to Agora can be implemented. The question is not "can it
|
||||||
|
be built?" but "does the architecture cohere? What does it enable? What does it
|
||||||
|
miss?"
|
||||||
|
|
||||||
|
* Will It Work Conceptually?
|
||||||
|
|
||||||
|
The short answer: yes, within a specific domain. The long answer: the boundary of
|
||||||
|
that domain is the most important thing to get right.
|
||||||
|
|
||||||
|
** The architecture's core insight is correct and load-bearing
|
||||||
|
|
||||||
|
The central design decision — "the LLM proposes; the symbolic engine decides
|
||||||
|
whether to accept" — is sound. It is the inverse of every existing agent
|
||||||
|
architecture. Claude Code, OpenCode, Hermes — all of them put the LLM in the
|
||||||
|
driver's seat and add safety as an afterthought (prompt-based guardrails that
|
||||||
|
consume tokens and can be evaded). Passepartout inverts this: the LLM proposes
|
||||||
|
actions and facts, but a deterministic layer of gates, constraint solvers, and
|
||||||
|
formal verifiers decides what to admit and what to execute. This inversion is the
|
||||||
|
correct response to the hallucination problem. You cannot eliminate hallucination
|
||||||
|
by making the LLM better. You eliminate it by not asking the LLM to do things
|
||||||
|
that require certainty.
|
||||||
|
|
||||||
|
The bootstrap mechanism — extracting 50-70 entity classes mechanically from the
|
||||||
|
existing Dispatcher gate stack with zero new code — is genuinely elegant. It
|
||||||
|
proves the pattern at minimal cost: code becomes facts, facts enable reasoning.
|
||||||
|
Every new gate pattern adds to the ontology organically. This is the right way to
|
||||||
|
start a knowledge base: not by designing a schema upfront, but by formalizing what
|
||||||
|
the system already knows implicitly.
|
||||||
|
|
||||||
|
** The "one memex, two indices" architecture survives contact with reality
|
||||||
|
|
||||||
|
Option 4 (one memex with neural and symbolic indices over the same Org files) is
|
||||||
|
the correct long-term architecture. The prose is the ground truth — always. The
|
||||||
|
symbolic index is a derived view that can be thrown away and rebuilt. The neural
|
||||||
|
index handles semantic search, associative leaps, and fuzzy matching. This
|
||||||
|
division of labor is permanent, not transitional, because the domains they serve
|
||||||
|
are fundamentally different kinds of knowledge.
|
||||||
|
|
||||||
|
The practical path — starting with Option 5 (ephemeral facts, no persistence)
|
||||||
|
through Phases 1-4, then graduating to Option 4 with VivaceGraph persistence in
|
||||||
|
Phase 5 — is the right sequence. It punts the serialization format problem until
|
||||||
|
the fact language has been battle-tested. It keeps the cost of mistakes low. It
|
||||||
|
treats the ontology as something discovered through use rather than designed
|
||||||
|
upfront.
|
||||||
|
|
||||||
|
** Wikipedia's ontology WOULD give it a running start — with caveats
|
||||||
|
|
||||||
|
Wikidata contains approximately 100 million entities with a decade of human
|
||||||
|
curation: type hierarchies, relations, dates, citations, disambiguation. For a
|
||||||
|
personal memex that mentions Nabokov, /Pale Fire/, Kafka, postmodernism, and
|
||||||
|
butterfly migration, the gate stack's 50-70 entity classes is starvation.
|
||||||
|
Organic growth through prose extraction would take years to cover the entities in
|
||||||
|
one person's engagement with a single novel.
|
||||||
|
|
||||||
|
Loading Wikidata's entity graph into the symbolic index transforms the
|
||||||
|
archivist's job from "discover that Nabokov wrote /Pale Fire/" to "connect your
|
||||||
|
heading to Wikidata entity Q36591." The second task is reference resolution, not
|
||||||
|
knowledge extraction — simpler, more reliable, and in many cases doable without
|
||||||
|
an LLM at all (string match against loaded entities). The notes claim this
|
||||||
|
collapses the LLM's role to three thin boundaries: input translation, prose-to-
|
||||||
|
candidate-triple for personal content, and result-to-prose formatting.
|
||||||
|
|
||||||
|
The caveats are real:
|
||||||
|
|
||||||
|
- Entity resolution (matching prose mentions to Wikidata entities) is genuinely
|
||||||
|
hard. "Nabokov" in a diary might refer to Vladimir Nabokov (Q36591), his son
|
||||||
|
Dmitri (Q566744), or someone else entirely. Disambiguation requires context
|
||||||
|
that the symbolic engine doesn't have without LLM assistance.
|
||||||
|
- Wikidata is biased toward English Wikipedia's coverage. A memex in Arabic,
|
||||||
|
Farsi, or Amharic will find far fewer resolved entities. The "universal" in
|
||||||
|
Wikidata is aspirational, not actual.
|
||||||
|
- Wikidata's property graph is not a ontology in the formal sense — it's a
|
||||||
|
collaboratively edited dataset with contradictions, gaps, and editorial wars
|
||||||
|
frozen in time. Loading it directly into a symbolic index that expects
|
||||||
|
consistency (Screamer checks, cardinality policies) will surface thousands of
|
||||||
|
contradictions on ingest, many of which are Wikidata artifacts, not meaningful
|
||||||
|
tensions.
|
||||||
|
- N-hop expansion is unbounded. One hop from Nabokov hits hundreds of entities
|
||||||
|
(his works, his family, his influences, his translators). Two hops hits
|
||||||
|
thousands. Three hops hits tens of thousands. The notes say "3-4 hops" for a
|
||||||
|
literary memex but don't estimate the entity count this implies. The claim that
|
||||||
|
5 million entities = ~400MB is the best-case hash-table figure; a graph with
|
||||||
|
query indices will be larger, and Prolog-like queries over millions of nodes
|
||||||
|
are not free.
|
||||||
|
|
||||||
|
Still: even a partial Wikidata load with conservative hop limits would provide
|
||||||
|
more ontology than the system could accumulate through years of organic growth.
|
||||||
|
It is the right accelerator, and the architecture handles it correctly — Wikidata
|
||||||
|
facts are admitted with =:provenance :wikidata= and =:policy :plural=, meaning
|
||||||
|
they sit alongside personal facts without overriding them. Disagreements are
|
||||||
|
surfaced, not resolved. The architecture treats Wikidata as evidence from an
|
||||||
|
external source, not as ground truth. That's the correct posture.
|
||||||
|
|
||||||
|
** Cardinality policies are the right abstraction for contradiction
|
||||||
|
|
||||||
|
The =:singular= / =:dual= / =:plural= cardinality model is one of the most
|
||||||
|
important ideas in these notes. Classical logic requires consistency — a
|
||||||
|
contradiction implies everything (ex contradictione quodlibet). A constraint
|
||||||
|
solver like Screamer also requires consistency — a contradictory constraint set
|
||||||
|
has no solutions. But a personal memex operates across domains where the meaning
|
||||||
|
of contradiction is fundamentally different:
|
||||||
|
|
||||||
|
- "rm -rf / is catastrophic" is =:singular= — there is one truth that evolves
|
||||||
|
over time.
|
||||||
|
- "I loved this person AND I resented them" is =:dual= — the tension IS the
|
||||||
|
fact.
|
||||||
|
- "Wikidata says Everest is 8848m; DBpedia says 8849m; my 2023 diary says
|
||||||
|
8848m" is =:plural= — multiple sources disagree, and surfacing the disagreement
|
||||||
|
with provenance is the product.
|
||||||
|
|
||||||
|
This is a genuinely novel contribution to knowledge representation. Most
|
||||||
|
knowledge graphs (Wikidata, Freebase, DBpedia) don't model contradiction at all —
|
||||||
|
they pick one value and discard the rest. Most constraint solvers reject
|
||||||
|
contradiction as error. Passepartout's cardinality model makes contradiction a
|
||||||
|
first-class citizen: you can query the fact that "I used to believe X until
|
||||||
|
Tuesday, then Y," or "these three sources disagree on height," or "I hold these
|
||||||
|
two positions in tension." The symbolic engine's job is not to decide which is
|
||||||
|
right. It is to surface the tension with provenance.
|
||||||
|
|
||||||
|
This alone, if implemented correctly, would be a category-level advance over
|
||||||
|
every existing personal knowledge management tool.
|
||||||
|
|
||||||
|
** Ontology versioning is the right approach to the migration problem
|
||||||
|
|
||||||
|
Every knowledge base eventually faces schema migration — you split =:secret-file=
|
||||||
|
into =:crypto-secret= and =:plaintext-secret=, and now every deduction that
|
||||||
|
crossed the old category boundary is suspect. The standard approach is batch
|
||||||
|
UPDATE operations that overwrite the past. Passepartout's approach — the category
|
||||||
|
hierarchy itself is a Merkle tree, every fact stores the =:ontology-version= at
|
||||||
|
assertion time, category changes trigger re-verification rather than remapping —
|
||||||
|
preserves all worldviews. You can query "what did I believe about secrets before
|
||||||
|
I refined my security model?" This is not querying a fact. It is querying the
|
||||||
|
history of your own thinking.
|
||||||
|
|
||||||
|
This is the kind of capability that no existing tool provides, and it flows
|
||||||
|
directly from the architecture. If the Merkle DAG infrastructure exists (it does,
|
||||||
|
from v0.2.0), ontology versioning is ~40 lines on top of it. The conceptual
|
||||||
|
design is sound. The engineering appears tractable.
|
||||||
|
|
||||||
|
* SWOT Analysis
|
||||||
|
|
||||||
|
** Strengths
|
||||||
|
|
||||||
|
*** Architectural inversion — proposer vs decider
|
||||||
|
|
||||||
|
The LLM proposes. The symbolic engine decides. This is the inverse of every
|
||||||
|
existing agent architecture, and it solves the hallucination problem at the
|
||||||
|
architectural level rather than the prompt-engineering level. No amount of
|
||||||
|
prompt refinement can make a probabilistic system deterministic. But a
|
||||||
|
deterministic admission gate can make a probabilistic proposer safe.
|
||||||
|
|
||||||
|
*** Unified container format (Org files)
|
||||||
|
|
||||||
|
Org files serve as the container for human prose, Lisp source code, symbolic
|
||||||
|
facts, and Agora Notes. One format, one toolchain, one Merkle tree, one version
|
||||||
|
control system. If Passepartout stops existing, the data survives in plain text.
|
||||||
|
This is the hardest commitment in the design and the most undervalued. Most agent
|
||||||
|
architectures store memory in JSONL transcripts, vector databases, or proprietary
|
||||||
|
formats — opaque to the human and dependent on the tool. Passepartout's memory
|
||||||
|
IS the human's memory, in the human's format.
|
||||||
|
|
||||||
|
*** Provenance as product
|
||||||
|
|
||||||
|
Every fact carries =:grounding= (the specific Org heading), =:provenance= (who
|
||||||
|
or what produced it), =:timestamp=, =:referenced-by=, =:contradicted-by=,
|
||||||
|
=:superseded-by=. The =/audit= command renders the full provenance chain. In the
|
||||||
|
broader memex, the value is not the verified fact ("this command is safe"). It
|
||||||
|
is the provenance itself: "this claim originated in that diary entry, has been
|
||||||
|
referenced 7 times across 4 projects, was contradicted 6 months later, and was
|
||||||
|
revised 3 weeks after that." This is a memory prosthesis that makes your own mind
|
||||||
|
legible to you.
|
||||||
|
|
||||||
|
*** Gate-to-fact bootstrap — ontology from existing code
|
||||||
|
|
||||||
|
The existing Dispatcher gate stack encodes an implicit ontology (categories of
|
||||||
|
secrets, destructive commands, trusted domains, core files). The bootstrap
|
||||||
|
extracts this mechanically — zero LLM tokens, zero human authoring, ~30 lines of
|
||||||
|
Lisp. This proves the pattern and provides the seed ontology without any new
|
||||||
|
infrastructure. Every new gate pattern added by the human (HITL approvals that
|
||||||
|
become rules) extends the ontology automatically.
|
||||||
|
|
||||||
|
*** Self-preservation architecture
|
||||||
|
|
||||||
|
The Third Law implementation — quarantine on skill failure, degraded-mode
|
||||||
|
signaling, resource monitoring, external watchdog, refusal to self-terminate —
|
||||||
|
is individually small (~20-50 lines each) and collectively transforms
|
||||||
|
self-preservation from a passive architectural property into an active behavior.
|
||||||
|
The key insight: the biggest gap is not that these mechanisms are hard. It is
|
||||||
|
that degradation is currently silent. Making it visible is cheap and high-impact.
|
||||||
|
|
||||||
|
*** Cardinality policies as a solution to contradiction
|
||||||
|
|
||||||
|
The =:singular= / =:dual= / =:plural= model is novel in knowledge representation
|
||||||
|
and directly addresses the hardest problem in a personal memex: that
|
||||||
|
contradiction is the product, not the error. Bayesian knowledge bases, graph
|
||||||
|
databases, and triple stores all struggle with contradiction. Passepartout's
|
||||||
|
model makes it a feature.
|
||||||
|
|
||||||
|
*** Organic ontology growth
|
||||||
|
|
||||||
|
Categories emerge from the system's own operation: gate patterns → gate outcomes
|
||||||
|
→ Screamer generalizations → archivist proposals → cross-domain overlap
|
||||||
|
detection. The ontology is a garden, not a building. This avoids the Principia
|
||||||
|
Mathematica problem — the need to define everything upfront — by replacing
|
||||||
|
axiomatic design with evolutionary growth. Categories that aren't used fade.
|
||||||
|
Categories that are contradictory are pruned. Categories that emerge from
|
||||||
|
overlapping domains are promoted. The system converges on useful granularity
|
||||||
|
through use.
|
||||||
|
|
||||||
|
*** Agora as provenance layer for networked knowledge
|
||||||
|
|
||||||
|
A BFT-timestamped triple store is one approach, but the Merkle DAG + DID
|
||||||
|
signatures provide a lighter-weight alternative: every fact's provenance is
|
||||||
|
content-addressed, every author's identity is cryptographically verifiable, and
|
||||||
|
the DAG structure enables partial replication without consensus. This is more
|
||||||
|
tractable than full BFT and sufficient for a personal memex that needs to share
|
||||||
|
facts across a network.
|
||||||
|
|
||||||
|
*** Decoupling of compute cost from knowledge base size
|
||||||
|
|
||||||
|
LLM tokens are minimized by design — deterministic gates cost 0 tokens, sparse-
|
||||||
|
tree rendering keeps context at 2,000-4,000 tokens, Screamer deductions cost 0
|
||||||
|
tokens. Adding 5 million Wikidata entities does not add a single token to any LLM
|
||||||
|
call. The variables that actually degrade performance — context window size, LLM
|
||||||
|
call frequency, Screamer deduction budget — are all bounded independently of
|
||||||
|
knowledge base size. This is a structural property: the education is local, only
|
||||||
|
the brain costs.
|
||||||
|
|
||||||
|
** Weaknesses
|
||||||
|
|
||||||
|
*** The fact language is unproven and may be insufficient
|
||||||
|
|
||||||
|
Triples — =(:entity :relation :value)= with provenance and grounding — is the
|
||||||
|
current hypothesis. It is simple enough to be parseable, expressive enough to
|
||||||
|
capture the gate stack's implicit claims, and extensible enough that Screamer can
|
||||||
|
operate on it. But:
|
||||||
|
|
||||||
|
- Triples cannot naturally express temporal relations. "Was X before Y?" requires
|
||||||
|
reification (making the relation itself an entity), which makes queries
|
||||||
|
exponentially more complex.
|
||||||
|
- Triples cannot express modal claims. "Should not do X unless Y" has no natural
|
||||||
|
triple representation. Neither does "could have done X but chose Y."
|
||||||
|
- Triples cannot express counterfactuals. "If X had happened, Y would have
|
||||||
|
followed." These are essential for the "what if" reasoning that a personal
|
||||||
|
memex should support.
|
||||||
|
- Triples struggle with n-ary relations. "Nabokov wrote Pale Fire in 1962 while
|
||||||
|
living in Montreux" is a 4-ary relation (author, work, date, location), not a
|
||||||
|
set of independent binary relations. Breaking it into triples loses the
|
||||||
|
connection that binds them.
|
||||||
|
- Triples cannot express negation cleanly. "Nabokov did NOT write Doctor Zhivago"
|
||||||
|
requires a negative fact, which in a triple store with an open-world assumption
|
||||||
|
means "not known" and "known not" are conflated.
|
||||||
|
|
||||||
|
The notes acknowledge this limitation but defer it. The right granularity
|
||||||
|
"depends on what queries the planner actually needs to make, and that cannot be
|
||||||
|
known in advance." This is honest but unsatisfying. If triples prove insufficient,
|
||||||
|
the entire fact store, the Screamer integration, the VivaceGraph persistence, and
|
||||||
|
the archivist's extraction format must be redesigned. The architecture has no
|
||||||
|
intermediate fallback between "triples" and "something more expressive."
|
||||||
|
|
||||||
|
*** Screamer as admission gate is untested at this scale
|
||||||
|
|
||||||
|
Screamer is a constraint solver with non-deterministic backtracking. Using it
|
||||||
|
to check a candidate triple against an existing fact store is conceptually
|
||||||
|
elegant: express the fact store as constraint variables, assert the candidate,
|
||||||
|
check solvability. But:
|
||||||
|
|
||||||
|
- Screamer was designed for constraint satisfaction problems with tens to
|
||||||
|
hundreds of variables. A fact store with millions of triples (after Wikidata
|
||||||
|
loading) is a constraint space orders of magnitude larger than Screamer's
|
||||||
|
design envelope.
|
||||||
|
- The consistency check is domain-scoped (only rules from the candidate's
|
||||||
|
=:domain= apply), but cross-domain contradictions are the most valuable kind.
|
||||||
|
"Nabokov was born in 1899" (literature domain) should be consistent with
|
||||||
|
"Nabokov died in 1977" (history domain). If these are separate domains, the
|
||||||
|
check misses contradictions; if they are unified, the constraint space
|
||||||
|
explodes.
|
||||||
|
- Screamer's non-deterministic backtracking is worst-case exponential. The notes
|
||||||
|
bound this via deduction budget (=SCREAMER_DEDUCTION_BUDGET_MS=) but don't
|
||||||
|
address the admission check itself, which runs on every assertion.
|
||||||
|
|
||||||
|
There is a risk that Screamer works beautifully for the gate-bootstrapped seed
|
||||||
|
(50-70 entity classes, ~200 facts) and becomes unusably slow after Wikidata
|
||||||
|
loading (millions of facts). The transition from "works" to "doesn't" may be
|
||||||
|
gradual and hard to detect — the system gets slower but doesn't crash,
|
||||||
|
degrading user experience without a clear diagnostic.
|
||||||
|
|
||||||
|
*** The "flip" from lossy to deterministic is underspecified
|
||||||
|
|
||||||
|
The architecture's central narrative arc is the "flip": at some point, the non-
|
||||||
|
lossy facts constitute a sufficient foundation that the symbolic engine can
|
||||||
|
reverse the flow — instead of LLM extraction, the symbolic engine reads prose
|
||||||
|
through its own lens and deduces facts directly. The sufficiency metric
|
||||||
|
(non-lossy / total > 0.7) makes this "computable and visible to the user."
|
||||||
|
|
||||||
|
But:
|
||||||
|
|
||||||
|
- The threshold (0.7) is arbitrary. It is not derived from empirical measurement,
|
||||||
|
information theory, or constraint satisfaction theory. It is a guess.
|
||||||
|
- Sufficiency is domain-specific, not global. The gate stack may have 0.95
|
||||||
|
coverage of security classifications but 0.05 coverage of literary analysis.
|
||||||
|
A global threshold of 0.7 hides the domains where the symbolic engine is still
|
||||||
|
effectively blind.
|
||||||
|
- The "flip" operation itself is not defined. "Screamer reads prose through its
|
||||||
|
own lens" — Screamer does not read prose. It operates on structured facts.
|
||||||
|
Either the archivist still extracts triples (which is LLM work), or some new
|
||||||
|
mechanism parses prose into triples deterministically (which is NLP at a level
|
||||||
|
that does not exist in open-source Lisp).
|
||||||
|
- Even after the flip, facts from the pre-flip period carry =:provenance
|
||||||
|
:llm-proposed= and are therefore suspect. The pre-flip facts were admitted
|
||||||
|
against fewer non-lossy facts, meaning Screamer's consistency checks were
|
||||||
|
weaker. A fact admitted during the seed phase may be wrong but undetected
|
||||||
|
because there were no contradicting facts at the time. Re-verifying all pre-
|
||||||
|
flip facts against the current fact store is described as a heartbeat task but
|
||||||
|
the cost (millions of Screamer checks) is not estimated.
|
||||||
|
|
||||||
|
The flip is a beautiful narrative. It may also be a mirage — the system may
|
||||||
|
achieve high sufficiency in narrow domains (security, filesystem, coding) and
|
||||||
|
never approach it in the broader memex (literature, personal reflection, daily
|
||||||
|
life). If the broader memex is the use case, the flip may never happen.
|
||||||
|
|
||||||
|
*** The archivist's extraction cost is unaccounted
|
||||||
|
|
||||||
|
The archivist calls the LLM to extract triples from prose, with "a minimal prompt
|
||||||
|
(~200 tokens)." Over a personal memex with thousands of entries — a decade of
|
||||||
|
diary entries, hundreds of literature notes, dozens of project logs — the
|
||||||
|
extraction cost is substantial.
|
||||||
|
|
||||||
|
Assume 5,000 headings, 200 tokens per heading prompt, and an LLM that returns
|
||||||
|
~100 tokens of structured triples per heading. That's 1.5 million tokens for the
|
||||||
|
initial extraction, plus verification tokens (Screamer checks cost 0 LLM tokens,
|
||||||
|
but incorrect proposals generate feedback that may trigger re-extraction). At
|
||||||
|
current API prices (~$0.15 per million input tokens for GPT-4o-mini), the cost
|
||||||
|
is modest (~$0.25). But at scale — re-extraction after ontology changes,
|
||||||
|
continuous extraction as new content is added, extraction for all incoming Agora
|
||||||
|
Notes — the cost accumulates.
|
||||||
|
|
||||||
|
More importantly, the extraction latency is human-noticeable. 5,000 headings at
|
||||||
|
1 second per LLM call is ~1.4 hours of extraction time. The system needs to
|
||||||
|
either batch-extract on startup (making cold starts slow) or extract lazily on
|
||||||
|
first query (making first queries slow). Neither is ideal.
|
||||||
|
|
||||||
|
The notes trumpet the token savings from deterministic gates and Screamer
|
||||||
|
deductions (valid — those cost 0 tokens) but the archivist's extraction cost is
|
||||||
|
the system's single largest recurring LLM expense, and it is mentioned only in
|
||||||
|
passing.
|
||||||
|
|
||||||
|
*** The Agora integration is clean in theory, undefined in practice
|
||||||
|
|
||||||
|
The "Passepartout IS the PDS" claim is elegant: the =memory-object= struct IS
|
||||||
|
the Note format, the Merkle DAG IS the Key Event Log, the fact store IS the
|
||||||
|
reputation system. But:
|
||||||
|
|
||||||
|
- An Agora PDS needs to serve HTTP APIs for thin clients. The daemon speaks a
|
||||||
|
framed TCP protocol over a local port. Extending it to serve HTTPS with
|
||||||
|
DIDComm endpoints, subscription management, and Relay push/pull is a
|
||||||
|
substantial engineering effort.
|
||||||
|
- The PDS needs to manage encrypted storage — client-side encrypted content that
|
||||||
|
the PDS itself cannot read. Passepartout's vault stores credentials with
|
||||||
|
integrity hashes but does not currently manage per-Note encryption with
|
||||||
|
audience-specific keys.
|
||||||
|
- The Relay Network is described as an intelligent communication backbone with
|
||||||
|
pub/sub routing. Passepartout has no Relay implementation, no Relay-facing API,
|
||||||
|
and no subscription management beyond its own event orchestrator.
|
||||||
|
- Agora's contract system (SCAL contracts, HODL invoices, arbitration tiers)
|
||||||
|
requires state machines and Lightning Network integration that Passepartout
|
||||||
|
has no primitives for.
|
||||||
|
- The "Passepartout IS the PDS" vision conflates two things: the data model
|
||||||
|
(Org files = Notes) and the infrastructure (a process that serves a network
|
||||||
|
protocol). The data model unification is clean and right. The infrastructure
|
||||||
|
unification implies Passepartout grows from a local agent to a network server
|
||||||
|
— a significant architectural expansion that the notes treat as a ~40-line
|
||||||
|
utility.
|
||||||
|
|
||||||
|
*** No adversarial model
|
||||||
|
|
||||||
|
The notes describe layered authentication (crypto, sensory, deterministic,
|
||||||
|
probabilistic) and type-level gates as structural safety. They do not describe
|
||||||
|
an adversarial model:
|
||||||
|
|
||||||
|
- What stops a malicious Agora Note from containing 100,000 triples that flood
|
||||||
|
the fact store?
|
||||||
|
- What stops a DID from publishing Notes that deliberately inject contradictions
|
||||||
|
to force Screamer into exponential backtracking?
|
||||||
|
- What stops a compromised sensor key from signing valid sensor data that is
|
||||||
|
adversarially crafted (e.g., video frames designed to trigger specific vision
|
||||||
|
model false positives)?
|
||||||
|
- What stops a spam DID from creating millions of Personas and flooding the
|
||||||
|
user's incoming Notes directory?
|
||||||
|
|
||||||
|
The resource monitor (Phase 1a) handles storage pressure generically. The
|
||||||
|
quarantine system handles individual DIDs flagged for spam. But none of these
|
||||||
|
are adversary-aware — they react to symptoms (disk full, error rate high) rather
|
||||||
|
than anticipating attack patterns. An adversarial model would identify these
|
||||||
|
vectors and design mitigations specifically. The notes describe a system that
|
||||||
|
works in a cooperative environment, not an adversarial one.
|
||||||
|
|
||||||
|
*** The self-repair criterion creates a two-tier architecture
|
||||||
|
|
||||||
|
The AGENTS.md rule — "default: everything is a skill" — means the symbolic
|
||||||
|
engine (Screamer, VivaceGraph, fact store, archivist, ACL2, planner) is all
|
||||||
|
skills, not core. This is correct for the self-repair criterion: a corrupted
|
||||||
|
skill degrades the agent but doesn't kill it. A corrupted core file kills the
|
||||||
|
brainstem.
|
||||||
|
|
||||||
|
But it creates a tension: the symbolic engine IS the reasoning layer that would
|
||||||
|
diagnose and repair a corrupted skill. If the fact store itself is corrupted
|
||||||
|
(impossible facts, inconsistent cardinality, broken Merkle chains), the engine
|
||||||
|
that detects corruption is the engine that is corrupted. The system needs a
|
||||||
|
"repair from below" path — a minimal core that can purge and rebuild the symbolic
|
||||||
|
index without depending on the symbolic index. This path exists (the fact store
|
||||||
|
is ephemeral in Phase 1-4 and rebuildable from prose in Phase 5+) but is not
|
||||||
|
exercised automatically. A corruption in the symbolic engine requires human
|
||||||
|
detection and manual rebuild — the exact problem the self-repair criterion was
|
||||||
|
designed to avoid.
|
||||||
|
|
||||||
|
** Opportunities
|
||||||
|
|
||||||
|
*** A memory prosthesis that makes your own mind legible
|
||||||
|
|
||||||
|
The symbolic index, when populated and queried, answers questions that no
|
||||||
|
existing tool can:
|
||||||
|
|
||||||
|
- "What did I believe about monorepos in 2023, and how has that changed?"
|
||||||
|
- "Which of my diary entries contradict each other?"
|
||||||
|
- "What entities in my memex have no connection to any other entity?"
|
||||||
|
- "Show me everything I've written about Nabokov, organized by when I wrote it,
|
||||||
|
what I was reading at the time, and what I concluded."
|
||||||
|
- "Which of my project plans reference security assumptions that I later changed?"
|
||||||
|
- "What did I think about this topic, and why did I change my mind?"
|
||||||
|
|
||||||
|
These are not information retrieval queries. They are self-knowledge queries.
|
||||||
|
They require provenance chains, temporal versioning, contradiction surfacing, and
|
||||||
|
cross-domain linkage — all of which the architecture provides as first-class
|
||||||
|
capabilities. If this works, it transforms the memex from a searchable archive
|
||||||
|
into a thinking partner that knows the history of your thoughts.
|
||||||
|
|
||||||
|
*** Deterministic reasoning as a moat
|
||||||
|
|
||||||
|
Every competitor agent system (Claude Code, OpenCode, OpenClaw, Hermes, Cognee,
|
||||||
|
Mem0) uses neural-only reasoning. They are all vulnerable to the same failure
|
||||||
|
mode: the LLM hallucinates a fact or an action, and there is no second system to
|
||||||
|
catch it. Their safety is heuristic. Their memory is flat. Their reasoning is
|
||||||
|
unprovable.
|
||||||
|
|
||||||
|
Passepartout's architectural bet — a symbolic engine that verifies, deduces, and
|
||||||
|
audits — creates a category difference, not a performance difference. If the bet
|
||||||
|
pays off, Passepartout is not "a better AI agent." It is a different kind of
|
||||||
|
system — one whose reasoning is provable, whose memory is content-addressed, and
|
||||||
|
whose knowledge accumulates through deduction rather than re-prompting.
|
||||||
|
|
||||||
|
This is a genuine moat. It cannot be replicated by adding a better system prompt
|
||||||
|
or a larger context window. It requires building the ontology, the constraint
|
||||||
|
solver, the fact store, and the provenance tracker — work that takes years and
|
||||||
|
cannot be shortcut by spending more on inference.
|
||||||
|
|
||||||
|
*** Agora as the first sovereign agent network
|
||||||
|
|
||||||
|
If Passepartout serves as the PDS and an Agora Persona, then AI agents can:
|
||||||
|
|
||||||
|
- Publish verified outputs as signed Notes with cryptographic provenance.
|
||||||
|
Readers know the agent produced the output, not a human impersonating the
|
||||||
|
agent.
|
||||||
|
- Accept invocation Notes from other persona owners. "Please analyze this
|
||||||
|
contract and publish your findings." The agent receives the request as an
|
||||||
|
Agora Note, processes it, signs the response, and publishes it.
|
||||||
|
- Build reputation through auditable chains of signed work products, not through
|
||||||
|
self-reported claims.
|
||||||
|
- Participate in the compute marketplace as both consumer and provider.
|
||||||
|
- Maintain sovereign identity — the agent's DID is independent of any platform,
|
||||||
|
any provider, any human account.
|
||||||
|
|
||||||
|
This is not a chatbot on a messaging platform. It is an autonomous entity on a
|
||||||
|
decentralized network, with cryptographic identity, verifiable provenance, and
|
||||||
|
economic agency. If Agora reaches even Order 1 (the first 1,000 users),
|
||||||
|
Passepartout agents become some of the most capable participants on the network.
|
||||||
|
|
||||||
|
*** The 10-80-10 ratio for coding is genuinely achievable
|
||||||
|
|
||||||
|
For a coding agent — the domain that Passepartout currently operates in — the
|
||||||
|
10-80-10 ratio is plausible. The existing Dispatcher already verifies every
|
||||||
|
action deterministically. Adding Screamer for consistency checking, VivaceGraph
|
||||||
|
for dependency queries, and ACL2 for structural verification would shift the
|
||||||
|
ratio from the current ~95-5-0 (neural-gate-symbolic) toward 50-40-10 in the
|
||||||
|
near term and potentially 10-80-10 in the long term.
|
||||||
|
|
||||||
|
The bootstrapped gate facts already cover file classifications, command safety,
|
||||||
|
path protections, and tool permissions — the core categories for a coding agent.
|
||||||
|
The archivist's extraction from project files would add dependency information,
|
||||||
|
test coverage, and code structure facts. The planner could reason about
|
||||||
|
refactoring order, dependency chains, and safety constraints deterministically.
|
||||||
|
This is the domain where the symbolic engine provides the most immediate value,
|
||||||
|
and it is the domain Passepartout already operates in.
|
||||||
|
|
||||||
|
*** Wikidata as an entity backbone unlocks cross-domain reasoning
|
||||||
|
|
||||||
|
Without Wikidata, the symbolic index for a general-knowledge memex is a sparse
|
||||||
|
set of personal facts with no connecting structure. With Wikidata, the entity
|
||||||
|
graph is pre-structured. The system can answer:
|
||||||
|
|
||||||
|
- "What does my memex say about Nabokov that Wikidata doesn't?"
|
||||||
|
- "Where does my memex disagree with Wikidata?"
|
||||||
|
- "What entities in my memex have no Wikidata counterpart?" (These are the
|
||||||
|
personal, novel, or subjective entities that are the most valuable.)
|
||||||
|
- "Show me the intersection of my literary interests (from diary) with Wikidata's
|
||||||
|
influence graph — which authors I read influenced each other in ways I haven't
|
||||||
|
written about?"
|
||||||
|
|
||||||
|
These are cross-domain queries that require both the personal memex (for what
|
||||||
|
the user knows) and Wikidata (for what the world knows). Neither alone can
|
||||||
|
answer them. Together, they enable a kind of knowledge synthesis that no existing
|
||||||
|
tool provides.
|
||||||
|
|
||||||
|
*** Ontology versioning enables "what-if" reasoning about one's own thinking
|
||||||
|
|
||||||
|
The ability to query across worldviews — "what did I believe before I changed my
|
||||||
|
security model?" — is a capability that has no analog in any existing tool. It
|
||||||
|
transforms the memex from a static archive into a dynamic record of intellectual
|
||||||
|
evolution. Combined with the temporal awareness system (Phase 0c), the system
|
||||||
|
could surface correlations: "You changed your mind about monorepos two weeks
|
||||||
|
after reading this article, which you bookmarked on this date, and one week
|
||||||
|
before starting this project that uses a monorepo structure." The provenance
|
||||||
|
chain IS the narrative of your thinking.
|
||||||
|
|
||||||
|
*** Contract-level pre-arbitration reduces the cost of decentralized commerce
|
||||||
|
|
||||||
|
Agora's Tier 0 Arbitrator — a local AI that provides evidence summaries before
|
||||||
|
human arbitration — is a genuinely useful role for a neurosymbolic system.
|
||||||
|
|
||||||
|
- "Contract CID X references arbitrator DID Y. DID Y is active. Verified."
|
||||||
|
- "All parties have signed. The HODL invoice is locked. Verified."
|
||||||
|
- "The buyer's claim of non-delivery is supported by 3 signed messages with
|
||||||
|
timestamps after the delivery deadline."
|
||||||
|
- "The seller's proof-of-delivery field is empty. No QR scan recorded."
|
||||||
|
|
||||||
|
Each check is a Screamer query against the contract-lifecycle domain. The results
|
||||||
|
are a plist, not a ruling. Both parties see the same evidence summary before
|
||||||
|
escalating. This makes Level 1 arbitration faster (arbitrators receive
|
||||||
|
pre-processed evidence bundles), cheaper (no human time spent on trivial
|
||||||
|
verification), and more transparent (both parties see the same machine-generated
|
||||||
|
summary).
|
||||||
|
|
||||||
|
This is not AI judging. This is AI preparing the docket. The distinction is
|
||||||
|
important and defensible.
|
||||||
|
|
||||||
|
*** Self-auditing agents could transform AI safety discourse
|
||||||
|
|
||||||
|
If Passepartout can answer =/audit= for any action or fact — showing the full
|
||||||
|
provenance chain, every gate that approved it, every fact that supported it,
|
||||||
|
every alternative that was considered — then AI safety moves from "trust us, we
|
||||||
|
tested it" to "here is the audit trail, verify it yourself."
|
||||||
|
|
||||||
|
This is the transparency that every AI safety framework calls for and none
|
||||||
|
delivers. It is possible because the architecture records provenance as a
|
||||||
|
first-class operation, not as an after-the-fact log. The provenance is the
|
||||||
|
operating system, not a logging layer.
|
||||||
|
|
||||||
|
*** The memex + Agora combination could be a new kind of social network
|
||||||
|
|
||||||
|
Current social networks (Twitter, Facebook, Reddit) separate the person from
|
||||||
|
their knowledge. You are a profile with posts. Your posts are isolated units
|
||||||
|
without connection to your broader intellectual life.
|
||||||
|
|
||||||
|
A Passepartout-powered Agora Persona would publish Notes that are grounded in
|
||||||
|
the memex: "Here is my analysis of /Pale Fire/, drawn from diary entries across
|
||||||
|
three years, annotated with Wikidata context, and verified against my existing
|
||||||
|
literary framework." The Note is cryptographically signed, carrying provenance
|
||||||
|
back to the specific Org headings that informed it. Readers see not just the
|
||||||
|
conclusion but the intellectual scaffolding that produced it.
|
||||||
|
|
||||||
|
This is not a "post." It is a publication — a knowledge artifact with verifiable
|
||||||
|
provenance, auditable reasoning, and cryptographic identity. If this becomes the
|
||||||
|
norm, it raises the standard for public discourse from "this is my opinion" to
|
||||||
|
"this is my opinion, here is the evidence, here is how it evolved, here is who
|
||||||
|
verified it."
|
||||||
|
|
||||||
|
** Threats
|
||||||
|
|
||||||
|
*** The ontology problem may be harder than anticipated
|
||||||
|
|
||||||
|
The notes are honest about this: "Whitehead's Principia Mathematica took over
|
||||||
|
300 pages to define the logical foundations before it could prove that 1+1=2."
|
||||||
|
Passepartout's domain is narrower (coding + personal knowledge) but the
|
||||||
|
ontology problem is the same category of problem. Every entity class must be
|
||||||
|
defined. Every relation must have clear semantics. Every inference rule must be
|
||||||
|
justified.
|
||||||
|
|
||||||
|
The gate-to-fact bootstrap provides 50-70 entity classes — enough for a coding
|
||||||
|
agent. But the broader memex contains orders of magnitude more entity types:
|
||||||
|
people, places, works, concepts, events, emotions, aesthetic judgments,
|
||||||
|
professional skills, personal projects, temporal patterns. Defining these as
|
||||||
|
triples with clear semantics is genuine intellectual work that no amount of
|
||||||
|
engineering can shortcut.
|
||||||
|
|
||||||
|
The risk is not that it's impossible. It's that it's slow — slow enough that
|
||||||
|
the system never achieves the density of facts needed for the "flip" in the
|
||||||
|
broader memex. The coding domain may reach sufficiency in months. The literary
|
||||||
|
domain may take years. The daily-reflection domain may never cross the
|
||||||
|
threshold because the facts involved (mood, insight, aesthetic experience) are
|
||||||
|
not formalizable as triples.
|
||||||
|
|
||||||
|
*** Screamer may not scale to the fact store size
|
||||||
|
|
||||||
|
The constraint satisfaction approach to consistency checking is elegant for a
|
||||||
|
seed fact set of hundreds of triples. It is unproven for millions of triples
|
||||||
|
(after Wikidata loading + years of personal extraction). The domain-scoping
|
||||||
|
strategy (Screamer only checks facts from the candidate's =:domain=) bounds the
|
||||||
|
constraint space, but the most valuable consistency checks are cross-domain:
|
||||||
|
|
||||||
|
- "You classified this file as public in your project notes but the gate stack
|
||||||
|
classifies it as secret." (project domain vs security domain)
|
||||||
|
- "You wrote that Nabokov influenced Kafka, but Wikidata says Kafka died before
|
||||||
|
Nabokov published his first novel." (literature domain vs Wikidata domain)
|
||||||
|
- "You planned to use this dependency, but the dependency's license changed in
|
||||||
|
a way that conflicts with your project's license." (project domain vs legal
|
||||||
|
domain)
|
||||||
|
|
||||||
|
If cross-domain checks are disabled for performance, the most valuable
|
||||||
|
contradictions are never detected. If they are enabled, the constraint space
|
||||||
|
explodes. There is no obvious sweet spot.
|
||||||
|
|
||||||
|
*** Wikidata quality may undermine trust in the symbolic index
|
||||||
|
|
||||||
|
If Wikidata facts are admitted with =:policy :plural= and the user sees
|
||||||
|
thousands of contradictions between Wikidata and their personal memex, the
|
||||||
|
symbolic index may feel less trustworthy, not more. "Wikidata says Mount Everest
|
||||||
|
is 8848m. DBpedia says 8849m. Your 2023 diary says 8848m. These three sources
|
||||||
|
disagree on height." This is correct behavior — surfacing disagreement with
|
||||||
|
provenance — but it may be overwhelming. The user wanted a knowledge base, not
|
||||||
|
a disagreement engine.
|
||||||
|
|
||||||
|
The trust problem is compounded by Wikidata's editorial biases. Wikidata
|
||||||
|
reflects the biases of Wikipedia editors: English-language dominance, Western
|
||||||
|
epistemological frameworks, systemic underrepresentation of non-Western
|
||||||
|
knowledge. A memex in Arabic that references Islamic philosophy, Egyptian
|
||||||
|
history, or African literature will find Wikidata's coverage thin, biased, or
|
||||||
|
absent. The symbolic index would dutifully surface these gaps — "your memex
|
||||||
|
mentions 47 entities with no Wikidata counterpart" — but it cannot fill them.
|
||||||
|
|
||||||
|
*** LLM cost and latency may prevent the archivist from keeping up
|
||||||
|
|
||||||
|
If the user writes a diary entry every day, the archivist must extract triples
|
||||||
|
from each new heading. If the extraction takes 1-3 seconds per heading, it's
|
||||||
|
background noise. But if the user imports 500 old diary entries, or the
|
||||||
|
archivist needs to re-extract after an ontology change, or Agora Notes arrive in
|
||||||
|
bulk from multiple follows, the extraction queue grows faster than it drains.
|
||||||
|
|
||||||
|
The notes describe extraction as a background task triggered by heartbeat, but
|
||||||
|
they don't specify the extraction rate limit. An unbounded queue with no rate
|
||||||
|
limit would consume the LLM budget. A bounded queue would fall behind. A lazy
|
||||||
|
extraction strategy (extract on first query) would make first queries slow.
|
||||||
|
A batch extraction on startup would make cold starts slow.
|
||||||
|
|
||||||
|
The archivist's throughput is gated by LLM API rate limits, token costs, and
|
||||||
|
inference latency. These are external constraints that the architecture cannot
|
||||||
|
eliminate. The symbolic engine can reduce LLM calls for reasoning; it cannot
|
||||||
|
reduce LLM calls for extraction from prose.
|
||||||
|
|
||||||
|
*** Agora may never reach network effects
|
||||||
|
|
||||||
|
Agora faces the cold start problem that every decentralized social protocol
|
||||||
|
faces: users won't join without content, creators won't post without users. The
|
||||||
|
bootstrapping strategy (managed service → hybrid → full decentralization,
|
||||||
|
targeting niche communities first) is well-articulated but its success depends
|
||||||
|
on execution in a market where Mastodon, Bluesky, Nostr, and Farcaster are
|
||||||
|
already competing for the same users.
|
||||||
|
|
||||||
|
If Agora doesn't reach even Order 1 (1,000 users), the PDS integration is
|
||||||
|
academic. Passepartout's DID identity, DIDComm gateway, Note signing, and
|
||||||
|
contract verification are all infrastructure for a network that doesn't exist.
|
||||||
|
The symbolic engine still works locally — provenance tracking, contradiction
|
||||||
|
surfacing, and deduction are all valuable without Agora. But the network effects
|
||||||
|
that make Agora a transformative platform — reputation, contracts, marketplaces,
|
||||||
|
collective governance — require a living network.
|
||||||
|
|
||||||
|
The risk is asymmetric: Passepartout invests significant engineering in Agora
|
||||||
|
integration that provides zero value if Agora fails to launch.
|
||||||
|
|
||||||
|
*** Complexity may prevent adoption
|
||||||
|
|
||||||
|
Passepartout is already a complex system: a Lisp daemon, a terminal UI, a skill
|
||||||
|
engine, a gate stack, multiple LLM backends, a Merkle memory system, and an
|
||||||
|
event orchestrator. Adding a fact store, a constraint solver, a graph database,
|
||||||
|
a theorem prover, an archivist, a planner, and an Agora PDS makes it more
|
||||||
|
complex, not less.
|
||||||
|
|
||||||
|
The target user — someone who wants a personal AI assistant that works offline —
|
||||||
|
may not want or need any of this. They want the TUI to work, the LLM to be fast,
|
||||||
|
and the files to stay safe. The neurosymbolic engine is infrastructure for a use
|
||||||
|
case (lifelong personal knowledge management with verifiable provenance) that
|
||||||
|
most users do not yet know they have.
|
||||||
|
|
||||||
|
The risk is that Passepartout builds a cathedral for a congregation of one — a
|
||||||
|
system that is architecturally brilliant and practically unused because the
|
||||||
|
complexity-to-value ratio is too high for anyone except the author.
|
||||||
|
|
||||||
|
*** The self-repair criterion may not hold under adversarial conditions
|
||||||
|
|
||||||
|
The architecture assumes that skills can fail gracefully (fboundp guards, hash
|
||||||
|
table fallbacks, degraded mode). It does not assume that a skill can be
|
||||||
|
adversarially corrupted to behave correctly while producing wrong results. A
|
||||||
|
compromised archivist that extracts plausible but false triples, a compromised
|
||||||
|
Screamer that passes all consistency checks, a compromised VivaceGraph that
|
||||||
|
returns query results from a parallel graph — these are "living" skills that
|
||||||
|
would pass integrity checks and still poison the symbolic index.
|
||||||
|
|
||||||
|
The type-level gates prevent the LLM from modifying gate code. They do not
|
||||||
|
prevent a compromised skill (loaded by a trusted human, or corrupted on disk by
|
||||||
|
a separate process) from operating normally while subtly wrong. The integrity
|
||||||
|
monitoring (Phase 0) catches disk-level corruption through hash checks. It does
|
||||||
|
not catch semantic corruption — a skill that is byte-for-byte identical to the
|
||||||
|
known-good version but loaded with a malicious input that triggers a latent bug.
|
||||||
|
|
||||||
|
This is not a vulnerability unique to Passepartout. It is a vulnerability in
|
||||||
|
every system where components trust each other. But Passepartout's architecture
|
||||||
|
amplifies the risk because the symbolic engine is supposed to be the trustworthy
|
||||||
|
layer — the component that verifies the LLM's output. If the symbolic engine
|
||||||
|
itself is compromised, the system has no higher court of appeal.
|
||||||
|
|
||||||
|
*** The 10-80-10 ratio may create false confidence
|
||||||
|
|
||||||
|
If the sufficiency metric shows "71% non-lossy, threshold 70%, mode: AUTO-
|
||||||
|
EXTRACTION," the user may assume the system is trustworthy. But sufficiency is
|
||||||
|
global — it aggregates across all domains. The system may have 95% sufficiency
|
||||||
|
in the security domain and 5% sufficiency in the literary domain, averaging to
|
||||||
|
71%. The auto-extraction switch would bypass the LLM for all categories with
|
||||||
|
sufficient coverage, but the threshold is global, not per-domain. A literary
|
||||||
|
query would hit the symbolic index that has "sufficient" coverage globally but
|
||||||
|
insufficient coverage for literature.
|
||||||
|
|
||||||
|
The notes describe domain-scoped Screamer checks but not domain-scoped
|
||||||
|
sufficiency. A global sufficiency metric that triggers a global extraction mode
|
||||||
|
change is the wrong granularity. Per-domain sufficiency, with per-domain
|
||||||
|
extraction mode, would be more complex but more honest. The architecture as
|
||||||
|
described has the simpler, more dangerous version.
|
||||||
|
|
||||||
|
** Summary Matrix
|
||||||
|
|
||||||
|
| | Positive | Negative |
|
||||||
|
|-----------+---------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------|
|
||||||
|
| INTERNAL | S: Architectural inversion, unified Org format, provenance as product, | W: Unproven fact language, Screamer scale unverified, extraction cost hidden, |
|
||||||
|
| | cardinality model, gate-to-fact bootstrap, self-preservation, organic ontology, | flip underspecified, adversarial model absent, self-repair tension, |
|
||||||
|
| | Wikidata as accelerator, decoupled compute cost | Agora integration scope undefined, per-domain sufficiency missing |
|
||||||
|
|-----------+---------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------|
|
||||||
|
| EXTERNAL | O: Memory prosthesis, deterministic moat, sovereign agent network, | T: Ontology may be harder than expected, Screamer may not scale, |
|
||||||
|
| | 10-80-10 for coding achievable, Wikidata cross-domain queries, | Wikidata quality/trust, LLM extraction bottleneck, Agora network effects, |
|
||||||
|
| | ontology versioning, contract pre-arbitration, self-auditing safety, | complexity-to-adoption ratio, adversarial semantic corruption, |
|
||||||
|
| | knowledge-based social network | false confidence from global sufficiency metric |
|
||||||
|
|
||||||
|
* What This Unlocks
|
||||||
|
|
||||||
|
** Technologically
|
||||||
|
|
||||||
|
The neurosymbolic engine, if built, would be the first AI system where:
|
||||||
|
|
||||||
|
1. *Reasoning is auditable.* Every conclusion carries a provenance chain back to
|
||||||
|
its premises. The =/audit= command renders the full inference tree — every
|
||||||
|
fact, every deduction, every gate outcome — in human-readable form.
|
||||||
|
|
||||||
|
2. *Knowledge accumulates deterministically.* Screamer deductions and gate
|
||||||
|
outcomes generate new facts without any LLM involvement. The knowledge base
|
||||||
|
grows from the system's own operation, not from re-prompting the LLM.
|
||||||
|
|
||||||
|
3. *Memory is content-addressed.* Every fact is a Merkle node. Every version
|
||||||
|
chain is tamper-proof. Rollback is atomic. The storage format is proven
|
||||||
|
correct before it is committed to disk.
|
||||||
|
|
||||||
|
4. *Safety is provable, not empirical.* Type-level gates make self-modification
|
||||||
|
structurally impossible. ACL2 proves that the rule set has no contradictions.
|
||||||
|
The dispatcher doesn't "try" to be safe — it is safe by construction.
|
||||||
|
|
||||||
|
5. *The human and the machine share the same format.* Org files for both. No
|
||||||
|
hidden database. No import/export step. The agent's memory IS the human's
|
||||||
|
memory.
|
||||||
|
|
||||||
|
These five properties, together, define a new category of AI system: the
|
||||||
|
*sovereign reasoning agent*. Not sovereign in the blockchain sense (decentralized
|
||||||
|
by consensus), but sovereign in the personal sense: the agent runs on your
|
||||||
|
hardware, reasons with your knowledge, and proves its reasoning to you.
|
||||||
|
|
||||||
|
** Socially
|
||||||
|
|
||||||
|
If the technical vision succeeds and Agora reaches network effects, the
|
||||||
|
combination unlocks:
|
||||||
|
|
||||||
|
1. *Verifiable public discourse.* Every published claim carries provenance back
|
||||||
|
to source material. "I read this, I thought this, I changed my mind on this
|
||||||
|
date, here is the evidence." Public discourse shifts from "competing opinions"
|
||||||
|
to "competing evidence chains." The quality floor rises because claims without
|
||||||
|
provenance are visibly weaker than claims with provenance.
|
||||||
|
|
||||||
|
2. *Sovereign AI agents with legal and economic personhood.* A Passepartout
|
||||||
|
agent with an Agora Persona can own assets, enter contracts, earn reputation,
|
||||||
|
and face consequences for failure. This is not a chatbot. It is an autonomous
|
||||||
|
entity with cryptographic identity, verified provenance, and economic agency
|
||||||
|
— more like a corporation than a tool.
|
||||||
|
|
||||||
|
3. *Self-auditing AI safety.* Every action the agent takes is traceable. Every
|
||||||
|
gate decision is recorded. Every fact that informed a decision is queryable.
|
||||||
|
AI safety moves from "trust us" to "here is the audit trail." This is the
|
||||||
|
transparency that every AI ethics framework calls for.
|
||||||
|
|
||||||
|
4. *A personal knowledge economy.* If your memex can publish Notes as Agora
|
||||||
|
content, your intellectual work — your analyses, your syntheses, your
|
||||||
|
discoveries — becomes a publishable, attributable, monetizable asset. Not
|
||||||
|
through advertising or subscriptions, but through direct value exchange:
|
||||||
|
Lightning payments for content access, contract work for your verified
|
||||||
|
expertise, reputation that follows your Persona across platforms.
|
||||||
|
|
||||||
|
5. *Collective intelligence without centralized control.* If multiple
|
||||||
|
Passepartout agents share facts through Agora Notes, the collective symbolic
|
||||||
|
index represents the verified, provenanced knowledge of a community — not the
|
||||||
|
averaged opinion of a crowd, but the auditable intersection of independently
|
||||||
|
verified claims. This is Wikipedia without the editorial board, science
|
||||||
|
without the journal gatekeepers, journalism without the corporate owners.
|
||||||
|
|
||||||
|
6. *A memory prosthesis that outlives the individual.* A memex with a decade of
|
||||||
|
diary entries, linked to Wikidata's entity graph, with Screamer deductions
|
||||||
|
surfacing patterns and contradictions, with ontology versioning preserving
|
||||||
|
intellectual evolution — this is not a knowledge management tool. It is an
|
||||||
|
externalized, queryable, auditable record of a life's thinking. It is what
|
||||||
|
Vannevar Bush imagined in 1945: "an enlarged intimate supplement to one's
|
||||||
|
memory."
|
||||||
|
|
||||||
|
* Conclusion
|
||||||
|
|
||||||
|
The architecture described in these notes is genuinely novel. Not incrementally
|
||||||
|
novel — most agent architectures are variations on "LLM + tools + prompt-based
|
||||||
|
safety." Passepartout's neurosymbolic vision is categorically different: an
|
||||||
|
inversion where the deterministic layer judges the probabilistic layer, where
|
||||||
|
facts carry provenance chains, where contradiction is a feature rather than an
|
||||||
|
error, and where the user's Org files are the single source of truth for both
|
||||||
|
human and machine.
|
||||||
|
|
||||||
|
The largest risk is not that the architecture is wrong. It is that the ontology
|
||||||
|
problem — the genuine difficulty of defining what a "fact" is, what relations
|
||||||
|
are, what categories are useful, and how they evolve — is harder than the notes
|
||||||
|
anticipate, and that the system spends years in a partially-working state where
|
||||||
|
the symbolic index is too sparse to be useful but too entangled to be discarded.
|
||||||
|
|
||||||
|
The second-largest risk is that Agora never reaches the network effects needed
|
||||||
|
to make the PDS integration valuable beyond a local experiment, and that the
|
||||||
|
engineering investment in DIDComm gateways, Note signing, contract verification,
|
||||||
|
and Relay integration produces infrastructure for a network that doesn't exist.
|
||||||
|
|
||||||
|
The opportunity is equally large: a system that makes your own mind legible to
|
||||||
|
you, that proves its reasoning rather than asserting it, that accumulates
|
||||||
|
knowledge across sessions through deduction rather than re-prompting, and that
|
||||||
|
publishes verified, provenanced knowledge to a decentralized network. If this
|
||||||
|
works — even partially, even slowly — it is a category-level advance over every
|
||||||
|
existing agent architecture and every existing personal knowledge management
|
||||||
|
tool.
|
||||||
|
|
||||||
|
The notes are a map of territory that no one has walked. The territory is real.
|
||||||
|
The map is detailed enough to navigate by. Whether the journey completes depends
|
||||||
|
on whether the ontology problem yields to engineering, and whether the user —
|
||||||
|
the one human whose memex this serves — finds value in the partial system well
|
||||||
|
before the full vision materializes.
|
||||||
314
notes/passepartout-agora.org
Normal file
314
notes/passepartout-agora.org
Normal file
@@ -0,0 +1,314 @@
|
|||||||
|
#+TITLE: Passepartout-Agora Integration — Unified Container Format
|
||||||
|
#+AUTHOR: Agent
|
||||||
|
#+FILETAGS: :notes:integration:agora:passepartout:design:
|
||||||
|
#+CREATED: [2026-05-08 Fri]
|
||||||
|
|
||||||
|
* Summary
|
||||||
|
|
||||||
|
Org files and Agora Notes are the same container. Both are text with headers,
|
||||||
|
tags, properties, and prose body. Both contain zero or more symbolic facts
|
||||||
|
extractable by Passepartout's archivist. The only difference is that an Agora
|
||||||
|
Note carries a DID signature and a CID for cryptographic provenance on the
|
||||||
|
network. An Org file without a signature is a local Note. A signed Org file
|
||||||
|
pushed to the PDS is an Agora Note.
|
||||||
|
|
||||||
|
Passepartout's =memory-object= struct serves as the storage format for both.
|
||||||
|
The archivist extracts facts from one unified store. Authorship is distinguished
|
||||||
|
by provenance, not location.
|
||||||
|
|
||||||
|
* The Unification
|
||||||
|
|
||||||
|
** Org files and Notes are the same container
|
||||||
|
|
||||||
|
| Property | Org file (local) | Agora Note (network) |
|
||||||
|
|------------------+------------------------------+-------------------------------------|
|
||||||
|
| Format | Org-mode text | Org-mode text |
|
||||||
|
| Identity | Merkle hash (=memory-object=) | CIDv1 (same hash) |
|
||||||
|
| Contains facts | Yes (archivist extracts) | Yes (archivist extracts) |
|
||||||
|
| Author identity | Implicit (file in =~/memex/=) | Explicit (DID signature in =proof=) |
|
||||||
|
| Access control | Filesystem permissions | =access_control= flags |
|
||||||
|
| Routing | N/A (local disk) | =notify= + =references= + Relay |
|
||||||
|
| Ephemeral | No | =ephemeral_duration= |
|
||||||
|
| Behavioral flag | Implicit (convention) | =is_feed= field |
|
||||||
|
|
||||||
|
The structure converges in a single plist:
|
||||||
|
|
||||||
|
#+begin_src lisp
|
||||||
|
(:cid <merkle-hash> ;; Identity across local and network
|
||||||
|
:title <string> ;; Org headline title
|
||||||
|
:content <org-text> ;; Full Org body (headings, prose, source blocks)
|
||||||
|
:owner <did-or-nil> ;; For Agora Notes: the signing Persona DID. nil for local
|
||||||
|
:proof <plist-or-nil> ;; ( :editor <did> :signature <bytes> )
|
||||||
|
;; Agora behavioral flags (nil for local files)
|
||||||
|
:is-feed <boolean-or-nil>
|
||||||
|
:access-control <did-list-or-nil>
|
||||||
|
:notify <did-list-or-nil>
|
||||||
|
:references <cid-list-or-nil>
|
||||||
|
:reply-to <cid-or-nil>
|
||||||
|
:thread-root <cid-or-nil>
|
||||||
|
:ephemeral-duration <integer-or-nil>
|
||||||
|
;; Passepartout metadata
|
||||||
|
:created-at <timestamp>
|
||||||
|
:tags <string-list> ;; Org tags
|
||||||
|
:properties <plist> ;; Org property drawer
|
||||||
|
:extracted-facts <fact-list>) ;; Populated by archivist after extraction
|
||||||
|
#+end_src
|
||||||
|
|
||||||
|
** Facts are extracted from both, identically
|
||||||
|
|
||||||
|
An Org file in =~/memex/literature/pale-fire.org= and an Agora Note from
|
||||||
|
=did:agora:heather= with =:references <post-CID>= both contain prose. The
|
||||||
|
archivist scans both, proposes triples via the LLM, verifies via Screamer,
|
||||||
|
and admits facts to the symbolic index. The facts carry different provenance:
|
||||||
|
|
||||||
|
#+begin_src lisp
|
||||||
|
;; Extracted from local Org file
|
||||||
|
(:entity :pale-fire :relation :theme :value :unreliable-narration
|
||||||
|
:provenance :local-prose :grounding "heading-42")
|
||||||
|
|
||||||
|
;; Extracted from Agora Note
|
||||||
|
(:entity :kafka :relation :influence :value :nabokov
|
||||||
|
:provenance :agora-note :grounding <incoming-note-cid> :author "did:agora:heather")
|
||||||
|
#+end_src
|
||||||
|
|
||||||
|
No new extraction path. The archivist already walks containers and extracts
|
||||||
|
facts. The container type determines the provenance tag and the grounding
|
||||||
|
identifier (local heading ID vs. Note CID).
|
||||||
|
|
||||||
|
** The memex distinguishes provenance by location, not format
|
||||||
|
|
||||||
|
Incoming Agora Notes arrive at =~/memex/social/notes/<did>/<cid>.org=.
|
||||||
|
The directory structure encodes authorship:
|
||||||
|
|
||||||
|
| Path | Meaning |
|
||||||
|
|---------------------------------------------------+------------------------------------|
|
||||||
|
| ~/memex/daily/ | Local diary entries |
|
||||||
|
| ~/memex/projects/ | Local project files |
|
||||||
|
| ~/memex/literature/ | Local reading notes |
|
||||||
|
| ~/memex/notes/ | Local design and thinking notes |
|
||||||
|
| ~/memex/social/notes/<did>/<cid>.org | Incoming Notes from other DIDs |
|
||||||
|
| ~/memex/social/outbox/<cid>.org | Outgoing Notes signed by the user |
|
||||||
|
|
||||||
|
The archivist scans all directories. Local files produce facts with
|
||||||
|
=:provenance :local-prose=. Agora files produce facts with =:provenance
|
||||||
|
:agora-note= + =:author <did>=. The symbolic index maps the provenance
|
||||||
|
to the cardinality policy: local prose is =:plural= (the human's own notes —
|
||||||
|
multiple interpretations coexist). Agora Notes are =:plural= by default (the
|
||||||
|
author's claim, not authoritative over local facts). Agora Notes can be promoted
|
||||||
|
to =:singular= or =:dual= if they carry cryptographic proofs of specific claims.
|
||||||
|
|
||||||
|
** Publishing Org content as Agora Notes
|
||||||
|
|
||||||
|
When the user wants to publish a diary entry, project log, or literary note as
|
||||||
|
an Agora Note, the operation is:
|
||||||
|
|
||||||
|
1. Select the Org heading or file.
|
||||||
|
2. Compute the Merkle hash (=memory-object= hash → CIDv1).
|
||||||
|
3. Sign with the user's Persona DID key (Phase 0b key registry).
|
||||||
|
4. Set Agora flags: =:is-feed= t/nil, =:access-control= [], =:references= [previous-note-cid].
|
||||||
|
5. Push to the PDS. The Note is an Org plist with a DID signature.
|
||||||
|
6. The PDS stores and relays it. The Note remains in =~/memex/social/outbox/= with its CID.
|
||||||
|
|
||||||
|
All of this is a single function: =(note-publish heading-id &key is-feed access-control references)=.
|
||||||
|
~40 lines, extending the vault (key signing), the fact store (CID generation),
|
||||||
|
and the memex (output directory).
|
||||||
|
|
||||||
|
* Implications for Passepartout's Architecture
|
||||||
|
|
||||||
|
** The symbolic index now has a second ingestion path
|
||||||
|
|
||||||
|
Facts enter through three gates:
|
||||||
|
1. Gate outcomes (bootstrap + runtime, =:provenance :gate-outcome=)
|
||||||
|
2. Screamer deductions (=:provenance :deduced=)
|
||||||
|
3. Archivist extraction (=:provenance :local-prose= or =:provenance :agora-note=)
|
||||||
|
|
||||||
|
The third path now covers both local Org files and incoming Agora Notes. No new
|
||||||
|
path needed. The archivist gains no new code — only a new directory to walk
|
||||||
|
(=~/memex/social/notes/=) and a new provenance tag to assign.
|
||||||
|
|
||||||
|
** Authentication Layer 1 now has Agora-native verification
|
||||||
|
|
||||||
|
Phase 0b's cryptographic gate (vector 0) verifies DID signatures. An incoming
|
||||||
|
Agora Note carries =:owner <did>= and =:proof.signature <bytes>=. Gate vector 0
|
||||||
|
verifies the signature against the DID's public key (from the key registry, which
|
||||||
|
is now also an Agora DID registry). Verification is identical for local signals
|
||||||
|
and Agora signals — the same gate, the same key lookup.
|
||||||
|
|
||||||
|
** Self-preservation gains an Agora dimension
|
||||||
|
|
||||||
|
The resource monitor (Phase 1a) tracks =~/memex/social/= as a source of storage
|
||||||
|
growth. Incoming Notes from network sources are lower preservation priority than
|
||||||
|
local prose — if disk pressure hits, incoming Agora Notes are evicted first
|
||||||
|
(their source is the remote PDS; they can be re-fetched). Quarantine (Phase 1a)
|
||||||
|
extends to Agora channels: if a DID is sending spam or malformed Notes, their
|
||||||
|
incoming directory is quarantined and the DID is flagged for human review.
|
||||||
|
|
||||||
|
** Sufficiency tracks Agora as a provenance source
|
||||||
|
|
||||||
|
The sufficiency score (Phase 4) gains a new provenance category:
|
||||||
|
|
||||||
|
#+begin_example
|
||||||
|
Symbolic Index
|
||||||
|
Facts: 3,847
|
||||||
|
Gate outcomes: 847 (22%)
|
||||||
|
Deduced: 921 (24%)
|
||||||
|
Human-authored: 72 (2%)
|
||||||
|
Local prose: 1,247 (32%)
|
||||||
|
Agora Notes: 760 (20%)
|
||||||
|
─────────────────────────
|
||||||
|
Non-lossy: 1,840 (48%)
|
||||||
|
LLM-proposed: 2,007 (52%)
|
||||||
|
#+end_example
|
||||||
|
|
||||||
|
Agora Notes are a provenance source, not a lossiness category. Facts from Agora
|
||||||
|
Notes carry =:provenance :agora-note= — they are LLM-extracted (the archivist
|
||||||
|
proposes them) but the source is cryptographically signed by a known DID. They
|
||||||
|
are neither =:gate-outcome= (mechanical) nor =:llm-proposed= from local prose
|
||||||
|
(uncertain source). They occupy a middle ground: verified source, uncertain
|
||||||
|
extraction.
|
||||||
|
|
||||||
|
* Implications for Agora
|
||||||
|
|
||||||
|
** Passepartout IS the PDS
|
||||||
|
|
||||||
|
The TODO.org in =projects/agora/= already captures this: "Passepartout IS the
|
||||||
|
PDS — the agent runs a personal data store in-process." With Org files as the
|
||||||
|
Note format, this is literal. The PDS stores Org files. The agent reads them.
|
||||||
|
The network accesses them via the PDS API. There is no separate PDS process.
|
||||||
|
|
||||||
|
** Level 0 pre-arbitration via Screamer
|
||||||
|
|
||||||
|
Section 07 of the Agora requirements describes a "Tier 0 Arbitrator" — a local
|
||||||
|
AI that provides a sanity check before human arbitration. Passepartout's
|
||||||
|
Screamer + fact store provides this at zero LLM tokens when working from
|
||||||
|
existing facts:
|
||||||
|
|
||||||
|
- "Contract CID X references arbitrator DID Y. DID Y is active. Verified."
|
||||||
|
- "All parties have signed. The HODL invoice is locked. Verified."
|
||||||
|
- "The buyer's claim of non-delivery is supported by 3 signed messages with
|
||||||
|
timestamps after the delivery deadline."
|
||||||
|
- "The seller's proof-of-delivery field is empty. No QR scan recorded."
|
||||||
|
|
||||||
|
Each check is a Screamer query against the contract-lifecycle domain. Results
|
||||||
|
are a plist, not a ruling. Both parties see the same evidence summary before
|
||||||
|
escalating to Level 1.
|
||||||
|
|
||||||
|
** Reputation as deduced facts
|
||||||
|
|
||||||
|
Screamer deduces reputation from signed contract chains, not asserted claims:
|
||||||
|
|
||||||
|
#+begin_src lisp
|
||||||
|
(:entity "did:agora:heather" :relation :contract-reputation
|
||||||
|
:value (:completed 47 :defaulted 0 :disputes 3 :won 3 :escalated 0)
|
||||||
|
:provenance :deduced :derived-from (<list of 47 contract CIDs>))
|
||||||
|
#+end_src
|
||||||
|
|
||||||
|
This is the strong version of Agora's Trust Score. It's a fact deduced from
|
||||||
|
cryptographic evidence, not a claim by the persona (self-reporting could be
|
||||||
|
false) and not a claim by a centralized reputation service (could be bought).
|
||||||
|
The deduction is auditable — `/audit did:agora:heather` shows every contract,
|
||||||
|
every outcome, every ruling.
|
||||||
|
|
||||||
|
** Agent Behavioral Contracts — formal enforcement for the ABC of Agora
|
||||||
|
|
||||||
|
Bhardwaj (2026) introduces a formal framework that brings Design-by-Contract
|
||||||
|
principles to autonomous AI agents. An ABC contract =C = (P, I, G, R)=
|
||||||
|
specifies /Preconditions/, /Invariants/ (hard and soft), /Governance/ policies
|
||||||
|
(hard and soft), and /Recovery/ mechanisms as first-class runtime-enforceable
|
||||||
|
components.
|
||||||
|
|
||||||
|
This maps directly onto Agora's contract lifecycle:
|
||||||
|
|
||||||
|
| ABC component | Agora mapping |
|
||||||
|
|------------------------+--------------------------------------------------------------|
|
||||||
|
| =P= (Preconditions) | Contract Note validity checks: all signers' DIDs active, |
|
||||||
|
| | contract CID correctly referenced, HODL invoice locked |
|
||||||
|
| =I= (Invariants) | Hard: payment amount unchanged, arbitrator DID unchanged. |
|
||||||
|
| | Soft: delivery within estimated window |
|
||||||
|
| =G= (Governance) | Hard: no party modifies contract terms unilaterally. |
|
||||||
|
| | Soft: parties communicate through designated channels |
|
||||||
|
| =R= (Recovery) | Arbitration escalation, HODL invoice release, reputation |
|
||||||
|
| | deduction |
|
||||||
|
|
||||||
|
The framework's key mathematical results have direct implications for Agora:
|
||||||
|
|
||||||
|
- /Drift Bounds Theorem/: contracts with recovery rate γ > α (natural drift rate
|
||||||
|
from LLM non-determinism in agent behavior) bound behavioral drift to D* = α/γ.
|
||||||
|
For Agora, this means contract enforcement can be /predictive/ — detecting drift
|
||||||
|
before violation — rather than just /corrective/ after breach.
|
||||||
|
|
||||||
|
- /Compositionality Theorem/: sufficient conditions (interface compatibility,
|
||||||
|
assumption discharge, governance consistency, recovery independence) under
|
||||||
|
which individual contract guarantees compose end-to-end for multi-agent chains.
|
||||||
|
This is essential for Agora's multi-party contracts, where a buyer, seller,
|
||||||
|
arbitrator, and escrow agent form a chain of interdependent behavioral
|
||||||
|
expectations.
|
||||||
|
|
||||||
|
- /(p, δ, k)-satisfaction/: probabilistic compliance accounting for LLM
|
||||||
|
non-determinism — contracts hold with probability p, deviations stay within
|
||||||
|
tolerance δ, recovery within k steps. This formalizes what Screamer's
|
||||||
|
contract-lifecycle domain queries verify: whether the current state of a
|
||||||
|
contract satisfies its agreed-upon conditions, given the inherent uncertainty
|
||||||
|
in any agent's behavior.
|
||||||
|
|
||||||
|
The empirical results are significant: across 1,980 sessions on 7 models,
|
||||||
|
contracted agents (with ABC enforcement) detected 5.2-6.8 soft violations per
|
||||||
|
session that uncontracted agents missed entirely, with <10ms per-action overhead.
|
||||||
|
Overhead is critical for Passepartout as the PDS — contract enforcement must not
|
||||||
|
add latency to Note processing.
|
||||||
|
|
||||||
|
ABC does not replace Screamer. ABC specifies /what/ must hold; Screamer verifies
|
||||||
|
/whether/ it holds against the fact store. The contract-lifecycle domain already
|
||||||
|
planned for Phase 0b (signal chain) can be implemented as an ABC-like structure:
|
||||||
|
a tuple of preconditions, invariants, governance rules, and recovery mechanisms,
|
||||||
|
each expressed as Screamer-verifiable facts with Merkle provenance.
|
||||||
|
|
||||||
|
See also:
|
||||||
|
- Bhardwaj, V.P. (2026). Agent Behavioral Contracts: Formal Specification and
|
||||||
|
Runtime Enforcement for Reliable Autonomous AI Agents. arXiv:2602.22302.
|
||||||
|
|
||||||
|
** The merkle DAG IS the Key Event Log
|
||||||
|
|
||||||
|
Agora's KEL specification (Section 02) describes an append-only log of key
|
||||||
|
events — inception, rotation, revocation, follow events. Passepartout's Merkle
|
||||||
|
DAG (Phase 5, built on v0.2.0 memory-object infrastructure) is this log. Each
|
||||||
|
key event is a fact in the =:key-lifecycle= domain. Each event has a
|
||||||
|
=:parent-id= chaining to the previous event. The DAG is content-addressed —
|
||||||
|
every event is a CID. The full KEL is queryable: `/audit did:agora:heather`
|
||||||
|
renders every key event, every follow event, every contract signature, with
|
||||||
|
provenance chains.
|
||||||
|
|
||||||
|
* Relation to the Neurosymbolic Roadmap
|
||||||
|
|
||||||
|
The Agora integration is not a new phase. It is a consequence of decisions
|
||||||
|
already made:
|
||||||
|
|
||||||
|
| Roadmap item | Agora consequence |
|
||||||
|
|-------------------------+----------------------------------------------------------------|
|
||||||
|
| Phase 0b (key registry) | Key registry uses Agora DIDs. DID store is =:key-lifecycle= domain |
|
||||||
|
| Phase 1 (fact store) | Fact store is also Note store. Same API, same hash table |
|
||||||
|
| Phase 1a (self-pres.) | Incoming Notes tracked. Spam DIDs quarantined. Disk eviction |
|
||||||
|
| Phase 3 (archivist) | Archivist walks =~/memex/social/notes/= alongside local dirs |
|
||||||
|
| Phase 4 (sufficiency) | Agora Notes are a provenance category in the sufficiency score |
|
||||||
|
| Phase 5 (Merkle DAG) | DAG = KEL. DAG = contract audit trail |
|
||||||
|
| Phase 0b (signal chain) | Signal chain = contract lifecycle chain. Same Merkle linking |
|
||||||
|
|
||||||
|
No new lines in the roadmap. The Note publishing function (~40 lines) is a
|
||||||
|
utility, not a phase.
|
||||||
|
|
||||||
|
* What Is NOT Built
|
||||||
|
|
||||||
|
1. *A separate Note parser.* Agora Notes ARE Org files. The existing Org parser
|
||||||
|
reads both.
|
||||||
|
2. *A separate Note store.* The =memory-object= struct stores both. The
|
||||||
|
=*memory-store*= hash table holds both.
|
||||||
|
3. *A separate extraction path for Agora content.* The archivist extracts facts
|
||||||
|
from prose regardless of origin. The provenance tag distinguishes source.
|
||||||
|
4. *A new authentication mechanism for Agora signals.* Gate vector 0 verifies
|
||||||
|
DID signatures. The key registry is the DID registry.
|
||||||
|
|
||||||
|
See also:
|
||||||
|
- =projects/agora/docs/= — Agora requirements (overview, identity, primitive, social, contracts, governance)
|
||||||
|
- =projects/agora/TODO.org= — Passepartout integration track
|
||||||
|
- =passepartout-neurosymbolic-design-decisions-and-options.org= — the full design rationale
|
||||||
|
- =passepartout-neurosymbolic-roadmap.org= — the phased implementation plan
|
||||||
@@ -1,719 +1,24 @@
|
|||||||
#+TITLE: Passepartout Neurosymbolic Engine — Design Decisions and Architecture Options
|
#+TITLE: Passepartout Neurosymbolic Engine — Design Decisions — SUPERSEDED
|
||||||
#+AUTHOR: Agent
|
#+AUTHOR: Agent
|
||||||
#+FILETAGS: :notes:design-decisions:neurosymbolic:architecture:v3.0.0:
|
#+FILETAGS: :notes:design-decisions:neurosymbolic:superseded:
|
||||||
#+CREATED: [2026-05-08 Fri]
|
#+CREATED: [2026-05-08 Fri]
|
||||||
|
#+SUPERSEDED: [2026-05-10 Sun]
|
||||||
* The Hallucination Problem — Why Neurosymbolic
|
|
||||||
|
This document has been consolidated into ~passepartout/docs/DESIGN_DECISIONS.org~. The unified document interleaves the neurosymbolic design rationale into nine thematic parts with a single narrative arc:
|
||||||
An LLM is a statistical engine trained on token sequences. It generates the most
|
|
||||||
probable continuation of a prompt. Given sufficient context, that continuation is
|
| Part | Topic | Key New Sections |
|
||||||
correct. Given novel context, it is often wrong in confident-sounding ways.
|
|------|-------|-----------------|
|
||||||
|
| I | Foundation | Historical Lineage (McCarthy) |
|
||||||
This is not a training deficiency. Hallucination is a fundamental property of
|
| II | The Two Brains | Hallucination Problem, 10-80-10, Brain/Education metaphor |
|
||||||
probabilistic inference. You can reduce it with better models, longer contexts,
|
| III | Safety & Self-Preservation | Active Third Law, Layered Signal Authentication |
|
||||||
and clever prompting, but you cannot eliminate it by making the LLM better. You
|
| IV | The Symbolic Engine | Five Options, Chosen Path, Gate-to-Fact Bootstrap, LLM as Proposer, Cardinality Policies, Organic Ontology, Ontology Versioning, Sufficiency Criterion, Merkle DAG, Abstract Fact Store Interface |
|
||||||
eliminate it by not asking the LLM to do things that require certainty.
|
| V | Knowledge Sources | Semantic Wikipedia, MOMo Empirical Validation |
|
||||||
|
| VI | Implementation Properties | Performance Scaling, Provenance as Product |
|
||||||
This is the architectural bet at the heart of Passepartout's neurosymbolic design.
|
| VII | Engineering Infrastructure | REPL, Cybernetic Loop, Observability, Literate Programming, Eval Harness, MCP, Local-First, Token Economics, Time Awareness (carried over from existing) |
|
||||||
The LLM should not be the reasoning engine. It should be the *creative* engine —
|
| VIII | Validation | Marcus, CREST, KiL philosophical validation; Competitive Argument |
|
||||||
proposing possibilities, surfacing connections, translating between natural
|
| IX | Open Questions | Fact language, human role, Wikidata scope, natural language interface, graph query performance |
|
||||||
language and formal representation. The *reasoning* engine should be symbolic:
|
|
||||||
deterministic, verification-grounded, provenance-tracked, and incapable of
|
Cross-references are preserved in:
|
||||||
hallucination by construction.
|
- ~notes/passepartout-symbolic-engine-exploration.org~
|
||||||
|
- ~notes/passepartout-whitehead.org~
|
||||||
This is not a rejection of neural methods. It is a division of labor. The neuro
|
- ~notes/competitive-landscape.org~
|
||||||
is the brain — generative, associative, creative, comfortable with ambiguity. It
|
|
||||||
produces hypotheses. The symbolic engine is the education — accumulated, verified,
|
|
||||||
provenance-tracked knowledge that the brain draws on and is disciplined by. It
|
|
||||||
doesn't think. It remembers, checks, and constrains.
|
|
||||||
|
|
||||||
The brain is always smarter than the education, but the education prevents the
|
|
||||||
brain from being confidently wrong.
|
|
||||||
|
|
||||||
** See also:
|
|
||||||
|
|
||||||
- =passepartout/docs/DESIGN_DECISIONS.org=: "The Probabilistic-Deterministic Split"
|
|
||||||
for the gate-level version of this argument.
|
|
||||||
- =notes/passepartout-whitehead.org=: Whitehead's ramified theory of types as
|
|
||||||
the structural guarantee against self-referential contradictions.
|
|
||||||
- =notes/passepartout-symbolic-engine-exploration.org=: the full design space and
|
|
||||||
the lossiness problem at the neural-symbolic boundary.
|
|
||||||
|
|
||||||
* The Five Architecture Options
|
|
||||||
|
|
||||||
The symbolic engine must relate to the human memex. The relationship is not
|
|
||||||
obvious because knowledge lives in two incompatible forms: natural language
|
|
||||||
prose (what the human reads and writes) and formal facts (what the symbolic
|
|
||||||
engine reasons about). The translation between them is lossy by nature. The
|
|
||||||
architecture is defined by how it handles that lossiness.
|
|
||||||
|
|
||||||
=notes/passepartout-symbolic-engine-exploration.org= explores five options. They are
|
|
||||||
summarized here to make subsequent decisions legible.
|
|
||||||
|
|
||||||
** Option 1: The Auto-Formalizer
|
|
||||||
|
|
||||||
A separate knowledge graph stores symbolic facts. The LLM populates it by
|
|
||||||
extracting triples from unstructured data — documentation, manuals, logs,
|
|
||||||
session histories. The KG becomes co-authoritative with the human prose.
|
|
||||||
|
|
||||||
This is the simplest to implement but inherits the dual-representation problem
|
|
||||||
in its most acute form. The KG and the prose can disagree, and the architecture
|
|
||||||
provides no mechanism for resolving disagreements. It also stores knowledge
|
|
||||||
twice — once in the user's Org files, once in the KG — with no guarantee that
|
|
||||||
they stay synchronized.
|
|
||||||
|
|
||||||
** Option 2: Two Intentionally Separate Memexes
|
|
||||||
|
|
||||||
The human memex contains prose: thoughts, diaries, decisions, documentation.
|
|
||||||
The symbolic memex contains formal facts: constraints, rules, relationships,
|
|
||||||
deductions. The archivist bridges between them but does not try to keep them
|
|
||||||
synchronized. They are allowed to diverge because they serve different purposes.
|
|
||||||
The prose captures what the human intended. The symbolic memex captures what
|
|
||||||
the symbolic engine has proven.
|
|
||||||
|
|
||||||
This is philosophically honest — it admits that no lossless translation between
|
|
||||||
natural language and formal logic is possible. But it forces the user to reason
|
|
||||||
about two separate knowledge stores and understand when to trust each.
|
|
||||||
|
|
||||||
** Option 3: Tangled Fact Blocks in Org Files
|
|
||||||
|
|
||||||
The tangle mechanism already handles the dual-representation problem for code.
|
|
||||||
Lisp code lives in literate blocks within Org files (=#+begin_src lisp=). The
|
|
||||||
tangle mechanism extracts these blocks and generates =.lisp= files. A new block
|
|
||||||
type — =#+begin_src knowledge= — would contain symbolic facts in a formal
|
|
||||||
language. The tangle mechanism would load these facts into the symbolic engine's
|
|
||||||
in-memory store, just as it loads Lisp code into the SBCL image.
|
|
||||||
|
|
||||||
This is aesthetically appealing because it unifies the format. One toolchain,
|
|
||||||
one version control system, one Merkle tree. But the block language itself IS
|
|
||||||
the knowledge representation language, and that language is the ontology we
|
|
||||||
have not yet defined. The format is unified but the content is unspecified.
|
|
||||||
|
|
||||||
** Option 4: One Memex, Two Indices
|
|
||||||
|
|
||||||
The prose remains in human language in Org files. The prose is always the ground
|
|
||||||
truth. Two indices sit on top of the prose as derived views:
|
|
||||||
|
|
||||||
- The *neural index* uses vector embeddings to enable semantic search. The LLM
|
|
||||||
navigates the prose through embedding space, retrieving relevant headings.
|
|
||||||
- The *symbolic index* stores formal assertions about what the prose says —
|
|
||||||
predicates, relations, constraints — each grounded to a specific heading or
|
|
||||||
block in the Org file.
|
|
||||||
|
|
||||||
Each index serves its own side of the machine. They do not need to understand
|
|
||||||
each other's representations. They only need to agree on which heading or block
|
|
||||||
they are referring to. Because the prose is always the ground truth, the symbolic
|
|
||||||
index can be thrown away and rebuilt from scratch if it becomes corrupted or
|
|
||||||
stale. No information is lost — only the extracted assertions.
|
|
||||||
|
|
||||||
** Option 5: Ephemeral Symbolic Facts
|
|
||||||
|
|
||||||
No persistence, no serialization format, no knowledge graph stored on disk.
|
|
||||||
VivaceGraph exists in memory during the session. Screamer derives facts from the
|
|
||||||
prose as needed. When the session ends, the facts are discarded and re-derived
|
|
||||||
from the prose on the next start.
|
|
||||||
|
|
||||||
This punts the ontological design problem entirely. You never have to decide on
|
|
||||||
a serialization format because you never serialize. The cost is compute
|
|
||||||
(re-derivation on every restart) and the inability to accumulate facts across
|
|
||||||
sessions. But it is the correct first step — a way to learn what kinds of facts
|
|
||||||
are actually useful before committing to a storage format.
|
|
||||||
|
|
||||||
* The Chosen Path: Option 4, Starting with Option 5
|
|
||||||
|
|
||||||
The one-memex-two-indices architecture (Option 4) is the correct long-term
|
|
||||||
architecture. The prose is the ground truth. The symbolic index is a derived
|
|
||||||
view that can be rebuilt. The neural index handles what the symbolic index
|
|
||||||
cannot — semantic search, fuzzy matching, associative leaps.
|
|
||||||
|
|
||||||
But committing to a persistence format before knowing what facts are useful
|
|
||||||
is premature. The practical path starts with Option 5 (ephemeral facts) as the
|
|
||||||
Phase 1-4 implementation, then graduates to Option 4 with VivaceGraph
|
|
||||||
persistence in Phase 5 when the fact language has been battle-tested (=see
|
|
||||||
=passepartout-neurosymbolic-roadmap.org=).
|
|
||||||
|
|
||||||
** Why the dual index is permanent, not transitional
|
|
||||||
|
|
||||||
In the coding domain, there is an aspiration that the symbolic index could
|
|
||||||
eventually capture enough of the prose's propositional content to become a
|
|
||||||
complete representation — the "flip" described in the architecture note. But
|
|
||||||
for the broader memex (literature, poetry, personal reflection, daily logs),
|
|
||||||
completeness is neither possible nor desirable. You cannot formalize what makes
|
|
||||||
a poem beautiful. You cannot extract a triple that captures the emotional weight
|
|
||||||
of a diary entry. The neural index will always be the gateway to the full
|
|
||||||
richness of the prose. The symbolic index handles what can be mechanically
|
|
||||||
verified: citations, entities, temporal order, contradictions, provenance.
|
|
||||||
The division of labor between the two indices is permanent because the domains
|
|
||||||
they serve are fundamentally different kinds of knowledge.
|
|
||||||
|
|
||||||
* The Neuro as Brain, the Symbolic as Education
|
|
||||||
|
|
||||||
The original 10-80-10 architecture (10% neural, 80% symbolic, 10% neural)
|
|
||||||
describes the target ratios for a *coding* agent — a domain where most reasoning
|
|
||||||
is formalizable. For the broader memex, the ratios are different and less
|
|
||||||
important than the metaphor itself.
|
|
||||||
|
|
||||||
The neuro is the *brain* — generative, associative, creative, comfortable with
|
|
||||||
ambiguity. It produces insights that are provisional, connections that are
|
|
||||||
speculative, hypotheses that may be wrong. It is the driver.
|
|
||||||
|
|
||||||
The symbolic engine is the *education* — accumulated, verified,
|
|
||||||
provenance-tracked knowledge that the brain draws on and is disciplined by. It
|
|
||||||
doesn't think creatively. It remembers, checks, and constrains. It prevents the
|
|
||||||
brain from being confidently wrong.
|
|
||||||
|
|
||||||
This framing resolves a tension in the original architecture. The 10-80-10
|
|
||||||
implies the symbolic engine /replaces/ the neuro for reasoning. But a symbolic
|
|
||||||
engine is terrible at creativity, ambiguity, and associative leaps across
|
|
||||||
unrelated domains — exactly what you need for a memex that contains /Pale Fire/,
|
|
||||||
a shopping list, and a project plan. The brain proposes that your sudden interest
|
|
||||||
in unreliable narrators coincides with a week where your project retrospective
|
|
||||||
used the word "deception." The education verifies: "those two diary entries are
|
|
||||||
4 days apart; the word 'deception' appears in both; here are the headings." The
|
|
||||||
brain makes the leap. The education makes it trustworthy.
|
|
||||||
|
|
||||||
This means the symbolic engine never needs to be "complete." Education isn't
|
|
||||||
complete knowledge — it's structured knowledge. You don't need a fact for every
|
|
||||||
sentence in your diary. You need facts for what can be mechanically verified:
|
|
||||||
dates, citations, entities, contradictions, temporal order. The brain handles
|
|
||||||
the rest.
|
|
||||||
|
|
||||||
* The Gate-to-Fact Bootstrap — Extracting the First Ontology from Existing Code
|
|
||||||
|
|
||||||
The Dispatcher gate stack already encodes an implicit ontology. Every gate
|
|
||||||
vector asserts the existence of a category of things:
|
|
||||||
|
|
||||||
- Gate vector 2 asserts there exists a class of files called /secrets/.
|
|
||||||
- Gate vector 7 asserts there exists a class of commands called /destructive/.
|
|
||||||
- Gate vector 8 asserts there exists a class of domains called /trusted/.
|
|
||||||
- The self-build boundary asserts there exists a class of files called
|
|
||||||
/core-harness/ and a class called /skills/.
|
|
||||||
|
|
||||||
These claims are currently expressed as code — Lisp functions that pattern-match
|
|
||||||
against file paths, shell commands, and URLs. They are not facts the symbolic
|
|
||||||
engine can query, derive from, or check for consistency. But they can be made
|
|
||||||
explicit.
|
|
||||||
|
|
||||||
The bootstrap makes every gate a set of initial symbolic facts:
|
|
||||||
=(:file ".env" :member-of-class :secret-files :source gate-vector-2)=,
|
|
||||||
=(:command "rm -rf /" :classified-as :catastrophic :source gate-vector-7)=,
|
|
||||||
=(:domain "api.telegram.org" :classified-as :trusted :source gate-vector-8)=.
|
|
||||||
|
|
||||||
This produces 50-70 entity classes directly from the existing gate stack,
|
|
||||||
without any new infrastructure:
|
|
||||||
|
|
||||||
| Source | Count | Example categories |
|
|
||||||
|----------------------------------------+-------+----------------------------------------------------|
|
|
||||||
| ~*dispatcher-protected-paths*~ | 11 | :secret-config-file, :ssh-key-file, :gpg-key-file |
|
|
||||||
| ~*dispatcher-shell-blocked*~ | 8 | :catastrophic-command, :injection-pattern |
|
|
||||||
| ~*dispatcher-network-whitelist*~ | 2 | :trusted-domain, :untrusted-domain |
|
|
||||||
| Self-build boundary | 2 | :core-harness-file, :skill-file |
|
|
||||||
| Privacy tags | 3 | :private-content, :financial-content |
|
|
||||||
| Permission table | 3 | :read-only-tool, :write-tool, :eval-tool |
|
|
||||||
| Cognitive tools | 6 | :code-search-tool, :file-io-tool, :shell-tool |
|
|
||||||
| Relations (all gates) | ~15 | :member-of-class, :classified-as, :depends-on |
|
|
||||||
| Qualities | ~8 | :catastrophic, :dangerous, :moderate, :harmless |
|
|
||||||
| Provenance sources | 4 | :gate-outcome, :human-authored, :deduced, :llm-proposed |
|
|
||||||
|----------------------------------------+-------+----------------------------------------------------|
|
|
||||||
|
|
||||||
This is the seed. It gives Screamer a domain to reason about immediately, without
|
|
||||||
any LLM involvement. It proves the pattern — code becomes facts, facts enable
|
|
||||||
reasoning — at the cost of approximately 30 lines of Lisp.
|
|
||||||
|
|
||||||
* The LLM as Proposer — Verified Extraction
|
|
||||||
|
|
||||||
The LLM cannot be trusted to populate the symbolic index directly. Its outputs are
|
|
||||||
sampled, not proven. A probabilistic extraction feeding a deterministic engine
|
|
||||||
defeats the purpose of being deterministic.
|
|
||||||
|
|
||||||
But the LLM is still useful. It can surface facts that are obvious to a human
|
|
||||||
reader of prose but would take the symbolic engine many deduction steps to reach
|
|
||||||
independently. The solution is to demote the LLM from /extractor/ to /proposer/:
|
|
||||||
|
|
||||||
1. The archivist reads a prose heading.
|
|
||||||
2. The LLM proposes candidate triples.
|
|
||||||
3. Screamer checks each triple for consistency against the existing fact store.
|
|
||||||
4. Only consistent triples are admitted to the symbolic index, flagged with
|
|
||||||
=:provenance :llm-proposed= and grounded to the source heading.
|
|
||||||
|
|
||||||
The LLM might hallucinate facts that don't correspond to the prose. It might
|
|
||||||
extract facts that contradict existing knowledge. It might produce syntactically
|
|
||||||
malformed triples. None of these failures contaminate the symbolic index because
|
|
||||||
proposals are not admitted automatically. The admission gate (Screamer) is
|
|
||||||
deterministic.
|
|
||||||
|
|
||||||
This is the core architecture pattern. Everything else — the entity classes, the
|
|
||||||
deduction engine, the persistence layer — follows from this single design decision:
|
|
||||||
*the LLM proposes; the symbolic engine decides whether to accept.*
|
|
||||||
|
|
||||||
* Three Contradiction Policies — Domain-Dependent Consistency
|
|
||||||
|
|
||||||
Classical logic requires consistency. A contradiction implies everything
|
|
||||||
(=ex contradictione quodlibet=). Screamer, as a constraint solver, also requires
|
|
||||||
consistency — a contradictory constraint set has no solutions. But the symbolic
|
|
||||||
engine operates across domains where the meaning of contradiction is fundamentally
|
|
||||||
different.
|
|
||||||
|
|
||||||
A single architecture serves all domains by applying different contradiction
|
|
||||||
policies, scoped to the entity class:
|
|
||||||
|
|
||||||
** Policy :exclusive — Contradiction Rejected at Admission
|
|
||||||
|
|
||||||
For domains where the world is physically singular — a file either exists or it
|
|
||||||
doesn't, a command either was blocked or it wasn't, a gate rule either applies or
|
|
||||||
it doesn't. When a new fact contradicts an existing one in an :exclusive domain,
|
|
||||||
the new fact is rejected. The existing fact is authoritative unless a human
|
|
||||||
explicitly retracts it.
|
|
||||||
|
|
||||||
Use for: security classifications, file system state, gate rules, code
|
|
||||||
correctness, deterministic safety constraints.
|
|
||||||
|
|
||||||
** Policy :coexistent — Contradiction Flagged, Both Retained
|
|
||||||
|
|
||||||
For domains where multiple truths coexist — literary interpretations, historical
|
|
||||||
accounts, personal beliefs held at different times, multi-source factual
|
|
||||||
disagreement (Wikidata vs. DBpedia vs. your memex). When a new fact contradicts
|
|
||||||
an existing one in a :coexistent domain, the contradiction is recorded with a
|
|
||||||
cross-reference flag. Both facts are stored. Queries return all facts with
|
|
||||||
provenance display.
|
|
||||||
|
|
||||||
Use for: literature, history, personal knowledge evolution, scientific consensus
|
|
||||||
shift, multi-author knowledge bases.
|
|
||||||
|
|
||||||
** Policy :temporal — Contradiction Accepted as Version Change
|
|
||||||
|
|
||||||
For domains where truth changes over time. When a new fact contradicts an old one
|
|
||||||
in a :temporal domain, the old fact is marked =:superseded= but retained. The
|
|
||||||
timeline is queryable: "You believed X on Tuesday, Y on Friday, Z on Sunday."
|
|
||||||
|
|
||||||
Use for: personal belief evolution, project plan revisions, scientific
|
|
||||||
consensus shift over time, any knowledge where the change itself is information.
|
|
||||||
|
|
||||||
** Policy Assignment
|
|
||||||
|
|
||||||
The policy is assigned when a category is defined. New categories default to
|
|
||||||
=:coexistent= (never loses information). Core security categories are explicitly
|
|
||||||
=:exclusive=. The gate stack's bootstrapped facts are =:exclusive= because they
|
|
||||||
describe the actual filesystem, not perspectives.
|
|
||||||
|
|
||||||
The Screamer admission gate does not reject all contradictions. It rejects
|
|
||||||
contradictions in =:exclusive= domains and flags them in =:coexistent= and
|
|
||||||
=:temporal= domains. The constraint solver still works because queries scope
|
|
||||||
their constraint set to a single provenance domain. "Is X true according to my
|
|
||||||
memex?" is a different query than "Is X true according to Wikidata?" Each has
|
|
||||||
a self-consistent internal logic. The contradiction is between domains, not
|
|
||||||
within them.
|
|
||||||
|
|
||||||
** Why This Matters for the Broader Memex
|
|
||||||
|
|
||||||
In the coding domain, contradiction is rare and must be resolved — a gate can't
|
|
||||||
both allow and block the same path. In the broader memex, contradiction is the
|
|
||||||
product, not the error. Your poetry analysis contradicts your last diary entry
|
|
||||||
on the same topic. Your reading of /Pale Fire/ changed between 2023 and 2025.
|
|
||||||
Wikidata says Mount Everest is 8848m (China: rock height); DBpedia says 8849m
|
|
||||||
(Nepal: snow height). The symbolic engine's job is not to decide which is right.
|
|
||||||
It is to surface the tension with provenance — "these three sources disagree.
|
|
||||||
Here is the chain for each."
|
|
||||||
|
|
||||||
* How Categories Grow — The Organic Ontology
|
|
||||||
|
|
||||||
Whitehead's /Principia Mathematica/ took over 300 pages to define the logical
|
|
||||||
foundations before it could prove that one plus one equals two. Every category
|
|
||||||
introduced carried a burden of justification. Every inference rule had to be
|
|
||||||
demonstrated sound. This is the classical approach to ontology: define everything
|
|
||||||
upfront, exhaustively, formally.
|
|
||||||
|
|
||||||
Passepartout cannot afford this and does not need it. Its domain is bounded
|
|
||||||
(software engineering, personal knowledge, literary engagement, daily life) and
|
|
||||||
its ontology grows from the system's own operation:
|
|
||||||
|
|
||||||
1. *The gate stack seeds the ontology.* Every gate vector is an implicit claim
|
|
||||||
about a category of things. The bootstrap makes these claims explicit. The
|
|
||||||
seed is 50-70 entity classes with no human authoring required — they are
|
|
||||||
mechanically extracted from the existing code.
|
|
||||||
|
|
||||||
2. *New gate vectors add categories directly.* As the Dispatcher grows (new
|
|
||||||
shell patterns, new path protections, new tool classifications), the ontology
|
|
||||||
grows with it. Every new pattern in the gate stack becomes a fact on skill
|
|
||||||
load. No human effort. The gate stack grows, the ontology grows.
|
|
||||||
|
|
||||||
3. *Screamer generalizes from gate outcomes.* After 37 shell commands are blocked
|
|
||||||
as destructive, Screamer extracts structural commonalities: "commands writing
|
|
||||||
to block devices," "commands recursively deleting outside the workspace."
|
|
||||||
These become new subcategories (=:block-device-command=,
|
|
||||||
=:workspace-external-delete=) that didn't exist in the original gate patterns.
|
|
||||||
The ontology deepens through observation.
|
|
||||||
|
|
||||||
4. *The archivist proposes from prose.* The archivist reads a diary entry about
|
|
||||||
a book: "Nabokov's lectures on Kafka." The LLM proposes =(:entity :nabokov
|
|
||||||
:relation :lectures-on :value :kafka)=. Screamer checks consistency. Admitted.
|
|
||||||
The categories =:author=, =:lectures-on=, and =:subject= didn't exist before —
|
|
||||||
they are created on first use. This is the primary growth mechanism for the
|
|
||||||
broader memex.
|
|
||||||
|
|
||||||
5. *The human declares explicitly.* The human writes a declarative fact directly
|
|
||||||
into the symbolic index. No extraction step. No LLM involvement. The fact is
|
|
||||||
admitted with =:provenance :human-authored= — the highest trust level.
|
|
||||||
|
|
||||||
6. *Temporal patterns crystallize into categories.* Every Sunday the memex gets a
|
|
||||||
retrospective heading. Every Monday a planning heading. The time-awareness
|
|
||||||
system observes the periodicity and proposes =:weekly-retrospective= and
|
|
||||||
=:weekly-planning= as fact types. Screamer verifies they don't contradict
|
|
||||||
existing categorizations. Admitted.
|
|
||||||
|
|
||||||
7. *Cross-domain overlap produces parent categories.* Screamer notices that
|
|
||||||
=:secret-files= (from the gate stack) and =:private-content= (from privacy
|
|
||||||
tags) share members — =.env= is both a secret file and private content. It
|
|
||||||
proposes =:sensitive-material= as a parent with both as children. Taxonomy
|
|
||||||
building happens automatically through overlap detection.
|
|
||||||
|
|
||||||
** Growth is self-limiting by design
|
|
||||||
|
|
||||||
Not every conceivable category is added. The system prunes through use:
|
|
||||||
|
|
||||||
- New categories are admitted only through Screamer's consistency check. A
|
|
||||||
category that contradicts an existing classification is rejected.
|
|
||||||
- A category that never gets queried costs nothing (a hash table entry) but
|
|
||||||
produces no value. It fades from use naturally.
|
|
||||||
- Overly fine-grained categories (=.env.foo.bar.baz= as its own class) are
|
|
||||||
rejected because they are redundant with the wildcard pattern that already
|
|
||||||
covers them.
|
|
||||||
- Overly broad categories that subsume meaningful distinctions ("everything is
|
|
||||||
a =:file=") produce contradictions when Screamer tries to apply existing rules.
|
|
||||||
Rejected.
|
|
||||||
|
|
||||||
The system converges on a useful granularity through use, not through upfront
|
|
||||||
design. The gate stack provides the seed. Gate outcomes, prose extraction,
|
|
||||||
deduction, and human authoring grow the shoots. Screamer prunes contradictions.
|
|
||||||
The ontology is a garden, not a building.
|
|
||||||
|
|
||||||
* Semantic Wikipedia as Entity Backbone
|
|
||||||
|
|
||||||
The gate stack provides 50-70 entity classes — adequate for a coding agent where
|
|
||||||
the domain is bounded to files, commands, and code symbols. For a general-knowledge
|
|
||||||
memex, 50-70 is starvation. Your memex mentions Nabokov, /Pale Fire/, Kinbote,
|
|
||||||
Zembla, paranoid reading, unreliable narrators, postmodernism, butterfly
|
|
||||||
migration, chess problems, and the Russian exile experience. The gate stack knows
|
|
||||||
none of these. Organic growth through prose extraction would take years just to
|
|
||||||
cover the entities in one person's engagement with a single novel.
|
|
||||||
|
|
||||||
Wikidata has already done this work: approximately 2 million entity classes, over
|
|
||||||
100 million entities, a decade of human curation. By loading the neighborhood of
|
|
||||||
your memex into the symbolic index (entities referenced in your prose, plus their
|
|
||||||
N-hop property net from Wikidata), the entity recognition problem vanishes. The
|
|
||||||
archivist doesn't need to discover Nabokov from your diary. It needs to connect
|
|
||||||
your heading to the existing Wikidata entity. That is a simpler task — reference
|
|
||||||
resolution, not knowledge extraction.
|
|
||||||
|
|
||||||
The LLM's role shrinks to three thin boundaries:
|
|
||||||
|
|
||||||
1. *Input translation* — natural language question to structured query. "What do
|
|
||||||
I think about monorepos?" → =(fact-query :entity :monorepo :relation :opinion
|
|
||||||
:source :memex)=. Formulaic, ~100 tokens, any model sufficient.
|
|
||||||
|
|
||||||
2. *Prose to candidate triple* — for personal memex entries that have no Wikidata
|
|
||||||
counterpart: your opinions, your day's events, your project plans. Proposals
|
|
||||||
are verified by Screamer before admission. This is the only extraction path
|
|
||||||
that still requires an LLM, and its scope is limited to what Wikidata cannot
|
|
||||||
provide — your subjective, personal, or novel content.
|
|
||||||
|
|
||||||
3. *Result to prose* — structured answer to readable sentence. "Your 2023 diary
|
|
||||||
says 8848m. Wikidata (last edited Feb 2024) says 8849m. They disagree on
|
|
||||||
height." The reasoning is done; the LLM wraps the plist in grammar. ~100
|
|
||||||
tokens, any model sufficient, purely cosmetic. Users who prefer no LLM at all
|
|
||||||
can navigate through command-driven interaction (=/query=, =/contradictions=,
|
|
||||||
=/audit=, =/context why=).
|
|
||||||
|
|
||||||
Everything else — the gate stack, the fact store, the constraint solver, the type
|
|
||||||
hierarchy, the provenance tracking, the contradiction surfacing, the cross-domain
|
|
||||||
comparison — is pure deterministic Lisp with zero LLM tokens.
|
|
||||||
|
|
||||||
** The decisive simplification
|
|
||||||
|
|
||||||
Without Semantic Wikipedia, the archivist must /discover/ entities from prose:
|
|
||||||
extract a triple for every person, place, work, concept, and event mentioned in
|
|
||||||
the memex. This is unbounded LLM work and the quality depends on extraction
|
|
||||||
accuracy.
|
|
||||||
|
|
||||||
With Wikidata loaded, the entity graph is pre-structured. The archivist's job
|
|
||||||
changes from "discover that Nabokov wrote /Pale Fire/ and lectured on Kafka" to
|
|
||||||
"verify that the Nabokov referenced in heading #47 is the same entity as Wikidata
|
|
||||||
item Q36591." The second task is simpler, more reliable, and in many cases can
|
|
||||||
be done without an LLM at all — a simple entity name match against the loaded
|
|
||||||
Wikidata graph may suffice for unambiguous names.
|
|
||||||
|
|
||||||
* The "Flip" — From Lossy Extraction to Deterministic Derivation
|
|
||||||
|
|
||||||
The symbolic index begins its life as a lossy construct. The initial extraction
|
|
||||||
from the prose — the first population of facts from LLM proposals verified by
|
|
||||||
Screamer — is built from an uncertain foundation. Some facts are correct. Some
|
|
||||||
are missing. Some are wrong.
|
|
||||||
|
|
||||||
But the symbolic engine accumulates non-lossy facts through three independent
|
|
||||||
mechanisms:
|
|
||||||
|
|
||||||
1. *Gate outcomes* — every gate rejection is a fact. No LLM involved. These
|
|
||||||
accumulate at the rate of user interactions.
|
|
||||||
2. *Screamer deductions* — new facts derived from existing facts. No LLM
|
|
||||||
involved. These accumulate whenever the fact store crosses a density threshold
|
|
||||||
where structural patterns emerge.
|
|
||||||
3. *Human authoring* — the human explicitly declares facts. No LLM involved.
|
|
||||||
|
|
||||||
At some point, the non-lossy facts constitute a sufficient foundation that the
|
|
||||||
symbolic engine can reverse the flow: instead of the LLM extracting facts from
|
|
||||||
prose, the symbolic engine reads prose through its own lens — its now-substantial
|
|
||||||
ontology of categories, rules, and constraints — and asserts facts in its own
|
|
||||||
language. The extraction mechanism ceases to be probabilistic and becomes
|
|
||||||
deterministic.
|
|
||||||
|
|
||||||
** The sufficiency criterion
|
|
||||||
|
|
||||||
The architecture note (=notes/passepartout-symbolic-engine-exploration.org=) describes
|
|
||||||
this "flip" as aspirational: "at some point, the non-lossy facts constitute a
|
|
||||||
sufficient foundation." This design decision makes it operational:
|
|
||||||
|
|
||||||
=(/ (count-provenance :gate-outcome :human-authored :deduced) total-facts)=
|
|
||||||
|
|
||||||
When this ratio exceeds a configurable threshold (=SUFFICIENCY_THRESHOLD=,
|
|
||||||
default 0.7), the system considers its foundation sufficient. The archivist
|
|
||||||
switches from "LLM proposes, Screamer verifies" to "Screamer queries existing
|
|
||||||
facts, applies to the new prose, and deduces new facts directly."
|
|
||||||
|
|
||||||
The flip is visible to the user through the TUI sidebar or =/status= command:
|
|
||||||
"Symbolic index: 847 facts (73% non-lossy, 12% LLM-proposed, 15% Wikidata).
|
|
||||||
Sufficient foundation: YES."
|
|
||||||
|
|
||||||
** The flip does not mean "complete"
|
|
||||||
|
|
||||||
In the broader memex, completeness is neither possible nor desirable. The flip
|
|
||||||
means "deterministic enough to be trustworthy," not "comprehensive enough to be
|
|
||||||
self-sufficient." The neural index remains the gateway to the full richness of
|
|
||||||
prose. The symbolic index handles what can be mechanically verified. The boundary
|
|
||||||
is permanent.
|
|
||||||
|
|
||||||
* Ephemeral First, Persistent Later
|
|
||||||
|
|
||||||
The architecture note's Option 5 (ephemeral facts, no disk persistence) is the
|
|
||||||
correct first implementation. Three reasons:
|
|
||||||
|
|
||||||
1. *The fact language is unproven.* Triples with provenance and grounding is a
|
|
||||||
hypothesis. It may be too simple for some domains, too complex for others.
|
|
||||||
Committing to a serialization format before knowing what's useful is premature.
|
|
||||||
|
|
||||||
2. *The ontology is emergent.* Categories are created on first use. What proves
|
|
||||||
useful stays; what doesn't fades. A persistent format would need a migration
|
|
||||||
story every time the category structure changes. Ephemeral avoids this entirely
|
|
||||||
— the facts are re-derived on each session start using the current (evolved)
|
|
||||||
ontology.
|
|
||||||
|
|
||||||
3. *Rebuildability is the safety net.* Because all facts have a =:grounding= to
|
|
||||||
an Org heading, and gate-outcome facts are regenerated from the gate stack on
|
|
||||||
every load, the entire symbolic index can be thrown away and rebuilt from
|
|
||||||
scratch. The cost is compute, not data. This is the practical realization of
|
|
||||||
"the prose is always the ground truth."
|
|
||||||
|
|
||||||
The transition to persistence (Phase 5: VivaceGraph) happens when two conditions
|
|
||||||
are met: the fact language has stabilized through use, and the accumulated
|
|
||||||
deductions across sessions provide value that justifies the serialization cost.
|
|
||||||
|
|
||||||
* Whitehead's Concrete Contributions — Four Operational Contributions
|
|
||||||
|
|
||||||
=notes/passepartout-whitehead.org= extracts four concrete, engineerable ideas
|
|
||||||
from Whitehead's /Principia Mathematica/ and /Process and Reality/. They are
|
|
||||||
summarized here because each informs the neurosymbolic design.
|
|
||||||
|
|
||||||
** Contribution 1: PM-Type-Level Gates
|
|
||||||
|
|
||||||
PM's ramified theory of types solved Russell's paradox by assigning every
|
|
||||||
propositional function a type level, making self-application syntactically
|
|
||||||
invalid. Passepartout applies the same principle to prevent a request from
|
|
||||||
modifying the rules that validate it. Every cognitive tool and gate vector
|
|
||||||
carries a =:type-level= integer. Before any gate predicate runs, the dispatcher
|
|
||||||
checks: if the signal's type level equals or exceeds the gate's type level, the
|
|
||||||
signal is rejected. A request to modify dispatcher rules (type-level 5) cannot
|
|
||||||
pass a gate of type-level 4 or lower. This is a structural prohibition, not a
|
|
||||||
heuristic — self-modification of the safety layer is impossible by construction.
|
|
||||||
|
|
||||||
Implementation: approximately 30 lines in the existing dispatcher. No new
|
|
||||||
dependencies. Backward compatible. This is Phase 0 of the symbolic engine
|
|
||||||
roadmap.
|
|
||||||
|
|
||||||
** Contribution 2: Theory of Descriptions → Reference Resolution
|
|
||||||
|
|
||||||
PM's theory of descriptions addressed the problem of referring to nonexistent
|
|
||||||
entities: "the current king of France is bald" is false, not meaningless, when
|
|
||||||
there is no unique referent. Passepartout applies this to reference resolution:
|
|
||||||
when the user says "the function that validates secrets," a cognitive tool checks
|
|
||||||
uniqueness before resolving. Ambiguous references trigger a clarification prompt
|
|
||||||
rather than a blind guess.
|
|
||||||
|
|
||||||
Implementation: approximately 40 lines as a cognitive tool. When the knowledge
|
|
||||||
graph ships, descriptions become native Prolog queries with uniqueness constraints.
|
|
||||||
|
|
||||||
** Contribution 3: Process and Reality → Architectural Vocabulary
|
|
||||||
|
|
||||||
Whitehead's process ontology maps with surprising precision to Passepartout's
|
|
||||||
pipeline architecture. Prehension = a gate grasping a signal. Positive prehension
|
|
||||||
= a gate passing. Negative prehension = a gate rejecting. Concrescence = the
|
|
||||||
pipeline process from input to output. Satisfaction = the final agent response.
|
|
||||||
This vocabulary is precise, standard, and already mapped to the architecture. It
|
|
||||||
provides the language for the =/why= command, the gate trace, and the ARCHITECTURE
|
|
||||||
documentation. It is descriptive, not operational — the design would be correct
|
|
||||||
without it, but it would lack the vocabulary to describe /why/ it is correct.
|
|
||||||
|
|
||||||
** Contribution 4: VivaceGraph + PM Types → KG Type Hierarchy
|
|
||||||
|
|
||||||
When the knowledge graph ships, every entity inherits PM's type hierarchy.
|
|
||||||
Entities carry =:pm-type-level= metadata. Queries cannot return entities of the
|
|
||||||
same level as the querying function. Self-referential knowledge becomes
|
|
||||||
structurally impossible — no "this entity defines its own type level." This is
|
|
||||||
Contribution 1 applied to the knowledge layer rather than the execution layer.
|
|
||||||
The dispatcher prevents self-referential /actions/; the KG prevents
|
|
||||||
self-referential /facts/.
|
|
||||||
|
|
||||||
* The Provenance Chain as Product
|
|
||||||
|
|
||||||
In the coding domain, the value of the symbolic engine is the verified fact:
|
|
||||||
"this command is safe." In the broader memex, the value is the provenance itself:
|
|
||||||
"this claim originated in that diary entry on that date, has been referenced 7
|
|
||||||
times across 4 different projects, was contradicted in a retrospective 6 months
|
|
||||||
later, and was revised in a note 3 weeks after that."
|
|
||||||
|
|
||||||
The symbolic engine doesn't tell you what is true. It tells you what you wrote,
|
|
||||||
when, where, and how it connects to everything else you wrote — with a verifiable
|
|
||||||
audit trail. It is a memory prosthesis that makes your own mind legible to you.
|
|
||||||
|
|
||||||
Every fact carries:
|
|
||||||
|
|
||||||
- =:grounding= — the specific Org heading from which it was extracted
|
|
||||||
- =:provenance= — who or what produced it (gate-outcome, human-authored, deduced,
|
|
||||||
LLM-proposed)
|
|
||||||
- =:timestamp= — when it was admitted to the symbolic index
|
|
||||||
- =:referenced-by= — other facts that depend on or reference this one
|
|
||||||
- =:contradicted-by= — other facts that disagree with this one (if any)
|
|
||||||
- =:superseded-by= — if this fact was replaced by a newer version
|
|
||||||
|
|
||||||
These fields make every fact auditable. The =/audit <node-id>= command renders
|
|
||||||
the full provenance chain as an Org headline tree. The provenance is not a
|
|
||||||
logging feature. It is the product.
|
|
||||||
|
|
||||||
* The Competitive Argument
|
|
||||||
|
|
||||||
No competitor has this problem because no competitor has a symbolic engine. The
|
|
||||||
55 systems surveyed in =notes/competitive-landscape.org= range from pure chat
|
|
||||||
agents (Claude, ChatGPT) to agent harnesses (Claude Code, OpenCode, Hermes) to
|
|
||||||
platform agents (OpenClaw). None of them encode knowledge as formal facts with
|
|
||||||
provenance. None of them verify extractions against an existing knowledge base.
|
|
||||||
None of them can prove properties about their own rulesets.
|
|
||||||
|
|
||||||
Their safety is heuristic (prompt-based guardrails that consume LLM tokens and
|
|
||||||
can be evaded with clever phrasing). Their memory is flat (JSONL transcripts
|
|
||||||
without content-addressed identity or provenance chains). Their reasoning is
|
|
||||||
entirely neural — when you ask "why did you decide that?", the answer is a
|
|
||||||
regenerated LLM explanation, not a retrieved inference chain.
|
|
||||||
|
|
||||||
Passepartout's architectural bet is that this problem is worth solving — that a
|
|
||||||
system which can surface contradictions with provenance, derive new facts from
|
|
||||||
observations, and verify claims against a provenanced knowledge graph is
|
|
||||||
fundamentally different from a system that can only call an LLM and hope the
|
|
||||||
response is correct.
|
|
||||||
|
|
||||||
The cost is the ontological work that is genuinely difficult. The reward is a
|
|
||||||
system that cannot hallucinate at the reasoning level, whose memory is provable
|
|
||||||
rather than empirical, and whose knowledge accumulates across sessions through
|
|
||||||
deduction rather than through LLM re-prompting. For a life's knowledge stored in
|
|
||||||
a personal memex, this is not a performance advantage. It is a category difference.
|
|
||||||
|
|
||||||
* Open Questions
|
|
||||||
|
|
||||||
Several design questions are unresolved and should remain unresolved at this
|
|
||||||
stage. They represent research decisions that require experience running the
|
|
||||||
system.
|
|
||||||
|
|
||||||
** What is the minimum viable fact language?
|
|
||||||
|
|
||||||
Triples — =(:entity :relation :value)= with provenance and grounding — is the
|
|
||||||
current hypothesis. It is simple enough to be parseable, expressive enough to
|
|
||||||
capture the gate stack's implicit claims, and extensible enough that Screamer
|
|
||||||
can operate on it. But it may be too simple. Triples do not naturally express
|
|
||||||
temporal relations ("was X before Y?"), modal claims ("should not do X unless
|
|
||||||
Y"), or counterfactuals — all of which may be essential for a symbolically-aided
|
|
||||||
memex. The right granularity depends on what queries actually need to be made,
|
|
||||||
and that cannot be known in advance.
|
|
||||||
|
|
||||||
** How does ontology refactoring work?
|
|
||||||
|
|
||||||
If the seed produces 50 categories from gate extraction and later experience
|
|
||||||
shows they are wrong — wrong granularity, missing cross-cutting concerns, conflated
|
|
||||||
categories — how are they migrated without invalidating all existing deductions
|
|
||||||
that cross the old category boundaries? The ephemeral-first approach (no
|
|
||||||
persistence, rebuild from scratch) is a temporary answer. Once persistence is
|
|
||||||
committed (VivaceGraph), refactoring the category hierarchy is a schema migration
|
|
||||||
problem that deduction provenance makes harder — every deduced fact's chain may
|
|
||||||
cross the old category boundary. This is not addressed in the current architecture.
|
|
||||||
|
|
||||||
** What is the appropriate role of the human?
|
|
||||||
|
|
||||||
The human can explicitly declare facts, write constraints, and correct wrong
|
|
||||||
extractions. But how much of the ontology should the human need to maintain? If
|
|
||||||
the human must write a definition for every new category the symbolic engine
|
|
||||||
encounters, the overhead is prohibitive. If the symbolic engine can generalize
|
|
||||||
from instances, the human role becomes supervision rather than authorship — review
|
|
||||||
and approve proposed generalizations. The balance cannot be set without experience.
|
|
||||||
|
|
||||||
** How much Wikidata is the right amount?
|
|
||||||
|
|
||||||
Loading Wikidata entities referenced in the memex is the minimum. Loading all
|
|
||||||
Wikidata entities within N hops of those references expands the graph
|
|
||||||
exponentially. The right N depends on the memex's breadth — a memex focused on
|
|
||||||
software engineering needs fewer hops than a memex spanning literature, history,
|
|
||||||
philosophy, and science. The query performance and memory costs of a large
|
|
||||||
Wikidata load are unknown.
|
|
||||||
|
|
||||||
** Can the symbolic engine satisfy queries from the user without LLM involvement?
|
|
||||||
|
|
||||||
The design aims for zero-LLM query answering: the user issues a structured
|
|
||||||
command (=/query=, =/contradictions=, =/audit=), and the symbolic engine responds
|
|
||||||
directly. But natural language questions ("what do I think about monorepos?")
|
|
||||||
still require the LLM as a thin translation layer. Whether the structured command
|
|
||||||
interface is sufficient for daily use, or whether users will demand natural
|
|
||||||
language interaction, determines how much LLM involvement remains in the mature
|
|
||||||
system.
|
|
||||||
|
|
||||||
** Is the triplestore physically bounded or does it explode?
|
|
||||||
|
|
||||||
A personal memex with years of diary entries, project notes, reading logs, and
|
|
||||||
literary analyses could produce millions of triples. A naive hash table scales
|
|
||||||
linearly but VivaceGraph's Prolog-like queries may not. The performance
|
|
||||||
characteristics of graph queries over a million-triple knowledge base have not
|
|
||||||
been estimated.
|
|
||||||
|
|
||||||
* Relation to Passepartout's Existing Architecture
|
|
||||||
|
|
||||||
The neurosymbolic engine is an extension of the existing probabilistic-deterministic
|
|
||||||
split, not a replacement for it. The current architecture divides cognition into
|
|
||||||
LLM-driven proposals and Lisp-driven verification. The symbolic engine deepens the
|
|
||||||
verification side from "is this action safe?" to "is this claim supported?" — the
|
|
||||||
same architectural pattern applied to a broader domain.
|
|
||||||
|
|
||||||
The self-repair criterion (a file belongs in core only if, when corrupted, the
|
|
||||||
agent cannot fix it without human help) applies to every component of the symbolic
|
|
||||||
engine. Screamer, VivaceGraph, the fact store, the archivist — all are skills,
|
|
||||||
loaded at runtime, hot-reloadable, and recoverable from corruption. A corrupted
|
|
||||||
symbolic engine degrades reasoning capability but does not kill the agent. The
|
|
||||||
eight existing core ASDF files are unchanged.
|
|
||||||
|
|
||||||
The symbolic engine is not v3.0.0 alone. It is the layer that sits between the
|
|
||||||
existing gate stack (which it makes explicit as facts) and the existing skill
|
|
||||||
system (which it extends with deduction, contradiction detection, and provenance
|
|
||||||
tracking). It grows within the current architecture without replacing any existing
|
|
||||||
component.
|
|
||||||
|
|
||||||
See also:
|
|
||||||
|
|
||||||
- =passepartout-neurosymbolic-roadmap.org= — the concrete phased implementation plan
|
|
||||||
- =notes/passepartout-symbolic-engine-exploration.org= — the original architecture note
|
|
||||||
- =notes/passepartout-whitehead.org= — the four Whitehead contributions
|
|
||||||
- =passepartout/docs/DESIGN_DECISIONS.org= — the existing design decisions
|
|
||||||
- =passepartout/docs/ARCHITECTURE.org= — the current pipeline architecture
|
|
||||||
- =passepartout/docs/ROADMAP.org= — the feature roadmap through v0.13.0
|
|
||||||
|
|||||||
@@ -1,920 +1,28 @@
|
|||||||
#+TITLE: Passepartout Neurosymbolic Engine — Implementation Roadmap
|
#+TITLE: Passepartout Neurosymbolic Engine — SUPERSEDED
|
||||||
#+AUTHOR: Agent
|
#+AUTHOR: Agent
|
||||||
#+FILETAGS: :notes:roadmap:neurosymbolic:v3.0.0:
|
#+FILETAGS: :notes:roadmap:neurosymbolic:superseded:
|
||||||
#+CREATED: [2026-05-08 Fri]
|
#+CREATED: [2026-05-08 Fri]
|
||||||
|
#+SUPERSEDED: [2026-05-10 Sun]
|
||||||
* Evolutionary Roadmap
|
|
||||||
|
This document has been consolidated into ~passepartout/docs/ROADMAP.org~. Each neurosymbolic phase now has its full implementation spec (rationale, code sketches, test catalog, line budget) inline in the roadmap's version sections:
|
||||||
This roadmap describes a phased implementation of the symbolic engine. It is
|
|
||||||
independent of the feature roadmap in =passepartout/docs/ROADMAP.org= — Phase 0
|
| Phase | Version |
|
||||||
can ship immediately alongside any v0.7.x patch. The symbolic engine grows in
|
|-------+---------|
|
||||||
parallel with feature work, not after it.
|
| 0 | v0.10.0 |
|
||||||
|
| 0b | v0.12.0 |
|
||||||
Every phase is loaded as a skill, not a core ASDF component. A corrupted symbolic
|
| 1 | v0.14.0 |
|
||||||
engine degrades reasoning capability but does not kill the agent. This satisfies
|
| 1a | v0.16.0 |
|
||||||
the self-repair criterion documented in =passepartout/docs/ARCHITECTURE.org= and
|
| 2 | v0.18.0 |
|
||||||
=passepartout/AGENTS.md=.
|
| 3 | v0.20.0 |
|
||||||
|
| 4 | v0.22.0 |
|
||||||
The design rationale for each decision is in
|
| 5 | v0.25.0 |
|
||||||
=notes/passepartout-neurosymbolic-design-decisions-and-options.org=. The original
|
| 6 | v0.27.0 |
|
||||||
architecture exploration is in
|
| 7 | v0.36.0 |
|
||||||
=notes/passepartout-symbolic-engine-exploration.org=. Whitehead's contributions are
|
| 8+ | v0.36.1+ |
|
||||||
enumerated in =notes/passepartout-whitehead.org=.
|
|
||||||
|
The "What Is NOT Built" rationale and "Competitive Advantage Analysis" sections are also now in ROADMAP.org.
|
||||||
* Phase 0: PM-Type-Level Gates (~30 lines — builds on existing Dispatcher)
|
|
||||||
|
Cross-references are preserved in the original files:
|
||||||
** What
|
- ~notes/passepartout-neurosymbolic-design-decisions-and-options.org~
|
||||||
|
- ~notes/passepartout-symbolic-engine-exploration.org~
|
||||||
Add =:type-level= metadata to the existing =defgate= and =def-cognitive-tool=
|
- ~notes/passepartout-whitehead.org~
|
||||||
macros. Before any gate predicate evaluates, the dispatcher checks structural
|
|
||||||
type compatibility: a signal at type-level 5 cannot pass a gate at type-level 4
|
|
||||||
or lower. Self-modification of the safety layer becomes impossible by
|
|
||||||
construction.
|
|
||||||
|
|
||||||
** Rationale
|
|
||||||
|
|
||||||
The Dispatcher gate stack currently prevents self-modification through pattern
|
|
||||||
matching — gate vector 2b catches writes to =core-*= files as a heuristic. But
|
|
||||||
there is no /structural/ guarantee preventing a request from modifying the rules
|
|
||||||
that validate it. Pattern-based protection can be bypassed through indirection
|
|
||||||
(an =eval= that constructs a write, a skill that redefines a gate function at
|
|
||||||
runtime). A type-level check is not heuristic — it is a category error rejected
|
|
||||||
before any predicate runs, just as PM's theory of types made self-membership
|
|
||||||
syntactically invalid before any logical evaluation.
|
|
||||||
|
|
||||||
** Implementation
|
|
||||||
|
|
||||||
1. Add =:type-level= keyword argument to =defgate= (default 0) and
|
|
||||||
=def-cognitive-tool= (default 0) in =core-skills.org=.
|
|
||||||
2. Add =gate-type-check= to the dispatcher's =run-gates= function in
|
|
||||||
=security-dispatcher.org=, executed before any gate predicate.
|
|
||||||
3. Assign type levels to existing cognitive tools: self-build-core at 5,
|
|
||||||
write-file at 3, read-file at 1, shell at 2, eval at 4.
|
|
||||||
4. Assign type levels to existing gate vectors: self-build boundary at 5,
|
|
||||||
shell safety at 3, path protection at 2, network exfil at 2, secret content at 1.
|
|
||||||
|
|
||||||
** Verification
|
|
||||||
|
|
||||||
Existing FiveAM gate tests continue to pass. New test: signal at type-level 5
|
|
||||||
targeting a gate at type-level 4 returns =:reject-type-violation= without
|
|
||||||
evaluating the gate predicate. New test: signal at type-level 1 passing through
|
|
||||||
a gate at type-level 3 proceeds to predicate evaluation.
|
|
||||||
|
|
||||||
** Relation to Other Work
|
|
||||||
|
|
||||||
This is Contribution 1 from =notes/passepartout-whitehead.org=. It is also the
|
|
||||||
gate-to-fact bootstrap mechanism — every type-level rejection emits a structured
|
|
||||||
event that Phase 1 ingests as a fact. The ~30 lines implement the seed of the
|
|
||||||
ontology without any new dependencies.
|
|
||||||
|
|
||||||
* Phase 1: Minimum Viable Fact Language (~150 lines — new skill)
|
|
||||||
|
|
||||||
** What
|
|
||||||
|
|
||||||
An ephemeral, in-memory triple store with provenance tracking and contradiction
|
|
||||||
detection. No disk persistence. All facts live in a hash table and are discarded
|
|
||||||
on session end. Gate outcomes are ingested as facts. The gate stack's implicit
|
|
||||||
ontology is materialized as the seed fact set.
|
|
||||||
|
|
||||||
** Rationale
|
|
||||||
|
|
||||||
The architecture note's Option 5 (ephemeral facts, no persistence) is the correct
|
|
||||||
first step. Three reasons:
|
|
||||||
|
|
||||||
1. *The fact language is unproven.* Triples with provenance and grounding is a
|
|
||||||
hypothesis that must be tested against real memex content before being committed
|
|
||||||
to a serialization format.
|
|
||||||
2. *The ontology is emergent.* Categories are created on first use. A persistent
|
|
||||||
format would require a migration story for every category change. Ephemeral
|
|
||||||
avoids this — facts are re-derived on each session start using the evolved
|
|
||||||
ontology.
|
|
||||||
3. *Rebuildability is the safety net.* Because all facts have a =:grounding= to
|
|
||||||
an Org heading, and gate-outcome facts are regenerated from the gate stack on
|
|
||||||
load, the entire symbolic index can be thrown away and rebuilt from scratch.
|
|
||||||
The cost is compute, not data.
|
|
||||||
|
|
||||||
** Implementation — =org/symbolic-facts.org= → =lisp/symbolic-facts.lisp= (skill)
|
|
||||||
|
|
||||||
*** Triple store
|
|
||||||
|
|
||||||
A hash table keyed by =(entity relation)=. Values are plists:
|
|
||||||
|
|
||||||
#+begin_example
|
|
||||||
(:value <string-or-symbol>
|
|
||||||
:grounding <heading-id-or-nil>
|
|
||||||
:provenance <:gate-outcome | :human-authored | :deduced | :llm-proposed>
|
|
||||||
:timestamp <universal-time>
|
|
||||||
:contradiction <:awaiting-resolution-or-nil>
|
|
||||||
:superseded-by <entity-string-or-nil>)
|
|
||||||
#+end_example
|
|
||||||
|
|
||||||
The =:provenance= field tracks how the fact entered the store. The
|
|
||||||
=:contradiction= field is nil on standard facts. The =:superseded-by= field is
|
|
||||||
set when a =:temporal= domain fact is replaced by a newer version.
|
|
||||||
|
|
||||||
*** Bootstrap from gates
|
|
||||||
|
|
||||||
On skill load, scan the Dispatcher's existing data structures and produce triples:
|
|
||||||
|
|
||||||
#+begin_example
|
|
||||||
;; From *dispatcher-protected-paths*
|
|
||||||
(:entity ".env" :relation :member-of-class :value :secret-config-file :provenance :gate-outcome)
|
|
||||||
(:entity "*id_rsa*" :relation :member-of-class :value :ssh-key-file :provenance :gate-outcome)
|
|
||||||
|
|
||||||
;; From *dispatcher-shell-blocked*
|
|
||||||
(:entity "rm -rf /" :relation :classified-as :value :catastrophic-command :provenance :gate-outcome)
|
|
||||||
(:entity "dd if=" :relation :classified-as :value :catastrophic-command :provenance :gate-outcome)
|
|
||||||
|
|
||||||
;; From *dispatcher-network-whitelist*
|
|
||||||
(:entity "api.telegram.org" :relation :classified-as :value :trusted-domain :provenance :gate-outcome)
|
|
||||||
#+end_example
|
|
||||||
|
|
||||||
This produces 50-70 entity classes immediately. No LLM involvement. No human
|
|
||||||
authoring. Mechanically extracted from existing code.
|
|
||||||
|
|
||||||
*** Ingest gate outcomes
|
|
||||||
|
|
||||||
Register a post-gate hook on the Dispatcher's rejection path. Every gate rejection
|
|
||||||
produces a triple with =:provenance :gate-outcome=:
|
|
||||||
|
|
||||||
#+begin_example
|
|
||||||
(:entity "/tmp/secrets.env" :relation :blocked-by :value :dispatcher-path-protection
|
|
||||||
:provenance :gate-outcome :grounding "signal-47")
|
|
||||||
#+end_example
|
|
||||||
|
|
||||||
*** Query
|
|
||||||
|
|
||||||
=(fact-query &key entity relation value source-provenance)= — pure hash-table
|
|
||||||
lookup. Returns the matching triple or nil. ~30 lines.
|
|
||||||
|
|
||||||
=(fact-query-all &key relation value source-provenance)= — returns all triples
|
|
||||||
matching the filter criteria. Enables "find all files classified as secrets."
|
|
||||||
|
|
||||||
*** Contradiction detection
|
|
||||||
|
|
||||||
On every =fact-assert=, check if the new triple contradicts an existing one
|
|
||||||
(same entity, same relation, different value, same provenance domain). If the
|
|
||||||
entity's class has =:contradiction-policy :exclusive=, the new fact is rejected
|
|
||||||
with a signal. If the policy is =:coexistent=, both facts are stored with a
|
|
||||||
=:contradiction= flag cross-referencing each other. If the policy is =:temporal=,
|
|
||||||
the old fact is marked =:superseded-by= the new one but retained.
|
|
||||||
|
|
||||||
The policy table is a hash table mapping entity classes to one of =:exclusive=,
|
|
||||||
=:coexistent=, or =:temporal=. Gate-bootstrapped facts default to =:exclusive=
|
|
||||||
(the filesystem is singular). New categories default to =:coexistent= (safe,
|
|
||||||
never loses information).
|
|
||||||
|
|
||||||
** Verification — ~8 FiveAM tests
|
|
||||||
|
|
||||||
1. =test-bootstrap-creates-facts= — bootstrap produces correct triples from
|
|
||||||
=*dispatcher-protected-paths*=.
|
|
||||||
2. =test-bootstrap-creates-shell-facts= — bootstrap produces correct triples from
|
|
||||||
=*dispatcher-shell-blocked*=.
|
|
||||||
3. =test-gate-outcome-produces-fact= — a simulated gate rejection produces a
|
|
||||||
triple with =:provenance :gate-outcome=.
|
|
||||||
4. =test-fact-query-returns-correct-value= — querying by entity and relation
|
|
||||||
returns the expected value plist.
|
|
||||||
5. =test-duplicate-ingestion-idempotent= — asserting the same fact twice does
|
|
||||||
not produce a duplicate or a contradiction.
|
|
||||||
6. =test-exclusive-contradiction-rejected= — asserting a contradictory fact in
|
|
||||||
an =:exclusive= domain returns a rejection.
|
|
||||||
7. =test-coexistent-contradiction-flagged= — asserting a contradictory fact in a
|
|
||||||
=:coexistent= domain stores both with cross-referencing flags.
|
|
||||||
8. =test-temporal-supersedes= — asserting a newer fact in a =:temporal= domain
|
|
||||||
marks the old fact as superseded but retains it.
|
|
||||||
|
|
||||||
** Relation to Other Work
|
|
||||||
|
|
||||||
This is Phase 1 of =notes/passepartout-v3.0.0-roadmap.org=. It implements Options 4 and 5
|
|
||||||
from the architecture note. The contradiction policies are from
|
|
||||||
=passepartout-neurosymbolic-design-decisions-and-options.org=.
|
|
||||||
|
|
||||||
* Phase 2: Screamer as Admission Gate (~200 lines — new skill)
|
|
||||||
|
|
||||||
** What
|
|
||||||
|
|
||||||
Wrap Screamer (a constraint solver with non-deterministic backtracking) as a
|
|
||||||
skill. Use it for consistency checking against the triple store and for deduction
|
|
||||||
of new facts from existing ones. Screamer is the *verification* layer; VivaceGraph
|
|
||||||
(introduced in Phase 5) is the *storage* layer.
|
|
||||||
|
|
||||||
** Rationale
|
|
||||||
|
|
||||||
The architecture note's "verified extraction" pattern requires a deterministic
|
|
||||||
admission gate. Screamer's non-deterministic backtracking finds contradictions
|
|
||||||
that simple string comparison misses. For example, if existing facts say "all
|
|
||||||
config files with extension =.env= are classified as secrets," and the LLM
|
|
||||||
proposes "=app.env= is not secret," Screamer finds the contradiction by
|
|
||||||
substituting =app.env= into the existing rule. A naive string-keyed hash table
|
|
||||||
comparison would miss this because ="app.env"= and =".env"= are different strings.
|
|
||||||
|
|
||||||
Screamer also enables deduction — new facts from existing ones without any LLM
|
|
||||||
involvement. If all files matching =*.env= are secrets, and =prod.env= matches
|
|
||||||
=*.env=, then =prod.env= is a secret. Deduced facts carry =:provenance :deduced=
|
|
||||||
and a =:derived-from= chain pointing to the facts they were derived from.
|
|
||||||
|
|
||||||
** Implementation — =org/symbolic-screamer.org= → =lisp/symbolic-screamer.lisp= (skill)
|
|
||||||
|
|
||||||
*** Wrap Screamer
|
|
||||||
|
|
||||||
Screamer is available via Quicklisp. Load at runtime via =ql:quickload :screamer=.
|
|
||||||
Not an ASDF dependency — if Screamer is not installed, the skill degrades
|
|
||||||
gracefully (no consistency checking, no deduction — the fact store still
|
|
||||||
functions as a hash table with provenance tracking).
|
|
||||||
|
|
||||||
*** Consistency check
|
|
||||||
|
|
||||||
=(screamer-consistent-p candidate-fact existing-facts)= — expresses the fact
|
|
||||||
store as Screamer constraint variables. The candidate fact is asserted. Screamer
|
|
||||||
checks solvability. Returns =:consistent=, =:contradiction <details>=, or
|
|
||||||
=:redundant= (the fact is already implied by existing facts).
|
|
||||||
|
|
||||||
Early-stage: the consistency check works on simple triples. As the fact store
|
|
||||||
grows, rules of the form "all X are Y" (representing protected paths, shell
|
|
||||||
patterns, class memberships) become Screamer constraints that new facts must
|
|
||||||
satisfy.
|
|
||||||
|
|
||||||
*** Deduction
|
|
||||||
|
|
||||||
=(screamer-deduce existing-facts)= — Screamer finds implications of the existing
|
|
||||||
fact set that are not already in the store. New facts are asserted with
|
|
||||||
=:provenance :deduced= and a =:derived-from= list of source fact keys.
|
|
||||||
|
|
||||||
Deduction is not run on every assertion — it is a background task triggered by
|
|
||||||
heartbeat or manually. The cost is compute (Screamer exploration), not tokens.
|
|
||||||
|
|
||||||
*** Admission gate
|
|
||||||
|
|
||||||
=(screamer-admit candidate-fact existing-facts)= — wraps consistency check with
|
|
||||||
the contradiction policy lookup. If the candidate fact's entity class has policy
|
|
||||||
=:exclusive=, contradictions reject. If =:coexistent=, flag. If =:temporal=,
|
|
||||||
supersede.
|
|
||||||
|
|
||||||
This is the function the archivist calls before any LLM-proposed fact enters the
|
|
||||||
store. It is also called on human-authored facts (which override the policy —
|
|
||||||
the human can assert contradictory facts in any domain). It is not called on
|
|
||||||
gate-outcome facts (gates are the ground truth for security domains).
|
|
||||||
|
|
||||||
** Verification — ~6 FiveAM tests
|
|
||||||
|
|
||||||
1. =test-screamer-consistency-passes= — a fact consistent with existing triples
|
|
||||||
returns =:consistent=.
|
|
||||||
2. =test-screamer-contradiction-detected= — "app.env is not secret" contradicts
|
|
||||||
"all *.env files are secrets" and returns =:contradiction=.
|
|
||||||
3. =test-screamer-redundant-detected= — asserting a fact already implied by
|
|
||||||
existing facts returns =:redundant=.
|
|
||||||
4. =test-screamer-deduction-produces-new-fact= — given "all *.env files are
|
|
||||||
secrets" and "prod.env matches *.env", Screamer deduces "prod.env is secret."
|
|
||||||
5. =test-admission-gate-rejects-contradiction= — the archivist's proposal that
|
|
||||||
contradicts an =:exclusive= domain fact is rejected.
|
|
||||||
6. =test-admission-gate-flags-coexistent-contradiction= — the archivist's proposal
|
|
||||||
that contradicts a =:coexistent= domain fact is stored with a cross-reference.
|
|
||||||
|
|
||||||
** Relation to Other Work
|
|
||||||
|
|
||||||
This is Phase 2 of =notes/passepartout-v3.0.0-roadmap.org=. It implements the "LLM as proposer"
|
|
||||||
pattern from the architecture note. Screamer's role is defined in
|
|
||||||
=passepartout-neurosymbolic-design-decisions-and-options.org=.
|
|
||||||
|
|
||||||
* Phase 3: Archivist as Fact Proposer (~100 lines — extends existing archivist)
|
|
||||||
|
|
||||||
** What
|
|
||||||
|
|
||||||
Extend the existing archivist skill (=org/symbolic-archivist.org=) with a fact
|
|
||||||
extraction mode. The LLM reads prose, proposes triples, and Screamer verifies
|
|
||||||
them before admission. The archivist's existing Scribe (log distillation) and
|
|
||||||
Gardener (link scanning) functions are unchanged.
|
|
||||||
|
|
||||||
** Rationale
|
|
||||||
|
|
||||||
The archivist already walks the entire memex (the Gardener scans for broken links
|
|
||||||
and orphans). Adding fact extraction reuses the same traversal infrastructure
|
|
||||||
rather than duplicating it. The extraction is gated by Screamer — the LLM is a
|
|
||||||
proposer, not an extractor. Facts that fail consistency checking are discarded.
|
|
||||||
Facts that pass are admitted with =:provenance :llm-proposed= and =:grounding=
|
|
||||||
to the source heading.
|
|
||||||
|
|
||||||
** Implementation — extends =org/symbolic-archivist.org=
|
|
||||||
|
|
||||||
*** Propose from prose
|
|
||||||
|
|
||||||
Given an Org heading, call the LLM with a minimal prompt (~200 tokens):
|
|
||||||
|
|
||||||
#+begin_example
|
|
||||||
Extract triples from this text as (:entity <name> :relation <keyword> :value <value>).
|
|
||||||
Ground each triple to the heading. Return a list of triples.
|
|
||||||
#+end_example
|
|
||||||
|
|
||||||
The LLM returns structured triples via the existing JSON→plist structured output
|
|
||||||
path from v0.4.2. The prompt is environment-aware: if the heading's file is in
|
|
||||||
=literature/= or has =:literature:= tags, the prompt includes literature-specific
|
|
||||||
relations (=:wrote=, =:published-in=, =:influenced=). If the heading is in
|
|
||||||
=projects/=, the prompt includes coding-specific relations (=:depends-on=,
|
|
||||||
=:tested-by=).
|
|
||||||
|
|
||||||
*** Verify through Screamer
|
|
||||||
|
|
||||||
Each proposed triple runs through =(screamer-admit candidate existing-facts)=
|
|
||||||
from Phase 2. Consistent and coexistent-flagged triples are admitted. Contradictory
|
|
||||||
triples in =:exclusive= domains are discarded with a log entry.
|
|
||||||
|
|
||||||
*** Provenance tracking
|
|
||||||
|
|
||||||
After each extraction run, update provenance counts:
|
|
||||||
|
|
||||||
#+begin_example
|
|
||||||
(:total-facts 847
|
|
||||||
:gate-outcome 312
|
|
||||||
:human-authored 12
|
|
||||||
:deduced 89
|
|
||||||
:llm-proposed 434)
|
|
||||||
#+end_example
|
|
||||||
|
|
||||||
This is the data structure that Phase 4's sufficiency criterion reads. It is
|
|
||||||
also surfaced in the TUI sidebar or =/status= command: "Symbolic index: 847
|
|
||||||
facts (37% from gates, 52% LLM-proposed, 10% deduced, 1% human)."
|
|
||||||
|
|
||||||
*** Rebuildable
|
|
||||||
|
|
||||||
Because every fact has a =:grounding= to an Org heading, the entire LLM-extracted
|
|
||||||
subset can be discarded and re-extracted without losing gate-outcome or deduced
|
|
||||||
facts. The =(fact-purge :provenance :llm-proposed)= function removes all
|
|
||||||
LLM-proposed facts. A subsequent =(archivist-extract-all)= re-extracts from
|
|
||||||
scratch.
|
|
||||||
|
|
||||||
This is the safety net: if the LLM produces a bad extraction that passes
|
|
||||||
Screamer's consistency check (possible in the early stages when the fact store
|
|
||||||
has few existing facts to check against), the extraction can be redone after the
|
|
||||||
fact store has grown. The cost is compute, not data.
|
|
||||||
|
|
||||||
** Verification — ~5 FiveAM tests
|
|
||||||
|
|
||||||
1. =test-archivist-extracts-triples= — given a known Org heading with explicit
|
|
||||||
triples in the prose, the archivist produces the correct triples via LLM.
|
|
||||||
2. =test-archivist-verified-extraction= — a hallucinated triple is rejected by
|
|
||||||
the Screamer admission gate.
|
|
||||||
3. =test-provenance-counts-update= — after extraction, the provenance breakdown
|
|
||||||
is correct.
|
|
||||||
4. =test-purge-llm-facts= — does not delete gate-outcome or deduced facts.
|
|
||||||
5. =test-re-extraction-idempotent= — re-extracting from the same prose after
|
|
||||||
purging produces the same facts (Screamer verification is deterministic
|
|
||||||
given the same starting set).
|
|
||||||
|
|
||||||
** Relation to Other Work
|
|
||||||
|
|
||||||
This is Phase 3 of =notes/passepartout-v3.0.0-roadmap.org=. The archivist's role as proposer
|
|
||||||
is described in =passepartout-neurosymbolic-design-decisions-and-options.org=
|
|
||||||
under "The LLM as Proposer."
|
|
||||||
|
|
||||||
* Phase 4: The "Flip" — Sufficiency Criterion (~50 lines — extends Phase 3)
|
|
||||||
|
|
||||||
** What
|
|
||||||
|
|
||||||
Make the architecture note's central narrative arc operational: a measurable
|
|
||||||
threshold for when the symbolic engine has enough non-lossy facts to bypass the
|
|
||||||
LLM for extraction.
|
|
||||||
|
|
||||||
** Rationale
|
|
||||||
|
|
||||||
The architecture note describes "at some point, the non-lossy facts constitute a
|
|
||||||
sufficient foundation that the symbolic engine can reverse the flow" but provides
|
|
||||||
no criterion for "some point." The sufficiency score makes the flip computable
|
|
||||||
and visible to the user.
|
|
||||||
|
|
||||||
** Implementation — extends =org/symbolic-facts.lisp=
|
|
||||||
|
|
||||||
*** Sufficiency score
|
|
||||||
|
|
||||||
=(fact-sufficiency-ratio)= — returns the ratio of non-lossy facts to total facts:
|
|
||||||
|
|
||||||
#+begin_src lisp
|
|
||||||
(/ (+ (count-provenance :gate-outcome)
|
|
||||||
(count-provenance :human-authored)
|
|
||||||
(count-provenance :deduced))
|
|
||||||
(fact-total-count))
|
|
||||||
#+end_src
|
|
||||||
|
|
||||||
When this ratio exceeds =SUFFICIENCY_THRESHOLD= (configurable env var, default
|
|
||||||
0.7), the system considers its foundation sufficient. The threshold defaults to
|
|
||||||
0.7 because below this, the majority of facts are LLM-proposed and therefore
|
|
||||||
uncertain. Above 0.7, the proven foundation provides enough constraint that
|
|
||||||
Screamer can reliably detect incorrect LLM proposals.
|
|
||||||
|
|
||||||
*** Auto-extraction toggle
|
|
||||||
|
|
||||||
When sufficiency is reached, the archivist switches from "LLM proposes, Screamer
|
|
||||||
verifies" to "Screamer queries existing facts, applies category rules to the new
|
|
||||||
prose, and deduces new facts directly." The LLM is bypassed for categories that
|
|
||||||
have sufficient non-lossy coverage. The LLM is still used for novel categories
|
|
||||||
that have no existing facts.
|
|
||||||
|
|
||||||
The switch is configurable: =AUTO_EXTRACTION_ENABLED=true/false=. When disabled,
|
|
||||||
the system continues with LLM proposals regardless of sufficiency — useful for
|
|
||||||
domains where extraction quality is prioritized over extraction determinism.
|
|
||||||
|
|
||||||
*** Monitor
|
|
||||||
|
|
||||||
The TUI sidebar (v0.8.0) or =/status= command displays:
|
|
||||||
|
|
||||||
#+begin_example
|
|
||||||
Symbolic Index
|
|
||||||
Total facts: 1,247
|
|
||||||
Proven:
|
|
||||||
Gate outcomes: 312 (25%)
|
|
||||||
Human-authored: 47 (4%)
|
|
||||||
Deduced: 521 (42%)
|
|
||||||
─────────────────────────
|
|
||||||
Non-lossy: 880 (71%)
|
|
||||||
LLM-proposed: 367 (29%)
|
|
||||||
─────────────────────────
|
|
||||||
Sufficiency: 71% ✓ (threshold: 70%)
|
|
||||||
Mode: AUTO-EXTRACTION (LLM bypassed for known categories)
|
|
||||||
#+end_example
|
|
||||||
|
|
||||||
** Verification — ~3 FiveAM tests
|
|
||||||
|
|
||||||
1. =test-sufficiency-below-threshold= — with 30% non-lossy facts, auto-extraction
|
|
||||||
is not enabled.
|
|
||||||
2. =test-sufficiency-above-threshold= — with 75% non-lossy facts, auto-extraction
|
|
||||||
is enabled.
|
|
||||||
3. =test-auto-extraction-produces-same-facts-as-llm-extraction= — for a category
|
|
||||||
with sufficient non-lossy coverage, auto-extraction produces facts that a
|
|
||||||
subsequent LLM extraction also produces (the deterministic path is consistent
|
|
||||||
with the probabilistic path).
|
|
||||||
|
|
||||||
** Relation to Other Work
|
|
||||||
|
|
||||||
This is Phase 4 of =notes/passepartout-v3.0.0-roadmap.org=. The flip concept originates in
|
|
||||||
=notes/passepartout-symbolic-engine-exploration.org= (lines 68-76) and is refined in
|
|
||||||
=passepartout-neurosymbolic-design-decisions-and-options.org= under "The Flip."
|
|
||||||
|
|
||||||
* Phase 5: VivaceGraph as Persistent Store (~300 lines — new skill)
|
|
||||||
|
|
||||||
** What
|
|
||||||
|
|
||||||
Replace the ephemeral hash-table triple store with VivaceGraph, a Lisp-native
|
|
||||||
graph database with Prolog-like queries. Add the KG type hierarchy (PM type
|
|
||||||
levels applied to the knowledge layer). Define the persistence format from the
|
|
||||||
fact language that survived Phases 1-4.
|
|
||||||
|
|
||||||
** Rationale
|
|
||||||
|
|
||||||
By this point, the triple fact language has been battle-tested through four
|
|
||||||
phases of gate outcomes, Screamer deductions, LLM proposals, and cross-domain
|
|
||||||
comparisons. The facts that proved useful define the persistent schema. The ones
|
|
||||||
that weren't are left behind. The serialization format is not designed upfront;
|
|
||||||
it emerges from use.
|
|
||||||
|
|
||||||
The transition from ephemeral to persistent is justified when two conditions are
|
|
||||||
met: (1) the fact language has stabilized (categories are being queried, not
|
|
||||||
constantly refactored), and (2) accumulated deductions across sessions provide
|
|
||||||
value that justifies the serialization cost.
|
|
||||||
|
|
||||||
** Implementation — =org/symbolic-vivacegraph.org= → =lisp/symbolic-vivacegraph.lisp= (skill)
|
|
||||||
|
|
||||||
*** Wrap VivaceGraph
|
|
||||||
|
|
||||||
VivaceGraph is available via Quicklisp. Load at runtime. Not an ASDF dependency.
|
|
||||||
If not installed, the fact store continues as a hash table (Phase 1-4 behavior)
|
|
||||||
with a log warning: "VivaceGraph not available — persistence disabled."
|
|
||||||
|
|
||||||
*** Prolog-like queries
|
|
||||||
|
|
||||||
Replace =fact-query= with graph traversals:
|
|
||||||
|
|
||||||
#+begin_src lisp
|
|
||||||
;; Find all files classified as secrets
|
|
||||||
(vivace-query '(:and (:entity ?e)
|
|
||||||
(:member-of-class ?e :secret-file)))
|
|
||||||
|
|
||||||
;; Find all files classified as secrets that were modified today
|
|
||||||
(vivace-query '(:and (:entity ?e)
|
|
||||||
(:member-of-class ?e :secret-file)
|
|
||||||
(:modified-since ?e ,(today-timestamp))))
|
|
||||||
|
|
||||||
;; Find contradictions between Wikidata and the memex
|
|
||||||
(vivace-query '(:and (:entity ?e)
|
|
||||||
(:has-value ?e ?v1 :source :wikidata)
|
|
||||||
(:has-value ?e ?v2 :source :memex)
|
|
||||||
(:not-equal ?v1 ?v2)))
|
|
||||||
#+end_src
|
|
||||||
|
|
||||||
*** KG type hierarchy (Contribution 4 from Whitehead)
|
|
||||||
|
|
||||||
Every entity in the graph carries =:pm-type-level= metadata. Queries cannot
|
|
||||||
return entities whose type level equals or exceeds the querying function's type
|
|
||||||
level. A fact-finding query at type-level 2 cannot return facts at type-level
|
|
||||||
3 or higher. Self-referential knowledge — "this fact defines its own type" —
|
|
||||||
becomes structurally impossible because the type level is assigned at creation
|
|
||||||
and cannot be modified by a fact of the same or higher level.
|
|
||||||
|
|
||||||
This is Contribution 1 (type-level gates) applied to the knowledge layer rather
|
|
||||||
than the execution layer. The dispatcher prevents self-referential /actions/; the
|
|
||||||
KG prevents self-referential /facts/.
|
|
||||||
|
|
||||||
*** Persistence format
|
|
||||||
|
|
||||||
The fact language that survived Phases 1-4 defines the format. Each entity is a
|
|
||||||
node; each triple is an edge with properties (=:grounding=, =:provenance=,
|
|
||||||
=:timestamp=). The format is not a new design — it is the triple schema evolved
|
|
||||||
through use, serialized by VivaceGraph's native persistence.
|
|
||||||
|
|
||||||
If the fact language later evolves to n-ary relations, VivaceGraph's graph model
|
|
||||||
accommodates this natively — edges can carry arbitrary property plists. The
|
|
||||||
triple form is a special case of the general graph model.
|
|
||||||
|
|
||||||
*** Load on startup, save on interval
|
|
||||||
|
|
||||||
On daemon start, =(vivacegraph-load)= reads the last saved graph. On heartbeat,
|
|
||||||
=(vivacegraph-save)= persists the graph in its native format to
|
|
||||||
=~/.cache/passepartout/facts.vg~. The interval matches the existing
|
|
||||||
=*memory-auto-save-interval*=. The save is atomic: write to a temp file, rename
|
|
||||||
on success. Corruption-safe.
|
|
||||||
|
|
||||||
** Verification — ~5 FiveAM tests
|
|
||||||
|
|
||||||
1. =test-vivacegraph-roundtrip= — save and load preserves all facts with
|
|
||||||
provenance metadata.
|
|
||||||
2. =test-prolog-query-returns-results= — a query for all secret files returns
|
|
||||||
the bootstrapped gate facts.
|
|
||||||
3. =test-prolog-query-cross-domain= — a query for contradictions between Wikidata
|
|
||||||
and memex provenance returns correct results.
|
|
||||||
4. =test-type-level-prevents-self-reference= — a query from a type-level-2
|
|
||||||
function cannot return type-level-3 facts.
|
|
||||||
5. =test-fact-store-fallback-without-vivacegraph= — when VivaceGraph is not
|
|
||||||
loaded, the hash-table fallback functions identically to Phase 1-4 behavior.
|
|
||||||
|
|
||||||
** Relation to Other Work
|
|
||||||
|
|
||||||
This is Phase 5 of =notes/passepartout-v3.0.0-roadmap.org= and Contribution 4 from
|
|
||||||
=notes/passepartout-whitehead.org=. The architecture note's Option 1
|
|
||||||
(auto-formalizer KG) converges with Option 4 (one memex, two indices) here —
|
|
||||||
VivaceGraph is the persistence layer for the symbolic index within the
|
|
||||||
one-memex-two-indices architecture.
|
|
||||||
|
|
||||||
* Phase 6: ACL2 for Structural Verification (~200 lines — new skill)
|
|
||||||
|
|
||||||
** What
|
|
||||||
|
|
||||||
Wrap ACL2 as a skill. Prove structural properties of the KG type hierarchy and
|
|
||||||
rule sets. Not for empirical claims.
|
|
||||||
|
|
||||||
** Rationale
|
|
||||||
|
|
||||||
The architecture note positions ACL2 as verifying LLM-proposed facts. But many
|
|
||||||
facts are empirical ("this command is destructive on Linux"), not logical. The
|
|
||||||
Whitehead note clarifies the right role: structural verification. ACL2 proves
|
|
||||||
that the type hierarchy has no cycles, that the rule set is non-contradictory,
|
|
||||||
and that the gate-to-fact bootstrap preserves the Dispatcher's intent. These are
|
|
||||||
structural properties that can be formally verified, not empirical claims that
|
|
||||||
depend on external reality.
|
|
||||||
|
|
||||||
** Implementation — =org/symbolic-acl2.org= → =lisp/symbolic-acl2.lisp= (skill)
|
|
||||||
|
|
||||||
*** Type consistency proofs
|
|
||||||
|
|
||||||
=(acl2-verify-type-hierarchy facts)= — prove that the KG type hierarchy has no
|
|
||||||
cycles: no entity of type-level 3 depends on an entity of type-level 5, no parent
|
|
||||||
category has a child that subsumes it, no category is its own ancestor via the
|
|
||||||
child-of relation. These are structural properties of the graph, independent of
|
|
||||||
what the facts /say/.
|
|
||||||
|
|
||||||
*** Rule set consistency
|
|
||||||
|
|
||||||
=(acl2-verify-rule-consistency rules)= — prove that the accumulated Dispatcher
|
|
||||||
rules (from HITL approvals) are non-contradictory: no rule allows a command that
|
|
||||||
another rule blocks, no rule permits a path access that another denies. If the
|
|
||||||
rule set is contradictory, ACL2 identifies the contradictory subset with the
|
|
||||||
provenance of each rule. The human resolves the contradiction.
|
|
||||||
|
|
||||||
*** Extraction verification
|
|
||||||
|
|
||||||
=(acl2-verify-bootstrap-preservation)= — prove that the gate-to-fact bootstrap
|
|
||||||
(Phase 0-1) preserves the Dispatcher's intent: every blocked pattern in the gate
|
|
||||||
stack maps to a fact in the store; every fact with =:provenance :gate-outcome= is
|
|
||||||
grounded in a specific gate vector; no gate-bootstrapped fact contradicts another
|
|
||||||
gate-bootstrapped fact.
|
|
||||||
|
|
||||||
** Not in scope
|
|
||||||
|
|
||||||
ACL2 does not verify that =rm -rf / is destructive. That is an empirical claim
|
|
||||||
about Linux. Screamer handles empirical consistency (does this new claim
|
|
||||||
contradict existing observations?). ACL2 handles structural consistency (does
|
|
||||||
this reasoning structure have formal flaws?). The boundary is: empirical claims
|
|
||||||
go to Screamer; structural claims go to ACL2.
|
|
||||||
|
|
||||||
** Verification — ~4 FiveAM tests
|
|
||||||
|
|
||||||
1. =test-acl2-type-hierarchy-no-cycles= — a synthetic KG with a type-level cycle
|
|
||||||
is detected and reported.
|
|
||||||
2. =test-acl2-rule-set-contradiction-detected= — two Dispatcher rules that
|
|
||||||
contradict each other produce a contradiction report with provenance.
|
|
||||||
3. =test-acl2-bootstrap-preservation= — the bootstrap extraction from the gate
|
|
||||||
stack is verified to have no missing or extra facts.
|
|
||||||
4. =test-acl2-not-loaded-graceful-degradation= — when ACL2 is not installed, the
|
|
||||||
skill loads but returns ":ACL2 not available — structural verification
|
|
||||||
disabled" without crashing.
|
|
||||||
|
|
||||||
** Relation to Other Work
|
|
||||||
|
|
||||||
This is Phase 6 of =notes/passepartout-v3.0.0-roadmap.org=. ACL2's role is refined in
|
|
||||||
=passepartout-neurosymbolic-design-decisions-and-options.org= from the
|
|
||||||
architecture note's broader claim to the structural verification scope.
|
|
||||||
|
|
||||||
* Phase 7: The 10-80-10 Planner (~500 lines — new skills, last phase)
|
|
||||||
|
|
||||||
** What
|
|
||||||
|
|
||||||
A planning engine built on the mature symbolic index. Screamer expresses task
|
|
||||||
planning as a constraint satisfaction problem. ACL2 verifies plans for structural
|
|
||||||
soundness. The LLM handles the I/O boundaries (natural language → structured goal
|
|
||||||
← natural language response). The symbolic engine handles the reasoning.
|
|
||||||
|
|
||||||
** Rationale
|
|
||||||
|
|
||||||
This is v3.0.0 as described in the architecture note and the ROADMAP. It is the
|
|
||||||
final phase because it requires a populated, queried, and trusted symbolic index.
|
|
||||||
The full planner is useless without a mature ontology and a proven deducer. By
|
|
||||||
the time Phase 7 begins, Phases 0-6 have accumulated months of gate outcomes,
|
|
||||||
Screamer deductions, verified LLM proposals, and human-authored facts. The
|
|
||||||
symbolic index has achieved sufficiency. The ontology has stabilized through use.
|
|
||||||
The planner is built on a foundation, not a speculation.
|
|
||||||
|
|
||||||
** Implementation — =org/symbolic-planner.org= → =lisp/symbolic-planner.lisp= (skill)
|
|
||||||
|
|
||||||
*** Task decomposition as constraint satisfaction
|
|
||||||
|
|
||||||
The user specifies a goal: "refactor the authentication module to support OAuth2."
|
|
||||||
The LLM translates this to a structured goal plist. Screamer expresses the planning
|
|
||||||
problem:
|
|
||||||
|
|
||||||
- /Variables/: subtasks (write OAuth2 client, add token store, update auth
|
|
||||||
middleware, write tests, update documentation)
|
|
||||||
- /Constraints/: dependency ordering (tests depend on implementation), resource
|
|
||||||
limits (one file write at a time), safety invariants (no modification of
|
|
||||||
=core-*= files)
|
|
||||||
- /Objective/: find an ordering that satisfies all constraints
|
|
||||||
|
|
||||||
Screamer returns a viable plan or reports unsolvability with the conflicting
|
|
||||||
constraints.
|
|
||||||
|
|
||||||
*** Plan verification
|
|
||||||
|
|
||||||
ACL2 proves that the plan contains no deadlocks (two subtasks waiting on each
|
|
||||||
other), no dependency cycles (A depends on B depends on C depends on A), and
|
|
||||||
no safety violations (no plan step requires a gate-blocked operation).
|
|
||||||
|
|
||||||
If verification fails, ACL2 identifies the failing subtask and the violated
|
|
||||||
constraint. The planner re-decomposes the problematic branch (the existing
|
|
||||||
ROADMAP's branch pruning, v0.11.0, but symbolically rather than neurally).
|
|
||||||
|
|
||||||
*** Neuro-symbolic boundary
|
|
||||||
|
|
||||||
The LLM handles the I/O boundaries:
|
|
||||||
|
|
||||||
- *Input* (10%): natural language → structured goal plist. "Refactor auth for
|
|
||||||
OAuth2" → =(:goal :refactor-component :target :auth-module :add-feature :oauth2)=.
|
|
||||||
Small prompt, formulaic translation, ~100 tokens.
|
|
||||||
- *Reasoning* (80%): Screamer plans. ACL2 verifies. VivaceGraph provides the
|
|
||||||
facts about file structure, dependencies, and gate constraints. Zero LLM
|
|
||||||
tokens.
|
|
||||||
- *Output* (10%): structured plan → natural language response. The verified plan
|
|
||||||
plist is formatted as "I'll refactor the authentication module in 5 steps:
|
|
||||||
1) Create the OAuth2 client (depends on: nothing, modifies: auth/client.lisp)
|
|
||||||
2) Add the token store..." Small prompt, formulaic translation, ~150 tokens.
|
|
||||||
|
|
||||||
*** TUI visualization
|
|
||||||
|
|
||||||
The plan is rendered as an Org headline tree in the TUI, with each subtask as a
|
|
||||||
node showing its terminal state (=todo=, =next-action=, =in-progress=, =done=,
|
|
||||||
=blocked=, =stuck=), its constraints, and its verified properties. This is the
|
|
||||||
same task tree visualization planned for v0.11.0 in the feature roadmap, but
|
|
||||||
with the addition of Screamer constraint annotations and ACL2 verification
|
|
||||||
badges.
|
|
||||||
|
|
||||||
** Verification — ~6 FiveAM tests
|
|
||||||
|
|
||||||
1. =test-goal-plist-from-natural-language= — natural language input produces
|
|
||||||
correct structured goal plist (LLM-dependent but formulaic; tested with
|
|
||||||
deterministic mock).
|
|
||||||
2. =test-screamer-plan-satisfies-constraints= — Screamer produces a plan that
|
|
||||||
satisfies all specified dependencies and safety constraints.
|
|
||||||
3. =test-screamer-report-unsolvable= — Screamer reports unsolvability when
|
|
||||||
constraints are contradictory.
|
|
||||||
4. =test-acl2-verifies-plan-no-cycles= — ACL2 verifies a valid plan has no
|
|
||||||
dependency cycles.
|
|
||||||
5. =test-acl2-rejects-cyclic-plan= — ACL2 detects a dependency cycle in an
|
|
||||||
invalid plan.
|
|
||||||
6. =test-plan-to-natural-language= — structured plan plist produces readable
|
|
||||||
natural language output.
|
|
||||||
|
|
||||||
** Relation to Other Work
|
|
||||||
|
|
||||||
This is Phase 7 of =notes/passepartout-v3.0.0-roadmap.org=. It corresponds to the ROADMAP's
|
|
||||||
v0.9.0 (task planning) and v3.0.0 (full 10-80-10 architecture). It is the last
|
|
||||||
component because it depends on a mature symbolic index from Phases 0-6.
|
|
||||||
|
|
||||||
* Phase 8+: Semantic Wikipedia Integration (TBD lines — optional acceleration)
|
|
||||||
|
|
||||||
** What
|
|
||||||
|
|
||||||
Load Wikidata entities referenced in the memex into the symbolic index. Every
|
|
||||||
entity the user's prose mentions gets its Wikidata property graph — type hierarchy,
|
|
||||||
relations, dates, citations — as triples with =:provenance :wikidata=.
|
|
||||||
|
|
||||||
** Rationale
|
|
||||||
|
|
||||||
The gate stack provides 50-70 entity classes — adequate for a coding agent.
|
|
||||||
For a general-knowledge memex containing literature, philosophy, history,
|
|
||||||
science, and daily life, 50-70 is starvation. Organic growth through prose
|
|
||||||
extraction (Phase 3) would take years to cover the entities mentioned in a single
|
|
||||||
reading of /Pale Fire/. Wikidata has already done this work at scale.
|
|
||||||
|
|
||||||
The LLM's role in extraction shrinks dramatically. Without Wikidata, the archivist
|
|
||||||
must /discover/ that Nabokov wrote /Pale Fire/, lectured on Kafka, and emigrated
|
|
||||||
from Russia — extracting each triple from prose. With Wikidata, the Nabokov entity
|
|
||||||
is pre-structured. The archivist's job changes from "discover entities" to
|
|
||||||
"connect your heading to the existing entity."
|
|
||||||
|
|
||||||
** Implementation sketch
|
|
||||||
|
|
||||||
1. *Index referenced entities.* Scan memex prose for entity names (capitalized
|
|
||||||
noun phrases, names in Org links, headings in =literature/= directories). For
|
|
||||||
each, attempt Wikidata entity resolution (string match, disambiguation via
|
|
||||||
context).
|
|
||||||
|
|
||||||
2. *Load N-hop property net.* For each resolved entity, load its Wikidata
|
|
||||||
properties: instance-of, subclass-of, authored, published-in, influenced-by,
|
|
||||||
birth-date, death-date, etc. Load the same for entities directly connected
|
|
||||||
to it (1-hop neighbors). Optionally expand to 2-hop for deeply connected
|
|
||||||
domains.
|
|
||||||
|
|
||||||
3. *Admit with co-existent policy.* Wikidata facts are admitted with
|
|
||||||
=:provenance :wikidata= and contradiction policy =:coexistent=. They do not
|
|
||||||
override your memex's facts. They sit alongside them. Contradictions are
|
|
||||||
surfaced, not resolved.
|
|
||||||
|
|
||||||
4. *Cross-domain query.* "What does my memex say about Nabokov that Wikidata
|
|
||||||
doesn't?" "Where does my memex disagree with Wikidata?" "What entities in my
|
|
||||||
memex have no Wikidata counterpart?" These queries are pure VivaceGraph
|
|
||||||
traversals — zero LLM tokens.
|
|
||||||
|
|
||||||
** Not a Phase 0 prerequisite
|
|
||||||
|
|
||||||
Semantic Wikipedia integration is an accelerator, not a prerequisite. Phases
|
|
||||||
0-7 work without it — the ontology grows through gate outcomes, Screamer
|
|
||||||
deductions, LLM proposals, and human authoring. Wikidata compresses the timeline
|
|
||||||
for the broad domain but does not change the architecture. The admission gate
|
|
||||||
(Screamer), contradiction policies, provenance tracking, and neuro-symbolic
|
|
||||||
boundary are identical with or without it.
|
|
||||||
|
|
||||||
** Open question
|
|
||||||
|
|
||||||
How much Wikidata is the right amount? Loading entities referenced in the memex
|
|
||||||
is the minimum. Loading all entities within N hops of those references expands
|
|
||||||
the graph exponentially. The right N depends on the memex's breadth and the user's
|
|
||||||
query patterns. A memex focused entirely on software engineering may need only 1
|
|
||||||
hop. A memex spanning literature, history, philosophy, and science may need 3-4
|
|
||||||
hops. The query performance and memory costs of a large Wikidata load have not
|
|
||||||
been estimated.
|
|
||||||
|
|
||||||
* Summary: Lines and Dependencies
|
|
||||||
|
|
||||||
| Phase | Component | Lines | New Skill? | Depends On | Earliest Release |
|
|
||||||
|-------+-------------------------+-------+------------+-----------------+------------------|
|
|
||||||
| 0 | PM-type-level gates | ~30 | No | Dispatcher | Immediately |
|
|
||||||
| 1 | Triple fact store | ~150 | Yes | Phase 0 | v0.7.2+ |
|
|
||||||
| 2 | Screamer admission | ~200 | Yes | Phase 1 | v0.7.2+ |
|
|
||||||
| 3 | Archivist extraction | ~100 | Extends | Phase 2 | v0.8.0+ |
|
|
||||||
| 4 | Flip — sufficiency | ~50 | Extends | Phase 3 | v0.8.0+ |
|
|
||||||
| 5 | VivaceGraph store | ~300 | Yes | Phase 4 | v0.10.0+ |
|
|
||||||
| 6 | ACL2 verification | ~200 | Yes | Phase 5 | v0.12.0+ |
|
|
||||||
| 7 | 10-80-10 planner | ~500 | Yes | Phase 6 | v3.0.0 |
|
|
||||||
| 8+ | Semantic Wikipedia | TBD | Yes | Phase 5 | TBD |
|
|
||||||
|-------+-------------------------+-------+------------+-----------------+------------------|
|
|
||||||
| Total | | ~1530 | | | |
|
|
||||||
|
|
||||||
This roadmap is independent of the feature roadmap in
|
|
||||||
=passepartout/docs/ROADMAP.org=. Phase 0 ships alongside any v0.7.x patch. The
|
|
||||||
symbolic engine grows in parallel with feature work (TUI improvements, MCP tools,
|
|
||||||
gateway expansion, etc.), not after it. The feature roadmap describes /what/ the
|
|
||||||
agent can do; this roadmap describes /how/ it knows what it knows.
|
|
||||||
|
|
||||||
The total new code across all phases is approximately 1,530 lines. Relative to
|
|
||||||
the existing codebase (~8,000+ lines across 40+ Org source files and 30+ skills),
|
|
||||||
the symbolic engine is a ~20% addition. Relative to the ROADMAP's planned feature
|
|
||||||
work through v0.13.0 (thousands of lines of TUI rendering, MCP protocol
|
|
||||||
implementation, skin engine, planning, etc.), the symbolic engine is a small,
|
|
||||||
orthogonal thread that grows the architecture's reasoning depth while the feature
|
|
||||||
work grows its interaction breadth.
|
|
||||||
|
|
||||||
* Competitive Advantage Analysis
|
|
||||||
|
|
||||||
** Phase 0-1: Deterministic safety, now with type-level guarantees
|
|
||||||
|
|
||||||
The existing Dispatcher gate stack already provides 0-LLM-token safety verification.
|
|
||||||
Phase 0 adds structural guarantees: no heuristic bypassing of the type hierarchy.
|
|
||||||
A request to modify the dispatcher's own rules is impossible by construction, not
|
|
||||||
just caught by pattern matching. No competitor has this — their equivalent of
|
|
||||||
"core file protection" is a prompt instruction, not a type system.
|
|
||||||
|
|
||||||
** Phase 2-3: Verified extraction — the symbolic index grows without corruption
|
|
||||||
|
|
||||||
No competitor verifies extracted facts against an existing knowledge base. Their
|
|
||||||
memory systems (Claude Code's ~extractMemories~, Hermes's MemoryProvider, OpenClaw's
|
|
||||||
session transcripts) record what the LLM /said/ happened, not what the system
|
|
||||||
/proved/ happened. Passepartout's Screamer-gated admission makes the symbolic index
|
|
||||||
a monotonic, verified structure. Facts are admitted because they are consistent,
|
|
||||||
not because the LLM generated them.
|
|
||||||
|
|
||||||
** Phase 4-5: Self-accelerating knowledge — the downward cost curve
|
|
||||||
|
|
||||||
The sufficiency criterion makes Passepartout's "cheaper over time" thesis
|
|
||||||
measurable. As the ratio of non-lossy facts grows, LLM calls for extraction
|
|
||||||
decrease. At sufficiency, extraction of known categories becomes deterministic.
|
|
||||||
The downward cost curve is not a marketing claim — it is a structural property
|
|
||||||
of the architecture, visible through the sufficiency score.
|
|
||||||
|
|
||||||
** Phase 6-7: Provable plan soundness
|
|
||||||
|
|
||||||
No competitor verifies task plans against formal constraints. Claude Code plans
|
|
||||||
in a single LLM call with no post-hoc verification. Hermes decomposes tasks into
|
|
||||||
subtasks but does not prove them non-contradictory. Passepartout's ACL2-verified
|
|
||||||
plans are structurally guaranteed to have no deadlocks, no dependency cycles,
|
|
||||||
and no safety violations. The verification is a proof, not a prompt.
|
|
||||||
|
|
||||||
** Semantic Wikipedia: Entity coverage at zero marginal cost
|
|
||||||
|
|
||||||
No competitor has a general-knowledge entity graph because no competitor has a
|
|
||||||
symbolic engine to populate. Claude Code knows codebases; it doesn't know that
|
|
||||||
Nabokov wrote /Pale Fire/ and lectured on Kafka. Passepartout with Wikidata
|
|
||||||
loaded knows both, and the entity knowledge costs zero LLM tokens — it is loaded
|
|
||||||
once as structured data and queried via VivaceGraph traversals.
|
|
||||||
|
|
||||||
** The permanent competitive advantage
|
|
||||||
|
|
||||||
The competitive advantage is not any single feature. It is the architecture's
|
|
||||||
ability to accumulate verified knowledge from four independent sources (gates,
|
|
||||||
deduction, verified LLM proposals, human authoring) and to make that knowledge
|
|
||||||
queryable with provenance. Competitors accumulate chat transcripts. Passepartout
|
|
||||||
accumulates a provenanced, self-verifying knowledge graph. Transcripts become
|
|
||||||
stale and unreliable. The knowledge graph becomes richer and more trustworthy
|
|
||||||
with every session.
|
|
||||||
|
|
||||||
* What Is NOT Built
|
|
||||||
|
|
||||||
1. *A separate knowledge graph serialization format before the ephemeral phase
|
|
||||||
proves what facts are useful.* Premature format commitment is the ontology
|
|
||||||
problem writ small. Let use determine the format.
|
|
||||||
|
|
||||||
2. *ACL2 verification of empirical claims.* Apple is red. rm -rf / is destructive.
|
|
||||||
These are observations, not theorems. Screamer handles empirical consistency.
|
|
||||||
ACL2 handles structural verification.
|
|
||||||
|
|
||||||
3. *VivaceGraph before Screamer.* The admission gate is the critical path. The
|
|
||||||
persistence layer is an optimization of a working system.
|
|
||||||
|
|
||||||
4. *A per-fact ontology designed upfront.* Extract from the gate stack, extend
|
|
||||||
from deductions and observations, prune through contradiction detection. The
|
|
||||||
ontology is a garden, not a building.
|
|
||||||
|
|
||||||
5. *New core ASDF components.* Every phase is a skill. A corrupted symbolic
|
|
||||||
engine degrades reasoning but does not kill the agent. Satisfies the
|
|
||||||
self-repair criterion.
|
|
||||||
|
|
||||||
6. *A "complete" symbolic index for the broad domain.* The neural index is the
|
|
||||||
permanent gateway to the richness of prose. The symbolic index handles what
|
|
||||||
can be mechanically verified. The boundary is permanent, not transitional.
|
|
||||||
The neuro is the brain. The symbolic is the education.
|
|
||||||
|
|
||||||
* Relation to the Feature Roadmap
|
|
||||||
|
|
||||||
The feature roadmap (=passepartout/docs/ROADMAP.org=) describes Passepartout's
|
|
||||||
evolution through v0.13.0: TUI improvements, MCP-native tools, task planning,
|
|
||||||
skill creation, evaluation harnesses, voice gateways, themes, and channels.
|
|
||||||
These are /interaction surface/ features — they expand what the agent can do.
|
|
||||||
|
|
||||||
This roadmap describes the /reasoning substrate/ — it deepens how the agent
|
|
||||||
knows what it knows. It is independent of the feature sequence. Phase 0 can ship
|
|
||||||
alongside any v0.7.x patch. Phases 1-4 ship during the v0.8.x-v0.10.x feature
|
|
||||||
cycle. Phases 5-7 ship during the v0.11.x-v0.13.x cycle.
|
|
||||||
|
|
||||||
The two roadmaps converge at v3.0.0: the feature roadmap provides the interaction
|
|
||||||
surface (a polished TUI, a rich tool ecosystem, a multi-gateway communication
|
|
||||||
layer); this roadmap provides the reasoning depth (a provenanced knowledge graph,
|
|
||||||
a deterministic constraint solver, a verified planning engine). The surface
|
|
||||||
without the substrate is a chat agent with good UX. The substrate without the
|
|
||||||
surface is a theorem prover without a user. Together, they are the v3.0.0
|
|
||||||
architecture.
|
|
||||||
|
|
||||||
See also:
|
|
||||||
|
|
||||||
- =notes/passepartout-neurosymbolic-design-decisions-and-options.org= — the
|
|
||||||
design rationale for every decision in this roadmap
|
|
||||||
- =notes/passepartout-symbolic-engine-exploration.org= — the original architecture
|
|
||||||
exploration and five architecture options
|
|
||||||
- =notes/passepartout-whitehead.org= — Whitehead's four concrete contributions
|
|
||||||
- =passepartout/docs/ROADMAP.org= — the feature roadmap through v0.13.0
|
|
||||||
- =passepartout/docs/ARCHITECTURE.org= — the current pipeline architecture
|
|
||||||
- =notes/passepartout-v3.0.0-roadmap.org= — the original concrete plan (superseded by this
|
|
||||||
document)
|
|
||||||
|
|||||||
180
projects/AGENTS.md
Normal file
180
projects/AGENTS.md
Normal file
@@ -0,0 +1,180 @@
|
|||||||
|
# AGENTS.md
|
||||||
|
|
||||||
|
## Development Cycle (every change)
|
||||||
|
|
||||||
|
0. **Start the runtime** — boot the Lisp image that loads your project.
|
||||||
|
For passepartout: `passepartout daemon` (loads the entire project into one SBCL image).
|
||||||
|
For standalone CL projects: SBCL with `(ql:quickload :your-project)`.
|
||||||
|
The running image IS the development environment. The REPL is mandatory.
|
||||||
|
The SBCL fallback below exists only for bootstrapping (when the runtime cannot
|
||||||
|
start) and CI.
|
||||||
|
|
||||||
|
1. **Read the next TODO** — find the next unreached `*** TODO` item in
|
||||||
|
`docs/ROADMAP.org` (search `*** TODO`). Read its prose, `:PROPERTIES:`,
|
||||||
|
and estimated line budget. That item is the target for this change cycle.
|
||||||
|
|
||||||
|
2. **Create a branch** — `git checkout -b feature/<version>-<name>` from main.
|
||||||
|
Every feature develops in its own branch. Branches are cheap, disposable,
|
||||||
|
and keep abandoned work off main. Name the branch after the version and
|
||||||
|
a short slug: `feature/v0.1.0-layout-engine`, `feature/v0.9.0-eval-harness`.
|
||||||
|
Complex features that span multiple phases may use a single branch with
|
||||||
|
multiple commits rather than one branch per phase.
|
||||||
|
|
||||||
|
3. **Think in org** — write your reasoning, goals, and approach in the .org file first.
|
||||||
|
|
||||||
|
4. **Write contract** — define a `** Contract` section listing each function's behavior:
|
||||||
|
`(fn-name args)`: description. Returns/guarantees ...
|
||||||
|
|
||||||
|
5. **TDD in REPL** — the inner loop runs entirely in the running image:
|
||||||
|
|
||||||
|
a. **Write tests in org** — add `fiveam:test` forms to the `* Test Suite` section
|
||||||
|
of the .org source file. Tests are definitions, not explorations — write them
|
||||||
|
in the file first.
|
||||||
|
|
||||||
|
b. **Send tests to REPL → RED** — evaluate the test forms in the running image.
|
||||||
|
Run the suite. It must FAIL — the implementation doesn't exist yet.
|
||||||
|
Record the failure output in the .org file under the test.
|
||||||
|
|
||||||
|
c. **Develop implementation in REPL** — redefine functions directly in the
|
||||||
|
running image. Explore. Discover the real argument shapes, edge cases, and
|
||||||
|
helper functions through interaction, not speculation. Each `defun` in the
|
||||||
|
REPL is immediate — no tangle, no reload, sub-second feedback.
|
||||||
|
|
||||||
|
d. **Run tests → GREEN** — after each change, re-run the suite from the REPL.
|
||||||
|
When all tests pass, the implementation is complete. If still RED, return to
|
||||||
|
step c. Record the passing output in the .org file under the test.
|
||||||
|
|
||||||
|
e. **Copy code to org** — copy each finished function from the REPL into its
|
||||||
|
own `#+begin_src lisp` block in the .org file. The code is already working;
|
||||||
|
the file is now its permanent home. One function per block. Never write a
|
||||||
|
function in a file that hasn't been proven in the image.
|
||||||
|
|
||||||
|
6. **Update literate prose** — write/update the explanatory text around the code:
|
||||||
|
what it does, why it exists, how it connects to the rest of the system.
|
||||||
|
|
||||||
|
7. **Tangle** — generate the .lisp file from the .org source:
|
||||||
|
```
|
||||||
|
emacs --batch --eval "(progn (require 'org) (find-file \"org/FILE.org\") (org-babel-tangle) (kill-buffer))"
|
||||||
|
```
|
||||||
|
Tangling is a finalization step, not part of the inner loop. The inner loop
|
||||||
|
(steps 5a–5e) happens entirely in the REPL. Tangle once, when the file is
|
||||||
|
ready to commit.
|
||||||
|
|
||||||
|
8. **Run full test suite** — from the REPL, run every test suite in the project:
|
||||||
|
```
|
||||||
|
(fiveam:run-all-tests)
|
||||||
|
```
|
||||||
|
This catches regressions across the entire system. A function that passes its
|
||||||
|
own tests but breaks another module is not done.
|
||||||
|
|
||||||
|
9. **Validate block balance** — check that every `#+begin_src lisp` block in the
|
||||||
|
modified .org files has balanced parentheses. Use your project's equivalent
|
||||||
|
function or the SBCL fallback below.
|
||||||
|
|
||||||
|
10. **Commit on the branch** — include the RED and GREEN test output recorded
|
||||||
|
in the .org file as part of the commit message evidence:
|
||||||
|
```
|
||||||
|
git add org/ lisp/ docs/
|
||||||
|
git commit -m "v0.9.0: eval harness — 10 tasks, regression detection
|
||||||
|
|
||||||
|
RED: 0/10 pass (tasks not yet defined)
|
||||||
|
GREEN: 10/10 pass"
|
||||||
|
```
|
||||||
|
|
||||||
|
11. **Mark the origin TODO DONE** — in `docs/ROADMAP.org`, change the
|
||||||
|
`*** TODO` item to `*** DONE` and add a `:LOGBOOK:` entry with the
|
||||||
|
completion date. This is a separate commit on the branch:
|
||||||
|
#+begin_src org
|
||||||
|
:LOGBOOK:
|
||||||
|
- State "DONE" from "TODO" [YYYY-MM-DD Day]
|
||||||
|
:END:
|
||||||
|
#+end_src
|
||||||
|
|
||||||
|
12. **Merge to main** — the merge IS the release. Rebase onto main first
|
||||||
|
to keep history linear, then fast-forward merge:
|
||||||
|
```
|
||||||
|
git checkout main
|
||||||
|
git merge feature/v0.9.0-eval-harness
|
||||||
|
```
|
||||||
|
|
||||||
|
13. **Bump the submodule** — if the project is a submodule in the parent
|
||||||
|
`memex` repo (e.g., passepartout), stage the submodule pointer and commit:
|
||||||
|
```
|
||||||
|
git add projects/passepartout
|
||||||
|
git commit -m "bump passepartout → v0.9.0"
|
||||||
|
```
|
||||||
|
Standalone projects skip this step.
|
||||||
|
|
||||||
|
14. **Delete the branch** — `git branch -d feature/v0.9.0-eval-harness`.
|
||||||
|
Abandoned branches can be deleted before merge with no cleanup needed.
|
||||||
|
|
||||||
|
## Branch Policy
|
||||||
|
|
||||||
|
- Every feature starts on a branch from main. Branch names: `feature/<version>-<slug>`.
|
||||||
|
- ROADMAP.org changes (DONE markers, LOGBOOK entries) happen on the branch, not
|
||||||
|
on main directly. They merge to main with the feature.
|
||||||
|
- If a feature fails or is abandoned, delete the branch. No revert commits, no
|
||||||
|
dead code on main, no `;; OBSOLETE` comments. Git history preserves the
|
||||||
|
experiment if you need to reference it later.
|
||||||
|
- Rebase onto main before merging. Keep history linear. No merge commits.
|
||||||
|
- Complex features that span multiple roadmap versions may live on one branch
|
||||||
|
with multiple commits, merging to main when the entire chain is stable.
|
||||||
|
- **Bug fixes, typos, docs-only edits, and single-session jobs do not get a
|
||||||
|
branch.** Commit them directly to main. The heuristic: if it can be finished
|
||||||
|
in one session and has no plausible alternative that could replace it, it
|
||||||
|
goes to main. If it spans sessions or might be abandoned for a better
|
||||||
|
approach, it gets a branch.
|
||||||
|
|
||||||
|
## Commands
|
||||||
|
|
||||||
|
Tangle a single file:
|
||||||
|
emacs --batch --eval "(progn (require 'org) (find-file \"org/FILE.org\") (org-babel-tangle) (kill-buffer))"
|
||||||
|
|
||||||
|
Validate structural integrity (org/ source files only):
|
||||||
|
emacs --batch -Q --eval '(progn (find-file "org/FILE.org") (check-parens) (kill-buffer))'
|
||||||
|
|
||||||
|
Run tests (from REPL):
|
||||||
|
(fiveam:run (intern "SUITE-NAME" :project-TESTS))
|
||||||
|
(fiveam:run-all-tests)
|
||||||
|
|
||||||
|
Run tests (SBCL fallback — only when the runtime cannot start):
|
||||||
|
sbcl --noinform \
|
||||||
|
--eval '(load (merge-pathnames "quicklisp/setup.lisp" (user-homedir-pathname)))' \
|
||||||
|
--eval '(ql:quickload :your-project :silent t)' \
|
||||||
|
--eval '(load "lisp/FILE.lisp")' \
|
||||||
|
--eval '(fiveam:run (intern "SUITE-NAME" :project-TESTS))' --quit
|
||||||
|
|
||||||
|
For error details: bind fiveam:*on-failure* to :debug
|
||||||
|
|
||||||
|
## REPL — mandatory
|
||||||
|
|
||||||
|
All development happens in a running Lisp image. Start your runtime:
|
||||||
|
- Passepartout: `passepartout daemon` — boots the entire project, listens on port 9105
|
||||||
|
- Standalone CL projects: `sbcl` with `(ql:quickload :your-project)`
|
||||||
|
|
||||||
|
Send code from opencode using the `lisp` tool (any SBCL project) or the `repl`
|
||||||
|
tool (passepartout daemon on port 9105). The inner loop (step 5a–5e) never leaves
|
||||||
|
the REPL:
|
||||||
|
|
||||||
|
1. Send test forms from .org to REPL → RED
|
||||||
|
2. Redefine functions in REPL → test → iterate
|
||||||
|
3. Send tests → GREEN
|
||||||
|
4. Copy working code back to .org
|
||||||
|
|
||||||
|
Tangle only when the file is complete and ready to commit. Never batch-compile
|
||||||
|
outside the image when the runtime is available. Use the SBCL fallback above only
|
||||||
|
when the runtime itself cannot start.
|
||||||
|
|
||||||
|
## Rules
|
||||||
|
|
||||||
|
- .org is source of truth; .lisp is generated — never edit .lisp directly
|
||||||
|
- Every code change starts with a contract and a failing test
|
||||||
|
- Prove RED before writing implementation
|
||||||
|
- Implementation is developed in the REPL, then copied to .org — never write
|
||||||
|
code in a file that hasn't been proven in the image
|
||||||
|
- Validate before committing
|
||||||
|
- If a tool fails, explain why and ask before trying alternatives
|
||||||
|
- Before shipping a version, run the `** File Update Checklist` in `docs/ROADMAP.org`
|
||||||
|
- **YOU MAY NOT** push a version tag (e.g., `v0.5.0`), create a GitHub release,
|
||||||
|
or run `git push` that triggers CI/CD version workflows without explicit
|
||||||
|
permission. Ask first.
|
||||||
Submodule projects/passepartout updated: 138f909a33...96628d00e9
Submodule projects/passepartout-contrib updated: ce17336acd...825ef832ba
Reference in New Issue
Block a user