amr/memex

Files

Amr Gharbeia e719443ce7 memex: update AGENTS.md, add passepartout design-decisions notes, SWOT + agora notes, bump submodules → v0.8.1

2026-05-10 07:11:08 -04:00

49 KiB

Raw Blame History

Passepartout Neurosymbolic + Agora Integration — SWOT Analysis

Premise and Scope
Will It Work Conceptually?
SWOT Analysis
What This Unlocks
- Technologically
- Socially
Conclusion

Premise and Scope

This analysis assumes the engineering is possible — Screamer can be wrapped, VivaceGraph can persist facts, ACL2 can verify structural properties, the archivist can extract triples from prose with Screamer verification, and the note-publishing bridge to Agora can be implemented. The question is not "can it be built?" but "does the architecture cohere? What does it enable? What does it miss?"

Will It Work Conceptually?

The short answer: yes, within a specific domain. The long answer: the boundary of that domain is the most important thing to get right.

The architecture's core insight is correct and load-bearing

The central design decision — "the LLM proposes; the symbolic engine decides whether to accept" — is sound. It is the inverse of every existing agent architecture. Claude Code, OpenCode, Hermes — all of them put the LLM in the driver's seat and add safety as an afterthought (prompt-based guardrails that consume tokens and can be evaded). Passepartout inverts this: the LLM proposes actions and facts, but a deterministic layer of gates, constraint solvers, and formal verifiers decides what to admit and what to execute. This inversion is the correct response to the hallucination problem. You cannot eliminate hallucination by making the LLM better. You eliminate it by not asking the LLM to do things that require certainty.

The bootstrap mechanism — extracting 50-70 entity classes mechanically from the existing Dispatcher gate stack with zero new code — is genuinely elegant. It proves the pattern at minimal cost: code becomes facts, facts enable reasoning. Every new gate pattern adds to the ontology organically. This is the right way to start a knowledge base: not by designing a schema upfront, but by formalizing what the system already knows implicitly.

The "one memex, two indices" architecture survives contact with reality

Option 4 (one memex with neural and symbolic indices over the same Org files) is the correct long-term architecture. The prose is the ground truth — always. The symbolic index is a derived view that can be thrown away and rebuilt. The neural index handles semantic search, associative leaps, and fuzzy matching. This division of labor is permanent, not transitional, because the domains they serve are fundamentally different kinds of knowledge.

The practical path — starting with Option 5 (ephemeral facts, no persistence) through Phases 1-4, then graduating to Option 4 with VivaceGraph persistence in Phase 5 — is the right sequence. It punts the serialization format problem until the fact language has been battle-tested. It keeps the cost of mistakes low. It treats the ontology as something discovered through use rather than designed upfront.

Wikipedia's ontology WOULD give it a running start — with caveats

Wikidata contains approximately 100 million entities with a decade of human curation: type hierarchies, relations, dates, citations, disambiguation. For a personal memex that mentions Nabokov, Pale Fire, Kafka, postmodernism, and butterfly migration, the gate stack's 50-70 entity classes is starvation. Organic growth through prose extraction would take years to cover the entities in one person's engagement with a single novel.

Loading Wikidata's entity graph into the symbolic index transforms the archivist's job from "discover that Nabokov wrote Pale Fire" to "connect your heading to Wikidata entity Q36591." The second task is reference resolution, not knowledge extraction — simpler, more reliable, and in many cases doable without an LLM at all (string match against loaded entities). The notes claim this collapses the LLM's role to three thin boundaries: input translation, prose-to- candidate-triple for personal content, and result-to-prose formatting.

The caveats are real:

Entity resolution (matching prose mentions to Wikidata entities) is genuinely hard. "Nabokov" in a diary might refer to Vladimir Nabokov (Q36591), his son Dmitri (Q566744), or someone else entirely. Disambiguation requires context that the symbolic engine doesn't have without LLM assistance.
Wikidata is biased toward English Wikipedia's coverage. A memex in Arabic, Farsi, or Amharic will find far fewer resolved entities. The "universal" in Wikidata is aspirational, not actual.
Wikidata's property graph is not a ontology in the formal sense — it's a collaboratively edited dataset with contradictions, gaps, and editorial wars frozen in time. Loading it directly into a symbolic index that expects consistency (Screamer checks, cardinality policies) will surface thousands of contradictions on ingest, many of which are Wikidata artifacts, not meaningful tensions.
N-hop expansion is unbounded. One hop from Nabokov hits hundreds of entities (his works, his family, his influences, his translators). Two hops hits thousands. Three hops hits tens of thousands. The notes say "3-4 hops" for a literary memex but don't estimate the entity count this implies. The claim that 5 million entities = ~400MB is the best-case hash-table figure; a graph with query indices will be larger, and Prolog-like queries over millions of nodes are not free.

Still: even a partial Wikidata load with conservative hop limits would provide more ontology than the system could accumulate through years of organic growth. It is the right accelerator, and the architecture handles it correctly — Wikidata facts are admitted with :provenance :wikidata and :policy :plural, meaning they sit alongside personal facts without overriding them. Disagreements are surfaced, not resolved. The architecture treats Wikidata as evidence from an external source, not as ground truth. That's the correct posture.

Cardinality policies are the right abstraction for contradiction

The :singular / :dual / :plural cardinality model is one of the most important ideas in these notes. Classical logic requires consistency — a contradiction implies everything (ex contradictione quodlibet). A constraint solver like Screamer also requires consistency — a contradictory constraint set has no solutions. But a personal memex operates across domains where the meaning of contradiction is fundamentally different:

"rm -rf / is catastrophic" is :singular — there is one truth that evolves over time.
"I loved this person AND I resented them" is :dual — the tension IS the fact.
"Wikidata says Everest is 8848m; DBpedia says 8849m; my 2023 diary says 8848m" is :plural — multiple sources disagree, and surfacing the disagreement with provenance is the product.

This is a genuinely novel contribution to knowledge representation. Most knowledge graphs (Wikidata, Freebase, DBpedia) don't model contradiction at all — they pick one value and discard the rest. Most constraint solvers reject contradiction as error. Passepartout's cardinality model makes contradiction a first-class citizen: you can query the fact that "I used to believe X until Tuesday, then Y," or "these three sources disagree on height," or "I hold these two positions in tension." The symbolic engine's job is not to decide which is right. It is to surface the tension with provenance.

This alone, if implemented correctly, would be a category-level advance over every existing personal knowledge management tool.

Ontology versioning is the right approach to the migration problem

Every knowledge base eventually faces schema migration — you split :secret-file into :crypto-secret and :plaintext-secret, and now every deduction that crossed the old category boundary is suspect. The standard approach is batch UPDATE operations that overwrite the past. Passepartout's approach — the category hierarchy itself is a Merkle tree, every fact stores the :ontology-version at assertion time, category changes trigger re-verification rather than remapping — preserves all worldviews. You can query "what did I believe about secrets before I refined my security model?" This is not querying a fact. It is querying the history of your own thinking.

This is the kind of capability that no existing tool provides, and it flows directly from the architecture. If the Merkle DAG infrastructure exists (it does, from v0.2.0), ontology versioning is ~40 lines on top of it. The conceptual design is sound. The engineering appears tractable.

SWOT Analysis

Strengths

Architectural inversion — proposer vs decider

The LLM proposes. The symbolic engine decides. This is the inverse of every existing agent architecture, and it solves the hallucination problem at the architectural level rather than the prompt-engineering level. No amount of prompt refinement can make a probabilistic system deterministic. But a deterministic admission gate can make a probabilistic proposer safe.

Unified container format (Org files)

Org files serve as the container for human prose, Lisp source code, symbolic facts, and Agora Notes. One format, one toolchain, one Merkle tree, one version control system. If Passepartout stops existing, the data survives in plain text. This is the hardest commitment in the design and the most undervalued. Most agent architectures store memory in JSONL transcripts, vector databases, or proprietary formats — opaque to the human and dependent on the tool. Passepartout's memory IS the human's memory, in the human's format.

Provenance as product

Every fact carries :grounding (the specific Org heading), :provenance (who or what produced it), :timestamp, :referenced-by, :contradicted-by, :superseded-by. The /audit command renders the full provenance chain. In the broader memex, the value is not the verified fact ("this command is safe"). It is the provenance itself: "this claim originated in that diary entry, has been referenced 7 times across 4 projects, was contradicted 6 months later, and was revised 3 weeks after that." This is a memory prosthesis that makes your own mind legible to you.

Gate-to-fact bootstrap — ontology from existing code

The existing Dispatcher gate stack encodes an implicit ontology (categories of secrets, destructive commands, trusted domains, core files). The bootstrap extracts this mechanically — zero LLM tokens, zero human authoring, ~30 lines of Lisp. This proves the pattern and provides the seed ontology without any new infrastructure. Every new gate pattern added by the human (HITL approvals that become rules) extends the ontology automatically.

Self-preservation architecture

The Third Law implementation — quarantine on skill failure, degraded-mode signaling, resource monitoring, external watchdog, refusal to self-terminate — is individually small (~20-50 lines each) and collectively transforms self-preservation from a passive architectural property into an active behavior. The key insight: the biggest gap is not that these mechanisms are hard. It is that degradation is currently silent. Making it visible is cheap and high-impact.

Cardinality policies as a solution to contradiction

The :singular / :dual / :plural model is novel in knowledge representation and directly addresses the hardest problem in a personal memex: that contradiction is the product, not the error. Bayesian knowledge bases, graph databases, and triple stores all struggle with contradiction. Passepartout's model makes it a feature.

Organic ontology growth

Categories emerge from the system's own operation: gate patterns → gate outcomes → Screamer generalizations → archivist proposals → cross-domain overlap detection. The ontology is a garden, not a building. This avoids the Principia Mathematica problem — the need to define everything upfront — by replacing axiomatic design with evolutionary growth. Categories that aren't used fade. Categories that are contradictory are pruned. Categories that emerge from overlapping domains are promoted. The system converges on useful granularity through use.

Agora as provenance layer for networked knowledge

A BFT-timestamped triple store is one approach, but the Merkle DAG + DID signatures provide a lighter-weight alternative: every fact's provenance is content-addressed, every author's identity is cryptographically verifiable, and the DAG structure enables partial replication without consensus. This is more tractable than full BFT and sufficient for a personal memex that needs to share facts across a network.

Decoupling of compute cost from knowledge base size

LLM tokens are minimized by design — deterministic gates cost 0 tokens, sparse- tree rendering keeps context at 2,000-4,000 tokens, Screamer deductions cost 0 tokens. Adding 5 million Wikidata entities does not add a single token to any LLM call. The variables that actually degrade performance — context window size, LLM call frequency, Screamer deduction budget — are all bounded independently of knowledge base size. This is a structural property: the education is local, only the brain costs.

Weaknesses

The fact language is unproven and may be insufficient

Triples — (:entity :relation :value) with provenance and grounding — is the current hypothesis. It is simple enough to be parseable, expressive enough to capture the gate stack's implicit claims, and extensible enough that Screamer can operate on it. But:

Triples cannot naturally express temporal relations. "Was X before Y?" requires reification (making the relation itself an entity), which makes queries exponentially more complex.
Triples cannot express modal claims. "Should not do X unless Y" has no natural triple representation. Neither does "could have done X but chose Y."
Triples cannot express counterfactuals. "If X had happened, Y would have followed." These are essential for the "what if" reasoning that a personal memex should support.
Triples struggle with n-ary relations. "Nabokov wrote Pale Fire in 1962 while living in Montreux" is a 4-ary relation (author, work, date, location), not a set of independent binary relations. Breaking it into triples loses the connection that binds them.
Triples cannot express negation cleanly. "Nabokov did NOT write Doctor Zhivago" requires a negative fact, which in a triple store with an open-world assumption means "not known" and "known not" are conflated.

The notes acknowledge this limitation but defer it. The right granularity "depends on what queries the planner actually needs to make, and that cannot be known in advance." This is honest but unsatisfying. If triples prove insufficient, the entire fact store, the Screamer integration, the VivaceGraph persistence, and the archivist's extraction format must be redesigned. The architecture has no intermediate fallback between "triples" and "something more expressive."

Screamer as admission gate is untested at this scale

Screamer is a constraint solver with non-deterministic backtracking. Using it to check a candidate triple against an existing fact store is conceptually elegant: express the fact store as constraint variables, assert the candidate, check solvability. But:

Screamer was designed for constraint satisfaction problems with tens to hundreds of variables. A fact store with millions of triples (after Wikidata loading) is a constraint space orders of magnitude larger than Screamer's design envelope.
The consistency check is domain-scoped (only rules from the candidate's :domain apply), but cross-domain contradictions are the most valuable kind. "Nabokov was born in 1899" (literature domain) should be consistent with "Nabokov died in 1977" (history domain). If these are separate domains, the check misses contradictions; if they are unified, the constraint space explodes.
Screamer's non-deterministic backtracking is worst-case exponential. The notes bound this via deduction budget (SCREAMER_DEDUCTION_BUDGET_MS) but don't address the admission check itself, which runs on every assertion.

There is a risk that Screamer works beautifully for the gate-bootstrapped seed (50-70 entity classes, ~200 facts) and becomes unusably slow after Wikidata loading (millions of facts). The transition from "works" to "doesn't" may be gradual and hard to detect — the system gets slower but doesn't crash, degrading user experience without a clear diagnostic.

The "flip" from lossy to deterministic is underspecified

The architecture's central narrative arc is the "flip": at some point, the non- lossy facts constitute a sufficient foundation that the symbolic engine can reverse the flow — instead of LLM extraction, the symbolic engine reads prose through its own lens and deduces facts directly. The sufficiency metric (non-lossy / total > 0.7) makes this "computable and visible to the user."

But:

The threshold (0.7) is arbitrary. It is not derived from empirical measurement, information theory, or constraint satisfaction theory. It is a guess.
Sufficiency is domain-specific, not global. The gate stack may have 0.95 coverage of security classifications but 0.05 coverage of literary analysis. A global threshold of 0.7 hides the domains where the symbolic engine is still effectively blind.
The "flip" operation itself is not defined. "Screamer reads prose through its own lens" — Screamer does not read prose. It operates on structured facts. Either the archivist still extracts triples (which is LLM work), or some new mechanism parses prose into triples deterministically (which is NLP at a level that does not exist in open-source Lisp).
Even after the flip, facts from the pre-flip period carry :provenance :llm-proposed and are therefore suspect. The pre-flip facts were admitted against fewer non-lossy facts, meaning Screamer's consistency checks were weaker. A fact admitted during the seed phase may be wrong but undetected because there were no contradicting facts at the time. Re-verifying all pre- flip facts against the current fact store is described as a heartbeat task but the cost (millions of Screamer checks) is not estimated.

The flip is a beautiful narrative. It may also be a mirage — the system may achieve high sufficiency in narrow domains (security, filesystem, coding) and never approach it in the broader memex (literature, personal reflection, daily life). If the broader memex is the use case, the flip may never happen.

The archivist's extraction cost is unaccounted

The archivist calls the LLM to extract triples from prose, with "a minimal prompt (~200 tokens)." Over a personal memex with thousands of entries — a decade of diary entries, hundreds of literature notes, dozens of project logs — the extraction cost is substantial.

Assume 5,000 headings, 200 tokens per heading prompt, and an LLM that returns ~100 tokens of structured triples per heading. That's 1.5 million tokens for the initial extraction, plus verification tokens (Screamer checks cost 0 LLM tokens, but incorrect proposals generate feedback that may trigger re-extraction). At current API prices (~$0.15 per million input tokens for GPT-4o-mini), the cost is modest (~$0.25). But at scale — re-extraction after ontology changes, continuous extraction as new content is added, extraction for all incoming Agora Notes — the cost accumulates.

More importantly, the extraction latency is human-noticeable. 5,000 headings at 1 second per LLM call is ~1.4 hours of extraction time. The system needs to either batch-extract on startup (making cold starts slow) or extract lazily on first query (making first queries slow). Neither is ideal.

The notes trumpet the token savings from deterministic gates and Screamer deductions (valid — those cost 0 tokens) but the archivist's extraction cost is the system's single largest recurring LLM expense, and it is mentioned only in passing.

The Agora integration is clean in theory, undefined in practice

The "Passepartout IS the PDS" claim is elegant: the memory-object struct IS the Note format, the Merkle DAG IS the Key Event Log, the fact store IS the reputation system. But:

An Agora PDS needs to serve HTTP APIs for thin clients. The daemon speaks a framed TCP protocol over a local port. Extending it to serve HTTPS with DIDComm endpoints, subscription management, and Relay push/pull is a substantial engineering effort.
The PDS needs to manage encrypted storage — client-side encrypted content that the PDS itself cannot read. Passepartout's vault stores credentials with integrity hashes but does not currently manage per-Note encryption with audience-specific keys.
The Relay Network is described as an intelligent communication backbone with pub/sub routing. Passepartout has no Relay implementation, no Relay-facing API, and no subscription management beyond its own event orchestrator.
Agora's contract system (SCAL contracts, HODL invoices, arbitration tiers) requires state machines and Lightning Network integration that Passepartout has no primitives for.
The "Passepartout IS the PDS" vision conflates two things: the data model (Org files = Notes) and the infrastructure (a process that serves a network protocol). The data model unification is clean and right. The infrastructure unification implies Passepartout grows from a local agent to a network server — a significant architectural expansion that the notes treat as a ~40-line utility.

No adversarial model

The notes describe layered authentication (crypto, sensory, deterministic, probabilistic) and type-level gates as structural safety. They do not describe an adversarial model:

What stops a malicious Agora Note from containing 100,000 triples that flood the fact store?
What stops a DID from publishing Notes that deliberately inject contradictions to force Screamer into exponential backtracking?
What stops a compromised sensor key from signing valid sensor data that is adversarially crafted (e.g., video frames designed to trigger specific vision model false positives)?
What stops a spam DID from creating millions of Personas and flooding the user's incoming Notes directory?

The resource monitor (Phase 1a) handles storage pressure generically. The quarantine system handles individual DIDs flagged for spam. But none of these are adversary-aware — they react to symptoms (disk full, error rate high) rather than anticipating attack patterns. An adversarial model would identify these vectors and design mitigations specifically. The notes describe a system that works in a cooperative environment, not an adversarial one.

The self-repair criterion creates a two-tier architecture

The AGENTS.md rule — "default: everything is a skill" — means the symbolic engine (Screamer, VivaceGraph, fact store, archivist, ACL2, planner) is all skills, not core. This is correct for the self-repair criterion: a corrupted skill degrades the agent but doesn't kill it. A corrupted core file kills the brainstem.

But it creates a tension: the symbolic engine IS the reasoning layer that would diagnose and repair a corrupted skill. If the fact store itself is corrupted (impossible facts, inconsistent cardinality, broken Merkle chains), the engine that detects corruption is the engine that is corrupted. The system needs a "repair from below" path — a minimal core that can purge and rebuild the symbolic index without depending on the symbolic index. This path exists (the fact store is ephemeral in Phase 1-4 and rebuildable from prose in Phase 5+) but is not exercised automatically. A corruption in the symbolic engine requires human detection and manual rebuild — the exact problem the self-repair criterion was designed to avoid.

Opportunities

A memory prosthesis that makes your own mind legible

The symbolic index, when populated and queried, answers questions that no existing tool can:

"What did I believe about monorepos in 2023, and how has that changed?"
"Which of my diary entries contradict each other?"
"What entities in my memex have no connection to any other entity?"
"Show me everything I've written about Nabokov, organized by when I wrote it, what I was reading at the time, and what I concluded."
"Which of my project plans reference security assumptions that I later changed?"
"What did I think about this topic, and why did I change my mind?"

These are not information retrieval queries. They are self-knowledge queries. They require provenance chains, temporal versioning, contradiction surfacing, and cross-domain linkage — all of which the architecture provides as first-class capabilities. If this works, it transforms the memex from a searchable archive into a thinking partner that knows the history of your thoughts.

Deterministic reasoning as a moat

Every competitor agent system (Claude Code, OpenCode, OpenClaw, Hermes, Cognee, Mem0) uses neural-only reasoning. They are all vulnerable to the same failure mode: the LLM hallucinates a fact or an action, and there is no second system to catch it. Their safety is heuristic. Their memory is flat. Their reasoning is unprovable.

Passepartout's architectural bet — a symbolic engine that verifies, deduces, and audits — creates a category difference, not a performance difference. If the bet pays off, Passepartout is not "a better AI agent." It is a different kind of system — one whose reasoning is provable, whose memory is content-addressed, and whose knowledge accumulates through deduction rather than re-prompting.

This is a genuine moat. It cannot be replicated by adding a better system prompt or a larger context window. It requires building the ontology, the constraint solver, the fact store, and the provenance tracker — work that takes years and cannot be shortcut by spending more on inference.

Agora as the first sovereign agent network

If Passepartout serves as the PDS and an Agora Persona, then AI agents can:

Publish verified outputs as signed Notes with cryptographic provenance. Readers know the agent produced the output, not a human impersonating the agent.
Accept invocation Notes from other persona owners. "Please analyze this contract and publish your findings." The agent receives the request as an Agora Note, processes it, signs the response, and publishes it.
Build reputation through auditable chains of signed work products, not through self-reported claims.
Participate in the compute marketplace as both consumer and provider.
Maintain sovereign identity — the agent's DID is independent of any platform, any provider, any human account.

This is not a chatbot on a messaging platform. It is an autonomous entity on a decentralized network, with cryptographic identity, verifiable provenance, and economic agency. If Agora reaches even Order 1 (the first 1,000 users), Passepartout agents become some of the most capable participants on the network.

The 10-80-10 ratio for coding is genuinely achievable

For a coding agent — the domain that Passepartout currently operates in — the 10-80-10 ratio is plausible. The existing Dispatcher already verifies every action deterministically. Adding Screamer for consistency checking, VivaceGraph for dependency queries, and ACL2 for structural verification would shift the ratio from the current ~95-5-0 (neural-gate-symbolic) toward 50-40-10 in the near term and potentially 10-80-10 in the long term.

The bootstrapped gate facts already cover file classifications, command safety, path protections, and tool permissions — the core categories for a coding agent. The archivist's extraction from project files would add dependency information, test coverage, and code structure facts. The planner could reason about refactoring order, dependency chains, and safety constraints deterministically. This is the domain where the symbolic engine provides the most immediate value, and it is the domain Passepartout already operates in.

Wikidata as an entity backbone unlocks cross-domain reasoning

Without Wikidata, the symbolic index for a general-knowledge memex is a sparse set of personal facts with no connecting structure. With Wikidata, the entity graph is pre-structured. The system can answer:

"What does my memex say about Nabokov that Wikidata doesn't?"
"Where does my memex disagree with Wikidata?"
"What entities in my memex have no Wikidata counterpart?" (These are the personal, novel, or subjective entities that are the most valuable.)
"Show me the intersection of my literary interests (from diary) with Wikidata's influence graph — which authors I read influenced each other in ways I haven't written about?"

These are cross-domain queries that require both the personal memex (for what the user knows) and Wikidata (for what the world knows). Neither alone can answer them. Together, they enable a kind of knowledge synthesis that no existing tool provides.

Ontology versioning enables "what-if" reasoning about one's own thinking

The ability to query across worldviews — "what did I believe before I changed my security model?" — is a capability that has no analog in any existing tool. It transforms the memex from a static archive into a dynamic record of intellectual evolution. Combined with the temporal awareness system (Phase 0c), the system could surface correlations: "You changed your mind about monorepos two weeks after reading this article, which you bookmarked on this date, and one week before starting this project that uses a monorepo structure." The provenance chain IS the narrative of your thinking.

Contract-level pre-arbitration reduces the cost of decentralized commerce

Agora's Tier 0 Arbitrator — a local AI that provides evidence summaries before human arbitration — is a genuinely useful role for a neurosymbolic system.

"Contract CID X references arbitrator DID Y. DID Y is active. Verified."
"All parties have signed. The HODL invoice is locked. Verified."
"The buyer's claim of non-delivery is supported by 3 signed messages with timestamps after the delivery deadline."
"The seller's proof-of-delivery field is empty. No QR scan recorded."

Each check is a Screamer query against the contract-lifecycle domain. The results are a plist, not a ruling. Both parties see the same evidence summary before escalating. This makes Level 1 arbitration faster (arbitrators receive pre-processed evidence bundles), cheaper (no human time spent on trivial verification), and more transparent (both parties see the same machine-generated summary).

This is not AI judging. This is AI preparing the docket. The distinction is important and defensible.

Self-auditing agents could transform AI safety discourse

If Passepartout can answer /audit for any action or fact — showing the full provenance chain, every gate that approved it, every fact that supported it, every alternative that was considered — then AI safety moves from "trust us, we tested it" to "here is the audit trail, verify it yourself."

This is the transparency that every AI safety framework calls for and none delivers. It is possible because the architecture records provenance as a first-class operation, not as an after-the-fact log. The provenance is the operating system, not a logging layer.

The memex + Agora combination could be a new kind of social network

Current social networks (Twitter, Facebook, Reddit) separate the person from their knowledge. You are a profile with posts. Your posts are isolated units without connection to your broader intellectual life.

A Passepartout-powered Agora Persona would publish Notes that are grounded in the memex: "Here is my analysis of Pale Fire, drawn from diary entries across three years, annotated with Wikidata context, and verified against my existing literary framework." The Note is cryptographically signed, carrying provenance back to the specific Org headings that informed it. Readers see not just the conclusion but the intellectual scaffolding that produced it.

This is not a "post." It is a publication — a knowledge artifact with verifiable provenance, auditable reasoning, and cryptographic identity. If this becomes the norm, it raises the standard for public discourse from "this is my opinion" to "this is my opinion, here is the evidence, here is how it evolved, here is who verified it."

Threats

The ontology problem may be harder than anticipated

The notes are honest about this: "Whitehead's Principia Mathematica took over 300 pages to define the logical foundations before it could prove that 1+1=2." Passepartout's domain is narrower (coding + personal knowledge) but the ontology problem is the same category of problem. Every entity class must be defined. Every relation must have clear semantics. Every inference rule must be justified.

The gate-to-fact bootstrap provides 50-70 entity classes — enough for a coding agent. But the broader memex contains orders of magnitude more entity types: people, places, works, concepts, events, emotions, aesthetic judgments, professional skills, personal projects, temporal patterns. Defining these as triples with clear semantics is genuine intellectual work that no amount of engineering can shortcut.

The risk is not that it's impossible. It's that it's slow — slow enough that the system never achieves the density of facts needed for the "flip" in the broader memex. The coding domain may reach sufficiency in months. The literary domain may take years. The daily-reflection domain may never cross the threshold because the facts involved (mood, insight, aesthetic experience) are not formalizable as triples.

Screamer may not scale to the fact store size

The constraint satisfaction approach to consistency checking is elegant for a seed fact set of hundreds of triples. It is unproven for millions of triples (after Wikidata loading + years of personal extraction). The domain-scoping strategy (Screamer only checks facts from the candidate's :domain) bounds the constraint space, but the most valuable consistency checks are cross-domain:

"You classified this file as public in your project notes but the gate stack classifies it as secret." (project domain vs security domain)
"You wrote that Nabokov influenced Kafka, but Wikidata says Kafka died before Nabokov published his first novel." (literature domain vs Wikidata domain)
"You planned to use this dependency, but the dependency's license changed in a way that conflicts with your project's license." (project domain vs legal domain)

If cross-domain checks are disabled for performance, the most valuable contradictions are never detected. If they are enabled, the constraint space explodes. There is no obvious sweet spot.

Wikidata quality may undermine trust in the symbolic index

If Wikidata facts are admitted with :policy :plural and the user sees thousands of contradictions between Wikidata and their personal memex, the symbolic index may feel less trustworthy, not more. "Wikidata says Mount Everest is 8848m. DBpedia says 8849m. Your 2023 diary says 8848m. These three sources disagree on height." This is correct behavior — surfacing disagreement with provenance — but it may be overwhelming. The user wanted a knowledge base, not a disagreement engine.

The trust problem is compounded by Wikidata's editorial biases. Wikidata reflects the biases of Wikipedia editors: English-language dominance, Western epistemological frameworks, systemic underrepresentation of non-Western knowledge. A memex in Arabic that references Islamic philosophy, Egyptian history, or African literature will find Wikidata's coverage thin, biased, or absent. The symbolic index would dutifully surface these gaps — "your memex mentions 47 entities with no Wikidata counterpart" — but it cannot fill them.

LLM cost and latency may prevent the archivist from keeping up

If the user writes a diary entry every day, the archivist must extract triples from each new heading. If the extraction takes 1-3 seconds per heading, it's background noise. But if the user imports 500 old diary entries, or the archivist needs to re-extract after an ontology change, or Agora Notes arrive in bulk from multiple follows, the extraction queue grows faster than it drains.

The notes describe extraction as a background task triggered by heartbeat, but they don't specify the extraction rate limit. An unbounded queue with no rate limit would consume the LLM budget. A bounded queue would fall behind. A lazy extraction strategy (extract on first query) would make first queries slow. A batch extraction on startup would make cold starts slow.

The archivist's throughput is gated by LLM API rate limits, token costs, and inference latency. These are external constraints that the architecture cannot eliminate. The symbolic engine can reduce LLM calls for reasoning; it cannot reduce LLM calls for extraction from prose.

Agora may never reach network effects

Agora faces the cold start problem that every decentralized social protocol faces: users won't join without content, creators won't post without users. The bootstrapping strategy (managed service → hybrid → full decentralization, targeting niche communities first) is well-articulated but its success depends on execution in a market where Mastodon, Bluesky, Nostr, and Farcaster are already competing for the same users.

If Agora doesn't reach even Order 1 (1,000 users), the PDS integration is academic. Passepartout's DID identity, DIDComm gateway, Note signing, and contract verification are all infrastructure for a network that doesn't exist. The symbolic engine still works locally — provenance tracking, contradiction surfacing, and deduction are all valuable without Agora. But the network effects that make Agora a transformative platform — reputation, contracts, marketplaces, collective governance — require a living network.

The risk is asymmetric: Passepartout invests significant engineering in Agora integration that provides zero value if Agora fails to launch.

Complexity may prevent adoption

Passepartout is already a complex system: a Lisp daemon, a terminal UI, a skill engine, a gate stack, multiple LLM backends, a Merkle memory system, and an event orchestrator. Adding a fact store, a constraint solver, a graph database, a theorem prover, an archivist, a planner, and an Agora PDS makes it more complex, not less.

The target user — someone who wants a personal AI assistant that works offline — may not want or need any of this. They want the TUI to work, the LLM to be fast, and the files to stay safe. The neurosymbolic engine is infrastructure for a use case (lifelong personal knowledge management with verifiable provenance) that most users do not yet know they have.

The risk is that Passepartout builds a cathedral for a congregation of one — a system that is architecturally brilliant and practically unused because the complexity-to-value ratio is too high for anyone except the author.

The self-repair criterion may not hold under adversarial conditions

The architecture assumes that skills can fail gracefully (fboundp guards, hash table fallbacks, degraded mode). It does not assume that a skill can be adversarially corrupted to behave correctly while producing wrong results. A compromised archivist that extracts plausible but false triples, a compromised Screamer that passes all consistency checks, a compromised VivaceGraph that returns query results from a parallel graph — these are "living" skills that would pass integrity checks and still poison the symbolic index.

The type-level gates prevent the LLM from modifying gate code. They do not prevent a compromised skill (loaded by a trusted human, or corrupted on disk by a separate process) from operating normally while subtly wrong. The integrity monitoring (Phase 0) catches disk-level corruption through hash checks. It does not catch semantic corruption — a skill that is byte-for-byte identical to the known-good version but loaded with a malicious input that triggers a latent bug.

This is not a vulnerability unique to Passepartout. It is a vulnerability in every system where components trust each other. But Passepartout's architecture amplifies the risk because the symbolic engine is supposed to be the trustworthy layer — the component that verifies the LLM's output. If the symbolic engine itself is compromised, the system has no higher court of appeal.

The 10-80-10 ratio may create false confidence

If the sufficiency metric shows "71% non-lossy, threshold 70%, mode: AUTO- EXTRACTION," the user may assume the system is trustworthy. But sufficiency is global — it aggregates across all domains. The system may have 95% sufficiency in the security domain and 5% sufficiency in the literary domain, averaging to 71%. The auto-extraction switch would bypass the LLM for all categories with sufficient coverage, but the threshold is global, not per-domain. A literary query would hit the symbolic index that has "sufficient" coverage globally but insufficient coverage for literature.

The notes describe domain-scoped Screamer checks but not domain-scoped sufficiency. A global sufficiency metric that triggers a global extraction mode change is the wrong granularity. Per-domain sufficiency, with per-domain extraction mode, would be more complex but more honest. The architecture as described has the simpler, more dangerous version.

Summary Matrix

	Positive	Negative
INTERNAL	S: Architectural inversion, unified Org format, provenance as product,	W: Unproven fact language, Screamer scale unverified, extraction cost hidden,
	cardinality model, gate-to-fact bootstrap, self-preservation, organic ontology,	flip underspecified, adversarial model absent, self-repair tension,
	Wikidata as accelerator, decoupled compute cost	Agora integration scope undefined, per-domain sufficiency missing
EXTERNAL	O: Memory prosthesis, deterministic moat, sovereign agent network,	T: Ontology may be harder than expected, Screamer may not scale,
	10-80-10 for coding achievable, Wikidata cross-domain queries,	Wikidata quality/trust, LLM extraction bottleneck, Agora network effects,
	ontology versioning, contract pre-arbitration, self-auditing safety,	complexity-to-adoption ratio, adversarial semantic corruption,
	knowledge-based social network	false confidence from global sufficiency metric

What This Unlocks

Technologically

The neurosymbolic engine, if built, would be the first AI system where:

Reasoning is auditable. Every conclusion carries a provenance chain back to its premises. The /audit command renders the full inference tree — every fact, every deduction, every gate outcome — in human-readable form.
Knowledge accumulates deterministically. Screamer deductions and gate outcomes generate new facts without any LLM involvement. The knowledge base grows from the system's own operation, not from re-prompting the LLM.
Memory is content-addressed. Every fact is a Merkle node. Every version chain is tamper-proof. Rollback is atomic. The storage format is proven correct before it is committed to disk.
Safety is provable, not empirical. Type-level gates make self-modification structurally impossible. ACL2 proves that the rule set has no contradictions. The dispatcher doesn't "try" to be safe — it is safe by construction.
The human and the machine share the same format. Org files for both. No hidden database. No import/export step. The agent's memory IS the human's memory.

These five properties, together, define a new category of AI system: the sovereign reasoning agent. Not sovereign in the blockchain sense (decentralized by consensus), but sovereign in the personal sense: the agent runs on your hardware, reasons with your knowledge, and proves its reasoning to you.

Socially

If the technical vision succeeds and Agora reaches network effects, the combination unlocks:

Verifiable public discourse. Every published claim carries provenance back to source material. "I read this, I thought this, I changed my mind on this date, here is the evidence." Public discourse shifts from "competing opinions" to "competing evidence chains." The quality floor rises because claims without provenance are visibly weaker than claims with provenance.
Sovereign AI agents with legal and economic personhood. A Passepartout agent with an Agora Persona can own assets, enter contracts, earn reputation, and face consequences for failure. This is not a chatbot. It is an autonomous entity with cryptographic identity, verified provenance, and economic agency — more like a corporation than a tool.
Self-auditing AI safety. Every action the agent takes is traceable. Every gate decision is recorded. Every fact that informed a decision is queryable. AI safety moves from "trust us" to "here is the audit trail." This is the transparency that every AI ethics framework calls for.
A personal knowledge economy. If your memex can publish Notes as Agora content, your intellectual work — your analyses, your syntheses, your discoveries — becomes a publishable, attributable, monetizable asset. Not through advertising or subscriptions, but through direct value exchange: Lightning payments for content access, contract work for your verified expertise, reputation that follows your Persona across platforms.
Collective intelligence without centralized control. If multiple Passepartout agents share facts through Agora Notes, the collective symbolic index represents the verified, provenanced knowledge of a community — not the averaged opinion of a crowd, but the auditable intersection of independently verified claims. This is Wikipedia without the editorial board, science without the journal gatekeepers, journalism without the corporate owners.
A memory prosthesis that outlives the individual. A memex with a decade of diary entries, linked to Wikidata's entity graph, with Screamer deductions surfacing patterns and contradictions, with ontology versioning preserving intellectual evolution — this is not a knowledge management tool. It is an externalized, queryable, auditable record of a life's thinking. It is what Vannevar Bush imagined in 1945: "an enlarged intimate supplement to one's memory."

Conclusion

The architecture described in these notes is genuinely novel. Not incrementally novel — most agent architectures are variations on "LLM + tools + prompt-based safety." Passepartout's neurosymbolic vision is categorically different: an inversion where the deterministic layer judges the probabilistic layer, where facts carry provenance chains, where contradiction is a feature rather than an error, and where the user's Org files are the single source of truth for both human and machine.

The largest risk is not that the architecture is wrong. It is that the ontology problem — the genuine difficulty of defining what a "fact" is, what relations are, what categories are useful, and how they evolve — is harder than the notes anticipate, and that the system spends years in a partially-working state where the symbolic index is too sparse to be useful but too entangled to be discarded.

The second-largest risk is that Agora never reaches the network effects needed to make the PDS integration valuable beyond a local experiment, and that the engineering investment in DIDComm gateways, Note signing, contract verification, and Relay integration produces infrastructure for a network that doesn't exist.

The opportunity is equally large: a system that makes your own mind legible to you, that proves its reasoning rather than asserting it, that accumulates knowledge across sessions through deduction rather than re-prompting, and that publishes verified, provenanced knowledge to a decentralized network. If this works — even partially, even slowly — it is a category-level advance over every existing agent architecture and every existing personal knowledge management tool.

The notes are a map of territory that no one has walked. The territory is real. The map is detailed enough to navigate by. Whether the journey completes depends on whether the ontology problem yields to engineering, and whether the user — the one human whose memex this serves — finds value in the partial system well before the full vision materializes.

49 KiB Raw Blame History