56 lines
4.3 KiB
Org Mode
56 lines
4.3 KiB
Org Mode
---
|
||
title: Implementation Properties
|
||
type: reference
|
||
tags: :passepartout:architecture:
|
||
---
|
||
|
||
* Implementation Properties
|
||
|
||
** Performance — Why Ontology Growth Doesn't Make the System Slower
|
||
:PROPERTIES:
|
||
:ID: 772ae489-b10a-48a0-bc3b-29136163d45b
|
||
:ID: design-performance
|
||
:CREATED: [2026-05-10 Sun]
|
||
:WEIGHT: 40
|
||
:END:
|
||
|
||
Passepartout's performance thesis is: minimize LLM calls, minimize context tokens, keep everything else local and fast. Knowledge base size is irrelevant to those metrics. This is not an aspiration. It is a structural property.
|
||
|
||
The system has two cost domains with fundamentally different scaling:
|
||
|
||
| Resource | Cost driver | Scales with |
|
||
|---------------+------------------------------------------+------------------------------------------|
|
||
| LLM tokens | Context window size, number of API calls | Foveal-peripheral pruning, gate rules |
|
||
| Compute | Screamer deduction, hash table lookups | Entity count, rule count per domain |
|
||
|
||
LLM tokens are minimized by design — deterministic gates cost 0 tokens, sparse-tree rendering keeps context at 2,000–4,000 tokens regardless of memex size. Adding 5 million Wikidata entities doesn't add a single token to any LLM call. The education is local. Only the brain costs.
|
||
|
||
Compute grows linearly with entity count (hash table lookups are O(1), but memory footprint grows). It grows with rule count within a single domain during Screamer consistency checking. But these are microsecond costs on local hardware, not API bills. A Screamer constraint check against a domain with 200 rules costs ~0.3ms. A 100-token guardrail paragraph in a system prompt costs ~$0.00001. The Screamer check is 10,000x cheaper and convergent — it handles the rule once. The guardrail paragraph handles it on every call, forever.
|
||
|
||
A 5-million-entity Wikidata load is ~400MB in a hash table. A lifetime personal memex with a decade of diary entries is perhaps 10-20 million triples (~1.5GB). Modern laptops carry 16-64GB. The knowledge base fits in consumer hardware with room for the Lisp runtime, the memory-object store, and the LLM inference engine.
|
||
|
||
*One genuine risk — rule generalization width.* If Screamer deduces increasingly broad rules within a single domain, the constraint space could bloat. Mitigation: rules carry a =:domain= tag. Screamer only applies rules from the fact's domain. Rule generalization that crosses domain boundaries is gated — must be human-approved. Rules that prove unused (never triggered a check in N heartbeat cycles) are demoted to =:inactive= and excluded from the active constraint set.
|
||
|
||
This is the minimalism argument restated in concrete terms: you buy bigger RAM and a faster CPU once. You don't buy bigger LLM context windows on every call. The education is a capital investment. The brain is an operating expense. The architecture makes the ratio favor capital.
|
||
|
||
** The Provenance Chain as Product
|
||
:PROPERTIES:
|
||
:ID: design-provenance-product
|
||
:CREATED: [2026-05-10 Sun]
|
||
:WEIGHT: 40
|
||
:END:
|
||
|
||
In the coding domain, the value of the symbolic engine is the verified fact: "this command is safe." In the broader memex, the value is the provenance itself: "this claim originated in that diary entry on that date, has been referenced 7 times across 4 different projects, was contradicted in a retrospective 6 months later, and was revised in a note 3 weeks after that."
|
||
|
||
The symbolic engine doesn't tell you what is true. It tells you what you wrote, when, where, and how it connects to everything else you wrote — with a verifiable audit trail. It is a memory prosthesis that makes your own mind legible to you.
|
||
|
||
Every fact carries:
|
||
- =:grounding= — the specific Org heading from which it was extracted
|
||
- =:provenance= — who or what produced it (gate-outcome, human-authored, deduced, LLM-proposed)
|
||
- =:timestamp= — when it was admitted to the symbolic index
|
||
- =:referenced-by= — other facts that depend on or reference this one
|
||
- =:contradicted-by= — other facts that disagree with this one (if any)
|
||
- =:superseded-by= — if this fact was replaced by a newer version
|
||
|
||
These fields make every fact auditable. The =/audit <node-id>= command renders the full provenance chain as an Org headline tree. The provenance is not a logging feature. It is the product.
|