diff --git a/ideas/passepartout-economics.org b/ideas/passepartout-economics.org new file mode 100644 index 0000000..9754eb6 --- /dev/null +++ b/ideas/passepartout-economics.org @@ -0,0 +1,409 @@ +#+TITLE: Passepartout — Patents, Moats, Economics, Design Implications +#+AUTHOR: Hermes agent distillation of 2026-05-21 discussion with Amr +#+FILETAGS: :passepartout:agent:economics:ip:licensing: +#+STARTUP: content + +* Summary + +Discussion about the economic and strategic implications of Passepartout's +architecture — a self-bootstrapping agent that combines deterministic safety +gates (0 LLM tokens per verification), Merkle-tree memory with provenance, +a symbolic fact store with sufficiency criterion, and ACL2-based macro layer +bootstrapping for provable reasoning. + +The central claim: this architecture decouples intelligence from LLM API +consumption. The probabilistic engine (LLM) handles ~10% input/output +translation; the symbolic engine handles ~80% of reasoning at near-zero +marginal cost. The cost curve inverts: generation is expensive, verification +is cheap. + +* Patentability + +** Likely patentable + +- **Probabilistic-deterministic split with deterministic gates between LLM + proposal and execution.** The LLM proposes, the gate stack decides. Each + gate is a pure Lisp function costing 0 LLM tokens. Every competitor uses + prompt-based guardrails. The specific 11-vector gate stack (secret + exposure, path protection, self-build boundary, shell safety, network + exfiltration, privacy tags, Lisp syntax, credential vault, tool permissions, + policy, protocol validation) is a specific novel implementation. + +- **Foveal-peripheral context model with Org-tree structured retrieval.** + Depth ≤ 2 always; full render on foveal node; full render on semantic + similarity to foveal; full render on temporal relevance (modified today, + upcoming deadlines); everything else title-only. Targets 2,000-4,000 tokens. + No agent does this. + +- **Merkle-tree memory with copy-on-write snapshots and operation-level + undo/redo.** Every memory-object is content-addressed. Snapshots are + deep-copies. Undo/redo at the individual operation level. Applied to an + agent's reasoning loop. + +- **Gate-to-fact bootstrap with sufficiency criterion.** Mechanically + extracting facts from the gate stack's own data structures (protected paths, + shell blocked patterns, network whitelist) as the seed of an ontology. A + measurable sufficiency threshold that flips the system from LLM-proposes + to Screamer-deduces. + +- **Macro-layer-as-skill bootstrapping architecture.** Encoding theorem-proving + capability as hot-reloadable skills where each layer is verified by the layer + below. The proof forest is a Merkle-versioned dependency tree. + +** Likely not patentable (known techniques in expected applications) + +- ACL2 itself (decades old) +- Screamer for consistency checking (constraint solving on a triple store is + an obvious application) +- Hot-reloadable skills (Lisp images have been hot-reloadable for 40 years) +- Org-mode as a data format +- Multi-layer signal authentication (known in network security) + +** Counterargument from prior art + +A patent examiner will argue that: +- "Thin harness, fat skills" is the standard OS microkernel architecture + applied to an AI agent +- Foveal-peripheral context is locality of reference (standard in OS design) +- Merkle-tree memory is content-addressed storage (standard in distributed + systems) +- Deterministic gate stack is capability-based security (going back to + KeyKOS in the 1980s) + +The defense: these principles have never been *combined* in an AI agent, and +the combination produces emergent effects (cost curve inversion, sufficiency +flip, self-repairing bootstrapping chain) that no single principle produces +alone. Good patent claims would cover the specific combination, not the +individual components. + +** Strongest single claim + +An AI agent system comprising: +1. A probabilistic language model +2. A stack of deterministic safety gates operating at zero LLM-token cost + between the model's proposal and execution +3. A Merkle-versioned memory store from which gate outcomes are mechanically + extracted as facts +4. A symbolic reasoning engine seeded by those facts with a measurable + sufficiency criterion that determines when the probabilistic model can + be bypassed + +Each element is known. The combination is novel and non-obvious. + +* Licensing Strategy + +** AGPLv3 for the public repository + +AGPLv3 closes the ASP loophole (Section 13): anyone who modifies the +software and offers it over a network must release their modified source. +This protects against proprietary forks that extract value without +contributing back. + +Crucially: AGPL is a *product requirement*, not a concession to openness. +The system's value proposition is provable correctness — every decision has +Merkle provenance, the proof forest is visible, the sufficiency meter is +readable. This claim is structurally incredible with closed source. An +enterprise buyer needs to inspect the gate stack, verify the Merkle +implementation, and confirm ACL2 integration is sound. AGPL makes this +possible without signing an NDA. + +** AGPL only covers modifications to code, not: + +- Gate rules specific to a domain (these are data, not code) +- The fact store (empirical data generated from usage) +- Ontology categories (design decisions stored as configuration) +- Proprietary skills loaded at runtime (AGPL boundary on plugin systems + is legally unsettled) + +** Dual license model + +- AGPLv3 for open source — builds ecosystem, trust, and community +- Commercial license for enterprises that cannot accept AGPL (blanket + policies against AGPL infection) — MySQL/SugarCRM/GraphQL model + +* Moats + +** Re-evaluated: time is not the primary moat + +Initial assumption: the bootstrapping chain (gate outcomes → facts → +Screamer rules → ACL2 theorems → macro layers) takes months to build, +giving first-mover advantage. + +Challenge: a Phase 4+ Passepartout fed on Wikipedia + Wikidata can build +a general ontology in two weeks. Entity resolution is batch work. Structural +consistency verification is minutes. The organic growth advantage collapses +for general knowledge. + +** Actual moats (weaker than initially assumed) + +1. **Domain-specific gate rules** — thin. A few hundred lines of Lisp data + encoding deployment-specific path patterns, shell safety rules, and + volume layouts. Write once, trivial to copy. Not a real moat. + +2. **Empirical decision history** — every HITL decision is a Merkle fact. + "On date T, user approved action X under context Y." A fresh instance + has none of this. Makes *your* instance more valuable but doesn't + prevent competition — it's a switching cost, not a barrier to entry. + +3. **Evaluation harness (regression suite)** — thousands of test cases + accumulated from every bug fix. Cannot be ingested from public data. + Built only by using the system, breaking it, fixing it, and adding a + test. Strongest residual moat, but even this can be partially + compressed through public benchmarks (SWE-bench, etc.). + +4. **Infrastructure integration** — the specific Docker compose layouts, + Traefik router patterns, Authentik provider configurations, backup + policies encoded as gate rules over months of use. A competitor's + infrastructure is different; their generic Passepartout does not know + your topology. + +** Strongest competitor strategy + +Not copying your gate rules — offering the same architecture as a service +with their own pre-seeded general knowledge, a generic safety baseline, +and a consulting engagement to customize gate rules for each customer. +The AGPL prevents closing the architecture but does not prevent offering +it as a service with a customization layer. + +** The defensible business is services, not product + +The defensible entity is "the organization that best understands how to +adapt Passepartout to your domain" — not "the organization that owns +Passepartout." The Lisp Machine appliance (hardware + certification) and +evaluation harness certification service are the closest thing to product +defensibility. + +* Economics and Monetization + +** Cost structure + +- One-time cost: gate-rule encoding for a domain (from hours for codified + domains — FAR, HIPAA, ISO standards — up to months for tacit domains) +- The LLM translates codified rules directly: ingest regulation → produce + gate rule plist → ACL2 verifies consistency → human reviews. This is + translation, not reasoning. +- For non-codified knowledge (craft expertise, organizational culture): + Phase 3 archivist loop over time +- Near-zero marginal cost: ACL2 proof + Screamer consistency check + + VivaceGraph lookup per interaction — all CPU-native, all in-image +- No recurring LLM API costs for the 80% symbolic reasoning layer +- After sufficiency flip: pennies per day vs dollars per day for LLM-only + +** Revenue models by field + +| Field | Why Passepartout | Revenue Model | +|-------+------------------+---------------| +| Industrial infrastructure (refineries, power grids, manufacturing) | Offline operation, provably safe, near-zero marginal cost, mandatory audit trail | Lisp Machine appliance + SCADA certification package | +| Healthcare administration (billing, claims, prior authorization) | Rule-heavy domain, privacy-mandated, audit-driven, high per-transaction cost today | Subscription for regulatory gate packages (CPT/ICD-10/HIPAA rules), updated when CMS publishes new rules | +| Software supply chain (CI/CD security, SBOM verification) | First-order structural verification — ACL2 is natural fit, CI/CD pipeline is already a sequence of gate-checkable steps | Evaluation harness as certification service — "run our 10,000-task suite and get a provable score" | +| Regulatory compliance (GDPR, SOC2, SOX, GxP) | Rule-completeness, active enforcement (not document-based), provable audit trail | Subscription for regulation-specific gate packages — GDPR package, SOC2 package, FedRAMP package, updated when regulations change | +| Defense and classified environments | Air-gapped operation, classification-level gate rules, Merkle provenance is court-admissible evidence | Government contract + hardened appliance with hardware root of trust | + +** Critical insight: encoding cost drops to near-zero for codified domains ** + +Laws, regulations, standards, procedures, and technical specifications are +already written down in structured text. The LLM does not need to *reason* +about them — it needs to *translate* them into gate rules and ACL2 theorems. + +Example: The US Federal Acquisition Regulation (FAR) is ~2,000 pages of +"thou shalt" and "thou shalt not" statements. A frontier LLM can ingest +the FAR and produce a plist of gate rules: +- (if contract > $250K AND not small-business-set-aside → :deny) +- (if sole-source AND no justification-documented → :deny, produce-justification) + +ACL2 then verifies the rule set for internal consistency (Phase 6). Screamer +checks against existing compliance facts. The human reviews the bootstrap +output and approves or corrects individual rules. + +The key distinction: the LLM is not *extracting knowledge from prose* in the +way Phase 3 archivist does (which is open-ended, noisy, requires grounding). +It is *translating a known rule system into a formal representation* — a +mechanical transformation of structured text into structured rules. The +result is not "the LLM's best guess at the rules" but "the rule set as +stated in the source document, mechanically transcribed." + +For domains where the knowledge is codified as text, the gate-rule encoding +time drops from weeks to hours. The only bottleneck is human review of the +output — and the system can assist here by surfacing contradictions for +resolution rather than requiring a full line-by-line audit. + +** What can actually be monetized (TLDR) + +1. **Pre-loaded bootstrapping chains for specific verticals** — domain gate + rules, pre-seeded fact stores, mature proof forests. Saves the buyer + months of bootstrapping. Distributed as data packages under commercial + license, not AGPL. + +2. **Evaluation harness as certification service** — "Bring your agent, + we'll run it through our suite and give a Merkle-verified score." + The regression suite grows with every deployment; a competitor's + regression suite starts empty. + +3. **Hardened Lisp Machine appliance** — RISC-V soft-core with Lisp + microcode, pre-loaded mature Passepartout, certified for specific + verticals (IEC 62443 for industrial, HIPAA for healthcare). Value is + in integration and certification, not the AGPL software. + +4. **Verified skill marketplace** — marketplace where skills are verified + (sandbox + ACL2 non-contradiction proof) before listing. Marketplace + takes a cut. Value is in the verification infrastructure, not the + skills themselves. + +5. **Support and consulting** — the Red Hat model. AGPL code is free; + training, custom gate rules, ontology design, and emergency support + are paid. + +* Design and Architectural Implications + +** The self-improving system + +Passepartout bootstraps two feedback loops: + +- **Empirical loop:** gate outcomes → facts → Screamer-verified patterns → + sufficiency flip → auto-extraction. Knowledge grows without the LLM + touching most of it. + +- **Logical loop:** ACL2 theorems → macro layers (generators, metafunctions, + induction DSL, abstract theories) → richer proof strategies → better + verification. Reasoning capacity grows without changing the prover binary. + +These loops intersect at the fact store: proven theorems become facts, richer +facts generate better proof strategies, better strategies verify more facts. +The system upgrades itself. + +** The 10-80-10 becomes approximately true + +- 10%: LLM handles input translation (natural language → structured goal) + and output formatting (structured result → natural language) +- 80%: Symbolic engine handles reasoning — Screamer plans, ACL2 verifies, + VivaceGraph retrieves facts. Zero LLM tokens. +- The cost curve inverts: verification is cheaper than generation. + +** Key implications + +1. **Verification becomes cheaper than generation.** Once macro layers are + mature, proving a new rule non-contradictory costs near-zero. The LLM + proposes; the symbolic engine accepts or rejects. + +2. **Trust scales with use.** Every interaction produces a structurally + verified outcome. Non-lossy fact base grows. Proof forest thickens. An + auditor can inspect the Merkle tree of gate outcomes and trace any + decision to its root theorem. + +3. **Degradation is reversible.** Every proof layer is a hot-reloadable + skill. Every fact has provenance. A bad metafunction is unloaded; + theorems proven under it are flagged for re-verification; the fact + store retains the pre-upgrade ontology version. + +4. **The system can diagnose its own logical frontier.** If ACL2 keeps + failing on a class of properties, and the failure mode is structural + (not solvable by more macros), the fact store accumulates a pattern: + "These N properties are first-order inexpressible." This signals the + human: the system needs a CIC prover (dependent types) for this domain. + The system cannot transcend its logic without external intervention — + but it can surface the boundary precisely. + +** The Lisp Machine endpoint + +If the system designs and builds itself on Lisp Machine hardware: +- The same system that proves theorems also optimizes the microcode +- No OS boundary, no driver layer — system and proof environment are one +- A RISC-V soft-core with Lisp microcode is manufacturable at older fab + nodes (28nm, 45nm) — sovereign intelligence without GPU supply chains + +** Social implications + +- **Concentration of reasoning.** The macro layers become opaque to anyone + who doesn't understand the bootstrapping history. The system understands + its own reasoning better than its users do. + +- **Cost advantage widens inequality asymmetrically.** The first instance + to reach maturity requires significant gate-rule design (from hours for + codified domains to months for tacit ones). After that, replication is + cheap. Organizations that invest early have a permanent cost advantage + over those that wait for a turnkey product. + +- **Sovereign artifact.** A self-building system on its own hardware does + not depend on cloud APIs, GPU supply chains, or proprietary model + weights. Its intelligence is generated, verified, and sustained locally. + Enables sovereign AI for nations without GPU access. + +* Open Questions + +1. Can CIC (dependent type theory) be implemented as a Passepartout skill, + verified for crash-freedom and rule fidelity by ACL2, and integrated + into the existing fact store API? The Gödelian boundary: ACL2 can + verify the kernel's implementation but not its soundness in any + absolute sense — but this matches current practice (Lean 4's ~500 line + C++ kernel is trusted, not proved). + +2. Can the system generate novel proof strategies? A sufficiently rich + abstract theory layer + Screamer could propose: "Proofs in domain X + all use induction schema Y. Generalizing to Z would prove new + properties across A, B, C." The LLM translates to a metafunction; + ACL2 verifies it; the prover gains a new tactic invented by itself. + +3. What is the social contract for a system that can truthfully say + "I know this is correct" — and "I know what I don't know"? + Most current AI systems can do neither. + +* Impact on the AI and GPU Industry + +If a symbolic-bootstrapping architecture becomes popular — especially now +that codified domains can be ingested at near-zero encoding cost — the +industry structure shifts fundamentally. + +** Token demand compresses + +The entire AI industry (OpenAI, Anthropic, Google — ~$50B API revenue) is +built on per-token pricing: metered cognition. A mature Passepartout +reduces token consumption to the unfamiliar 10% I/O boundary. Token demand +shifts from "every interaction burns tokens" to "only unfamiliar +interactions burn tokens." Steady-state per-user LLM consumption drops by +an order of magnitude. + +** GPU inference demand plateaus in regulated industries + +GPU inference is driven by two things: training and per-request inference. +Training demand is unaffected (frontier models still train on clusters). +Inference demand drops 80-90% in any sector where the rule book is +published — which covers most economically significant sectors (finance, +healthcare, industrial, government procurement, legal compliance). + +Nvidia's growth narrative shifts from "every transaction goes through a +GPU" to "every training run needs a GPU, and the generative 20% needs +inference." A smaller inference TAM than current market pricing assumes. + +** Hyperscaler competition shifts + +The competitive thesis "AI is the next OS, and we own the compute layer" +weakens if the most valuable AI workloads run on a $500 RISC-V board on +your premises. The hyperscalers respond by: +- Offering Passepartout as a managed service (AGPL allows this) +- Differentiating on the frontier I/O API and world model API +- Competing on gate rule libraries for specific industries + +The race shifts from "who has the most H100s" to "who has the best +domain-specific gate rules." Google's industry data advantage matters +more than Azure's raw compute. + +** New hardware tier: verification appliances + +A new category emerges: CPU-native verification appliances running a Lisp +microcode on RISC-V cores. Low volume (hundreds of thousands/year), +high margin ($5K-50K/unit), high switching costs. The Sun Microsystems +model, not the Intel model. Manufacturable at older fab nodes (28nm, +45nm) — no dependency on TSMC's leading edge. + +** The key uncertainty and its resolution + +Original question: how long does gate-rule encoding take? + +Resolution: for codified domains, near-zero. The LLM translates published +regulations into formal rules in one pass — it is a mechanical transformation, +not open-ended reasoning. The bottleneck only exists for tacit, oral, unwritten +knowledge (craft expertise, organizational culture). + +Consequence for the transition timeline: Phase 2 (sufficiency) happens +within months for any domain whose rule book is published. The disruption +accelerates from years to quarters.