hermes-brain/projects/passepartout/architecture/design/foundation.org

---
title: Foundation
type: reference
tags: :passepartout:architecture:
---

* Foundation

** Non-Negotiable Identity
:PROPERTIES:
:ID:       0a33bd83-ff3c-4eac-bc97-83eb6702051a
:ID:       design-identity
:CREATED:  [2026-05-07 Wed]
:WEIGHT:  40
:END:

- Pure Common Lisp + Org-mode. No JSON. No YAML. No external databases.
- Single-address-space memory (Lisp hash tables in RAM — the agent IS the memory).
- "Thin harness, fat skills" — complexity lives at the edges, not the kernel.
- One agent composed of many skills. Concurrency via bordeaux-threads (shared memory).
- Plists everywhere — homoiconic communication between all components.

This is the foundational decision from which all other decisions derive. It is not negotiable. Every architectural choice below exists because this identity makes it possible — and in some cases, makes it the only viable path. The single memory space enables Merkle-tree integrity without serialization boundaries. Plists enable the cognitive pipeline to be transparent and inspectable at every stage. Org-mode as the universal format means the agent's memory, the user's notes, and the agent's own source code are the same structure. This identity is the constraint that produces the architecture.

** One Single Agent
:PROPERTIES:
:ID:       design-multi-agent-default
:CREATED:  [2026-05-07 Wed]
:WEIGHT:  40
:END:

The AI industry has developed an intuition toward multi-agent systems as the default solution to hard problems. Multiple agents spawn, delegate, coordinate, debate, and consensus their way toward solutions. This pattern is compelling in demos and genuinely useful in specific contexts — but it has become a default assumption that warrants scrutiny.

When context windows grew expensive and task complexity increased, the response was natural: split the problem across agents, each handling a slice. But this architectural choice carries hidden costs that are rarely acknowledged.

*The synchronization tax* is the most immediate burden. Each agent operates with partial information, and maintaining coherence requires continuous state reconciliation. Tokens and processing cycles are spent not on the task itself, but on protocol overhead — who holds what, who decided what, who is correct when they disagree.

*Fragmented context* is the deeper problem. When Agent A writes a function and Agent B modifies a type it depends on, neither has the full picture. Integration failures emerge not from individual incompetence but from systemic communication gaps. Single-agent systems avoid this entirely: one brain holds the complete model, every decision is made with full visibility.

*Audit trails become complex* in multi-agent systems. A decision traced through a single-agent system has a clean, linear history. A decision traced through a multi-agent system branches and forks, with each agent's reasoning partially overlapping and partially conflicting.

None of this is to say multi-agent systems are never appropriate. Embarrassingly parallel workloads benefit from parallelism regardless of context. When distinct expertises are required and cannot coexist in one model, delegation makes sense. In adversarial scenarios where conflicting goals are features, multi-agent architectures shine.

But the default assumption that complex reasoning tasks are best solved by multiple agents is unproven and likely wrong for the engineering domain. Claude Code is a single-agent system. It handles 50-file refactors, debugs complex stack traces, writes tests, and navigates large codebases. The assumption that you need five agents to do what one well-designed agent can do is an industry habit, not a technical necessity.

Passepartout is single-agent by default not from limitation but from conviction: for reasoning-heavy work where coherence matters, a unified memory space and single decision-making locus are architectural assets, not constraints.

** The Unified Memory Argument
:PROPERTIES:
:ID:       design-unified-memory
:CREATED:  [2026-05-07 Wed]
:WEIGHT:  40
:END:

If single-agent architecture is the decision, unified memory becomes the mechanism that makes it viable. The critical question is not "how many agents" but "how does the agent manage context without saturating."

Context window limits are largely a symptom of lazy architecture. The default approach — stuff everything in, hope the model figures it out — works poorly at scale. A more principled approach inverts the problem: the system should hold effectively infinite context, with the active window kept lean through intelligent management.

*Lazy loading* is the core technique. When an agent needs information about a function, it does not load the entire codebase. It loads precisely what the function does. Context stays lean — 2,000 to 4,000 tokens — while the full context remains accessible through retrieval.

*Compaction events* are scheduled during idle cycles. The system extracts new facts from active context and writes them to permanent storage. Active context is wiped clean, not because space ran out, but because the information has been preserved in a form that can be retrieved when relevant.

*Org-mode as externalized memory* solves the persistence problem elegantly. Every decision, every note, every task lives in plain text files the user already owns. The agent does not maintain a separate database. It queries files it can already access, modifies files it already owns.

*Retrieval is the key primitive.* Semantic search across Org files finds relevant nodes. The agent does not hold the full context — it holds pointers to context, loaded on demand. This is how a single agent handles tasks that would saturate a naive multi-megabyte context window.

The unified memory argument is not that infinite context is free. It is that with proper architecture, effective infinite context is achievable without the synchronization and fragmentation costs of multi-agent systems.

** Org-Mode as Unified AST
:PROPERTIES:
:ID:       design-org-unified-ast
:CREATED:  [2026-05-07 Wed]
:WEIGHT:  40
:END:

Passepartout makes a bet that most systems consider too expensive to place: that humans and machines should share the same file format. That bet is Org-mode.

Most systems separate human-readable notes from machine-readable data. The user writes Markdown. The system stores it, indexes it, searches it. But internally, the system maintains its own model — a database, an object store, a knowledge graph — that is disconnected from the Markdown. When the user dies or leaves, the Markdown survives but the model must be reconstructed.

Passepartout refuses this separation. The Org file is not a representation of the data. The Org file IS the data. The same text that the user reads and edits is what the system parses and operates on. org-element reads an Org buffer and returns a tree structure that is the direct Lisp representation of the file's content.

This has several profound implications.

First, there is no translation layer between human and machine. When the agent writes a skill, it writes Org text that is immediately readable by the human who owns the file. When the human writes a note, it is immediately accessible to the agent as a native data structure. The communication is not mediated by a schema or an import/export process.

Second, the format is genuinely readable by both parties, not just technically accessible. Org-mode's syntax is human-friendly: headlines begin with asterisks, properties live in drawers, tags are labels after colons. The human does not have to understand the full Org specification to read what the agent wrote. The agent does not have to handle edge cases in human notation.

Third, the format is stable across decades. Org-mode has been in active development since 2003. The files written today will be readable by Org-mode in 2040. There is no schema migration, no database upgrade, no vendor lock-in. The human's notes survive the system.

Fourth, the format is universally available. Org-mode is free software. The files are plain text. There is no proprietary format to decode, no application to purchase, no cloud service to access.

Fifth, the format is header-aware and sparse-tree capable. Org-mode's headline hierarchy is not just formatting — it is a semantic structure the system can query. The agent can retrieve only the relevant subtree under a heading, ignoring the rest of the file. This is fundamentally different from Markdown, where the entire file must be loaded or the retrieval logic must parse and filter at the string level.

Sparse tree retrieval is the key to efficient context management. When the agent needs information about the =openctl-db= function, it queries for the =openctl-db= subtree specifically. It receives exactly the code, documentation, and metadata under that heading — nothing more. The context stays lean not because the file was pre-split but because the retrieval is structural. In a Markdown system, the agent either loads the entire file (expensive, noisy) or relies on imprecise grep-like search (fragile, loses hierarchy). In Org-mode, retrieval is precise, hierarchical, and cheap. The heading boundary is the access boundary.

Sixth, Org-mode unifies what every other format fragments. A single Org file contains the headline hierarchy, prose documentation, source code blocks with live evaluation, tags for categorization, metadata in property drawers, TODO state for task management, timestamps and deadlines, and links to other nodes. Markdown cannot express TODO state without external tools. JSON cannot contain prose. YAML cannot embed runnable code. Each format serves one purpose; Org-mode serves all of them. When the agent reads a skill file, it reads documentation, code, dependencies, metadata, and task state in one parseable structure. When the human reads the same file, they see the same information rendered in a human-friendly form. No other format achieves this unification without maintaining parallel files or external databases.

Seventh, a skill lives in one Org file, not a directory. The standard pattern for a software project is a directory containing =README.md=, =package.json=, =src/main.py=, =src/utils.py=, =tests/test_main.py=, =scripts/deploy.sh=, and =config.yaml=. Each file type is isolated by convention. Passepartout's skills violate this convention deliberately. Each skill is one Org file. The file contains the skill's documentation, the skill's code, the skill's metadata, the skill's TODO state, and the skill's dependencies on other skills. There is no directory to navigate, no external files to locate, no risk that the README describes behavior that the code does not implement. The skill is a single atomic unit: readable by human and machine, editable by both, versionable as one entity.

The unified format is what makes the memory architecture work. The agent's memory is not a database that the user cannot inspect. It is a folder of Org files that the user can read, edit, and understand. The agent manipulates these files directly, using the same tools the user would use. There is no hidden state, no shadow database, no model that differs from the source.

This is what "sovereignty" means in technical terms: the user owns the data in a format they can access, and the agent operates on the data in the same format they own.

** Homoiconicity as Foundation
:PROPERTIES:
:ID:       design-homoiconicity
:CREATED:  [2026-05-07 Wed]
:WEIGHT:  40
:END:

Common Lisp is homoiconic: code and data share the same representation. A Lisp program is a list, and a list is a Lisp program. This is usually presented as a curiosity, an interesting property that enables macros. In Passepartout, it is the foundational enabling property of the entire self-modification architecture.

When code is data, the agent can read its own source the same way it reads a text file or an Org buffer. There is no AST parser required, no external tool to extract the function object from the running image. The agent evaluates (read-from-string source) and the result is executable Lisp. The representation it manipulates is the same representation that the runtime executes.

This is not true of most languages. In Python, the agent can inspect an AST through the ast module, but that AST is a foreign object — a data structure that represents code but is not code itself. In C, the agent cannot inspect its own compiled machine code at all.

In Lisp, the distinction between code and data is a convention, not a barrier. The agent's skills are lists. The agent can take a skill, extract a function definition, modify the body, wrap it in a new list, and evaluate it. The modification is surgical: it changes exactly what it intends to change, with no risk of corrupting adjacent state, because the representation is a tree that the runtime understands natively.

Runtime introspection is therefore native. The agent does not need a debugger API or a reflection protocol. It operates on its own code as data because its own code is data. (describe 'function-name) returns the function's documentation. (function-lambda-list 'function-name) returns its parameters. (macroexpand-1 '(defskill ...)) shows what the macro produces. There is no impedance mismatch between the agent's reasoning and the system's representation.

Self-modification is the practical consequence. The agent can detect an error, locate the erroneous function, generate a corrected version, and hot-reload it into the running image. The correction is not applied to a file that requires a restart — it is applied to the live object that the system is currently executing. This is what makes the self-editing skill viable: the agent can fix itself without stopping.

In v1.0.0, when the symbolic engine takes over the reasoning core, homoiconicity becomes the bridge between the neural and symbolic layers. The neural engine generates proposals as s-expressions. The symbolic engine evaluates them against formal constraints. The result is a modification that is simultaneously a data structure the symbolic engine can analyze and code the runtime can execute. The two representations are identical by construction.

This is the technical meaning of "Lisp as Governor": not just that Lisp orchestrates the other components, but that the representation of the system is uniform and inspectable at every level. There is no hidden state, no opaque machine code, no representation that the agent cannot reach into and modify. The system is legible to itself by design.

*** Self-Modification Without Boundaries

Other systems that support self-editing draw a line between the core and the skills. Hermes can modify its skills at runtime, but the core harness is protected — editing it requires a restart because the core is treated as privileged code that cannot be safely modified while running.

Passepartout has no such boundary. The "thin harness, fat skills" distinction describes where complexity lives, not where authority flows. The harness is small by design, but it is not privileged. The agent can read and write any part of the system — including the very code that is currently executing — without restarting.

This is only possible because Lisp code is mutable data at runtime. In a compiled language, the machine code for a running function is locked in memory, protected by the call stack, impossible to modify safely. In Lisp, the function object is a list you can modify with =setf=. When the agent changes a harness function, the running image immediately reflects the change. The next invocation uses the new code. There is no restart, no special boot mode, no distinction between development and production.

The implications extend beyond convenience. A system that cannot modify its own core is a system that has limits on its own adaptability. It can learn skills but not improve its own structure. It can grow but not evolve. Passepartout's lack of a core boundary means the system can improve its own reasoning engine, fix bugs in its own cognition, and evolve its own architecture — all while continuing to operate. There is no ceiling on self-improvement. The agent can rewrite the very code that rewrites itself.