hermes-brain/projects/passepartout/architecture/design/the-two-brains/the-dispatcher-as-learning-system.org

---
title: The Dispatcher as Learning System
type: reference
tags: :passepartout:architecture:
---

* The Dispatcher as Learning System
:PROPERTIES:
:ID:       76d92677-1eb0-4a3a-bee7-561874841e45
:ID:       design-dispatcher-learning
:CREATED:  [2026-05-07 Wed]
:WEIGHT:  40
:END:

The Dispatcher begins as a static guard — a set of rules that block obviously dangerous actions. But defining "obviously" is the hard problem. The agent encounters situations the rules do not anticipate. The Dispatcher must grow.

The human-in-the-loop exception is the seed. When the LLM proposes an action the Dispatcher does not recognize, the system does not default to blocking or allowing. It suspends. It writes the proposed action to an Org buffer in a format the human can read and understand. The human reviews and approves or denies. The Dispatcher observes the decision.

From this single observation, the Dispatcher extracts a rule. Not merely "allow this specific action" but "allow this class of actions parameterized by these dimensions." The human approved a write to ~/projects/myapp/src/core.clj. The Dispatcher generalizes: writes to ~/projects/*/src/*.lisp are approved for this session, or for this project, or indefinitely depending on the context and the user's pattern of decisions.

Shadow mode is where rules are tested before deployment. When the Dispatcher encounters a novel situation and is uncertain, it can run the proposed action in a simulated environment. It observes the side effects — what files would be modified, what processes would be spawned, what network calls would be made. If the simulation produces dangerous side effects, the rule is discarded. If it appears safe, the rule is added to the active set with a confidence rating.

Formal verification is where the learned rules are checked against invariants. The Dispatcher's rules are not merely patterns observed from human behavior. They are formulas in a logic that the system can reason about. A rule that would enable path traversal is not discarded because it was observed to be safe in prior instances — it is discarded because it violates the path-confinement invariant by construction.

The Dispatcher becomes, over time, not a guard that blocks bad actions but a reasoning system that understands why actions are good or bad. Early versions learn from human decisions. Later versions learn from their own logical analysis. The human's role transitions from approver to auditor to, eventually, unnecessary oversight.

This is the bootstrap. The system begins dependent on human judgment because it has no basis for judgment of its own. Through accumulated decisions, it constructs a model of what is permitted and why. That model is the foundation for the deterministic symbolic engine that in v1.0.0 takes over the reasoning that the Dispatcher learned to perform.