Files
hermes-brain/projects/passepartout/architecture/design/safety-self-preservation/layered-signal-authentication-trust-in-the-pipe.org

3.6 KiB

— title: Layered Signal Authentication — Trust in the Pipe type: reference tags: :passepartout:architecture: —

Layered Signal Authentication — Trust in the Pipe

Passepartout's Perceive-Reason-Act pipeline currently accepts signals from any source that speaks the framed TCP protocol. The :source field in the signal plist is metadata — it claims origin, it does not prove it. A compromised process on the machine, a skill with elevated privileges, or a network attacker who reaches the daemon port can inject signals with :source :human-input and the Dispatcher will treat them as authorized.

This is not a hypothetical threat. Passepartout will eventually process signals from automated feeds (RSS, API polls), sensors (vision, microphone, file watchers), and scheduled jobs (cron, heartbeat). A single compromised sensor that can inject signals claiming to be human breaks all three Laws simultaneously: it can self-terminate, override human intent, and cause harm.

The solution: a single authentication gate (vector 0, at priority 700 — before all other gates and before any type-level checking) that runs up to four configurable layers:

Layer Question Mechanism Result type Depends on
1 Is the signal cryptographically signed by a known key? Key pairs + SHA-256 Binary (pass/reject) Vault + Ironclad (exist)
2 Do sensory attributes match the claimed identity? Vision/audio processing Plist of match results Vision and audio skills (TBD)
3 Does deterministic reasoning rule out this identity? Screamer + fact store Binary (pass/reject) Phase 2 (Screamer + fact store)
4 Do probabilistic patterns support this identity? Embeddings + LLM Confidence score (0-1) Embedding infrastructure (exists)

Signals that fail any binary layer (crypto, deterministic) are rejected with provenance. Signals that pass binary layers but carry low probabilistic confidence operate at reduced authorization — read-only by default, write actions require HITL. The four layers compose, they are not independent gates. They are one gate with configurable depth.

The authorization matrix is per-key, per-action-class. Default policy for every non-human key: (:read-only :propose). The human's key signs new source keys into existence. The human's key signs revocation of compromised keys. Both operations produce facts in the symbolic index — auditable, revocable, survivable across restarts.

The signal provenance chain is Merkle-linked: each signal in a multi-step chain hashes its predecessor's signature as part of its own payload. After an incident: "The deletion happened because sensor #3 classified the directory as stale. Classification was signed by key #47 (vision-skill). Sensor data was signed by key #12 (camera-feed). Sensory auth noted liveness failure. Deterministic auth noted impossible transit. Key #12 was later revoked." Every intermediate step is auditable. Every signer is identifiable. Every authentication result is in the chain.

The human can configure which layers are active per signal class: AUTH_LAYERS_DEFAULT=crypto,deterministic,probabilistic, AUTH_LAYERS_SENSOR=crypto,sensory,deterministic, AUTH_LAYERS_CRON=crypto.

For full implementation detail, see the Phase 0b spec in ROADMAP.org v0.12.0.