passepartout: v0.5.0 hotfix 2 — daemon stable

- Restore (in-package :passepartout) to core-reason - Move *VAULT-MEMORY* back to core-skills - Fix ASDF and defstruct/defpackage ordering - Increase daemon timeout to 120s - Handshake: 0.5.0 Verified: daemon processes messages, TUI clean, gate trace works
2026-05-07 20:14:51 -04:00
parent da160b71e3
commit 924bf8f479
71 changed files with 2976 additions and 3888 deletions
--- a/docs/ARCHITECTURE.org
+++ b/docs/ARCHITECTURE.org
@@ -63,6 +63,7 @@ When the agent assembles context for the LLM, it does not send the entire memory
 1. *Depth ≤ 2* — the root node and its immediate children are always included (title and properties only, no content).
 2. *Foveal focus* — the node the user is currently interacting with is rendered in full, including its body content and all descendants.
 3. *Semantic relevance* — any node whose embedding vector has cosine similarity ≥ threshold (default 0.75) to the foveal node is rendered in full.
+4. *Temporal relevance* — nodes modified within a time window (current session, today) are rendered in full. Deadlines and scheduled items approaching within the warning window (default 60 minutes) are surfaced proactively in the awareness context. Nodes older than the window are title-only. This is the temporal dimension of the foveal-peripheral model: prune in time as well as in semantic space.

 Nodes that don't match any rule are rendered as title-only — a single Org headline with its :ID: property. This keeps active context between 2,000–4,000 tokens for typical memex sizes, versus 50,000–150,000 tokens for a full serialization. The embedding vectors that power semantic retrieval are computed at ingest time (~ingest-ast~ in core-memory.lisp) and can use local models (Ollama), cloud APIs (OpenAI embeddings), or a zero-dependency lexical fallback (trigram Jaccard similarity).

--- a/docs/DESIGN_DECISIONS.org
+++ b/docs/DESIGN_DECISIONS.org
@@ -327,6 +327,8 @@ The structural multipliers are:

 4. *Hot state* — in a REPL-based agent, variables, file handles, sub-routine results, and memory objects are already in memory. Every turn in a standard chat agent re-sends the full conversation history. Token costs in chat agents are quadratic: a 10-turn session pays for ~55 "turns" of context (10 + 9 + 8 + ... + 1 = 55). In Passepartout, context is stored once in the Lisp image. A 10-turn session pays for ~10 turns of context. This is an ~82% reduction on protocol overhead alone, before any foveal-peripheral pruning. This argument is testable: send the same multi-turn session through both architectures and count tokens.

+5. *Temporal filtering* — time-scoped memory queries (what happened today? what's due in the next hour?) return only nodes matching the time window. The temporal filter is a pure-Lisp hash-table walk with a numeric comparison on ~memory-object-version~. Sub-millisecond. 0 LLM tokens. Competitors without time-indexed memory must serialize all nodes and let the LLM scan for temporal relevance — 5,000–50,000 tokens per temporal query. This is the same principle as the foveal-peripheral model applied to the time dimension.
+
 ** The Compounding Cost Curve — Unique Among Agents

 Every AI agent grows more expensive over time. Context histories accumulate. Safety instructions grow more elaborate. Guardrails become longer prompt paragraphs. The user's data grows. The only way to reduce cost in a standard agent is to cap context — sacrificing capability.
@@ -343,6 +345,24 @@ Passepartout has a downward cost curve. Four mechanisms compound:

 After 12 months of daily use, Passepartout's per-session costs are expected to be 40–60% of baseline, while competitors' costs rise to 125–140% of baseline. The crossover point is estimated at 3–6 months. This is not a model quality claim — it is a structural property of the architecture.

+** Time Awareness as a Structural Advantage
+:PROPERTIES:
+:ID:       design-time-awareness
+:CREATED:  [2026-05-07 Thu]
+:END:
+
+Passepartout's architecture provides three layers of time awareness, each enabled by infrastructure that competitors lack:
+
+*Level 1 — Present Awareness.* The LLM knows the current time, date, and session duration because a single ~format-time-for-llm~ call injects it into the system prompt. Most agents know the date from the OS. None know the time or session duration. The cost is ~8 incremental tokens per call (trivially prefix-cached). The saving is eliminating "I don't know the current time" preamble tokens, time-check tool calls, and incorrect temporal reasoning from a model guessing the time.
+
+*Level 2 — Temporal Memory.* Memory queries accept ~:since~ and ~:until~ parameters. "What did I work on in the last hour?" filters 500 nodes to 12 in sub-millisecond Lisp rather than serializing 500 nodes to the LLM at ~5,000 tokens for it to scan. Every memory node carries a ~memory-object-version~ timestamp (a monotonic ~get-universal-time~ value set at ingest since v0.1.0). The temporal filter is a hash-table walk with numeric comparison. 0 LLM tokens. >90% token reduction on time-scoped queries.
+
+*Level 3 — Proactive Triggers.* The heartbeat tick (existing infrastructure since v0.3.0) scans for approaching deadlines every 60 seconds. When a deadline is within the warning window (~DEADLINE_WARNING_MINUTES~, default 60), a temporal context note is injected into the awareness assembly. The LLM sees "3 deadlines today: Submit report (45min)" in its context without a triggering call. A "what should I work on today?" query is answered from pre-loaded context — 0 LLM tokens versus 1,500–4,000 for an unassisted agent.
+
+None of these three layers require new infrastructure. Time awareness is not a feature Passepartout builds — it is a feature Passepartout *unlocks* by having timestamped memory (v0.1.0), heartbeat+cron (v0.3.0), and the foveal-peripheral context pruning model (v0.2.0) already in place. Adding time awareness costs ~175 lines of Lisp. Building it in competitors would require building the heartbeat, the time-indexed memory, and the proactive context injection — 800+ lines each — and would still cost LLM tokens because their safety verification is prompt-based.
+
+The structural principle generalizes: Passepartout's infrastructure investments compound. Each new subsystem (Merkle memory, heartbeat, skill engine, embedding pipeline) lowers the cost of the next feature. Time awareness is the first demonstration of this compounding — three layers unlocked by infrastructure already built for other purposes.
+
 ** Tiered Pricing: Cheap Models for Simple Tasks, Free for Learned Patterns

 The model-tier router (v0.3.0) classifies every task by complexity and routes it to the cheapest capable model. Simple lookups go to tiny local models or deterministic hash table scans (0 LLM tokens). Text processing goes to mid-tier models. Complex planning and code generation go to the premium model. The consensus loop (v0.10.0) only fires for high-impact actions.
--- a/docs/ROADMAP.org
+++ b/docs/ROADMAP.org
@@ -884,55 +884,6 @@ After all renames complete, update every remaining reference:

 *** Verify: ASDF compiles, FiveAM suite passes, integration tests pass.

-*** Time Awareness — Temporal Context for the Agent
-
-Rationale: Passepartout already has the infrastructure for time awareness — timestamped memory (v0.1.0), heartbeat+cron (v0.3.0), and foveal-peripheral context pruning (v0.2.0). Adding time awareness costs ~175 lines of Lisp and unlocks three layers that no competitor provides. The temporal dimension is the missing axis in the foveal-peripheral model: prune in time as well as in semantic space.
-
-*** TODO Time Awareness — Level 2: temporal memory filtering
-:PROPERTIES:
-:ID:       id-v050-time-memory
-:CREATED:  [2026-05-07 Thu]
-:END:
-
-Rationale: ~memory-object-version~ has been set to ~get-universal-time~ on every ingest since v0.1.0. Every memory node carries a timestamp. But ~context-query~ has no time filter — "what did I work on today?" serializes all nodes to the LLM instead of filtering 500→12 in sub-millisecond Lisp.
-
- ~memory-objects-since(timestamp)~ in ~core-memory.lisp~: hash-table walk returning objects with ~version >= timestamp~. ~20 lines.
- ~memory-objects-in-range(since until)~ in ~core-memory.lisp~: version between two timestamps. ~15 lines.
- Extend ~context-query~ in ~symbolic-awareness.lisp~ with ~:since~ and ~:until~ keyword parameters. ~10 lines.
- Pure Lisp, sub-millisecond, 0 LLM tokens. ~90% token reduction on time-scoped memory queries.
- FiveAM test: ingest 3 nodes at T0, sleep, ingest 2 nodes at T1, verify ~memory-objects-since(T1)~ returns exactly 2.
-
-*** TODO Time Awareness — Level 3: ~sensor-time~ skill
-:PROPERTIES:
-:ID:       id-v050-sensor-time
-:CREATED:  [2026-05-07 Thu]
-:END:
-
-Rationale: The heartbeat fires every 60 seconds for maintenance tasks. It can also carry temporal awareness — scanning for approaching deadlines, tracking session duration, and injecting temporal context so the LLM knows "3 deadlines today: Submit report (45min)" without triggering a call. This turns "what should I do today?" from a 1,500–4,000 token LLM call into a 0-token pre-loaded context answer.
-
- New skill: ~sensor-time.org~ → ~sensor-time.lisp~. ~120 lines.
- Session tracking: record session start time at load. Expose ~(session-duration)~.
- Cron-registered heartbeat tick: ~orchestrator-register-cron "time-tick"~ with ~:action sensor-time-tick~, ~:tier :reflex~ (no LLM), ~:repeat "+1m"~.
- Deadline scanning on tick: query memory for headlines with ~:DEADLINE~ or ~:SCHEDULED~ properties. If within ~DEADLINE_WARNING_MINUTES~ (env var, default 60), inject deadline note into awareness context.
- Deadline context note format: ~"3 deadlines approaching: Submit report (45min), Review PR (2h), Call mom (3h)."~
- ~TUI status bar~: add session duration and deadline count to the status bar (reuse existing gate-trace / focus-map rendering from v0.4.0).
- FiveAM test: set deadline 30 minutes from now, fire tick, verify deadline appears in awareness context.
-
-*** TODO Time Awareness — Level 1: timestamp in system prompt
-:PROPERTIES:
-:ID:       id-v050-time-prompt
-:CREATED:  [2026-05-07 Thu]
-:END:
-
-Rationale: The system prompt currently has IDENTITY, TOOLS, CONTEXT, LOGS. No TIME. The LLM cannot answer "what time is it?" or contextualize deadlines correctly. Adding a timestamp costs ~8 incremental tokens and eliminates guessing, time-check tool calls, and preamble hedging. Combined with session duration from Level 3, the LLM knows "2026-05-07 Thu 14:32:17 UTC. Session: 3h 12m."
-
- ~format-time-for-llm~ function: returns human-readable date + time + optional session duration. Uses ~multiple-value-bind~ with ~decode-universal-time~. ~15 lines.
- Inject into ~think()~'s system prompt format string in ~core-reason.lisp~: add ~TIME:~ section between IDENTITY and TOOLS. ~5 lines.
- ~TIME_AWARENESS~ env var (default ~true~) in ~.env.example~. When ~false~, timestamp omitted.
- ~TIME_FORMAT~ env var (default ~iso~): ~iso~ = ~2026-05-07T14:32:17Z~, ~natural~ = ~2:32 PM UTC, Thursday May 7, 2026~.
- Session duration from ~session-duration~ function in ~sensor-time~ skill (Level 3). If skill not loaded, omit duration, show time only.
- FiveAM test: ~format-time-for-llm~ returns string containing current year and UTC; with ~TIME_AWARENESS=false~ returns empty string.
-
 *** Token Economics (foundation complete — now build features)

 **Design insight: why token economics is the structural differentiator.** Passepartout's sparse-tree rendering and deterministic safety gates should produce 2–3x fewer tokens than competitors for equivalent coding tasks, and 13–24x fewer for knowledge management. But without caching and budget enforcement, the fixed overhead per call eats these savings. A coding session that touches 30 files with competent context management costs ~72K tokens (Passepartout) versus ~185K (Claude Code). Without caching, the Passepartout number climbs toward ~150K because every call retransmits the static prefix. The architectural advantage exists in theory but requires operational plumbing to materialize.
@@ -1187,6 +1138,55 @@ The cost tracking and budget enforcement are defensive advantages: no competitor

 The minimum viable local model advantage is structural: at 2,000–4,000 effective tokens (foveal-peripheral + caching), a 7–8B parameter model on consumer hardware is a daily driver. Competitors at 32K+ effective tokens require 70B+ parameter models and 16–32 GB VRAM. Passepartout runs on a laptop GPU where competitors need a data center card or cloud API.

+** v0.5.1: Time Awareness
+
+Rationale: Passepartout already has the infrastructure for time awareness — timestamped memory (v0.1.0), heartbeat+cron (v0.3.0), and foveal-peripheral context pruning (v0.2.0). Adding time awareness costs ~175 lines of Lisp and unlocks three layers that no competitor provides. The temporal dimension is the missing axis in the foveal-peripheral model: prune in time as well as in semantic space.
+
+*** TODO Time Awareness — Level 2: temporal memory filtering
+:PROPERTIES:
+:ID:       id-v051-time-memory
+:CREATED:  [2026-05-07 Thu]
+:END:
+
+Rationale: ~memory-object-version~ has been set to ~get-universal-time~ on every ingest since v0.1.0. Every memory node carries a timestamp. But ~context-query~ has no time filter — "what did I work on today?" serializes all nodes to the LLM instead of filtering 500→12 in sub-millisecond Lisp.
+
+- ~memory-objects-since(timestamp)~ in ~core-memory.lisp~: hash-table walk returning objects with ~version >= timestamp~. ~20 lines.
+- ~memory-objects-in-range(since until)~ in ~core-memory.lisp~: version between two timestamps. ~15 lines.
+- Extend ~context-query~ in ~symbolic-awareness.lisp~ with ~:since~ and ~:until~ keyword parameters. ~10 lines.
+- Pure Lisp, sub-millisecond, 0 LLM tokens. ~90% token reduction on time-scoped memory queries.
+- FiveAM test: ingest 3 nodes at T0, sleep, ingest 2 nodes at T1, verify ~memory-objects-since(T1)~ returns exactly 2.
+
+*** TODO Time Awareness — Level 3: ~sensor-time~ skill
+:PROPERTIES:
+:ID:       id-v051-sensor-time
+:CREATED:  [2026-05-07 Thu]
+:END:
+
+Rationale: The heartbeat fires every 60 seconds for maintenance tasks. It can also carry temporal awareness — scanning for approaching deadlines, tracking session duration, and injecting temporal context so the LLM knows "3 deadlines today: Submit report (45min)" without triggering a call. This turns "what should I do today?" from a 1,500–4,000 token LLM call into a 0-token pre-loaded context answer.
+
+- New skill: ~sensor-time.org~ → ~sensor-time.lisp~. ~120 lines.
+- Session tracking: record session start time at load. Expose ~(session-duration)~.
+- Cron-registered heartbeat tick: ~orchestrator-register-cron "time-tick"~ with ~:action sensor-time-tick~, ~:tier :reflex~ (no LLM), ~:repeat "+1m"~.
+- Deadline scanning on tick: query memory for headlines with ~:DEADLINE~ or ~:SCHEDULED~ properties. If within ~DEADLINE_WARNING_MINUTES~ (env var, default 60), inject deadline note into awareness context.
+- Deadline context note format: ~"3 deadlines approaching: Submit report (45min), Review PR (2h), Call mom (3h)."~
+- ~TUI status bar~: add session duration and deadline count to the status bar (reuse existing gate-trace / focus-map rendering from v0.4.0).
+- FiveAM test: set deadline 30 minutes from now, fire tick, verify deadline appears in awareness context.
+
+*** TODO Time Awareness — Level 1: timestamp in system prompt
+:PROPERTIES:
+:ID:       id-v051-time-prompt
+:CREATED:  [2026-05-07 Thu]
+:END:
+
+Rationale: The system prompt currently has IDENTITY, TOOLS, CONTEXT, LOGS. No TIME. The LLM cannot answer "what time is it?" or contextualize deadlines correctly. Adding a timestamp costs ~8 incremental tokens and eliminates guessing, time-check tool calls, and preamble hedging. Combined with session duration from Level 3, the LLM knows "2026-05-07 Thu 14:32:17 UTC. Session: 3h 12m."
+
+- ~format-time-for-llm~ function: returns human-readable date + time + optional session duration. Uses ~multiple-value-bind~ with ~decode-universal-time~. ~15 lines.
+- Inject into ~think()~'s system prompt format string in ~core-reason.lisp~: add ~TIME:~ section between IDENTITY and TOOLS. ~5 lines.
+- ~TIME_AWARENESS~ env var (default ~true~) in ~.env.example~. When ~false~, timestamp omitted.
+- ~TIME_FORMAT~ env var (default ~iso~): ~iso~ = ~2026-05-07T14:32:17Z~, ~natural~ = ~2:32 PM UTC, Thursday May 7, 2026~.
+- Session duration from ~session-duration~ function in ~sensor-time~ skill (Level 3). If skill not loaded, omit duration, show time only.
+- FiveAM test: ~format-time-for-llm~ returns string containing current year and UTC; with ~TIME_AWARENESS=false~ returns empty string.
+
 ** v0.6.0: Signal Pipeline, Concurrency & Streaming

 The current pipeline is strictly sequential — one signal traverses Perceive → Reason → Act before the next signal begins. Background tasks (heartbeat, embedding cron, gardener scans) compete with foreground interactions. A heartbeat that fires during a long tool chain is queued. A Telegram message during a multi-step planning cycle is queued. The system feels sluggish under concurrent load even though the symbolic operations are near-instant (SBCL hash table lookups are microseconds) — the bottleneck is the single-pipeline architecture, not the hardware.