passepartout/docs/v0.2.x-REMEDIATION.org

#+TITLE: v0.2.x Remediation Plan
#+AUTHOR:
#+STARTUP: content
#+FILETAGS: :docs:plan:remediation:

* Summary

Features marked DONE in the ROADMAP for v0.1.0 and v0.2.0 but whose implementations
are stubs, no-ops, or missing critical functionality. These should have been
completed in their respective versions and must be addressed before v0.3.0
development proceeds.

* P0: system-archivist — Proper Distillation and Link Maintenance

** Claimed status**: =DONE= (v0.1.0: "Scribe + Gardener background workers" + v0.2.0: "31 org files with full literate prose")

** Actual state**: =archivist-log= is a trivial log wrapper (~10 lines). No knowledge
distillation, no broken link detection, no orphaned node flagging.

** What it should do**:

*** Scribe (knowledge distillation)
1. Read daily Org log files from the Memex =daily/= directory
2. Identify new entries (since last processed commit or timestamp)
3. Extract conceptual claims, decisions, and atomic facts from prose
4. Generate atomic Zettelkasten notes in =notes/= with:
   - Descriptive snake_case filename (no dates)
   - =:CREATED:= property from the source log's date
   - =Source:= backlink to the original daily file and headline
   - Tags inferred from content and parent file
5. Track processed state to avoid re-distilling the same content

*** Gardener (structural maintenance)
1. Scan all Org files in the Memex for broken =[[file:...][...]]= links
2. Scan =memory-store= for =memory-object= entries whose =:parent-id= or =:children=
   references point to deleted objects (orphaned nodes)
3. Flag broken links and orphans with =:GARDENER: broken-link= or =:GARDENER: orphan= tags
4. Generate a maintenance report as a Org buffer the user can review

*** Implementation approach
- Wire into =system-event-orchestrator= as cron jobs:
  - Scribe: daily cron (="<%%Y-%%m-%%d %%a +1d>"=, tier =:cognition=)
  - Gardener: weekly cron (="<%%Y-%%m-%%d %%a +1w>"=, tier =:cognition=)
- Use =orchestrator-register-cron= to schedule
- Replace the trivial =archivist-log= function with real implementation
- Track last-processed state via =memory-store= (:LATEST_PROCESSED_DATETIME property)
  or git commit hash

** Dependencies**: =system-event-orchestrator= (cron scheduling), =core-memory= (object store)

** Verification**: FiveAM test that creates a daily log with known content, runs the
Scribe, and asserts that an atomic note was created with correct backlinks.

* P0: system-self-improve — Surgical Self-Editing and Self-Repair

** Claimed status**: =DONE= (v0.2.0: "Self-editing (error detection, surgical fix, hot-reload)")

** Actual state**: =self-improve-edit= does =(declare (ignore old-text new-text))= followed by
a log message — no actual text transformation. =self-improve-fix= same pattern.
The skill's trigger is =nil= so it never fires.

** What it should do**:

*** Self-edit (surgical text replacement)
1. Accept (=filepath=, =old-text=, =new-text=) and apply the transformation
2. Read the file, locate =old-text= (with exact match verification), replace with =new-text=
3. If the target is an Org file with a =#+begin_src lisp= block, tangling the file
   and reloading the skill after edit
4. Create a memory snapshot before editing (rollback safety)
5. Verify the edit succeeded (re-read file, confirm =new-text= appears)
6. Return success/failure with a diff summary

*** Self-fix (error diagnosis and repair)
1. Accept (=skill-name=, =error-log=) and diagnose the failure
2. Parse the error log for: syntax errors (unmatched parens, invalid forms),
   undefined symbol references, semantic issues (prohibited forms)
3. For syntax errors: locate the problematic region, propose a correction
   using structural Lisp knowledge
4. For undefined references: check if the symbol exists in another package,
   if the skill's =#+DEPENDS_ON:= declaration is missing a dependency
5. For semantic issues: identify the prohibited operation and suggest alternatives
6. Invoke =self-improve-edit= to apply the fix
7. After repair, run the skill's tests if they exist; if tests pass, hot-reload

*** Implementation approach
- Add an actual =:trigger= function that activates on =:ERROR= or =:STUCK= signal types
- =self-improve-edit=: use =uiop:read-file-string=, string replacement with
  =ppcre:regex-replace= or substring operations, write back with =with-open-file=
- =self-improve-fix=: add structural analysis in =programming-lisp.lisp= for error parsing
- Leverage the REPL skill for verification after repair (call =lisp-eval= on the fixed code block)

** Dependencies**: =programming-lisp= (lisp-structural-check), =programming-org= (tangling),
=core-memory= (snapshot-memory), =core-skills= (jailed reload)

** Verification**: FiveAM test that creates a file with known content, calls self-improve-edit,
and asserts the replacement was applied. Second test with a file containing a
deliberate error, calls self-improve-fix, and asserts the error was corrected.

* P1: system-event-orchestrator — Bootstrap Implementation

** Claimed status**: v0.3.0 partially DONE ("hook-registry + cron-registry + tier classifier")

** Actual state**: Hook/cron registries, tier dispatching, and heartbeat integration work.
But =orchestrator-bootstrap= is a stub: =(log-message "ORCHESTRATOR: Bootstrap complete")=

** What it should do**:

1. Scan the Memex =projects/= and =notes/= directories for Org files containing =#+HOOK:= properties
2. For each =#+HOOK:= property found, call =orchestrator-register-hook= with
   the hook name and a gate function
3. For files with =#+CRON:= properties (or cron expressions in timestamps),
   register them via =orchestrator-register-cron=
4. Log the count of registered hooks and cron jobs at completion
5. Run bootstrap once at startup (after memory is loaded but before cognitive loop begins)

*** Implementation approach
- Use =uiop:directory-files= with glob patterns for =*.org= files
- Use =org-element= from Emacs (via =emacs-bridge= or =org-eval= skill) for parsing,
  or implement a simple regex-based Org property parser in Lisp
- Walk each file's headlines, extract property drawers, filter for =HOOK:= and =CRON:= keys
- Call existing =orchestrator-register-hook= / =orchestrator-register-cron=

** Dependencies**: =programming-org= (Org file parsing), file system access

** Verification**: Create a test Org file with =#+HOOK: on-write=, run bootstrap,
assert the hook registry contains the expected entry.

* P1: system-memory — Memory Introspection

** Claimed status**: Skill exists but was never part of a version milestone.

** Actual state**: =memory-inspect= is a no-op: =(log-message "MEMORY: Self-inspection triggered.")=
The =:trigger= is =nil= so the skill never activates.

** What it should do**:

1. Return a structured report of memory state:
   - Total objects in =*memory-store*=
   - Distribution by type (=:HEADLINE=, =:PARAGRAPH=, etc.)
   - Distribution by =:TODO-STATE= (=TODO=, =NEXT=, =DONE=, etc.)
   - Count of privacy-filtered objects
   - Most recent objects (by =:version= timestamp)
   - Current snapshot count and timestamps
   - Orphaned objects (parent-id references a deleted ID)
2. Accept an optional filter to narrow the report (by type, by tag, by time range)
3. Wire the trigger to activate on =:INTROSPECTION= signal type or =/memory= commands

*** Implementation approach
- Iterate =*memory-store*= with =maphash=, collect statistics
- Add to skill trigger: =(eq (getf (getf ctx :payload) :sensor) :introspection)=
- Return results as a plist that can be rendered in the TUI

** Dependencies**: =core-memory= (memory-store and memory-object struct)

** Verification**: Ingest known objects, call memory-inspect, assert type counts and
object counts match.

* P2: core-context — Semantic Retrieval (Embeddings)

** Claimed status**: The foveal-peripheral model is implemented and tested, but the
embedding pipeline that feeds it is listed as TODO for v0.3.0.

** Actual state**: The context rendering code (=context-object-render=) computes
=cosine-similarity= correctly, but =org-object-vector= is never populated.
All objects have =nil= vectors, all similarities are =0.0=, and the model
falls back to "include everything within depth 2." This is functionally
equivalent to no retrieval at all.

** What it should do**:

1. Add a =populate-vector= function to =core-memory= that calls an embedding
   provider and stores the result in the =memory-object= =:vector= slot
2. At ingest time (=ingest-ast=), generate embeddings for new objects
3. Embedding provider options (in priority order):
   - Ollama (local, =nomic-embed-text= or =mxbai-embed-large=)
   - OpenAI-compatible embedding API (=text-embedding-3-small=)
   - Fallback: TF-IDF bag-of-words vector (no external dependency)
4. Updates: when =memory-object= content changes, mark =:vector= as =:pending=
   and process in a background batch via the event orchestrator
5. Add an environment variable =EMBEDDING_PROVIDER= with default =ollama=

*** Implementation approach
- Add an =:embedding-provider= function stored in =*config*=
- =embed-object=: take content string → call provider → store float vector
- Modify =ingest-ast= to call =embed-object= on each new object
- Add batch processing in =system-event-orchestrator= for vector updates
- Use =bordeaux-threads= with a lock for async embedding generation

** Dependencies**: External embedding provider (Ollama or API), =core-memory= (vector slot)

** Verification**: Create objects with content, run embedding pipeline, assert vectors
are non-nil and have the correct dimensionality. Verify that =cosine-similarity=
between semantically similar objects exceeds 0.75 threshold.

* P2: core-context — Subtree-Based Skill Source Loading

** Claimed status**: DESIGN_DECISIONS §"Org-Mode as Unified AST" describes: "When the
agent needs information about the =openctl-db= function, it queries for the
=openctl-db= subtree specifically."

** Actual state**: =context-skill-source= reads the ENTIRE Org file as a string via
=uiop:read-file-string=. No subtree query exists.

** What it should do**:

1. Add a =context-skill-subtree= function that takes (=skill-name=, =heading-name=)
   and returns only the content under that headline
2. Add a =context-skill-function-signature= function that returns only the function
   name, lambda list, and docstring
3. Add a =context-skill-tests= function that returns only test blocks
4. Modify =context-skill-source= to optionally accept a =:subtree= keyword argument
5. If the Org file has an Org-element parser available, use it for structural queries;
   otherwise fall back to regex-based headline matching

*** Implementation approach
- Use =org-element= via =org-eval= skill (REPL bridge to Emacs) if available
- Lisp-native fallback: parse Org headlines with regex (=^*+ = pattern),
   match heading name by string comparison, extract content until next
   headline of equal or higher level
- Cache parsed results to avoid re-parsing on repeated queries

** Dependencies**: =programming-org= (Org parsing utilities), =emacs-bridge= (if Emacs
Org-element is preferred)

** Verification**: Create a test Org file with multiple headlines, query for a specific
subtree, assert only that subtree's content is returned.

* Priority and Sequencing

The remediation should proceed in this order:

1. **system-event-orchestrator bootstrap** (P1) — needed as infrastructure for Scribe/Gardener cron scheduling
2. **system-archivist** (P0) — depends on orchestrator for cron scheduling
3. **system-self-improve** (P0) — independent, can proceed in parallel with #2
4. **core-context embeddings** (P2) — independent, unlocks semantic retrieval
5. **core-context subtree loading** (P2) — independent, improves context efficiency
6. **system-memory inspect** (P1) — lowest priority, nice-to-have introspection

P0 items must be completed before v0.3.0 development begins. P1 items should be
completed before v0.3.0 is released. P2 items can extend into early v0.3.0.

* Out of Scope

Features listed as TODO in the ROADMAP for v0.3.0+ are NOT in this remediation
plan. Specifically excluded:

- HITL continuation-based suspension (v0.3.0 TODO)
- Model-tier routing / cost optimization (v0.3.0 TODO)
- Memory scope segmentation (v0.3.0 TODO)
- Long-horizon planning / task trees (v0.4.0 TODO)
- Shadow simulation mode (not on roadmap, aspirational)
- Formal verification of dispatcher rules (not on roadmap, aspirational)
- Bouncer rule learning from HITL decisions (not on roadmap, aspirational)