v0.8.2: cleanup + prose + structure + decomposition + budget + errors

Phase 1 — dedup + hardening (~9 items):
- Remove duplicate *skill-registry* defvar from core-skills
- Merge *backend-registry* into *probabilistic-backends*, delete backend-register
- Remove inject-stimulus alias, standardize on stimulus-inject
- Add pre-eval sandbox (skill-source-scan) blocks restricted symbols before eval
- Remove dead plist-get function; remove duplicate json-alist-to-plist export
- Fix read-framed-message whitespace DoS (4096-iteration max)
- Add *read-eval* nil to dispatcher-approvals-process read-from-string (RCE)
- Add test-op to ASDF; update .asd version 0.4.3→0.7.2

Phase 2 — prose + contracts + reorder:
- Split ROADMAP: 2623→1089 lines (TODO only), CHANGELOG: 260→1528 lines (full DONE history, 14 versions reverse chron)
- Add Contracts + Overview to 6 channel files + embedding-native + programming-standards + symbolic-scope
- Reorder 28 .org files: Contract → Test Suite → Implementation (TDD order)
- Add 7-phase inline prose to think() in core-reason
- Expand USER_MANUAL: 183→461 lines (10 new sections)

Phase 3 — decomposition + export organization:
- Decompose think() into think-assemble-prompt, think-call-llm, think-parse-response orchestrator
- Organize 188 exports into 16 grouped sections by module

Phase 4 — budget enforcement + error protocol:
- Per-session budget enforcement (SESSION_BUDGET_USD env var, budget-exhausted-p, guard in think-call-llm)
- Error condition hierarchy (6 conditions: pipeline-error, llm-error, gate-error, budget-error, protocol-error)
- Restarts in loop-process: skip-signal, use-fallback, abort-pipeline
This commit is contained in:
2026-05-10 09:07:44 -04:00
parent 27d203ad67
commit 8fd56dece3
68 changed files with 7014 additions and 6521 deletions

View File

@@ -77,95 +77,18 @@ The Diagnostics skill is the self-knowledge of Passepartout. It answers
2. The ~** Contract~ section MUST list every public function.
3. Every test in ~* Test Suite~ MUST reference a specific Contract item.
4. If you change a function's signature, you MUST update its Contract item.
5. These files are excluded (no defuns): ~core-manifest.org~, ~setup.org~.
6. **NO-HARDCODED-CONSTANTS**: All configurable values (thresholds, intervals,
paths, limits, counters) MUST be read from environment variables with a
documented default in ~.env.example~. No magic numbers, no hardcoded
string literals in function bodies for any value a user might need to
change. The user owns their configuration — they change it in ~.env~, not
in the source code. Exceptions: internal implementation details that are
never user-facing (hash-table sizes, buffer capacity limits, loop
iteration caps) may live in source. But if the value controls *behavior*
(how many approvals before a rule, what similarity threshold gates
context, how long a shell command runs before timeout), it lives
in ~.env~ with a fallback default.
** Engineering Lifecycle (Two-Track)
** Contract
The canonical workflow. Two tracks, not to be confused:
The standards skill itself guarantees:
*** Track 1 — Org-First: Prose, Tests, Thinking (Phases 0/A)
This track stays in Org. No code is written yet.
**** Phase 0: Exploration & Documentation
1. Read the relevant Org source files for context
2. Explore the problem in the running REPL with ~repl-inspect~ and ~repl-eval~
3. Document findings in Org prose
4. If a bug: document investigation in Org before fixing (Org as thinking medium)
**** Phase A: Test-First Design
1. Write the success criteria as Contract items in the ~** Contract~ section
2. Write the FiveAM test in the ~* Test Suite~ section at the bottom of the file, with a comment referencing which Contract item it verifies. Tests are embedded — no ~:tangle ../tests/...~ override.
3. Tangle and evaluate in the REPL — confirm it fails (red)
4. The failing test is the success criteria. Do not proceed to Track 2 until it exists and is red.
*** Track 2 — REPL-First: Implementation, Iteration, Reflection (Phases B/C/D/E)
Code is prototyped in the REPL, never written directly into Org first.
**** Phase B/C: REPL Implementation
1. Write the function directly in the REPL using ~repl-eval~
2. Iterate: evaluate, inspect, fix, re-evaluate — the image accumulates state
3. Run the test in the REPL — confirm green
4. Explore edge cases with ~repl-inspect~ and ad-hoc evaluations
5. Before writing any ~defun~ in an Org block, verify it was prototyped and tested in the REPL first
**** Phase D: Chaos Verification
Run the appropriate chaos tier before reflecting code back to Org:
- *Tier 1 (Deterministic)*: Full FiveAM test suite — required on every change
- *Tier 2 (Probabilistic)*: Randomized fuzzing — required on every major release
- *Tier 3 (Stress)*: Load and resource starvation — required during hardening sprints
**** Phase E: Reflect Back to Org
1. Copy the working function into its own ~#+begin_src lisp~ block in the Org file
2. Update the prose to match what the function actually does (arguments, return, rationale)
3. Before closing Phase E, run ~(lisp-validate (uiop:read-file-string "path/to/file.lisp") :strict t)~ in the REPL — never external scripts or manual paren-counting
4. Verify the Org file tangles correctly
5. Tangle, commit, update GTD
**** Syntax Error Protocol
If a LOADER ERROR or reader-error occurs:
1. Run ~(lisp-validate (uiop:read-file-string "file.lisp") :strict t)~ in the REPL — never Python, never grep, never manual counting
2. Fix the error in the Org file (since the code was prototyped in REPL first, this should be rare)
3. Retangle and re-evaluate
Rationale: The two tracks prevent the two failure modes we have observed. Writing implementation code directly in Org (without REPL prototyping) produces syntax errors that require external tools to debug. Skipping Org-first test writing produces code without verified success criteria. The split is not bureaucratic — it is the mechanism by which both failures are prevented.
** GTD Conventions
Every task headline in the project's ROADMAP.org and gtd.org follows these rules:
1. **:ID:** — generated by ~memory-id-generate~ (UUIDv4 with ~id-~ prefix), never written manually. Use ~(memory-id-generate)~ in the REPL to produce one.
2. **:CREATED:** — ISO-8601 timestamp: ~[2026-05-02 Sat 14:30]~. Set when the headline is first created, never changed.
3. **:LOGBOOK:** — each state transition is logged: ~- State "DONE" from "TODO" [2026-05-02 Sat 15:00]~.
4. **CLOSED:** — set when the task reaches DONE: ~CLOSED: [2026-05-02 Sat 15:00]~.
5. **TODO keywords** follow the standard sequence: ~TODO~~NEXT~~IN-PROGRESS~~DONE~ / ~BLOCKED~ / ~CANCELLED~.
6. **The Agent** updates these automatically during Phase E of the lifecycle. The human never needs to write a UUID or timestamp manually — the agent generates and inserts them.
Example:
#+begin_src org
*** DONE Event Orchestrator
:PROPERTIES:
:ID: id-4a2b9c8f-3d7e-4f12-a9b0-1c2d3e4f5a6b
:CREATED: [2026-05-02 Sat]
:END:
:LOGBOOK:
- State "DONE" from "TODO" [2026-05-02 Sat 18:00]
:END:
CLOSED: [2026-05-02 Sat 18:00]
#+end_src
1. (standards-git-clean-p dir): checks whether directory ~dir~ has
uncommitted git changes. Returns T if clean, NIL if dirty. Runs
~git status --porcelain~ in the target directory.
2. (standards-lisp-verify code): validates Lisp code string for
structural correctness. Delegates to ~lisp-syntax-validate~.
3. (standards-lisp-format code): applies formatting conventions to
Lisp code. Delegates to ~lisp-format~.
* Implementation