v0.7.2: agent identity injection (CONFIG section) — TDD

Live config section injected into system prompt between time and IDENTITY. assemble-config-section reads *provider-cascade*, tokenizer-context-limit, gate count, and *hitl-pending* at each think() call. fboundp-guarded. Tested. - core-reason: assemble-config-section fn, config-section binding, injected into all 3 prompt assembly paths - Reason tests: +4 checks (Passepartout, version, gates)
2026-05-08 16:48:10 -04:00
parent 26fd756222
commit ca44136a55
3 changed files with 96 additions and 16 deletions
--- a/docs/ROADMAP.org
+++ b/docs/ROADMAP.org
@@ -1701,6 +1701,24 @@ Passepartout's image-based Lisp model enables hot-reload — redefine a function
 - On compile error: keep the old version loaded, log the error, show TUI warning: ~"✗ Skill 'skill-name' failed to compile — old version retained."~
 ~80 lines in a new ~symbolic-file-watch.org~ skill.

+*** TODO Heavy thinking skill — parallel reasoning + sequential deliberation
+:PROPERTIES:
+:ID:       id-v082-heavy-thinking
+:CREATED:  [2026-05-08 Fri]
+:END:
+
+The HeavySkill paper (arXiv:2605.02396v1) demonstrates that a two-stage pipeline — K independent reasoning trajectories followed by a critical deliberation step — consistently outperforms majority voting and approaches Pass@K. The authors distill it into a readable skill file that works across any agent harness. Passepartout's Merkle tree makes this auditable, rewoundable, and cross-session comparable.
+
+- New skill: ~org/heavy-thinking.org~ — a readable skill document loaded at startup. The agent follows a defined protocol when facing complex reasoning tasks:
+  1. *Activation*: triggers when the complexity classifier detects a STEM/reasoning/code-generation task. Dormant for simple factual queries or casual conversation
+  2. *Parallel reasoning*: spawns K independent ~think()~ calls (default K=3, ~HEAVY_THINKING_WIDTH~ env var). Each call solves the same problem from scratch without access to other trajectories. Encourages diverse strategies
+  3. *Sequential deliberation*: a second model call reads all K trajectories (pruned to essential thinking content to stay under context budget). Critically evaluates each — not voting, but re-reasoning. Produces a synthesized final answer with a deliberation trace: "Trajectories 1,3 converged on answer X. Trajectory 2 had error Y. Synthesized answer: X."
+  4. *Output*: returns the synthesized answer with ~[Heavy-thinking: 3 parallel, 1 deliberate]~ annotation in the response metadata
+- Merkle advantage: each trajectory is stored as a content-addressed node. The deliberation trace is permanent and auditable — users can see WHY one answer was chosen
+- Iterative deliberation optional (capped at 2 — the paper shows iterations 3+ degrade HP@K)
+- Cost model: 3 parallel × 1 deliberation = 4 API calls for complex tasks (vs 1 normally). ~HEAVY_THINKING_COST_MULTIPLIER~ env var for cost-aware auto-activation
+~100 lines as a skill (~60 prompt template + ~40 orchestration in ~symbolic-heavy-thinking.org~).
+
 ** v0.8.3: Direction 3 — Adaptive Layout + Personality

 The TUI adapts to the terminal it's running in — full sidebar at ultrawide, compact at standard, minimal at narrow (phone/SSH). It has a personality: spinner style, relative timestamps, progress bars, live context help.