feat(arch): finalize Universal Literate Note transition for all projects and skills

2026-03-31 16:14:37 -04:00
parent 1712b1e4a9
commit 70be8ab93e
79 changed files with 1606 additions and 417 deletions
--- a/notes/token-optimization.org
+++ b/notes/token-optimization.org
@@ -0,0 +1,55 @@
+#+TITLE: PROJECT: Token Optimization (Universal Literate Note)
+#+ID: project-token-optimization
+#+STARTUP: content
+#+FILETAGS: :strategy:token:optimization:cost:psf:
+
+* Overview
+The **Token Optimization** project defines the strategy and implementation for cost-effective LLM usage. It implements a multi-tier, multi-provider approach to minimize inference costs while maximizing reasoning capability through smart routing and context compression.
+
+* Phase A: Demand (PRD)
+:PROPERTIES:
+:STATUS: FROZEN
+:END:
+
+** 1. Purpose
+Minimize LLM operational expenses while maintaining high-fidelity agentic performance.
+
+** 2. User Needs
+- **Multi-Tier Strategy:** Resolve tasks using the cheapest model that meets the required intelligence threshold.
+- **Failover Resilience:** Automated fallback chain (Gemini -> OpenRouter -> GPT-4o).
+- **Context Efficiency:** Implement pruning and RAG to avoid token bloat.
+- **Usage Transparency:** Real-time tracking and budget alerts.
+
+** 3. Success Criteria
+*** TODO 80% of queries handled by Tier 1 (Free/Fast) models
+*** TODO Automated fallback triggered on rate limits
+*** TODO Context compression reducing average prompt size by 30%
+*** TODO Budget alerts active at 80% threshold
+
+* Phase B: Blueprint (PROTOCOL)
+:PROPERTIES:
+:STATUS: SIGNED
+:END:
+
+** 1. Architectural Intent
+Interfaces for dynamic model selection and cost-aware request routing. Source of truth is the `openclaw.json` configuration and real-time provider telemetry.
+
+** 2. Semantic Interfaces
+#+begin_src lisp
+(defun token-resolve-model (task-complexity)
+  "Selects the optimal model tier based on task metadata.")
+
+(defun token-compress-context (raw-context)
+  "Applies pruning heuristics to reduce token count.")
+#+end_src
+
+* Phase D: Build (Implementation)
+Implementation consists of configuration and routing logic located in `projects/token-optimization/`.
+
+** Routing Logic
+#+begin_src lisp
+;; Logic for complexity-based routing stubs
+#+end_src
+
+* Phase E: Chaos (Verification)
+Verification involves A/B testing model choices and simulating rate limits to verify fallback integrity.