chore: unify bold syntax to single asterisk in .org files and update legacy memex-amero references
This commit is contained in:
@@ -4,7 +4,7 @@
|
||||
#+FILETAGS: :strategy:token:optimization:cost:psf:
|
||||
|
||||
* Overview
|
||||
The **Token Optimization** project defines the strategy and implementation for cost-effective LLM usage. It implements a multi-tier, multi-provider approach to minimize inference costs while maximizing reasoning capability through smart routing and context compression.
|
||||
The *Token Optimization* project defines the strategy and implementation for cost-effective LLM usage. It implements a multi-tier, multi-provider approach to minimize inference costs while maximizing reasoning capability through smart routing and context compression.
|
||||
|
||||
* Phase A: Demand (PRD)
|
||||
:PROPERTIES:
|
||||
@@ -15,10 +15,10 @@ The **Token Optimization** project defines the strategy and implementation for c
|
||||
Minimize LLM operational expenses while maintaining high-fidelity agentic performance.
|
||||
|
||||
** 2. User Needs
|
||||
- **Multi-Tier Strategy:** Resolve tasks using the cheapest model that meets the required intelligence threshold.
|
||||
- **Failover Resilience:** Automated fallback chain (Gemini -> OpenRouter -> GPT-4o).
|
||||
- **Context Efficiency:** Implement pruning and RAG to avoid token bloat.
|
||||
- **Usage Transparency:** Real-time tracking and budget alerts.
|
||||
- *Multi-Tier Strategy:* Resolve tasks using the cheapest model that meets the required intelligence threshold.
|
||||
- *Failover Resilience:* Automated fallback chain (Gemini -> OpenRouter -> GPT-4o).
|
||||
- *Context Efficiency:* Implement pruning and RAG to avoid token bloat.
|
||||
- *Usage Transparency:* Real-time tracking and budget alerts.
|
||||
|
||||
** 3. Success Criteria
|
||||
*** TODO 80% of queries handled by Tier 1 (Free/Fast) models
|
||||
|
||||
Reference in New Issue
Block a user