67 lines
1.9 KiB
Org Mode
67 lines
1.9 KiB
Org Mode
#+TITLE: Token Management & Model Optimization Research
|
|
#+author: Amero Garcia
|
|
#+created: [2026-03-16 Mon 14:28]
|
|
#+DATE: 2026-03-04
|
|
#+FILETAGS: :research:token:optimization:models
|
|
|
|
* Token Management Strategy Research
|
|
|
|
** Initial Findings
|
|
|
|
*** OpenRouter Free Tier
|
|
- URL: https://openrouter.ai/collections/free-models
|
|
- Providers moving from free to paid-only models
|
|
- Belief: "Free models play crucial role in democratizing access"
|
|
|
|
*** Google AI Studio (Gemini)
|
|
- Free tier available
|
|
- Limits: 60 requests/minute, 300K tokens/day
|
|
- No credit card required
|
|
- Every API key gets these limits
|
|
|
|
** Research Questions
|
|
|
|
1. Which providers offer free or low-cost tiers?
|
|
2. What are the rate limits and quotas?
|
|
3. Which models are best for which use cases?
|
|
4. How to optimize context windows?
|
|
5. What is the cost per token breakdown?
|
|
|
|
** To Research Further
|
|
|
|
| Provider | Free Tier | Paid Tier | Best For |
|
|
|----------|-----------|-----------|----------|
|
|
| Google Gemini | 300K tokens/day | Pay per use? | General, coding |
|
|
| OpenRouter | Varies by model | Per-request | Routing, variety |
|
|
| OpenAI | ? | ? | GPT-4 quality |
|
|
| Anthropic | ? | ? | Claude capabilities |
|
|
| Mistral | ? | ? | Open weights |
|
|
| Local | Hardware cost | Free | Privacy, control |
|
|
|
|
** Token Optimization Strategies to Explore
|
|
|
|
1. *Tiered Model Usage*
|
|
- Simple tasks: Fast/cheap models
|
|
- Complex tasks: Stronger models
|
|
- Fallback: Lower tier if higher fails
|
|
|
|
2. *Context Compression*
|
|
- Summarize long contexts
|
|
- Use RAG instead of full context
|
|
- Prune old conversation
|
|
|
|
3. *Caching*
|
|
- Cache common responses
|
|
- Reuse embeddings
|
|
- Batch requests
|
|
|
|
4. *Hybrid Approach*
|
|
- Local models for simple queries
|
|
- Cloud APIs for complex tasks
|
|
- Manual review for critical outputs
|
|
|
|
** X Account Access
|
|
|
|
*Pending:* X account access via Google login
|
|
*Blocker:* Requires OTP from user per security rule (SOUL.md)
|
|
*Action needed:* User provides OTP, I complete OAuth, access bookmarks |