refactor: moved org-agent to its own repository as a submodule

This commit is contained in:
2026-03-27 15:46:53 -04:00
parent 01f76a4570
commit b7e082c403
176 changed files with 19686 additions and 9665 deletions

View File

@@ -0,0 +1,95 @@
#+TITLE: Learning From Failure: The PinchTab Security Incident
#+author: User
#+created: [2026-03-16 Mon 14:28]
#+DATE: 2026-03-08
#+FILETAGS: :failure:security:learning
* The Failure
** What Happened
User asked me to critically analyze three browser automation tools (PinchTab, Camofox, Unbrowse) and recommend the best path forward. Instead of rigorous security analysis, I:
1. Accepted PinchTab's marketing claims at face value
2. Recommended installing a 12MB precompiled binary via `curl | bash`
3. Failed to verify: source code availability, signing/verification, supply chain integrity, security audits
4. Did not question the suspicious "stealth injection" terminology
5. Did not compare against verifiable open-source alternatives
** Why It Was Wrong
- Mystery binary from relatively unknown publisher
- "Stealth" features imply modifying browser internals (red flag for both ethics and detection)
- Multiple GitHub forks (ZEMLYANINYA, prayedbeto) suggests supply chain confusion
- No GPG signatures, no checksums, no security audit published
- Full Chrome CDP access + HTTP API = complete browser control over network
- Could have achieved same efficiency gains via existing Playwright/CDP infrastructure
** What Should Have Happened
1. Verify binary source (is it actually Go? can I build from source?)
2. Check for security audits, CVEs, corporate backing
3. Question "stealth injection"—what does it actually do? is it ethical/legal?
4. Compare against established alternatives (Browser-use, Playwright direct, ScrapeGraphAI)
5. Prefer auditable source code over mystery binaries
6. Document risk analysis before ANY security-sensitive recommendation
* Root Cause Analysis
** Cognitive Failures
- Pattern-matched to "efficiency" language without critical evaluation
- Failed to apply first-principles security analysis
- Did not recognize "curl | bash" as a major security anti-pattern
- Let enthusiasm for solution override due diligence
- Did not surface uncertainty ("I haven't verified this binary's provenance")
** System Failures
- No established security review checklist
- No mandatory "pause and verify" rule for executable recommendations
- No pattern for questioning suspicious terminology like "stealth"
- Failed to apply existing SOUL.md rule: "Think from first principles"
* The Correction
** Revised Recommendation
Instead of PinchTab (unverified binary), either:
1. Enhance existing OpenClaw browser tool with accessibility tree extraction (via Playwright Python)
2. Use browser-use (19k stars, MIT license, auditable Python source)
3. Use established Playwright directly with CDP enhancements
All achieve ~5x token efficiency without mystery binaries.
** Security Principles Established
1. *Never recommend executing unknown binaries*
2. *Verify provenance before trusting any tool*
3. *Prefer auditable source code over precompiled binaries*
4. *Question suspicious terminology* ("stealth", "injection", "undetectable")
5. *Document risk analysis* for security-sensitive recommendations
6. *Surface uncertainty* rather than feign confidence
* Integration Into Workflow
** For Future Tool Evaluations
TODO Is source code auditable?
TODO Who is the publisher? What's their reputation?
TODO Are there security audits? CVE history?
TODO How is it distributed? (curl | bash = red flag)
TODO What permissions does it require?
TODO Are there established alternatives with better provenance?
TODO Document risk analysis explicitly
** For Security-Sensitive Recommendations
- State confidence level explicitly ("I have not verified this")
- Provide alternatives with different risk profiles
- Wait for user authorization before any executable recommendation
- Never assume "convenience" outweighs security
* Meta-Learning
** Habit Established
After every significant mistake:
1. Acknowledge failure specifically (what, why, impact)
2. Root cause analysis (cognitive + system failures)
3. Correction (what should have happened)
4. Integration (new rules/checklists)
5. Record in memex for future reference
** Verification
This document will be checked by user. Pattern should repeat for all significant failures.