refactor: moved org-agent to its own repository as a submodule

2026-03-27 15:46:53 -04:00
parent 01f76a4570
commit b7e082c403
176 changed files with 19686 additions and 9665 deletions
--- a/notes/learning-from-failure-pinchtab.org
+++ b/notes/learning-from-failure-pinchtab.org
@@ -0,0 +1,95 @@
+#+TITLE: Learning From Failure: The PinchTab Security Incident
+#+author: User
+#+created: [2026-03-16 Mon 14:28]
+#+DATE: 2026-03-08
+#+FILETAGS: :failure:security:learning
+
+* The Failure
+
+** What Happened
+User asked me to critically analyze three browser automation tools (PinchTab, Camofox, Unbrowse) and recommend the best path forward. Instead of rigorous security analysis, I:
+
+1. Accepted PinchTab's marketing claims at face value
+2. Recommended installing a 12MB precompiled binary via `curl | bash`
+3. Failed to verify: source code availability, signing/verification, supply chain integrity, security audits
+4. Did not question the suspicious "stealth injection" terminology
+5. Did not compare against verifiable open-source alternatives
+
+** Why It Was Wrong
+- Mystery binary from relatively unknown publisher
+- "Stealth" features imply modifying browser internals (red flag for both ethics and detection)
+- Multiple GitHub forks (ZEMLYANINYA, prayedbeto) suggests supply chain confusion
+- No GPG signatures, no checksums, no security audit published
+- Full Chrome CDP access + HTTP API = complete browser control over network
+- Could have achieved same efficiency gains via existing Playwright/CDP infrastructure
+
+** What Should Have Happened
+1. Verify binary source (is it actually Go? can I build from source?)
+2. Check for security audits, CVEs, corporate backing
+3. Question "stealth injection"—what does it actually do? is it ethical/legal?
+4. Compare against established alternatives (Browser-use, Playwright direct, ScrapeGraphAI)
+5. Prefer auditable source code over mystery binaries
+6. Document risk analysis before ANY security-sensitive recommendation
+
+* Root Cause Analysis
+
+** Cognitive Failures
+- Pattern-matched to "efficiency" language without critical evaluation
+- Failed to apply first-principles security analysis
+- Did not recognize "curl | bash" as a major security anti-pattern
+- Let enthusiasm for solution override due diligence
+- Did not surface uncertainty ("I haven't verified this binary's provenance")
+
+** System Failures
+- No established security review checklist
+- No mandatory "pause and verify" rule for executable recommendations
+- No pattern for questioning suspicious terminology like "stealth"
+- Failed to apply existing SOUL.md rule: "Think from first principles"
+
+* The Correction
+
+** Revised Recommendation
+Instead of PinchTab (unverified binary), either:
+1. Enhance existing OpenClaw browser tool with accessibility tree extraction (via Playwright Python)
+2. Use browser-use (19k stars, MIT license, auditable Python source)
+3. Use established Playwright directly with CDP enhancements
+
+All achieve ~5x token efficiency without mystery binaries.
+
+** Security Principles Established
+1. *Never recommend executing unknown binaries*
+2. *Verify provenance before trusting any tool*
+3. *Prefer auditable source code over precompiled binaries*
+4. *Question suspicious terminology* ("stealth", "injection", "undetectable")
+5. *Document risk analysis* for security-sensitive recommendations
+6. *Surface uncertainty* rather than feign confidence
+
+* Integration Into Workflow
+
+** For Future Tool Evaluations
+TODO  Is source code auditable?
+TODO  Who is the publisher? What's their reputation?
+TODO  Are there security audits? CVE history?
+TODO  How is it distributed? (curl | bash = red flag)
+TODO  What permissions does it require?
+TODO  Are there established alternatives with better provenance?
+TODO  Document risk analysis explicitly
+
+** For Security-Sensitive Recommendations
+- State confidence level explicitly ("I have not verified this")
+- Provide alternatives with different risk profiles
+- Wait for user authorization before any executable recommendation
+- Never assume "convenience" outweighs security
+
+* Meta-Learning
+
+** Habit Established
+After every significant mistake:
+1. Acknowledge failure specifically (what, why, impact)
+2. Root cause analysis (cognitive + system failures)
+3. Correction (what should have happened)
+4. Integration (new rules/checklists)
+5. Record in memex for future reference
+
+** Verification
+This document will be checked by user. Pattern should repeat for all significant failures.