Add Thoth to competitive analysis; refine compute marketplace thesis
- Thoth: new Category 2 entry (Personal AI Assistants), LangGraph ReAct agent with knowledge graph, Developer/Designer studios, 151K LOC - Compute marketplace: answer the structural question 'why buy compute if every user runs their own Passepartout?' — three structural reasons: specialized proof libraries, certification weight, bootstrap verification
This commit is contained in:
@@ -7,14 +7,14 @@
|
|||||||
|
|
||||||
* Overview
|
* Overview
|
||||||
|
|
||||||
Analyzed 8 competitor codebases alongside Passepartout. The competitive landscape
|
Analyzed 9 competitor codebases alongside Passepartout. The competitive landscape
|
||||||
divides into three categories:
|
divides into three categories:
|
||||||
|
|
||||||
1. Coding agents (Aider, OpenCode, Codex CLI, Claude Code, Gemini CLI)
|
1. Coding agents (Aider, OpenCode, Codex CLI, Claude Code, Gemini CLI)
|
||||||
2. Personal AI assistants (Hermes, OpenClaw)
|
2. Personal AI assistants (Hermes, OpenClaw, Thoth)
|
||||||
3. CI/check-based systems (Continue)
|
3. CI/check-based systems (Continue)
|
||||||
|
|
||||||
None of the eight compete with Passepartout on all axes simultaneously. Passepartout's
|
None of the nine compete with Passepartout on all axes simultaneously. Passepartout's
|
||||||
strongest differentiators — Org-mode data model, deterministic gate stack, ACL2
|
strongest differentiators — Org-mode data model, deterministic gate stack, ACL2
|
||||||
verification, Merkle-treed memory, and the triad architecture — are absent from
|
verification, Merkle-treed memory, and the triad architecture — are absent from
|
||||||
every competitor.
|
every competitor.
|
||||||
@@ -263,6 +263,85 @@ no Org-mode, no verification, no neurosymbolic architecture. Differentiated by
|
|||||||
vastly broader channel support and mature plugin ecosystem. But architecturally
|
vastly broader channel support and mature plugin ecosystem. But architecturally
|
||||||
conventional — LLM + tools + channels, no cognitive architecture innovation.
|
conventional — LLM + tools + channels, no cognitive architecture innovation.
|
||||||
|
|
||||||
|
** Thoth (Python, ~151K lines, Apache 2.0)
|
||||||
|
|
||||||
|
https://github.com/siddsachar/Thoth — Personal AI Sovereignty. Local-first
|
||||||
|
desktop AI assistant with knowledge graph, tools, voice, vision, shell,
|
||||||
|
browser automation, workflow engine, and messaging channels.
|
||||||
|
|
||||||
|
Architecture: LangGraph create_react_agent (prebuilt ReAct pattern). Dual-mode
|
||||||
|
streaming via agent.stream(). NiceGUI web UI served by Python app.py with
|
||||||
|
desktop launcher (tray icon, Ollama auto-start, browser/OS window). Context
|
||||||
|
trimming via tiktoken to ~85% of model window, base64 data redaction, stale
|
||||||
|
browser snapshot compression (keeps last 8), MD5 tool result dedup, old tool
|
||||||
|
result summarization. 50-step recursion limit (chat), 100 (tasks), 120 (Developer
|
||||||
|
Studio). Agent graph cached by tool set + model override. Checkpoints via
|
||||||
|
LangGraph's SQLite-backed checkpointer. 30+ tool modules.
|
||||||
|
|
||||||
|
Safety model: Shell command classification (tools/shell_tool.py) with 17 blocked
|
||||||
|
patterns (rm -rf /, mkfs, dd of=/dev/, shutdown, fork bombs, pipe-to-bash, etc.),
|
||||||
|
30+ safe auto-execute prefixes (ls, cat, grep, git status, etc.), needs-approval
|
||||||
|
for compound commands (;, &&, ||, |, $(), backticks). Interactive interrupt() for
|
||||||
|
non-safe shell — LangGraph human-in-the-loop pauses the graph. Per-workflow safety
|
||||||
|
modes: block (default, refuse non-safe), approve (pause), allow_all.
|
||||||
|
Prompt-injection defense: scans tool outputs and user inputs for 5 categories
|
||||||
|
(role overrides, instruction hijacking, data exfiltration, invisible unicode,
|
||||||
|
hidden HTML directives) — detection-only, no stripping. Filesystem workspace
|
||||||
|
boundary (~/Documents/Thoth). Opt-in Docker Sandbox for Developer Studio.
|
||||||
|
Destructive ops (file delete, moderate shell, Gmail send, calendar delete,
|
||||||
|
memory/task/tracker delete) require confirmation. MCP servers disabled until
|
||||||
|
tested. Custom Tools reviewed and promoted. No sandboxing of agent runtime
|
||||||
|
itself — agent runs in-process. No response-level guardrails.
|
||||||
|
|
||||||
|
Data model: SQLite (WAL mode) at ~/.thoth/memory.db — shared between knowledge
|
||||||
|
graph and legacy memory. Knowledge graph: SQLite (durable) + NetworkX MultiDiGraph
|
||||||
|
(in-memory, rebuilt on startup) + FAISS vector index (semantic recall, rebuilt on
|
||||||
|
every entity write). 11 entity types (person, preference, fact, event, place,
|
||||||
|
project, organisation, concept, skill, media, self_knowledge). 67+ typed relations
|
||||||
|
with 30+ LLM-produced aliases mapped to canonical forms. Dream Cycle refinement
|
||||||
|
pipeline for entity dedup/merge/stale-confidence decay. Config: JSON files
|
||||||
|
(skills_config.json, api_keys.json, providers.json, channels_config.json). Keys in
|
||||||
|
OS credential store (Windows Credential Manager, macOS Keychain, Linux Secret
|
||||||
|
Service/KWallet). Memory extraction background daemon scanning past conversations
|
||||||
|
every ~2 hours.
|
||||||
|
|
||||||
|
Self-modification: Agent CAN create/update/delete skills via dedicated tools
|
||||||
|
(thoth_create_skill, thoth_patch_skill, thoth_delete_skill). SKILL.md files with
|
||||||
|
YAML frontmatter at ~/.thoth/skills/. Bundled skills (read-only) at app root;
|
||||||
|
user skills override by name. Skill patching requires user confirmation + auto
|
||||||
|
backup. Maximum 1 patch proposal per conversation. Tool guides cannot be patched.
|
||||||
|
Self-knowledge block injected into system prompt. No tool to modify agent.py,
|
||||||
|
prompts.py, or system prompt directly. Developer Studio provides code editing
|
||||||
|
through approval-gated tools (tool-assisted human workflow, not agent self-mod).
|
||||||
|
|
||||||
|
Verification: None formal. Update signature verification (updater.py).
|
||||||
|
Comprehensive test suite at tests/test_suite.py. No tool-call verification beyond
|
||||||
|
LangGraph schema validation. No output verification or fact-checking.
|
||||||
|
|
||||||
|
Key differentiators vs other assistants: LangGraph ReAct agent with structured
|
||||||
|
streaming event model. Personal knowledge graph (11 entity types, 67 relations,
|
||||||
|
NetworkX + FAISS). Developer Studio (Docker sandbox, code threads, Git operations,
|
||||||
|
approval modes). Designer Studio (decks, documents, landing pages, sandboxed
|
||||||
|
interactive runtime). 5 messaging channels (Telegram, Discord, Slack, WhatsApp,
|
||||||
|
SMS) with streaming, reactions, media processing. Background workflow engine
|
||||||
|
(schedules, webhooks, step pipelines, conditions, approvals, concurrency groups).
|
||||||
|
30+ tool modules including browser automation, shell, Gmail, Calendar, X, image/
|
||||||
|
video generation. 39 curated Ollama tool-calling models. 10 LLM providers (Ollama,
|
||||||
|
OpenAI, Anthropic, Google AI/Gemini, xAI/Grok, MiniMax, OpenRouter, Ollama Cloud,
|
||||||
|
ChatGPT/Codex subscription, custom endpoints). MCP client (stdio, Streamable HTTP,
|
||||||
|
SSE) with namespaced tools, approval gates. No accounts, no telemetry, no hosted
|
||||||
|
server. Local-first with OS credential store.
|
||||||
|
|
||||||
|
Key gap vs Passepartout: No deterministic gate stack — shell safety is pattern
|
||||||
|
list (17 blocked, 30 safe), not typed gates. No sandboxed agent runtime. No
|
||||||
|
proof system. No output guardrails. No neurosymbolic architecture. No Org-mode.
|
||||||
|
No Merkle-tree memory. Knowledge graph (SQLite+FAISS) is richer than Hermes but
|
||||||
|
is LLM-driven entity extraction — no structural integrity guarantees. Thoth's
|
||||||
|
differentiation from Hermes/OpenClaw is the knowledge graph + Developer/Designer
|
||||||
|
studios + embedded LangGraph framework — a broader product scope, but still
|
||||||
|
architecturally conventional (LLM + tools + channels + KG), not a new cognitive
|
||||||
|
architecture.
|
||||||
|
|
||||||
* Category 3: CI/Check Systems
|
* Category 3: CI/Check Systems
|
||||||
|
|
||||||
** Continue (TypeScript, ~328K lines, Apache 2.0)
|
** Continue (TypeScript, ~328K lines, Apache 2.0)
|
||||||
@@ -379,5 +458,6 @@ Repository dumps and analysis artifacts at /tmp/:
|
|||||||
- /tmp/claude-code-leaked-source/ — Claude Code leaked (TypeScript/Bun)
|
- /tmp/claude-code-leaked-source/ — Claude Code leaked (TypeScript/Bun)
|
||||||
- /tmp/gemini-cli/ — Google Gemini CLI (TypeScript)
|
- /tmp/gemini-cli/ — Google Gemini CLI (TypeScript)
|
||||||
- /tmp/openclaw/ — OpenClaw source (TypeScript)
|
- /tmp/openclaw/ — OpenClaw source (TypeScript)
|
||||||
|
- /tmp/thoth/ — Thoth source (Python)
|
||||||
- /tmp/continue/ — Continue source (TypeScript)
|
- /tmp/continue/ — Continue source (TypeScript)
|
||||||
- /usr/local/lib/hermes-agent/ — Hermes Agent (Python)
|
- /usr/local/lib/hermes-agent/ — Hermes Agent (Python)
|
||||||
|
|||||||
@@ -8,6 +8,22 @@ Passepartout instances offer their symbolic engine capacity (ACL2 cycles, Scream
|
|||||||
|
|
||||||
The early player runs a large instance and sells compute to smaller instances. The AGPL allows this because the marketplace is a service, not a modification of the code. Revenue is a percentage of each compute transaction.
|
The early player runs a large instance and sells compute to smaller instances. The AGPL allows this because the marketplace is a service, not a modification of the code. Revenue is a percentage of each compute transaction.
|
||||||
|
|
||||||
|
But the question is structural: if every user runs their own Passepartout — each with the same symbolic engine, the same gate stack, the same ACL2 prover — why would they need to buy compute from anyone? The answer is that Passepartout's symbolic engine is /domain-specific/, not /generalized/. Local compute handles your daily gate stack (milliseconds per verification). The marketplace sells three things a local instance cannot produce:
|
||||||
|
|
||||||
|
**1. Specialized proof libraries and search strategies.** ACL2 is a search — the prover tries strategies until something works. A fresh Passepartout has generic strategies (the default waterfall, basic arithmetic, simple induction). A provider who has run 10,000 medical-device ISO 13482 proofs has tuned rewrite rules, custom clause processors, cached lemmas, and known failure-mode workarounds for that domain. You don't want to rediscover those from scratch — you buy them as a burst compute transaction. The provider isn't selling raw CPU cycles; they are selling /the accumulated search strategy from every proof ever run in that domain/, pre-packaged as a service. Over time, your own Passepartout learns the patterns and needs less external compute, but the provider stays ahead because they aggregate proof experience from /every/ client in that domain.
|
||||||
|
|
||||||
|
**2. Certification weight for third-party trust.** Your Passepartout can prove "this gate rule is correct" to /you/. ACL2 produces a machine-checkable proof log — anyone can mechanically verify it. But when a hospital buyer evaluating a published HIPAA gate rule needs to know the rule satisfies the regulation, they do not care about your Passepartout's isolated run of the proof. They want the rule verified by a provider who:
|
||||||
|
- Carries errors-and-omissions insurance for the specific regulation
|
||||||
|
- Submits to annual third-party audits
|
||||||
|
- Maintains compliance documentation for the proof pipeline
|
||||||
|
- Has a publicly verifiable track record of correct certifications
|
||||||
|
|
||||||
|
Your local instance cannot produce any of this. The provider's proof carries /reputational weight/ because the provider is a legal and economic entity, not a process. This is the same reason software is certified by UL or TÜV rather than by the developer running the test suite locally.
|
||||||
|
|
||||||
|
**3. Bootstrap verification for new instances.** A fresh Passepartout cannot verify its own initial state — the bootstrapping problem. You need a working system to generate the proof that the system is correct, but the proof refers to the system itself. The marketplace provides bootstrap proofs from existing trusted providers. Once verified, your instance stands on its own, but the initial self-certification requires an external prover that /already/ has a self-verified image. This is a one-time cost per instance (or per upgrade).
|
||||||
|
|
||||||
|
Secondary but real: burst capacity for heavy proofs (hours-long ACL2 conjectures you do not want tying up your daily agent's CPU), collective regression suite execution (small instances contribute edge cases but cannot run the full suite on every change), and latency guarantees for time-critical gate verifications (trading, emergency shutdown). These are infrastructure economics — the same reason individuals buy cloud burst instances despite having their own hardware.
|
||||||
|
|
||||||
If Passepartout instances on Agora transact billions of verified operations per day, the spread on compute transactions is enormous. This is not a product sale — it is a bet on network effects. Every new instance increases the value of the network (more capacity, more diversity, more resilience).
|
If Passepartout instances on Agora transact billions of verified operations per day, the spread on compute transactions is enormous. This is not a product sale — it is a bet on network effects. Every new instance increases the value of the network (more capacity, more diversity, more resilience).
|
||||||
|
|
||||||
The early player that provisions the largest compute capacity on Agora becomes the default infrastructure provider for the entire network. This is venture-scale money.
|
The early player that provisions the largest compute capacity on Agora becomes the default infrastructure provider for the entire network. This is venture-scale money.
|
||||||
|
|||||||
Reference in New Issue
Block a user