- Split competitive-analysis-2026-05.org → TOC + 9 competitor files in ideas/competitors/. Dropped date from filename. All competitor UUIDs generated, TOC keeps original UUID for backlink continuity. - Deleted passepartout-economics.org archive (replaced by 27-node KB). - Inlined 5 'See also' blocks into natural prose (compliance-index, first-mover-window, revenue-table, orders-of-magnitude-time, native-org-knowledge-base). - Linked 7 orphan compliance pages back to compliance index + finished truncated sentences. - Linked all 14 Agora requirement docs from topic-relevant pages (identity→lisp-machine-security, infrastructure→compute-marketplace, social-space→growth-strategy, exchange→agora-contracts, etc.). - Linked ai-industry-impact from investment-thesis, sufficiency-flip, verification-appliance, effects-growth-flywheel (up from 1 to 10+ pages). - Fixed CREATED timestamps to use git commit dates instead of today. - Made all links absolute from root (no port inheritance). - Removed stale agora/docs/ duplicate content.
3.4 KiB
Passepartout Native Org-Mode Knowledge Base
What
Passepartout should be able to use Org-mode files directly as its knowledge base — no pandoc conversion, no markdown intermediary.
Currently gbrain provides vector search + entity linking over markdown, but we bridge via a conversion layer (org → pandoc → markdown → gbrain). This loses Org-mode semantics: properties drawers become flat YAML, tag inheritance is lost, file: links become relative markdown links, TODO states vanish, and the tree structure (headings with content subtrees) collapses into flat markdown headings.
Why
Org-mode's data model is strictly richer than markdown's. A Passepartout that can ingest, index, and query org files natively has:
- Property-based entity extraction (no separate links: frontmatter needed)
- Tag-inheritance for automatic categorization
- TODO/priority/timestamps for knowledge freshness signals
- ID-based stable cross-references (org-id) that survive file moves
- Heading-level chunking (one heading = one knowledge unit)
- The same file format for everything — no split between "authoring format" and "knowledge base format"
What it replaces
The current pipeline: org file → pandoc → markdown file → gbrain import →
gbrain embed → gbrain query. This is four serial steps with a conversion at each boundary that degrades the data model.
The target: org file → (Passepartout-native indexer) → query. Zero conversion, zero data loss.
Architecture sketch
A Passepartout-native knowledge module that directly ingests ideas/*.org:
-
Parser: extract each heading as a chunk. Preserve:
- Heading path (H1 → H2 → H3) as a hierarchical path
- Properties drawer as structured metadata
- file: links as typed entity references
- org-id as stable identifier
- Tags (inherited from parent headings)
- TODO state, priority, timestamps
- Embedder: vector-embed each heading chunk with metadata prefix
- Query: hybrid search over headings + full-text over content. Result includes the heading path + sibling headings for context.
-
Cross-reference graph: build a typed entity graph from:
- file: links → typed reference
- org-id links → stable cross-doc reference
- Tag co-occurrence → implicit relationship
- Same-property values → attribute-based grouping
- Dream cycle: auto-discover entities from org properties and file: links. Enrich thin sections. Flag sections with stale timestamps.
Priority
Below the gate stack and ACL2 planner (core dependencies) but above the Lisp Machine hardware. Target: after TUI stabilization and eval harness, once Screamer planner is stable enough to route queries through the knowledge base.
The short-term bridge (current) is gbrain with nightly org→md sync. This is adequate while the gate stack and planner are the priority. The native org module replaces gbrain entirely once built.
The nightly pipeline uses gbrain to provide hybrid search across the existing org files. The compliance framework mapping is the largest single dataset this would serve, and the broader Passepartout economics knowledge base demonstrates the value of native org querying at scale.