- Moved everything from ideas/passepartout/ to projects/passepartout/ - Moved legal structures to projects/flags/ - Created missing _index.org files for all subdirectories - Stripped redundant passepartout- prefix from filenames - Rewrote root _index.org as generalized brain index (projects + concepts) - Updated Hugo nav to Projects/Concepts - Updated build script section descriptions - Deleted stale ideas/passepartout-economics.md orphan
3.4 KiB
Passepartout Native Org-Mode Knowledge Base
What
Passepartout should be able to use Org-mode files directly as its knowledge base — no pandoc conversion, no markdown intermediary.
Currently gbrain provides vector search + entity linking over markdown, but we bridge via a conversion layer (org → pandoc → markdown → gbrain). This loses Org-mode semantics: properties drawers become flat YAML, tag inheritance is lost, file: links become relative markdown links, TODO states vanish, and the tree structure (headings with content subtrees) collapses into flat markdown headings.
Why
Org-mode's data model is strictly richer than markdown's. A Passepartout that can ingest, index, and query org files natively has:
- Property-based entity extraction (no separate links: frontmatter needed)
- Tag-inheritance for automatic categorization
- TODO/priority/timestamps for knowledge freshness signals
- ID-based stable cross-references (org-id) that survive file moves
- Heading-level chunking (one heading = one knowledge unit)
- The same file format for everything — no split between "authoring format" and "knowledge base format"
What it replaces
The current pipeline: org file → pandoc → markdown file → gbrain import →
gbrain embed → gbrain query. This is four serial steps with a conversion at each boundary that degrades the data model.
The target: org file → (Passepartout-native indexer) → query. Zero conversion, zero data loss.
Architecture sketch
A Passepartout-native knowledge module that directly ingests ideas/*.org:
-
Parser: extract each heading as a chunk. Preserve:
- Heading path (H1 → H2 → H3) as a hierarchical path
- Properties drawer as structured metadata
- file: links as typed entity references
- org-id as stable identifier
- Tags (inherited from parent headings)
- TODO state, priority, timestamps
- Embedder: vector-embed each heading chunk with metadata prefix
- Query: hybrid search over headings + full-text over content. Result includes the heading path + sibling headings for context.
-
Cross-reference graph: build a typed entity graph from:
- file: links → typed reference
- org-id links → stable cross-doc reference
- Tag co-occurrence → implicit relationship
- Same-property values → attribute-based grouping
- Dream cycle: auto-discover entities from org properties and file: links. Enrich thin sections. Flag sections with stale timestamps.
Priority
Below the gate stack and ACL2 planner (core dependencies) but above the Lisp Machine hardware. Target: after TUI stabilization and eval harness, once Screamer planner is stable enough to route queries through the knowledge base.
The short-term bridge (current) is gbrain with nightly org→md sync. This is adequate while the gate stack and planner are the priority. The native org module replaces gbrain entirely once built.
The nightly pipeline uses gbrain to provide hybrid search across the existing org files. The compliance framework mapping is the largest single dataset this would serve, and the broader Passepartout economics knowledge base demonstrates the value of native org querying at scale.