:PROPERTIES: :ID: 7f4e6b9a-2c1d-5e8f-9a3b-6d7c4e5f2a1b :CREATED: [2026-05-23 Sat] :END: #+title: Passepartout Native Org-Mode Knowledge Base #+filetags: :passepartout:roadmap:knowledge:org:gbrain: ** What Passepartout should be able to use Org-mode files directly as its knowledge base — no pandoc conversion, no markdown intermediary. Currently gbrain provides vector search + entity linking over markdown, but we bridge via a conversion layer (org → pandoc → markdown → gbrain). This loses Org-mode semantics: properties drawers become flat YAML, tag inheritance is lost, file: links become relative markdown links, TODO states vanish, and the tree structure (headings with content subtrees) collapses into flat markdown headings. ** Why Org-mode's data model is strictly richer than markdown's. A Passepartout that can ingest, index, and query org files natively has: - Property-based entity extraction (no separate links: frontmatter needed) - Tag-inheritance for automatic categorization - TODO/priority/timestamps for knowledge freshness signals - ID-based stable cross-references (org-id) that survive file moves - Heading-level chunking (one heading = one knowledge unit) - The same file format for everything — no split between "authoring format" and "knowledge base format" ** What it replaces The current pipeline: org file → pandoc → markdown file → gbrain import → gbrain embed → gbrain query. This is four serial steps with a conversion at each boundary that degrades the data model. The target: org file → (Passepartout-native indexer) → query. Zero conversion, zero data loss. ** Architecture sketch A Passepartout-native knowledge module that directly ingests ideas/*.org: - Parser: extract each heading as a chunk. Preserve: - Heading path (H1 → H2 → H3) as a hierarchical path - Properties drawer as structured metadata - file: links as typed entity references - org-id as stable identifier - Tags (inherited from parent headings) - TODO state, priority, timestamps - Embedder: vector-embed each heading chunk with metadata prefix - Query: hybrid search over headings + full-text over content. Result includes the heading path + sibling headings for context. - Cross-reference graph: build a typed entity graph from: - file: links → typed reference - org-id links → stable cross-doc reference - Tag co-occurrence → implicit relationship - Same-property values → attribute-based grouping - Dream cycle: auto-discover entities from org properties and file: links. Enrich thin sections. Flag sections with stale timestamps. ** Priority Below the gate stack and ACL2 planner (core dependencies) but above the Lisp Machine hardware. Target: after TUI stabilization and eval harness, once Screamer planner is stable enough to route queries through the knowledge base. The short-term bridge (current) is gbrain with nightly org→md sync. This is adequate while the gate stack and planner are the priority. The native org module replaces gbrain entirely once built. ** See also [[file:../../concepts/compliance-framework-mapping.org][Compliance framework mapping]] [[file:../../ideas/passepartout-economics.org][Passepartout economics]]