Merge verification-monopoly, evaluation-harness, collective-regression-suite into one page

Combined all three under verification-monopoly.org with title: 'The Evaluation Harness — Collective Regression Suite as Certification Monopoly' Structure: (1) vision from monopoly, (2) service from harness, (3) spec from collective-regression. All three IDs preserved in PROPERTIES. Deleted evaluation-harness.org and collective-regression-suite.org.
2026-05-24 19:12:49 +00:00
parent 348f2736a8
commit ede891f2ce
12 changed files with 226 additions and 232 deletions
--- a/projects/passepartout/architecture/native-org-knowledge-base.org
+++ b/projects/passepartout/architecture/native-org-knowledge-base.org
@@ -0,0 +1,83 @@
+:PROPERTIES:
+:ID:       7f4e6b9a-2c1d-5e8f-9a3b-6d7c4e5f2a1b
+:CREATED:  [2026-05-23 Sat]
+:END:
+#+title: Passepartout Native Org-Mode Knowledge Base
+#+filetags: :passepartout:roadmap:knowledge:org:gbrain:
+
+** What
+
+[[id:28c46769-c14b-42aa-ac7a-69d310157f8f][Passepartout]] should be able to use Org-mode files directly as its
+knowledge base — no pandoc conversion, no markdown intermediary.
+
+Currently gbrain provides vector search + entity linking over markdown,
+but we bridge via a conversion layer (org → pandoc → markdown → gbrain).
+This loses Org-mode semantics: properties drawers become flat YAML, tag
+inheritance is lost, file: links become relative markdown links, TODO
+states vanish, and the tree structure (headings with content subtrees)
+collapses into flat markdown headings.
+
+** Why
+
+Org-mode's data model is strictly richer than markdown's. A Passepartout
+that can ingest, index, and query org files natively has:
+- Property-based entity extraction (no separate links: frontmatter needed)
+- Tag-inheritance for automatic categorization
+- TODO/priority/timestamps for knowledge freshness signals
+- ID-based stable cross-references (org-id) that survive file moves
+- Heading-level chunking (one heading = one knowledge unit)
+- The same file format for everything — no split between "authoring format"
+  and "knowledge base format"
+
+** What it replaces
+
+The current pipeline: org file → pandoc → markdown file → gbrain import →
+
+gbrain embed → gbrain query. This is four serial steps with a conversion
+at each boundary that degrades the data model.
+
+The target: org file → (Passepartout-native indexer) → query. Zero
+conversion, zero data loss.
+
+** Architecture sketch
+
+A Passepartout-native knowledge module that directly ingests
+ideas/*.org:
+
+- Parser: extract each heading as a chunk. Preserve:
+  - Heading path (H1 → H2 → H3) as a hierarchical path
+  - Properties drawer as structured metadata
+  - file: links as typed entity references
+  - org-id as stable identifier
+  - Tags (inherited from parent headings)
+  - TODO state, priority, timestamps
+
+- Embedder: vector-embed each heading chunk with metadata prefix
+
+- Query: hybrid search over headings + full-text over content.
+  Result includes the heading path + sibling headings for context.
+
+- Cross-reference graph: build a typed entity graph from:
+  - file: links → typed reference
+  - org-id links → stable cross-doc reference
+  - Tag co-occurrence → implicit relationship
+  - Same-property values → attribute-based grouping
+
+- Dream cycle: auto-discover entities from org properties and file:
+  links. Enrich thin sections. Flag sections with stale timestamps.
+
+** Priority
+
+Below the gate stack and ACL2 planner (core dependencies) but above
+the Lisp Machine hardware. Target: after TUI stabilization and eval harness, once Screamer
+planner is stable enough to route queries through the knowledge base.
+
+The short-term bridge (current) is gbrain with nightly org→md sync.
+This is adequate while the gate stack and planner are the priority.
+The native org module replaces gbrain entirely once built.
+
+The nightly pipeline uses gbrain to provide hybrid search across the existing
+org files. The [[id:36e5b948-e07b-477f-9036-4dfe88254347][compliance framework mapping]] is the largest single
+dataset this would serve, and the broader
+[[id:28c46769-c14b-42aa-ac7a-69d310157f8f][Passepartout economics]] knowledge base demonstrates the value of
+native org querying at scale.