Files
hermes-brain/projects/passepartout/architecture/design/knowledge-sources/empirical-validation-momo-and-modular-ontology-engineering.org

39 lines
3.9 KiB
Org Mode
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
title: Empirical Validation — MOMo and Modular Ontology Engineering
type: reference
tags: :passepartout:architecture:
---
* Empirical Validation — MOMo and Modular Ontology Engineering
:PROPERTIES:
:ID: b572b3a0-0238-470f-9bf8-63d356f68fe0
:ID: design-momo
:CREATED: [2026-05-08 Fri]
:WEIGHT: 40
:END:
Shimizu and Hitzler (2025, /Journal of Web Semantics/) argue that LLMs can significantly accelerate knowledge graph and ontology engineering — modeling, extension, population, alignment, and entity disambiguation — but /only/ if ontologies are modular.
*** The central finding: modularity is the key variable
In a complex ontology alignment task, an LLM without module information detected correct mappings for 5 of 109 alignment rules — effectively useless. When the same LLM was given the module structure of the target ontology (20 named conceptual modules), it detected correct mappings for 104 of 109 rules — 95% accuracy. The variable was modularity.
For ontology population (extracting triples from text), their best results came from prompts that included a schematic representation of a /single module/ plus one extraction example. Against ground truth, this achieved approximately 90% extraction accuracy. Without module-scoped prompting, quality degraded substantially.
The mechanism: conceptual modules scope the LLM's attention to something human-sized. The paper's central claim — "by somehow limiting the scope, we achieve a more human-like approach — and one more capable of being expressed succinctly in language" — is an independent discovery of the same principle underlying Passepartout's domain-scoped Screamer checks and per-domain cardinality policies.
*** What Passepartout should adopt
*The modular prompt pattern.* The archivist should use module-scoped prompts: a schematic representation of a domain module plus a single extraction example. Instead of a generic "extract triples" prompt, the prompt should reference the relevant module(s) and include an example triple for each relation in that module. The module provides /context/; the example provides /format/. Both improve LLM extraction quality without increasing Screamer's verification burden.
*MOMo modules as ontology scaffold.* The 50-70 gate-bootstrapped entity classes are starvation for the broader memex. MOMo's micropattern library provides a ready-made scaffold — hundreds of commonsense patterns for temporal relations, spatial relations, agent-action, organizational structure, provenance, and event participation. Loading these as initial modules — with =:policy :plural= and =:provenance :external-ontology= — would give the symbolic index a structured vocabulary for domains where the gate stack has nothing to offer. Organic growth then /extends and refines/ these modules rather than inventing them from scratch.
*Cross-source validation.* The archivist can extract facts from the user's prose, extract facts from Wikidata for the same entities, and present disagreements with provenance. This is the =:plural= cardinality policy applied at extraction time.
The paper validates three design decisions already made: (1) modularity is non-negotiable — the difference between 5% and 95% accuracy; (2) the extraction pipeline is feasible — 90% population accuracy with module-scoped prompts means the archivist /can/ extract useful facts, and the remaining 10% hallucination rate is what Screamer catches; (3) knowledge graphs are positioned as anti-hallucination infrastructure — the Passepartout thesis stated in the academic literature.
References:
- Shimizu, C., & Hitzler, P. (2025). Accelerating knowledge graph and ontology engineering with large language models. /Journal of Web Semantics, 85/, 100862.
- Shimizu, C., Hammar, K., & Hitzler, P. (2023). Modular ontology modeling. /Semantic Web, 14/(3), 459489.
- Norouzi, S.S. et al. (2024). Ontology Population using LLMs. arXiv:2411.01612.