3.9 KiB
— title: Empirical Validation — MOMo and Modular Ontology Engineering type: reference tags: :passepartout:architecture: —
Empirical Validation — MOMo and Modular Ontology Engineering
Shimizu and Hitzler (2025, Journal of Web Semantics) argue that LLMs can significantly accelerate knowledge graph and ontology engineering — modeling, extension, population, alignment, and entity disambiguation — but only if ontologies are modular.
The central finding: modularity is the key variable
In a complex ontology alignment task, an LLM without module information detected correct mappings for 5 of 109 alignment rules — effectively useless. When the same LLM was given the module structure of the target ontology (20 named conceptual modules), it detected correct mappings for 104 of 109 rules — 95% accuracy. The variable was modularity.
For ontology population (extracting triples from text), their best results came from prompts that included a schematic representation of a single module plus one extraction example. Against ground truth, this achieved approximately 90% extraction accuracy. Without module-scoped prompting, quality degraded substantially.
The mechanism: conceptual modules scope the LLM's attention to something human-sized. The paper's central claim — "by somehow limiting the scope, we achieve a more human-like approach — and one more capable of being expressed succinctly in language" — is an independent discovery of the same principle underlying Passepartout's domain-scoped Screamer checks and per-domain cardinality policies.
What Passepartout should adopt
The modular prompt pattern. The archivist should use module-scoped prompts: a schematic representation of a domain module plus a single extraction example. Instead of a generic "extract triples" prompt, the prompt should reference the relevant module(s) and include an example triple for each relation in that module. The module provides context; the example provides format. Both improve LLM extraction quality without increasing Screamer's verification burden.
MOMo modules as ontology scaffold. The 50-70 gate-bootstrapped entity classes are starvation for the broader memex. MOMo's micropattern library provides a ready-made scaffold — hundreds of commonsense patterns for temporal relations, spatial relations, agent-action, organizational structure, provenance, and event participation. Loading these as initial modules — with :policy :plural and :provenance :external-ontology — would give the symbolic index a structured vocabulary for domains where the gate stack has nothing to offer. Organic growth then extends and refines these modules rather than inventing them from scratch.
Cross-source validation. The archivist can extract facts from the user's prose, extract facts from Wikidata for the same entities, and present disagreements with provenance. This is the :plural cardinality policy applied at extraction time.
The paper validates three design decisions already made: (1) modularity is non-negotiable — the difference between 5% and 95% accuracy; (2) the extraction pipeline is feasible — 90% population accuracy with module-scoped prompts means the archivist can extract useful facts, and the remaining 10% hallucination rate is what Screamer catches; (3) knowledge graphs are positioned as anti-hallucination infrastructure — the Passepartout thesis stated in the academic literature.
References:
- Shimizu, C., & Hitzler, P. (2025). Accelerating knowledge graph and ontology engineering with large language models. Journal of Web Semantics, 85, 100862.
- Shimizu, C., Hammar, K., & Hitzler, P. (2023). Modular ontology modeling. /Semantic Web, 14/(3), 459–489.
- Norouzi, S.S. et al. (2024). Ontology Population using LLMs. arXiv:2411.01612.