Restructure three-pronged → knowledge-layers: collapse 11 files to 3, integrate into main architecture

- Rename 'three-pronged' folder to 'knowledge-layers' — prong metaphor was misleading (implied parallel tines), replaced with epistemic layers (deductive base, empirical middle, probabilistic oracle — vertical stack) - Collapse 11 overlapping files into 3 coherent documents: - knowledge-layers/_index.org: core framework (two engines + one store, World Model formula, 0-14 layer table, provenance store design, conflict resolution, cold-start, stage mapping) - knowledge-layers/practical-implications.org: design-world-aware-of- physics, 10 powers, Schafmeister existence proof, epistemic transparency - knowledge-layers/neurological-empirical.org: neural networks in provenance framework (kept intact) - Relocate wolfram/mathematica and Schafmeister docs to ideas/viability/ - Integrate into main architecture _index.org: - Gate: expanded from two vectors (ACL2+LLM) to three (deductive, provenance/empirical, LLM oracle) - Autodidactic loop: split into Track 1 (deductive hardening, fast) and Track 2 (empirical validation, slow, experimental-feedback-driven) - See also: added Knowledge Layers cross-reference - Add all-lisp geometry engine note (ideas/lisp-geometry-engine.org) as concrete illustration of the empirical layer's effect on design work - Rebuild site: 148 files, 0 errors
2026-06-04 19:09:44 +00:00
parent 2e8cf19f9e
commit 6e992cc0c5
92 changed files with 921 additions and 2628 deletions
--- a/projects/passepartout/architecture/knowledge-layers/_index.org
+++ b/projects/passepartout/architecture/knowledge-layers/_index.org
@@ -0,0 +1,163 @@
+:PROPERTIES:
+:CREATED:  [2026-05-24 Sun]
+:ID:       329bd4fb-702a-4a2b-9c63-69281aacb83a
+:END:
+#+title: Knowledge Layers
+#+filetags: :architecture:knowledge-layers:verification:epistemology:
+
+Passepartout's architecture for how the system knows what it knows: deductive proofs (mathematical certainty), provenance-tracked empirical models (statistical validity), and probabilistic oracle (LLM-aided guidance) — all governed by the gate.
+
+These three epistemic layers form a vertical stack: the deductive base provides formal guarantees, the empirical middle bridges mathematics to physical reality through curated data, and the probabilistic oracle generates hypotheses and interprets results within the boundaries set by the lower layers.
+
+---
+
+* Two Engines, One Store
+
+The three layers are not three parallel engines. They are two reasoning engines and one curated data store:
+
+- **The symbolic engine** handles everything that can be formalized: deductive proofs, empirical equations, validity predicates, pipeline composition, uncertainty propagation. This is one engine — it reasons about symbols using rules that are either proven (ACL2) or well-defined (force field equations). It is authoritative where it applies.
+
+- **The probabilistic oracle** (the LLM) handles everything that cannot be formalized: parameter selection, model choice, interpretation of results in natural language, failure diagnosis, creative hypothesis generation. It proposes; the symbolic engine checks. It is bounded — it cannot execute actions, only recommend them.
+
+- **The provenance store** is not an engine. It is a structured database that stores empirical parameter sets, validity envelopes, experimental benchmarks, and comparison histories. Neither engine reasons about it as a whole. The symbolic engine queries it for parameters and validity predicates. The LLM queries it for context and updates it with new data.
+
+The gate is the integration point. Every action is checked against three vectors:
+1. **Security policy** — is this action safe? (ACL2-verified gate rules)
+2. **Scientific validity** — is this model valid in this context? (provenance store query + validity envelope check)
+3. **Consistency** — do the symbolic check and the oracle's assessment agree? If the LLM's recommendation violates a validity envelope, the gate rejects.
+
+* The Knowledge Tree
+
+The layers of human knowledge, from formal foundations to empirical design, mapped to their epistemic status:
+
+| Layer | Domain | Formal status | Verification model |
+|---|---|---|---|
+| 0. Logic / Foundations | Proof theory, set theory, type theory | Deductive | Complete — provable from axioms |
+| 1. Algebra | Groups, rings, fields, vector spaces | Deductive | Complete |
+| 2. Analysis | Calculus, limits, real numbers, measure theory | Deductive | Complete (in principle; deep results are hard) |
+| 3. Geometry / Topology | Manifolds, differential forms, curvature | Deductive | Complete |
+| 4. Classical Mechanics | Lagrangian/Hamiltonian mechanics | Deductive | Complete |
+| 5. Quantum Mechanics | Hilbert spaces, operators, Schrödinger equation | Deductive | Complete |
+| 6. Statistical Mechanics | Ensembles, partition functions, entropy | Deductive | Complete |
+| 7. Electrodynamics | Maxwell's equations, potentials, radiation | Deductive | Complete |
+| 8. Quantum Chemistry | Born-Oppenheimer, DFT, coupled cluster | Partially deductive — equations formal, approximations necessary | The implementation is verifiable; the model choice is not |
+| 9. Molecular Mechanics | Force fields, potential functions | Empirical parameterization | Simulation is deterministic; parameters fitted to experiment |
+| 10. Molecular Dynamics | Integration schemes, thermostats | Deductive mechanics + empirical inputs | Integrator provable; force field parameters are not |
+| 11. Chemical Thermodynamics | Binding constants, free energy surfaces | Mixed — statistical mechanics deductive, solvation models empirical | Provenance tracked for empirical components |
+| 12. Structural Biochemistry | Protein folding, docking, enzyme kinetics | Largely empirical | Validation against experiment, not deductive proof |
+| 13. Organic Chemistry | Reaction mechanisms, functional group transformations | Empirical with formal structure | Mechanism hypotheses falsified by experiment, not proved |
+| 14. Molecular Design | Shape-programmable molecules, therapeutic targeting | Empirical design space | Design rules validated by experiment, not derived from QM |
+
+The critical transition is between layers 7 and 8. Everything below is fully formalizable — ACL2 can verify correctness against first principles. Layer 8 introduces the first irreducible approximation (Born-Oppenheimer, DFT exchange-correlation functionals). From layer 9 onward, models are empirical through and through: mathematically rigorous in their execution, but their parameters are fitted to experiment and their validity is provisional.
+
+Passepartout can verify the *computation* at every layer — that the Schrödinger equation is correctly solved, that the MD integrator preserves phase space, that the docking algorithm correctly explores conformational space. It cannot verify that the *model* matches reality. That is the domain of the empirical layer.
+
+* World Model = Verified Equations ⊗ Parameters ⊗ Validity Envelope
+
+Every computation that bridges formal mathematics and physical reality decomposes into a world model triple:
+
+**World Model = Verified Equations ⊗ Provenance-Tracked Parameters ⊗ Validity Envelope**
+
+| Component | What it is | Who handles it |
+|---|---|---|
+| Verified Equations | The formal skeleton: differential equations, integration schemes, force field functional forms | Symbolic engine — ACL2 verifies the implementation against the mathematical theory |
+| Provenance-Tracked Parameters | The numbers that make the model match reality: force constants, partial charges, solvation parameters, scoring weights | Provenance store — each carries a source (paper, dataset, calculation), confidence interval, validity regime (temperature, molecular class, solvent), and last-validation date |
+| Validity Envelope | The region of input space where the model has been experimentally validated | Gate — checked as a predicate before execution: is the current input within the model's validated range? |
+
+A force field, for example, is:
+- Bond stretching follows Hooke's law (verified equation — proven by the symbolic engine)
+- The specific spring constant for a C-C bond is 600 kcal/mol/Å² (provenance-tracked parameter — from Cornell et al. 1995, validated against 50+ small molecules)
+- The model is valid for proteins and nucleic acids in aqueous solution at 273-373K (validity envelope — checked by the gate before each simulation)
+
+The three components are inseparable. Without verified equations, the computation is untrustworthy. Without provenance-tracked parameters, the numbers are arbitrary. Without a validity envelope, the user cannot know whether the model applies to their problem.
+
+* The Provenance Store
+
+The provenance store is the infrastructure that makes the empirical layer operational. It is not a single database — it is a structured knowledge base that holds:
+
+**For traditional empirical models** (force fields, solvation equations, scoring functions):
+- The functional form of the model (e.g., AMBER ff14SB: harmonic bond + harmonic angle + Fourier torsion + LJ + Coulomb)
+- Every parameter with its source (paper, dataset, QM calculation), confidence interval, and validity regime
+- Validation history: which experimental measurements have been compared to this model, with what outcome
+- Revision history: when parameters were updated, by whom, and what changed
+
+**For neural network models** (ANI-2x, AlphaFold, learned potentials):
+- Architecture description and training hyperparameters
+- Training dataset provenance (level of theory, molecule coverage, element coverage)
+- Validation benchmarks with per-benchmark error metrics
+- Distribution summary statistics (needed for the distribution match check)
+- Domain of applicability (elements, charge ranges, molecule classes)
+
+**The gate checks:**
+
+For traditional models:
+1. Does the model support the elements/atom types in the current input? (parameter availability check)
+2. Are the conditions (temperature, pressure, solvent) within the model's validated range? (validity envelope check)
+3. Is the input within the model's training distribution? (distribution match check — primarily for neural network models)
+
+For neural network models, check 3 requires new machinery: a distribution match function that computes how similar the current input is to the model's training distribution in latent space. This is a standard technique in reliable ML (distance to training data, density estimation, conformal prediction). It integrates into the gate as a predicate: input within distribution = pass; outside distribution = flag with confidence reduction.
+
+Every check outputs pass, flag with reduced confidence, or block. The gate never silently permits a computation outside a model's validated range.
+
+* Conflict Resolution
+
+The three layers can disagree. The arbitration rules:
+
+1. **Deductive overrides both.** If the symbolic engine (ACL2) proves that a computation is formally incorrect, it is blocked regardless of what the LLM recommends or what the provenance store reports. Formal correctness is the non-negotiable base layer.
+
+2. **Empirical overrides probabilistic.** If the provenance store reports that a model's validity envelope excludes the current conditions, the LLM cannot override that judgment. The LLM may recommend a different model, or the gate may flag for human review — but it cannot proceed with the invalid model.
+
+3. **Probabilistic proposes, never executes.** The LLM recommends model selections, parameter choices, and design alternatives. Every proposal is checked against the deductive layer (formal correctness) and the empirical layer (validity envelope) before execution. The LLM cannot write a file, run a command, or send a message — it can only propose.
+
+4. **Human override is always recorded.** A user can override any layer's judgment. The override is logged to the provenance chain with the user's signature and reason. The result of an overridden computation is tagged as "human override — bypassed [layer] check" with reduced default confidence.
+
+5. **Uncertainty propagates upward.** If two empirical models disagree, the system reports both results with their respective confidence intervals and a flag: "Models disagree by 2.3 kcal/mol. Model A's uncertainty: ±0.8 kcal/mol. Model B's uncertainty: ±1.1 kcal/mol. Recommend experimental validation." The gate does not force agreement; it reports the conflict transparently.
+
+* Cold Start
+
+The provenance store must be populated with validated data before it can enforce validity envelopes. The bootstrap sequence:
+
+1. **Seed from curated sources.** Initial parameter sets from established force fields (AMBER, CHARMM, OPLS), benchmark datasets (PDBbind, COMP6), and published experimental reference data are loaded with explicit provenance tagging. Each entry is marked "unverified by this instance" but carries its original source citation.
+
+2. **LLM provides provisional parameters.** For domains where no curated data exists, the LLM proposes parameter values based on training data knowledge. These are tagged as "unvalidated — LLM-sourced" with reduced confidence and clearly marked validity envelopes.
+
+3. **Validation through use.** Every time the system runs a computation and receives experimental feedback (or the user provides a measurement), the comparison is recorded. Disagreements between prediction and measurement trigger parameter updates. Over hundreds of comparisons, the provenance store's confidence intervals tighten.
+
+4. **Community amplification (Stage 1+).** Through the social protocol, instances share validated parameter sets with provenance chains. A force field validated by one instance for ethanol and another for DMSO accumulates a broader validity envelope than either alone. The network effect compounds the cold-start investment.
+
+The cold start never reaches the same confidence as a mature instance with years of experimental feedback. But even a seeded provenance store with provisional parameters is strictly better than a system with no provenance — because the provisional parameters are explicitly tagged as provisional, and the user can see the confidence for every result rather than trusting a single unmarked number.
+
+* Mapping to Stages
+
+The knowledge-layers infrastructure is staged, not all-at-once:
+
+- **Stage 0 (current).** The probabilistic oracle exists (the LLM). The provenance store does not. The deductive engine partially exists through Hermes skills (symbolic gate rules as Python, not ACL2). The empirical layer is invisible — the LLM reasons about chemistry, biology, and engineering using training data alone, without systematic provenance.
+
+- **Stage 1 (social protocol).** The provenance store prototype can be introduced as a side effect of signed messages and data exchange. When instances share a validated parameter set, the message carries a signature and source. The receiving instance stores it with provenance. Natural crawl before full infrastructure.
+
+- **Stage 2 (gate as software).** The provenance store becomes operational infrastructure. The gate checks scientific validity alongside security policy. The provenance store integrates with the Knowledge subsystem as a structured data store — the symbolic index holds formal facts; the provenance store holds empirical parameters. Same storage mechanism, different data type.
+
+- **Stage 3 (Lisp machine).** The symbolic engine is native in one address space. ACL2 runs at hardware level. The provenance store becomes a native Lisp hash table with persistence. The gate checks validity predicates in the evaluation loop itself. The LLM proposes model selections; every proposal is verified against the provenance store before execution. All three layers in one address space.
+
+- **Stage 4+ (in-process inference).** The LLM moves in-process. All three components share one address space. No IPC between them. The query cycle is: LLM proposes → symbolic engine checks against provenance store → if valid, execute → if invalid, return to LLM with diagnostic. This loop runs at native speed.
+
+* What This Changes in the Architecture
+
+The knowledge-layers model adds a dimension to the existing architecture that was only implicit before:
+
+1. **The gate gets a third vector.** Previously the gate checked security (is this action safe?) and, through its ACL2 verification, mathematical correctness. Now it also checks scientific validity (is this model valid in this context?). The mechanism is the same — a policy evaluated before the computation proceeds — but the policy now includes predicates over empirical model applicability, not just safety and formal correctness.
+
+2. **The autodidactic loop gets two speeds.** The fast loop (deductive — generate code, prove it, hot-reload) runs autonomously at LLM speed. The slow loop (empirical — make prediction, get experimental data, update parameters) requires real-world feedback and cannot run without it. Both are essential. The fast loop makes the system mathematically powerful; the slow loop makes it useful for real-world science and engineering.
+
+3. **The provenance store is a new data type in the Knowledge subsystem.** It is neither the symbolic index (formal facts) nor the neural index (embedding vectors). It is a third index: parametric, uncertain, provisional — but no less essential for its lack of deductive certainty.
+
+4. **The gate becomes a configurable integrity layer.** Security, scientific validity, ethical constraints, legal constraints, economic constraints — all expressed as predicates over the computation's inputs, models, and parameters. Users, institutions, or jurisdictions can configure different policies without changing anything else in the system. Compliance becomes configuration.
+
+---
+
+See also:
+- [[id:971cd9e7-2cc5-4743-8042-2469dbe4078f][Lisp Foundation]] — the prover bootstrapping path that enables the deductive layer
+- [[id:1c3ec48b-446c-50d2-b53e-126a81f5143f][Architecture]] — the gate, subsystems, and staged progression
+- [[id:4b5c6d7e-8f9a-0b1c-2d3e-4f5a6b7c8d9e][Neural Networks in the Empirical Layer]] — how trained models fit into the provenance framework
+- knowledge-layers/practical-implications.org — concrete consequences for design, safety, regulation, and trust
+- [[id:f4e5d6c7-b8a9-0c1d-2e3f-4a5b6c7d8e9f][Schafmeister and Clasp]] — existence proof: Lisp in computational nanotechnology
--- a/projects/passepartout/architecture/knowledge-layers/neurological-empirical.org
+++ b/projects/passepartout/architecture/knowledge-layers/neurological-empirical.org
@@ -0,0 +1,128 @@
+:PROPERTIES:
+:CREATED:  [2026-05-25 Mon]
+:ID:       4b5c6d7e-8f9a-0b1c-2d3e-4f5a6b7c8d9e
+:END:
+#+title: Neurological Software in the Empirical Middle
+#+filetags: :ideas:passepartout:architecture:world-models:
+
+The empirical middle of the knowledge tree (layers 8-14) is increasingly dominated by neural networks trained on data — not symbolic equations with fitted parameters. ANI, MACE, SchNet for molecular energies and forces. AlphaFold for protein structure prediction. Neural docking scores, learned solvation models, QSAR neural nets, RL-based molecular design agents. These are not traditional empirical models with interpretable parameters. They are learned function approximators with millions of inscrutable weights.
+
+The knowledge-layers architecture must accommodate them. This note analyzes how.
+
+**What changes when the model is a neural network.**
+
+A traditional empirical model (force field, solvation equation, docking scoring function) has:
+
+- A **symbolic expression** for the relationship between inputs and outputs (E = k_b(r - r_0)² + ...)
+- **Interpretable parameters** that correspond to physical quantities (spring constant = 600 kcal/mol/Å²)
+- **Known failure modes** from the equation's form (harmonic approximation fails at extreme bond lengths)
+
+A neural network model has:
+
+- A **learned function** with no simple symbolic expression
+- **Inscrutable parameters** (weights) that do not correspond to physical quantities
+- **Unknown failure modes** — neural networks interpolate well in-distribution and fail unpredictably out-of-distribution
+
+From the architecture's perspective, the critical difference is not that neural networks are harder to verify (they are, but that is a secondary concern). The critical difference is that the provenance information shifts: instead of tracking where a parameter value came from and what it means, you track what the network was trained on, what it was validated against, and whether the current input resembles its training distribution.
+
+**The provenance store handles the shift by tracking three things instead of one.**
+
+A traditional empirical model's provenance entry:
+
+```
+Model: AMBER ff14SB
+Equation: Harmonic bond + harmonic angle + Fourier torsion + LJ + Coulomb
+Parameters:
+  - k_b(C-C): 600 kcal/mol/Å², source: Cornell et al. (1995), validated: 50+ small molecules
+  - r_0(C-C): 1.525 Å, source: Cornell et al. (1995), validated: 50+ small molecules
+  - ...
+Validity envelope:
+  - Temperature: 273-373K
+  - Solvents: water, methanol, ethanol
+  - Molecule classes: proteins, nucleic acids
+```
+
+A neural network model's provenance entry:
+
+```
+Model: ANI-2x
+Architecture: Ensemble of 8 evidential ANI networks
+Parameters: ~8 million weights — not interpretable individually
+Training data:
+  - Level of theory: ωB97M-D3(BJ)/def2-TZVPPD (DFT)
+  - Molecules: ~8 million conformations from 63,000 organic molecules
+  - Elements: H, C, N, O, S, F, Cl, Br
+  - Conformational coverage: ANI-2x conformational space (RDKit + stochastic sampling)
+Validation benchmarks:
+  - COMP6 benchmark (drug-like molecules): MAE 1.2 kcal/mol
+  - Dihedral profiles: MAE 0.8 kcal/mol
+  - Isomerization energies: MAE 0.9 kcal/mol
+Validity envelope (domain check):
+  - Elements: H, C, N, O, S, F, Cl, Br only
+  - Atomic charge range: not validated for charged species outside training distribution
+  - Conformational novelty flag: activated if RMSD to nearest training point > threshold
+```
+
+The structure is the same: model → training/validation data → domain of applicability. The content differs: traditional models have interpretable parameters with experimental sources; neural networks have training dataset provenance and aggregated validation benchmarks.
+
+**The gate checks the same things regardless of model type.**
+
+The gate predicates for model validity are:
+
+1. **Does the model support the elements/atoms/molecule types in the current input?** — This is the same check for a force field (does the force field have parameters for this atom type?) and a neural network (was this element in the training data?).
+
+2. **Are the conditions within the model's validated range?** — Temperature, pressure, solvent, etc. Same predicate, same structure. The neural network's validated range may be narrower or less well-defined, but the check is the same.
+
+3. **Is the input within the model's training/validation distribution?** — For traditional models, this is a direct validity envelope check. For neural networks, this is a **distribution match** — a statistical check that the current molecular conformation resembles the training set. If the input is far from the training distribution in latent space, the gate flags it regardless of whether the model predicts confidently.
+
+The distribution match check is the new machinery that neural network models require. It is a standard technique in reliable ML (distance to training data, density estimation in latent space, conformal prediction). It integrates into the gate as a predicate: "input is within training distribution: PASS" or "input is outside training distribution: FLAG with confidence reduction."
+
+**The symbolic engine does not need to understand the network.**
+
+This is the key simplification. The symbolic engine — ACL2, the gate predicates, the formal reasoning — does not need to parse the neural network's weights or architecture. It needs to:
+
+- Query the provenance store for the model's training data description
+- Compute a distribution match score for the current input against the training data
+- Compare the result to a threshold from the validity envelope
+- Output: pass, flag, or block
+
+None of these operations require understanding what the network does. They are metadata operations on the provenance store and geometric operations on the input space. The network itself is a black box — the symbolic engine treats it as a function with a known domain of applicability, the same way it treats a force field as a function with a known validity envelope.
+
+**The oracle handles model selection.**
+
+Which model to use for a given problem — traditional force field or learned neural network? The LLM oracle handles this, informed by the provenance store. The store tells the LLM what models are available, what they are validated for, and how they perform on relevant benchmarks. The LLM recommends. The gate checks the recommendation against the validity envelope before execution.
+
+This is where the architecture connects to the real world of model selection that computational scientists face daily. There is no single best force field or neural network architecture for all problems. The choice depends on the molecule class, the property of interest, the required accuracy, and the computational budget. The LLM, with its broad knowledge of the literature, is well-suited to making this recommendation — not by reasoning about the models from first principles, but by knowing which models are preferred for which use cases from training data.
+
+**The full picture: three kinds of empirical model.**
+
+The provenance store now handles three data types:
+
+| Model type | Example | Parameters | Validation method | Gate check |
+|---|---|---|---|---|
+| Symbolic equation + fitted parameters | AMBER force field | Interpretable (spring constants, partial charges) | Per-parameter: source experiment, confidence interval | Validity envelope: temperature, solvent, molecule class |
+| Trained neural network | ANI-2x | Inscrutable (8M weights) | Per-dataset: benchmark MAE, held-out test set | Distribution match: is input like training data? |
+| Hybrid (learned correction to symbolic model) | Δ-ML corrections to DFT | Partially interpretable corrections + network weights | Per-benchmark + per-component | Both envelope + distribution match |
+
+All three are handled by the same provenance store, the same gate predicates, and the same LLM oracle. The only new infrastructure required is the **distribution match check** for neural network models — a piece of statistical machinery that computes how similar the current input is to the model's training distribution.
+
+**Where this fits in the stage plan.**
+
+- **Stage 0-1**: The provenance store does not exist. Neural network models are loaded as black boxes with no systematic validity checking. This is current practice in computational science — the user is responsible for knowing whether a model applies to their problem.
+
+- **Stage 2**: The provenance store begins operation. Initially it handles traditional symbolic-fitted models because they have clear provenance chains and validity envelopes. Neural network models require the distribution match infrastructure, which is a separate development track.
+
+- **Stage 3**: The distribution match infrastructure is operational. The gate can check whether an input is within a neural network's training distribution. The provenance store holds training dataset descriptions, validation benchmarks, and distribution summary statistics for each supported neural network model.
+
+- **Stage 4+**: Neural network models are loaded into the same address space as the symbolic engine and the provenance store. The distribution match check runs at the level of the evaluation loop itself. The gate's validity check becomes a fast native predicate — no querying a separate data store, just reading a hash table and computing a distance in the same process.
+
+**The summary.**
+
+Neural network models trained on empirical data are not a problem for the knowledge-layers architecture. They fit into the existing framework:
+
+- **The provenance store** tracks training data sources, validation benchmarks, and distribution statistics — instead of parameter sources and confidence intervals.
+- **The gate** checks domain match and training distribution coverage — instead of validity envelopes and parameter regimes.
+- **The symbolic engine** does not need to understand the network — it treats it as a black box with a known domain, the same way it treats a force field.
+- **The LLM oracle** handles model selection — recommending which neural network or traditional model fits the user's problem, informed by the provenance store's benchmark records.
+
+The new infrastructure required is not large — a distribution match function and a training dataset descriptor in the provenance store. Everything else is existing mechanism applied to a new data type.
--- a/projects/passepartout/architecture/knowledge-layers/practical-implications.org
+++ b/projects/passepartout/architecture/knowledge-layers/practical-implications.org
@@ -0,0 +1,101 @@
+:PROPERTIES:
+:CREATED:  [2026-06-04 Tue]
+:ID:       5c6d7e8f-9a0b-1c2d-3e4f-5a6b7c8d9e0f
+:END:
+#+title: Practical Implications of the Knowledge-Layers Architecture
+#+filetags: :architecture:knowledge-layers:implications:design:
+
+What the knowledge-layers model — deductive proofs, provenance-tracked empirical models, and probabilistic oracle — means for design, engineering, science, software, and trust. These are concrete consequences, not abstract possibilities.
+
+---
+
+* Design World Aware of Its Physics
+
+The most vivid implication: a design environment where the constraint solver IS the physics engine, and every parameter carries its epistemic status.
+
+Currently, design software pretends material properties are true numbers. You pick "steel" from a dropdown, the system shows Young's modulus = 200 GPa, and you design to that single value. But that value is an average across 50 samples from different suppliers. The actual value for your specific part is between 190 and 210 GPa, and the software never tells you.
+
+With provenance-tracked empirical models, every parameter in the constraint network carries its source, confidence interval, and validity envelope. The constraint solver propagates uncertainty automatically. The designer sees distributions, not single numbers:
+
+- The assembled clearance at a joint: 0.03-0.08mm, not 0.05mm flat
+- The confidence this design meets specification under rated load: 95%, with a breakdown (material uncertainty 3%, manufacturing tolerance 1.5%, load model 0.5%)
+- Material selection as a query with confidence thresholds: "yield strength > 250 MPa, validated at 400°C, at least 3 independent measurements from peer-reviewed sources"
+- Tolerance stack-up as an automatic consequence of provenance — the ISO tolerance grades of each component propagate through the constraint network
+
+The gate constrains what the designer can even specify. Design a seal for 500°C continuous operation. The provenance store reports: "This material's empirical model is validated to 300°C. Above that, the only data is a single 1973 paper with a 2x extrapolation factor and no confidence interval." The gate flags it. The designer must explicitly accept the risk (logged to the provenance chain) or select a material with better empirical coverage.
+
+Manufacturing feedback closes the loop: as-manufactured dimensions and measured friction coefficients write back to the provenance store. The next design iteration has tighter confidence intervals. Datasheet revisions propagate retroactively: a bearing manufacturer's 2025 revision shows a lower load rating than the 2022 datasheet; the gate re-checks all designs using that bearing and flags any that now fall below required safety margins.
+
+This is what "design world aware of its physics" means in architectural terms: the constraint solver enforces geometric consistency (deductive layer), the provenance store enforces parametric validity (empirical layer), and the gate enforces that neither can be violated without explicit override.
+
+[[id:329bd4fb-702a-4a2b-9c63-69281aacb83a][Knowledge Layers]] — the three-layer architecture that makes this possible
+
+* Ten Practical Powers
+
+What the system can do with all three layers operating together that a conventional system cannot:
+
+**1. It can tell you how wrong every result might be.**
+Every output carries an uncertainty budget: binding affinity -9.2 ± 1.4 kcal/mol, broken down by source (force field ±0.8, solvation ±0.5, sampling ±0.3, scoring ±0.6). No computational chemistry package does this today — every one outputs a precise-looking number and leaves uncertainty to the scientist's judgment.
+
+**2. It can prevent you from using a model outside its valid range.**
+A force field parameterized for soluble proteins at room temperature gives plausible-looking numbers for a membrane protein at body temperature. The gate catches this: "This force field was validated for aqueous solutions of soluble proteins at 273-373K. Your simulation involves a lipid bilayer. Three parameters are outside their validated range. Confidence reduction: 40% if you proceed."
+
+**3. It can detect when a model is getting worse.**
+Every model version is tracked. When a superseded force field is used, the gate flags: "AMBER ff99 was superseded by ff14SB in 2014 and ff19SB in 2019. The newer parameter sets improve backbone dihedral prediction by 30%. Migrate?"
+
+**4. It can compare predictions to experiments automatically.**
+Every computational prediction matched to an experimental measurement builds a systematic bias profile: "This force field consistently underestimates binding affinity for charged ligands by 0.5-1.0 kcal/mol." These profiles accumulate across all computations, making future predictions more interpretable.
+
+**5. It can red-team its own reasoning.**
+The LLM proposes a conclusion. ACL2 checks the formal steps. The provenance store checks model validity. If all three agree, the result is as reliable as the system can make it. If they disagree: "The mathematics checks out, but the models supporting it are outside their validated range. Your conclusion may be mathematically correct but physically unsupported."
+
+**6. It can build a community knowledge graph of what works.**
+Through the social protocol, instances share validated parameter sets. A force field validated by one instance for ethanol and another for DMSO accumulates a validity envelope broader than either alone. The network effect compounds.
+
+**7. It can generate a defensible record for regulatory submission.**
+Every simulation carries a full provenance chain: model version and source, parameter validation, solver settings, gate checks passed, uncertainty budget. For FDA/EMA-regulated industries, this is the difference between a simulation used for guidance and one accepted as evidence.
+
+**8. It can be wrong honestly.**
+Every result carries its epistemic label: "deductively proven (ACL2-verified)," "empirically validated within validity envelope," or "extrapolation outside validated range — low confidence, for hypothesis generation only." The system does not ask the user to trust it. It shows what it knows and how it knows it.
+
+**9. It can refuse an unsound instruction.**
+"I will not run this simulation. The requested temperature (500K) exceeds the force field's validated range (273-373K). The solvent (hexane) has no validated parameters. The simulation will produce numerically precise but physically meaningless results." The override exists but is recorded, and the result is tagged with its true confidence.
+
+**10. It can connect mathematics to reality without faking it.**
+A finite element analysis of a bridge: "The equations are verified against classical mechanics (layer 4). The material parameters come from ASTM standard tests (layers 8-9, validity envelope: -20°C to 60°C, validated by 200+ measurements). The load calculations carry ±3% uncertainty." The bridge is not proven safe — no software can prove a physical structure is safe — but the chain from mathematical foundation to empirical measurement is fully transparent.
+
+* Schafmeister and Clasp: Existence Proof
+
+Christian Schafmeister's work at Temple University is the strongest existence proof for the knowledge-layers architecture. He created [[https://github.com/clasp-developers/clasp][Clasp]], a Common Lisp implementation that interoperates with C++ libraries via LLVM compilation, specifically to design spiroligomers — shape-programmable molecules that bind proteins as therapeutics.
+
+Why this proves the architecture:
+
+1. **Lisp is already used for molecular design.** Schafmeister's team runs computational chemistry pipelines from within a Lisp environment, funded by the NIH and NSF. The interactivity and homoiconicity that the knowledge-layers architecture relies on are the same properties that make this work possible.
+
+2. **The single-address-space model is not a retro fantasy.** Clasp proves you can run C++ libraries inside a Lisp image, not alongside it. The Lisp machine is a practical architecture being used today for computationally demanding scientific work.
+
+3. **Schafmeister's pipeline spans the entire knowledge tree.** QM calculations (layer 8) feed force field parameterization (layer 9), which feeds MD simulations (layer 10), which feed binding free energy predictions (layer 11), which feed docking studies (layer 12), which guide experimental design (layer 14). Every layer's output is an input to the layer above, and every layer has a different epistemic status — from provable QM through empirically parameterized force fields through heuristic design rules.
+
+The main difference in direction: Schafmeister brought C++ into Lisp to access the scientific computing ecosystem. The knowledge-layers architecture replaces C++ libraries with verified Lisp-native alternatives. The principle — one representation, one address space, no translation boundaries — is the same.
+
+* Truth as an Epistemic Property, Not a Brand
+
+The deepest shift the knowledge-layers model enables: computation becomes epistemically transparent.
+
+Currently, computational results are trusted based on popularity. "Everyone uses this software" is the epistemic warrant. The knowledge-layers model replaces this with an explicit chain: this equation was verified against classical mechanics, these parameters come from a specific experimental paper, this validity envelope covers the conditions you specified. Trust moves from "the tool is popular" to "the chain is traceable."
+
+This changes the economics of computational trust. A result that is deductively proven can be used as a building block for further proofs — its truth is inherited by any derivation. A result that is empirically validated is useful for design decisions with known risk. A result that is an LLM extrapolation is useful only for hypothesis generation. Computational results become differentiated products, not interchangeable commodities. Provenance quality is the differentiator.
+
+For reproducibility: the provenance chain is a complete specification. Every computation is fully described by its model, its parameters, its validity envelope, and its gate checks. Reproducing the computation is loading the same chain and running it. No more "we used the AMBER force field" without version, parameter set, cutoff scheme, or solvation model.
+
+For regulatory science: a regulator can read the output and see exactly what was computed, with what models, under what conditions, with what uncertainty. Review shifts from auditing the company's process to auditing the computation's chain.
+
+For education: students develop epistemic hygiene as a side effect of using the system. Every computation they run shows them whether the result is proven, validated, or generated — making visible the distinction that current software hides behind uniformly precise-looking numbers.
+
+---
+
+See also:
+- [[id:329bd4fb-702a-4a2b-9c63-69281aacb83a][Knowledge Layers]] — the architecture that makes these powers possible
+- [[id:1c3ec48b-446c-50d2-b53e-126a81f5143f][Architecture]] — the gate, subsystems, and stage plan
+- [[id:f4e5d6c7-b8a9-0c1d-2e3f-4a5b6c7d8e9f][Schafmeister and Clasp]] — Lisp in computational nanotechnology
+- ideas/lisp-geometry-engine — the geometry engine as concrete illustration of design-world-aware-of-physics