What the World Model Idea Means — Plain Language

An explanation of the world model idea and its implications, written for someone who does not want to parse architectural jargon.

The basic idea in one sentence.

Some things you can prove are true (like 2+2=4), and some things you can only model approximately because you have to fit the numbers to real-world measurements (like how strongly a particular molecule bends or twists). Most practically useful knowledge is the second kind.

The deductive vs. empirical split.

There is a floor in the building of human knowledge. Below the floor, everything is provable from first principles. This includes:

Logic (if A implies B and A is true, then B is true)
Mathematics (calculus, linear algebra, number theory)
Fundamental physics (Newton's laws, the Schrödinger equation, Maxwell's equations)

In theory, a computer with enough power can prove things at this level. Given the right axioms and enough time, it can derive any true statement. This is what ACL2 does — it proves that a piece of code is correct for all possible inputs.

Above the floor, things change. The equations have a known form (bonds stretch like springs, molecules attract each other at long range and repel at short range), but the precise numbers in those equations have to be fitted to experimental measurements. There is no way to derive a spring constant from first principles — you have to measure it.

This includes:

Chemistry (how strongly does this bond bend? how fast does this reaction happen?)
Biology (how tightly does this drug bind to this protein? how fast does this enzyme work?)
Engineering (how much weight can this beam hold before it cracks?)
Medicine (does this drug work? what dose is safe?)
Materials science (how strong is this alloy at high temperature?)
Climate science, geology, pharmacology, and almost every other applied science

The critical observation.

Most of what humans actually care about lives above the floor. Pure mathematics is beautiful and foundational, but nobody builds a bridge, cures a disease, or designs a drug using only proofs from first principles. Every practical domain works with approximate models that are useful but not deductively certain.

Passepartout's verification engine can handle the stuff below the floor. It can prove that a numerical integration routine is correct, that a sorting algorithm works, that an algebraic simplification is valid. But above the floor, "verification" means something different — not "proven correct from axioms" but "the implementation correctly executes the model, and the model's parameters are traceable to experimental data."

The three parts of a useful computation.

In the deductive zone (below the floor), every computation has two parts:

The algorithm (how you compute it)
The verification (the proof that the algorithm is correct)

In the empirical zone (above the floor), every useful computation has three parts:

The equations (the known mathematical form of the model)
The parameters (the numbers fitted to experimental data)
The validity envelope (the range of conditions where the model is reliable)

The equations can be verified — ACL2 can prove that your force field code correctly implements Hooke's law. The parameters cannot be verified; they can only be validated against experimental data. The validity envelope cannot be proven either — it is a learned or declared boundary that says "we checked this model works for these kinds of molecules at these temperatures; outside that range, we don't know."

What this means for Passepartout.

First, the architecture needs three subsystems, not two.

The neurosymbolic split (probabilistic brain + deterministic prover) only covers the deductive zone. The empirical zone needs a third subsystem — a provenance tracker that stores where every parameter came from, what its confidence interval is, and what range of conditions it has been validated for.

This subsystem does not prove anything. It curates. It ensures that when Passepartout simulates a molecule, every force constant, every partial charge, every solvation parameter has a source that can be checked. If the same parameter was determined by two different experiments with different results, the system can report both and flag the uncertainty.

Second, the gate gets a new job.

The gate currently asks "is this action safe?" — should this shell command run, should this file be written, should this network message be sent. With the world model insight, the gate also asks "is this model valid for the context?" — this force field was parameterized for soluble proteins; you are applying it to a membrane protein. The answer may be "block" or "allow with reduced confidence" or "flag for human review."

This is not a security check. It is a scientific integrity check. But it uses the same mechanism — a policy evaluated before the computation proceeds.

Third, self-improvement splits into two speeds.

Fast loop (below the floor): Passepartout generates a new algorithm, verifies it with ACL2, and hot-reloads it. This is what the Mathematica-bootstrapping scenario describes — days to generate thousands of provably correct functions. This loop runs autonomously.
Slow loop (above the floor): Passepartout makes a prediction using an empirical model, gets experimental data back (either by performing an experiment or reading a paper), and updates the model's parameters or validity envelope. This loop requires real-world feedback. It cannot run autonomously — it needs data from the physical world.

Both loops matter. The fast loop makes Passepartout mathematically powerful. The slow loop makes it useful for real-world science and engineering.

What this does not mean.

This does not mean Passepartout cannot handle empirical science. It means Passepartout handles it differently — with explicit uncertainty, provenance tracking, and validity boundaries, instead of pretending the model is deductively certain.

This is actually a design advantage. Most scientific software treats its parameters as if they were provably correct. Force field databases ship as flat files with no provenance. Passepartout would be the first system that can say: "I am running the AMBER force field. The bond-stretching parameters come from a 1995 paper by Cornell et al., validated against 50 small molecules. The partial charges come from the RESP fitting procedure, applied to HF/6-31G* calculations. The validity envelope covers proteins and nucleic acids in aqueous solution at 273-373K. Your simulation involves a lipid membrane at 350K, which is outside the validated range. The results may be qualitatively correct but the quantitative predictions should be treated with caution."

No existing chemistry software does this. A system that can is more useful than one that cannot, even if the underlying simulation is the same.

The broader implication.

The deductive/empirical floor is not a weakness in Passepartout's design. It is a correct description of how knowledge actually works in the physical world. Most systems pretend everything is deductively certain and hide their assumptions. Passepartout would make the assumptions explicit, trace every number to its source, and report uncertainty alongside every result.

This is what it means to build a system that does not lie to you.

7.4 KiB Raw Blame History

What the World Model Idea Means — Plain Language

7.4 KiB

Raw Blame History