Who Is Closest to Passepartout?

The Four Architectural Properties
The Academic Landscape
The Gap
What This Means
References

A survey of academic researchers whose work overlaps with Passepartout's architecture along specific dimensions. The conclusion: no academic group combines all four architectural properties that define Passepartout's design. The closest groups each hold one or two pieces; none combine all.

The Four Architectural Properties

LLM-level generator with full creative freedom — the generator synthesizes entire implementations from specifications, not individual tactic steps or hole-fillings.
Theorem-prover verification with complete functional correctness — the verifier checks all execution paths against the full spec, not bounded verification via SMT solvers.
Asymmetric authority — the symbolic component (prover) is the final authority and cannot be overridden by the neural component.
Counterexample-guided retry loop — when the prover rejects an implementation, it returns a concrete counterexample that the generator uses to reformulate.

The Academic Landscape

LLM + Theorem Prover Loops

Researcher	Institution	System	Match	Divergence
Sean Welleck	CMU	ImProver 2	Self-improving LMs generating proof steps verified by Lean	Generator fills tactic holes in existing proofs, not full implementations. Camp B.
Timon Gehr	ETH	COPRA, Thor	LLM interacts with theorem prover kernel	Same constraint: tactic-level. Neural component generates one move at a time.
Kaiyu Yang	Princeton	LINC	Neural network learns symbolic rules, prover checks consistency	Neural component is a learner discovering rules from data, not a generator synthesizing from spec. Different abstraction level.

All three are Camp B in the loop taxonomy (constrained generator + complete verifier). None gives the LLM freedom to synthesize full implementations. Welleck's ImProver is the closest in spirit — the loop iterates, the prover is authoritative — but the scope of what the generator produces is orders of magnitude smaller than what Passepartout's design requires.

Synthesis + Verification (non-LLM)

Researcher	Institution	System	Match	Divergence
Armando Solar-Lezama	MIT	Sketch	Synthesis-aided verification: partial program → solver fills holes → assertions checked	Generator is constraint-based SAT/SMT, not an LLM. Verification is bounded (solver capacity).
Emina Torlak	UW	Rosette	Solver-aided languages for synthesis + verification	Same constraints as Sketch. Bounded, non-LLM.
Swarat Chaudhuri	UT Austin	Neurosymbolic Programming	Neural networks guide program synthesis, symbolic analysis verifies	Uses SMT for bounded verification, not theorem prover for complete. Neural-symbolic are symmetric collaborators, not asymmetric authority.

Chaudhuri is the closest overall academic neighbor. His group explicitly works on combining neural and symbolic components, with symbolic verification of neural-generated candidates. But the verification is bounded (SMT), not complete (theorem prover), and the loop does not have Passepartout's asymmetric authority design.

Lisp as Infrastructure for Verification

Researcher	Institution	System	Match	Divergence
Christian Schafmeister	Temple	Clasp	Common Lisp through LLVM; interactive Lisp for serious computation	Lisp infrastructure, not a neurosymbolic loop. No ACL2 integration.
Kaufmann, Moore	UT Austin / Retired	ACL2	The theorem prover itself	Pure symbolic verification. No LLM loop.

Schafmeister is aligned with Passepartout on the "why Lisp" question — interactive development, uniform representation, C++ interop for performance — but does not work on agentic verification loops.

Autonomous Code Modification Loops

Researcher	Institution	System	Match	Divergence
Kevin Ellis	Cornell	DreamCoder	Neural program synthesis loop: generate → execute → learn	Verifier is interpreter (does it run?), not prover (is it correct?). Camp A.
Andrej Karpathy	Anthropic	autoresearch	LLM modifies code, runs experiments, keeps/discards based on metric	Verifier is val_bpb — a single empirical number. No specification, no formal guarantee. Camp C.

Both prove the viability of the autonomous loop concept but use the weakest possible verifiers (execution and empirical metrics).

The Bitter Lesson / Temporal Credit Assignment (Sutton)

Researcher	Institution	System	Match	Divergence
Richard Sutton	Alberta / Keen Technologies	TD learning, eligibility traces, Alberta Plan	The fundamental problem in verification — an action was checked, but the consequence plays out hours later; was the action correct? — is the same problem TD learning solves in RL: assigning credit to actions based on delayed outcomes. Sutton's temporal credit assignment work is the theory you would need to extend Passepartout from per-action gates to trajectory-level verification. His Bitter Lesson (scale beats engineered knowledge at sufficient compute) is the most commonly cited argument against the symbolic verification approach Passepartout bets on.	The Bitter Lesson is not anti-knowledge — it says methods that improve with more computation eventually dominate. Passepartout's gate is a deliberately small engineered knowledge system that won't benefit from more compute (the ACL2 lemmas don't get more correct with more hardware). That's acceptable because the gate is a narrow bottleneck (permit/deny). The LLM layer inside the gate does benefit from scale. The architecture already respects the Bitter Lesson by placing the scalable piece where scale helps and the non-scalable piece where deductive certainty matters. Sutton's Alberta Plan (world model + reward + learning algorithm) parallels Passepartout's Stage 6 (world model + gate + verified fine-tuning), but Sutton's agents learn by pure reward while Passepartout's learn by reward constrained by verified policy. Sutton would likely argue that a learned safety policy at scale would outcompete the gate. Passepartout's bet is that access control, message authentication, and compliance should never be probabilistic, even at infinite scale.

Integrate-Symbolic-Into-Neural (Garcez)

Researcher	Institution	System	Match	Divergence
Artur d'Avila Garcez	City, Univ. of London	NeSy frameworks, NSL	Pioneer of neural-symbolic computation since 1990s. Book: "Neural-Symbolic Cognitive Reasoning." Runs NeSy workshop series.	His approach integrates symbolic knowledge into neural networks (logic regularization, knowledge distillation). Symbolic rules are a training signal, not a runtime verifier. The neural component can override symbolic constraints through the loss landscape. No asymmetric authority, no theorem prover, no complete verification. His camp is "make neural networks behave more symbolically." Passepartout's camp is "make neural networks accountable to symbolic verification." Opposite architectural philosophy.

Garcez's position in the design space is closest to Camp A (no independent verifier). The symbolic rules guide learning but do not veto outputs at runtime. His work is foundational for the field of neural-symbolic computation, but his architectural philosophy is the inverse of Passepartout's. He wants the symbolic inside the neural. Passepartout wants them separate with the symbolic holding authority.

Theorist of the Hybrid Thesis (Marcus)

Researcher	Institution	System	Match	Divergence
Gary Marcus	NYU Emeritus, Robust.AI	None (theorist/critic)	Longest-standing public advocate for hybrid AI. Since "The Algebraic Mind" (2001) and "Rebooting AI" (2019), he has argued deep learning alone cannot achieve systematicity, composition, or reasoning. He identified the need for the approach Passepartout implements. As of May 2026, he is publicly asking why LLM agent frameworks are not using LEAN as a theorem-prover verifier — the same engineering gap Passepartout occupies.	He does not propose a specific architecture or loop design. His background is cognitive science and developmental psychology, not formal verification. The "symbolic component" he advocates is abstract — structured knowledge representations, not ACL2 theorem proving. He has no answer to the cost/feasibility question (the "Better is Cheaper" argument is Passepartout's contribution, not Marcus's). He is a theorist of the problem, not an architect of the solution — though his May 2026 tweet shows he is now engaging with the engineering question directly.

Marcus occupies a category that does not appear in the loop taxonomy (Camps A-D) because he does not define a loop. He identifies the need for hybrid AI with genuine symbolic authority. Passepartout is the engineering response to the thesis Marcus has been arguing since before most of the field would admit the limitations existed. His May 2026 tweet asking "they aren't using LEAN in one of those many tools?" is the theorist noticing the empty cell Passepartout was designed to fill.

The Gap

Property	Passepartout	Closest academic	Academic's limiter
Generator freedom	Full synthesis from spec	ImProver (Welleck)	Fills tactic holes only
Verification completeness	Complete (theorem prover)	Sketch (Solar-Lezama)	Bounded (SMT)
Asymmetric authority	Neural cannot override prover	Neurosymbolic Prog (Chaudhuri)	Symmetric collaboration
Counterexample feedback	Structured from prover to LLM	ImProver (Welleck)	Pass/fail at tactic level
Two symbolic layers	Gates + prover independent	None	No second layer exists

No academic group combines all four properties. The closest — Chaudhuri — has three of five (neural + symbolic + verification) but fails on verification completeness (SMT not ACL2), asymmetric authority (symmetric not asymmetric), and the two-layer gate design.

What This Means

The gap is either:

A genuinely empty cell in the design space. The combination is novel, the components have not converged in one system before, and Passepartout's design is early.
A sign that the combination is not as valuable as the components. No major academic lab has invested in this specific loop because the cost of writing complete formal specifications exceeds the benefit of complete formal verification, given the alternative of bounded verification (SMT) with cheaper spec costs.

The way to distinguish (1) from (2) is to build the architecture and measure whether the spec-writing cost is amortized over enough synthesized implementations to justify it. Passepartout's answer is: yes, because specs are written once and implementations are generated for every deployment context. The academic literature has not tested this claim.

References

Neurosymbolic Loop Architectures — the taxonomy that positions these comparisons
Neurosymbolic AI Paper Library — papers referenced above are in the local library

12 KiB Raw Blame History