Engelmann, Bernard, and the Predictive Brain
Bernard Andrews’ critique of Engelmann raises important questions about how meaning, concepts, and understanding are formed. Much of that discussion takes place at a largely philosophical level. What I want to do here is bring in a perspective that I think has particular explanatory power: Predictive Processing and Active Inference.
Predictive Processing and Active Inference are not just interesting lenses or metaphors. They are increasingly widely accepted frameworks in contemporary cognitive science and neuroscience that aim to explain how perception, action, and learning are implemented in real brains. Rather than starting from abstract assumptions about meaning or rules, they start from the constraints of biological systems that must act in the world under uncertainty.
For an education-focused introduction to predictive processing and active inference, see my Substack post: “Teachers Are Prediction Error Managers.” For wider context, explore the following introductory talks: Is Reality a Controlled Hallucination? - Anil Seth, How the brain shapes reality - Andy Clark, and Active Inference in the Brain - Karl Friston.
At the core of these frameworks is the idea that the brain is constantly generating predictions about incoming sensory input and updating those predictions in response to error. Learning, perception, and understanding are all aspects of this same process. Active Inference extends this idea to action, treating behaviour as a way of testing and improving predictions rather than as a separate system bolted on to cognition.
What makes this perspective particularly powerful is that it connects multiple levels of explanation. It links neural architecture, perception and action, learning over time, social interaction, and language within a single framework. It also aligns with a growing body of empirical work in neuroscience, psychology, and computational modelling.
With that framing in place, I believe Bernard’s critique of Engelmann can be re-read in a more productive way.
Underdetermined in theory, good enough in practice
Bernard is right to point out that Engelmann sometimes sounds as if, with the right sequence of examples and non-examples, meaning can be fixed in a logically watertight way. Quine’s gavagai, Goodman’s grue, and Kripke’s quus all show that this cannot literally be true. No finite set of examples can rule out every possible interpretation.
Where the discussion risks being misleading is in treating this logical result as if it were a practical problem for learning and teaching.
The underdetermination of meaning is true in extremity. It is a limit case. It does not follow that meaning remains radically indeterminate in ordinary learning situations.
In practice, a small number of well-chosen examples and non-examples can narrow the space of plausible interpretations extremely quickly. After that point, the remaining alternatives are technically possible, but practically irrelevant. They do not survive further experience.
From a Predictive Processing perspective, this is exactly what effective instruction does. It does not aim for logical certainty. It reshapes the learner’s expectations so that one interpretation becomes overwhelmingly more successful than the others.
At that point, the learner has a generative model with a very high chance of working across future situations. That is what we normally mean by understanding.
Seen this way, Engelmann’s expansion sequences make a great deal of sense. They are not about achieving philosophical closure. They deliberately hack the brain’s natural statistical learning machinery, its sensitivity to sameness, difference, and regularity, to rapidly engineer practical and useful updates to learners’ generative models. In other words, they exploit mechanisms the brain already uses to learn from experience, but do so in a more targeted and accelerated way. Crucially, these initial learning episodes are not sufficient on their own. They require repeated successful re-use over time, through spaced retrieval and application, to establish salience and stabilise these generative model updates for long-term use.
Norms versus priors: the rattle example
Bernard argues that Engelmann is wrong to think meaning comes from sensory examples, and that meaning instead depends on norms of correct use. There is something right about this, but it leans too heavily on social norms and not enough on development and mechanism.
Take the rattle example. A baby does not initially “see a rattle”. But that is not because the baby lacks a social norm. It is because the baby does not yet have the necessary component priors to model a rattle as a stable object.
Those priors include expectations about object persistence, how grasping relates to sound, typical object behaviour, and the relationship between action and consequence.
Through repeated embodied interaction, models that predict these regularities reduce prediction error and are strengthened. Models that do not, fade away.
This process happens across cultures. Babies in different societies converge on similar object-level discriminations long before they know the words for them. Language and social correction come later and refine use, but they are not the starting point.
Norms matter, but they are better understood as high-level constraints that sit inside a generative model, not as something separate from learning.
The uncomfortable truth about “understanding”
There is an uncomfortable point sitting underneath all of this, and it is usually left unsaid.
I think I actually agree with Bernard on one important thing here: in practice, our judgements about whether someone “understands” something are almost always made against social norms of what counts as understanding in that domain. We decide that a student understands addition, or photosynthesis, or a grammatical rule, because they can do the kinds of things that are conventionally taken to demonstrate understanding.
In that sense, understanding is unavoidably a normative judgement.
However, if we try to define understanding as a purely individual property, independent of those social norms, things become much harder. What exactly would it mean, inside a single person, to “possess” understanding over and above the ability to respond, act, explain, and predict successfully? It quickly becomes unclear what we are even pointing to.
This is where Predictive Processing helps, because it dissolves the problem rather than trying to solve it.
From a Predictive Processing and Active Inference perspective, what we call “understanding” is nothing more and nothing less than having an effective generative model. There is no extra mental layer beyond that. There are only predictions, and how accurate those predictions turn out to be over time.
To understand something, on this view, is to have a model that reliably generates the right predictions, answers, or actions in the situations that matter. When those predictions fail, prediction error increases and the model is revised. When they succeed, the model stabilises.
The difference between shallow performance and deep understanding is not that one is probabilistic and the other is not. Both are. The difference is that a shallow model only works in narrow, familiar contexts, while a deeper model remains reliable when the context shifts.
The checkerboard illusion and the power of priors
Bernard uses the checkerboard shadow illusion to argue that perception cannot be explained purely in causal or mechanistic terms, and that judgements of “sameness” depend on the norms we are applying. In his account, the illusion shows that there is no single, norm-independent fact of the matter about whether the squares are the same colour. Whether they are treated as the same or different depends on the rules of the practice we are engaged in.
That use of the illusion is well taken. But the same illusion is also widely used in contemporary cognitive science for a different purpose: to illustrate Predictive Processing and the power of prior expectations.
From a Predictive Processing perspective, the illusion arises because the visual system is inferring the most likely causes of the sensory input. Given our experience of the world, the most probable interpretation is that a stable surface is being viewed under uneven illumination, with a shadow cast by an object above it. Situations in which it would matter to identify the two squares as exactly the same shade are rare. Navigating surfaces and interpreting objects under variable lighting are far more common.
A particularly revealing feature of the illusion is that it cannot be undone by a conscious change of intent or norm. Even when we explicitly try to operate under a different rule set, for example by telling ourselves that the task is to identify squares of exactly the same colour, perception remains locked into the checkerboard pattern of light and dark squares. Knowing the correct answer does not change how the image appears.
This shows that the perception itself is not governed by norms in the first instance. It is governed by priors that are extremely strong and precise, shaped by long-term interaction with environments where lighting varies and surfaces remain stable.
What this clarifies is that the real work is being done by priors. These priors are, in effect, the individual’s internal “norms”: statistically learned expectations about how the world is structured and how sensory input is usually caused. They are probabilistic predictions derived from experience, not explicit conventions.
Social norms still matter, but their role is secondary. They are incorporated into the generative model only to the extent that they help us predict more accurately in the future, including predicting what others will do and how they will respond. They do not exist to ensure accurate representation of the world as it is “in itself”, nor to enforce conformity for its own sake. They persist because they work.
From this perspective, sameness, cognition, and understanding are not primarily social achievements. They are properties of an individual’s generative model, shaped by experience, refined by error, and stabilised by predictive success.
Rethinking Engelmann and “faultless communication”
Where Engelmann arguably overreached was in talking about faultless communication as something that could be achieved in principle. No instructional sequence can eliminate every possible alternative inference.
However, the concept of faultless communication in instructional design remains extremely powerful.
Its value is not that it delivers logical certainty, but that it forces the instructional designer to attend carefully to the structure of the learner’s sensory experience. It demands precision in the selection and sequencing of examples, non-examples, prompts, and feedback, so that the inferences learners are most likely to make, and the generative model updates they are most likely to form, are productive and effective ones.
Seen through a Predictive Processing lens, faultless communication is best understood as an aspiration to minimise unhelpful prediction error and reduce the space of plausible but fragile interpretations. It is about shaping experience so that one class of model consistently outperforms the alternatives.
This stands in contrast to more inquiry-led approaches to instruction. Inquiry-based designers are still structuring the learner’s sensory experience, but they deliberately allow a much wider space of possible inferences. That wider space may sometimes be valuable, but it also increases the likelihood that learners will settle on partial, unstable, or context-bound models that require later correction.
From this perspective, Engelmann’s insistence on faultless communication is not naïve, but strategic. It reflects a commitment to guiding learners toward robust generative models from the outset, rather than relying on later revision to repair misconceptions that were predictable consequences of underspecified experience.
It is also worth noting that Engelmann’s ideas are not just of historical interest. There is strong contemporary work focused on translating them into practical, classroom-ready approaches. In particular, Kris Boulton, with his Unstoppable Learning, is doing important work on how Engelmann’s principles can be applied thoughtfully and flexibly in real teaching contexts.
A point of agreement
Stepping back, I think there is clear common ground here. Whatever the theoretical disagreements, Engelmann’s instructional methods are extremely powerful and effective for learning. They work, often strikingly well, especially when compared with less rigorous instructional designs that leave too much of the learner’s experience underspecified.
Where Bernard and I agree is that Engelmann’s success cannot be explained by appeals to faultless communication in a strict philosophical sense. Where Predictive Processing and Active Inference add value is in explaining why those methods work so reliably. By tightly controlling the learner’s sensory experience, Engelmann’s designs constrain the space of possible inferences and guide learners toward internal priors that are well aligned with the structure of the task and, ultimately, with the structure of the world.
Learning, on this view, is not the transmission of meaning nor mere conformity to social norms, but the gradual stabilisation of individual predictive models through cycles of prediction and error correction. These models come to align with reality not by mirroring it directly, but through a process that has been described as “controlled hallucination”, in which perception and understanding persist because they continue to make successful predictions.
Engelmann’s great contribution was not a final theory of meaning, but a practical mastery of how experience can be shaped to support robust, transferable understanding.

