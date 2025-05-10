Author’s Preface

This essay arises from a long-standing unease with how statistical reasoning—particularly Bayesian inference—is represented in academic, applied, and popular discourse. At first glance, Bayesian methods appear to formalize something intuitive: the process of learning from experience and adjusting beliefs in light of new evidence. But the more one examines the framework, the more conceptual difficulties appear. Are the numbers meaningful? What, if anything, do they represent about the world? And is the Bayesian framework merely a calculation engine, or is it making a claim about cognition, belief, or rationality?

This is an inquiry into the relationship between formal systems and the empirical world. My background includes exposure to various mathematical domains and training in experimental psychology, yet I have consistently found statistical theory—and particularly probabilistic reasoning—conceptually difficult, ambiguous, and often internally incoherent. This essay is an attempt to trace the source of that difficulty, to clarify the limits of formal inference, and to push back against the conflation of formal coherence with epistemic validity.

Introduction

Bayesian reasoning is frequently presented as the rational ideal for updating beliefs in light of evidence. It begins with a prior judgment about a hypothesis, receives new information, and delivers a posterior probability—an allegedly improved estimate. This framing has found wide adoption in fields as diverse as machine learning, cognitive science, epidemiology, and even philosophy of science.

But what exactly is being updated? Is it a belief? A model? A state of knowledge? The claim that Bayesian outputs represent “degrees of belief” has become ubiquitous, but it raises foundational questions. Can belief be represented numerically? Does a posterior probability genuinely reflect anything about the real world, or is it just an artifact of prior assumptions? Is Bayesian inference a formal computation or a cognitive model? And if the latter, is it a good one?

This essay argues that Bayesianism, while formally elegant, suffers from a persistent category mistake: the conflation of mathematical calculation with human belief. It explores the conceptual, semantic, and epistemological weaknesses of the Bayesian framework—especially the normalization requirement, the interpretation of priors, and the elusive meaning of probability itself.

Discussion

Overview

At the stratospheric level, the core idea is intuitive and widely accepted: we can begin with some initial sense—whether vague or informed—about where the truth might lie, and then, as new evidence becomes available, update that view to better match the state of the world. This is a commonsense observation, not a mathematical claim. It’s how people adjust beliefs in everyday life: notice a pattern, revise a hunch, reconsider a judgment. The general principle is uncontroversial. It sounds obvious because it is. Who would argue against the idea of learning from evidence?

But this broadly shared intuition was later translated into a mathematical model—first sketched by Thomas Bayes in the 18th century. Bayes proposed that this informal idea of belief updating could be formalized through a precise statistical framework. He introduced an equation intended to operationalize the process: a way to start with a “prior” probability, incorporate new data through a “likelihood,” and arrive at a “posterior” probability. What began as a common sense insight was now encoded into a mathematical rule.

The resulting formula was, in appearance, simple:

P(H|E) = [P(E|H) × P(H)] / P(E)

Note: Read

“P” as “the probability of something, bounded by 0 to1” “|” as “given that” “H” as “the probability of some starting hypothesis” “E” as “some probability due to additional evidence”

Here, the posterior probability P(H|E) is said to represent an improved estimate of the likelihood that a hypothesis H is true, given some new evidence E. The prior P(H) reflects one's initial assessment of the hypothesis. The likelihood P(E|H) captures how probable the new evidence is if the hypothesis were true. The denominator P(E) ensures that the result is normalized to produce a proper probability distribution.

Despite the apparent simplicity, the equation conceals considerable complexity. Its application outside of textbook scenarios quickly becomes burdensome. In practice, using Bayes’ theorem often requires access to a host of quantities that are neither known nor reliably knowable. The prior probability must be assigned before any data are seen—but based on what? Prior experience? Intuition? Theoretical considerations? And the likelihood function must be specified in detail, requiring strong assumptions about how evidence behaves under each hypothesis.

The denominator—the normalization term—is even more problematic. It requires summing or integrating over the probabilities of the evidence across all possible competing theoretical hypotheses, including many that may be ill-defined or completely unspecified. This step is mathematically necessary, but conceptually strained. It presumes a grasp of the full hypothesis space—a fiction in almost every real-world application.

Moreover, the updating procedure assumes a rigid structure: priors are fixed, evidence is processed in a narrowly defined way, and the output is treated as more refined belief. But belief, in the ordinary sense, is not a product of calculation. It is shaped by context, language, interpretation, and prior knowledge that often defies formal representation.

So while Bayes’ theorem gives an elegant formula for how belief might be updated in light of evidence, its implementation relies on assumptions and inputs that are themselves beliefs—sometimes opaque, sometimes arbitrary. The idea of using data to improve understanding of the world remains valid. But the translation of that idea into formal probabilistic terms introduces fragility: hidden assumptions, computational complexity, and a persistent gap between numerical output and real-world interpretation.

In short, the common-sense principle of learning from evidence should not be confused with the mathematical machinery that claims to operationalize it. The elegance of the equation should not distract from the difficulty—and often, the impossibility—of ensuring that the parts correspond to anything stable or reliable in the world.

Conflation of the Objective with the Subjective

The idea that Bayesian updating is about “degrees of belief” is conceptually flawed. It represents a confusion between a formal computational method and a subjective psychological state. Bayesian statistics is a mathematical model—a system for manipulating probabilities under certain assumptions, often implemented with rigor and precision. But it is not a direct measure of belief, nor does it provide a window into the mental state of an individual. To equate the numerical output of a model with an internal mental judgment is to conflate fundamentally different domains.

This conflation has serious consequences. It leads to the impression that numbers derived from Bayesian calculations represent how much someone believes something, when in fact those numbers are entirely model-dependent. The model requires prior assumptions, data-conditioned likelihoods, and normalization across an abstract space of hypotheses. These are technical constructs, not mental contents. The resulting "posterior probability" is not a belief; it is a numerical value produced under a particular set of assumptions. While it may influence belief, it is not belief itself.

Moreover, treating Bayesian probabilities as degrees of belief lends an air of psychological legitimacy to what is essentially formal reasoning. It encourages the notion that Bayesian models can capture, represent, or regulate human belief formation. This is a powerful rhetorical move—but it is a category error. Human belief is rooted in language, context, culture, intuition, memory, and emotion. None of these are reducible to a single number, no matter how elegant the formula that generates it.

The Instability of Belief

Belief is not stable. It fluctuates with context, mood, social influence, and phrasing. Ordinary language gives us a rich but imprecise vocabulary for expressing belief: “certain,” “unsure,” “highly likely,” “maybe,” “could be,” “I doubt it.” These expressions are elastic, dependent on tone, situation, and conversational norms. They do not correspond neatly to numerical probabilities.

Attempts to quantify belief often result in unstable or whimsical judgments. Ask the same person, in the same situation, to assign a numerical probability to an event on different occasions, and the results are likely to vary widely. Even professional forecasters, when asked to give quantified judgments, display considerable inconsistency over time and between individuals. Numerical expressions of belief are artifacts of context and interpretation, not stable representations of internal cognitive states.

This instability stands in direct contrast to the rigidity of formal Bayesian updating. Bayesian calculations, given fixed inputs, produce fixed outputs. But human belief is not computational in this way. It is interpretive, conditional, and multi-layered. People often “believe” and “disbelieve” simultaneously in different senses or along different dimensions. That richness of belief does not survive the translation into a single number, nor should it be mistaken for one.

Why Does the Conflation Exist?

The persistent confusion between statistical modeling and belief formation likely arises from several converging factors. At first, referring to Bayesian outputs as "degrees of belief" may have been a convenient metaphor: a way to explain the idea of conditional updating in accessible terms. But over time, the metaphor hardened into a conceptual framework. The outputs of a formula came to be treated as the beliefs themselves.

This is a classic category mistake—confusing a model of something with the thing it purports to model. It is widespread in the statistical and philosophical literature, often going unremarked. One reason may be that the formalism of Bayesian inference provides a kind of intellectual comfort: it seems to offer a rational, rule-based alternative to the messiness of human cognition. But rationality in the model is not the same as rationality in the mind.

There is also a broader cultural tendency to treat mathematical models as authoritative, as though numerical precision ensures ontological or psychological accuracy. But just because a person can manipulate complex formulas does not mean they have clear or coherent ideas about belief, knowledge, or reasoning. Fluency in computation does not guarantee conceptual clarity.

Critics sometimes argue that Bayesian methods are flawed because they involve subjective priors. But this charge misses the point. All statistical modeling, Bayesian or otherwise, involves judgment: judgment about model structure, relevant variables, measurement techniques, data quality, and interpretive framing. The “subjectivity” of priors is not a defect unique to Bayesianism; it is a feature of modeling as such.

In short, the conflation persists because it is rhetorically powerful, culturally reinforced, and rarely challenged. But it obscures more than it reveals. Statistics is computation. Belief is cognition. The two may interact, but they are not the same.

The Bayesian Elephant in the Room

Do not ignore the Bayesian elephant in the room: normalization. This is not a minor technicality—it is a core structural feature of Bayesian inference, and it poses both practical and conceptual difficulties. At the heart of Bayes' theorem lies a fraction: the numerator is a product of the likelihood and the prior, and the denominator—the normalization constant—is the total probability of the observed evidence across all possible hypotheses. Its purpose is to ensure that the resulting posterior probabilities sum to one, necessitated by axioms of probability, and are therefore interpretable as a proper probability distribution.

In principle, this normalization seems innocuous. In practice, it is often impossible. Calculating the denominator requires knowledge of all hypotheses under consideration, including the likelihood of the observed evidence for each. But in most real-world problems, the hypothesis space is vast, unbounded, or ill-defined. No one can enumerate, much less compute, the full distribution of possibilities. This renders the normalization step practically infeasible.

Yet the problem is not merely practical—it is conceptual. The normalization constant is treated as if it corresponds to something in the real world, as if it anchors the model’s probability assignments to objective frequencies or structures. But that assumption collapses under scrutiny. The very idea of summing over “all possible hypotheses” presumes a level of access and definitional clarity that does not exist in complex domains. One cannot sum over what one cannot define. In such cases, the normalization constant becomes a formal artifact—with no tether to the external world.

This makes normalization a kind of mathematical fiction: a requirement of the formalism that does not correspond to empirical structure. It allows the Bayesian framework to generate probabilities that appear coherent internally but may not have meaningful interpretation. The result may be elegant, but it is only as meaningful as the assumptions and approximations that went into it. If those assumptions are opaque or arbitrary—as they often are—then the final output may be little more than numerically sanctioned speculation.

In this sense, the model is broken because it pretends to map onto a world it cannot fully represent. When normalization rests on assumptions that are untestable, ungrounded, or undefined, the resulting probabilities risk becoming formal illusions: precise answers to ill-posed questions. And if that is the case, then the model may tell us nothing of consequence.

My Personal Intellectual Strengths and Weaknesses

My main cognitive strength lies in conceptual analysis—the ability to examine abstract ideas, trace their implications, identify category errors, and critique foundational assumptions. This mode of reasoning has always come more easily than procedural or symbolic manipulation. I focus less on how to compute than on what the computation means.

Yet statistical reasoning—particularly in probability and inference—has remained the most conceptually difficult domain I’ve encountered in mathematics. This stands in contrast to other areas, such as algebra, matrix operations, and formal logic, which I’ve studied to functional competence.

The difficulty with statistics isn’t primarily computational. It’s semantic. I often find myself asking what a given probability actually means, or how a statistical construct maps onto real-world phenomena. These questions persist even when I can carry out the calculations. The ambiguity lies in interpretation, not procedure.

This raises possibilities: perhaps statistical reasoning requires cognitive skills distinct from those used in conceptual analysis. Or perhaps the field itself contains unresolved conceptual ambiguities. A third possibility, of course, is that the limitation is mine—that my inability reflects intellectual limits rather than structural incoherence.

Still, I maintain that meaningful mathematical work depends on understanding what operations signify. The notion that formal syntax can stand apart from semantics—that rules alone confer meaning—is, in my view, untenable. It fails to address the fundamental question of what symbols refer to or how models relate to the world.

I assume there is literature—academic or otherwise—that explores the cognitive and conceptual demands of reasoning across different mathematical domains. If so, it would be highly relevant here.

Are Probability and Statistics Particularly Difficult to Understand?

It seems that a large number of people—including those with otherwise strong mathematical skills—struggle disproportionately with probability and statistics. Anecdotally and personally, I would suggest that this area of mathematics is especially difficult to grasp, not because the computations are complex (though they sometimes are), but because the concepts are often obscure, counterintuitive, or semantically unstable.

This has certainly been true in my experience. I eventually reached what I believed to be a working understanding of frequentist statistics during graduate study. But it was not an effortless process. This training was not undertaken in a mathematics department, but in the context of experimental psychology—where the conceptual foundations are rarely discussed in depth.

Despite having some facility with other areas of mathematics—including algebra, geometry, matrix operations, and others—I found statistical thinking to be singularly difficult. I suspect the reason lies in the kind of understanding required. In most mathematics, there is a clear mapping between operations and their implications. In statistics, particularly in the interpretation of probability, this mapping becomes unstable. One is not simply solving for values; one is interpreting what those values mean, and that interpretive layer is often vague or contested.

All mathematics demands more than rote computation. Understanding what a mathematical operation does, what it represents, and how it maps to reality—these are necessary. And it is precisely in the domain of probability and statistics that these challenges become most acute.

Maybe We Did Not Evolve to Do Statistical Thinking

There is reason to believe that probability and statistics run against the grain of how the human mind evolved to think. Our cognitive architecture appears to be well-suited to detecting patterns, forming narratives, and making causal inferences—but not to reasoning explicitly in terms of likelihoods or distributions.

Bayesian or frequentist reasoning requires abstract manipulations of uncertainty—operations that must be learned, not intuited. This is evidenced by the widespread cognitive biases documented in the psychological literature: base-rate neglect, the conjunction fallacy, misperceptions of randomness, overconfidence, and so on. These are not isolated errors but systematic failures of probabilistic reasoning.

This suggests that the human brain did not evolve to reason statistically. Our ancestors needed to make quick, survival-relevant decisions in uncertain environments, not to calculate likelihood ratios or define null distributions. While individual differences certainly exist—and some people have an easier time mastering these concepts—the broader cognitive mismatch remains. Probabilistic reasoning is an acquired skill, not an innate one, and its acquisition is hampered by our native tendencies toward causal, story-driven interpretation.

In this light, the difficulty of learning probability and statistics may not reflect a personal shortcoming so much as a biological limitation—an evolutionary legacy that prioritizes intuitive heuristics over formal inference.

Mathematics is Language of Specialized Types

It is increasingly recognized that mathematics, with its various branches, functions as a set of highly specialized dialects of language. These dialects are concise, rule-governed systems that allow us to reason about domains where ordinary language lacks precision or expressive power. Each branch of mathematics creates its own conceptual vocabulary—algebra, calculus, set theory, logic—all offering compact frameworks for discussing structure, quantity, relation, and transformation.

Mathematics brings discipline, reduces ambiguity, and supplies internal inference rules. Unlike natural language, which is rich but often vague, mathematics enforces constraints that limit interpretive drift. For that reason, it can express ideas that would otherwise be inexpressible or unmanageable in ordinary discourse. But at its core, mathematics is still a form of language: a structured means of representing and reasoning about the world.

Even elementary arithmetic involves abstraction. While not highly abstract, it is nonetheless removed from immediate sensory experience. And evidence suggests that a rudimentary sense of number exists even in animals and in human infants prior to language acquisition. This points to a cognitive grounding for numerical concepts—but it also illustrates that mathematics as we know it is a system layered on top of this foundation, designed to express abstractions far beyond those accessible to natural cognition.

Mathematics Depends on Meaning and Calculation

Mathematics requires both the ability to perform calculations and the ability to interpret what those calculations mean. Formal manipulations—whether algebraic, statistical, or logical—are meaningless without semantic grounding. One must understand not just how to compute a result, but what the result refers to, implies, or represents in a larger context.

This dual dependence—on syntax and semantics—is fundamental. Mathematical expressions are structured strings of symbols governed by rules (syntax), but these symbols only acquire significance through interpretation (semantics). A formula means nothing unless one knows what the symbols refer to. To treat mathematics as purely syntactic is to strip it of its power to describe or reason about the world.

Syntax and Semantics are Joined at the Hip

To argue that syntax can exist and be manipulated independently of semantics is to misunderstand the nature of formal systems. Likewise, semantics cannot function without syntactic structure. They are mutually dependent. Every mathematical expression carries meaning only within a syntactic framework, and every operation performed within that framework assumes a semantic interpretation, whether explicit or implicit.

This is as true in statistical modeling as in any other branch of mathematics. The structure of a statistical test may be formally valid, but unless the elements of the model refer to something meaningful, the computation does not produce understanding. The model must be mapped to something in the world for the output to be interpreted as evidence, knowledge, or inference.

Are the Conceptual Models Underlying Statistics Flawed?

Statistical models purport to describe how the world behaves under conditions of uncertainty. But I am open to the possibility that the entire conceptual framework behind many of these models is deeply flawed—or at least, poorly matched to real-world inference. This is not a critique of the computations themselves, which may be internally consistent, but of the conceptual mapping from model to world.

Any model that claims to represent real phenomena must establish semantic contact with that domain. It cannot remain in the realm of pure abstraction. The model must refer to observable or meaningful quantities, relationships, or structures in the world. Otherwise, its outputs—however precise—are unanchored. All mapping is conceptual mapping. If the model does not map intelligibly to the domain it aims to describe, then the calculations may have no real-world relevance.

There is some evidence that such problems exist in various branches of statistics. Whether through misuse, misinterpretation, or foundational weakness, the gap between statistical formalism and empirical applicability is often wider than acknowledged.

Do Not Confuse Use of the Model With the Concepts Underlying the Model

It is critical not to confuse the usefulness of a model with the coherence of the conceptual framework that underpins it. A model might be operationally effective in some domains, but that does not validate its interpretive foundation.

Take, for example, null hypothesis significance testing in the frequentist tradition. The procedure rests on the idea of a “null hypothesis”—typically a claim of no effect or no association. But what does this mean? In real-world terms, uncorrelated events do occur, but that does not imply that the null hypothesis has deep conceptual substance. The issue is not whether the null sometimes holds in practice, but whether the entire framework of asserting and rejecting null hypotheses provides a meaningful model of reasoning.

Rejection of the null hypothesis is treated as informative, but it tells us nothing directly about the truth of the alternative hypothesis. This interpretive gap is routinely glossed over. The logical structure collapses under scrutiny. The premise is that we can model the world by setting up a straw hypothesis, calculating the probability of data under that straw, and then discarding it when the data are unlikely. But what is left standing? The model doesn’t say. Rejection is not confirmation. The entire framework is conceptually unstable, even if procedurally widespread.

Do the Numbers Meaningfully Inform Us About the World?

Some claim that Bayesian statistics is a powerful formal tool for reasoning under uncertainty. But that claim cannot be accepted at face value. Its utility must be demonstrated empirically, not assumed by virtue of formal elegance. Several critical questions arise:

Do Bayesian methods actually improve our understanding of the world, even within a probabilistic framework? How would such improvement be known or measured? Is it empirical, or is it simply conjecture—an artifact of assuming that the model maps coherently onto the structure of reality?

Probabilistic mathematics is especially fraught in this regard. Unlike deterministic systems, where outputs can be definitively verified or falsified against observable outcomes, probabilistic systems traffic in likelihoods and distributions. This makes epistemic claims about the correspondence between model and world far more difficult to evaluate. For present purposes, we will take a pragmatic stance and assume there is an empirical world that models purport to describe. Without that, we're not doing science—we’re engaging in storytelling.

So the confusion arises: are we merely manipulating numbers, or are we making assertions about how the world works? Probability is not just arithmetic with uncertainty. It carries with it an implicit claim: that the world behaves in a fashion described by the structure of the model. That’s where the problem begins. For statistical models to be meaningful, there must be some valid mapping from the model to real-world phenomena. Otherwise, the numbers may be correct within the system, but say nothing of consequence about anything outside it.

Statisticians often respond that the mathematics is “formally correct.” But formal correctness is not the issue. Any formula, so long as it avoids internal contradiction and obeys the rules of the system in which it's written, is mathematically correct. That includes operations with undefined terms—division by zero is only wrong when executed, not when syntactically composed. One can construct endless symbol strings that are valid within a formal system and yet have no meaning at all.

This is the central issue: mathematics, on its own, does not generate meaning. It generates results, and those results become meaningful only through interpretation. Intermediate steps in mathematical derivations rarely correspond directly to anything in the world. Only the inputs and outputs—when grounded—can be assigned meaning, and even then, that grounding requires an interpretive act. The mapping from symbolic manipulation to empirical relevance is always a second layer of work.

Moreover, there is a frequent error in conflating computation with meaning. Even simple symbolic systems are sometimes mistaken as inherently meaningful when they are in fact purely syntactic. A computation has semantic content only if someone assigns it. All symbols mean something only when interpreted—otherwise, they are just marks in a system governed by transformation rules.

Mathematics, then, is a form of language: a constrained, highly structured, and unusually precise language. It uses an unfamiliar vocabulary in many areas, but it can always be paraphrased in natural language—albeit at the cost of losing precision and gaining ambiguity. Its function is to express concepts concisely and with reduced interpretive slippage. But it does not automatically refer to the world. That mapping is not intrinsic; it is imposed.

Thus, mathematical models, including those used in statistics, must be understood as representations. They may function as thought experiments, analogies, or predictive devices. While it is not traditional to regard mathematics in these terms, such interpretations are legitimate. Models can offer insight—but only if the relationship between model and world is clear, justified, and critically examined. Without that, mathematical precision becomes a polished surface masking conceptual voids.

Summary

This essay critiques the widespread interpretation of Bayesian inference as a model of belief revision. While Bayesian statistics provides a formal method for updating probability estimates, it is often misrepresented as a measure of subjective confidence or rational belief. That conflation is deeply problematic. Belief, as ordinarily experienced and expressed, is unstable, contextual, and linguistically vague. Probabilistic models, by contrast, are deterministic outputs of predefined assumptions.

Several conceptual problems are explored in detail: the difficulty of assigning and justifying priors; the ambiguous meaning of “likelihood”; the mathematical fiction of normalization over undefined hypothesis spaces; and the overall gap between formal computation and empirical grounding. The essay also examines the possibility that human cognition is poorly suited for statistical reasoning, and that the apparent authority of probabilistic models often obscures rather than clarifies their relationship to reality.

Ultimately, the essay challenges the claim that Bayesianism is a rational framework for belief. It argues instead that it is a formal tool, perhaps useful in constrained domains, but often misapplied or overinterpreted in complex, open-ended ones. Formal models may support reasoning, but they are not substitutes for it. Belief, like meaning, cannot be reduced to a number.

Reading List

Gigerenzer, G. (2002). Reckoning with risk: Learning to live with uncertainty. Penguin.

→ A widely readable critique of probabilistic reasoning and an exploration of how humans actually make decisions under uncertainty.

Hacking, I. (2001). An introduction to probability and inductive logic. Cambridge University Press.

→ A clear, accessible philosophical introduction to probability and its interpretation. Often available via university open-access syllabi or PDF repositories.

Jaynes, E. T. (2003). Probability theory: The logic of science. Cambridge University Press.

→ Though technical in parts, this work is often cited by Bayesian proponents and is available in full draft form online via https://bayes.wustl.edu/. It serves as a touchstone for understanding Bayesian epistemology and its assumptions.

Porter, T. M. (1995). Trust in numbers: The pursuit of objectivity in science and public life. Princeton University Press.

→ Offers a historical and philosophical account of how quantification became associated with rationality and authority. Excerpts and chapters are widely available via institutional archives.

Kahneman, D., & Tversky, A. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185(4157), 1124–1131.

→ Foundational paper in cognitive psychology; open access via

https://www.science.org

or institutional links. Illustrates systematic departures from probabilistic rationality in human judgment.

Sloman, S. A., & Fernbach, P. (2017). The knowledge illusion: Why we never think alone. Riverhead Books.

→ Though not strictly statistical, this work highlights cognitive limitations in individual reasoning, relevant to understanding belief formation and model use.