Category Archives: Amateur Philosophy

Book Review: The Case Against Reality

Nima Arkani-Hamed shows up surprisingly rarely in popular science books. A major figure in my former field, Nima is extremely quotable (frequent examples include “spacetime is doomed” and “the universe is not a crappy metal”), but those quotes don’t seem to quite have reached the popular physics mainstream. He’s been interviewed in books by physicists, and has a major role in one popular physics book that I’m aware of. From this scattering of mentions, I was quite surprised to hear of another book where he makes an appearance: not a popular physics book at all, but a popular psychology book: Donald Hoffman’s The Case Against Reality. Naturally, this meant I had to read it.

Then, I saw the first quote on the back cover…or specifically, who was quoted.

Seeing that, I settled in for a frustrating read.

A few pages later, I realized that this, despite his endorsement, is not a Deepak Chopra kind of book. Hoffman is careful in some valuable ways. Specifically, he has a philosopher’s care, bringing up objections and potential holes in his arguments. As a result, the book wasn’t frustrating in the way I expected.

It was even more frustrating, actually. But in an entirely different way.

When a science professor writes a popular book, the result is often a kind of ungainly Frankenstein. The arguments we want to make tend to be better-suited to shorter pieces, like academic papers, editorials, and blog posts. To make these into a book, we have to pad them out. We stir together all the vaguely related work we’ve done, plus all the best-known examples from other peoples’ work, trying (often not all that hard) to make the whole sound like a cohesive story. Read enough examples, and you start to see the joints between the parts.

Hoffman is ostensibly trying to tell a single story. His argument is that the reality we observe, of objects in space and time, is not the true reality. It is a convenient reality, one that has led to our survival, but evolution has not (and as he argues, cannot) let us perceive the truth. Instead, he argues that the true reality is consciousness: a world made up of conscious beings interacting with each other, with space, time, and all the rest emerging as properties of those interactions.

That certainly sounds like it could be one, cohesive argument. In practice, though, it is three, and they don’t fit together as well as he’d hope.

Hoffman is trained as a psychologist. As such, one of the arguments is psychological: that research shows that we mis-perceive the world in service of evolutionary fitness.

Hoffman is a cognitive scientist, and while many cognitive scientists are trained as psychologists, others are trained as philosophers. As such, one of his arguments is philosophical: that the contents of consciousness can never be explained by relations between material objects, and that evolution, and even science, systematically lead us astray.

Finally, Hoffman has evidently been listening to and reading the work of some physicists, like Nima and Carlo Rovelli. As such, one of his arguments is physical: that physicists believe that space and time are illusions and that consciousness may be fundamental, and that the conclusions of the book lead to his own model of the basic physical constituents of the world.

The book alternates between these three arguments, so rather than in chapter order, I thought it would be better to discuss each argument in its own section.

The Psychological Argument

Sometimes, when two academics get into a debate, they disagree about what’s true. Two scientists might argue about whether an experiment was genuine, whether the statistics back up a conclusion, or whether a speculative theory is actually consistent. These are valuable debates, and worth reading about if you want to learn something about the nature of reality.

Sometimes, though, two debating academics agree on what’s true, and just disagree on what’s important. These debates are, at best, relevant to other academics and funders. They are not generally worth reading for anybody else, and are often extremely petty and dumb.

Hoffman’s psychological argument, regrettably, is of the latter kind. He would like to claim it’s the former, and to do so he marshals a host of quotes from respected scientists that claim that human perception is veridical: that what we perceive is real, courtesy of an evolutionary process that would have killed us off if it wasn’t. From that perspective, every psychological example Hoffman gives is a piece of counter-evidence, a situation where evolution doesn’t just fail to show us the true nature of reality, but actively hides reality from us.

The problem is that, if you actually read the people Hoffman quotes, they’re clearly not making the extreme point he claims. These people are psychologists, and all they are arguing is that perception is veridical in a particular, limited way. They argue that we humans are good at estimating distances or positions of objects, or that we can see a wide range of colors. They aren’t making some sort of philosophical point about those distances or positions or colors being how the world “really is”, nor are they claiming that evolution never makes humans mis-perceive.

Instead, they, and thus Hoffman, are arguing about importance. When studying humans, is it more useful to think of us as perceiving the world as it is? Or is it more useful to think of evolution as tricking us? Which happens more often?

The answers to each of those questions have to be “it depends”. Neither answer can be right all the time. At most then, this kind of argument can convince one academic to switch from researching in one way to researching in another, by saying that right now one approach is a better strategy. It can’t tell us anything more.

If the argument Hoffman is trying to get across here doesn’t matter, are there other reasons to read this part?

Popular psychology books tend to re-use a few common examples. There are some good ones, so if you haven’t read such a book you probably should read a couple, just to hear about them. For example, Hoffman tells the story of the split-brain patients, which is definitely worth being aware of.

(Those of you who’ve heard that story may be wondering how the heck Hoffman squares it with his idea of consciousness as fundamental. He actually does have a (weird) way to handle this, so read on.)

The other examples come from Hoffman’s research, and other research in his sub-field. There are stories about what optical illusions tell us about our perception, about how evolution primes us to see different things as attractive, and about how advertisers can work with attention.

These stories would at least be a source of a few more cool facts, but I’m a bit wary. The elephant in the room here is the replication crisis. Paper after paper in psychology has turned out to be a statistical mirage, accidental successes that fail to replicate in later experiments. This can happen without any deceit on the part of the psychologist, it’s just a feature of how statistics are typically done in the field.

Some psychologists make a big deal about the replication crisis: they talk about the statistical methods they use, and what they do to make sure they’re getting a real result. Hoffman talks a bit about tricks to rule out other explanations, but mostly doesn’t focus on this kind of thing.. This doesn’t mean he’s doing anything wrong: it might just be it’s off-topic. But it makes it a bit harder to trust him, compared to other psychologists who do make a big deal about it.

The Philosophical Argument

Hoffman structures his book around two philosophical arguments, one that appears near the beginning and another that, as he presents it, is the core thesis of the book. He calls both of these arguments theorems, a naming choice sure to irritate mathematicians and philosophers alike, but the mathematical content in either is for the most part not the point: in each case, the philosophical setup is where the arguments get most of their strength.

The first of these arguments, called The Scrambling Theorem, is set up largely as background material: not his core argument, but just an entry into the overall point he’s making. I found it helpful as a way to get at his reasoning style, the sorts of things he cares about philosophically and the ones he doesn’t.

The Scrambling Theorem is meant to weigh in on the debate over a thought experiment called the Inverted Spectrum, which in turn weighs on the philosophical concept of qualia. The Inverted Spectrum asks us to imagine someone who sees the spectrum of light inverted compared to how we see it, so that green becomes red and red becomes green, without anything different about their body or brain. Such a person would learn to refer to colors the same ways that we do, still referring to red blood even though they see what we see when we see green grass. Philosophers argue that, because we can imagine this, the “qualia” we see in color, like red or green, are distinct from their practical role: they are images in the mind’s eye that can be compared across minds, but do not correspond to anything we have yet characterized scientifically in the physical world.

As a response, other philosophers argued that you can’t actually invert the spectrum. Colors aren’t really a wheel, we can distinguish, for example, more colors between red and blue than between green and yellow. Just flipping colors around would have detectable differences that would have to have physical implications, you can’t just swap qualia and nothing else.

The Scrambling Theorem is in response to this argument. Hoffman argues that, while you can’t invert the spectrum, you can scramble it. By swapping not only the colors, but the relations between them, you can arrange any arbitrary set of colors however else you’d like. You can declare that green not only corresponds to blood and not grass, but that it has more colors between it and yellow, perhaps by stealing them from the other side of the color wheel. If you’re already allowed to swap colors and their associations around, surely you can do this too, and change order and distances between them.

Believe it or not, I think Hoffman’s argument is correct, at least in its original purpose. You can’t respond to the Inverted Spectrum just by saying that colors are distributed differently on different sides of the color wheel. If you want to argue against the Inverted Spectrum, you need a better argument.

Hoffman’s work happens to suggest that better argument. Because he frames this argument in the language of mathematics, as a “theorem”, Hoffman’s argument is much more general than the summary I gave above. He is arguing that not merely can you scramble colors, but anything you like. If you want to swap electrons and photons, you can: just make your photons interact with everything the way electrons did, and vice versa. As long as you agree that the things you are swapping exist, according to Hoffman, you are free to exchange them and their properties any way you’d like.

This is because, to Hoffman, things that “actually exist” cannot be defined just in terms of their relations. An electron is not merely a thing that repels other electrons and is attracted to protons and so on, it is a thing that “actually exists” out there in the world. (Or, as he will argue, it isn’t really. But that’s because in the end he doesn’t think electrons exist.)

(I’m tempted to argue against this with a mathematical object like group elements. Surely the identity element of a group is defined by its relations? But I think he would argue identity elements of groups don’t actually exist.)

In the end, Hoffman is coming from a particular philosophical perspective, one common in modern philosophers of metaphysics, the study of the nature of reality. From this perspective, certain things exist, and are themselves by necessity. We cannot ask what if a thing were not itself. For example, in this perspective it is nonsense to ask what if Superman was not Clark Kent, because the two names refer to the same actually existing person.

(If, you know, Superman actually existed.)

Despite the name of the book, Hoffman is not actually making a case against reality in general. He very much seems to believe in this type of reality, in the idea that there are certain things out there that are real, independent of any purely mathematical definition of their properties. He thinks they are different things than you think they are, but he definitely thinks there are some such things, and that it’s important and scientifically useful to find them.

Hoffman’s second argument is, as he presents it, the core of the book. It’s the argument that’s supposed to show that the world is almost certainly not how we perceive it, even through scientific instruments and the scientific method. Once again, he calls it a theorem: the Fitness Beats Truth theorem.

The Fitness Beats Truth argument begins with a question: why should we believe what we see? Why do we expect that the things we perceive should be true?

In Hoffman’s mind, the only answer is evolution. If we perceived the world inaccurately, we would die out, replaced by creatures that perceived the world better than we did. You might think we also have evidence from biology, chemistry, and physics: we can examine our eyes, test them against cameras, see how they work and what they can and can’t do. But to Hoffman, all of this evidence may be mistaken, because to learn biology, chemistry, and physics we must first trust that we perceive the world correctly to begin with. Evolution, though, doesn’t rely on any of that. Even if we aren’t really bundles of cells replicating through DNA and RNA, we should still expect something like evolution, some process by which things differ, are selected, and reproduce their traits differently in the next generation. Such things are common enough, and general enough, that one can (handwavily) expect them through pure reason alone.

But, says Hoffman’s psychology experience, evolution tricks us! We do mis-perceive, and systematically, in ways that favor our fitness over reality. And so Hoffman asks, how often should we expect this to happen?

The Fitness Beats Truth argument thinks of fitness as randomly distributed: some parts of reality historically made us more fit, some less. This distribution could match reality exactly, so that for any two things that are actually different, they will make us fit in different ways. But it doesn’t have to. There might easily be things that are really very different from each other, but which are close enough from a fitness perspective that to us they seem exactly the same.

The “theorem” part of the argument is an attempt to quantify this. Hoffman imagines a pixelated world, and asks how likely it is that a random distribution of fitness matches a random distribution of pixels. This gets extremely unlikely for a world of any reasonable size, for pretty obvious reasons. Thus, Hoffman concludes: in a world with evolution, we should almost always expect it to hide something from us. The world, if it has any complexity at all, has an almost negligible probability of being as we perceive it.

On one level, this is all kind of obvious. Evolution does trick us sometimes, just as it tricks other animals. But Hoffman is trying to push this quite far, to say that ultimately our whole picture of reality, not just our eyes and ears and nose but everything we see with microscopes and telescopes and calorimeters and scintillators, all of that might be utterly dramatically wrong. Indeed, we should expect it to be.

In this house, we tend to dismiss the Cartesian Demon. If you have an argument that makes you doubt literally everything, then it seems very unlikely you’ll get anything useful from it. Unlike Descartes’s Demon, Hoffman thinks we won’t be tricked forever. The tricks evolution plays on us mattered in our ancestral environment, but over time we move to stranger and stranger situations. Eventually, our fitness will depend on something new, and we’ll need to learn something new about reality.

This means that ultimately, despite the skeptical cast, Hoffman’s argument fits with the way science already works. We are, very much, trying to put ourselves in new situations and test whether our evolved expectations still serve us well or whether we need to perceive things anew. That is precisely what we in science are always doing, every day. And as we’ll see in the next section, whatever new things we have to learn have no particular reason to be what Hoffman thinks they should be.

But while it doesn’t really matter, I do still want to make one counter-argument to Fitness Beats Truth. Hoffman considers a random distribution of fitness, and asks what the chance is that it matches truth. But fitness isn’t independent of truth, and we know that not just from our perception, but from deeper truths of physics and mathematics. Fitness is correlated with truth, fitness often matches truth, for one key reason: complex things are harder than simple things.

Imagine a creature evolving an eye. They have a reason, based on fitness, to need to know where their prey is moving. If evolution was a magic wand, and chemistry trivial, it would let them see their prey, and nothing else. But evolution is not magic, and chemistry is not trivial. The easiest thing for this creature to see is patches of light and darkness. There are many molecules that detect light, because light is a basic part of the physical world. To detect just prey, you need something much more complicated, molecules and cells and neurons. Fitness imposes a cost, and it means that the first eyes that evolve are spots, detecting just light and darkness.

Hoffman asks us not to assume that we know how eyes work, that we know how chemistry works, because we got that knowledge from our perceptions. But the nature of complexity and simplicity, entropy and thermodynamics and information, these are things we can approach through pure thought, as much as evolution. And those principles tell us that it will always be easier for an organism to perceive the world as it truly is than not, because the world is most likely simple and it is most likely the simplest path to perceive it directly. When benefits get high enough, when fitness gets strong enough, we can of course perceive the wrong thing. But if there is only a small fitness benefit to perceiving something incorrectly, then simplicity will win out. And by asking simpler and simpler questions, we can make real durable scientific progress towards truth.

The Physical Argument

So if I’m not impressed by the psychology or the philosophy, what about the part that motivated me to read the book in the first place, the physics?

Because this is, in a weird and perhaps crackpot way, a physics book. Hoffman has a specific idea, more specific than just that the world we perceive is an evolutionary illusion, more specific than that consciousness cannot be explained by the relations between physical particles. He has a proposal, based on these ideas, one that he thinks might lead to a revolutionary new theory of physics. And he tries to argue that physicists, in their own way, have been inching closer and closer to his proposal’s core ideas.

Hoffman’s idea is that the world is made, not of particles or fields or anything like that, but of conscious agents. You and I are, in this picture, certainly conscious agents, but so are the sources of everything we perceive. When we reach out and feel a table, when we look up and see the Sun, those are the actions of some conscious agent intruding on our perceptions. Unlike panpsychists, who believe that everything in the world is conscious, Hoffman doesn’t believe that the Sun itself is conscious, or is made of conscious things. Rather, he thinks that the Sun is an evolutionary illusion that rearranges our perceptions in a convenient way. The perceptions still come from some conscious thing or set of conscious things, but unlike in panpsychism they don’t live in the center of our solar system, or in any other place (space and time also being evolutionary illusions in this picture). Instead, they could come from something radically different that we haven’t imagined yet.

Earlier, I mentioned split brain patients. For anyone who thinks of conscious beings as fundamental, split brain patients are a challenge. These are people who, as a treatment for epilepsy, had the bridge between the two halves of their brain severed. The result is eerily as if their consciousness was split in two. While they only express one train of thought, that train of thought seems to only correspond to the thoughts of one side of their brain, controlling only half their body. The other side, controlling the other half of their body, appears to have different thoughts, different perceptions, and even different opinions, which are made manifest when instead of speaking they use that side of their body to gesture and communicate. While some argue that these cases are over-interpreted and don’t really show what they’re claimed to, Hoffman doesn’t. He accepts that these split-brain patients genuinely have their consciousness split in two.

Hoffman thinks this isn’t a problem because for him, conscious agents can be made up of other conscious agents. Each of us is conscious, but we are also supposed to be made up of simpler conscious agents. Our perceptions and decisions are not inexplicable, but can be explained in terms of the interactions of the simpler conscious entities that make us up, each one communicating with the others.

Hoffman speculates that everything is ultimately composed of the simplest possible conscious agents. For him, a conscious agent must do two things: perceive, and act. So the simplest possible agent perceives and acts in the simplest possible way. They perceive a single bit of information: 0 or 1, true or false, yes or no. And they take one action, communicating a different bit of information to another conscious agent: again, 0 or 1, true or false, yes or no.

Hoffman thinks that this could be the key to a new theory of physics. Instead of thinking about the world as composed of particles and fields, think about it as composed of these simple conscious agents, each one perceiving and communicating one bit at a time.

Hoffman thinks this, in part, because he sees physics as already going in this direction. He’s heard that “spacetime is doomed”, he’s heard that quantum mechanics is contextual and has no local realism, he’s heard that quantum gravity researchers think the world might be a hologram and space-time has a finite number of bits. This all “rhymes” enough with his proposal that he’s confident physics has his back.

Hoffman is trained in psychology. He seems to know his philosophy, at least enough to engage with the literature there. But he is absolutely not a physicist, and it shows. Time and again it seems like he relies on “pop physics” accounts that superficially match his ideas without really understanding what the physicists are actually talking about.

He keeps up best when it comes to interpretations of quantum mechanics, a field where concepts from philosophy play a meaningful role. He covers the reasons why quantum mechanics keeps philosophers up at night: Bell’s Theorem, which shows that a theory that matches the predictions of quantum mechanics cannot both be “realist”, with measurements uncovering pre-existing facts about the world, and “local”, with things only influencing each other at less than the speed of light, the broader notion of contextuality, where measured results are dependent on which other measurements are made, and the various experiments showing that both of these properties hold in the real world.

These two facts, and their implications, have spawned a whole industry of interpretations of quantum mechanics, where physicists and philosophers decide which side of various dilemmas to take and how to describe the results. Hoffman quotes a few different “non-realist” interpretations: Carlo Rovelli’s Relational Quantum Mechanics, Quantum Bayesianism/QBism, Consistent Histories, and whatever Chris Fields is into. These are all different from one another, which Hoffman is aware of. He just wants to make the case that non-realist interpretations are reasonable, that the physicists collectively are saying “maybe reality doesn’t exist” just like he is.

The problem is that Hoffman’s proposal is not, in the quantum mechanics sense, non-realist. Yes, Hoffman thinks that the things we observe are just an “interface”, that reality is really a network of conscious agents. But in order to have a non-realist interpretation, you need to also have other conscious agents not be real. That’s easily seen from the old “Wigner’s friend” thought experiment, where you put one of your friends in a Schrodinger’s cat-style box. Just as Schrodinger’s cat can be both alive and dead, your friend can both have observed something and not have observed it, or observed something and observed something else. The state of your friend’s mind, just like everything else in a non-realist interpretation, doesn’t have a definite value until you measure it.

Hoffman’s setup doesn’t, and can’t, work that way. His whole philosophical project is to declare that certain things exist and others don’t: the sun doesn’t exist, conscious agents do. In a non-realist interpretation, the sun and other conscious agents can both be useful descriptions, but ultimately nothing “really exists”. Science isn’t a catalogue of what does or doesn’t “really exist”, it’s a tool to make predictions about your observations.

Hoffman gets even more confused when he gets to quantum gravity. He starts out with a common misconception: that the Planck length represents the “pixels” of reality, sort of like the pixels of your computer screen, which he uses to support his “interface” theory of consciousness. This isn’t really the right way to think about it the Planck length, though, and certainly isn’t what the people he’s quoting have in mind. The Planck length is a minimum scale in that space and time stop making sense as one approaches it, but that’s not necessarily because space and time are made up of discrete pixels. Rather, it’s because as you get closer to the Planck length, space and time stop being the most convenient way to describe things. For a relatively simple example of how this can work, see my post here.

From there, he reflects on holography: the discovery that certain theories in physics can be described equally well by what is happening on their boundary as by their interior, the way that a 2D page can hold all the information for an apparently 3D hologram. He talks about the Bekenstein bound, the conjecture that there is a maximum amount of information needed to describe a region of space, proportional not to the volume of the region but to its area. For Hoffman, this feels suspiciously like human vision: if we see just a 2D image of the world, could that image contain all the information needed to construct that world? Could the world really be just what we see?

In a word, no.

On the physics side, the Bekenstein bound is a conjecture, and one that doesn’t always hold. A more precise version that seems to hold more broadly, called the Bousso bound, works by demanding the surface have certain very specific geometric properties in space-time, properties not generally shared by the retinas of our eyes.

But it even fails in Hoffman’s own context, once we remember that there are other types of perception than vision. When we hear, we don’t detect a 2D map, but a 1D set of frequencies, put in “stereo” by our ears. When we feel pain, we can feel it in any part of our body, essentially a 3D picture since it goes inwards as well. Nothing about human perception uniquely singles out a 2D surface.

There is actually something in physics much closer to what Hoffman is imagining, but it trades on a principle Hoffman aspires to get rid of: locality. We’ve known since Einstein that you can’t change the world around you faster than the speed of light. Quantum mechanics doesn’t change that, despite what you may have heard. More than that, simultaneity is relative: two distant events might be at the same time in your reference frame, but for someone else one of them might be first, or the other one might be, there is no one universal answer.

Because of that, if you want to think about things happening one by one, cause following effect, actions causing consequences, then you can’t think of causes or actions as spread out in space. You have to think about what happens at a single point: the location of an imagined observer.

Once you have this concept, you can ask whether describing the world in terms of this single observer works just as well as describing it in terms of a wide open space. And indeed, it actually can do well, at least under certain conditions. But one again, this really isn’t how Hoffman is doing things: he has multiple observers all real at the same time, communicating with each other in a definite order.

In general, a lot of researchers in quantum gravity think spacetime is doomed. They think things are better described in terms of objects with other properties and interactions, with space and time as just convenient approximations for a more complicated reality. They get this both from observing properties of the theories we already have, and from thought experiments showing where those theories cause problems.

Nima, the most catchy of these quotable theorists, is approaching the problem from the direction of scattering amplitudes: the calculations we do to find the probability of observations in particle physics. Each scattering amplitude describes a single observation: what someone far away from a particle collision can measure, independent of any story of what might have “actually happened” to the particles in between. Nima’s goal is to describe these amplitudes purely in terms of those observations, to get rid of the “story” that shows up in the middle as much as possible.

The other theorists have different goals, but have this in common: they treat observables as their guide. They look at the properties that a single observer’s observations can have, and try to take a fresh view, independent of any assumptions about what happens in between.

This key perspective, this key insight, is what Hoffman is missing throughout this book. He has read what many physicists have to say, but he does not understand why they are saying it. His book is titled The Case Against Reality, but he merely trades one reality for another. He stops short of the more radical, more justified case against reality: that “reality”, that thing philosophers argue about and that makes us think we can rule out theories based on pure thought, is itself the wrong approach: that instead of trying to characterize an idealized real world, we are best served by focusing on what we can do.

One thing I didn’t do here is a full critique of Hoffman’s specific proposal, treating it as a proposed theory of physics. That would involve quite a bit more work, on top of what has turned out to be a very long book review. I would need to read not just his popular description, but the actual papers where he makes his case and lays out the relevant subtleties. Since I haven’t done that, I’ll end with a few questions: things that his proposal will need to answer if it aspires to be a useful idea for physics.

  • Are the networks of conscious agents he proposes Turing-complete? In other words, can they represent any calculation a computer can do? If so, they aren’t a useful idea for physics, because you could imagine a network of conscious agents to reproduce any theory you want. The idea wouldn’t narrow things down to get us closer to a useful truth. This was also one of the things that made me uncomfortable with the Wolfram Physics Project.
  • What are the conditions that allow a network of simple conscious agents to make up a bigger conscious agent? Do those conditions depend meaningfully on the network’s agents being conscious, or do they just have to pass messages? If the latter, then Hoffman is tacitly admitting you can make a conscious agent out of non-conscious agents, even if he insists this is philosophically impossible.
  • How do you square this network with relativity and quantum mechanics? Is there a set time, an order in which all the conscious agents communicate with each other? If so, how do you square that with the relativity of simultaneity? Are the agents themselves supposed to be able to be put in quantum states, or is quantum mechanics supposed to emerge from a theory of classical agents?
  • How does evolution fit in here? A bit part of Hoffman’s argument was supported by the universality of the evolutionary algorithm. In order for evolution to matter for your simplest agents, they need to be able to be created or destroyed. But then they have more than two actions: not just 0 and 1, but 0, 1, and cease to exist. So you could have an even simpler agent that has just two bits.

Generalize

What’s the difference between a model and an explanation?

Suppose you cared about dark matter. You observe that things out there in the universe don’t quite move the way you would expect. There is something, a consistent something, that changes the orbits of galaxies and the bending of light, the shape of the early universe and the spiderweb of super-clusters. How do you think about that “something”?

One option is to try to model the something. You want to use as few parameters as possible, so that your model isn’t just an accident, but will actually work to predict new data. You want to describe how it changes gravity, on all the scales you care about. Your model might be very simple, like the original MOND, and just describe a modification to Newtonian gravity, since you typically only need Newtonian gravity to model many of these phenomena. (Though MOND itself can’t account for all the things attributed to dark matter, so it had to be modified.) You might have something slightly more complicated, proposing some “matter” but not going into much detail about what it is, just enough for your model to work.

If you were doing engineering, a model like that is a fine thing to have. If you were building a spaceship and wanted to figure out what its destination would look like after a long journey, you’d need a model of dark matter like this, one that predicted how galaxies move and light bends, to do the job.

But a model like that isn’t an explanation. And the reason why is that explanations generalize.

In practice, you often just need Newtonian gravity to model how galaxies move. But if you want to model more dramatic things, the movement of the whole universe or the area around a black hole, then you need general relativity as well. So to generalize to those areas, you can’t just modify Newtonian gravity. You need an explanation, one that tells you not just how Newton’s equations change, but how Einstein’s equations change.

In practice, you can get by with a simple model of dark matter, one that doesn’t tell you very much, and just adds a new type of matter. But if you want to model quantum gravity, you need to know how this new matter interacts, not just at baseline with gravity, but with everything else. You need to know how the new matter is produced, whether it gets its mass from the Higgs boson or from something else, whether it falls into the same symmetry groups as the Standard Model or totally new ones, how it arises from tangled-up strings and multi-dimensional membranes. You need not just a model, but an explanation, one that tells you not just roughly what kind of particle you need, but how it changes our models of particle physics overall.

Physics, at its best, generalizes. Newton’s genius wasn’t that he modeled gravity on Earth, but that he unified it with gravity in the solar system. By realizing that gravity was universal, he proposed an explanation that led to much more progress than the models of predecessors like Kepler. Later, Einstein’s work on general relativity led to similar progress.

We can’t always generalize. Sometimes, we simply don’t know enough. But if we’re not engineering, then we don’t need a model, and generalizing should, at least in the long-run, be our guiding hope.

Theorems About Reductionism

A reductionist would say that the behavior of the big is due to the behavior of the small. Big things are made up of small things, so anything the big things do must be explicable in terms of what the small things are doing. It may be very hard to explain things this way: you wouldn’t want to describe the economy in terms of motion of carbon atoms. But in principle, if you could calculate everything, you’d find the small things are enough: there are no fundamental “new rules” that only apply to big things.

A physicist reductionist would have to amend this story. Zoom in far enough, and it doesn’t really make sense to talk about “small things”, “big things”, or even “things” at all. The world is governed by interactions of quantum fields, ripples spreading and colliding and changing form. Some of these ripples (like the ones we call “protons”) are made up of smaller things…but ultimately most aren’t. They just are what they are.

Still, a physicist can rescue the idea of reductionism by thinking about renormalization. If you’ve heard of renormalization, you probably think of it as a trick physicists use to hide inconvenient infinite results in their calculations. But an arguably better way to think about it is as a kind of “zoom” dial for quantum field theories. Starting with a theory, we can use renormalization to “zoom out”, ignoring the smallest details and seeing what picture emerges. As we “zoom”, different forces will seem to get stronger or weaker: electromagnetism matters less when we zoom out, the strong nuclear force matters more.

(Why then, is electromagnetism so much more important in everyday life? The strong force gets so strong as we zoom out that we stop seeing individual particles, and only see them bound into protons and neutrons. Electromagnetism is like this too, binding electrons and protons into neutral atoms. In both cases, it can be better, once we’ve zoomed out, to use a new description: you don’t want to do chemistry keeping track of the quarks and gluons.)

A physicists reductionist then, would expect renormalization to always go “one way”. As we “zoom out”, we should find that our theories in a meaningful sense get simpler and simpler. Maybe they’re still hard to work with: it’s easier to think about gluons and quarks when zoomed in than the zoo of different nuclear particles we need to consider when zoomed out. But at each step, we’re ignoring some details. And if you’re a reductionist, you shouldn’t expect “zooming out” to show you anything truly fundamentally new.

Can you prove that, though?

Surprisingly, yes!

In 2011, Zohar Komargodski and Adam Schwimmer proved a result called the a-theorem. “The a-theorem” is probably the least google-able theorem in the universe, which has probably made it hard to popularize. It is named after a quantity, labeled “a”, that gives a particular way to add up energy in a quantum field theory. Komargodski and Schwimmer proved that, when you do the renormalization procedure and “zoom out”, then this quantity “a” will always get smaller.

Why does this say anything about reductionism?

Suppose you have a theory that violates reductionism. You zoom out, and see something genuinely new: a fact about big things that isn’t due to facts about small things. If you had a theory like that, then you could imagine “zooming in” again, and using your new fact about big things to predict something about the small things that you couldn’t before. You’d find that renormalization doesn’t just go “one way”: with new facts able to show up at every scale, zooming out isn’t necessarily ignoring more and zooming in isn’t necessarily ignoring less. It would depend on the situation which way the renormalization procedure would go.

The a-theorem puts a stop to this. It says that, when you “zoom out”, there is a number that always gets smaller. In some ways it doesn’t matter what that number is (as long as you’re not cheating and using the scale directly). In this case, it is a number that loosely counts “how much is going on” in a given space. And because it always decreases when you do renormalization, it means that renormalization can never “go backwards”. You can never renormalize back from your “zoomed out” theory to the “zoomed in” one.

The a-theorem, like every theorem, is based on assumptions. Here, the assumptions are mostly that quantum field theory works in the normal way, that the theory we’re dealing with is not a totally new type of theory instead. One assumption I find interesting is the assumption of locality, that no signals can travel faster than the speed of light. On a naive level, this makes a lot of sense to me. If you can send signals faster than light, then you can’t control your “zoom lens”. Physics in a small area might be changed by something happening very far away, so you can’t “zoom in” in a way that lets you keep including everything that could possibly be relevant. If you have signals that go faster than light, you could transmit information between different parts of big things without them having to “go through” small things first. You’d screw up reductionism, and have surprises show up on every scale.

Personally, I find it really cool that it’s possible to prove a theorem that says something about a seemingly philosophical topic like reductionism. Even with assumptions (and even with the above speculations about the speed of light), it’s quite interesting that one can say anything at all about this kind of thing from a physics perspective. I hope you find it interesting too!

Cause and Effect and Stories

You can think of cause and effect as the ultimate story. The world is filled with one damn thing happening after another, but to make sense of it we organize it into a narrative: this happened first, and it caused that, which caused that. We tie this to “what if” stories, stories about things that didn’t happen: if this hadn’t happened, then it wouldn’t have caused that, so that wouldn’t have happened.

We also tell stories about cause and effect. Physicists use cause and effect as a tool, a criterion to make sense of new theories: does this theory respect cause and effect, or not? And just like everything else in science, there is more than one story they tell about it.

As a physicist, how would you think about cause and effect?

The simplest, and most obvious requirement, is that effects should follow their causes. Cause and effect shouldn’t go backwards in time, the cause should come before the effect.

This all sounds sensible, until you remember that in physics “before” and “after” are relative. If you try to describe the order of two distant events, your description will be different than someone moving with a different velocity. You might think two things happened at the same time, while they think one happened first, and someone else thinks the other happened first.

You’d think this makes a total mess of cause and effect, but actually everything remains fine, as long nothing goes faster than the speed of light. If someone could travel between two events slower than the speed of light, then everybody will agree on their order, and so everyone can agree on which one caused the other. Cause and effect only get screwed up if they can happen faster than light.

(If the two events are two different times you observed something, then cause and effect will always be fine, since you yourself can’t go faster than the speed of light. So nobody will contradict what you observe, they just might interpret it differently.)

So if you want to make sure that your theory respects cause and effect, you’d better be sure that nothing goes faster than light. It turns out, this is not automatic! In general relativity, an effect called Shapiro time delay makes light take longer to pass a heavy object than to go through empty space. If you modify general relativity, you can accidentally get a theory with a Shapiro time advance, where light arrives sooner than it would through empty space. In such a theory, at least some observers will see effects happen before their causes!

Once you know how to check this, as a physicist, there are two kinds of stories you can tell. I’ve heard different people in the field tell both.

First, you can say that cause and effect should be a basic physical principle. Using this principle, you can derive other restrictions, demands on what properties matter and energy can have. You can carve away theories that violate these rules, making sure that we’re testing for theories that actually make sense.

On the other hand, there are a lot of stories about time travel. Time travel screws up cause and effect in a very direct way. When Harry Potter and Hermione travel back in time at the end of Harry Potter and the Prisoner of Azkaban, they cause the event that saves Harry’s life earlier in the book. Science fiction and fantasy are full of stories like this, and many of them are perfectly consistent. How can we be so sure that we don’t live in such a world?

The other type of story positions the physics of cause and effect as a search for evidence. We’re looking for physics that violates cause and effect, because if it exists, then on some small level it should be possible to travel back in time. By writing down the consequences of cause and effect, we get to describe what evidence we’d need to see it breaking down, and if we see it whole new possibilities open up.

These are both good stories! And like all other stories in science, they only capture part of what the scientists are up to. Some people stick to one or the other, some go between them, driven by the actual research, not the story itself. Like cause and effect itself, the story is just one way to describe the world around us.

On Stubbornness and Breaking Down

In physics, we sometimes say that an idea “breaks down”. What do we mean by that?

When a theory “breaks down”, we mean that it stops being accurate. Newton’s theory of gravity is excellent most of the time, but for objects under strong enough gravity or high enough speed its predictions stop matching reality and a new theory (relativity) is needed. This is the sense in which we say that Newtonian gravity breaks down for the orbit of mercury, or breaks down much more severely in the area around a black hole.

When a symmetry is “broken”, we mean that it stops holding true. Most of physics looks the same when you flip it in a mirror, a property called parity symmetry. Take a pile of electric and magnetic fields, currents and wires, and you’ll find their mirror reflection is also a perfectly reasonable pile of electric and magnetic fields, currents and wires. This isn’t true for all of physics, though: the weak nuclear force isn’t the same when you flip it in a mirror. We say that the weak force breaks parity symmetry.

What about when a more general “idea” breaks down? What about space-time?

In order for space-time to break down, there needs to be a good reason to abandon the idea. And depending on how stubborn you are about it, that reason can come at different times.

You might think of space-time as just Einstein’s theory of general relativity. In that case, you could say that space-time breaks down as soon as the world deviates from that theory. In that view, any modification to general relativity, no matter how small, corresponds to space-time breaking down. You can think of this as the “least stubborn” option, the one with barely any stubbornness at all, that will let space-time break down with a tiny nudge.

But if general relativity breaks down, a slightly more stubborn person could insist that space-time is still fine. You can still describe things as located at specific places and times, moving across curved space-time. They just obey extra forces, on top of those built into the space-time.

Such a person would be happy as long as general relativity was a good approximation of what was going on, but they might admit space-time has broken down when general relativity becomes a bad approximation. If there are only small corrections on top of the usual space-time picture, then space-time would be fine, but if those corrections got so big that they overwhelmed the original predictions of general relativity then that’s quite a different situation. In that situation, space-time may have stopped being a useful description, and it may be much better to describe the world in another way.

But we could imagine an even more stubborn person who still insists that space-time is fine. Ultimately, our predictions about the world are mathematical formulas. No matter how complicated they are, we can always subtract a piece off of those formulas corresponding to the predictions of general relativity, and call the rest an extra effect. That may be a totally useless thing to do that doesn’t help you calculate anything, but someone could still do it, and thus insist that space-time still hasn’t broken down.

To convince such a person, space-time would need to break down in a way that made some important concept behind it invalid. There are various ways this could happen, corresponding to different concepts. For example, one unusual proposal is that space-time is non-commutative. If that were true then, in addition to the usual Heisenberg uncertainty principle between position and momentum, there would be an uncertainty principle between different directions in space-time. That would mean that you can’t define the position of something in all directions at once, which many people would agree is an important part of having a space-time!

Ultimately, physics is concerned with practicality. We want our concepts not just to be definable, but to do useful work in helping us understand the world. Our stubbornness should depend on whether a concept, like space-time, is still useful. If it is, we keep it. But if the situation changes, and another concept is more useful, then we can confidently say that space-time has broken down.

Machine Learning, Occam’s Razor, and Fundamental Physics

There’s a saying in physics, attributed to the famous genius John von Neumann: “With four parameters I can fit an elephant, and with five I can make him wiggle his trunk.”

Say you want to model something, like some surprising data from a particle collider. You start with some free parameters: numbers in your model that aren’t decided yet. You then decide those numbers, “fixing” them based on the data you want to model. Your goal is for your model not only to match the data, but to predict something you haven’t yet measured. Then you can go out and check, and see if your model works.

The more free parameters you have in your model, the easier this can go wrong. More free parameters make it easier to fit your data, but that’s because they make it easier to fit any data. Your model ends up not just matching the physics, but matching the mistakes as well: the small errors that crop up in any experiment. A model like that may look like it’s a great fit to the data, but its predictions will almost all be wrong. It wasn’t just fit, it was overfit.

We have statistical tools that tell us when to worry about overfitting, when we should be impressed by a model and when it has too many parameters. We don’t actually use these tools correctly, but they still give us a hint of what we actually want to know, namely, whether our model will make the right predictions. In a sense, these tools form the mathematical basis for Occam’s Razor, the idea that the best explanation is often the simplest one, and Occam’s Razor is a critical part of how we do science.

So, did you know machine learning was just modeling data?

All of the much-hyped recent advances in artificial intelligence, GPT and Stable Diffusion and all those folks, at heart they’re all doing this kind of thing. They start out with a model (with a lot more than five parameters, arranged in complicated layers…), then use data to fix the free parameters. Unlike most of the models physicists use, they can’t perfectly fix these numbers: there are too many of them, so they have to approximate. They then test their model on new data, and hope it still works.

Increasingly, it does, and impressively well, so well that the average person probably doesn’t realize this is what it’s doing. When you ask one of these AIs to make an image for you, what you’re doing is asking what image the model predicts would show up captioned with your text. It’s the same sort of thing as asking an economist what their model predicts the unemployment rate will be when inflation goes up. The machine learning model is just way, way more complicated.

As a physicist, the first time I heard about this, I had von Neumann’s quote in the back of my head. Yes, these machines are dealing with a lot more data, from a much more complicated reality. They literally are trying to fit elephants, even elephants wiggling their trunks. Still, the sheer number of parameters seemed fishy here. And for a little bit things seemed even more fishy, when I learned about double descent.

Suppose you start increasing the number of parameters in your model. Initially, your model gets better and better. Your predictions have less and less error, your error descends. Eventually, though, the error increases again: you have too many parameters so you’re over-fitting, and your model is capturing accidents in your data, not reality.

In machine learning, weirdly, this is often not the end of the story. Sometimes, your prediction error rises, only to fall once more, in a double descent.

For a while, I found this deeply disturbing. The idea that you can fit your data, start overfitting, and then keep overfitting, and somehow end up safe in the end, was terrifying. The way some of the popular accounts described it, like you were just overfitting more and more and that was fine, was baffling, especially when they seemed to predict that you could keep adding parameters, keep fitting tinier and tinier fleas on the elephant’s trunk, and your predictions would never start going wrong. It would be the death of Occam’s Razor as we know it, more complicated explanations beating simpler ones off to infinity.

Luckily, that’s not what happens. And after talking to a bunch of people, I think I finally understand this enough to say something about it here.

The right way to think about double descent is as overfitting prematurely. You do still expect your error to eventually go up: your model won’t be perfect forever, at some point you will really overfit. It might take a long time, though: machine learning people are trying to model very complicated things, like human behavior, with giant piles of data, so very complicated models may often be entirely appropriate. In the meantime, due to a bad choice of model, you can accidentally overfit early. You will eventually overcome this, pushing past with more parameters into a model that works again, but for a little while you might convince yourself, wrongly, that you have nothing more to learn.

(You can even mitigate this by tweaking your setup, potentially avoiding the problem altogether.)

So Occam’s Razor still holds, but with a twist. The best model is simple enough, but no simpler. And if you’re not careful enough, you can convince yourself that a too-simple model is as complicated as you can get.

Image from Astral Codex Ten

I was reminded of all this recently by some articles by Sabine Hossenfelder.

Hossenfelder is a critic of mainstream fundamental physics. The articles were her restating a point she’s made many times before, including in (at least) one of her books. She thinks the people who propose new particles and try to search for them are wasting time, and the experiments motivated by those particles are wasting money. She’s motivated by something like Occam’s Razor, the need to stick to the simplest possible model that fits the evidence. In her view, the simplest models are those in which we don’t detect any more new particles any time soon, so those are the models she thinks we should stick with.

I tend to disagree with Hossenfelder. Here, I was oddly conflicted. In some of her examples, it seemed like she had a legitimate point. Others seemed like she missed the mark entirely.

Talk to most astrophysicists, and they’ll tell you dark matter is settled science. Indeed, there is a huge amount of evidence that something exists out there in the universe that we can’t see. It distorts the way galaxies rotate, lenses light with its gravity, and wiggled the early universe in pretty much the way you’d expect matter to.

What isn’t settled is whether that “something” interacts with anything else. It has to interact with gravity, of course, but everything else is in some sense “optional”. Astroparticle physicists use satellites to search for clues that dark matter has some other interactions: perhaps it is unstable, sometimes releasing tiny signals of light. If it did, it might solve other problems as well.

Hossenfelder thinks this is bunk (in part because she thinks those other problems are bunk). I kind of do too, though perhaps for a more general reason: I don’t think nature owes us an easy explanation. Dark matter isn’t obligated to solve any of our other problems, it just has to be dark matter. That seems in some sense like the simplest explanation, the one demanded by Occam’s Razor.

At the same time, I disagree with her substantially more on collider physics. At the Large Hadron Collider so far, all of the data is reasonably compatible with the Standard Model, our roughly half-century old theory of particle physics. Collider physicists search that data for subtle deviations, one of which might point to a general discrepancy, a hint of something beyond the Standard Model.

While my intuitions say that the simplest dark matter is completely dark, they don’t say that the simplest particle physics is the Standard Model. Back when the Standard Model was proposed, people might have said it was exceptionally simple because it had a property called “renormalizability”, but these days we view that as less important. Physicists like Ken Wilson and Steven Weinberg taught us to view theories as a kind of series of corrections, like a Taylor series in calculus. Each correction encodes new, rarer ways that particles can interact. A renormalizable theory is just the first term in this series. The higher terms might be zero, but they might not. We even know that some terms cannot be zero, because gravity is not renormalizable.

The two cases on the surface don’t seem that different. Dark matter might have zero interactions besides gravity, but it might have other interactions. The Standard Model might have zero corrections, but it might have nonzero corrections. But for some reason, my intuition treats the two differently: I would find it completely reasonable for dark matter to have no extra interactions, but very strange for the Standard Model to have no corrections.

I think part of where my intuition comes from here is my experience with other theories.

One example is a toy model called sine-Gordon theory. In sine-Gordon theory, this Taylor series of corrections is a very familiar Taylor series: the sine function! If you go correction by correction, you’ll see new interactions and more new interactions. But if you actually add them all up, something surprising happens. Sine-Gordon turns out to be a special theory, one with “no particle production”: unlike in normal particle physics, in sine-Gordon particles can neither be created nor destroyed. You would never know this if you did not add up all of the corrections.

String theory itself is another example. In string theory, elementary particles are replaced by strings, but you can think of that stringy behavior as a series of corrections on top of ordinary particles. Once again, you can try adding these things up correction by correction, but once again the “magic” doesn’t happen until the end. Only in the full series does string theory “do its thing”, and fix some of the big problems of quantum gravity.

If the real world really is a theory like this, then I think we have to worry about something like double descent.

Remember, double descent happens when our models can prematurely get worse before getting better. This can happen if the real thing we’re trying to model is very different from the model we’re using, like the example in this explainer that tries to use straight lines to match a curve. If we think a model is simpler because it puts fewer corrections on top of the Standard Model, then we may end up rejecting a reality with infinite corrections, a Taylor series that happens to add up to something quite nice. Occam’s Razor stops helping us if we can’t tell which models are really the simple ones.

The problem here is that every notion of “simple” we can appeal to here is aesthetic, a choice based on what makes the math look nicer. Other sciences don’t have this problem. When a biologist or a chemist wants to look for the simplest model, they look for a model with fewer organisms, fewer reactions…in the end, fewer atoms and molecules, fewer of the building-blocks given to those fields by physics. Fundamental physics can’t do this: we build our theories up from mathematics, and mathematics only demands that we be consistent. We can call theories simpler because we can write them in a simple way (but we could write them in a different way too). Or we can call them simpler because they look more like toy models we’ve worked with before (but those toy models are just a tiny sample of all the theories that are possible). We don’t have a standard of simplicity that is actually reliable.

From the Wikipedia page for dark matter halos

There is one other way out of this pickle. A theory that is easier to write down is under no obligation to be true. But it is more likely to be useful. Even if the real world is ultimately described by some giant pile of mathematical parameters, if a simple theory is good enough for the engineers then it’s a better theory to aim for: a useful theory that makes peoples’ lives better.

I kind of get the feeling Hossenfelder would make this objection. I’ve seen her argue on twitter that scientists should always be able to say what their research is good for, and her Guardian article has this suggestive sentence: “However, we do not know that dark matter is indeed made of particles; and even if it is, to explain astrophysical observations one does not need to know details of the particles’ behaviour.”

Ok yes, to explain astrophysical observations one doesn’t need to know the details of dark matter particles’ behavior. But taking a step back, one doesn’t actually need to explain astrophysical observations at all.

Astrophysics and particle physics are not engineering problems. Nobody out there is trying to steer a spacecraft all the way across a galaxy, navigating the distribution of dark matter, or creating new universes and trying to make sure they go just right. Even if we might do these things some day, it will be so far in the future that our attempts to understand them won’t just be quaint: they will likely be actively damaging, confusing old research in dead languages that the field will be better off ignoring to start from scratch.

Because of that, usefulness is also not a meaningful guide. It cannot tell you which theories are more simple, which to favor with Occam’s Razor.

Hossenfelder’s highest-profile recent work falls afoul of one or the other of her principles. Her work on the foundations of quantum mechanics could genuinely be useful, but there’s no reason aside from claims of philosophical beauty to expect it to be true. Her work on modeling dark matter is at least directly motivated by data, but is guaranteed to not be useful.

I’m not pointing this out to call Hossenfelder a hypocrite, as some sort of ad hominem or tu quoque. I’m pointing this out because I don’t think it’s possible to do fundamental physics today without falling afoul of these principles. If you want to hold out hope that your work is useful, you don’t have a great reason besides a love of pretty math: otherwise, anything useful would have been discovered long ago. If you just try to model existing data as best you can, then you’re making a model for events far away or locked in high-energy particle colliders, a model no-one else besides other physicists will ever use.

I don’t know the way through this. I think if you need to take Occam’s Razor seriously, to build on the same foundations that work in every other scientific field…then you should stop doing fundamental physics. You won’t be able to make it work. If you still need to do it, if you can’t give up the sub-field, then you should justify it on building capabilities, on the kind of “practice” Hossenfelder also dismisses in her Guardian piece.

We don’t have a solid foundation, a reliable notion of what is simple and what isn’t. We have guesses and personal opinions. And until some experiment uncovers some blinding flash of new useful meaningful magic…I don’t think we can do any better than that.

Shape the Science to the Statistics, Not the Statistics to the Science

In theatre, and more generally in writing, the advice is always to “show, don’t tell”. You could just tell your audience that Long John Silver is a ruthless pirate, but it works a lot better to show him marching a prisoner off the plank. Rather than just informing with words, you want to make things as concrete as possible, with actions.

There is a similar rule in pedagogy. Pedagogy courses teach you to be explicit about your goals, planning a course by writing down Intended Learning Outcomes. (They never seem amused when I ask about the Unintended Learning Outcomes.) At first, you’d want to write down outcomes like “students will understand calculus” or “students will know what a sine is”. These, however, are hard to judge, and thus hard to plan around. Instead, the advice is to write outcomes that correspond to actions you want the students to take, things you want them to be capable of doing: “students can perform integration by parts” “students can decide correctly whether to use a sine or cosine”. Again and again, the best way to get the students to know something is to get them to do something.

Jay Daigle recently finished a series of blog posts on how scientists use statistics to test hypotheses. I recommend it, it’s a great introduction to the concepts scientists use to reason about data, as well as a discussion of how they often misuse those concepts and what they can do better. I have a bit of a different perspective on one of the “takeaways” of the post, and I wanted to highlight that here.

The center of Daigle’s point is a tool, widely used in science, called Neyman-Pearson Hypothesis Testing. Neyman-Pearson is a tool for making decisions involving a threshold for significance: a number that scientists often call a p-value. If you follow the procedure, only acting when you find a p-value below 0.05, then you will only be wrong 5% of the time: specifically, that will be your rate of false positives, the percent of the time you conclude some action works when it really doesn’t.

A core problem, from Daigle’s perspective, is that scientists use Neyman-Pearson for the wrong purpose. Neyman-Pearson is a tool for making decisions, not a test that tells you whether or not a specific claim is true. It tells you “on average, if I approve drugs when their p-value is below 0.05, only 5% of them will fail”. That’s great if you can estimate how bad it is to deny a drug that should be approved, how bad it is to approve a drug that should be denied, and calculate out on average how often you can afford to be wrong. It doesn’t tell you anything about the specific drug, though. It doesn’t tell you “every drug with a p-value below 0.05 works”. It certainly doesn’t tell you “a drug with a p-value of 0.051 almost works” or “a drug with a p-value of 0.001 definitely works”. It just doesn’t give you that information.

In later posts, Daigle suggests better tools, which he argues map better to what scientists want to do, as well as general ways scientists can do better. Section 4. in particular focuses on the idea that one thing scientists need to do is ask better questions. He uses a specific example from cognitive psychology, a study that tests whether describing someone’s face makes you worse at recognizing it later. That’s a clear scientific question, one that can be tested statistically. That doesn’t mean it’s a good question, though. Daigle points out that questions like this have a problem: it isn’t clear what the result actually tells us.

Here’s another example of the same problem. In grad school, I knew a lot of social psychologists. One was researching a phenomenon called extended contact. Extended contact is meant to be a foil to another phenomenon called direct contact, both having to do with our views of other groups. In direct contact, making a friend from another group makes you view that whole group better. In extended contact, making a friend who has a friend from another group makes you view the other group better.

The social psychologist was looking into a concrete-sounding question: which of these phenomena, direct or extended contact, is stronger?

At first, that seems like it has the same problem as Daigle’s example. Suppose one of these effects is larger: what does that mean? Why do we care?

Well, one answer is that these aren’t just phenomena: they’re interventions. If you know one phenomenon is stronger than another, you can use that to persuade people to be more accepting of other groups. The psychologist’s advisor even had a procedure to make people feel like they made a new friend. Armed with that, it’s definitely useful to know whether extended contact or direct contact is better: whichever one is stronger is the one you want to use!

You do need some “theory” behind this, of course. You need to believe that, if a phenomenon is stronger in your psychology lab, it will be stronger wherever you try to apply it in the real world. It probably won’t be stronger every single time, so you need some notion of how much stronger it needs to be. That in turn means you need to estimate costs: what it costs if you pick the weaker one instead, how much money you’re wasting or harm you’re doing.

You’ll notice this is sounding a lot like the requirements I described earlier, for Neyman-Pearson. That’s not accident: as you try to make your science more and more clearly defined, it will get closer and closer to a procedure to make a decision, and that’s exactly what Neyman-Pearson is good for.

So in the end I’m quite a bit more supportive of Neyman-Pearson than Daigle is. That doesn’t mean it isn’t being used wrong: most scientists are using it wrong. Instead of calculating a p-value each time they make a decision, they do it at the end of a paper, misinterpreting it as evidence that one thing or another is “true”. But I think that what these scientists need to do is not chance their statistics, but change their science. If they focused their science on making concrete decisions, they would actually be justified in using Neyman-Pearson…and their science would get a lot better in the process.

Einstein-Years

Scott Aaronson recently published an interesting exchange on his blog Shtetl Optimized, between him and cognitive psychologist Steven Pinker. The conversation was about AI: Aaronson is optimistic (though not insanely so) Pinker is pessimistic (again, not insanely though). While fun reading, the whole thing would normally be a bit too off-topic for this blog, except that Aaronson’s argument ended up invoking something I do know a bit about: how we make progress in theoretical physics.

Aaronson was trying to respond to an argument of Pinker’s, that super-intelligence is too vague and broad to be something we could expect an AI to have. Aaronson asks us to imagine an AI that is nothing more or less than a simulation of Einstein’s brain. Such a thing isn’t possible today, and might not even be efficient, but it has the advantage of being something concrete we can all imagine. Aarsonson then suggests imagining that AI sped up a thousandfold, so that in one year it covers a thousand years of Einstein’s thought. Such an AI couldn’t solve every problem, of course. But in theoretical physics, surely such an AI could be safely described as super-intelligent: an amazing power that would change the shape of physics as we know it.

I’m not as sure of this as Aaronson is. We don’t have a machine that generates a thousand Einstein-years to test, but we do have one piece of evidence: the 76 Einstein-years the man actually lived.

Einstein is rightly famous as a genius in theoretical physics. His annus mirabilis resulted in five papers that revolutionized the field, and the next decade saw his theory of general relativity transform our understanding of space and time. Later, he explored what general relativity was capable of and framed challenges that deepened our understanding of quantum mechanics.

After that, though…not so much. For Einstein-decades, he tried to work towards a new unified theory of physics, and as far as I’m aware made no useful progress at all. I’ve never seen someone cite work from that period of Einstein’s life.

Aarsonson mentions simulating Einstein “at his peak”, and it would be tempting to assume that the unified theory came “after his peak”, when age had weakened his mind. But while that kind of thing can sometimes be an issue for older scientists, I think it’s overstated. I don’t think careers peak early because of “youthful brains”, and with the exception of genuine dementia I don’t think older physicists are that much worse-off cognitively than younger ones. The reason so many prominent older physicists go down unproductive rabbit-holes isn’t because they’re old. It’s because genius isn’t universal.

Einstein made the progress he did because he was the right person to make that progress. He had the right background, the right temperament, and the right interests to take others’ mathematics and take them seriously as physics. As he aged, he built on what he found, and that background in turn enabled him to do more great things. But eventually, the path he walked down simply wasn’t useful anymore. His story ended, driven to a theory that simply wasn’t going to work, because given his experience up to that point that was the work that interested him most.

I think genius in physics is in general like that. It can feel very broad because a good genius picks up new tricks along the way, and grows their capabilities. But throughout, you can see the links: the tools mastered at one age that turn out to be just right for a new pattern. For the greatest geniuses in my field, you can see the “signatures” in their work, hints at why they were just the right genius for one problem or another. Give one a thousand years, and I suspect the well would eventually run dry: the state of knowledge would no longer be suitable for even their breadth.

…of course, none of that really matters for Aaronson’s point.

A century of Einstein-years wouldn’t have found the Standard Model or String Theory, but a century of physicist-years absolutely did. If instead of a simulation of Einstein, your AI was a simulation of a population of scientists, generating new geniuses as the years go by, then the argument works again. Sure, such an AI would be much more expensive, much more difficult to build, but the first one might have been as well. The point of the argument is simply to show such a thing is possible.

The core of Aaronson’s point rests on two key traits of technology. Technology is replicable: once we know how to build something, we can build more of it. Technology is scalable: if we know how to build something, we can try to build a bigger one with more resources. Evolution can tap into both of these, but not reliably: just because it’s possible to build a mind a thousand times better at some task doesn’t mean it will.

That is why the possibility of AI leads to the possibility of super-intelligence. If we can make a computer that can do something, we can make it do that something faster. That something doesn’t have to be “general”, you can have programs that excel at one task or another. For each such task, with more resources you can scale things up: so anything a machine can do now, a later machine can probably do better. Your starting-point doesn’t necessarily even have to be efficient, or a good algorithm: bad algorithms will take longer to scale, but could eventually get there too.

The only question at that point is “how fast?” I don’t have the impression that’s settled. The achievements that got Pinker and Aarsonson talking, GPT-3 and DALL-E and so forth, impressed people by their speed, by how soon they got to capabilities we didn’t expect them to have. That doesn’t mean that something we might really call super-intelligence is close: that has to do with the details, with what your target is and how fast you can actually scale. And it certainly doesn’t mean that another approach might not be faster! (As a total outsider, I can’t help but wonder if current ML is in some sense trying to fit a cubic with straight lines.)

It does mean, though, that super-intelligence isn’t inconceivable, or incoherent. It’s just the recognition that technology is a master of brute force, and brute force eventually triumphs. If you want to think about what happens in that “eventually”, that’s a very important thing to keep in mind.

The Most Anthropic of All Possible Worlds

Today, we’d call Leibniz a mathematician, a physicist, and a philosopher. As a mathematician, Leibniz turned calculus into something his contemporaries could actually use. As a physicist, he championed a doomed theory of gravity. In philosophy, he seems to be most remembered for extremely cheaty arguments.

Free will and determinism? Can’t it just be a coincidence?

I don’t blame him for this. Faced with a tricky philosophical problem, it’s enormously tempting to just blaze through with an answer that makes every subtlety irrelevant. It’s a temptation I’ve succumbed to time and time again. Faced with a genie, I would always wish for more wishes. On my high school debate team, I once forced everyone at a tournament to switch sides with some sneaky definitions. It’s all good fun, but people usually end up pretty annoyed with you afterwards.

People were annoyed with Leibniz too, especially with his solution to the problem of evil. If you believe in a benevolent, all-powerful god, as Leibniz did, why is the world full of suffering and misery? Leibniz’s answer was that even an all-powerful god is constrained by logic, so if the world contains evil, it must be logically impossible to make the world any better: indeed, we live in the best of all possible worlds. Voltaire famously made fun of this argument in Candide, dragging a Leibniz-esque Professor Pangloss through some of the most creative miseries the eighteenth century had to offer. It’s possibly the most famous satire of a philosopher, easily beating out Aristophanes’ The Clouds (which is also great).

Physicists can also get accused of cheaty arguments, and probably the most mocked is the idea of a multiverse. While it hasn’t had its own Candide, the multiverse has been criticized by everyone from bloggers to Nobel prizewinners. Leibniz wanted to explain the existence of evil, physicists want to explain “unnaturalness”: the fact that the kinds of theories we use to explain the world can’t seem to explain the mass of the Higgs boson. To explain it, these physicists suggest that there are really many different universes, separated widely in space or built in to the interpretation of quantum mechanics. Each universe has a different Higgs mass, and ours just happens to be the one we can live in. This kind of argument is called “anthropic” reasoning. Rather than the best of all possible worlds, it says we live in the world best-suited to life like ours.

I called Leibniz’s argument “cheaty”, and you might presume I think the same of the multiverse. But “cheaty” doesn’t mean “wrong”. It all depends what you’re trying to do.

Leibniz’s argument and the multiverse both work by dodging a problem. For Leibniz, the problem of evil becomes pointless: any evil might be necessary to secure a greater good. With a multiverse, naturalness becomes pointless: with many different laws of physics in different places, the existence of one like ours needs no explanation.

In both cases, though, the dodge isn’t perfect. To really explain any given evil, Leibniz would have to show why it is secretly necessary in the face of a greater good (and Pangloss spends Candide trying to do exactly that). To explain any given law of physics, the multiverse needs to use anthropic reasoning: it needs to show that that law needs to be the way it is to support human-like life.

This sounds like a strict requirement, but in both cases it’s not actually so useful. Leibniz could (and Pangloss does) come up with an explanation for pretty much anything. The problem is that no-one actually knows which aspects of the universe are essential and which aren’t. Without a reliable way to describe the best of all possible worlds, we can’t actually test whether our world is one.

The same problem holds for anthropic reasoning. We don’t actually know what conditions are required to give rise to people like us. “People like us” is very vague, and dramatically different universes might still contain something that can perceive and observe. While it might seem that there are clear requirements, so far there hasn’t been enough for people to do very much with this type of reasoning.

However, for both Leibniz and most of the physicists who believe anthropic arguments, none of this really matters. That’s because the “best of all possible worlds” and “most anthropic of all possible worlds” aren’t really meant to be predictive theories. They’re meant to say that, once you are convinced of certain things, certain problems don’t matter anymore.

Leibniz, in particular, wasn’t trying to argue for the existence of his god. He began the argument convinced that a particular sort of god existed: one that was all-powerful and benevolent, and set in motion a deterministic universe bound by logic. His argument is meant to show that, if you believe in such a god, then the problem of evil can be ignored: no matter how bad the universe seems, it may still be the best possible world.

Similarly, the physicists convinced of the multiverse aren’t really getting there through naturalness. Rather, they’ve become convinced of a few key claims: that the universe is rapidly expanding, leading to a proliferating multiverse, and that the laws of physics in such a multiverse can vary from place to place, due to the huge landscape of possible laws of physics in string theory. If you already believe those things, then the naturalness problem can be ignored: we live in some randomly chosen part of the landscape hospitable to life, which can be anywhere it needs to be.

So despite their cheaty feel, both arguments are fine…provided you agree with their assumptions. Personally, I don’t agree with Leibniz. For the multiverse, I’m less sure. I’m not confident the universe expands fast enough to create a multiverse, I’m not even confident it’s speeding up its expansion now. I know there’s a lot of controversy about the math behind the string theory landscape, about whether the vast set of possible laws of physics are as consistent as they’re supposed to be…and of course, as anyone must admit, we don’t know whether string theory itself is true! I don’t think it’s impossible that the right argument comes around and convinces me of one or both claims, though. These kinds of arguments, “if assumptions, then conclusion” are the kind of thing that seems useless for a while…until someone convinces you of the conclusion, and they matter once again.

So in the end, despite the similarity, I’m not sure the multiverse deserves its own Candide. I’m not even sure Leibniz deserved Candide. But hopefully by understanding one, you can understand the other just a bit better.

The Only Speed of Light That Matters

A couple weeks back, someone asked me about a Veritasium video with the provocative title “Why No One Has Measured The Speed Of Light”. Veritasium is a science popularization youtube channel, and usually a fairly good one…so it was a bit surprising to see it make a claim usually reserved for crackpots. Many, many people have measured the speed of light, including Ole Rømer all the way back in 1676. To argue otherwise seems like it demands a massive conspiracy.

Veritasium wasn’t proposing a conspiracy, though, just a technical point. Yes, many experiments have measured the speed of light. However, the speed they measure is in fact a “two-way speed”, the speed that light takes to go somewhere and then come back. They leave open the possibility that light travels differently in different directions, and only has the measured speed on average: that there are different “one-way speeds” of light.

The loophole is clearest using some of the more vivid measurements of the speed of light, timing how long it takes to bounce off a mirror and return. It’s less clear using other measurements of the speed of light, like Rømer’s. Rømer measured the speed of light using the moons of Jupiter, noticing that the time they took to orbit appeared to change based on whether Jupiter was moving towards or away from the Earth. For this measurement Rømer didn’t send any light to Jupiter…but he did have to make assumptions about Jupiter’s rotation, using it like a distant clock. Those assumptions also leave the door open to a loophole, one where the different one-way speeds of light are compensated by different speeds for distant clocks. You can watch the Veritasium video for more details about how this works, or see the wikipedia page for the mathematical details.

When we think of the speed of light as the same in all directions, in some sense we’re making a choice. We’ve chosen a convention, called the Einstein synchronization convention, that lines up distant clocks in a particular way. We didn’t have to choose that convention, though we prefer to (the math gets quite a bit more complicated if we don’t). And crucially for any such choice, it is impossible for any experiment to tell the difference.

So far, Veritasium is doing fine here. But if the video was totally fine, I wouldn’t have written this post. The technical argument is fine, but the video screws up its implications.

Near the end of the video, the host speculates whether this ambiguity is a clue. What if a deeper theory of physics could explain why we can’t tell the difference between different synchronizations? Maybe that would hint at something important.

Well, it does hint at something important, but not something new. What it hints at is that “one-way speeds” don’t matter. Not for light, or really for anything else.

Think about measuring the speed of something, anything. There are two ways to do it. One is to time it against something else, like the signal in a wire, and assume we know that speed. Veritasium shows an example of this, measuring the speed of a baseball that hits a target and sends a signal back. The other way is to send it somewhere with a clock we trust, and compare it to our clock. Each of these requires that something goes back and forth, even if it’s not the same thing each time. We can’t measure the one-way speed of anything because we’re never in two places at once. Everything we measure, every conclusion we come to about the world, rests on something “two-way”: our actions go out, our perceptions go in. Even our depth perception is an inference from our ancestors, whose experience seeing food and traveling to it calibrated our notion of distance.

Synchronization of clocks is a convention because the external world is a convention. What we have really, objectively, truly, are our perceptions and our memories. Everything else is a model we build to fill the gaps in between. Some features of that model are essential: if you change them, you no longer match our perceptions. Other features, though, are just convenience: ways we arrange the model to make it easier to use, to make it not “sound dumb”, to tell a coherent story. Synchronization is one of those things: the notion that you can compare times in distant places is convenient, but as relativity already tells us in other contexts, not necessary. It’s part of our storytelling, not an essential part of our model.