Tag Archives: philosophy of science

Lack of Recognition Is a Symptom, Not a Cause

Science is all about being first. Once a discovery has been made, discovering the same thing again is redundant. At best, you can improve the statistical evidence…but for a theorem or a concept, you don’t even have that. This is why we make such a big deal about priority: the first person to discover something did something very valuable. The second, no matter how much effort and insight went into their work, did not.

Because priority matters, for every big scientific discovery there is a priority dispute. Read about science’s greatest hits, and you’ll find people who were left in the wings despite their accomplishments, people who arguably found key ideas and key discoveries earlier than the people who ended up famous. That’s why the idea Peter Higgs is best known for, the Higgs mechanism,

“is therefore also called the Brout–Englert–Higgs mechanism, or Englert–Brout–Higgs–Guralnik–Hagen–Kibble mechanism, Anderson–Higgs mechanism,Anderson–Higgs–Kibble mechanism, Higgs–Kibble mechanism by Abdus Salam and ABEGHHK’tH mechanism (for Anderson, Brout, Englert, Guralnik, Hagen, Higgs, Kibble, and ‘t Hooft) by Peter Higgs.”

Those who don’t get the fame don’t get the rewards. The scientists who get less recognition than they deserve get fewer grants and worse positions, losing out on the career outcomes that the person famous for the discovery gets, even if the less-recognized scientist made the discovery first.

…at least, that’s the usual story.

You can start to see the problem when you notice a contradiction: if a discovery has already been made, what would bring someone to re-make it?

Sometimes, people actually “steal” discoveries, finding something that isn’t widely known and re-publishing it without acknowledging the author. More often, though, the re-discoverer genuinely didn’t know. That’s because, in the real world, we don’t all know about a discovery as soon as it’s made. It has to be communicated.

At minimum, this means you need enough time to finish ironing out the kinks of your idea, write up a paper, and disseminate it. In the days before the internet, dissemination might involve mailing pre-prints to universities across the ocean. It’s relatively easy, in such a world, for two people to get started discovering the same thing, write it up, and even publish it before they learn about the other person’s work.

Sometimes, though, something gets rediscovered long after the original paper should have been available. In those cases, the problem isn’t time, it’s reach. Maybe the original paper was written in a way that hid its implications. Maybe it was published in a way only accessible to a smaller community: either a smaller part of the world, like papers that were only available to researchers in the USSR, or a smaller research community. Maybe the time hadn’t come yet, and the whole reason why the result mattered had yet to really materialize.

For a result like that, a lack of citations isn’t really the problem. Rather than someone who struggles because their work is overlooked, these are people whose work is overlooked, in a sense, because they are struggling: because their work is having a smaller impact on the work of others. Acknowledging them later can do something, but it can’t change the fact that this was work published for a smaller community, yielding smaller rewards.

And ultimately, it isn’t just priority we care about, but impact. While the first European to make contact with the New World might have been Erik the Red, we don’t call the massive exchange of plants and animals between the Old and New World the “Red Exchange”. Erik the Red being “first” matters much less, historically speaking, than Columbus changing the world. Similarly, in science, being the first to discover something is meaningless if that discovery doesn’t change how other people do science, and the person who manages to cause that change is much more valuable than someone who does the same work but doesn’t manage the same reach.

Am I claiming that it’s fair when scientists get famous for other peoples’ discoveries? No, it’s definitely not fair. It’s not fair because most of the reasons one might have lesser reach aren’t under one’s control. Soviet scientists (for the most part) didn’t choose to be based in the USSR. People who make discoveries before they become relevant don’t choose the time in which they were born. And while you can get better at self-promotion with practice, there’s a limited extent to which often-reclusive scientists should be blamed for their lack of social skills.

What I am claiming is that addressing this isn’t a matter of scrupulously citing the “original” discoverer after the fact. That’s a patch, and a weak one. If we want to get science closer to the ideal, where each discovery only has to be made once, then we need to work to increase reach for everyone. That means finding ways to speed up publication, to let people quickly communicate preliminary ideas with a wide audience and change the incentives so people aren’t penalized when others take up those ideas. It means enabling conversations between different fields and sub-fields, building shared vocabulary and opportunities for dialogue. It means making a community that rewards in-person hand-shaking less and careful online documentation more, so that recognition isn’t limited to the people with the money to go to conferences and the social skills to schmooze their way through them. It means anonymity when possible, and openness when we can get away with it.

Lack of recognition and redundant effort are both bad, and they both stem from the same failures to communicate. Instead of fighting about who deserves fame, we should work to make sure that science is truly global and truly universal. We can aim for a future where no-one’s contribution goes unrecognized, and where anything that is known to one is known to all.

Does Science Require Publication?

Seen on Twitter:

As is traditional, twitter erupted into dumb arguments over this. Some made fun of Yann LeCun for implying that Elon Musk will be forgotten, which despite any other faults of his seems unlikely. Science popularizer Sabine Hossenfelder pointed out that there are two senses of “publish” getting confused here: publish as in “make public” and publish as in “put in a scientific journal”. The latter tends to be necessary for scientists in practice, but is not required in principle. (The way journals work has changed a lot over just the last century!) The former, Sabine argued, is still 100% necessary.

Plenty of people on twitter still disagreed (this always happens). It got me thinking a bit about the role of publication in science.

When we talk about what science requires or doesn’t require, what are we actually talking about?

“Science” is a word, and like any word its meaning is determined by how it is used. Scientists use the word “science” of course, as do schools and governments and journalists. But if we’re getting into arguments about what does or does not count as science, then we’re asking about a philosophical problem, one in which philosophers of science try to understand what counts as science and what doesn’t.

What do philosophers of science want? Many things, but a big one is to explain why science works so well. Over a few centuries, humanity went from understanding the world in terms of familiar materials and living creatures to decomposing them in terms of molecules and atoms and cells and proteins. In doing this, we radically changed what we were capable of, computers out of the reach of blacksmiths and cures for diseases that weren’t even distinguishable. And while other human endeavors have seen some progress over this time (democracy, human rights…), science’s accomplishment demands an explanation.

Part of that explanation, I think, has to include making results public. Alchemists were interested in many of the things later chemists were, and had started to get some valuable insights. But alchemists were fearful of what their knowledge would bring (especially the ones who actually thought they could turn lead into gold). They published almost only in code. As such, the pieces of progress they made didn’t build up, didn’t aggregate, didn’t become overall progress. It was only when a new scientific culture emerged, when natural philosophers and physicists and chemists started writing to each other as clearly as they could, that knowledge began to build on itself.

Some on twitter pointed out the example of the Manhattan project during World War II. A group of scientists got together and made progress on something almost entirely in secret. Does that not count as science?

I’m willing to bite this bullet: I don’t think it does! When the Soviets tried to replicate the bomb, they mostly had to start from scratch, aside from some smuggled atomic secrets. Today, nations trying to build their own bombs know more, but they still must reinvent most of it. We may think this is a good thing, we may not want more countries to make progress in this way. But I don’t think we can deny that it genuinely does slow progress!

At the same time, to contradict myself a bit: I think you can think of science that happens within a particular community. The scientists of the Manhattan project didn’t publish in journals the Soviets could read. But they did write internal reports, they did publish to each other. I don’t think science by its nature has to include the whole of humanity (if it does, then perhaps studying the inside of black holes really is unscientific). You probably can do science sticking to just your own little world. But it will be slower. Better, for progress’s sake, if you can include people from across the world.

Generalizing a Black Box Theory

In physics and in machine learning, we have different ways of thinking about models.

A model in physics, like the Standard Model, is a tool to make predictions. Using statistics and a whole lot of data (from particle physics experiments), we fix the model’s free parameters (like the mass of the Higgs boson). The model then lets us predict what we’ll see next: when we turn on the Large Hadron Collider, what will the data look like? In physics, when a model works well, we think that model is true, that it describes the real way the world works. The Standard Model isn’t the ultimate truth: we expect that a better model exists that makes better predictions. But it is still true, in an in-between kind of way. There really are Higgs bosons, even if they’re a result of some more mysterious process underneath, just like there really are atoms, even if they’re made out of protons, neutrons, and electrons.

A model in machine learning, like the Large Language Model that fuels ChatGPT, is also a tool to make predictions. Using statistics and a whole lot of data (from text on the internet, or images, or databases of proteins, or games of chess…) we fix the model’s free parameters (called weights, numbers for the strengths of connections between metaphorical neurons). The model then lets us predict what we’ll see next: when a text begins “Q: How do I report a stolen card? A:”, how does it end?

So far, that sounds a lot like physics. But in machine learning, we don’t generally think these models are true, at least not in the same way. The thing producing language isn’t really a neural network like a Large Language Model. It’s the sum of many human brains, many internet users, spread over many different circumstances. Each brain might be sort of like a neural network, but they’re not like the neural networks sitting on OpenAI’s servers. A Large Language Model isn’t true in some in-between kind of way, like atoms or Higgs bosons. It just isn’t true. It’s a black box, a machine that makes predictions, and nothing more.

But here’s the rub: what do we mean by true?

I want to be a pragmatist here. I don’t want to get stuck in a philosophical rabbit-hole, arguing with metaphysicists about what “really exists”. A true theory should be one that makes good predictions, that lets each of us know, based on our actions, what we should expect to see. That’s why science leads to technology, why governments and companies pay people to do it: because the truth lets us know what will happen, and make better choices. So if Large Language Models and the Standard Model both make good predictions, why is only one of them true?

Recently, I saw Dan Elton of More is Different make the point that there is a practical reason to prefer the “true” explanations: they generalize. A Large Language Model might predict what words come next in a text. But it doesn’t predict what happens when you crack someone’s brain open and see how the neurons connect to each other, even if that person is the one who made the text. A good explanation, a true model, can be used elsewhere. The Standard Model tells you what data from the Large Hadron Collider will look like, but it also tells you what data from the muon g-2 experiment will look like. It also, in principle, tells you things far away from particle physics: what stars look like, what atoms look like, what the inside of a nuclear reactor looks like. A black box can’t do that, even if it makes great predictions.

It’s a good point. But thinking about it, I realized things are a little murkier.

You can’t generalize a Large Language Model to tell you how human neurons are connected. But you can generalize it in other ways, and people do. There’s a huge industry in trying to figure out what GPT and its relatives “know”. How much math can they do? How much do they know about geography? Can they predict the future?

These generalizations don’t work the way that they do in physics, or the rest of science, though. When we generalize the Standard Model, we aren’t taking a machine that makes particle physics predictions and trying to see what those particle physics predictions can tell us. We’re taking something “inside” the machine, the fields and particles, and generalizing that, seeing how the things around us could be made of those fields and those particles. In contrast, when people generalize GPT, they typically don’t look inside the “black box”. They use the Large Language Model to make predictions, and see what those predictions “know about”.

On the other hand, we do sometimes generalize scientific models that way too.

If you’re simulating the climate, or a baby star, or a colony of bacteria, you typically aren’t using your simulation like a prediction machine. You don’t plug in exactly what is going on in reality, then ask what happens next. Instead, you run many simulations with different conditions, and look for patterns. You see how a cloud of sulfur might cool down the Earth, or how baby stars often form in groups, leading them to grow up into systems of orbiting black holes. Your simulation is kind of like a black box, one that you try out in different ways until you uncover some explainable principle, something your simulation “knows” that you can generalize.

And isn’t nature that kind of black box, too? When we do an experiment, aren’t we just doing what the Large Language Models are doing, prompting the black box in different ways to get an idea of what it knows? Are scientists who do experiments that picky about finding out what’s “really going on”, or do they just want a model that works?

We want our models to be general, and to be usable. Building a black box can’t be the whole story, because a black box, by itself, isn’t general. But it can certainly be part of the story. Going from the black box of nature to the black box of a machine lets you run tests you couldn’t previously do, lets you investigate faster and ask stranger questions. With a simulation, you can blow up stars. With a Large Language Model, you can ask, for a million social media comments, whether the average internet user would call them positive or negative. And if you make sure to generalize, and try to make better decisions, then it won’t be just the machine learning. You’ll be learning too.

Book Review: The Case Against Reality

Nima Arkani-Hamed shows up surprisingly rarely in popular science books. A major figure in my former field, Nima is extremely quotable (frequent examples include “spacetime is doomed” and “the universe is not a crappy metal”), but those quotes don’t seem to quite have reached the popular physics mainstream. He’s been interviewed in books by physicists, and has a major role in one popular physics book that I’m aware of. From this scattering of mentions, I was quite surprised to hear of another book where he makes an appearance: not a popular physics book at all, but a popular psychology book: Donald Hoffman’s The Case Against Reality. Naturally, this meant I had to read it.

Then, I saw the first quote on the back cover…or specifically, who was quoted.

Seeing that, I settled in for a frustrating read.

A few pages later, I realized that this, despite his endorsement, is not a Deepak Chopra kind of book. Hoffman is careful in some valuable ways. Specifically, he has a philosopher’s care, bringing up objections and potential holes in his arguments. As a result, the book wasn’t frustrating in the way I expected.

It was even more frustrating, actually. But in an entirely different way.

When a science professor writes a popular book, the result is often a kind of ungainly Frankenstein. The arguments we want to make tend to be better-suited to shorter pieces, like academic papers, editorials, and blog posts. To make these into a book, we have to pad them out. We stir together all the vaguely related work we’ve done, plus all the best-known examples from other peoples’ work, trying (often not all that hard) to make the whole sound like a cohesive story. Read enough examples, and you start to see the joints between the parts.

Hoffman is ostensibly trying to tell a single story. His argument is that the reality we observe, of objects in space and time, is not the true reality. It is a convenient reality, one that has led to our survival, but evolution has not (and as he argues, cannot) let us perceive the truth. Instead, he argues that the true reality is consciousness: a world made up of conscious beings interacting with each other, with space, time, and all the rest emerging as properties of those interactions.

That certainly sounds like it could be one, cohesive argument. In practice, though, it is three, and they don’t fit together as well as he’d hope.

Hoffman is trained as a psychologist. As such, one of the arguments is psychological: that research shows that we mis-perceive the world in service of evolutionary fitness.

Hoffman is a cognitive scientist, and while many cognitive scientists are trained as psychologists, others are trained as philosophers. As such, one of his arguments is philosophical: that the contents of consciousness can never be explained by relations between material objects, and that evolution, and even science, systematically lead us astray.

Finally, Hoffman has evidently been listening to and reading the work of some physicists, like Nima and Carlo Rovelli. As such, one of his arguments is physical: that physicists believe that space and time are illusions and that consciousness may be fundamental, and that the conclusions of the book lead to his own model of the basic physical constituents of the world.

The book alternates between these three arguments, so rather than in chapter order, I thought it would be better to discuss each argument in its own section.

The Psychological Argument

Sometimes, when two academics get into a debate, they disagree about what’s true. Two scientists might argue about whether an experiment was genuine, whether the statistics back up a conclusion, or whether a speculative theory is actually consistent. These are valuable debates, and worth reading about if you want to learn something about the nature of reality.

Sometimes, though, two debating academics agree on what’s true, and just disagree on what’s important. These debates are, at best, relevant to other academics and funders. They are not generally worth reading for anybody else, and are often extremely petty and dumb.

Hoffman’s psychological argument, regrettably, is of the latter kind. He would like to claim it’s the former, and to do so he marshals a host of quotes from respected scientists that claim that human perception is veridical: that what we perceive is real, courtesy of an evolutionary process that would have killed us off if it wasn’t. From that perspective, every psychological example Hoffman gives is a piece of counter-evidence, a situation where evolution doesn’t just fail to show us the true nature of reality, but actively hides reality from us.

The problem is that, if you actually read the people Hoffman quotes, they’re clearly not making the extreme point he claims. These people are psychologists, and all they are arguing is that perception is veridical in a particular, limited way. They argue that we humans are good at estimating distances or positions of objects, or that we can see a wide range of colors. They aren’t making some sort of philosophical point about those distances or positions or colors being how the world “really is”, nor are they claiming that evolution never makes humans mis-perceive.

Instead, they, and thus Hoffman, are arguing about importance. When studying humans, is it more useful to think of us as perceiving the world as it is? Or is it more useful to think of evolution as tricking us? Which happens more often?

The answers to each of those questions have to be “it depends”. Neither answer can be right all the time. At most then, this kind of argument can convince one academic to switch from researching in one way to researching in another, by saying that right now one approach is a better strategy. It can’t tell us anything more.

If the argument Hoffman is trying to get across here doesn’t matter, are there other reasons to read this part?

Popular psychology books tend to re-use a few common examples. There are some good ones, so if you haven’t read such a book you probably should read a couple, just to hear about them. For example, Hoffman tells the story of the split-brain patients, which is definitely worth being aware of.

(Those of you who’ve heard that story may be wondering how the heck Hoffman squares it with his idea of consciousness as fundamental. He actually does have a (weird) way to handle this, so read on.)

The other examples come from Hoffman’s research, and other research in his sub-field. There are stories about what optical illusions tell us about our perception, about how evolution primes us to see different things as attractive, and about how advertisers can work with attention.

These stories would at least be a source of a few more cool facts, but I’m a bit wary. The elephant in the room here is the replication crisis. Paper after paper in psychology has turned out to be a statistical mirage, accidental successes that fail to replicate in later experiments. This can happen without any deceit on the part of the psychologist, it’s just a feature of how statistics are typically done in the field.

Some psychologists make a big deal about the replication crisis: they talk about the statistical methods they use, and what they do to make sure they’re getting a real result. Hoffman talks a bit about tricks to rule out other explanations, but mostly doesn’t focus on this kind of thing.. This doesn’t mean he’s doing anything wrong: it might just be it’s off-topic. But it makes it a bit harder to trust him, compared to other psychologists who do make a big deal about it.

The Philosophical Argument

Hoffman structures his book around two philosophical arguments, one that appears near the beginning and another that, as he presents it, is the core thesis of the book. He calls both of these arguments theorems, a naming choice sure to irritate mathematicians and philosophers alike, but the mathematical content in either is for the most part not the point: in each case, the philosophical setup is where the arguments get most of their strength.

The first of these arguments, called The Scrambling Theorem, is set up largely as background material: not his core argument, but just an entry into the overall point he’s making. I found it helpful as a way to get at his reasoning style, the sorts of things he cares about philosophically and the ones he doesn’t.

The Scrambling Theorem is meant to weigh in on the debate over a thought experiment called the Inverted Spectrum, which in turn weighs on the philosophical concept of qualia. The Inverted Spectrum asks us to imagine someone who sees the spectrum of light inverted compared to how we see it, so that green becomes red and red becomes green, without anything different about their body or brain. Such a person would learn to refer to colors the same ways that we do, still referring to red blood even though they see what we see when we see green grass. Philosophers argue that, because we can imagine this, the “qualia” we see in color, like red or green, are distinct from their practical role: they are images in the mind’s eye that can be compared across minds, but do not correspond to anything we have yet characterized scientifically in the physical world.

As a response, other philosophers argued that you can’t actually invert the spectrum. Colors aren’t really a wheel, we can distinguish, for example, more colors between red and blue than between green and yellow. Just flipping colors around would have detectable differences that would have to have physical implications, you can’t just swap qualia and nothing else.

The Scrambling Theorem is in response to this argument. Hoffman argues that, while you can’t invert the spectrum, you can scramble it. By swapping not only the colors, but the relations between them, you can arrange any arbitrary set of colors however else you’d like. You can declare that green not only corresponds to blood and not grass, but that it has more colors between it and yellow, perhaps by stealing them from the other side of the color wheel. If you’re already allowed to swap colors and their associations around, surely you can do this too, and change order and distances between them.

Believe it or not, I think Hoffman’s argument is correct, at least in its original purpose. You can’t respond to the Inverted Spectrum just by saying that colors are distributed differently on different sides of the color wheel. If you want to argue against the Inverted Spectrum, you need a better argument.

Hoffman’s work happens to suggest that better argument. Because he frames this argument in the language of mathematics, as a “theorem”, Hoffman’s argument is much more general than the summary I gave above. He is arguing that not merely can you scramble colors, but anything you like. If you want to swap electrons and photons, you can: just make your photons interact with everything the way electrons did, and vice versa. As long as you agree that the things you are swapping exist, according to Hoffman, you are free to exchange them and their properties any way you’d like.

This is because, to Hoffman, things that “actually exist” cannot be defined just in terms of their relations. An electron is not merely a thing that repels other electrons and is attracted to protons and so on, it is a thing that “actually exists” out there in the world. (Or, as he will argue, it isn’t really. But that’s because in the end he doesn’t think electrons exist.)

(I’m tempted to argue against this with a mathematical object like group elements. Surely the identity element of a group is defined by its relations? But I think he would argue identity elements of groups don’t actually exist.)

In the end, Hoffman is coming from a particular philosophical perspective, one common in modern philosophers of metaphysics, the study of the nature of reality. From this perspective, certain things exist, and are themselves by necessity. We cannot ask what if a thing were not itself. For example, in this perspective it is nonsense to ask what if Superman was not Clark Kent, because the two names refer to the same actually existing person.

(If, you know, Superman actually existed.)

Despite the name of the book, Hoffman is not actually making a case against reality in general. He very much seems to believe in this type of reality, in the idea that there are certain things out there that are real, independent of any purely mathematical definition of their properties. He thinks they are different things than you think they are, but he definitely thinks there are some such things, and that it’s important and scientifically useful to find them.

Hoffman’s second argument is, as he presents it, the core of the book. It’s the argument that’s supposed to show that the world is almost certainly not how we perceive it, even through scientific instruments and the scientific method. Once again, he calls it a theorem: the Fitness Beats Truth theorem.

The Fitness Beats Truth argument begins with a question: why should we believe what we see? Why do we expect that the things we perceive should be true?

In Hoffman’s mind, the only answer is evolution. If we perceived the world inaccurately, we would die out, replaced by creatures that perceived the world better than we did. You might think we also have evidence from biology, chemistry, and physics: we can examine our eyes, test them against cameras, see how they work and what they can and can’t do. But to Hoffman, all of this evidence may be mistaken, because to learn biology, chemistry, and physics we must first trust that we perceive the world correctly to begin with. Evolution, though, doesn’t rely on any of that. Even if we aren’t really bundles of cells replicating through DNA and RNA, we should still expect something like evolution, some process by which things differ, are selected, and reproduce their traits differently in the next generation. Such things are common enough, and general enough, that one can (handwavily) expect them through pure reason alone.

But, says Hoffman’s psychology experience, evolution tricks us! We do mis-perceive, and systematically, in ways that favor our fitness over reality. And so Hoffman asks, how often should we expect this to happen?

The Fitness Beats Truth argument thinks of fitness as randomly distributed: some parts of reality historically made us more fit, some less. This distribution could match reality exactly, so that for any two things that are actually different, they will make us fit in different ways. But it doesn’t have to. There might easily be things that are really very different from each other, but which are close enough from a fitness perspective that to us they seem exactly the same.

The “theorem” part of the argument is an attempt to quantify this. Hoffman imagines a pixelated world, and asks how likely it is that a random distribution of fitness matches a random distribution of pixels. This gets extremely unlikely for a world of any reasonable size, for pretty obvious reasons. Thus, Hoffman concludes: in a world with evolution, we should almost always expect it to hide something from us. The world, if it has any complexity at all, has an almost negligible probability of being as we perceive it.

On one level, this is all kind of obvious. Evolution does trick us sometimes, just as it tricks other animals. But Hoffman is trying to push this quite far, to say that ultimately our whole picture of reality, not just our eyes and ears and nose but everything we see with microscopes and telescopes and calorimeters and scintillators, all of that might be utterly dramatically wrong. Indeed, we should expect it to be.

In this house, we tend to dismiss the Cartesian Demon. If you have an argument that makes you doubt literally everything, then it seems very unlikely you’ll get anything useful from it. Unlike Descartes’s Demon, Hoffman thinks we won’t be tricked forever. The tricks evolution plays on us mattered in our ancestral environment, but over time we move to stranger and stranger situations. Eventually, our fitness will depend on something new, and we’ll need to learn something new about reality.

This means that ultimately, despite the skeptical cast, Hoffman’s argument fits with the way science already works. We are, very much, trying to put ourselves in new situations and test whether our evolved expectations still serve us well or whether we need to perceive things anew. That is precisely what we in science are always doing, every day. And as we’ll see in the next section, whatever new things we have to learn have no particular reason to be what Hoffman thinks they should be.

But while it doesn’t really matter, I do still want to make one counter-argument to Fitness Beats Truth. Hoffman considers a random distribution of fitness, and asks what the chance is that it matches truth. But fitness isn’t independent of truth, and we know that not just from our perception, but from deeper truths of physics and mathematics. Fitness is correlated with truth, fitness often matches truth, for one key reason: complex things are harder than simple things.

Imagine a creature evolving an eye. They have a reason, based on fitness, to need to know where their prey is moving. If evolution was a magic wand, and chemistry trivial, it would let them see their prey, and nothing else. But evolution is not magic, and chemistry is not trivial. The easiest thing for this creature to see is patches of light and darkness. There are many molecules that detect light, because light is a basic part of the physical world. To detect just prey, you need something much more complicated, molecules and cells and neurons. Fitness imposes a cost, and it means that the first eyes that evolve are spots, detecting just light and darkness.

Hoffman asks us not to assume that we know how eyes work, that we know how chemistry works, because we got that knowledge from our perceptions. But the nature of complexity and simplicity, entropy and thermodynamics and information, these are things we can approach through pure thought, as much as evolution. And those principles tell us that it will always be easier for an organism to perceive the world as it truly is than not, because the world is most likely simple and it is most likely the simplest path to perceive it directly. When benefits get high enough, when fitness gets strong enough, we can of course perceive the wrong thing. But if there is only a small fitness benefit to perceiving something incorrectly, then simplicity will win out. And by asking simpler and simpler questions, we can make real durable scientific progress towards truth.

The Physical Argument

So if I’m not impressed by the psychology or the philosophy, what about the part that motivated me to read the book in the first place, the physics?

Because this is, in a weird and perhaps crackpot way, a physics book. Hoffman has a specific idea, more specific than just that the world we perceive is an evolutionary illusion, more specific than that consciousness cannot be explained by the relations between physical particles. He has a proposal, based on these ideas, one that he thinks might lead to a revolutionary new theory of physics. And he tries to argue that physicists, in their own way, have been inching closer and closer to his proposal’s core ideas.

Hoffman’s idea is that the world is made, not of particles or fields or anything like that, but of conscious agents. You and I are, in this picture, certainly conscious agents, but so are the sources of everything we perceive. When we reach out and feel a table, when we look up and see the Sun, those are the actions of some conscious agent intruding on our perceptions. Unlike panpsychists, who believe that everything in the world is conscious, Hoffman doesn’t believe that the Sun itself is conscious, or is made of conscious things. Rather, he thinks that the Sun is an evolutionary illusion that rearranges our perceptions in a convenient way. The perceptions still come from some conscious thing or set of conscious things, but unlike in panpsychism they don’t live in the center of our solar system, or in any other place (space and time also being evolutionary illusions in this picture). Instead, they could come from something radically different that we haven’t imagined yet.

Earlier, I mentioned split brain patients. For anyone who thinks of conscious beings as fundamental, split brain patients are a challenge. These are people who, as a treatment for epilepsy, had the bridge between the two halves of their brain severed. The result is eerily as if their consciousness was split in two. While they only express one train of thought, that train of thought seems to only correspond to the thoughts of one side of their brain, controlling only half their body. The other side, controlling the other half of their body, appears to have different thoughts, different perceptions, and even different opinions, which are made manifest when instead of speaking they use that side of their body to gesture and communicate. While some argue that these cases are over-interpreted and don’t really show what they’re claimed to, Hoffman doesn’t. He accepts that these split-brain patients genuinely have their consciousness split in two.

Hoffman thinks this isn’t a problem because for him, conscious agents can be made up of other conscious agents. Each of us is conscious, but we are also supposed to be made up of simpler conscious agents. Our perceptions and decisions are not inexplicable, but can be explained in terms of the interactions of the simpler conscious entities that make us up, each one communicating with the others.

Hoffman speculates that everything is ultimately composed of the simplest possible conscious agents. For him, a conscious agent must do two things: perceive, and act. So the simplest possible agent perceives and acts in the simplest possible way. They perceive a single bit of information: 0 or 1, true or false, yes or no. And they take one action, communicating a different bit of information to another conscious agent: again, 0 or 1, true or false, yes or no.

Hoffman thinks that this could be the key to a new theory of physics. Instead of thinking about the world as composed of particles and fields, think about it as composed of these simple conscious agents, each one perceiving and communicating one bit at a time.

Hoffman thinks this, in part, because he sees physics as already going in this direction. He’s heard that “spacetime is doomed”, he’s heard that quantum mechanics is contextual and has no local realism, he’s heard that quantum gravity researchers think the world might be a hologram and space-time has a finite number of bits. This all “rhymes” enough with his proposal that he’s confident physics has his back.

Hoffman is trained in psychology. He seems to know his philosophy, at least enough to engage with the literature there. But he is absolutely not a physicist, and it shows. Time and again it seems like he relies on “pop physics” accounts that superficially match his ideas without really understanding what the physicists are actually talking about.

He keeps up best when it comes to interpretations of quantum mechanics, a field where concepts from philosophy play a meaningful role. He covers the reasons why quantum mechanics keeps philosophers up at night: Bell’s Theorem, which shows that a theory that matches the predictions of quantum mechanics cannot both be “realist”, with measurements uncovering pre-existing facts about the world, and “local”, with things only influencing each other at less than the speed of light, the broader notion of contextuality, where measured results are dependent on which other measurements are made, and the various experiments showing that both of these properties hold in the real world.

These two facts, and their implications, have spawned a whole industry of interpretations of quantum mechanics, where physicists and philosophers decide which side of various dilemmas to take and how to describe the results. Hoffman quotes a few different “non-realist” interpretations: Carlo Rovelli’s Relational Quantum Mechanics, Quantum Bayesianism/QBism, Consistent Histories, and whatever Chris Fields is into. These are all different from one another, which Hoffman is aware of. He just wants to make the case that non-realist interpretations are reasonable, that the physicists collectively are saying “maybe reality doesn’t exist” just like he is.

The problem is that Hoffman’s proposal is not, in the quantum mechanics sense, non-realist. Yes, Hoffman thinks that the things we observe are just an “interface”, that reality is really a network of conscious agents. But in order to have a non-realist interpretation, you need to also have other conscious agents not be real. That’s easily seen from the old “Wigner’s friend” thought experiment, where you put one of your friends in a Schrodinger’s cat-style box. Just as Schrodinger’s cat can be both alive and dead, your friend can both have observed something and not have observed it, or observed something and observed something else. The state of your friend’s mind, just like everything else in a non-realist interpretation, doesn’t have a definite value until you measure it.

Hoffman’s setup doesn’t, and can’t, work that way. His whole philosophical project is to declare that certain things exist and others don’t: the sun doesn’t exist, conscious agents do. In a non-realist interpretation, the sun and other conscious agents can both be useful descriptions, but ultimately nothing “really exists”. Science isn’t a catalogue of what does or doesn’t “really exist”, it’s a tool to make predictions about your observations.

Hoffman gets even more confused when he gets to quantum gravity. He starts out with a common misconception: that the Planck length represents the “pixels” of reality, sort of like the pixels of your computer screen, which he uses to support his “interface” theory of consciousness. This isn’t really the right way to think about it the Planck length, though, and certainly isn’t what the people he’s quoting have in mind. The Planck length is a minimum scale in that space and time stop making sense as one approaches it, but that’s not necessarily because space and time are made up of discrete pixels. Rather, it’s because as you get closer to the Planck length, space and time stop being the most convenient way to describe things. For a relatively simple example of how this can work, see my post here.

From there, he reflects on holography: the discovery that certain theories in physics can be described equally well by what is happening on their boundary as by their interior, the way that a 2D page can hold all the information for an apparently 3D hologram. He talks about the Bekenstein bound, the conjecture that there is a maximum amount of information needed to describe a region of space, proportional not to the volume of the region but to its area. For Hoffman, this feels suspiciously like human vision: if we see just a 2D image of the world, could that image contain all the information needed to construct that world? Could the world really be just what we see?

In a word, no.

On the physics side, the Bekenstein bound is a conjecture, and one that doesn’t always hold. A more precise version that seems to hold more broadly, called the Bousso bound, works by demanding the surface have certain very specific geometric properties in space-time, properties not generally shared by the retinas of our eyes.

But it even fails in Hoffman’s own context, once we remember that there are other types of perception than vision. When we hear, we don’t detect a 2D map, but a 1D set of frequencies, put in “stereo” by our ears. When we feel pain, we can feel it in any part of our body, essentially a 3D picture since it goes inwards as well. Nothing about human perception uniquely singles out a 2D surface.

There is actually something in physics much closer to what Hoffman is imagining, but it trades on a principle Hoffman aspires to get rid of: locality. We’ve known since Einstein that you can’t change the world around you faster than the speed of light. Quantum mechanics doesn’t change that, despite what you may have heard. More than that, simultaneity is relative: two distant events might be at the same time in your reference frame, but for someone else one of them might be first, or the other one might be, there is no one universal answer.

Because of that, if you want to think about things happening one by one, cause following effect, actions causing consequences, then you can’t think of causes or actions as spread out in space. You have to think about what happens at a single point: the location of an imagined observer.

Once you have this concept, you can ask whether describing the world in terms of this single observer works just as well as describing it in terms of a wide open space. And indeed, it actually can do well, at least under certain conditions. But one again, this really isn’t how Hoffman is doing things: he has multiple observers all real at the same time, communicating with each other in a definite order.

In general, a lot of researchers in quantum gravity think spacetime is doomed. They think things are better described in terms of objects with other properties and interactions, with space and time as just convenient approximations for a more complicated reality. They get this both from observing properties of the theories we already have, and from thought experiments showing where those theories cause problems.

Nima, the most catchy of these quotable theorists, is approaching the problem from the direction of scattering amplitudes: the calculations we do to find the probability of observations in particle physics. Each scattering amplitude describes a single observation: what someone far away from a particle collision can measure, independent of any story of what might have “actually happened” to the particles in between. Nima’s goal is to describe these amplitudes purely in terms of those observations, to get rid of the “story” that shows up in the middle as much as possible.

The other theorists have different goals, but have this in common: they treat observables as their guide. They look at the properties that a single observer’s observations can have, and try to take a fresh view, independent of any assumptions about what happens in between.

This key perspective, this key insight, is what Hoffman is missing throughout this book. He has read what many physicists have to say, but he does not understand why they are saying it. His book is titled The Case Against Reality, but he merely trades one reality for another. He stops short of the more radical, more justified case against reality: that “reality”, that thing philosophers argue about and that makes us think we can rule out theories based on pure thought, is itself the wrong approach: that instead of trying to characterize an idealized real world, we are best served by focusing on what we can do.

One thing I didn’t do here is a full critique of Hoffman’s specific proposal, treating it as a proposed theory of physics. That would involve quite a bit more work, on top of what has turned out to be a very long book review. I would need to read not just his popular description, but the actual papers where he makes his case and lays out the relevant subtleties. Since I haven’t done that, I’ll end with a few questions: things that his proposal will need to answer if it aspires to be a useful idea for physics.

  • Are the networks of conscious agents he proposes Turing-complete? In other words, can they represent any calculation a computer can do? If so, they aren’t a useful idea for physics, because you could imagine a network of conscious agents to reproduce any theory you want. The idea wouldn’t narrow things down to get us closer to a useful truth. This was also one of the things that made me uncomfortable with the Wolfram Physics Project.
  • What are the conditions that allow a network of simple conscious agents to make up a bigger conscious agent? Do those conditions depend meaningfully on the network’s agents being conscious, or do they just have to pass messages? If the latter, then Hoffman is tacitly admitting you can make a conscious agent out of non-conscious agents, even if he insists this is philosophically impossible.
  • How do you square this network with relativity and quantum mechanics? Is there a set time, an order in which all the conscious agents communicate with each other? If so, how do you square that with the relativity of simultaneity? Are the agents themselves supposed to be able to be put in quantum states, or is quantum mechanics supposed to emerge from a theory of classical agents?
  • How does evolution fit in here? A bit part of Hoffman’s argument was supported by the universality of the evolutionary algorithm. In order for evolution to matter for your simplest agents, they need to be able to be created or destroyed. But then they have more than two actions: not just 0 and 1, but 0, 1, and cease to exist. So you could have an even simpler agent that has just two bits.

Generalize

What’s the difference between a model and an explanation?

Suppose you cared about dark matter. You observe that things out there in the universe don’t quite move the way you would expect. There is something, a consistent something, that changes the orbits of galaxies and the bending of light, the shape of the early universe and the spiderweb of super-clusters. How do you think about that “something”?

One option is to try to model the something. You want to use as few parameters as possible, so that your model isn’t just an accident, but will actually work to predict new data. You want to describe how it changes gravity, on all the scales you care about. Your model might be very simple, like the original MOND, and just describe a modification to Newtonian gravity, since you typically only need Newtonian gravity to model many of these phenomena. (Though MOND itself can’t account for all the things attributed to dark matter, so it had to be modified.) You might have something slightly more complicated, proposing some “matter” but not going into much detail about what it is, just enough for your model to work.

If you were doing engineering, a model like that is a fine thing to have. If you were building a spaceship and wanted to figure out what its destination would look like after a long journey, you’d need a model of dark matter like this, one that predicted how galaxies move and light bends, to do the job.

But a model like that isn’t an explanation. And the reason why is that explanations generalize.

In practice, you often just need Newtonian gravity to model how galaxies move. But if you want to model more dramatic things, the movement of the whole universe or the area around a black hole, then you need general relativity as well. So to generalize to those areas, you can’t just modify Newtonian gravity. You need an explanation, one that tells you not just how Newton’s equations change, but how Einstein’s equations change.

In practice, you can get by with a simple model of dark matter, one that doesn’t tell you very much, and just adds a new type of matter. But if you want to model quantum gravity, you need to know how this new matter interacts, not just at baseline with gravity, but with everything else. You need to know how the new matter is produced, whether it gets its mass from the Higgs boson or from something else, whether it falls into the same symmetry groups as the Standard Model or totally new ones, how it arises from tangled-up strings and multi-dimensional membranes. You need not just a model, but an explanation, one that tells you not just roughly what kind of particle you need, but how it changes our models of particle physics overall.

Physics, at its best, generalizes. Newton’s genius wasn’t that he modeled gravity on Earth, but that he unified it with gravity in the solar system. By realizing that gravity was universal, he proposed an explanation that led to much more progress than the models of predecessors like Kepler. Later, Einstein’s work on general relativity led to similar progress.

We can’t always generalize. Sometimes, we simply don’t know enough. But if we’re not engineering, then we don’t need a model, and generalizing should, at least in the long-run, be our guiding hope.

What’s in a Subfield?

A while back, someone asked me what my subfield, amplitudeology, is really about. I wrote an answer to that here, a short-term and long-term perspective that line up with the stories we often tell about the field. I talked about how we try to figure out ways to calculate probabilities faster, first for understanding the output of particle colliders like the LHC, then more recently for gravitational wave telescopes. I talked about how the philosophy we use for that carries us farther, how focusing on the minimal information we need to make a prediction gives us hope that we can generalize and even propose totally new theories.

The world doesn’t follow stories, though, not quite so neatly. Try to define something as simple as the word “game” and you run into trouble. Some games have a winner and a loser, some games everyone is on one team, and some games don’t have winners or losers at all. Games can involve physical exercise, computers, boards and dice, or just people telling stories. They can be played for fun, or for money, silly or deadly serious. Most have rules, but some don’t even have that. Instead, games are linked by history: a series of resemblances, people saying that “this” is a game because it’s kind of like “that”.

A subfield isn’t just a word, it’s a group of people. So subfields aren’t defined just by resemblance. Instead, they’re defined by practicality.

To ask what amplitudeology is really about, think about why you might want to call yourself an amplitudeologist. It could be a question of goals, certainly: you might care a lot about making better predictions for the LHC, or you could have some other grand story in mind about how amplitudes will save the world. Instead, though, it could be a matter of training: you learned certain methods, certain mathematics, a certain perspective, and now you apply it to your research, even if it goes further afield from what was considered “amplitudeology” before. It could even be a matter of community, joining with others who you think do cool stuff, even if you don’t share exactly the same goals or the same methods.

Calling yourself an amplitudeologist means you go to their conferences and listen to their talks, means you look to them to collaborate and pay attention to their papers. Those kinds of things define a subfield: not some grand mission statement, but practical questions of interest, what people work on and know and where they’re going with that. Instead of one story, like every other word, amplitudeology has a practical meaning that shifts and changes with time. That’s the way subfields should be: useful to the people who practice them.

Theorems About Reductionism

A reductionist would say that the behavior of the big is due to the behavior of the small. Big things are made up of small things, so anything the big things do must be explicable in terms of what the small things are doing. It may be very hard to explain things this way: you wouldn’t want to describe the economy in terms of motion of carbon atoms. But in principle, if you could calculate everything, you’d find the small things are enough: there are no fundamental “new rules” that only apply to big things.

A physicist reductionist would have to amend this story. Zoom in far enough, and it doesn’t really make sense to talk about “small things”, “big things”, or even “things” at all. The world is governed by interactions of quantum fields, ripples spreading and colliding and changing form. Some of these ripples (like the ones we call “protons”) are made up of smaller things…but ultimately most aren’t. They just are what they are.

Still, a physicist can rescue the idea of reductionism by thinking about renormalization. If you’ve heard of renormalization, you probably think of it as a trick physicists use to hide inconvenient infinite results in their calculations. But an arguably better way to think about it is as a kind of “zoom” dial for quantum field theories. Starting with a theory, we can use renormalization to “zoom out”, ignoring the smallest details and seeing what picture emerges. As we “zoom”, different forces will seem to get stronger or weaker: electromagnetism matters less when we zoom out, the strong nuclear force matters more.

(Why then, is electromagnetism so much more important in everyday life? The strong force gets so strong as we zoom out that we stop seeing individual particles, and only see them bound into protons and neutrons. Electromagnetism is like this too, binding electrons and protons into neutral atoms. In both cases, it can be better, once we’ve zoomed out, to use a new description: you don’t want to do chemistry keeping track of the quarks and gluons.)

A physicists reductionist then, would expect renormalization to always go “one way”. As we “zoom out”, we should find that our theories in a meaningful sense get simpler and simpler. Maybe they’re still hard to work with: it’s easier to think about gluons and quarks when zoomed in than the zoo of different nuclear particles we need to consider when zoomed out. But at each step, we’re ignoring some details. And if you’re a reductionist, you shouldn’t expect “zooming out” to show you anything truly fundamentally new.

Can you prove that, though?

Surprisingly, yes!

In 2011, Zohar Komargodski and Adam Schwimmer proved a result called the a-theorem. “The a-theorem” is probably the least google-able theorem in the universe, which has probably made it hard to popularize. It is named after a quantity, labeled “a”, that gives a particular way to add up energy in a quantum field theory. Komargodski and Schwimmer proved that, when you do the renormalization procedure and “zoom out”, then this quantity “a” will always get smaller.

Why does this say anything about reductionism?

Suppose you have a theory that violates reductionism. You zoom out, and see something genuinely new: a fact about big things that isn’t due to facts about small things. If you had a theory like that, then you could imagine “zooming in” again, and using your new fact about big things to predict something about the small things that you couldn’t before. You’d find that renormalization doesn’t just go “one way”: with new facts able to show up at every scale, zooming out isn’t necessarily ignoring more and zooming in isn’t necessarily ignoring less. It would depend on the situation which way the renormalization procedure would go.

The a-theorem puts a stop to this. It says that, when you “zoom out”, there is a number that always gets smaller. In some ways it doesn’t matter what that number is (as long as you’re not cheating and using the scale directly). In this case, it is a number that loosely counts “how much is going on” in a given space. And because it always decreases when you do renormalization, it means that renormalization can never “go backwards”. You can never renormalize back from your “zoomed out” theory to the “zoomed in” one.

The a-theorem, like every theorem, is based on assumptions. Here, the assumptions are mostly that quantum field theory works in the normal way, that the theory we’re dealing with is not a totally new type of theory instead. One assumption I find interesting is the assumption of locality, that no signals can travel faster than the speed of light. On a naive level, this makes a lot of sense to me. If you can send signals faster than light, then you can’t control your “zoom lens”. Physics in a small area might be changed by something happening very far away, so you can’t “zoom in” in a way that lets you keep including everything that could possibly be relevant. If you have signals that go faster than light, you could transmit information between different parts of big things without them having to “go through” small things first. You’d screw up reductionism, and have surprises show up on every scale.

Personally, I find it really cool that it’s possible to prove a theorem that says something about a seemingly philosophical topic like reductionism. Even with assumptions (and even with the above speculations about the speed of light), it’s quite interesting that one can say anything at all about this kind of thing from a physics perspective. I hope you find it interesting too!

Physics’ Unique Nightmare

Halloween is coming up, so let’s talk about the most prominent monster of the physics canon, the nightmare scenario.

Not to be confused with the D&D Nightmare, which once was a convenient source of infinite consumable items for mid-level characters.

Right now, thousands of physicists search for more information about particle physics beyond our current Standard Model. They look at data from the Large Hadron Collider to look for signs of new particles and unexpected behavior, they try to detect a wide range of possible dark matter particles, and they make very precise measurements to try to detect subtle deviations. And in the back of their minds, almost all of those physicists wonder if they’ll find anything at all.

It’s not that we think the Standard Model is right. We know it has problems, deep mathematical issues that make it give nonsense answers and an apparent big mismatch with what we observe about the motion of matter and light in the universe. (You’ve probably heard this mismatch called dark matter and dark energy.)

But none of those problems guarantee an answer soon. The Standard Model will eventually fail, but it may fail only for very difficult and expensive experiments, not a Large Hadron Collider but some sort of galactic-scale Large Earth Collider. It might be that none of the experiments or searches or theories those thousands of physicists are working on will tell them anything they didn’t already know. That’s the nightmare scenario.

I don’t know another field that has a nightmare scenario quite like this. In most fields, one experiment or another might fail, not just not giving the expected evidence but not teaching anything new. But most experiments teach us something new. We don’t have a theory, in almost any field, that has the potential to explain every observation up to the limits of our experiments, but which we still hope to disprove. Only the Standard Model is like that.

And while thousands of physicists are exposed to this nightmare scenario, the majority of physicists aren’t. Physics isn’t just the science of the reductionistic laws of the smallest constituents of matter. It’s also the study of physical systems, from the bubbling chaos of nuclear physics to the formation of planets and galaxies and black holes, to the properties of materials to the movement of bacteria on a petri dish and bees in a hive. It’s also the development of new methods, from better control of individual atoms and quantum states to powerful new tricks for calculation. For some, it can be the discovery, not of reductionistic laws of the smallest scales, but of general laws of the largest scales, of how systems with many different origins can show echoes of the same behavior.

Over time, more and more of those thousands of physicists break away from the nightmare scenario, “waking up” to new questions of these kinds. For some, motivated by puzzles and skill and the beauty of physics, the change is satisfying, a chance to work on ideas that are moving forward, connected with experiment or grounded in evolving mathematics. But if your motivation is really tied to those smallest scales, to that final reductionistic “why”, then such a shift won’t be satisfying, and this is a nightmare you won’t wake up from.

Me, I’m not sure. I’m a tool-builder, and I used to tell myself that tool-builders are always needed. But I find I do care, in the end, what my tools are used for. And as we approach the nightmare scenario, I’m not at all sure I know how to wake up.

Cause and Effect and Stories

You can think of cause and effect as the ultimate story. The world is filled with one damn thing happening after another, but to make sense of it we organize it into a narrative: this happened first, and it caused that, which caused that. We tie this to “what if” stories, stories about things that didn’t happen: if this hadn’t happened, then it wouldn’t have caused that, so that wouldn’t have happened.

We also tell stories about cause and effect. Physicists use cause and effect as a tool, a criterion to make sense of new theories: does this theory respect cause and effect, or not? And just like everything else in science, there is more than one story they tell about it.

As a physicist, how would you think about cause and effect?

The simplest, and most obvious requirement, is that effects should follow their causes. Cause and effect shouldn’t go backwards in time, the cause should come before the effect.

This all sounds sensible, until you remember that in physics “before” and “after” are relative. If you try to describe the order of two distant events, your description will be different than someone moving with a different velocity. You might think two things happened at the same time, while they think one happened first, and someone else thinks the other happened first.

You’d think this makes a total mess of cause and effect, but actually everything remains fine, as long nothing goes faster than the speed of light. If someone could travel between two events slower than the speed of light, then everybody will agree on their order, and so everyone can agree on which one caused the other. Cause and effect only get screwed up if they can happen faster than light.

(If the two events are two different times you observed something, then cause and effect will always be fine, since you yourself can’t go faster than the speed of light. So nobody will contradict what you observe, they just might interpret it differently.)

So if you want to make sure that your theory respects cause and effect, you’d better be sure that nothing goes faster than light. It turns out, this is not automatic! In general relativity, an effect called Shapiro time delay makes light take longer to pass a heavy object than to go through empty space. If you modify general relativity, you can accidentally get a theory with a Shapiro time advance, where light arrives sooner than it would through empty space. In such a theory, at least some observers will see effects happen before their causes!

Once you know how to check this, as a physicist, there are two kinds of stories you can tell. I’ve heard different people in the field tell both.

First, you can say that cause and effect should be a basic physical principle. Using this principle, you can derive other restrictions, demands on what properties matter and energy can have. You can carve away theories that violate these rules, making sure that we’re testing for theories that actually make sense.

On the other hand, there are a lot of stories about time travel. Time travel screws up cause and effect in a very direct way. When Harry Potter and Hermione travel back in time at the end of Harry Potter and the Prisoner of Azkaban, they cause the event that saves Harry’s life earlier in the book. Science fiction and fantasy are full of stories like this, and many of them are perfectly consistent. How can we be so sure that we don’t live in such a world?

The other type of story positions the physics of cause and effect as a search for evidence. We’re looking for physics that violates cause and effect, because if it exists, then on some small level it should be possible to travel back in time. By writing down the consequences of cause and effect, we get to describe what evidence we’d need to see it breaking down, and if we see it whole new possibilities open up.

These are both good stories! And like all other stories in science, they only capture part of what the scientists are up to. Some people stick to one or the other, some go between them, driven by the actual research, not the story itself. Like cause and effect itself, the story is just one way to describe the world around us.

Stories Backwards and Forwards

You can always start with “once upon a time”…

I come up with tricks to make calculations in particle physics easier. That’s my one-sentence story, or my most common one. If I want to tell a longer story, I have more options.

Here’s one longer story:

I want to figure out what Nature is telling us. I want to take all the data we have access to that has anything to say about fundamental physics, every collider and gravitational wave telescope and ripple in the overall structure of the universe, and squeeze it as hard as I can until something comes out. I want to make sure we understand the implications of our current best theories as well as we can, to as high precision as we can, because I want to know whether they match what we see.

To do that, I am starting with a type of calculation I know how to do best. That’s both because I can make progress with it, and because it will be important for making these inferences, for testing our theories. I am following a hint in a theory that definitely does not describe the real world, one that is both simpler to work with and surprisingly complex, one that has a good track record, both for me and others, for advancing these calculations. And at the end of the day, I’ll make our ability to infer things from Nature that much better.

Here’s another:

Physicists, unknowing, proposed a kind of toy model, one often simpler to work with but not necessarily simpler to describe. Using this model, they pursued increasingly elaborate calculations, and time and time again, those calculations surprised them. The results were not random, not a disorderly mess of everything they could plausibly have gotten. Instead, they had structure, symmetries and patterns and mathematical properties that the physicists can’t seem to explain. If we can explain them, we will advance our knowledge of models and theories and ideas, geometry and combinatorics, learning more about the unexpected consequences of the rules we invent.

We can also help the physicists advance physics, of course. That’s a happy accident, but one that justifies the money and time, showing the rest of the world that understanding consequences of rules is still important and valuable.

These seem like very different stories, but they’re not so different. They change in order, physics then math or math then physics, backwards and forwards. By doing that, they change in emphasis, in where they’re putting glory and how they’re catching your attention. But at the end of the day, I’m investigating mathematical mysteries, and I’m advancing our ability to do precision physics.

(Maybe you think that my motivation must lie with one of these stories and not the other. One is “what I’m really doing”, the other is a lie made up for grant agencies.
Increasingly, I don’t think people work like that. If we are at heart stories, we’re retroactive stories. Our motivation day to day doesn’t follow one neat story or another. We move forward, we maybe have deep values underneath, but our accounts of “why” can and will change depending on context. We’re human, and thus as messy as that word should entail.)

I can tell more than two stories if I want to. I won’t here. But this is largely what I’m working on at the moment. In applying for grants, I need to get the details right, to sprinkle the right references and the right scientific arguments, but the broad story is equally important. I keep shuffling that story, a pile of not-quite-literal index cards, finding different orders and seeing how they sound, imagining my audience and thinking about what stories would work for them.