Tag Archives: philosophy of science

Of Cows and Razors

Last week’s post came up on Reddit, where a commenter made a good point. I said that one of the mysteries of neutrinos is that they might not get their mass from the Higgs boson. This is true, but the commenter rightly points out it’s true of other particles too: electrons might not get their mass from the Higgs. We aren’t sure. The lighter quarks might not get their mass from the Higgs either.

When talking physics with the public, we usually say that electrons and quarks all get their mass from the Higgs. That’s how it works in our Standard Model, after all. But even though we’ve found the Higgs boson, we can’t be 100% sure that it functions the way our model says. That’s because there are aspects of the Higgs we haven’t been able to measure directly. We’ve measured how it affects the heaviest quark, the top quark, but measuring its interactions with other particles will require a bigger collider. Until we have those measurements, the possibility remains open that electrons and quarks get their mass another way. It would be a more complicated way: we know the Higgs does a lot of what the model says, so if it deviates in another way we’d have to add more details, maybe even more undiscovered particles. But it’s possible.

If I wanted to defend the idea that neutrinos are special here, I would point out that neutrino masses, unlike electron masses, are not part of the Standard Model. For electrons, we have a clear “default” way for them to get mass, and that default is in a meaningful way simpler than the alternatives. For neutrinos, every alternative is complicated in some fashion: either adding undiscovered particles, or unusual properties. If we were to invoke Occam’s Razor, the principle that we should always choose the simplest explanation, then for electrons and quarks there is a clear winner. Not so for neutrinos.

I’m not actually going to make this argument. That’s because I’m a bit wary of using Occam’s Razor when it comes to questions of fundamental physics. Occam’s Razor is a good principle to use, if you have a good idea of what’s “normal”. In physics, you don’t.

To illustrate, I’ll tell an old joke about cows and trains. Here’s the version from The Curious Incident of the Dog in the Night-Time:

There are three men on a train. One of them is an economist and one of them is a logician and one of them is a mathematician. And they have just crossed the border into Scotland (I don’t know why they are going to Scotland) and they see a brown cow standing in a field from the window of the train (and the cow is standing parallel to the train). And the economist says, ‘Look, the cows in Scotland are brown.’ And the logician says, ‘No. There are cows in Scotland of which at least one is brown.’ And the mathematician says, ‘No. There is at least one cow in Scotland, of which one side appears to be brown.’

One side of this cow appears to be very fluffy.

If we want to be as careful as possible, the mathematician’s answer is best. But we expect not to have to be so careful. Maybe the economist’s answer, that Scottish cows are brown, is too broad. But we could imagine an agronomist who states “There is a breed of cows in Scotland that is brown”. And I suggest we should find that pretty reasonable. Essentially, we’re using Occam’s Razor: if we want to explain seeing a brown half-cow from a train, the simplest explanation would be that it’s a member of a breed of cows that are brown. It would be less simple if the cow were unique, a brown mutant in a breed of black and white cows. It would be even less simple if only one side of the cow were brown, and the other were another color.

When we use Occam’s Razor in this way, we’re drawing from our experience of cows. Most of the cows we meet are members of some breed or other, with similar characteristics. We don’t meet many mutant cows, or half-colored cows, so we think of those options as less simple, and less likely.

But what kind of experience tells us which option is simpler for electrons, or neutrinos?

The Standard Model is a type of theory called a Quantum Field Theory. We have experience with other Quantum Field Theories: we use them to describe materials, metals and fluids and so forth. Still, it seems a bit odd to say that if something is typical of these materials, it should also be typical of the universe. As another physicists in my sub-field, Nima Arkani-Hamed, likes to say, “the universe is not a crappy metal!”

We could also draw on our experience from other theories in physics. This is a bit more productive, but has other problems. Our other theories are invariably incomplete, that’s why we come up with new theories in the first place…and with so few theories, compared to breeds of cows, it’s unclear that we really have a good basis for experience.

Physicists like to brag that we study the most fundamental laws of nature. Ordinarily, this doesn’t matter as much as we pretend: there’s a lot to discover in the rest of science too, after all. But here, it really makes a difference. Unlike other fields, we don’t know what’s “normal”, so we can’t really tell which theories are “simpler” than others. We can make aesthetic judgements, on the simplicity of the math or the number of fields or the quality of the stories we can tell. If we want to be principled and forego all of that, then we’re left on an abyss, a world of bare observations and parameter soup.

If a physicist looks out a train window, will they say that all the electrons they see get their mass from the Higgs? Maybe, still. But they should be careful about it.

Digging for Buried Insight

The scientific method, as we usually learn it, starts with a hypothesis. The scientist begins with a guess, and asks a question with a clear answer: true, or false? That guess lets them design an experiment, observe the consequences, and improve our knowledge of the world.

But where did the scientist get the hypothesis in the first place? Often, through some form of exploratory research.

Exploratory research is research done, not to answer a precise question, but to find interesting questions to ask. Each field has their own approach to exploration. A psychologist might start with interviews, asking broad questions to find narrower questions for a future survey. An ecologist might film an animal, looking for changes in its behavior. A chemist might measure many properties of a new material, seeing if any stand out. Each approach is like digging for treasure, not sure of exactly what you will find.

Mathematicians and theoretical physicists don’t do experiments, but we still need hypotheses. We need an idea of what we plan to prove, or what kind of theory we want to build: like other scientists, we want to ask a question with a clear, true/false answer. And to find those questions, we still do exploratory research.

What does exploratory research look like, in the theoretical world? Often, it begins with examples and calculations. We can start with a known method, or a guess at a new one, a recipe for doing some specific kind of calculation. Recipe in hand, we proceed to do the same kind of calculation for a few different examples, covering different sorts of situation. Along the way, we notice patterns: maybe the same steps happen over and over, or the result always has some feature.

We can then ask, do those same steps always happen? Does the result really always have that feature? We have our guess, our hypothesis, and our attempt to prove it is much like an experiment. If we find a proof, our hypothesis was true. On the other hand, we might not be able to find a proof. Instead, exploring, we might find a counterexample – one where the steps don’t occur, the feature doesn’t show up. That’s one way to learn that our hypothesis was false.

This kind of exploration is essential to discovery. As scientists, we all have to eventually ask clear yes/no questions, to submit our beliefs to clear tests. But we can’t start with those questions. We have to dig around first, to observe the world without a clear plan, to get to a point where we have a good question to ask.

Who Is, and Isn’t, Counting Angels on a Pinhead

How many angels can dance on the head of a pin?

It’s a question famous for its sheer pointlessness. While probably no-one ever had that exact debate, “how many angels fit on a pin” has become a metaphor, first for a host of old theology debates that went nowhere, and later for any academic study that seems like a waste of time. Occasionally, physicists get accused of doing this: typically string theorists, but also people who debate interpretations of quantum mechanics.

Are those accusations fair? Sometimes yes, sometimes no. In order to tell the difference, we should think about what’s wrong, exactly, with counting angels on the head of a pin.

One obvious answer is that knowing the number of angels that fit on a needle’s point is useless. Wikipedia suggests that was the origin of the metaphor in the first place, a pun on “needle’s point” and “needless point”. But this answer is a little too simple, because this would still be a useful debate if angels were real and we could interact with them. “How many angels fit on the head of a pin” is really a question about whether angels take up space, whether two angels can be at the same place at the same time. Asking that question about particles led physicists to bosons and fermions, which among other things led us to invent the laser. If angelology worked, perhaps we would have angel lasers as well.

Be not afraid of my angel laser

“If angelology worked” is key here, though. Angelology didn’t work, it didn’t lead to angel-based technology. And while Medieval people couldn’t have known that for certain, maybe they could have guessed. When people accuse academics of “counting angels on the head of a pin”, they’re saying they should be able to guess that their work is destined for uselessness.

How do you guess something like that?

Well, one problem with counting angels is that nobody doing the counting had ever seen an angel. Counting angels on the head of a pin implies debating something you can’t test or observe. That can steer you off-course pretty easily, into conclusions that are either useless or just plain wrong.

This can’t be the whole of the problem though, because of mathematics. We rarely accuse mathematicians of counting angels on the head of a pin, but the whole point of math is to proceed by pure logic, without an experiment in sight. Mathematical conclusions can sometimes be useless (though we can never be sure, some ideas are just ahead of their time), but we don’t expect them to be wrong.

The key difference is that mathematics has clear rules. When two mathematicians disagree, they can look at the details of their arguments, make sure every definition is as clear as possible, and discover which one made a mistake. Working this way, what they build is reliable. Even if it isn’t useful yet, the result is still true, and so may well be useful later.

In contrast, when you imagine Medieval monks debating angels, you probably don’t imagine them with clear rules. They might quote contradictory bible passages, argue everyday meanings of words, and win based more on who was poetic and authoritative than who really won the argument. Picturing a debate over how many angels can fit on the head of a pin, it seems more like Calvinball than like mathematics.

This then, is the heart of the accusation. Saying someone is just debating how many angels can dance on a pin isn’t merely saying they’re debating the invisible. It’s saying they’re debating in a way that won’t go anywhere, a debate without solid basis or reliable conclusions. It’s saying, not just that the debate is useless now, but that it will likely always be useless.

As an outsider, you can’t just dismiss a field because it can’t do experiments. What you can and should do, is dismiss a field that can’t produce reliable knowledge. This can be hard to judge, but a key sign is to look for these kinds of Calvinball-style debates. Do people in the field seem to argue the same things with each other, over and over? Or do they make progress and open up new questions? Do the people talking seem to be just the famous ones? Or are there cases of young and unknown researchers who happen upon something important enough to make an impact? Do people just list prior work in order to state their counter-arguments? Or do they build on it, finding consequences of others’ trusted conclusions?

A few corners of string theory do have this Calvinball feel, as do a few of the debates about the fundamentals of quantum mechanics. But if you look past the headlines and blogs, most of each of these fields seems more reliable. Rather than interminable back-and-forth about angels and pinheads, these fields are quietly accumulating results that, one way or another, will give people something to build on.

Theoretical Uncertainty and Uncertain Theory

Yesterday, Fermilab’s Muon g-2 experiment announced a new measurement of the magnetic moment of the muon, a number which describes how muons interact with magnetic fields. For what might seem like a small technical detail, physicists have been very excited about this measurement because it’s a small technical detail that the Standard Model seems to get wrong, making it a potential hint of new undiscovered particles. Quanta magazine has a great piece on the announcement, which explains more than I will here, but the upshot is that there are two different calculations on the market that attempt to predict the magnetic moment of the muon. One of them, using older methods, disagrees with the experiment. The other, with a new approach, agrees. The question then becomes, which calculation was wrong? And why?

What does it mean for a prediction to match an experimental result? The simple, wrong, answer is that the numbers must be equal: if you predict “3”, the experiment has to measure “3”. The reason why this is wrong is that in practice, every experiment and every prediction has some uncertainty. If you’ve taken a college physics class, you’ve run into this kind of uncertainty in one of its simplest forms, measurement uncertainty. Measure with a ruler, and you can only confidently measure down to the smallest divisions on the ruler. If you measure 3cm, but your ruler has ticks only down to a millimeter, then what you’re measuring might be as large as 3.1cm or as small as 2.9 cm. You just don’t know.

This uncertainty doesn’t mean you throw up your hands and give up. Instead, you estimate the effect it can have. You report, not a measurement of 3cm, but of 3cm plus or minus 1mm. If the prediction was 2.9cm, then you’re fine: it falls within your measurement uncertainty.

Measurements aren’t the only thing that can be uncertain. Predictions have uncertainty too, theoretical uncertainty. Sometimes, this comes from uncertainty on a previous measurement: if you make a prediction based on that experiment that measured 3cm plus or minus 1mm, you have to take that plus or minus into account and estimate its effect (we call this propagation of errors). Sometimes, the uncertainty comes instead from an approximation you’re making. In particle physics, we sometimes approximate interactions between different particles with diagrams, beginning with the simplest diagrams and adding on more complicated ones as we go. To estimate the uncertainty there, we estimate the size of the diagrams we left out, the more complicated ones we haven’t calculated yet. Other times, that approximation doesn’t work, and we need to use a different approximation, treating space and time as a finite grid where we can do computer simulations. In that case, you can estimate your uncertainty based on how small you made your grid. The new approach to predicting the muon magnetic moment uses that kind of approximation.

There’s a common thread in all of these uncertainty estimates: you don’t expect to be too far off on average. Your measurements won’t be perfect, but they won’t all be screwed up in the same way either: chances are, they will randomly be a little below or a little above the truth. Your calculations are similar: whether you’re ignoring complicated particle physics diagrams or the spacing in a simulated grid, you can treat the difference as something small and random. That randomness means you can use statistics to talk about your errors: you have statistical uncertainty. When you have statistical uncertainty, you can estimate, not just how far off you might get, but how likely it is you ended up that far off. In particle physics, we have very strict standards for this kind of thing: to call something new a discovery, we demand that it is so unlikely that it would only show up randomly under the old theory roughly one in a million times. The muon magnetic moment isn’t quite up to our standards for a discovery yet, but the new measurement brought it closer.

The two dueling predictions for the muon’s magnetic moment both estimate some amount of statistical uncertainty. It’s possible that the two calculations just disagree due to chance, and that better measurements or a tighter simulation grid would make them agree. Given their estimates, though, that’s unlikely. That takes us from the realm of theoretical uncertainty, and into uncertainty about the theoretical. The two calculations use very different approaches. The new calculation tries to compute things from first principles, using the Standard Model directly. The risk is that such a calculation needs to make assumptions, ignoring some effects that are too difficult to calculate, and one of those assumptions may be wrong. The older calculation is based more on experimental results, using different experiments to estimate effects that are hard to calculate but that should be similar between different situations. The risk is that the situations may be less similar than expected, their assumptions breaking down in a way that the bottom-up calculation could catch.

None of these risks are easy to estimate. They’re “unknown unknowns”, or rather, “uncertain uncertainties”. And until some of them are resolved, it won’t be clear whether Fermilab’s new measurement is a sign of undiscovered particles, or just a (challenging!) confirmation of the Standard Model.

Reality as an Algebra of Observables

Listen to a physicist talk about quantum mechanics, and you’ll hear the word “observable”. Observables are, intuitively enough, things that can be observed. They’re properties that, in principle, one could measure in an experiment, like the position of a particle or its momentum. They’re the kinds of things linked by uncertainty principles, where the better you know one, the worse you know the other.

Some physicists get frustrated by this focus on measurements alone. They think we ought to treat quantum mechanics, not like a black box that produces results, but as information about some underlying reality. Instead of just observables, they want us to look for “beables“: not just things that can be observed, but things that something can be. From their perspective, the way other physicists focus on observables feels like giving up, like those physicists are abandoning their sacred duty to understand the world. Others, like the Quantum Bayesians or QBists, disagree, arguing that quantum mechanics really is, and ought to be, a theory of how individuals get evidence about the world.

I’m not really going to weigh in on that debate, I still don’t feel like I know enough to even write a decent summary. But I do think that one of the instincts on the “beables” side is wrong. If we focus on observables in quantum mechanics, I don’t think we’re doing anything all that unusual. Even in other parts of physics, we can think about reality purely in terms of observations. Doing so isn’t a dereliction of duty: often, it’s the most useful way to understand the world.

When we try to comprehend the world, we always start alone. From our time in the womb, we have only our senses and emotions to go on. With a combination of instinct and inference we start assembling a consistent picture of reality. Philosophers called phenomenologists (not to be confused with the physicists called phenomenologists) study this process in detail, trying to characterize how different things present themselves to an individual consciousness.

For my point here, these details don’t matter so much. That’s because in practice, we aren’t alone in understanding the world. Based on what others say about the world, we conclude they perceive much like we do, and we learn by their observations just as we learn by our own. We can make things abstract: instead of the specifics of how individuals perceive, we think about groups of scientists making measurements. At the end of this train lie observables: things that we as a community could in principle learn, and share with each other, ignoring the details of how exactly we measure them.

If each of these observables was unrelated, just scattered points of data, then we couldn’t learn much. Luckily, they are related. In quantum mechanics, some of these relationships are the uncertainty principles I mentioned earlier. Others relate measurements at different places, or at different times. The fancy way to refer to all these relationships is as an algebra: loosely, it’s something you can “do algebra with”, like you did with numbers and variables in high school. When physicists and mathematicians want to do quantum mechanics or quantum field theory seriously, they often talk about an “algebra of observables”, a formal way of thinking about all of these relationships.

Focusing on those two things, observables and how they are related, isn’t just useful in the quantum world. It’s an important way to think in other areas of physics too. If you’ve heard people talk about relativity, the focus on measurement screams out, in thought experiments full of abstract clocks and abstract yardsticks. Without this discipline, you find paradoxes, only to resolve them when you carefully track what each person can observe. More recently, physicists in my field have had success computing the chance particles collide by focusing on the end result, the actual measurements people can make, ignoring what might happen in between to cause that measurement. We can then break measurements down into simpler measurements, or use the structure of simpler measurements to guess more complicated ones. While we typically have done this in quantum theories, that’s not really a limitation: the same techniques make sense for problems in classical physics, like computing the gravitational waves emitted by colliding black holes.

With this in mind, we really can think of reality in those terms: not as a set of beable objects, but as a set of observable facts, linked together in an algebra of observables. Paring things down to what we can know in this way is more honest, and it’s also more powerful and useful. Far from a betrayal of physics, it’s the best advantage we physicists have in our quest to understand the world.

Inevitably Arbitrary

Physics is universal…or at least, it aspires to be. Drop an apple anywhere on Earth, at any point in history, and it will accelerate at roughly the same rate. When we call something a law of physics, we expect it to hold everywhere in the universe. It shouldn’t depend on anything arbitrary.

Sometimes, though, something arbitrary manages to sneak in. Even if the laws of physics are universal, the questions we want to answer are not: they depend on our situation, on what we want to know.

The simplest example is when we have to use units. The mass of an electron is the same here as it is on Alpha Centauri, the same now as it was when the first galaxies formed. But what is that mass? We could write it as 9.1093837015×10−31 kilograms, if we wanted to, but kilograms aren’t exactly universal. Their modern definition is at least based on physical constants, but with some pretty arbitrary numbers. It defines the Planck constant as 6.62607015×10−34 Joule-seconds. Chase that number back, and you’ll find references to the Earth’s circumference and the time it takes to turn round on its axis. The mass of the electron may be the same on Alpha Centauri, but they’d never write it as 9.1093837015×10−31 kilograms.

Units aren’t the only time physics includes something arbitrary. Sometimes, like with units, we make a choice of how we measure or calculate something. We choose coordinates for a plot, a reference frame for relativity, a zero for potential energy, a gauge for gauge theories and regularization and subtraction schemes for quantum field theory. Sometimes, the choice we make is instead what we measure. To do thermodynamics we must choose what we mean by a state, to call two substances water even if their atoms are in different places. Some argue a perspective like this is the best way to think about quantum mechanics. In a different context, I’d argue it’s why we say coupling constants vary with energy.

So what do we do, when something arbitrary sneaks in? We have a few options. I’ll illustrate each with the mass of the electron:

  • Make an arbitrary choice, and stick with it: There’s nothing wrong with measuring an electron in kilograms, if you’re consistent about it. You could even use ounces. You just have to make sure that everyone else you compare with is using the same units, or be careful to convert.
  • Make a “natural” choice: Why not set the speed of light and Planck’s constant to one? They come up a lot in particle physics, and all they do is convert between length and time, or time and energy. That way you can use the same units for all of them, and use something convenient, like electron-Volts. They even have electron in the name! Of course they also have “Volt” in the name, and Volts are as arbitrary as any other metric unit. A “natural” choice might make your life easier, but you should always remember it’s still arbitrary.
  • Make an efficient choice: This isn’t always the same as the “natural” choice. The units you choose have an effect on how difficult your calculation is. Sometimes, the best choice for the mass of an electron is “one electron-mass”, because it lets you calculate something else more easily. This is easier to illustrate with other choices: for example, if you have to pick a reference frame for a collision, picking one in which one of the objects is at rest, or where they move symmetrically, might make your job easier.
  • Stick to questions that aren’t arbitrary: No matter what units we use, the electron’s mass will be arbitrary. Its ratios to other masses won’t be though. No matter where we measure, dimensionless ratios like the mass of the muon divided by the mass of the electron, or the mass of the electron divided by the value of the Higgs field, will be the same. If we can make sure to ask only this kind of question, we can avoid arbitrariness. Note that we can think of even a mass in “kilograms” as this kind of question: what’s the ratio of the mass of the electron to “this arbitrary thing we’ve chosen”? In practice though, you want to compare things in the same theory, without the historical baggage of metric.

This problem may seem silly, and if we just cared about units it might be. But at the cutting-edge of physics there are still areas where the arbitrary shows up. Our choices of how to handle it, or how to avoid it, can be crucial to further progress.

Which Things Exist in Quantum Field Theory

If you ever think metaphysics is easy, learn a little quantum field theory.

Someone asked me recently about virtual particles. When talking to the public, physicists sometimes explain the behavior of quantum fields with what they call “virtual particles”. They’ll describe forces coming from virtual particles going back and forth, or a bubbling sea of virtual particles and anti-particles popping out of empty space.

The thing is, this is a metaphor. What’s more, it’s a metaphor for an approximation. As physicists, when we draw diagrams with more and more virtual particles, we’re trying to use something we know how to calculate with (particles) to understand something tougher to handle (interacting quantum fields). Virtual particles, at least as you’re probably picturing them, don’t really exist.

I don’t really blame physicists for talking like that, though. Virtual particles are a metaphor, sure, a way to talk about a particular calculation. But so is basically anything we can say about quantum field theory. In quantum field theory, it’s pretty tough to say which things “really exist”.

I’ll start with an example, neutrino oscillation.

You might have heard that there are three types of neutrinos, corresponding to the three “generations” of the Standard Model: electron-neutrinos, muon-neutrinos, and tau-neutrinos. Each is produced in particular kinds of reactions: electron-neutrinos, for example, get produced by beta-plus decay, when a proton turns into a neutron, an anti-electron, and an electron-neutrino.

Leave these neutrinos alone though, and something strange happens. Detect what you expect to be an electron-neutrino, and it might have changed into a muon-neutrino or a tau-neutrino. The neutrino oscillated.

Why does this happen?

One way to explain it is to say that electron-neutrinos, muon-neutrinos, and tau-neutrinos don’t “really exist”. Instead, what really exists are neutrinos with specific masses. These don’t have catchy names, so let’s just call them neutrino-one, neutrino-two, and neutrino-three. What we think of as electron-neutrinos, muon-neutrinos, and tau-neutrinos are each some mix (a quantum superposition) of these “really existing” neutrinos, specifically the mixes that interact nicely with electrons, muons, and tau leptons respectively. When you let them travel, it’s these neutrinos that do the traveling, and due to quantum effects that I’m not explaining here you end up with a different mix than you started with.

This probably seems like a perfectly reasonable explanation. But it shouldn’t. Because if you take one of these mass-neutrinos, and interact with an electron, or a muon, or a tau, then suddenly it behaves like a mix of the old electron-neutrinos, muon-neutrinos, and tau-neutrinos.

That’s because both explanations are trying to chop the world up in a way that can’t be done consistently. There aren’t electron-neutrinos, muon-neutrinos, and tau-neutrinos, and there aren’t neutrino-ones, neutrino-twos, and neutrino-threes. There’s a mathematical object (a vector space) that can look like either.

Whether you’re comfortable with that depends on whether you think of mathematical objects as “things that exist”. If you aren’t, you’re going to have trouble thinking about the quantum world. Maybe you want to take a step back, and say that at least “fields” should exist. But that still won’t do: we can redefine fields, add them together or even use more complicated functions, and still get the same physics. The kinds of things that exist can’t be like this. Instead you end up invoking another kind of mathematical object, equivalence classes.

If you want to be totally rigorous, you have to go a step further. You end up thinking of physics in a very bare-bones way, as the set of all observations you could perform. Instead of describing the world in terms of “these things” or “those things”, the world is a black box, and all you’re doing is finding patterns in that black box.

Is there a way around this? Maybe. But it requires thought, and serious philosophy. It’s not intuitive, it’s not easy, and it doesn’t lend itself well to 3d animations in documentaries. So in practice, whenever anyone tells you about something in physics, you can be pretty sure it’s a metaphor. Nice describable, non-mathematical things typically don’t exist.

Science as Hermeneutics: Closer Than You’d Think

This post is once again inspired by a Ted Chiang short story. This time, it’s “The Evolution of Human Science”, which imagines a world in which super-intelligent “metahumans” have become incomprehensible to the ordinary humans they’ve left behind. Human scientists in that world practice “hermeneutics“: instead of original research, they try to interpret what the metahumans are doing, reverse-engineering their devices and observing their experiments.

Much like a blogger who, out of ideas, cribs them from books.

It’s a thought-provoking view of what science in the distant future could become. But it’s also oddly familiar.

You might think I’m talking about machine learning here. It’s true that in recent years people have started using machine learning in science, with occasionally mysterious results. There are even a few cases of physicists using machine-learning to suggest some property, say of Calabi-Yau manifolds, and then figuring out how to prove it. It’s not hard to imagine a day when scientists are reduced to just interpreting whatever the AIs throw at them…but I don’t think we’re quite there yet.

Instead, I’m thinking about my own work. I’m a particular type of theoretical physicist. I calculate scattering amplitudes, formulas that tell us the probabilities that subatomic particles collide in different ways. We have a way to calculate these, Feynman’s famous diagrams, but they’re inefficient, so researchers like me look for shortcuts.

How do we find those shortcuts? Often, it’s by doing calculations the old, inefficient way. We use older methods, look at the formulas we get, and try to find patterns. Each pattern is a hint at some new principle that can make our calculations easier. Sometimes we can understand the pattern fully, and prove it should hold. Other times, we observe it again and again and tentatively assume it will keep going, and see what happens if it does.

Either way, this isn’t so different from the hermeneutics scientists practice in the story. Feynman diagrams already “know” every pattern we find, like the metahumans in the story who already know every result the human scientists can discover. But that “knowledge” isn’t in a form we can understand or use. We have to learn to interpret it, to read between the lines and find underlying patterns, to end up with something we can hold in our own heads and put into action with our own hands. The truth may be “out there”, but scientists can’t be content with that. We need to get the truth “in here”. We need to interpret it for ourselves.

Unification That Does Something

I’ve got unification on the brain.

Recently, a commenter asked me what physicists mean when they say two forces unify. While typing up a response, I came across this passage, in a science fiction short story by Ted Chiang.

Physics admits of a lovely unification, not just at the level of fundamental forces, but when considering its extent and implications. Classifications like ‘optics’ or ‘thermodynamics’ are just straitjackets, preventing physicists from seeing countless intersections.

This passage sounds nice enough, but I feel like there’s a misunderstanding behind it. When physicists seek after unification, we’re talking about something quite specific. It’s not merely a matter of two topics intersecting, or describing them with the same math. We already plumb intersections between fields, including optics and thermodynamics. When we hope to find a unified theory, we do so because it does something. A real unified theory doesn’t just aid our calculations, it gives us new ways to alter the world.

To show you what I mean, let me start with something physicists already know: electroweak unification.

There’s a nice series of posts on the old Quantum Diaries blog that explains electroweak unification in detail. I’ll be a bit vaguer here.

You might have heard of four fundamental forces: gravity, electromagnetism, the strong nuclear force, and the weak nuclear force. You might have also heard that two of these forces are unified: the electromagnetic force and the weak nuclear force form something called the electroweak force.

What does it mean that these forces are unified? How does it work?

Zoom in far enough, and you don’t see the electromagnetic force and the weak force anymore. Instead you see two different forces, I’ll call them “W” and “B”. You’ll also see the Higgs field. And crucially, you’ll see the “W” and “B” forces interact with the Higgs.

The Higgs field is special because it has what’s called a “vacuum” value. Even in otherwise empty space, there’s some amount of “Higgsness” in the background, like the color of a piece of construction paper. This background Higgs-ness is in some sense an accident, just one stable way the universe happens to sit. In particular, it picks out an arbitrary kind of direction: parts of the “W” and “B” forces happen to interact with it, and parts don’t.

Now let’s zoom back out. We could, if we wanted, keep our eyes on the “W” and “B” forces. But that gets increasingly silly. As we zoom out we won’t be able to see the Higgs field anymore. Instead, we’ll just see different parts of the “W” and “B” behaving in drastically different ways, depending on whether or not they interact with the Higgs. It will make more sense to talk about mixes of the “W” and “B” fields, to distinguish the parts that are “lined up” with the background Higgs and the parts that aren’t. It’s like using “aft” and “starboard” on a boat. You could use “north” and “south”, but that would get confusing pretty fast.

My cabin is on the west side of the ship…unless we’re sailing east….

What are those “mixes” of the “W” and “B” forces? Why, they’re the weak nuclear force and the electromagnetic force!

This, broadly speaking, is the kind of unification physicists look for. It doesn’t have to be a “mix” of two different forces: most of the models physicists imagine start with a single force. But the basic ideas are the same: that if you “zoom in” enough you see a simpler model, but that model is interacting with something that “by accident” picks a particular direction, so that as we zoom out different parts of the model behave in different ways. In that way, you could get from a single force to all the different forces we observe.

That “by accident” is important here, because that accident can be changed. That’s why I said earlier that real unification lets us alter the world.

To be clear, we can’t change the background Higgs field with current technology. The biggest collider we have can just make a tiny, temporary fluctuation (that’s what the Higgs boson is). But one implication of electroweak unification is that, with enough technology, we could. Because those two forces are unified, and because that unification is physical, with a physical cause, it’s possible to alter that cause, to change the mix and change the balance. This is why this kind of unification is such a big deal, why it’s not the sort of thing you can just chalk up to “interpretation” and ignore: when two forces are unified in this way, it lets us do new things.

Mathematical unification is valuable. It’s great when we can look at different things and describe them in the same language, or use ideas from one to understand the other. But it’s not the same thing as physical unification. When two forces really unify, it’s an undeniable physical fact about the world. When two forces unify, it does something.

The Parameter Was Inside You All Along

Sabine Hossenfelder had an explainer video recently on how to tell science from pseudoscience. This is a famously difficult problem, so naturally we have different opinions. I actually think the picture she draws is reasonably sound. But while it is a good criterion to tell whether you yourself are doing pseudoscience, it’s surprisingly tricky to apply it to other people.

Hossenfelder argues that science, at its core, is about explaining observations. To tell whether something is science or pseudoscience you need to ask, first, if it agrees with observations, and second, if it is simpler than those observations. In particular, a scientist should prefer models with fewer parameters. If your model has so many parameters that you can fit any observation, you’re not being scientific.

This is a great rule of thumb, one that as Hossenfelder points out forms the basis of a whole raft of statistical techniques. It does rely on one tricky judgement, though: how many parameters does your model actually have?

Suppose I’m one of those wacky theorists who propose a whole new particle to explain some astronomical mystery. Hossenfelder, being more conservative in these things, proposes a model with no new particles. Neither of our models fit the data perfectly. Perhaps my model fits a little better, but after all it has one extra parameter, from the new particle. If we want to compare our models, we should take that into account, and penalize mine.

Here’s the question, though: how do I know that Hossenfelder didn’t start out with more particles, and got rid of them to get a better fit? If she did, she had more parameters than I did. She just fit them away.

The problem here is closely related to one called the look-elsewhere effect. Scientists don’t publish everything they try. An unscrupulous scientist can do a bunch of different tests until one of them randomly works, and just publish that one, making the result look meaningful when really it was just random chance. Even if no individual scientist is unscrupulous, a community can do the same thing: many scientists testing many different models, until one accidentally appears to work.

As a scientist, you mostly know if your motivations are genuine. You know if you actually tried a bunch of different models or had good reasons from the start to pick the one you did. As someone judging other scientists, you often don’t have that luxury. Sometimes you can look at prior publications and see all the other attempts someone made. Sometimes they’ll even tell you explicitly what parameters they used and how they fit them. But sometimes, someone will swear up and down that their model is just the most natural, principled choice they could have made, and they never considered anything else. When that happens, how do we guard against the look-elsewhere effect?

The normal way to deal with the look-elsewhere effect is to consider, not just whatever tests the scientist claims to have done, but all tests they could reasonably have done. You need to count all the parameters, not just the ones they say they varied.

This works in some fields. If you have an idea of what’s reasonable and what’s not, you have a relatively manageable list of things to look at. You can come up with clear rules for which theories are simpler than others, and people will agree on them.

Physics doesn’t have it so easy. We don’t have any pre-set rules for what kind of model is “reasonable”. If we want to parametrize every “reasonable” model, the best we can do are what are called Effective Field Theories, theories which try to describe every possible type of new physics in terms of its effect on the particles we already know. Even there, though, we need assumptions. The most popular effective field theory, called SMEFT, assumes the forces of the Standard Model keep their known symmetries. You get a different model if you relax that assumption, and even that model isn’t the most general: for example, it still keeps relativity intact. Try to make the most general model possible, and you end up waist-deep in parameter soup.

Subjectivity is a dirty word in science…but as far as I can tell it’s the only way out of this. We can try to count parameters when we can, and use statistical tools…but at the end of the day, we still need to make choices. We need to judge what counts as an extra parameter and what doesn’t, which possible models to compare to and which to ignore. That’s going to be dependent on our scientific culture, on fashion and aesthetics, there just isn’t a way around that. The best we can do is own up to our assumptions, and be ready to change them when we need to.