# The Unpublishable Dirty Tricks of Theoretical Physics

As the saying goes, it is better not to see laws or sausages being made. You’d prefer to see the clean package on the outside than the mess behind the scenes.

The same is true of science. A good paper tells a nice, clean story: a logical argument from beginning to end, with no extra baggage to slow it down. That story isn’t a lie: for any decent paper in theoretical physics, the conclusions will follow from the premises. Most of the time, though, it isn’t how the physicist actually did it.

The way we actually make discoveries is messy. It involves looking for inspiration in all the wrong places: pieces of old computer code and old problems, trying to reproduce this or that calculation with this or that method. In the end, once we find something interesting enough, we can reconstruct a clearer, cleaner, story, something actually fit to publish. We hide the original mess partly for career reasons (easier to get hired if you tell a clean, heroic story), partly to be understood (a paper that embraced the mess of discovery would be a mess to read), and partly just due to that deep human instinct to not let others see us that way.

The trouble is, some of that “mess” is useful, even essential. And because it’s never published or put into textbooks, the only way to learn it is word of mouth.

A lot of these messy tricks involve numerics. Many theoretical physics papers derive things analytically, writing out equations in symbols. It’s easy to make a mistake in that kind of calculation, either writing something wrong on paper or as a bug in computer code. To correct mistakes, many things are checked numerically: we plug in numbers to make sure everything still works. Sometimes this means using an approximation, trying to make sure two things cancel to some large enough number of decimal places. Sometimes instead it’s exact: we plug in prime numbers, and can much more easily see if two things are equal, or if something is rational or contains a square root. Sometimes numerics aren’t just used to check something, but to find a solution: exploring many options in an easier numerical calculation, finding one that works, and doing it again analytically.

“Ansatze” are also common: our fancy word for an educated guess. These we sometimes admit, when they’re at the core of a new scientific idea. But the more minor examples go un-mentioned. If a paper shows a nice clean formula and proves it’s correct, but doesn’t explain how the authors got it…probably, they used an ansatz. This trick can go hand-in-hand with numerics as well: make a guess, check it matches the right numbers, then try to see why it’s true.

The messy tricks can also involve the code itself. In my field we often use “computer algebra” systems, programs to do our calculations for us. These systems are programming languages in their own right, and we need to write computer code for them. That code gets passed around informally, but almost never standardized. Mathematical concepts that come up again and again can be implemented very differently by different people, some much more efficiently than others.

I don’t think it’s unreasonable that we leave “the mess” out of our papers. They would certainly be hard to understand otherwise! But it’s a shame we don’t publish our dirty tricks somewhere, even in special “dirty tricks” papers. Students often start out assuming everything is done the clean way, and start doubting themselves when they notice it’s much too slow to make progress. Learning the tricks is a big part of learning to be a physicist. We should find a better way to teach them.

# Calculations of the Past

Last week was a birthday conference for one of the pioneers of my sub-field, Ettore Remiddi. I wasn’t there, but someone who was pointed me to some of the slides, including a talk by Stefano Laporta. For those of you who didn’t see my post a few weeks back, Laporta was one of Remiddi’s students, who developed one of the most important methods in our field and then vanished, spending ten years on an amazingly detailed calculation. Laporta’s talk covers more of the story, about what it was like to do precision calculations in that era.

“That era”, the 90’s through 2000’s, witnessed an enormous speedup in computers, and a corresponding speedup in what was possible. Laporta worked with Remiddi on the three-loop electron anomalous magnetic moment, something Remiddi had been working on since 1969. When Laporta joined in 1989, twenty-one of the seventy-two diagrams needed had still not been computed. They would polish them off over the next seven years, before Laporta dove in to four loops. Twenty years later, he had that four-loop result to over a thousand digits.

One fascinating part of the talk is seeing how the computational techniques change over time, as language replaces language and computer clusters get involved. As a student, Laporta learns a lesson we all often need: that to avoid mistakes, we need to do as little by hand as possible, even for something as simple as copying a one-line formula. Looking at his review of others’ calculations, it’s remarkable how many theoretical results had to be dramatically corrected a few years down the line, and how much still might depend on theoretical precision.

Another theme was one of Remiddi suggesting something and Laporta doing something entirely different, and often much more productive. Whether it was using the arithmetic-geometric mean for an elliptic integral instead of Gaussian quadrature, or coming up with his namesake method, Laporta spent a lot of time going his own way, and Remiddi quickly learned to trust him.

There’s a lot more in the slides that’s worth reading, including a mention of one of this year’s Physics Nobelists. The whole thing is an interesting look at what it takes to press precision to the utmost, and dedicate years to getting something right.

# Of p and sigma

Ask a doctor or a psychologist if they’re sure about something, and they might say “it has p<0.05”. Ask a physicist, and they’ll say it’s a “5 sigma result”. On the surface, they sound like they’re talking about completely different things. As it turns out, they’re not quite that different.

Whether it’s a p-value or a sigma, what scientists are giving you is shorthand for a probability. The p-value is the probability itself, while sigma tells you how many standard deviations something is away from the mean on a normal distribution. For people not used to statistics this might sound very complicated, but it’s not so tricky in the end. There’s a graph, called a normal distribution, and you can look at how much of it is above a certain point, measured in units called standard deviations, or “sigmas”. That gives you your probability.

What are these numbers a probability of? At first, you might think they’re a probability of the scientist being right: of the medicine working, or the Higgs boson being there.

That would be reasonable, but it’s not how it works. Scientists can’t measure the chance they’re right. All they can do is compare models. When a scientist reports a p-value, what they’re doing is comparing to a kind of default model, called a “null hypothesis”. There are different null hypotheses for different experiments, depending on what the scientists want to test. For the Higgs, scientists looked at pairs of photons detected by the LHC. The null hypothesis was that these photons were created by other parts of the Standard Model, like the strong force, and not by a Higgs boson. For medicine, the null hypothesis might be that people get better on their own after a certain amount of time. That’s hard to estimate, which is why medical experiments use a control group: a similar group without the medicine, to see how much they get better on their own.

Once we have a null hypothesis, we can use it to estimate how likely it is that it produced the result of the experiment. If there was no Higgs, and all those photons just came from other particles, what’s the chance there would still be a giant pile of them at one specific energy? If the medicine didn’t do anything, what’s the chance the control group did that much worse than the treatment group?

Ideally, you want a small probability here. In medicine and psychology, you’re looking for a 5% probability, for p<0.05. In physics, you need 5 sigma to make a discovery, which corresponds to a one in 3.5 million probability. If the probability is low, then you can say that it would be quite unlikely for your result to happen if the null hypothesis was true. If you’ve got a better hypothesis (the Higgs exists, the medicine works), then you should pick that instead.

Note that this probability still uses a model: it’s the probability of the result given that the model is true. It isn’t the probability that the model is true, given the result. That probability is more important to know, but trickier to calculate. To get from one to the other, you need to include more assumptions: about how likely your model was to begin with, given everything else you know about the world. Depending on those assumptions, even the tiniest p-value might not show that your null hypothesis is wrong.

In practice, unfortunately, we usually can’t estimate all of those assumptions in detail. The best we can do is guess their effect, in a very broad way. That usually just means accepting a threshold for p-values, declaring some a discovery and others not. That limitation is part of why medicine and psychology demand p-values of 0.05, while physicists demand 5 sigma results. Medicine and psychology have some assumptions they can rely on: that people function like people, that biology and physics keep working. Physicists don’t have those assumptions, so we have to be extra-strict.

Ultimately, though, we’re all asking the same kind of question. And now you know how to understand it when we do.

# Searching for Stefano

On Monday, Quanta magazine released an article on a man who transformed the way we do particle physics: Stefano Laporta. I’d tipped them off that Laporta would make a good story: someone who came up with the bread-and-butter algorithm that fuels all of our computations, then vanished from the field for ten years, returning at the end with an 1,100 digit masterpiece. There’s a resemblance to Searching for Sugar Man, fans and supporters baffled that their hero is living in obscurity.

If anything, I worry I under-sold the story. When Quanta interviewed me, it was clear they were looking for ties to well-known particle physics results: was Laporta’s work necessary for the Higgs boson discovery, or linked to the controversy over the magnetic moment of the muon? I was careful, perhaps too careful, in answering. The Higgs, to my understanding, didn’t require so much precision for its discovery. As for the muon, the controversial part is a kind of calculation that wouldn’t use Laporta’s methods, while the un-controversial part was found numerically by a group that doesn’t use his algorithm either.

With more time now, I can make a stronger case. I can trace Laporta’s impact, show who uses his work and for what.

In particle physics, we have a lovely database called INSPIRE that lists all our papers. Here is Laporta’s page, his work sorted by number of citations. When I look today, I find his most cited paper, the one that first presented his algorithm, at the top, with a delightfully apt 1,001 citations. Let’s listen to a few of those 1,001 tales, and see what they tell us.

Once again, we’ll sort by citations. The top paper, “Higgs boson production at hadron colliders in NNLO QCD“, is from 2002. It computes the chance that a particle collider like the LHC could produce a Higgs boson. It in turn has over a thousand citations, headlined by two from the ATLAS and CMS collaborations: “Observation of a new particle in the search for the Standard Model Higgs boson with the ATLAS detector at the LHC” and “Observation of a New Boson at a Mass of 125 GeV with the CMS Experiment at the LHC“. Those are the papers that announced the discovery of the Higgs, each with more than twelve thousand citations. Later in the list, there are design reports: discussions of why the collider experiments are built a certain way. So while it’s true that the Higgs boson could be seen clearly from the data, Laporta’s work still had a crucial role: with his algorithm, we could reassure experimenters that they really found the Higgs (not something else), and even more importantly, help them design the experiment so that they could detect it.

The next paper tells a similar story. A different calculation, with almost as many citations, feeding again into planning and prediction for collider physics.

The next few touch on my own corner of the field. “New Relations for Gauge-Theory Amplitudes” triggered a major research topic in its own right, one with its own conference series. Meanwhile, “Iteration of planar amplitudes in maximally supersymmetric Yang-Mills theory at three loops and beyond” served as a foundation for my own career, among many others. None of this would have happened without Laporta’s algorithm.

After that, more applications: fundamental quantities for collider physics, pieces of math that are used again and again. In particular, they are referenced again and again by the Particle Data Group, who collect everything we know about particle physics.

Further down still, and we get to specific code: FIRE and Reduze, programs made by others to implement Laporta’s algorithm, each with many uses in its own right.

All that, just from one of Laporta’s papers.

His ten-year magnum opus is more recent, and has fewer citations: checking now, just 139. Still, there are stories to tell there too.

I mentioned earlier 1,100 digits, and this might confuse some of you. The most precise prediction in particle physics has ten digits of precision, the magnetic behavior of the electron. Laporta’s calculation didn’t change that, because what he calculated isn’t the only contribution: he calculated Feynman diagrams with four “loops”, which is its own approximation, one limited in precision by what might be contributed by more loops. The current result has Feynman diagrams with five loops as well (known to much less than 1,100 digits), but the diagrams with six or more are unknown, and can only be estimated. The result also depends on measurements, which themselves can’t reach 1,100 digits of precision.

So why would you want 1,100 digits, then? In a word, mathematics. The calculation involves exotic types of numbers called periods, more complicated cousins of numbers like pi. These numbers are related to each other, often in complicated and surprising ways, ways which are hard to verify without such extreme precision. An older result of Laporta’s inspired the physicist David Broadhurst and mathematician Anton Mellit to conjecture new relations between this type of numbers, relations that were only later proven using cutting-edge mathematics. The new result has inspired mathematicians too: Oliver Schnetz found hints of a kind of “numerical footprint”, special types of numbers tied to the physics of electrons. It’s a topic I’ve investigated myself, something I think could lead to much more efficient particle physics calculations.

In addition to being inspired by Laporta’s work, Broadhurst has advocated for it. He was the one who first brought my attention to Laporta’s story, with a moving description of welcoming him back to the community after his ten-year silence, writing a letter to help him get funding. I don’t have all the details of the situation, but the impression I get is that Laporta had virtually no academic support for those ten years: no salary, no students, having to ask friends elsewhere for access to computer clusters.

When I ask why someone with such an impact didn’t have a professorship, the answer I keep hearing is that he didn’t want to move away from his home town in Bologna. If you aren’t an academic, that won’t sound like much of an explanation: Bologna has a university after all, the oldest in the world. But that isn’t actually a guarantee of anything. Universities hire rarely, according to their own mysterious agenda. I remember another colleague whose wife worked for a big company. They offered her positions in several cities, including New York. They told her that, since New York has many universities, surely her husband could find a job at one of them? We all had a sad chuckle at that.

For almost any profession, a contribution like Laporta’s would let you live anywhere you wanted. That’s not true for academia, and it’s to our loss. By demanding that each scientist be able to pick up and move, we’re cutting talented people out of the field, filtering by traits that have nothing to do with our contributions to knowledge. I don’t know Laporta’s full story. But I do know that doing the work you love in the town you love isn’t some kind of unreasonable request. It’s a request academia should be better at fulfilling.

Each year, the Niels Bohr International Academy has a series of public talks. Part of Copenhagen’s Folkeuniversitet (“people’s university”), they attract a mix of older people who want to keep up with modern developments and young students looking for inspiration. I gave a talk a few days ago, as part of this year’s program. The last time I participated, back in 2017, I covered a topic that comes up a lot on this blog: “The Quest for Quantum Gravity”. This year, I was asked to cover something more unusual: “The Unreasonable Effectiveness of Mathematics in the Natural Sciences”.

Some of you might notice that title is already taken: it’s a famous lecture by the physicist Wigner, from 1959. Wigner posed an interesting question: why is advanced mathematics so useful in physics? Time and time again, mathematicians develop an idea purely for its own sake, only for physicists to find it absolutely indispensable to describe some part of the physical world. Should we be surprised that this keeps working? Suspicious?

I talked a bit about this: some of the answers people have suggested over the years, and my own opinion. But like most public talks, the premise was mostly a vehicle for cool examples: physicists through history bringing in new math, and surprising mathematical facts like the ones I talked about a few weeks back at Culture Night. Because of that, I was actually a bit unprepared to dive into the philosophical side of the topic (despite it being in principle a very philosophical topic!) When one of the audience members brought up mathematical Platonism, I floundered a bit, not wanting to say something that was too philosophically naive.

Well, if there’s anywhere I can be naive, it’s my own blog. I even have a label for Amateur Philosophy posts. So let’s do one.

Mathematical Platonism is the idea that mathematical truths “exist”: that they’re somewhere “out there” being discovered. On the other side, one might believe that mathematics is not discovered, but invented. For some reason, a lot of people with the latter opinion seem to think this has something to do with describing nature (for example, an essay a few years back by Lee Smolin defines mathematics as “the study of systems of evoked relationships inspired by observations of nature”).

I’m not a mathematical Platonist. I don’t even like to talk about which things do or don’t “exist”. But I also think that describing mathematics in terms of nature is missing the point. Mathematicians aren’t physicists. While there may have been a time when geometers argued over lines in the sand, these days mathematicians’ inspiration isn’t usually the natural world, at least not in the normal sense.

Instead, I think you can’t describe mathematics without describing mathematicians. A mathematical fact is, deep down, something a mathematician can say without other mathematicians shouting them down. It’s an allowed move in what my hazy secondhand memory of Wittgenstein wants to call a “language game”: something that gets its truth from a context of people interpreting and reacting to it, in the same way a move in chess matters only when everyone is playing by its rules.

This makes mathematics sound very subjective, and we’re used to the opposite: the idea that a mathematical fact is as objective as they come. The important thing to remember is that even with this kind of description, mathematics still ends up vastly less subjective than any other field. We care about subjectivity between different people: if a fact is “true” for Brits and “false” for Germans, then it’s a pretty limited fact. Mathematics is special because the “rules of its game” aren’t rules of one group or another. They’re rules that are in some sense our birthright. Any human who can read and write, or even just act and perceive, can act as a Turing Machine, a universal computer. With enough patience and paper, anything that you can prove to one person you can prove to another: you just have to give them the rules and let them follow them. It doesn’t matter how smart you are, or what you care about most: if something is mathematically true for others, it is mathematically true for you.

Some would argue that this is evidence for mathematical Platonism, that if something is a universal truth it should “exist”. Even if it does, though, I don’t think it’s useful to think of it in that way. Once you believe that mathematical truth is “out there”, you want to try to characterize it, to say something about it besides that it’s “out there”. You’ll be tempted to have an opinion on the Axiom of Choice, or the Continuum Hypothesis. And the whole point is that those aren’t sensible things to have opinions on, that having an opinion about them means denying the mathematical proofs that they are, in the “standard” axioms, undecidable. Whatever is “out there”, it has to include everything you can prove with every axiom system, whichever non-standard ones you can cook up, because mathematicians will happily work on any of them. The whole point of mathematics, the thing that makes it as close to objective as anything can be, is that openness: the idea that as long as an argument is good enough, as long as it can convince anyone prepared to wade through the pages, then it is mathematics. Nothing, so long as it can convince in the long-run, is excluded.

If we take this definition seriously, there are some awkward consequences. You could imagine a future in which every mind, everyone you might be able to do mathematics with, is crushed under some tyrant, forced to agree to something false. A real philosopher would dig in to this corner case, try to salvage the definition or throw it out. I’m not a real philosopher though. So all I can say is that while I don’t think that tyrant gets to define mathematics, I also don’t think there are good alternatives to my argument. Our only access to mathematics, and to truth in general, is through the people who pursue it. I don’t think we can define one without the other.

# In Uppsala for Elliptics 2021

I’m in Uppsala in Sweden this week, at an actual in-person conference.

Elliptics started out as a series of small meetings of physicists trying to understand how to make sense of elliptic integrals in calculations of colliding particles. It grew into a full-fledged yearly conference series. I organized last year, which naturally was an online conference. This year though, the stage was set for Uppsala University to host in person.

I should say mostly in person. It’s a hybrid conference, with some speakers and attendees joining on Zoom. Some couldn’t make it because of travel restrictions, or just wanted to be cautious about COVID. But seemingly just as many had other reasons, like teaching schedules or just long distances, that kept them from coming in person. We’re all wondering if this will become a long-term trend, where the flexibility of hybrid conferences lets people attend no matter their constraints.

The hybrid format worked better than expected, but there were still a few kinks. The audio was particularly tricky, it seemed like each day the organizers needed a new microphone setup to take questions. It’s always a little harder to understand someone on Zoom, especially when you’re sitting in an auditorium rather than focused on your own screen. Still, technological experience should make this work better in future.

Content-wise, the conference began with a “mini-school” of pedagogical talks on particle physics, string theory, and mathematics. I found the mathematical talks by Erik Panzer particularly nice, it’s a topic I still feel quite weak on and he laid everything out in a very clear way. It seemed like a nice touch to include a “school” element in the conference, though I worry it ate too much into the time.

The rest of the content skewed more mathematical, and more string-theoretic, than these conferences have in the past. The mathematical content ranged from intriguing (including an interesting window into what it takes to get high-quality numerics) to intimidatingly obscure (large commutative diagrams, category theory on the first slide). String theory was arguably under-covered in prior years, but it felt over-covered this year. With the particle physics talks focusing on either general properties with perhaps some connections to elliptics, or to N=4 super Yang-Mills, it felt like we were missing the more “practical” talks from past conferences, where someone was computing something concrete in QCD and told us what they needed. Next year is in Mainz, so maybe those talks will reappear.

# Outreach Talk on Math’s Role in Physics

Tonight is “Culture Night” in Copenhagen, the night when the city throws open its doors and lets the public in. Museums and hospitals, government buildings and even the Freemasons, all have public events. The Niels Bohr Institute does too, of course: an evening of physics exhibits and demos, capped off with a public lecture by Denmark’s favorite bow-tie wearing weirder-than-usual string theorist, Holger Bech Nielsen. In between, there are a number of short talks by various folks at the institute, including yours truly.

In my talk, I’m going to try and motivate the audience to care about math. Math is dry of course, and difficult for some, but we physicists need it to do our jobs. If you want to be precise about a claim in physics, you need math simply to say what you want clearly enough.

Since you guys likely don’t overlap with my audience tonight, it should be safe to give a little preview. I’ll be using a few examples, but this one is the most complicated:

I’ll be telling a story I stole from chapter seven of the web serial Almost Nowhere. (That link is to the first chapter, by the way, in case you want to read the series without spoilers. It’s very strange, very unique, and at least in my view quite worth reading.) You follow a warrior carrying a spear around a globe in two different paths. The warrior tries to always point in the same direction, but finds that the two different paths result in different spears when they meet. The story illustrates that such a simple concept as “what direction you are pointing” isn’t actually so simple: if you want to think about directions in curved space (like the surface of the Earth, but also, like curved space-time in general relativity) then you need more sophisticated mathematics (a notion called parallel transport) to make sense of it.

It’s kind of an advanced concept for a public talk. But seeing it show up in Almost Nowhere inspired me to try to get it across. I’ll let you know how it goes!

By the way, if you are interested in learning the kinds of mathematics you need for theoretical physics, and you happen to be a Bachelor’s student planning to pursue a PhD, then consider the Perimeter Scholars International Master’s Program! It’s a one-year intensive at the Perimeter Institute in Waterloo, Ontario, in Canada. In a year it gives you a crash course in theoretical physics, giving you tools that will set you ahead of other beginning PhD students. I’ve witnessed it in action, and it’s really remarkable how much the students learn in a year, and what they go on to do with it. Their early registration deadline is on November 15, just a month away, so if you’re interested you may want to start thinking about it.

# Digging for Buried Insight

The scientific method, as we usually learn it, starts with a hypothesis. The scientist begins with a guess, and asks a question with a clear answer: true, or false? That guess lets them design an experiment, observe the consequences, and improve our knowledge of the world.

But where did the scientist get the hypothesis in the first place? Often, through some form of exploratory research.

Exploratory research is research done, not to answer a precise question, but to find interesting questions to ask. Each field has their own approach to exploration. A psychologist might start with interviews, asking broad questions to find narrower questions for a future survey. An ecologist might film an animal, looking for changes in its behavior. A chemist might measure many properties of a new material, seeing if any stand out. Each approach is like digging for treasure, not sure of exactly what you will find.

Mathematicians and theoretical physicists don’t do experiments, but we still need hypotheses. We need an idea of what we plan to prove, or what kind of theory we want to build: like other scientists, we want to ask a question with a clear, true/false answer. And to find those questions, we still do exploratory research.

What does exploratory research look like, in the theoretical world? Often, it begins with examples and calculations. We can start with a known method, or a guess at a new one, a recipe for doing some specific kind of calculation. Recipe in hand, we proceed to do the same kind of calculation for a few different examples, covering different sorts of situation. Along the way, we notice patterns: maybe the same steps happen over and over, or the result always has some feature.

We can then ask, do those same steps always happen? Does the result really always have that feature? We have our guess, our hypothesis, and our attempt to prove it is much like an experiment. If we find a proof, our hypothesis was true. On the other hand, we might not be able to find a proof. Instead, exploring, we might find a counterexample – one where the steps don’t occur, the feature doesn’t show up. That’s one way to learn that our hypothesis was false.

This kind of exploration is essential to discovery. As scientists, we all have to eventually ask clear yes/no questions, to submit our beliefs to clear tests. But we can’t start with those questions. We have to dig around first, to observe the world without a clear plan, to get to a point where we have a good question to ask.

# Who Is, and Isn’t, Counting Angels on a Pinhead

How many angels can dance on the head of a pin?

It’s a question famous for its sheer pointlessness. While probably no-one ever had that exact debate, “how many angels fit on a pin” has become a metaphor, first for a host of old theology debates that went nowhere, and later for any academic study that seems like a waste of time. Occasionally, physicists get accused of doing this: typically string theorists, but also people who debate interpretations of quantum mechanics.

Are those accusations fair? Sometimes yes, sometimes no. In order to tell the difference, we should think about what’s wrong, exactly, with counting angels on the head of a pin.

One obvious answer is that knowing the number of angels that fit on a needle’s point is useless. Wikipedia suggests that was the origin of the metaphor in the first place, a pun on “needle’s point” and “needless point”. But this answer is a little too simple, because this would still be a useful debate if angels were real and we could interact with them. “How many angels fit on the head of a pin” is really a question about whether angels take up space, whether two angels can be at the same place at the same time. Asking that question about particles led physicists to bosons and fermions, which among other things led us to invent the laser. If angelology worked, perhaps we would have angel lasers as well.

“If angelology worked” is key here, though. Angelology didn’t work, it didn’t lead to angel-based technology. And while Medieval people couldn’t have known that for certain, maybe they could have guessed. When people accuse academics of “counting angels on the head of a pin”, they’re saying they should be able to guess that their work is destined for uselessness.

How do you guess something like that?

Well, one problem with counting angels is that nobody doing the counting had ever seen an angel. Counting angels on the head of a pin implies debating something you can’t test or observe. That can steer you off-course pretty easily, into conclusions that are either useless or just plain wrong.

This can’t be the whole of the problem though, because of mathematics. We rarely accuse mathematicians of counting angels on the head of a pin, but the whole point of math is to proceed by pure logic, without an experiment in sight. Mathematical conclusions can sometimes be useless (though we can never be sure, some ideas are just ahead of their time), but we don’t expect them to be wrong.

The key difference is that mathematics has clear rules. When two mathematicians disagree, they can look at the details of their arguments, make sure every definition is as clear as possible, and discover which one made a mistake. Working this way, what they build is reliable. Even if it isn’t useful yet, the result is still true, and so may well be useful later.

In contrast, when you imagine Medieval monks debating angels, you probably don’t imagine them with clear rules. They might quote contradictory bible passages, argue everyday meanings of words, and win based more on who was poetic and authoritative than who really won the argument. Picturing a debate over how many angels can fit on the head of a pin, it seems more like Calvinball than like mathematics.

This then, is the heart of the accusation. Saying someone is just debating how many angels can dance on a pin isn’t merely saying they’re debating the invisible. It’s saying they’re debating in a way that won’t go anywhere, a debate without solid basis or reliable conclusions. It’s saying, not just that the debate is useless now, but that it will likely always be useless.

As an outsider, you can’t just dismiss a field because it can’t do experiments. What you can and should do, is dismiss a field that can’t produce reliable knowledge. This can be hard to judge, but a key sign is to look for these kinds of Calvinball-style debates. Do people in the field seem to argue the same things with each other, over and over? Or do they make progress and open up new questions? Do the people talking seem to be just the famous ones? Or are there cases of young and unknown researchers who happen upon something important enough to make an impact? Do people just list prior work in order to state their counter-arguments? Or do they build on it, finding consequences of others’ trusted conclusions?

A few corners of string theory do have this Calvinball feel, as do a few of the debates about the fundamentals of quantum mechanics. But if you look past the headlines and blogs, most of each of these fields seems more reliable. Rather than interminable back-and-forth about angels and pinheads, these fields are quietly accumulating results that, one way or another, will give people something to build on.

A reader pointed me to Stephen Wolfram’s one-year update of his proposal for a unified theory of physics. I was pretty squeamish about it one year ago, and now I’m even less interested in wading in to the topic. But I thought it would be worth saying something, and rather than say something specific, I realized I could say something general. I thought I’d talk a bit about how we judge good and bad research in theoretical physics.

In science, there are two things we want out of a new result: we want it to be true, and we want it to be surprising. The first condition should be obvious, but the second is also important. There’s no reason to do an experiment or calculation if it will just tell us something we already know. We do science in the hope of learning something new, and that means that the best results are the ones we didn’t expect.

(What about replications? We’ll get there.)

If you’re judging an experiment, you can measure both of these things with statistics. Statistics lets you estimate how likely an experiment’s conclusion is to be true: was there a large enough sample? Strong enough evidence? It also lets you judge how surprising the experiment is, by estimating how likely it would be to happen given what was known beforehand. Did existing theories and earlier experiments make the result seem likely, or unlikely? While you might not have considered replications surprising, from this perspective they can be: if a prior experiment seems unreliable, successfully replicating it can itself be a surprising result.

If instead you’re judging a theoretical result, these measures get more subtle. There aren’t always good statistical tools to test them. Nonetheless, you don’t have to rely on vague intuitions either. You can be fairly precise, both about how true a result is and how surprising it is.

We get our results in theoretical physics through mathematical methods. Sometimes, this is an actual mathematical proof: guaranteed to be true, no statistics needed. Sometimes, it resembles a proof, but falls short: vague definitions and unstated assumptions mar the argument, making it less likely to be true. Sometimes, the result uses an approximation. In those cases we do get to use some statistics, estimating how good the approximation may be. Finally, a result can’t be true if it contradicts something we already know. This could be a logical contradiction in the result itself, but if the result is meant to describe reality (note: not always the case), it might contradict the results of a prior experiment.

What makes a theoretical result surprising? And how precise can we be about that surprise?

Theoretical results can be surprising in the light of earlier theory. Sometimes, this gets made precise by a no-go theorem, a proof that some kind of theoretical result is impossible to obtain. If a result finds a loophole in a no-go theorem, that can be quite surprising. Other times, a result is surprising because it’s something no-one else was able to do. To be precise about that kind of surprise, you need to show that the result is something others wanted to do, but couldn’t. Maybe someone else made a conjecture, and only you were able to prove it. Maybe others did approximate calculations, and now you can do them more precisely. Maybe a question was controversial, with different people arguing for different sides, and you have a more conclusive argument. This is one of the better reasons to include a long list of references in a paper: not to pad your friends’ citation counts, but to show that your accomplishment is surprising: that others might have wanted to achieve it, but had to settle for something lesser.

In general, this means that showing whether a theoretical result is good: not merely true, but surprising and new, links you up to the rest of the theoretical community. You can put in all the work you like on a theory of everything, and make it as rigorous as possible, but if all you did was reproduce a sub-case of someone else’s theory then you haven’t accomplished all that much. If you put your work in context, compare and contrast to what others have done before, then we can start getting precise about how much we should be surprised, and get an idea of what your result is really worth.