# Shape the Science to the Statistics, Not the Statistics to the Science

In theatre, and more generally in writing, the advice is always to “show, don’t tell”. You could just tell your audience that Long John Silver is a ruthless pirate, but it works a lot better to show him marching a prisoner off the plank. Rather than just informing with words, you want to make things as concrete as possible, with actions.

There is a similar rule in pedagogy. Pedagogy courses teach you to be explicit about your goals, planning a course by writing down Intended Learning Outcomes. (They never seem amused when I ask about the Unintended Learning Outcomes.) At first, you’d want to write down outcomes like “students will understand calculus” or “students will know what a sine is”. These, however, are hard to judge, and thus hard to plan around. Instead, the advice is to write outcomes that correspond to actions you want the students to take, things you want them to be capable of doing: “students can perform integration by parts” “students can decide correctly whether to use a sine or cosine”. Again and again, the best way to get the students to know something is to get them to do something.

Jay Daigle recently finished a series of blog posts on how scientists use statistics to test hypotheses. I recommend it, it’s a great introduction to the concepts scientists use to reason about data, as well as a discussion of how they often misuse those concepts and what they can do better. I have a bit of a different perspective on one of the “takeaways” of the post, and I wanted to highlight that here.

The center of Daigle’s point is a tool, widely used in science, called Neyman-Pearson Hypothesis Testing. Neyman-Pearson is a tool for making decisions involving a threshold for significance: a number that scientists often call a p-value. If you follow the procedure, only acting when you find a p-value below 0.05, then you will only be wrong 5% of the time: specifically, that will be your rate of false positives, the percent of the time you conclude some action works when it really doesn’t.

A core problem, from Daigle’s perspective, is that scientists use Neyman-Pearson for the wrong purpose. Neyman-Pearson is a tool for making decisions, not a test that tells you whether or not a specific claim is true. It tells you “on average, if I approve drugs when their p-value is below 0.05, only 5% of them will fail”. That’s great if you can estimate how bad it is to deny a drug that should be approved, how bad it is to approve a drug that should be denied, and calculate out on average how often you can afford to be wrong. It doesn’t tell you anything about the specific drug, though. It doesn’t tell you “every drug with a p-value below 0.05 works”. It certainly doesn’t tell you “a drug with a p-value of 0.051 almost works” or “a drug with a p-value of 0.001 definitely works”. It just doesn’t give you that information.

In later posts, Daigle suggests better tools, which he argues map better to what scientists want to do, as well as general ways scientists can do better. Section 4. in particular focuses on the idea that one thing scientists need to do is ask better questions. He uses a specific example from cognitive psychology, a study that tests whether describing someone’s face makes you worse at recognizing it later. That’s a clear scientific question, one that can be tested statistically. That doesn’t mean it’s a good question, though. Daigle points out that questions like this have a problem: it isn’t clear what the result actually tells us.

Here’s another example of the same problem. In grad school, I knew a lot of social psychologists. One was researching a phenomenon called extended contact. Extended contact is meant to be a foil to another phenomenon called direct contact, both having to do with our views of other groups. In direct contact, making a friend from another group makes you view that whole group better. In extended contact, making a friend who has a friend from another group makes you view the other group better.

The social psychologist was looking into a concrete-sounding question: which of these phenomena, direct or extended contact, is stronger?

At first, that seems like it has the same problem as Daigle’s example. Suppose one of these effects is larger: what does that mean? Why do we care?

Well, one answer is that these aren’t just phenomena: they’re interventions. If you know one phenomenon is stronger than another, you can use that to persuade people to be more accepting of other groups. The psychologist’s advisor even had a procedure to make people feel like they made a new friend. Armed with that, it’s definitely useful to know whether extended contact or direct contact is better: whichever one is stronger is the one you want to use!

You do need some “theory” behind this, of course. You need to believe that, if a phenomenon is stronger in your psychology lab, it will be stronger wherever you try to apply it in the real world. It probably won’t be stronger every single time, so you need some notion of how much stronger it needs to be. That in turn means you need to estimate costs: what it costs if you pick the weaker one instead, how much money you’re wasting or harm you’re doing.

You’ll notice this is sounding a lot like the requirements I described earlier, for Neyman-Pearson. That’s not accident: as you try to make your science more and more clearly defined, it will get closer and closer to a procedure to make a decision, and that’s exactly what Neyman-Pearson is good for.

So in the end I’m quite a bit more supportive of Neyman-Pearson than Daigle is. That doesn’t mean it isn’t being used wrong: most scientists are using it wrong. Instead of calculating a p-value each time they make a decision, they do it at the end of a paper, misinterpreting it as evidence that one thing or another is “true”. But I think that what these scientists need to do is not chance their statistics, but change their science. If they focused their science on making concrete decisions, they would actually be justified in using Neyman-Pearson…and their science would get a lot better in the process.

# A Scale of “Sure-Thing-Ness” for Experiments

No experiment is a sure thing. No matter what you do, what you test, what you observe, there’s no guarantee that you find something new. Even if you do your experiment correctly and measure what you planned to measure, nature might not tell you anything interesting.

Still, some experiments are more sure than others. Sometimes you’re almost guaranteed to learn something, even if it wasn’t what you hoped, while other times you just end up back where you started.

The first, and surest, type of experiment, is a voyage into the unknown. When nothing is known about your target, no expectations, and no predictions, then as long as you successfully measure anything you’ll have discovered something new. This can happen if the thing you’re measuring was only recently discovered. If you’re the first person who manages to measure the reaction rates of an element, or the habits of an insect, or the atmosphere of a planet, then you’re guaranteed to find out something you didn’t know before.

If you don’t have a total unknown to measure, then you want to test a clear hypothesis. The best of these are the theory killers, experiments which can decisively falsify an idea. History’s most famous experiments take this form, like the measurement of the perihelion of Mercury to test General Relativity or Pasteur’s tests of spontaneous generation. When you have a specific prediction and not much wiggle room, an experiment can teach you quite a lot.

“Not much wiggle room” is key, because these tests can all to easily become theory modifiers instead. If you can tweak your theory enough, then your experiment might not be able to falsify it. Something similar applies when you have a number of closely related theories. Even if you falsify one, you can just switch to another similar idea. In those cases, testing your theory won’t always teach you as much: you have to get lucky and see something that pins your theory down more precisely.

Finally, you can of course be just looking. Some experiments are just keeping an eye out, in the depths of space or the precision of quantum labs, watching for something unexpected. That kind of experiment might never see anything, and never rule anything out, but they can still sometimes be worthwhile.

There’s some fuzziness to these categories, of course. Often when scientists argue about whether an experiment is worth doing they’re arguing about which category to place it in. Would a new collider be a “voyage into the unknown” (new energy scales we’ve never measured before), a theory killer/modifier (supersymmetry! but which one…) or just “just looking”? Is your theory of cosmology specific enough to be “killed”, or merely “modified”? Is your wacky modification of quantum mechanics something that can be tested, or merely “just looked” for?

For any given experiment, it’s worth keeping in mind what you expect, and what would happen if you’re wrong. In science, we can’t do every experiment we want. We have to focus our resources and try to get results. Even if it’s never a sure thing.

By A. Physicist

…because it disagrees with precision electroweak measurements

…………………………………..with bounds from ATLAS and CMS

…………………………………..with the power spectrum of the CMB

…………………………………..with Eötvös experiments

…because it isn’t gauge invariant

………………………….Lorentz invariant

………………………….diffeomorphism invariant

………………………….background-independent, whatever that means

…because it violates unitarity

…………………………………locality

…………………………………causality

…………………………………observer-independence

…………………………………technical naturalness

…………………………………international treaties

…………………………………cosmic censorship

…because you screwed up the calculation

…because you didn’t actually do the calculation

…because I don’t understand the calculation

…because you predict too many magnetic monopoles

……………………………………too many proton decays

……………………………………too many primordial black holes

…………………………………..remnants, at all

…because it’s fine-tuned

…because it’s suspiciously finely-tuned

…because it’s finely tuned to be always outside of experimental bounds

…because you’re misunderstanding quantum mechanics

…………………………………………………………..black holes

………………………………………………………….effective field theory

…………………………………………………………..thermodynamics

…………………………………………………………..the scientific method

…because Condensed Matter would contribute more to Chinese GDP

…because the approximation you’re making is unjustified

…………………………………………………………………………is not valid

…………………………………………………………………………is wildly overoptimistic

………………………………………………………………………….is just kind of lazy

…because there isn’t a plausible UV completion

…because you care too much about the UV

…because it only works in polynomial time

…………………………………………..exponential time

…………………………………………..factorial time

…because even if it’s fast it requires more memory than any computer on Earth

…because it requires more bits of memory than atoms in the visible universe

…because it has no meaningful advantages over current methods

…because it has meaningful advantages over my own methods

…because it can’t just be that easy

…because it’s not the kind of idea that usually works

…because it’s not the kind of idea that usually works in my field

…because it isn’t canonical

…because it’s ugly

…because it’s baroque

…because it ain’t baroque, and thus shouldn’t be fixed

…because only a few people work on it

…because far too many people work on it

…because clearly it will only work for the first case

……………………………………………………………….the first two cases

……………………………………………………………….the first seven cases

……………………………………………………………….the cases you’ve published and no more

…because I know you’re wrong

…because I strongly suspect you’re wrong

…because I strongly suspect you’re wrong, but saying I know you’re wrong looks better on a grant application

…….in a blog post

…because I’m just really pessimistic about something like that ever actually working

…because I’d rather work on my own thing, that I’m much more optimistic about

…because if I’m clear about my reasons

……and what I know

…….and what I don’t

……….then I’ll convince you you’re wrong.

……….or maybe you’ll convince me?

# The opposite of Witches

On Halloween I have a tradition of posts about spooky topics, whether traditional Halloween fare or things that spook physicists. This year it’s a little of both.

Mage: The Ascension is a role-playing game set in a world in which belief shapes reality. Players take the role of witches and warlocks, casting spells powered by their personal paradigms of belief. The game allows for pretty much any modern-day magic-user you could imagine, from Wiccans to martial artists.

Even stereotypical green witches, probably

Despite all the options, I was always more interested in the game’s villains, the witches’ opposites, the Technocracy.

The Technocracy answer an inevitable problem with any setting involving modern-day magic: why don’t people notice? If reality is powered by belief, why does no-one believe in magic?

In the Technocracy’s case, the answer is a vast conspiracy of mages with a scientific bent, manipulating public belief. Much like the witches and warlocks of Mage are a grab-bag of every occult belief system, the Technocracy combines every oppressive government conspiracy story you can imagine, all with the express purpose of suppressing the supernatural and maintaining scientific consensus.

This quote is from another game by the same publisher, but it captures the attitude of the Technocracy, and the magnitude of what is being claimed here:

Do not believe what the scientists tell you. The natural history we know is a lie, a falsehood sold to us by wicked old men who would make the world a dull gray prison and protect us from the dangers inherent to freedom. They would have you believe our planet to be a lonely starship, hurtling through the void of space, barren of magic and in need of a stern hand upon the rudder.

Close your mind to their deception. The time before our time was not a time of senseless natural struggle and reptilian rage, but a time of myth and sorcery. It was a time of legend, when heroes walked Creation and wielded the very power of the gods. It was a time before the world was bent, a time before the magic of Creation lessened, a time before the souls of men became the stunted, withered things they are today.

It can be a fun exercise to see how far doubt can take you, how much of the scientific consensus you can really be confident of and how much could be due to a conspiracy. Believing in the Technocracy would be the most extreme version of this, but Flat-Earthers come pretty close. Once you’re doubting whether the Earth is round, you have to imagine a truly absurd conspiracy to back it up.

On the other extreme, there are the kinds of conspiracies that barely take a conspiracy at all. Big experimental collaborations, like ATLAS and CMS at the LHC, keep a tight handle on what their members publish. (If you’re curious how much of one, here’s a talk by a law professor about, among other things, the Constitution of CMS. Yes, it has one!) An actual conspiracy would still be outed in about five minutes, but you could imagine something subtler, the experiment sticking to “safe” explanations and refusing to publish results that look too unusual, on the basis that they’re “probably” wrong. Worries about that sort of thing can make actual physicists spooked.

There’s an important dividing line with doubt: too much and you risk invoking a conspiracy more fantastical than the science you’re doubting in the first place. The Technocracy doesn’t just straddle that line, it hops past it off into the distance. Science is too vast, and too unpredictable, to be controlled by some shadowy conspiracy.

Or maybe that’s just what we want you to think!

# What’s in a Conjecture? An ER=EPR Example

A few weeks back, Caltech’s Institute of Quantum Information and Matter released a short film titled Quantum is Calling. It’s the second in what looks like will become a series of pieces featuring Hollywood actors popularizing ideas in physics. The first used the game of Quantum Chess to talk about superposition and entanglement. This one, featuring Zoe Saldana, is about a conjecture by Juan Maldacena and Leonard Susskind called ER=EPR. The conjecture speculates that pairs of entangled particles (as investigated by Einstein, Podolsky, and Rosen) are in some sense secretly connected by wormholes (or Einstein-Rosen bridges).

The film is fun, but I’m not sure ER=EPR is established well enough to deserve this kind of treatment.

At this point, some of you are nodding your heads for the wrong reason. You’re thinking I’m saying this because ER=EPR is a conjecture.

I’m not saying that.

The fact of the matter is, conjectures play a very important role in theoretical physics, and “conjecture” covers a wide range. Some conjectures are supported by incredibly strong evidence, just short of mathematical proof. Others are wild speculations, “wouldn’t it be convenient if…” ER=EPR is, well…somewhere in the middle.

Most popularizers don’t spend much effort distinguishing things in this middle ground. I’d like to talk a bit about the different sorts of evidence conjectures can have, using ER=EPR as an example.

Our friendly neighborhood space octopus

The first level of evidence is motivation.

At its weakest, motivation is the “wouldn’t it be convenient if…” line of reasoning. Some conjectures never get past this point. Hawking’s chronology protection conjecture, for instance, points out that physics (and to some extent logic) has a hard time dealing with time travel, and wouldn’t it be convenient if time travel was impossible?

For ER=EPR, this kind of motivation comes from the black hole firewall paradox. Without going into it in detail, arguments suggested that the event horizons of older black holes would resemble walls of fire, incinerating anything that fell in, in contrast with Einstein’s picture in which passing the horizon has no obvious effect at the time. ER=EPR provides one way to avoid this argument, making event horizons subtle and smooth once more.

Motivation isn’t just “wouldn’t it be convenient if…” though. It can also include stronger arguments: suggestive comparisons that, while they could be coincidental, when put together draw a stronger picture.

In ER=EPR, this comes from certain similarities between the type of wormhole Maldacena and Susskind were considering, and pairs of entangled particles. Both connect two different places, but both do so in an unusually limited way. The wormholes of ER=EPR are non-traversable: you cannot travel through them. Entangled particles can’t be traveled through (as you would expect), but more generally can’t be communicated through: there are theorems to prove it. This is the kind of suggestive similarity that can begin to motivate a conjecture.

(Amusingly, the plot of the film breaks this in both directions. Keanu Reeves can neither steal your cat through a wormhole, nor send you coded messages with entangled particles.)

Nor live forever as the portrait in his attic withers away

Motivation is a good reason to investigate something, but a bad reason to believe it. Luckily, conjectures can have stronger forms of evidence. Many of the strongest conjectures are correspondences, supported by a wealth of non-trivial examples.

In science, the gold standard has always been experimental evidence. There’s a reason for that: when you do an experiment, you’re taking a risk. Doing an experiment gives reality a chance to prove you wrong. In a good experiment (a non-trivial one) the result isn’t obvious from the beginning, so that success or failure tells you something new about the universe.

In theoretical physics, there are things we can’t test with experiments, either because they’re far beyond our capabilities or because the claims are mathematical. Despite this, the overall philosophy of experiments is still relevant, especially when we’re studying a correspondence.

“Correspondence” is a word we use to refer to situations where two different theories are unexpectedly computing the same thing. Often, these are very different theories, living in different dimensions with different sorts of particles. With the right “dictionary”, though, you can translate between them, doing a calculation in one theory that matches a calculation in the other one.

Even when we can’t do non-trivial experiments, then, we can still have non-trivial examples. When the result of a calculation isn’t obvious from the beginning, showing that it matches on both sides of a correspondence takes the same sort of risk as doing an experiment, and gives the same sort of evidence.

Some of the best-supported conjectures in theoretical physics have this form. AdS/CFT is technically a conjecture: a correspondence between string theory in a hyperbola-shaped space and my favorite theory, N=4 super Yang-Mills. Despite being a conjecture, the wealth of nontrivial examples is so strong that it would be extremely surprising if it turned out to be false.

ER=EPR is also a correspondence, between entangled particles on the one hand and wormholes on the other. Does it have nontrivial examples?

Some, but not enough. Originally, it was based on one core example, an entangled state that could be cleanly matched to the simplest wormhole. Now, new examples have been added, covering wormholes with electric fields and higher spins. The full “dictionary” is still unclear, with some pairs of entangled particles being harder to describe in terms of wormholes. So while this kind of evidence is being built, it isn’t as solid as our best conjectures yet.

I’m fine with people popularizing this kind of conjecture. It deserves blog posts and press articles, and it’s a fine idea to have fun with. I wouldn’t be uncomfortable with the Bohemian Gravity guy doing a piece on it, for example. But for the second installment of a star-studded series like the one Caltech is doing…it’s not really there yet, and putting it there gives people the wrong idea.

I hope I’ve given you a better idea of the different types of conjectures, from the most fuzzy to those just shy of certain. I’d like to do this kind of piece more often, though in future I’ll probably stick with topics in my sub-field (where I actually know what I’m talking about 😉 ). If there’s a particular conjecture you’re curious about, ask in the comments!

# Model-Hypothesis-Experiment: Sure, Just Not All the Same Person!

At some point, we were all taught how science works.

The scientific method gets described differently in different contexts, but it goes something like this:

First, a scientist proposes a model, a potential explanation for how something out in the world works. They then create a hypothesis, predicting some unobserved behavior that their model implies should exist. Finally, they perform an experiment, testing the hypothesis in the real world. Depending on the results of the experiment, the model is either supported or rejected, and the scientist begins again.

It’s a handy picture. At the very least, it’s a good way to fill time in an introductory science course before teaching the actual science.

But science is a big area. And just as no two sports have the same league setup, no two areas of science use the same method. While the central principles behind the method still hold (the idea that predictions need to be made before experiments are performed, the idea that in order to test a model you need to know something it implies that other models don’t, the idea that the question of whether a model actually describes the real world should be answered by actual experiments…), the way they are applied varies depending on the science in question.

In particular, in high-energy particle physics, we do roughly follow the steps of the method: we propose models, we form hypotheses, and we test them out with experiments. We just don’t expect the same person to do each step!

In high energy physics, models are the domain of Theorists. Occasionally referred to as “pure theorists” to distinguish them from the next category, theorists manipulate theories (some intended to describe the real world, some not). “Manipulate” here can mean anything from modifying the principles of the theory to see what works, to attempting to use the theory to calculate some quantity or another, to proving that the theory has particular properties. There’s quite a lot to do, and most of it can happen without ever interacting with the other areas.

Hypotheses, meanwhile, are the province of Phenomenologists. While theorists often study theories that don’t describe the real world, phenomenologists focus on theories that can be tested. A phenomenologist’s job is to take a theory (either proposed by a theorist or another phenomenologist) and calculate its consequences for experiments. As new data comes in, phenomenologists work to revise their theories, computing just how plausible the old proposals are given the new information. While phenomenologists often work closely with those in the next category, they also do large amounts of work internally, honing calculation techniques and looking through models to find explanations for odd behavior in the data.

That data comes, ultimately, from Experimentalists. Experimentalists run the experiments. With experiments as large as the Large Hadron Collider, they don’t actually build the machines in question. Rather, experimentalists decide how the machines are to be run, then work to analyze the data that emerges. Data from a particle collider or a neutrino detector isn’t neatly labeled by particle. Rather, it involves a vast set of statistics, energies and charges observed in a variety of detectors. An experimentalist takes this data and figures out what particles the detectors actually observed, and from that what sorts of particles were likely produced. Like the other areas, much of this process is self-contained. Rather than being concerned with one theory or another, experimentalists will generally look for general signals that could support a variety of theories (for example, leptoquarks).

If experimentalists don’t build the colliders, who does? That’s actually the job of an entirely different class of scientists, the Accelerator Physicists. Accelerator physicists not only build particle accelerators, they study how to improve them, with research just as self-contained as the other groups.

So yes, we build models, form hypotheses, and construct and perform experiments to test them. And we’ve got very specialized, talented people who focus on each step. That means a lot of internal discussion, and many papers published that only belong to one step or another. For our subfield, it’s the best way we’ve found to get science done.

# Breakthrough or Crackpot?

Suppose that you have an idea. Not necessarily a wonderful, awful idea, but an idea that seems like it could completely change science as we know it. And why not? It’s been done before.

My advice to you is to be very very careful. Because if you’re not careful, your revolutionary idea might force you to explain much much more than you expect.

Let’s consider an example. Suppose you believe that the universe is only six thousand years old, in contrast to the 13.772 ± 0.059 billion years that scientists who study the subject have calculated. And furthermore, imagine that you’ve gone one step further: you’ve found evidence!

Being no slouch at this sort of thing, you read the Wikipedia article linked above, and you figure you’ve got two problems to deal with: extrapolations from the expansion of the universe, and the cosmic microwave background. Let’s say your new theory is good enough that you can address both of these: you can explain why calculations based on both of these methods give 14 billion years, while you still assert that the universe is only six thousand years old. You’ve managed to explain away all of the tests that scientists used to establish the age of the universe. If you can manage that, you’re done, right?

Not quite. Explaining all the direct tests may seem like great progress, but it’s only the first step, because the age of the universe can show up indirectly as well. No stars have been observed that are 13.772 billion years old, but every star whose age has been calculated has been found to be older than six thousand years! And even if you can explain why every attempt to measure a star’s age turned out wrong, there’s more to it than that, because the age of stars is a very important part of how astronomers model stellar behavior. Every time astronomers make a prediction about a star, whether estimating its size, it’s brightness, its color, every time they make such a prediction and then the prediction turns out correct, they’re using the fact that the star is (some specific number) much much older than six thousand years. And because almost everything we can see in space either is made of stars, or orbits a star, or once was a star, changing the age of the universe means you have to explain all those results too. If you propose that the age of the universe is only six thousand, you need to explain not only the cosmic microwave background, not only the age of stars, but almost every single successful prediction made in the last fifty years of astronomy, none of which would have been successful if the age of the universe was only six thousand.

Daunting, isn’t it?

Oh, we’re not done yet!

See, it’s not just astronomy you have to contend with, because the age of the Earth specifically is also calculated to be much larger than six thousand years. And just as astronomers use the age of stars to make successful predictions about their other properties, geologists use the age of rock formations to make their own predictions. And the same is true for species of animals and plants, studied through genetic drift with known rates over time, or fossils with known ages. So in proposing that the universe is only six thousand years old, you need to explain not just two pieces of evidence, but the majority of successful predictions made in three distinct disciplines over the last fifty years. Is your evidence that the universe is only six thousand years old good enough to outweigh all of that?

This is one of the best ways to tell a genuine scientific breakthrough from ideas that can be indelicately described as crackpot. If your idea questions something that has been used to make successful predictions for decades, then it becomes your burden of proof to explain why all those results were successful, and chances are, you can’t fulfill that burden.

This test can be applied quite widely. As another example, homeopathic medicine relies on the idea that if you dilute a substance (medicine or poison) drastically then rather than getting weaker it will suddenly become stronger, sometimes with the reverse effect. While you might at first think this could be confirmed or denied merely by testing homeopathic medicines themselves, the principle would also have to apply to any other dilution, meaning that a homeopath needs to explain everything from the success of water treatment plants that wash out all but tiny traces of contaminants to high school chemistry experiments involving diluting acid to observe its pH.

This is why scientific revolutions are hard! If you want to change the way we look at the world, you need to make absolutely sure you aren’t invalidating the success of prior researchers. In fact, the successes of past research constrain new science so much, that it sometimes is possible to make predictions just from these constraints!

So whenever you think you’ve got a breakthrough, ask yourself: how much does this mean I have to explain? What is my burden of proof?

# A Theorist’s Theory

Part One of a Series on N=4 Super Yang-Mills Theory

In my last post, I called Wikipedia’s explanation of N=4 super Yang-Mills theory only “half-decent”. It’s not particularly bad, though it could use more detail. What it isn’t, and what I wanted, was an explanation that would make sense to a general audience (i.e., you guys!).

Well, if you want something done right, you have to quote that cliché. Or, well, do it yourself.

This is the first in a series of articles that will explain N=4 super Yang-Mills theory. In this series I will take that phrase apart bit by bit, explaining as I go. And because I’m perverse and out to confuse you, I’ll start with the last bit and work my way up.

N=4 Super Yang-Mills Theory

Now as a relatively well-educated person, you may be grumbling at this point. “I know what a theory is!”

“A scientific theory is a well-substantiated explanation of some aspect of the natural world, based on a body of facts that have been repeatedly confirmed through observation and experiment.”

Ah. It appears you’ve been talking to the biologists again. This is exactly why we needed this post. Let’s have a chat.

To be clear, when a biologist says that something (evolution, say, or germ theory) is a theory, this is exactly what they mean. They are describing an idea that has been repeatedly tested and that actually describes the real world. Most other scientists work the same way: geologists (plate tectonics theory), chemists (molecular orbital theory), even most physicists (big bang theory). But this isn’t what theoretical physicists mean when they say theory. In contrast, most things that theorists call theories have no experimental evidence, and usually aren’t even meant to describe the real world.

Unlike the AAAS definition above, theoretical physicists don’t have a formal definition of their usage of theory. If we did, it might go something like this:

“A theory (in theoretical physics) consists of a list of quantum fields, their properties, and how they interact. These fields do not need to be ones that exist in the natural world, but they do have to be (relatively) mathematically consistent. To study a theory is then to consider the interactions of a specific list of quantum fields, without taking into account any other fields that might otherwise interfere.”

Note that there are ways to get around parts of this definition. The (2,0) theory is famously mysterious because we don’t know how to write down the interactions between its fields, but even there we have an implicit definition of how the fields interact built into the theory’s definition, and the challenge is to make that definition explicit. Other theories stretch the definition of a quantum field, or cover a range of different properties. Still, all of them fit the basic template: define some mathematical entities, and describe how they interact.

With that definition in hand, some of you are already asking the next question: “What are the quantum fields of N=4 super Yang-Mills? How do they interact?”

Tune in to the next installment to find out!