Monthly Archives: July 2025

Value in Formal Theory Land

What makes a physics theory valuable?

You may think that a theory’s job is to describe reality, to be true. If that’s the goal, we have a whole toolbox of ways to assess its value. We can check if it makes predictions and if those predictions are confirmed. We can assess whether the theory can cheat to avoid the consequences of its predictions (falsifiability) and whether its complexity is justified by the evidence (Occam’s razor, and statistical methods that follow from it).

But not every theory in physics can be assessed this way.

Some theories aren’t even trying to be true. Others may hope to have evidence some day, but are clearly not there yet, either because the tests are too hard or the theory hasn’t been fleshed out enough.

Some people specialize in theories like these. We sometimes say they’re doing “formal theory”, working with the form of theories rather than whether they describe the world.

Physics isn’t mathematics. Work in formal theory is still supposed to help describe the real world. But that help might take a long time to arrive. Until then, how can formal theorists know which theories are valuable?

One option is surprise. After years tinkering with theories, a formal theorist will have some idea of which sorts of theories are possible and which aren’t. Some of this is intuition and experience, but sometimes it comes in the form of an actual “no-go theorem”, a proof that a specific kind of theory cannot be consistent.

Intuition and experience can be wrong, though. Even no-go theorems are fallible, both because they have assumptions which can be evaded and because people often assume they go further than they do. So some of the most valuable theories are valuable because they are surprising: because they do something that many experienced theorists think is impossible.

Another option is usefulness. Here I’m not talking about technology: these are theories that may or may not describe the real world and can’t be tested in feasible experiments, they’re not being used for technology! But they can certainly be used by other theorists. They can show better ways to make predictions from other theories, or better ways to check other theories for contradictions. They can be a basis that other theories are built on.

I remember, back before my PhD, hearing about the consistent histories interpretation of quantum mechanics. I hadn’t heard much about it, but I did hear that it allowed calculations that other interpretations didn’t. At the time, I thought this was an obvious improvement: surely, if you can’t choose based on observations, you should at least choose an interpretation that is useful. In practice, it doesn’t quite live up to the hype. The things it allows you to calculate are things other interpretations would say don’t make sense to ask, questions like “what was the history of the universe” instead of observations you can test like “what will I see next?” But still, being able to ask new questions has proven useful to some, and kept a community interested.

Often, formal theories are judged on vaguer criteria. There’s a notion of explanatory power, of making disparate effects more intuitively part of the same whole. There’s elegance, or beauty, which is the theorist’s Occam’s razor, favoring ideas that do more with less. And there’s pure coolness, where a bunch of nerds are going to lean towards ideas that let them play with wormholes and multiverses.

But surprise, and usefulness, feel more solid to me. If you can find someone who says “I didn’t think this was possible”, then you’ve almost certainly done something valuable. And if you can’t do that, “I’d like to use this” is an excellent recommendation too.

Hype, Incentives, and Culture

To be clear, hype isn’t just lying.

We have a word for when someone lies to convince someone else to pay them, and that word is fraud. Most of what we call hype doesn’t reach that bar.

Instead, hype lives in a gray zone of affect and metaphor.

Some hype is pure affect. It’s about the subjective details, it’s about mood. “This is amazing” isn’t a lie, or at least, isn’t a lie you can check. They might really be amazed!

Some hype relies on metaphor. A metaphor can’t really be a lie, because a metaphor is always incomplete. But a metaphor can certainly be misleading. It can associate something minor with something important, or add emotional valence that isn’t really warranted.

Hype lies in a gray zone…and precisely because it lives in a gray zone, not everything that looks like hype is intended to be type.

We think of hype as a consequence of incentives. Scientists hype their work to grant committees to get grants, and hype it more to the public for prestige. Companies hype their products to sell them, and their business plans to draw in investors.

But what looks like hype can also be language, and culture.

To many people in the rest of the world, the way Americans talk about almost everything is hype. Everything is bigger and nicer and cooler. This isn’t because Americans are under some sort of weird extra career incentives, though. It’s just how they expect to talk, how they learned to talk, how everyone around them normally talks.

Similarly, people in different industries are used to talking differently. Depending on what work you do, you interpret different metaphors in different ways. What might seem like an enthusiastic endorsement in one industry might be dismissive in another.

In the end, it takes two to communicate: a speaker, and an audience. Speakers want to get their audience excited, and hopefully, if they don’t want to hype, to understand something of the truth. That means understanding how the audience communicates enthusiasm, and how it differs from the speaker. It means understanding language, and culture.

Did the South Pole Telescope Just Rule Out Neutrino Masses? Not Exactly, Followed by My Speculations

Recently, the South Pole Telescope’s SPT-3G collaboration released new measurements of the cosmic microwave background, the leftover light from the formation of the first atoms. By measuring this light, cosmologists can infer the early universe’s “shape”: how it rippled on different scales as it expanded into the universe we know today. They compare this shape to mathematical models, equations and simulations which tie together everything we know about gravity and matter, and try to see what it implies for those models’ biggest unknowns.

Some of the most interesting such unknowns are neutrino masses. We know that neutrinos have mass because they transform as they move, from one type of neutrino to another. Those transformations let physicists measure the differences between neutrino masses, but but themselves, they don’t say what the actual masses are. All we know from particle physics, at this point, is a minimum: in order for the neutrinos to differ in mass enough to transform in the way they do, the total mass of the three flavors of neutrino must be at least 0.06 electron-Volts.

(Divided by the speed of light squared to get the right units, if you’re picky about that sort of thing. Physicists aren’t.)

Neutrinos also influenced the early universe, shaping it in a noticeably different way than heavier particles that bind together into atoms, like electrons and protons, did. That effect, observed in the cosmic microwave background and in the distribution of galaxies in the universe today, lets cosmologists calculate a maximum: if neutrinos are more massive than a certain threshold, they could not have the effects cosmologists observe.

Over time as measurements improved, this maximum has decreased. Now, the South Pole Telescope has added more data to the pool, and combining it with prior measurements…well, I’ll quote their paper:

Ok, it’s probably pretty hard to understand what that means if you’re not a physicist. To explain:

  1. There are two different hypotheses for how neutrino masses work, called “hierarchies”. In the “normal” hierarchy, the neutrinos go in the same order as the particles they interact with with the weak nuclear force: electron-neutrinos are lighter than muon neutrinos, which are lighter than tau neutrinos. In the “inverted” hierarchy, they come in the opposite order, and the electron neutrino is the heaviest. Both of these are consistent with the particle-physics data.
  2. Confidence is a statistics thing, which could take a lot of unpacking to define correctly. To give a short but likely tortured-sounding explanation: when you rule out a hypothesis with a certain confidence level, you’re saying that, if that hypothesis was true, there would only be a 100%-minus-that-chance chance that you would see what you actually observed.

So, what are the folks at the South Pole Telescope saying? They’re saying that if you put all the evidence together (that’s roughly what that pile of acroynms at the beginning means), then the result would be incredibly uncharacteristic for either hypothesis for neutrino masses. If you had “normal” neutrino masses, you’d only see these cosmological observations 2.1% of the time. And if you had inverted neutrino masses instead, you’d only see these observations 0.01% of the time!

That sure makes it sound like neither hypothesis is correct, right? Does it actually mean that?

I mean, it could! But I don’t think so. Here I’ll start speculating on the possibilities, from least likely in my opinion to most likely. This is mostly my bias talking, and shouldn’t be taken too seriously.

5. Neutrinos are actually massless

This one is really unlikely. The evidence from particle physics isn’t just quantitative, but qualitative. I don’t know if it’s possible to write down a model that reproduces the results of neutrino oscillation experiments without massive neutrinos, and if it is it would be a very bizarre model that would almost certainly break something else. This is essentially a non-starter.

4. This is a sign of interesting new physics

I mean, it would be nice, right? I’m sure there are many proposals at this point, tweaks that add a few extra fields with some hard-to-notice effects to explain the inconsistency. I can’t rule this out, and unlike the last point there isn’t anything about it that seems impossible. But we’ve had a lot of odd observations, and so far this hasn’t happened.

3. Someone did statistics wrong

This happens more often. Any argument like this is a statistical argument, and while physicists keep getting better at statistics, they’re not professional statisticians. Sometimes there’s a genuine misunderstanding that goes in to testing a model, and once it gets resolved the problem goes away.

2. The issue will go away with more data

The problem could also just…go away. 97.9% confidence sounds huge…but in physics, the standards are higher: you need 99.99994% to announce a new discovery. Physicists do a lot of experiments and observations, and sometimes, they see weird things! As the measurements get more precise, we may well see the disagreement melt away, and cosmology and particle physics both point to the same range for neutrino masses. It’s happened to many other measurements before.

1. We’re reaching the limits of our current approach to cosmology

This is probably not actually the most likely possibility, but it’s my list, what are you going to do?

There are basic assumptions behind how most theoretical physicists do cosmology. These assumptions are reasonably plausible, and seem to be needed to do anything at all. But they can be relaxed. Our universe looks like it’s homogeneous on the largest scales: the same density on average, in every direction you look. But the way that gets enforced in the mathematical models is very direct, and it may be that a different, more indirect, approach has more flexibility. I’ll probably be writing about this more in future, hopefully somewhere journalistic. But there are some very cool ideas floating around, gradually getting fleshed out more and more. It may be that the answer to many of the mysteries of cosmology right now is not new physics, but new mathematics: a new approach to modeling the universe.

Bonus Info on the LHC and Beyond

Three of my science journalism pieces went up last week!

(This is a total coincidence. One piece was a general explainer “held in reserve” for a nice slot in the schedule, one was a piece I drafted in February, while the third I worked on in May. In journalism, things take as long as they take.)

The shortest piece, at Quanta Magazine, was an explainer about the two types of particles in physics: bosons, and fermions.

I don’t have a ton of bonus info here, because of how tidy the topic is, so just two quick observations.

First, I have the vague impression that Bose, bosons’ namesake, is “claimed” by both modern-day Bangladesh and India. I had friends in grad school who were proud of their fellow physicist from Bangladesh, but while he did his most famous work in Dhaka, he was born and died in Calcutta. Since both were under British India for most of his life, these things likely get complicated.

Second, at the end of the piece I mention a “world on a wire” where fermions and bosons are the same. One example of such a “wire” is a string, like in string theory. One thing all young string theorists learn is “bosonization”: the idea that, in a 1+1-dimensional world like a string, you can re-write any theory with fermions as a theory with bosons, as well as vice versa. This has important implications for how string theory is set up.

Next, in Ars Technica, I had a piece about how LHC physicists are using machine learning to untangle the implications of quantum interference.

As a journalist, it’s really easy to fall into a trap where you give the main person you interview too much credit: after all, you’re approaching the story from their perspective. I tried to be cautious about this, only to be stymied when literally everyone else I interviewed praised Aishik Ghosh to the skies and credited him with being the core motivating force behind the project. So I shrugged my shoulders and followed suit. My understanding is that he has been appropriately rewarded and will soon be a professor at Georgia Tech.

I didn’t list the inventors of the NSBI method that Ghosh and co. used, but names like Kyle Cranmer and Johann Brehmer tend to get bandied about. It’s a method that was originally explored for a more general goal, trying to characterize what the Standard Model might be missing, while the work I talk about in the piece takes it in a new direction, closer to the typical things the ATLAS collaboration looks for.

I also did not say nearly as much as I was tempted to about how the ATLAS collaboration publishes papers, which was honestly one of the most intriguing parts of the story for me. There is a huge amount of review that goes on inside ATLAS before one of their papers reaches the outside world, way more than there ever is in a journal’s peer review process. This is especially true for “physics papers”, where ATLAS is announcing a new conclusion about the physical world, as ATLAS’s reputation stands on those conclusions being reliable. That means starting with an “internal note” that’s hundreds of pages long (and sometimes over a thousand), an editorial board that manages the editing process, disseminating the paper to the entire collaboration for comment, and getting specific experts and institute groups within the collaboration to read through the paper in detail. The process is a bit less onerous for “technical papers”, which describe a new method, not a new conclusion about the world. Still, it’s cumbersome enough that for those papers, often scientists don’t publish them “within ATLAS” at all, instead releasing them independently. The results I reported on are special because they involved a physics paper and a technical paper, both within the ATLAS collaboration process. Instead of just working with partial or simplified data, they wanted to demonstrate the method on a “full analysis”, with all the computation and human coordination that requires. Normally, ATLAS wouldn’t go through the whole process of publishing a physics paper without basing it on new data, but this was different: the method had the potential to be so powerful that the more precise results would be worth stating as physics results alone.

(Also, for the people in the comments worried about training a model on old data: that’s not what they did. In physics, they don’t try to train a neural network model to predict the results of colliders, such a model wouldn’t tell us anything useful. They run colliders to tell us whether what they see matches the analytic, Standard, model. The neural network is trained to predict not what the experiment will say, but what the Standard Model will say, as we can usually only figure that out through time-consuming simulations. So it’s trained on (new) simulations, not on experimental data.)

Finally, on Friday I had a piece in Physics Today about the European Strategy for Particle Physics (or ESPP), and in particular, plans for the next big collider.

Before I even started working on this piece, I saw a thread by Patrick Koppenburg on some of the 263 documents submitted for the ESPP update. While my piece ended up mostly focused on the big circular collider plan that most of the field is converging on (the future circular collider, or FCC), Koppenburg’s thread was more wide-ranging, meant to illustrate the breadth of ideas under discussion. Some of that discussion is about the LHC’s current plans, like its “high-luminosity” upgrade that will see it gather data at much higher rates up until 2040. Some of it is assessing broader concerns, which it may surprise some of you to learn includes sustainability: yes, there are more or less sustainable ways to build giant colliders.

The most fun part of the discussion, though, concerns all of the other collider proposals.

Some report progress on new technologies. Muon colliders are the most famous of these, but there are other proposals that would specifically help with a linear collider. I never did end up understanding what Cooled Copper Colliders are all about, beyond that they let you get more energy in a smaller machine without super-cooling. If you know about them, chime in in the comments! Meanwhile, plasma wakefield acceleration could accelerate electrons on a wave of plasma. This has the disadvantage that you want to collide electrons and positrons, and if you try to stick a positron in plasma it will happily annihilate with the first electron it meets. So what do you do? You go half-and-half, with the HALHF project: speed up the electron with a plasma wakefield, accelerate the positron normally, and have them meet in the middle.

Others are backup plans, or “budget options”, where CERN could get a bit better measurements on some parameters if they can’t stir up the funding to measure the things they really want. They could put electrons and positrons into the LHC tunnel instead of building a new one, for a weaker machine that could still study the Higgs boson to some extent. They could use a similar experiment to produce Z bosons instead, which could serve as a bridge to a different collider project. Or, they could collider the LHC’s proton beam with an electron beam, for an experiment that mixes advantages and disadvantages of some of the other approaches.

While working on the piece, one resource I found invaluable was this colloquium talk by Tristan du Pree, where he goes through the 263 submissions and digs up a lot of interesting numbers and commentary. Read the slides for quotes from the different national inputs and “solo inputs” with comments from particular senior scientists. I used that talk to get a broad impression of what the community was feeling, and it was interesting how well it was reflected in the people I interviewed. The physicist based in Switzerland felt the most urgency for the FCC plan, while the Dutch sources were more cautious, with other Europeans firmly in the middle.

Going over the FCC report itself, one thing I decided to leave out of the discussion was the cost-benefit analysis. There’s the potential for a cute sound-bite there, “see, the collider is net positive!”, but I’m pretty skeptical of the kind of analysis they’re doing there, even if it is standard practice for government projects. Between the biggest benefits listed being industrial benefits to suppliers and early-career researcher training (is a collider unusually good for either of those things, compared to other ways we spend money?) and the fact that about 10% of the benefit is the science itself (where could one possibly get a number like that?), it feels like whatever reasoning is behind this is probably the kind of thing that makes rigor-minded economists wince. I wasn’t able to track down the full calculation though, so I really don’t know, maybe this makes more sense than it looks.

I think a stronger argument than anything along those lines is a much more basic point, about expertise. Right now, we have a community of people trying to do something that is not merely difficult, but fundamental. This isn’t like sending people to space, where many of the engineering concerns will go away when we can send robots instead. This is fundamental engineering progress in how to manipulate the forces of nature (extremely powerful magnets, high voltages) and process huge streams of data. Pushing those technologies to the limit seems like it’s going to be relevant, almost no matter what we end up doing. That’s still not putting the science first and foremost, but it feels a bit closer to an honest appraisal of what good projects like this do for the world.