You may have heard that the muon g-2 problem has been solved.
Muons are electrons’ heavier cousins. As spinning charged particles, they are magnetic, the strength of that magnetism characterized by a number denoted “g”. If you were to guess this number from classical physics alone, you’d conclude it should be 2, but quantum mechanics tweaks it. The leftover part, “g-2”, can be measured, and predicted, with extraordinary precision, which ought to make it an ideal test: if our current understanding of the particle physics, called the Standard Model, is subtly wrong, the difference might be noticeable there.
And for a while, it looked like such a difference was indeed noticeable. Extremely precise experiments over the last thirty years have consistently found a number slightly different from the extremely precise calculations, different enough that it seemed quite unlikely to be due to chance.
Now, the headlines are singing a different tune.
What changed?
That headline might make you think the change was an experimental result, a new measurement that changed the story. It wasn’t, though. There is a new, more precise measurement, but it agrees with the old measurements.
So the change has to be in the calculations, right? They did a new calculation, corrected a mistake or just pushed up their precision, and found that the Standard Model matches the experiment after all?
…sort of, but again, not really. The group of theoretical physicists associated with the experiment did release new, more accurate calculations. But it wasn’t the new calculations, by themselves, that made a difference. Instead, it was a shift in what kind of calculations they used…or even more specifically, what kind of calculations they trusted.
Parts of the calculation of g-2 can be done with Feynman diagrams, those photogenic squiggles you see on physicists’ blackboards. That part is very precise, and not especially controversial. However, Feynman diagrams only work well when forces between particles are comparatively weak. They’re great for electromagnetism, even better for the weak nuclear force. But for the strong nuclear force, the one that holds protons and neutrons together, you often need a different method.
For g-2, that used to be done via a “data-driven” method. Physicists measured different things, particles affected by the strong nuclear force in different ways, and used that to infer how the strong force would affect g-2. By getting a consistent picture from different experiments, they were reasonably confident that they had the right numbers.
Back in 2020, though, a challenger came to the scene, with another method. Called lattice QCD, this method involves building gigantic computer simulations of the effect of the strong force. People have been doing lattice QCD since the 1970’s, and the simulations have been getting better and better, until in 2020, a group managed to calculate the piece of the g-2 calculation that had until then been done by the data-driven method.
The lattice group found a very different result than what had been found previously. Instead of a wild disagreement with experiment, their calculation agreed. According to them, everything was fine, the muon g-2 was behaving exactly as the Standard Model predicted.
For some of us, that’s where the mystery ended. Clearly, something must be wrong with the data-driven method, not with the Standard Model. No more muon puzzle.
But the data-driven method wasn’t just a guess, it was being used for a reason. A significant group of physicists found the arguments behind it convincing. Now, there was a new puzzle: figuring out why the data-driven method and lattice QCD disagree.
Five years later, has that mystery been solved? Is that, finally, what the headlines are about?
Again, not really, no.
The theorists associated with the experiment have decided to trust lattice QCD, not the data-driven method. But they don’t know what went wrong, exactly.
Instead, they’ve highlighted cracks in the data-driven method. The way the data-driven method works, it brings together different experiments to try to get a shared picture. But that shared picture has started to fall apart. A new measurement by a different experiment doesn’t fit into the system: the data-driven method now “has tensions”, as physicists say. It’s no longer possible to combine all experiments into a shared picture they way they used to. Meanwhile, lattice QCD has gotten even better, reaching even higher precision. From the perspective of the theorists associated with the muon g-2 experiment, switching methods is now clearly the right call.
But does that mean they solved the puzzle?
If you were confident that lattice QCD is the right approach, then the puzzle was already solved in 2020. All that changed was the official collaboration finally acknowledging that.
And if you were confident that the data-driven method was the right approach, then the puzzle is even worse. Now, there are tensions within the method itself…but still no explanation of what went wrong! If you had good reasons to think the method should work, you still have those good reasons. Now you’re just…more puzzled.
I am reminded of another mystery, a few years back, when an old experiment announced a dramatically different measurement for the mass of the W boson. Then, I argued the big mystery was not how the W boson’s mass had changed (it hadn’t), but how they came to be so confident in a result so different from what others, also confidently, had found. In physics, our confidence is encoded in numbers, estimated and measured and tested and computed. If we’re not estimating that confidence correctly…then that’s the real mystery, the real puzzle. One much more important to solve.
Also, I had two more pieces out this week! In Quanta I have a short explainer about bosons and fermions, while at Ars Technica I have a piece about machine learning at the LHC. I may have a “bonus info” post on the latter at some point, I have to think about whether I have enough material for it.







