The Niels Bohr Institute is hosting a conference this week on New Ideas in Cosmology. I’m no cosmologist, but it’s a pretty cool field, so as a local I’ve been sitting in on some of the talks. So far they’ve had a selection of really interesting speakers with quite a variety of interests, including a talk by Roger Penrose with his trademark hand-stippled drawings.
One thing that has impressed me has been the “interdisciplinary” feel of the conference. By all rights this should be one “discipline”, cosmology. But in practice, each speaker came at the subject from a different direction. They all had a shared core of knowledge, common models of the universe they all compare to. But the knowledge they brought to the subject varied: some had deep knowledge of the mathematics of gravity, others worked with string theory, or particle physics, or numerical simulations. Each talk, aware of the varied audience, was a bit “colloquium-style“, introducing a framework before diving in to the latest research. Each speaker knew enough to talk to the others, but not so much that they couldn’t learn from them. It’s been unexpectedly refreshing, a real interdisciplinary conference done right.
I’m at a conference this week of a very particular type: a birthday conference. When folks in my field turn 60, their students and friends organize a special conference for them, celebrating their research legacy. With COVID restrictions just loosening, my advisor Michael Douglas is getting a last-minute conference. And as one of the last couple students he graduated at Stony Brook, I naturally showed up.
The conference, Mikefest, is at the Institut des Hautes Études Scientifiques, just outside of Paris. Mike was a big supporter of the IHES, putting in a lot of fundraising work for them. Another big supporter, James Simons, was Mike’s employer for a little while after his time at Stony Brook. The conference center we’re meeting in is named for him.
I wasn’t involved in organizing the conference, so it was interesting seeing differences between this and other birthday conferences. Other conferences focus on the birthday prof’s “family tree”: their advisor, their students, and some of their postdocs. We’ve had several talks from Mike’s postdocs, and one from his advisor, but only one from a student. Including him and me, three of Mike’s students are here: another two have had their work mentioned but aren’t speaking or attending.
Most of the speakers have collaborated with Mike, but only for a few papers each. All of them emphasized a broader debt though, for discussions and inspiration outside of direct collaboration. The message, again and again, is that Mike’s work has been broad enough to touch a wide range of people. He’s worked on branes and the landscape of different string theory universes, pure mathematics and computation, neuroscience and recently even machine learning. The talks generally begin with a few anecdotes about Mike, before pivoting into research talks on the speakers’ recent work. The recent-ness of the work is perhaps another difference from some birthday conferences: as one speaker said, this wasn’t just a celebration of Mike’s past, but a “welcome back” after his return from the finance world.
One thing I don’t know is how much this conference might have been limited by coming together on short notice. For other birthday conferences impacted by COVID (and I’m thinking of one in particular), it might be nice to have enough time to have most of the birthday prof’s friends and “academic family” there in person. As-is, though, Mike seems to be having fun regardless.
For overambitious apes like us, adding integers is the easiest thing in the world. Take one berry, add another, and you have two. Each remains separate, you can lay them in a row and count them one by one, each distinct thing adding up to a group of distinct things.
Other things in math are less like berries. Add two real numbers, like pi and the square root of two, and you get another real number, bigger than the first two, something you can write in an infinite messy decimal. You know in principle you can separate it out again (subtract pi, get the square root of two), but you can’t just stare at it and see the parts. This is less like adding berries, and more like adding fluids. Pour some water in to some other water, and you certainly have more water. You don’t have “two waters”, though, and you can’t tell which part started as which.
Some things in math look like berries, but are really like fluids. Take a polynomial, say . It looks like three types of things, like three berries: five , six , and eight . Add another polynomial, and the illusion continues: add and you get . You’ve just added more , more , more , like adding more strawberries, blueberries, and raspberries.
But those berries were a choice you made, and not the only one. You can rewrite that first polynomial, for example saying . That’s the same thing, you can check. But now it looks like five , negative four , and seven . It’s different numbers of different things, blackberries or gooseberries or something. And you can do this in many ways, infinitely many in fact. The polynomial isn’t really a collection of berries, for all it looked like one. It’s much more like a fluid, a big sloshing mess you can pour into buckets of different sizes. (Technically, it’s a vector space. Your berries were a basis.)
Even smart, advanced students can get tripped up on this. You can be used to treating polynomials as a fluid, and forget that directions in space are a fluid, one you can rotate as you please. If you’re used to directions in space, you’ll get tripped up by something else. You’ll find that types of particles can be more fluid than berry, the question of which quark is which not as simple as how many strawberries and blueberries you have. The laws of physics themselves are much more like a fluid, which should make sense if you take a moment, because they are made of equations, and equations are like a fluid.
So my fellow overambitious apes, do be careful. Not many things are like berries in the end. A whole lot are like fluids.
The W boson is a fundamental particle, part of the Standard Model of particle physics. It is what we call a “force-carrying boson”, a particle related to the weak nuclear force in the same way photons are related to electromagnetism. Unlike photons, W bosons are “heavy”: they have a mass. We can’t usually predict masses of particles, but the W boson is a bit different, because its mass comes from the Higgs boson in a special way, one that ties it to the masses of other particles like the Z boson. The upshot is that if you know the mass of a few other particles, you can predict the mass of the W.
And according to a recent publication, that prediction is wrong. A team analyzed results from an old experiment called the Tevatron, the biggest predecessor of today’s Large Hadron Collider. They treated the data with groundbreaking care, mindbogglingly even taking into account the shape of the machine’s wires. And after all that analysis, they found that the W bosons detected by the Tevatron had a different mass than the mass predicted by the Standard Model.
How different? Here’s where precision comes in. In physics, we decide whether to trust a measurement with a statistical tool. We calculate how likely the measurement would be, if it was an accident. In this case: how likely it would be that, if the Standard Model was correct, the measurement would still come out this way? To discover a new particle, we require this chance to be about one in 3.5 million, or in our jargon, five sigma. That was the requirement for discovering the Higgs boson. This super-precise measurement of the W boson doesn’t have five sigma…it has seven sigma. That means, if we trust the analysis team, then a measurement like this could come accidentally out of the Standard Model only about one in a trillion times.
Ok, should we trust the analysis team?
If you want to know that, I’m the wrong physicist to ask. The right physicists are experimental particle physicists. They do analyses like that one, and they know what can go wrong. Everyone I’ve heard from in that field emphasized that this was a very careful group, who did a lot of things impressively right…but there is still room for mistakes. One pointed out that the new measurement isn’t just inconsistent with the Standard Model, but with many previous measurements too. Those measurements are less precise, but still precise enough that we should be a bit skeptical. Another went into more detail about specific clues as to what might have gone wrong.
I’m not an experimentalist or a phenomenologist. I’m an “amplitudeologist”. I work not on the data, or the predictions, but the calculational tools used to make those predictions, called “scattering amplitudes”. And that gives me a different view on the situation.
See in my field, precision is one of our biggest selling-points. If you want theoretical predictions to match precise experiments, you need our tricks to compute them. We believe (and argue to grant agencies) that this precision will be important: if a precise experiment and a precise prediction disagree, it could be the first clue to something truly new. New solid evidence of something beyond the Standard Model would revitalize all of particle physics, giving us a concrete goal and killing fruitless speculation.
This result shakes my faith in that a little. Probably, the analysis team got something wrong. Possibly, all previous analyses got something wrong. Either way, a lot of very careful smart people tried to estimate their precision, got very confident…and got it wrong.
If some future analysis digs down deep in precision, and finds another deviation from the Standard Model, should we trust it? What if it’s measuring something new, and we don’t have the prior experiments to compare to?
(This would happen if we build a new even higher-energy collider. There are things the collider could measure, like the chance one Higgs boson splits into two, that we could not measure with any earlier machine. If we measured that, we couldn’t compare it to the Tevatron or the LHC, we’d have only the new collider to go on.)
Statistics are supposed to tell us whether to trust a result. Here, they’re not doing their job. And that creates the scary possibility that some anomaly shows up, some real deviation deep in the sigmas that hints at a whole new path for the field…and we just end up bickering about who screwed it up. Or the equally scary possibility that we find a seven-sigma signal of some amazing new physics, build decades of new theories on it…and it isn’t actually real.
We don’t just trust statistics. We also trust the things normal people trust. Do other teams find the same result? (I hope that they’re trying to get to this same precision here, and see what went wrong!) Does the result match other experiments? Does it make predictions, which then get tested in future experiments?
All of those are heuristics of course. Nothing can guarantee that we measure the truth. Each trick just corrects for some of our biases, some of the ways we make mistakes. We have to hope that’s good enough, that if there’s something to see we’ll see it, and if there’s nothing to see we won’t. Precision, my field’s raison d’être, can’t be enough to convince us by itself. But it can help.
If I can teach one lesson to all of you, it’s this: be precise. In physics, we try to state what we mean as precisely as we can. If we can’t state something precisely, that’s a clue: maybe what we’re trying to state doesn’t actually make sense.
Someone recently reached out to me with a question about black holes. He was confused about how they were described, about what would happen when you fall in to one versus what we could see from outside. Part of his confusion boiled down to a question: “is the center really an infinitely small point?”
According to user4552, the reason this question is confusing is that the usual setup of general relativity cannot answer it. In general relativity, singularities like the singularity in the middle of a black hole aren’t treated as points, or collections of points: they’re not part of space-time at all. So you can’t count their dimensions, you can’t see whether they’re “really” infinitely small points, or surfaces, or lines…
This might surprise people (like me) who have experience with simpler equations for these things, like the Schwarzchild metric. The Schwarzchild metric describes space-time around a black hole, and in the usual coordinates it sure looks like the singularity is at a single point where r=0, just like the point where r=0 is a single point in polar coordinates in flat space. The thing is, though, that’s just one sort of coordinates. You can re-write a metric in many different sorts of coordinates, and the singularity in the center of a black hole might look very different in those coordinates. In general relativity, you need to stick to things you can say independent of coordinates.
Ok, you might say, so the usual mathematics can’t answer the question. Can we use more unusual mathematics? If our definition of dimensions doesn’t tell us whether the singularity is a point, maybe we just need a new definition!
According to user4552, people have tried this…and it only sort of works. There are several different ways you could define the dimension of a singularity. They all seem reasonable in one way or another. But they give different answers! Some say they’re points, some say they’re three-dimensional. And crucially, there’s no obvious reason why one definition is “right”. The question we started with, “is the center really an infinitely small point?”, looked like a perfectly reasonable question, but it actually wasn’t: the question wasn’t precise enough.
This is the real problem. The problem isn’t that our question was undefined, after all, we can always add new definitions. The problem was that our question didn’t specify well enough the definitions we needed. That is why the question doesn’t have an answer.
Once you understand the difference, you see these kinds of questions everywhere. If you’re baffled by how mass could have come out of the Big Bang, or how black holes could radiate particles in Hawking radiation, maybe you’ve heard a physicist say that energy isn’t always conserved. Energy conservation is a consequence of symmetry, specifically, symmetry in time. If your space-time itself isn’t symmetric (the expanding universe making the past different from the future, a collapsing star making a black hole), then you shouldn’t expect energy to be conserved.
I sometimes hear people object to this. They ask, is it really true that energy isn’t conserved when space-time isn’t symmetric? Shouldn’t we just say that space-time itself contains energy?
And well yes, you can say that, if you want. It isn’t part of the usual definition, but you can make a new definition, one that gives energy to space-time. In fact, you can make more than one new definition…and like the situation with the singularity, these definitions don’t always agree! Once again, you asked a question you thought was sensible, but it wasn’t precise enough to have a definite answer.
Keep your eye out for these kinds of questions. If scientists seem to avoid answering the question you want, and keep answering a different question instead…it might be their question is the only one with a precise answer. You can define a method to answer your question, sure…but it won’t be the only way. You need to ask precise enough questions to get good answers.
Last week, I gave the opening lectures for a course on scattering amplitudes, the things we compute to find probabilities in particle physics. After the first class, one of the students asked me if two different descriptions of these amplitudes, one called CHY and the other called the amplituhedron, were related. There does happen to be a connection, but it’s a bit subtle and indirect, not the sort of thing the student would have been thinking of. Why then, did he think they might be related? Well, he explained, both descriptions are geometric.
If you’ve been following this blog for a while, you’ve seen me talk aboutmisunderstandings. There are a lot of subtle ways a smart student can misunderstand something, ways that can be hard for a teacher to recognize. The right question, or the right explanation, can reveal what’s going on. Here, I think the problem was that there are multiple meanings of geometry.
One of the descriptions the student asked about, CHY, is related to string theory. It describes scattering particles in terms of the path of a length of string through space and time. That path draws out a surface called a world-sheet, showing all the places the string touches on its journey. And that picture, of a wiggly surface drawn in space and time, looks like what most people think of as geometry: a “shape” in a pretty normal sense, which here describes the physics of scattering particles.
The other description, the amplituhedron, also uses geometric objects to describe scattering particles. But the “geometric objects” here are much more abstract. A few of them are familiar: straight lines, the area between them forming shapes on a plane. Most of them, though are generalizations of this: instead of lines on a plane, they have higher dimensional planes in higher dimensional spaces. These too get described as geometry, even though they aren’t the “everyday” geometry you might be familiar with. Instead, they’re a “natural generalization”, something that, once you know the math, is close enough to that “everyday” geometry that it deserves the same name.
This week, twopapers presented a totally different kind of geometric description of particle physics. In those papers, “geometric” has to do with differential geometry, the mathematics behind Einstein’s theory of general relativity. The descriptions are geometric because they use the same kinds of building-blocks of that theory, a metric that bends space and time. Once again, this kind of geometry is a natural generalization of the everyday notion, but now in once again a different way.
All of these notions of geometry do have some things in common, of course. Maybe you could even write down a definition of “geometry” that includes all of them. But they’re different enough that if I tell you that two descriptions are “geometric”, it doesn’t tell you all that much. It definitely doesn’t tell you the two descriptions are related.
It’s a reasonable misunderstanding, though. It comes from a place where, used to “everyday” geometry, you expect two “geometric descriptions” of something to be similar: shapes moving in everyday space, things you can directly compare. Instead, a geometric description can be many sorts of shape, in many sorts of spaces, emphasizing many sorts of properties. “Geometry” is just a really broad term.
As the saying goes, it is better not to see laws or sausages being made. You’d prefer to see the clean package on the outside than the mess behind the scenes.
The same is true of science. A good paper tells a nice, clean story: a logical argument from beginning to end, with no extra baggage to slow it down. That story isn’t a lie: for any decent paper in theoretical physics, the conclusions will follow from the premises. Most of the time, though, it isn’t how the physicist actually did it.
The way we actually make discoveries is messy. It involves looking for inspiration in all the wrong places: pieces of old computer code and old problems, trying to reproduce this or that calculation with this or that method. In the end, once we find something interesting enough, we can reconstruct a clearer, cleaner, story, something actually fit to publish. We hide the original mess partly for career reasons (easier to get hired if you tell a clean, heroic story), partly to be understood (a paper that embraced the mess of discovery would be a mess to read), and partly just due to that deep human instinct to not let others see us that way.
The trouble is, some of that “mess” is useful, even essential. And because it’s never published or put into textbooks, the only way to learn it is word of mouth.
A lot of these messy tricks involve numerics. Many theoretical physics papers derive things analytically, writing out equations in symbols. It’s easy to make a mistake in that kind of calculation, either writing something wrong on paper or as a bug in computer code. To correct mistakes, many things are checked numerically: we plug in numbers to make sure everything still works. Sometimes this means using an approximation, trying to make sure two things cancel to some large enough number of decimal places. Sometimes instead it’s exact: we plug in prime numbers, and can much more easily see if two things are equal, or if something is rational or contains a square root. Sometimes numerics aren’t just used to check something, but to find a solution: exploring many options in an easier numerical calculation, finding one that works, and doing it again analytically.
“Ansatze” are also common: our fancy word for an educated guess. These we sometimesadmit, when they’re at the core of a new scientific idea. But the more minor examples go un-mentioned. If a paper shows a nice clean formula and proves it’s correct, but doesn’t explain how the authors got it…probably, they used an ansatz. This trick can go hand-in-hand with numerics as well: make a guess, check it matches the right numbers, then try to see why it’s true.
The messy tricks can also involve the code itself. In my field we often use “computer algebra” systems, programs to do our calculations for us. These systems are programming languages in their own right, and we need to write computer code for them. That code gets passed around informally, but almost never standardized. Mathematical concepts that come up again and again can be implemented very differently by different people, some much more efficiently than others.
I don’t think it’s unreasonable that we leave “the mess” out of our papers. They would certainly be hard to understand otherwise! But it’s a shame we don’t publish our dirty tricks somewhere, even in special “dirty tricks” papers. Students often start out assuming everything is done the clean way, and start doubting themselves when they notice it’s much too slow to make progress. Learning the tricks is a big part of learning to be a physicist. We should find a better way to teach them.
Last week was a birthday conference for one of the pioneers of my sub-field, Ettore Remiddi. I wasn’t there, but someone who was pointed me to some of the slides, including a talk by Stefano Laporta. For those of you who didn’t see my post a few weeks back, Laporta was one of Remiddi’s students, who developed one of the most important methods in our field and then vanished, spending ten years on an amazingly detailed calculation. Laporta’s talk covers more of the story, about what it was like to do precision calculations in that era.
“That era”, the 90’s through 2000’s, witnessed an enormous speedup in computers, and a corresponding speedup in what was possible. Laporta worked with Remiddi on the three-loop electron anomalous magnetic moment, something Remiddi had been working on since 1969. When Laporta joined in 1989, twenty-one of the seventy-two diagrams needed had still not been computed. They would polish them off over the next seven years, before Laporta dove in to four loops. Twenty years later, he had that four-loop result to over a thousand digits.
One fascinating part of the talk is seeing how the computational techniques change over time, as language replaces language and computer clusters get involved. As a student, Laporta learns a lesson we all often need: that to avoid mistakes, we need to do as little by hand as possible, even for something as simple as copying a one-line formula. Looking at his review of others’ calculations, it’s remarkable how many theoretical results had to be dramatically corrected a few years down the line, and how much still might depend on theoretical precision.
Another theme was one of Remiddi suggesting something and Laporta doing something entirely different, and often much more productive. Whether it was using the arithmetic-geometric mean for an elliptic integral instead of Gaussian quadrature, or coming up with his namesake method, Laporta spent a lot of time going his own way, and Remiddi quickly learned to trust him.
There’s a lot more in the slides that’s worth reading, including a mention of one of this year’s Physics Nobelists. The whole thing is an interesting look at what it takes to press precision to the utmost, and dedicate years to getting something right.
Ask a doctor or a psychologist if they’re sure about something, and they might say “it has p<0.05”. Ask a physicist, and they’ll say it’s a “5 sigma result”. On the surface, they sound like they’re talking about completely different things. As it turns out, they’re not quite that different.
Whether it’s a p-value or a sigma, what scientists are giving you is shorthand for a probability. The p-value is the probability itself, while sigma tells you how many standard deviations something is away from the mean on a normal distribution. For people not used to statistics this might sound very complicated, but it’s not so tricky in the end. There’s a graph, called a normal distribution, and you can look at how much of it is above a certain point, measured in units called standard deviations, or “sigmas”. That gives you your probability.
What are these numbers a probability of? At first, you might think they’re a probability of the scientist being right: of the medicine working, or the Higgs boson being there.
That would be reasonable, but it’s not how it works. Scientists can’t measure the chance they’re right. All they can do is compare models. When a scientist reports a p-value, what they’re doing is comparing to a kind of default model, called a “null hypothesis”. There are different null hypotheses for different experiments, depending on what the scientists want to test. For the Higgs, scientists looked at pairs of photons detected by the LHC. The null hypothesis was that these photons were created by other parts of the Standard Model, like the strong force, and not by a Higgs boson. For medicine, the null hypothesis might be that people get better on their own after a certain amount of time. That’s hard to estimate, which is why medical experiments use a control group: a similar group without the medicine, to see how much they get better on their own.
Once we have a null hypothesis, we can use it to estimate how likely it is that it produced the result of the experiment. If there was no Higgs, and all those photons just came from other particles, what’s the chance there would still be a giant pile of them at one specific energy? If the medicine didn’t do anything, what’s the chance the control group did that much worse than the treatment group?
Ideally, you want a small probability here. In medicine and psychology, you’re looking for a 5% probability, for p<0.05. In physics, you need 5 sigma to make a discovery, which corresponds to a one in 3.5 million probability. If the probability is low, then you can say that it would be quite unlikely for your result to happen if the null hypothesis was true. If you’ve got a better hypothesis (the Higgs exists, the medicine works), then you should pick that instead.
On Monday, Quanta magazine released an article on a man who transformed the way we do particle physics: Stefano Laporta. I’d tipped them off that Laporta would make a good story: someone who came up with the bread-and-butter algorithm that fuels all of our computations, then vanished from the field for ten years, returning at the end with an 1,100 digit masterpiece. There’s a resemblance to Searching for Sugar Man, fans and supporters baffled that their hero is living in obscurity.
If anything, I worry I under-sold the story. When Quanta interviewed me, it was clear they were looking for ties to well-known particle physics results: was Laporta’s work necessary for the Higgs boson discovery, or linked to the controversy over the magnetic moment of the muon? I was careful, perhaps too careful, in answering. The Higgs, to my understanding, didn’t require so much precision for its discovery. As for the muon, the controversial part is a kind of calculation that wouldn’t use Laporta’s methods, while the un-controversial part was found numerically by a group that doesn’t use his algorithm either.
With more time now, I can make a stronger case. I can trace Laporta’s impact, show who uses his work and for what.
In particle physics, we have a lovely database called INSPIRE that lists all our papers. Here is Laporta’s page, his work sorted by number of citations. When I look today, I find his most cited paper, the one that first presented his algorithm, at the top, with a delightfully apt 1,001 citations. Let’s listen to a few of those 1,001 tales, and see what they tell us.
After that, more applications: fundamental quantities for collider physics, pieces of math that are used again and again. In particular, they are referenced again and again by the Particle Data Group, who collect everything we know about particle physics.
Further down still, and we get to specific code: FIRE and Reduze, programs made by others to implement Laporta’s algorithm, each with many uses in its own right.
All that, just from one of Laporta’s papers.
His ten-year magnum opus is more recent, and has fewer citations: checking now, just 139. Still, there are stories to tell there too.
So why would you want 1,100 digits, then? In a word, mathematics. The calculation involves exotic types of numbers called periods, more complicated cousins of numbers like pi. These numbers are related to each other, often in complicated and surprising ways, ways which are hard to verify without such extreme precision. An older result of Laporta’s inspired the physicist David Broadhurst and mathematician Anton Mellit to conjecture newrelations between this type of numbers, relations that were only later proven using cutting-edge mathematics. The new result has inspired mathematicians too: Oliver Schnetz found hints of a kind of “numerical footprint”, special types of numbers tied to the physics of electrons. It’s a topic I’ve investigated myself, something I think could lead to much more efficient particle physics calculations.
In addition to being inspired by Laporta’s work, Broadhurst has advocated for it. He was the one who first brought my attention to Laporta’s story, with a moving description of welcoming him back to the community after his ten-year silence, writing a letter to help him get funding. I don’t have all the details of the situation, but the impression I get is that Laporta had virtually no academic support for those ten years: no salary, no students, having to ask friends elsewhere for access to computer clusters.
When I ask why someone with such an impact didn’t have a professorship, the answer I keep hearing is that he didn’t want to move away from his home town in Bologna. If you aren’t an academic, that won’t sound like much of an explanation: Bologna has a university after all, the oldest in the world. But that isn’t actually a guarantee of anything. Universities hire rarely, according to their own mysterious agenda. I remember another colleague whose wife worked for a big company. They offered her positions in several cities, including New York. They told her that, since New York has many universities, surely her husband could find a job at one of them? We all had a sad chuckle at that.
For almost any profession, a contribution like Laporta’s would let you live anywhere you wanted. That’s not true for academia, and it’s to our loss. By demanding that each scientist be able to pick up and move, we’re cutting talented people out of the field, filtering by traits that have nothing to do with our contributions to knowledge. I don’t know Laporta’s full story. But I do know that doing the work you love in the town you love isn’t some kind of unreasonable request. It’s a request academia should be better at fulfilling.