Tag Archives: astronomy

Machine Learning, Occam’s Razor, and Fundamental Physics

There’s a saying in physics, attributed to the famous genius John von Neumann: “With four parameters I can fit an elephant, and with five I can make him wiggle his trunk.”

Say you want to model something, like some surprising data from a particle collider. You start with some free parameters: numbers in your model that aren’t decided yet. You then decide those numbers, “fixing” them based on the data you want to model. Your goal is for your model not only to match the data, but to predict something you haven’t yet measured. Then you can go out and check, and see if your model works.

The more free parameters you have in your model, the easier this can go wrong. More free parameters make it easier to fit your data, but that’s because they make it easier to fit any data. Your model ends up not just matching the physics, but matching the mistakes as well: the small errors that crop up in any experiment. A model like that may look like it’s a great fit to the data, but its predictions will almost all be wrong. It wasn’t just fit, it was overfit.

We have statistical tools that tell us when to worry about overfitting, when we should be impressed by a model and when it has too many parameters. We don’t actually use these tools correctly, but they still give us a hint of what we actually want to know, namely, whether our model will make the right predictions. In a sense, these tools form the mathematical basis for Occam’s Razor, the idea that the best explanation is often the simplest one, and Occam’s Razor is a critical part of how we do science.

So, did you know machine learning was just modeling data?

All of the much-hyped recent advances in artificial intelligence, GPT and Stable Diffusion and all those folks, at heart they’re all doing this kind of thing. They start out with a model (with a lot more than five parameters, arranged in complicated layers…), then use data to fix the free parameters. Unlike most of the models physicists use, they can’t perfectly fix these numbers: there are too many of them, so they have to approximate. They then test their model on new data, and hope it still works.

Increasingly, it does, and impressively well, so well that the average person probably doesn’t realize this is what it’s doing. When you ask one of these AIs to make an image for you, what you’re doing is asking what image the model predicts would show up captioned with your text. It’s the same sort of thing as asking an economist what their model predicts the unemployment rate will be when inflation goes up. The machine learning model is just way, way more complicated.

As a physicist, the first time I heard about this, I had von Neumann’s quote in the back of my head. Yes, these machines are dealing with a lot more data, from a much more complicated reality. They literally are trying to fit elephants, even elephants wiggling their trunks. Still, the sheer number of parameters seemed fishy here. And for a little bit things seemed even more fishy, when I learned about double descent.

Suppose you start increasing the number of parameters in your model. Initially, your model gets better and better. Your predictions have less and less error, your error descends. Eventually, though, the error increases again: you have too many parameters so you’re over-fitting, and your model is capturing accidents in your data, not reality.

In machine learning, weirdly, this is often not the end of the story. Sometimes, your prediction error rises, only to fall once more, in a double descent.

For a while, I found this deeply disturbing. The idea that you can fit your data, start overfitting, and then keep overfitting, and somehow end up safe in the end, was terrifying. The way some of the popular accounts described it, like you were just overfitting more and more and that was fine, was baffling, especially when they seemed to predict that you could keep adding parameters, keep fitting tinier and tinier fleas on the elephant’s trunk, and your predictions would never start going wrong. It would be the death of Occam’s Razor as we know it, more complicated explanations beating simpler ones off to infinity.

Luckily, that’s not what happens. And after talking to a bunch of people, I think I finally understand this enough to say something about it here.

The right way to think about double descent is as overfitting prematurely. You do still expect your error to eventually go up: your model won’t be perfect forever, at some point you will really overfit. It might take a long time, though: machine learning people are trying to model very complicated things, like human behavior, with giant piles of data, so very complicated models may often be entirely appropriate. In the meantime, due to a bad choice of model, you can accidentally overfit early. You will eventually overcome this, pushing past with more parameters into a model that works again, but for a little while you might convince yourself, wrongly, that you have nothing more to learn.

(You can even mitigate this by tweaking your setup, potentially avoiding the problem altogether.)

So Occam’s Razor still holds, but with a twist. The best model is simple enough, but no simpler. And if you’re not careful enough, you can convince yourself that a too-simple model is as complicated as you can get.

Image from Astral Codex Ten

I was reminded of all this recently by some articles by Sabine Hossenfelder.

Hossenfelder is a critic of mainstream fundamental physics. The articles were her restating a point she’s made many times before, including in (at least) one of her books. She thinks the people who propose new particles and try to search for them are wasting time, and the experiments motivated by those particles are wasting money. She’s motivated by something like Occam’s Razor, the need to stick to the simplest possible model that fits the evidence. In her view, the simplest models are those in which we don’t detect any more new particles any time soon, so those are the models she thinks we should stick with.

I tend to disagree with Hossenfelder. Here, I was oddly conflicted. In some of her examples, it seemed like she had a legitimate point. Others seemed like she missed the mark entirely.

Talk to most astrophysicists, and they’ll tell you dark matter is settled science. Indeed, there is a huge amount of evidence that something exists out there in the universe that we can’t see. It distorts the way galaxies rotate, lenses light with its gravity, and wiggled the early universe in pretty much the way you’d expect matter to.

What isn’t settled is whether that “something” interacts with anything else. It has to interact with gravity, of course, but everything else is in some sense “optional”. Astroparticle physicists use satellites to search for clues that dark matter has some other interactions: perhaps it is unstable, sometimes releasing tiny signals of light. If it did, it might solve other problems as well.

Hossenfelder thinks this is bunk (in part because she thinks those other problems are bunk). I kind of do too, though perhaps for a more general reason: I don’t think nature owes us an easy explanation. Dark matter isn’t obligated to solve any of our other problems, it just has to be dark matter. That seems in some sense like the simplest explanation, the one demanded by Occam’s Razor.

At the same time, I disagree with her substantially more on collider physics. At the Large Hadron Collider so far, all of the data is reasonably compatible with the Standard Model, our roughly half-century old theory of particle physics. Collider physicists search that data for subtle deviations, one of which might point to a general discrepancy, a hint of something beyond the Standard Model.

While my intuitions say that the simplest dark matter is completely dark, they don’t say that the simplest particle physics is the Standard Model. Back when the Standard Model was proposed, people might have said it was exceptionally simple because it had a property called “renormalizability”, but these days we view that as less important. Physicists like Ken Wilson and Steven Weinberg taught us to view theories as a kind of series of corrections, like a Taylor series in calculus. Each correction encodes new, rarer ways that particles can interact. A renormalizable theory is just the first term in this series. The higher terms might be zero, but they might not. We even know that some terms cannot be zero, because gravity is not renormalizable.

The two cases on the surface don’t seem that different. Dark matter might have zero interactions besides gravity, but it might have other interactions. The Standard Model might have zero corrections, but it might have nonzero corrections. But for some reason, my intuition treats the two differently: I would find it completely reasonable for dark matter to have no extra interactions, but very strange for the Standard Model to have no corrections.

I think part of where my intuition comes from here is my experience with other theories.

One example is a toy model called sine-Gordon theory. In sine-Gordon theory, this Taylor series of corrections is a very familiar Taylor series: the sine function! If you go correction by correction, you’ll see new interactions and more new interactions. But if you actually add them all up, something surprising happens. Sine-Gordon turns out to be a special theory, one with “no particle production”: unlike in normal particle physics, in sine-Gordon particles can neither be created nor destroyed. You would never know this if you did not add up all of the corrections.

String theory itself is another example. In string theory, elementary particles are replaced by strings, but you can think of that stringy behavior as a series of corrections on top of ordinary particles. Once again, you can try adding these things up correction by correction, but once again the “magic” doesn’t happen until the end. Only in the full series does string theory “do its thing”, and fix some of the big problems of quantum gravity.

If the real world really is a theory like this, then I think we have to worry about something like double descent.

Remember, double descent happens when our models can prematurely get worse before getting better. This can happen if the real thing we’re trying to model is very different from the model we’re using, like the example in this explainer that tries to use straight lines to match a curve. If we think a model is simpler because it puts fewer corrections on top of the Standard Model, then we may end up rejecting a reality with infinite corrections, a Taylor series that happens to add up to something quite nice. Occam’s Razor stops helping us if we can’t tell which models are really the simple ones.

The problem here is that every notion of “simple” we can appeal to here is aesthetic, a choice based on what makes the math look nicer. Other sciences don’t have this problem. When a biologist or a chemist wants to look for the simplest model, they look for a model with fewer organisms, fewer reactions…in the end, fewer atoms and molecules, fewer of the building-blocks given to those fields by physics. Fundamental physics can’t do this: we build our theories up from mathematics, and mathematics only demands that we be consistent. We can call theories simpler because we can write them in a simple way (but we could write them in a different way too). Or we can call them simpler because they look more like toy models we’ve worked with before (but those toy models are just a tiny sample of all the theories that are possible). We don’t have a standard of simplicity that is actually reliable.

From the Wikipedia page for dark matter halos

There is one other way out of this pickle. A theory that is easier to write down is under no obligation to be true. But it is more likely to be useful. Even if the real world is ultimately described by some giant pile of mathematical parameters, if a simple theory is good enough for the engineers then it’s a better theory to aim for: a useful theory that makes peoples’ lives better.

I kind of get the feeling Hossenfelder would make this objection. I’ve seen her argue on twitter that scientists should always be able to say what their research is good for, and her Guardian article has this suggestive sentence: “However, we do not know that dark matter is indeed made of particles; and even if it is, to explain astrophysical observations one does not need to know details of the particles’ behaviour.”

Ok yes, to explain astrophysical observations one doesn’t need to know the details of dark matter particles’ behavior. But taking a step back, one doesn’t actually need to explain astrophysical observations at all.

Astrophysics and particle physics are not engineering problems. Nobody out there is trying to steer a spacecraft all the way across a galaxy, navigating the distribution of dark matter, or creating new universes and trying to make sure they go just right. Even if we might do these things some day, it will be so far in the future that our attempts to understand them won’t just be quaint: they will likely be actively damaging, confusing old research in dead languages that the field will be better off ignoring to start from scratch.

Because of that, usefulness is also not a meaningful guide. It cannot tell you which theories are more simple, which to favor with Occam’s Razor.

Hossenfelder’s highest-profile recent work falls afoul of one or the other of her principles. Her work on the foundations of quantum mechanics could genuinely be useful, but there’s no reason aside from claims of philosophical beauty to expect it to be true. Her work on modeling dark matter is at least directly motivated by data, but is guaranteed to not be useful.

I’m not pointing this out to call Hossenfelder a hypocrite, as some sort of ad hominem or tu quoque. I’m pointing this out because I don’t think it’s possible to do fundamental physics today without falling afoul of these principles. If you want to hold out hope that your work is useful, you don’t have a great reason besides a love of pretty math: otherwise, anything useful would have been discovered long ago. If you just try to model existing data as best you can, then you’re making a model for events far away or locked in high-energy particle colliders, a model no-one else besides other physicists will ever use.

I don’t know the way through this. I think if you need to take Occam’s Razor seriously, to build on the same foundations that work in every other scientific field…then you should stop doing fundamental physics. You won’t be able to make it work. If you still need to do it, if you can’t give up the sub-field, then you should justify it on building capabilities, on the kind of “practice” Hossenfelder also dismisses in her Guardian piece.

We don’t have a solid foundation, a reliable notion of what is simple and what isn’t. We have guesses and personal opinions. And until some experiment uncovers some blinding flash of new useful meaningful magic…I don’t think we can do any better than that.

The Folks With the Best Pictures

Sometimes I envy astronomers. Particle physicists can write books full of words and pages of colorful graphs and charts, and the public won’t retain any of it. Astronomers can mesmerize the world with a single picture.

NASA just released the first images from its James Webb Space Telescope. They’re impressive, and not merely visually: in twelve hours, they probe deeper than the Hubble Space Telescope managed in weeks on the same patch of sky, as well as gathering data that can show what kinds of molecules are present in the galaxies.

(If you’re curious how the James Webb images compare to Hubble ones, here’s a nice site comparing them.)

Images like this enter the popular imagination. The Hubble telescope’s deep field has appeared on essentially every artistic product one could imagine. As of writing this, searching for “Hubble” on Etsy gives almost 5,000 results. “JWST”, the acronym for the James Webb Space Telescope, already gives over 1,000, including several on the front page that already contain just-released images. Despite the Large Hadron Collider having operated for over a decade, searching “LHC” also leads to just around 1,000 results…and a few on the front page are actually pictures of the JWST!

It would be great as particle physicists to have that kind of impact…but I think we shouldn’t stress ourselves too much about it. Ultimately astronomers will always have this core advantage. Space is amazing, visually stunning and mind-bogglingly vast. It has always had a special place for human cultures, and I’m happy for astronomers to inherit that place.

Don’t Trust the Experiments, Trust the Science

I was chatting with an astronomer recently, and this quote by Arthur Eddington came up:

“Never trust an experimental result until it has been confirmed by theory.”

Arthur Eddington

At first, this sounds like just typical theorist arrogance, thinking we’re better than all those experimentalists. It’s not that, though, or at least not just that. Instead, it’s commenting on a trend that shows up again and again in science, but rarely makes the history books. Again and again an experiment or observation comes through with something fantastical, something that seems like it breaks the laws of physics or throws our best models into disarray. And after a few months, when everyone has checked, it turns out there was a mistake, and the experiment agrees with existing theories after all.

You might remember a recent example, when a lab claimed to have measured neutrinos moving faster than the speed of light, only for it to turn out to be due to a loose cable. Experiments like this aren’t just a result of modern hype: as Eddington’s quote shows, they were also common in his day. In general, Eddington’s advice is good: when an experiment contradicts theory, theory tends to win in the end.

This may sound unscientific: surely we should care only about what we actually observe? If we defer to theory, aren’t we putting dogma ahead of the evidence of our senses? Isn’t that the opposite of good science?

To understand what’s going on here, we can use an old philosophical argument: David Hume’s argument against miracles. David Hume wanted to understand how we use evidence to reason about the world. He argued that, for miracles in particular, we can never have good evidence. In Hume’s definition, a miracle was something that broke the established laws of science. Hume argued that, if you believe you observed a miracle, there are two possibilities: either the laws of science really were broken, or you made a mistake. The thing is, laws of science don’t just come from a textbook: they come from observations as well, many many observations in many different conditions over a long period of time. Some of those observations establish the laws in the first place, others come from the communities that successfully apply them again and again over the years. If your miracle was real, then it would throw into doubt many, if not all, of those observations. So the question you have to ask is: it it more likely those observations were wrong? Or that you made a mistake? Put another way, your evidence is only good enough for a miracle if it would be a bigger miracle if you were wrong.

Hume’s argument always struck me as a little bit too strict: if you rule out miracles like this, you also rule out new theories of science! A more modern approach would use numbers and statistics, weighing the past evidence for a theory against the precision of the new result. Most of the time you’d reach the same conclusion, but sometimes an experiment can be good enough to overthrow a theory.

Still, theory should always sit in the background, a kind of safety net for when your experiments screw up. It does mean that when you don’t have that safety net you need to be extra-careful. Physics is an interesting case of this: while we have “the laws of physics”, we don’t have any established theory that tells us what kinds of particles should exist. That puts physics in an unusual position, and it’s probably part of why we have such strict standards of statistical proof. If you’re going to be operating without the safety net of theory, you need that kind of proof.

This post was also inspired by some biological examples. The examples are politically controversial, so since this is a no-politics blog I won’t discuss them in detail. (I’ll also moderate out any comments that do.) All I’ll say is that I wonder if in that case the right heuristic is this kind of thing: not to “trust scientists” or “trust experts” or even “trust statisticians”, but just to trust the basic, cartoon-level biological theory.

The Big Bang: What We Know and How We Know It

When most people think of the Big Bang, they imagine a single moment: a whole universe emerging from nothing. That’s not really how it worked, though. The Big Bang refers not to one event, but to a whole scientific theory. Using Einstein’s equations and some simplifying assumptions, we physicists can lay out a timeline for the universe’s earliest history. Different parts of this timeline have different evidence: some are meticulously tested, others we even expect to be wrong! It’s worth talking through this timeline and discussing what we know about each piece, and how we know it.

We can see surprisingly far back in time. As we look out into the universe, we see each star as it was when the light we see left it: longer ago the further the star is from us. Looking back, we see changes in the types of stars and galaxies: stars formed without the metals that later stars produced, galaxies made of those early stars. We see the universe become denser and hotter, until eventually we reach the last thing we can see: the cosmic microwave background, a faint light that fills our view in every direction. This light represents a change in the universe, the emergence of the first atoms. Before this, there were ions: free nuclei and electrons, forming a hot plasma. That plasma constantly emitted and absorbed light. As the universe cooled, the ions merged into atoms, and light was free to travel. Because of this, we cannot see back beyond this point. Our model gives detailed predictions for this curtain of light: its temperature, and even the ways it varies in intensity from place to place, which in turn let us hone our model further.

In principle, we could “see” a bit further. Light isn’t the only thing that travels freely through the universe. Neutrinos are almost massless, and pass through almost everything. Like the cosmic microwave background, the universe should have a cosmic neutrino background. This would come from much earlier, from an era when the universe was so dense that neutrinos regularly interacted with other matter. We haven’t detected this neutrino background yet, but future experiments might. Gravitational waves meanwhile, can also pass through almost any obstacle. There should be gravitational wave backgrounds as well, from a variety of eras in the early universe. Once again these haven’t been detected yet, but more powerful gravitational wave telescopes may yet see them.

We have indirect evidence a bit further back than we can see things directly. In the heat of the early universe the first protons and neutrons were merged via nuclear fusion, becoming the first atomic nuclei: isotopes of hydrogen, helium, and lithium. Our model lets us predict the proportions of these, how much helium and lithium per hydrogen atom. We can then compare this to the oldest stars we see, and see that the proportions are right. In this way, we know something about the universe from before we can “see” it.

We get surprised when we look at the universe on large scales, and compare widely separated regions. We find those regions are surprisingly similar, more than we would expect from randomness and the physics we know. Physicists have proposed different explanations for this. The most popular, cosmic inflation, suggests that the universe expanded very rapidly, accelerating so that a small region of similar matter was blown up much larger than the ordinary Big Bang model would have, projecting those similarities across the sky. While many think this proposal fits the data best, we still aren’t sure it’s the right one: there are alternate proposals, and it’s even controversial whether we should be surprised by the large-scale similarity in the first place.

We understand, in principle, how matter can come from “nothing”. This is sometimes presented as the most mysterious part of the Big Bang, the idea that matter could spontaneously emerge from an “empty” universe. But to a physicist, this isn’t very mysterious. Matter isn’t actually conserved, mass is just energy you haven’t met yet. Deep down, the universe is just a bunch of rippling quantum fields, with different ones more or less active at different times. Space-time itself is just another field, the gravitational field. When people say that in the Big Bang matter emerged from nothing, all they mean is that energy moved from the gravitational field to fields like the electron and quark, giving rise to particles. As we wind the model back, we can pretty well understand how this could happen.

If we extrapolate, winding Einstein’s equations back all the way, we reach a singularity: the whole universe, according to those equations, would have emerged from a single point, a time when everything was zero distance from everything else. This assumes, though, that Einstein’s equations keep working all the way back that far. That’s probably wrong, though. Einstein’s equations don’t include the effect of quantum mechanics, which should be much more important when the universe is at its hottest and densest. We don’t have a complete theory of quantum gravity yet (at least, not one that can model this), so we can’t be certain how to correct these equations. But in general, quantum theories tend to “fuzz out” singularities, spreading out a single point over a wider area. So it’s likely that the universe didn’t actually come from just a single point, and our various incomplete theories of quantum gravity tend to back this up.

So, starting from what we can see, we extrapolate back to what we can’t. We’re quite confident in some parts of the Big Bang theory: the emergence of the first galaxies, the first stars, the first atoms, and the first elements. Back far enough and things get more mysterious, we have proposals but no definite answers. And if you try to wind back up to the beginning, you find we still don’t have the right kind of theory to answer the question. That’s a task for the future.

Black Holes, Neutron Stars, and the Power of Love

What’s the difference between a black hole and a neutron star?

When a massive star nears the end of its life, it starts running out of nuclear fuel. Without the support of a continuous explosion, the star begins to collapse, crushed under its own weight.

What happens then depends on how much weight that is. The most massive stars collapse completely, into the densest form anything can take: a black hole. Einstein’s equations say a black hole is a single point, infinitely dense: get close enough and nothing, not even light, can escape. A quantum theory of gravity would change this, but not a lot: a quantum black hole would still be as dense as quantum matter can get, still equipped with a similar “point of no return”.

A slightly less massive star collapses, not to a black hole, but to a neutron star. Matter in a neutron star doesn’t collapse to a single point, but it does change dramatically. Each electron in the old star is crushed together with a proton until it becomes a neutron, a forced reversal of the more familiar process of Beta decay. Instead of a ball of hydrogen and helium, the star then ends up like a single atomic nucleus, one roughly the size of a city.

Not kidding about the “city” thing…and remember, this is more massive than the Sun

Now, let me ask a slightly different question: how do you tell the difference between a black hole and a neutron star?

Sometimes, you can tell this through ordinary astronomy. Neutron stars do emit light, unlike black holes, though for most neutron stars this is hard to detect. In the past, astronomers would use other objects instead, looking at light from matter falling in, orbiting, or passing by a black hole or neutron star to estimate its mass and size.

Now they have another tool: gravitational wave telescopes. Maybe you’ve heard of LIGO, or its European cousin Virgo: massive machines that do astronomy not with light but by detecting ripples in space and time. In the future, these will be joined by an even bigger setup in space, called LISA. When two black holes or neutron stars collide they “ring” the fabric of space and time like a bell, sending out waves in every direction. By analyzing the frequency of these waves, scientists can learn something about what made them: in particular, whether the waves were made by black holes or neutron stars.

One big difference between black holes and neutron stars lies in something called their “Love numbers“. From far enough away, you can pretend both black holes and neutron stars are single points, like fundamental particles. Try to get more precise, and this picture starts to fail, but if you’re smart you can include small corrections and keep things working. Some of those corrections, called Love numbers, measure how much one object gets squeezed and stretched by the other’s gravitational field. They’re called Love numbers not because they measure how hug-able a neutron star is, but after the mathematician who first proposed them, A. E. H. Love.

What can we learn from Love numbers? Quite a lot. More impressively, there are several different types of questions Love numbers can answer. There are questions about our theories, questions about the natural world, and questions about fundamental physics.

You might have heard that black holes “have no hair”. A black hole in space can be described by just two numbers: its mass, and how much it spins. A star is much more complicated, with sunspots and solar flares and layers of different gases in different amounts. For a black hole, all of that is compressed down to nothing, reduced to just those two numbers and nothing else.

With that in mind, you might think a black hole should have zero Love numbers: it should be impossible to squeeze it or stretch it. This is fundamentally a question about a theory, Einstein’s theory of relativity. If we took that theory for granted, and didn’t add anything to it, what would the consequences be? Would black holes have zero Love number, or not?

It turns out black holes do have zero Love number, if they aren’t spinning. If they are, things are more complicated: a few calculations made it look like spinning black holes also had zero Love number, but just last year a more detailed proof showed that this doesn’t hold. Somehow, despite having “no hair”, you can actually “squeeze” a spinning black hole.

(EDIT: Folks on twitter pointed out a wrinkle here: more recent papers are arguing that spinning black holes actually do have zero Love number as well, and that the earlier papers confused Love numbers with a different effect. All that is to say this is still very much an active area of research!)

The physics behind neutron stars is in principle known, but in practice hard to understand. When they are formed, almost every type of physics gets involved: gas and dust, neutrino blasts, nuclear physics, and general relativity holding it all together.

Because of all this complexity, the structure of neutron stars can’t be calculated from “first principles” alone. Finding it out isn’t a question about our theories, but a question about the natural world. We need to go out and measure how neutron stars actually behave.

Love numbers are a promising way to do that. Love numbers tell you how an object gets squeezed and stretched in a gravitational field. Learning the Love numbers of neutron stars will tell us something about their structure: namely, how squeezable and stretchable they are. Already, LIGO and Virgo have given us some information about this, and ruled out a few possibilities. In future, the LISA telescope will show much more.

Returning to black holes, you might wonder what happens if we don’t stick to Einstein’s theory of relativity. Physicists expect that relativity has to be modified to account for quantum effects, to make a true theory of quantum gravity. We don’t quite know how to do that yet, but there are a few proposals on the table.

Asking for the true theory of quantum gravity isn’t just a question about some specific part of the natural world, it’s a question about the fundamental laws of physics. Can Love numbers help us answer it?

Maybe. Some theorists think that quantum gravity will change the Love numbers of black holes. Fewer, but still some, think they will change enough to be detectable, with future gravitational wave telescopes like LISA. I get the impression this is controversial, both because of the different proposals involved and the approximations used to understand them. Still, it’s fun that Love numbers can answer so many different types of questions, and teach us so many different things about physics.

Unrelated: For those curious about what I look/sound like, I recently gave a talk of outreach advice for the Max Planck Institute for Physics, and they posted it online here.

What Tells Your Story

I watched Hamilton on Disney+ recently. With GIFs and songs from the show all over social media for the last few years, there weren’t many surprises. One thing that nonetheless struck me was the focus on historical evidence. The musical Hamilton is based on Ron Chernow’s biography of Alexander Hamilton, and it preserves a surprising amount of the historian’s care for how we know what we know, hidden within the show’s other themes. From the refrain of “who tells your story”, to the importance of Eliza burning her letters with Hamilton (not just the emotional gesture but the “gap in the narrative” it created for historians), to the song “The Room Where It Happens” (which looked from GIFsets like it was about Burr’s desire for power, but is mostly about how much of history is hidden in conversations we can only partly reconstruct), the show keeps the puzzle of reasoning from incomplete evidence front-and-center.

Any time we try to reason about the past, we are faced with these kinds of questions. They don’t just apply to history, but to the so-called historical sciences as well, sciences that study the past. Instead of asking “who” told the story, such scientists must keep in mind “what” is telling the story. For example, paleontologists reason from fossils, and thus are limited by what does and doesn’t get preserved. As a result after a century of studying dinosaurs, only in the last twenty years did it become clear they had feathers.

Astronomy, too, is a historical science. Whenever astronomers look out at distant stars, they are looking at the past. And just like historians and paleontologists, they are limited by what evidence happened to be preserved, and what part of that evidence they can access.

These limitations lead to mysteries, and often controversies. Before LIGO, astronomers had an idea of what the typical mass of a black hole was. After LIGO, a new slate of black holes has been observed, with much higher mass. It’s still unclear why.

Try to reason about the whole universe, and you end up asking similar questions. When we see the movement of “standard candle” stars, is that because the universe’s expansion is accelerating, or are the stars moving as a group?

Push far enough back and the evidence doesn’t just lead to controversy, but to hard limits on what we can know. No matter how good our telescopes are, we won’t see light older than the cosmic microwave background: before that background was emitted the universe was filled with plasma, which would have absorbed any earlier light, erasing anything we could learn from it. Gravitational waves may one day let us probe earlier, and make discoveries as surprising as feathered dinosaurs. But there is yet a stronger limit to how far back we can go, beyond which any evidence has been so diluted that it is indistinguishable from random noise. We can never quite see into “the room where it happened”.

It’s gratifying to see questions of historical evidence in a Broadway musical, in the same way it was gratifying to hear fractals mentioned in a Disney movie. It’s important to think about who, and what, is telling the stories we learn. Spreading that lesson helps all of us reason better.

QCD Meets Gravity 2020

I’m at another Zoom conference this week, QCD Meets Gravity. This year it’s hosted by Northwestern.

The view of the campus from wonder.me

QCD Meets Gravity is a conference series focused on the often-surprising links between quantum chromodynamics on the one hand and gravity on the other. By thinking of gravity as the “square” of forces like the strong nuclear force, researchers have unlocked new calculation techniques and deep insights.

Last year’s conference was very focused on one particular topic, trying to predict the gravitational waves observed by LIGO and VIRGO. That’s still a core topic of the conference, but it feels like there is a bit more diversity in topics this year. We’ve seen a variety of talks on different “squares”: new theories that square to other theories, and new calculations that benefit from “squaring” (even surprising applications to the Navier-Stokes equation!) There are talks on subjects from String Theory to Effective Field Theory, and even a talk on a very different way that “QCD meets gravity”, in collisions of neutron stars.

With still a few more talks to go, expect me to say a bit more next week, probably discussing a few in more detail. (Several people presented exciting work in progress!) Until then, I should get back to watching!

Truth Doesn’t Have to Break the (Word) Budget

Imagine you saw this headline:

Scientists Say They’ve Found the Missing 40 Percent of the Universe’s Matter

It probably sounds like they’re talking about dark matter, right? And if scientists found dark matter, that could be a huge discovery: figuring out what dark matter is made of is one of the biggest outstanding mysteries in physics. Still, maybe that 40% number makes you a bit suspicious…

Now, read this headline instead:

Astronomers Have Finally Found Most of The Universe’s Missing Visible Matter

Visible matter! Ah, what a difference a single word makes!

These are two articles, the first from this year and the second from 2017, talking about the same thing. Leave out dark matter and dark energy, and the rest of the universe is made of ordinary protons, neutrons, and electrons. We sometimes call that “visible matter”, but that doesn’t mean it’s easy to spot. Much of it lingers in threads of gas and dust between galaxies, making it difficult to detect. These two articles are about astronomers who managed to detect this matter in different ways. But while the articles cover the same sort of matter, one headline is a lot more misleading.

Now, I know science writing is hard work. You can’t avoid misleading your readers, if only a little, because you can never include every detail. Introduce too many new words and you’ll use up your “vocabulary budget” and lose your audience. I also know that headlines get tweaked by editors at the last minute to maximize “clicks”, and that news that doesn’t get enough “clicks” dies out, replaced by news that does.

But that second headline? It’s shorter than the first. They were able to fit that crucial word “visible” in, without breaking the budget. And while I don’t have the data, I doubt the first headline was that much more viral. They could have afforded to get this right, if they wanted to.

Read each article further, and you see the same pattern. The 2020 article does mention visible matter in the first sentence at least, so they don’t screw that one up completely. But another important detail never gets mentioned.

See, you might be wondering, if one of these articles is from 2017 and the other is from 2020, how are they talking about the same thing? If astronomers found this matter already in 2017, how did they find it again in 2020?

There’s a key detail that the 2017 article mentions and the 2020 article leaves out. Here’s a quote from the 2017 article, emphasis mine:

We now have our first solid piece of evidence that this matter has been hiding in the delicate threads of cosmic webbing bridging neighbouring galaxies, right where the models predicted.

This “missing” matter was expected to exist, was predicted by models to exist. It just hadn’t been observed yet. In 2017, astronomers detected some of this matter indirectly, through its effect on the Cosmic Microwave Background. In 2020, they found it more directly, through X-rays shot out from the gases themselves.

Once again, the difference is just a short phrase. By saying “right where the models predicted”, the 2017 article clears up an important point, that this matter wasn’t a surprise. And all it took was five words.

These little words and phrases make a big difference. If you’re writing about science, you will always face misunderstandings. But if you’re careful and clever, you can clear up the most obvious ones. With just a few well-chosen words, you can have a much better piece.

Congratulations to Roger Penrose, Reinhard Genzel, and Andrea Ghez!

The 2020 Physics Nobel Prize was announced last week, awarded to Roger Penrose for his theorems about black holes and Reinhard Genzel and Andrea Ghez for discovering the black hole at the center of our galaxy.

Of the three, I’m most familiar with Penrose’s work. People had studied black holes before Penrose, but only the simplest of situations, like an imaginary perfectly spherical star. Some wondered whether black holes in nature were limited in this way, if they could only exist under perfectly balanced conditions. Penrose showed that wasn’t true: he proved mathematically that black holes not only can form, they must form, in very general situations. He’s also worked on a wide variety of other things. He came up with “twistor space”, an idea intended for a new theory of quantum gravity that ended up as a useful tool for “amplitudeologists” like me to study particle physics. He discovered a set of four types of tiles such that if you tiled a floor with them the pattern would never repeat. And he has some controversial hypotheses about quantum gravity and consciousness.

I’m less familiar with Genzel and Ghez, but by now everyone should be familiar with what they found. Genzel and Ghez led two teams that peered into the center of our galaxy. By carefully measuring the way stars moved deep in the core, they figured out something we now teach children: that our beloved Milky Way has a dark and chewy center, an enormous black hole around which everything else revolves. These appear to be a common feature of galaxies, and many others have been shown to orbit black holes as well.

Like last year, I find it a bit odd that the Nobel committee decided to lump these two prizes together. Both discoveries concern black holes, so they’re more related than last year’s laureates, but the contexts are quite different: it’s not as if Penrose predicted the black hole in the center of our galaxy. Usually the Nobel committee avoids mathematical work like Penrose’s, except when it’s tied to a particular experimental discovery. It doesn’t look like anyone has gotten a Nobel prize for discovering that black holes exist, so maybe that’s the intent of this one…but Genzel and Ghez were not the first people to find evidence of a black hole. So overall I’m confused. I’d say that Penrose deserved a Nobel Prize, and that Genzel and Ghez did as well, but I’m not sure why they needed to split one with each other.

4gravitons, Spinning Up

I had a new paper out last week, with Michèle Levi and Andrew McLeod. But to explain it, I’ll need to clarify something about our last paper.

Two weeks ago, I told you that Andrew and Michèle and I had written a paper, predicting what gravitational wave telescopes like LIGO see when black holes collide. You may remember that LIGO doesn’t just see colliding black holes: it sees colliding neutron stars too. So why didn’t we predict what happens when neutron stars collide?

Actually, we did. Our calculation doesn’t just apply to black holes. It applies to neutron stars too. And not just neutron stars: it applies to anything of roughly the right size and shape. Black holes, neutron stars, very large grapefruits…

LIGO’s next big discovery

That’s the magic of Effective Field Theory, the “zoom lens” of particle physics. Zoom out far enough, and any big, round object starts looking like a particle. Black holes, neutron stars, grapefruits, we can describe them all using the same math.

Ok, so we can describe both black holes and neutron stars. Can we tell the difference between them?

In our last calculation, no. In this one, yes!

Effective Field Theory isn’t just a zoom lens, it’s a controlled approximation. That means that when we “zoom out” we don’t just throw out anything “too small to see”. Instead, we approximate it, estimating how big of an effect it can have. Depending on how precise we want to be, we can include more and more of these approximated effects. If our estimates are good, we’ll include everything that matters, and get a good approximation for what we’re trying to observe.

At the precision of our last calculation, a black hole and a neutron star still look exactly the same. Our new calculation aims for a bit higher precision though. (For the experts: we’re at a higher order in spin.) The higher precision means that we can actually see the difference: our result changes for two colliding black holes versus two colliding grapefruits.

So does that mean I can tell you what happens when two neutron stars collide, according to our calculation? Actually, no. That’s not because we screwed up the calculation: it’s because some of the properties of neutron stars are unknown.

The Effective Field Theory of neutron stars has what we call “free parameters”, unknown variables. People have tried to estimate some of these (called “Love numbers” after the mathematician A. E. H. Love), but they depend on the details of how neutron stars work: what stuff they contain, how that stuff is shaped, and how it can move. To find them out, we probably can’t just calculate: we’ll have to measure, observe an actual neutron star collision and see what the numbers actually are.

That’s one of the purposes of gravitational wave telescopes. It’s not (as far as I know) something LIGO can measure. But future telescopes, with more precision, should be able to. By watching two colliding neutron stars and comparing to a high-precision calculation, physicists will better understand what those neutron stars are made of. In order to do that, they will need someone to do that high-precision calculation. And that’s why people like me are involved.