Tag Archives: particle physics

At Ars Technica Last Week, With a Piece on How Wacky Ideas Become Big Experiments

I had a piece last week at Ars Technica about the path ideas in physics take to become full-fledged experiments.

My original idea for the story was a light-hearted short news piece. A physicist at the University of Kansas, Steven Prohira, had just posted a proposal for wiring up a forest to detect high-energy neutrinos, using the trees like giant antennas.

Chatting to experts, what at first seemed silly started feeling like a hook for something more. Prohira has a strong track record, and the experts I talked to took his idea seriously. They had significant doubts, but I was struck by how answerable those doubts were, how rather than dismissing the whole enterprise they had in mind a list of questions one could actually test. I wrote a blog post laying out that impression here.

The editor at Ars was interested, so I dug deeper. Prohira’s story became a window on a wider-ranging question: how do experiments happen? How does a scientist convince the community to work on a project, and the government to fund it? How do ideas get tested before these giant experiments get built?

I tracked down researchers from existing experiments and got their stories. They told me how detecting particles from space takes ingenuity, with wacky ideas involving the natural world being surprisingly common. They walked me through tales of prototypes and jury-rigging and feasibility studies and approval processes.

The highlights of those tales ended up in the piece, but there was a lot I couldn’t include. In particular, I had a long chat with Sunil Gupta about the twists and turns taken by the GRAPES experiment in India. Luckily for you, some of the most interesting stories have already been covered, for example their measurement of the voltage of a thunderstorm or repurposing used building materials to keep costs down. I haven’t yet found his story about stirring wavelength-shifting chemicals all night using a propeller mounted on a power drill, but I suspect it’s out there somewhere. If not, maybe it can be the start of a new piece!

A Tale of Two Experiments

Before I begin, two small announcements:

First: I am now on bluesky! Instead of having a separate link in the top menu for each social media account, I’ve changed the format so now there are social media buttons in the right-hand sidebar, right under the “Follow” button. Currently, they cover tumblr, twitter, and bluesky, but there may be more in future.

Second, I’ve put a bit more technical advice on my “Open Source Grant Proposal” post, so people interested in proposing similar research can have some ideas about how best to pitch it.

Now, on to the post:


Gravitational wave telescopes are possibly the most exciting research program in physics right now. Big, expensive machines with more on the way in the coming decades, gravitational wave telescopes need both precise theoretical predictions and high-quality data analysis. For some, gravitational wave telescopes have the potential to reveal genuinely new physics, to probe deviations from general relativity that might be related to phenomena like dark matter, though so far no such deviations have been conclusively observed. In the meantime, they’re teaching us new consequences of known physics. For example, the unusual population of black holes observed by LIGO has motivated those who model star clusters to consider processes in which the motion of three stars or black holes is related to each other, discovering that these processes are more important than expected.

Particle colliders are probably still exciting to the general public, but for many there is a growing sense of fatigue and disillusionment. Current machines like the LHC are big and expensive, and proposed future colliders would be even costlier and take decades to come online, in addition to requiring a huge amount of effort from the community in terms of precise theoretical predictions and data analysis. Some argue that colliders still might uncover genuinely new physics, deviations from the standard model that might explain phenomena like dark matter, but as no such deviations have yet been conclusively observed people are increasingly skeptical. In the meantime, most people working on collider physics are focused on learning new consequences of known physics. For example, by comparing observed results with theoretical approximations, people have found that certain high-energy processes usually left out of calculations are actually needed to get a good agreement with the data, showing that these processes are more important than expected.

…ok, you see what I did there, right? Was that fair?

There are a few key differences, with implications to keep in mind:

First, collider physics is significantly more expensive than gravitational wave physics. LIGO took about $300 million to build and spends about $50 million a year. The LHC took about $5 billion to build and costs $1 billion a year to run. That cost still puts both well below several other government expenses that you probably consider frivolous (please don’t start arguing about which ones in the comments!), but it does mean collider physics demands a bit of a stronger argument.

Second, the theoretical motivation to expect new fundamental physics out of LIGO is generally considered much weaker than for colliders. A large part of the theoretical physics community thought that they had a good argument why they should see something new at the LHC. In contrast, most theorists have been skeptical of the kinds of modified gravity theories that have dramatic enough effects that one could measure them with gravitational wave telescopes, with many of these theories having other pathologies or inconsistencies that made people wary.

Third, the general public finds astrophysics cooler than particle physics. Somehow, telling people “pairs of black holes collide more often than we thought because sometimes a third star in the neighborhood nudges them together” gets people much more excited than “pairs of quarks collide more often than we thought because we need to re-sum large logarithms differently”, even though I don’t think there’s a real “principled” difference between them. Neither reveals new laws of nature, both are upgrades to our ability to model how real physical objects behave, neither is useful to know for anybody living on Earth in the present day.

With all this in mind, my advice to gravitational wave physicists is to try, as much as possible, not to lean on stories about dark matter and modified gravity. You might learn something, and it’s worth occasionally mentioning that. But if you don’t, you run a serious risk of disappointing people. And you have such a big PR advantage if you just lean on new consequences of bog standard GR, that those guys really should get the bulk of the news coverage if you want to keep the public on your side.

Replacing Space-Time With the Space in Your Eyes

Nima Arkani-Hamed thinks space-time is doomed.

That doesn’t mean he thinks it’s about to be destroyed by a supervillain. Rather, Nima, like many physicists, thinks that space and time are just approximations to a deeper reality. In order to make sense of gravity in a quantum world, seemingly fundamental ideas, like that particles move through particular places at particular times, will probably need to become more flexible.

But while most people who think space-time is doomed research quantum gravity, Nima’s path is different. Nima has been studying scattering amplitudes, formulas used by particle physicists to predict how likely particles are to collide in particular ways. He has been trying to find ways to calculate these scattering amplitudes without referring directly to particles traveling through space and time. In the long run, the hope is that knowing how to do these calculations will help suggest new theories beyond particle physics, theories that can’t be described with space and time at all.

Ten years ago, Nima figured out how to do this in a particular theory, one that doesn’t describe the real world. For that theory he was able to find a new picture of how to calculate scattering amplitudes based on a combinatorical, geometric space with no reference to particles traveling through space-time. He gave this space the catchy name “the amplituhedron“. In the years since, he found a few other “hedra” describing different theories.

Now, he’s got a new approach. The new approach doesn’t have the same kind of catchy name: people sometimes call it surfaceology, or curve integral formalism. Like the amplituhedron, it involves concepts from combinatorics and geometry. It isn’t quite as “pure” as the amplituhedron: it uses a bit more from ordinary particle physics, and while it avoids specific paths in space-time it does care about the shape of those paths. Still, it has one big advantage: unlike the amplituhedron, Nima’s new approach looks like it can work for at least a few of the theories that actually describe the real world.

The amplituhedron was mysterious. Instead of space and time, it described the world in terms of a geometric space whose meaning was unclear. Nima’s new approach also describes the world in terms of a geometric space, but this space’s meaning is a lot more clear.

The space is called “kinematic space”. That probably still sounds mysterious. “Kinematic” in physics refers to motion. In the beginning of a physics class when you study velocity and acceleration before you’ve introduced a single force, you’re studying kinematics. In particle physics, kinematic refers to the motion of the particles you detect. If you see an electron going up and to the right at a tenth the speed of light, those are its kinematics.

Kinematic space, then, is the space of observations. By saying that his approach is based on ideas in kinematic space, what Nima is saying is that it describes colliding particles not based on what they might be doing before they’re detected, but on mathematics that asks questions only about facts about the particles that can be observed.

(For the experts: this isn’t quite true, because he still needs a concept of loop momenta. He’s getting the actual integrands from his approach, rather than the dual definition he got from the amplituhedron. But he does still have to integrate one way or another.)

Quantum mechanics famously has many interpretations. In my experience, Nima’s favorite interpretation is the one known as “shut up and calculate”. Instead of arguing about the nature of an indeterminately philosophical “real world”, Nima thinks quantum physics is a tool to calculate things people can observe in experiments, and that’s the part we should care about.

From a practical perspective, I agree with him. And I think if you have this perspective, then ultimately, kinematic space is where your theories have to live. Kinematic space is nothing more or less than the space of observations, the space defined by where things land in your detectors, or if you’re a human and not a collider, in your eyes. If you want to strip away all the speculation about the nature of reality, this is all that is left over. Any theory, of any reality, will have to be described in this way. So if you think reality might need a totally new weird theory, it makes sense to approach things like Nima does, and start with the one thing that will always remain: observations.

I Ain’t Afraid of No-Ghost Theorems

In honor of Halloween this week, let me say a bit about the spookiest term in physics: ghosts.

In particle physics, we talk about the universe in terms of quantum fields. There is an electron field for electrons, a gluon field for gluons, a Higgs field for Higgs bosons. The simplest fields, for the simplest particles, can be described in terms of just a single number at each point in space and time, a value describing how strong the field is. More complicated fields require more numbers.

Most of the fundamental forces have what we call vector fields. They’re called this because they are often described with vectors, lists of numbers that identify a direction in space and time. But these vectors actually contain too many numbers.

These extra numbers have to be tidied up in some way in order to describe vector fields in the real world, like the electromagnetic field or the gluon field of the strong nuclear force. There are a number of tricks, but the nicest is usually to add some extra particles called ghosts. Ghosts are designed to cancel out the extra numbers in a vector, leaving the right description for a vector field. They’re set up mathematically such that they can never be observed, they’re just a mathematical trick.

Mathematical tricks aren’t all that spooky (unless you’re scared of mathematics itself, anyway). But in physics, ghosts can take on a spookier role as well.

In order to do their job cancelling those numbers, ghosts need to function as a kind of opposite to a normal particle, a sort of undead particle. Normal particles have kinetic energy: as they go faster and faster, they have more and more energy. Said another way, it takes more and more energy to make them go faster. Ghosts have negative kinetic energy: the faster they go, the less energy they have.

If ghosts are just a mathematical trick, that’s fine, they’ll do their job and cancel out what they’re supposed to. But sometimes, physicists accidentally write down a theory where the ghosts aren’t just a trick cancelling something out, but real particles you could detect, without anything to hide them away.

In a theory where ghosts really exist, the universe stops making sense. The universe defaults to the lowest energy it can reach. If making a ghost particle go faster reduces its energy, then the universe will make ghost particles go faster and faster, and make more and more ghost particles, until everything is jam-packed with super-speedy ghosts unto infinity, never-ending because it’s always possible to reduce the energy by adding more ghosts.

The absence of ghosts, then, is a requirement for a sensible theory. People prove theorems showing that their new ideas don’t create ghosts. And if your theory does start seeing ghosts…well, that’s the spookiest omen of all: an omen that your theory is wrong.

Transforming Particles Are Probably Here to Stay

It can be tempting to imagine the world in terms of lego-like building-blocks. Atoms stick together protons, neutrons, and electrons, and protons and neutrons are made of stuck-together quarks in turn. And while atoms, despite the name, aren’t indivisible, you might think that if you look small enough you’ll find indivisible, unchanging pieces, the smallest building-blocks of reality.

Part of that is true. We might, at some point, find the smallest pieces, the things everything else is made of. (In a sense, it’s quite likely we’ve already found them!) But those pieces don’t behave like lego blocks. They aren’t indivisible and unchanging.

Instead, particles, even the most fundamental particles, transform! The most familiar example is beta decay, a radioactive process where a neutron turns into a proton, emitting an electron and a neutrino. This process can be explained in terms of more fundamental particles: the neutron is made of three quarks, and one of those quarks transforms from a “down quark” to an “up quark”. But the explanation, as far as we can tell, doesn’t go any deeper. Quarks aren’t unchanging, they transform.

Beta decay! Ignore the W, which is important but not for this post.

There’s a suggestion I keep hearing, both from curious amateurs and from dedicated crackpots: why doesn’t this mean that quarks have parts? If a down quark can turn into an up dark, an electron, and a neutrino, then why doesn’t that mean that a down quark contains an up quark, an electron, and a neutrino?

The simplest reason is that this isn’t the only way a quark transforms. You can also have beta-plus decay, where an up quark transforms into a down quark, emitting a neutrino and the positively charged anti-particle of the electron, called a positron.

Also, ignore the directions of the arrows, that’s weird particle physics notation that doesn’t matter here.

So to make your idea work, you’d somehow need each down quark to contain an up quark plus some other particles, and each up quark to contain a down quark plus some other particles.

Can you figure out some complicated scheme that works like that? Maybe. But there’s a deeper reason why this is the wrong path.

Transforming particles are part of a broader phenomenon, called particle production. Reactions in particle physics can produce new particles that weren’t there before. This wasn’t part of the earliest theories of quantum mechanics that described one electron at a time. But if you want to consider the quantum properties of not just electrons, but the electric field as well, then you need a more complete theory, called a quantum field theory. And in those theories, you can produce new particles. It’s as simple as turning on the lights: from a wiggling electron, you make light, which in a fully quantum theory is made up of photons. Those photons weren’t “part of” the electron to start with, they are produced by its motion.

If you want to avoid transforming particles, to describe everything in terms of lego-like building-blocks, then you want to avoid particle production altogether. Can you do this in a quantum field theory?

Actually, yes! But your theory won’t describe the whole of the real world.

In physics, we have examples of theories that don’t have particle production. These example theories have a property called integrability. They are theories we can “solve”, doing calculations that aren’t possible in ordinary theories, named after the fact that the oldest such theories in classical mechanics were solved using integrals.

Normal particle physics theories have conserved charges. Beta decay conserves electric charge: you start out with a neutral particle, and end up with one particle with positive charge and another with negative charge. It also conserves other things, like “electron-number” (the electron has electron-number one, the neutrino that comes out with it has electron-number minus one), energy, and momentum.

Integrable theories have those charges too, but they have more. In fact, they have an infinite number of conserved charges. As a result, you can show that in these theories it is impossible to produce new particles. It’s as if each particle’s existence is its own kind of conserved charge, one that can never be created or destroyed, so that each collision just rearranges the particles, never makes new ones.

But while we can write down these theories, we know they can’t describe the whole of the real world. In an integrable theory, when you build things up from the fundamental building-blocks, their energy follows a pattern. Compare the energy of a bunch of different combinations, and you find a characteristic kind of statistical behavior called a Poisson distribution.

Look at the distribution of energies of nuclei of atoms, and you’ll find a very different kind of behavior. It’s called a Wigner-Dyson distribution, and it indicates the opposite of integrability: chaos. Chaos is behavior that can’t be “solved” like integrable theories, behavior that has to be approached by simulations and approximations.

So if you really want there to be un-changing building-blocks, if you think that’s really essential? Then you should probably start looking at integrable theories. But I wouldn’t hold my breath if I were you: the real world seems pretty clearly chaotic, not integrable. And probably, that means particle production is here to stay.

The Bystander Effect for Reviewers

I probably came off last week as a bit of an extreme “journal abolitionist”. This week, I wanted to give a couple caveats.

First, as a commenter pointed out, the main journals we use in my field are run by nonprofits. Physical Review Letters, the journal where we publish five-page papers about flashy results, is run by the American Physical Society. The Journal of High-Energy Physics, where we publish almost everything else, is run by SISSA, the International School for Advanced Studies in Trieste. (SISSA does use Springer, a regular for-profit publisher, to do the actual publishing.)

The journals are also funded collectively, something I pointed out here before but might not have been obvious to readers of last week’s post. There is an agreement, SCOAP3, where research institutions band together to pay the journals. Authors don’t have to pay to publish, and individual libraries don’t have to pay for subscriptions.

And this is a lot better than the situation in other fields, yeah! Though I’d love to quantify how much. I haven’t been able to find a detailed breakdown, but SCOAP3 pays around 1200 EUR per article published. What I’d like to do (but not this week) is to compare this to what other fields pay, as well as to publishing that doesn’t have the same sort of trapped audience, and to online-only free journals like SciPost. (For example, publishing actual physical copies of journals at this point is sort of a vanity thing, so maybe we should compare costs to vanity publishers?)

Second, there’s reviewing itself. Even without traditional journals, one might still want to keep peer review.

What I wanted to understand last week was what peer review does right now, in my field. We read papers fresh off the arXiv, before they’ve gone through peer review. Authors aren’t forced to update the arXiv with the journal version of their paper, if they want another version, even if that version was rejected by the reviewers, then they’re free to do so, and most of us wouldn’t notice. And the sort of in-depth review that happens in peer review also happens without it. When we have journal clubs and nominate someone to present a recent paper, or when we try to build on a result or figure out why it contradicts something we thought we knew, we go through the same kind of in-depth reading that (in the best cases) reviewers do.

But I think I’ve hit upon something review does that those kinds of informal things don’t. It gets us to speak up about it.

I presented at a journal club recently. I read through a bombastic new paper, figured out what I thought was wrong with it, and explained it to my colleagues.

But did I reach out to the author? No, of course not, that would be weird.

Psychologists talk about the bystander effect. If someone collapses on the street, and you’re the only person nearby, you’ll help. If you’re one of many, you’ll wait and see if someone else helps instead.

I think there’s a bystander effect for correcting people. If someone makes a mistake and publishes something wrong, we’ll gripe about it to each other. But typically, we won’t feel like it’s our place to tell the author. We might get into a frustrating argument, there wouldn’t be much in it for us, and it might hurt our reputation if the author is well-liked.

(People do speak up when they have something to gain, of course. That’s why when you write a paper, most of the people emailing you won’t be criticizing the science: they’ll be telling you you need to cite them.)

Peer review changes the expectations. Suddenly, you’re expected to criticize, it’s your social role. And you’re typically anonymous, you don’t have to worry about the consequences. It becomes a lot easier to say what you really think.

(It also becomes quite easy to say lazy stupid things, of course. This is why I like setups like SciPost, where reviews are made public even when the reviewers are anonymous. It encourages people to put some effort in, and it means that others can see that a paper was rejected for bad reasons and put less stock in the rejection.)

I think any new structure we put in place should keep this feature. We need to preserve some way to designate someone a critic, to give someone a social role that lets them let loose and explain why someone else is wrong. And having these designated critics around does help my field. The good criticisms get implemented in the papers, the authors put the new versions up on arXiv. Reviewing papers for journals does make our science better…even if none of us read the journal itself.

Why Journals Are Sticky

An older professor in my field has a quirk: every time he organizes a conference, he publishes all the talks in a conference proceeding.

In some fields, this would be quite normal. In computer science, where progress flows like a torrent, new developments are announced at conferences long before they have the time to be written up carefully as a published paper. Conference proceedings are summaries of what was presented at the conference, published so that anyone can catch up on the new developments.

In my field, this is rarer. A few results at each conference will be genuinely new, never-before-published discoveries. Most, though, are talks on older results, results already available online. Writing them up again in summarized form as a conference proceeding seems like a massive waste of time.

The cynical explanation is that this professor is doing this for the citations. Each conference proceeding one of his students publishes is another publication on their CV, another work that they can demand people cite whenever someone uses their ideas or software, something that puts them above others’ students without actually doing any extra scientific work.

I don’t think that’s how this professor thinks about it, though. He certainly cares about his students’ careers, and will fight for them to get cited as much as possible. But he asks everyone at the conference to publish a proceeding, not just his students. I think he’d argue that proceedings are helpful, that they can summarize papers in new ways and make them more accessible. And if they give everyone involved a bit more glory, if they let them add new entries to their CV and get fancy books on their shelves, so much the better for everyone.

My guess is, he really believes something like that. And I’m fairly sure he’s wrong.

The occasional conference proceeding helps, but only because it makes us more flexible. Sometimes, it’s important to let others know about a new result that hasn’t been published yet, and we let conference proceedings go into less detail than a full published paper, so this can speed things up. Sometimes, an old result can benefit from a new, clearer explanation, which normally couldn’t be published without it being a new result (or lecture notes). It’s good to have the option of a conference proceeding.

But there is absolutely no reason to have one for every single talk at a conference.

Between the cynical reason and the explicit reason, there’s the banal one. This guy insists on conference proceedings because they were more useful in the past, because they’re useful in other fields, and because he’s been doing them himself for years. He insists on them because to him, they’re a part of what it means to be a responsible scientist.

And people go along with it. Because they don’t want to get into a fight with this guy, certainly. But also because it’s a bit of extra work that could give a bit of a career boost, so what’s the harm?

I think something similar to this is why academic journals still work the way they do.

In the past, journals were the way physicists heard about new discoveries. They would get each edition in the mail, and read up on new developments. The journal needed to pay professional copyeditors and printers, so they needed money, and they got that money from investors by being part of for-profit companies that sold shares.

Now, though, physicists in my field don’t read journals. We publish our new discoveries online on a non-profit website, formatting them ourselves with software that uses the same programming skills we use in the rest of our professional lives. We then discuss the papers in email threads and journal club meetings. When a paper is wrong, or missing something important, we tell the author, and they fix it.

Oh, and then after that we submit the papers to the same for-profit journals and the same review process that we used to use before we did all this, listing the journals that finally accept the papers on our CVs.

Why do we still do that?

Again, you can be cynical. You can accuse the journals of mafia-ish behavior, you can tie things back to the desperate need to publish in high-ranked journals to get hired. But I think the real answer is a bit more innocent, and human, than that.

Imagine that you’re a senior person in the field. You may remember the time before we had all of these nice web-based publishing options, when journals were the best way to hear about new developments. More importantly than that, though, you’ve worked with these journals. You’ve certainly reviewed papers for them, everyone in the field does that, but you may have also served as an editor, tracking down reviewers and handling communication between the authors and the journal. You’ve seen plenty of cases where the journal mattered, where tracking down the right reviewers caught a mistake or shot down a crackpot’s ambitions, where the editing cleaned something up or made a work more appear more professional. You think of the journals as having high standards, standards you have helped to uphold: when choosing between candidates for a job, you notice that one has several papers in Physical Review Letters, and remember papers you’ve rejected for not meeting what you intuited were that journal’s standards. To you, journals are a key part of being a responsible scientist.

Does any of that make journals worth it, though?

Well, that depends on costs. It depends on alternatives. It depends not merely on what the journals catch, but on how often they do it, and how much would have been caught on its own. It depends on whether the high standards you want to apply to job applicants are already being applied by the people who write their recommendation letters and establish their reputations.

And you’re not in a position to evaluate any of that, of course. Few people are, who don’t spend a ton of time thinking about scientific publishing.

And thus, for the non-senior people, there’s not much reason to push back. One hears a few lofty speeches about Elsevier’s profits, and dreams about the end of the big for-profit journals. But most people aren’t cut out to be crusaders or reformers, especially when they signed up to be scientists. Most people are content not to annoy the most respected people in their field by telling them that something they’ve spent an enormous amount of time on is now pointless. Most people want to be seen as helpful by these people, to not slack off on work like reviewing that they argue needs doing.

And most of us have no reason to think we know that much better, anyway. Again, we’re scientists, not scientific publishing experts.

I don’t think it’s good practice to accuse people of cognitive biases. Everyone thinks they have good reasons to believe what they believe, and the only way to convince them is to address those reasons.

But the way we use journals in physics these days is genuinely baffling. It’s hard to explain, it’s the kind of thing people have been looking quizzically at for years. And this kind of explanation is the only one I’ve found that matches what I’ve seen. Between the cynical explanation and the literal arguments, there’s the basic human desire to do what seems like the responsible thing. That tends to explain a lot.

The Machine Learning for Physics Recipe

Last week, I went to a conference on machine learning for physics. Machine learning covers a huge variety of methods and ideas, several of which were on full display. But again and again, I noticed a pattern. The people who seemed to be making the best use of machine learning, the ones who were the most confident in their conclusions and getting the most impressive results, the ones who felt like they had a whole assembly line instead of just a prototype, all of them were doing essentially the same thing.

This post is about that thing. If you want to do machine learning in physics, these are the situations where you’re most likely to see a benefit. You can do other things, and they may work too. But this recipe seems to work over and over again.

First, you need simulations, and you need an experiment.

Your experiment gives you data, and that data isn’t easy to interpret. Maybe you’ve embedded a bunch of cameras in the antarctic ice, and your data tells you when they trigger and how bright the light is. Maybe you’ve surrounded a particle collision with layers silicon, and your data tells you how much electric charge the different layers absorb. Maybe you’ve got an array of telescopes focused on a black hole far far away, and your data are pixels gathered from each telescope.

You want to infer, from your data, what happened physically. Your cameras in the ice saw signs of a neutrino, you want to know how much energy it had and where it was coming from. Your silicon is absorbing particles, what kind are they and what processes did they come from? The black hole might have the rings predicted by general relativity, but it might have weirder rings from a variant theory.

In each case, you can’t just calculate the answer you need. The neutrino streams past, interacting with the ice and camera positions in unpredictable ways. People can write down clean approximations for particles in the highest-energy part of a collision, but once they start cooling down the process becomes so messy that no straightforward formula describes them. Your array of telescopes fuzz and pixellate and have to be assembled together in a complicated way, so that there is no one guaranteed answer you can find to establish what they saw.

In each case, though, you can use simulations. If you specify in advance the energy and path of the neutrino, you can use a computer to predict how much light your cameras should see. If you know what particles you started with, you can run sophisticated particle physics code to see what “showers” of particles you eventually find. If you have the original black hole image, you can fuzz and pixellate and take it apart to match what your array of telescopes will do.

The problem is, for the experiments, you can’t anticipate, and you don’t know in advance. And simulations, while cheaper than experiments, aren’t cheap. You can’t run a simulation for every possible input and then check them against the experiments. You need to fill in the gaps, run some simulations and then use some theory, some statistical method or human-tweaked guess, to figure out how to interpret your experiments.

Or, you can use Machine Learning. You train a machine learning model, one well-suited the task (anything from the old standby of boosted decision trees to an old fad of normalizing flows to the latest hotness of graph neural networks). You run a bunch of simulations, as many as you can reasonably afford, and you use that data for training, making a program that matches the input data you want to find with its simulated results. This program will be less reliable than your simulations, but it will run much faster. If it’s reliable enough, you can use it instead of the old human-made guesses and tweaks. You now have an efficient, reliable way to go from your raw experiment data to the physical questions you actually care about.

Crucially, each of the elements in this recipe is essential.

You need a simulation. If you just have an experiment with no simulation, then you don’t have a way to interpret the results, and training a machine to reproduce the experiment won’t tell you anything new.

You need an experiment. If you just have simulations, training a machine to reproduce them also doesn’t tell you anything new. You need some reason to want to predict the results of the simulations, beyond just seeing what happens in between which the machine can’t tell you.

And you need to not have anything better than the simulation. If you have a theory where you can write out formulas for what happens then you don’t need machine learning, you can interpret the experiments more easily without it. This applies if you’ve carefully designed your experiment to measure something easy to interpret, like the ratio of rates of two processes that should be exactly the same.

These aren’t the only things you need. You also need to do the whole thing carefully enough that you understand well your uncertainties, not just what the machine predicts but how often it gets it wrong, and whether it’s likely to do something strange when you use it on the actual experiment. But if you can do that, you have a reliable recipe, one many people have followed successfully before. You have a good chance of making things work.

This isn’t the only way physicists can use machine learning. There are people looking into something more akin to what’s called unsupervised learning, where you look for strange events in your data as clues for what to investigate further. And there are people like me, trying to use machine learning on the mathematical side, to guess new formulas and new heuristics. There is likely promise in many of these approaches. But for now, they aren’t a recipe.

HAMLET-Physics 2024

Back in January, I announced I was leaving France and leaving academia. Since then, it hasn’t made much sense for me to go to conferences, even the big conference of my sub-field or the conference I organized.

I did go to a conference this week, though. I had two excuses:

  1. The conference was here in Copenhagen, so no travel required.
  2. The conference was about machine learning.

HAMLET-Physics, or How to Apply Machine Learning to Experimental and Theoretical Physics, had the additional advantage of having an amusing acronym. Thanks to generous support by Carlsberg and the Danish Data Science Academy, they could back up their choice by taking everyone on a tour of Kronborg (better known in the English-speaking world as Elsinore).

This conference’s purpose was to bring together physicists who use machine learning, machine learning-ists who might have something useful to say to those physicists, and other physicists who don’t use machine learning yet but have a sneaking suspicion they might have to at some point. As a result, the conference was super-interdisciplinary, with talks by people addressing very different problems with very different methods.

Interdisciplinary conferences are tricky. It’s easy for the different groups of people to just talk past each other: everyone shows up, gives the same talk they always do, socializes with the same friends they always meet, then leaves.

There were a few talks that hit that mold, and were so technical only a few people understood. But most were better. The majority of the speakers did really well at presenting their work in a way that would be understandable and even exciting to people outside their field, while still having enough detail that we all learned something. I was particularly impressed by Thea Aarestad’s keynote talk on Tuesday, a really engaging view of how machine learning can be used under the extremely tight time constraints LHC experiments need to decide whether to record incoming data.

For the social aspect, the organizers had a cute/gimmicky/machine-learning-themed solution. Based on short descriptions and our public research profiles, they clustered attendees, plotting the connections between them. They then used ChatGPT to write conversation prompts between any two people on the basis of their shared interests. In practice, this turned out to be amusing but totally unnecessary. We were drawn to speak to each other not by conversation prompts, but by a drive to learn from each other. “Why do you do it that way?” was a powerful conversation-starter, as was “what’s the best way to do this?” Despite the different fields, the shared methodologies gave us strong reasons to talk, and meant that people were very rarely motivated to pick one of ChatGPT’s “suggestions”.

Overall, I got a better feeling for how machine learning is useful in physics (and am planning a post on that in future). I also got some fresh ideas for what to do myself, and a bit of a picture of what the future holds in store.

At Quanta This Week, With a Piece on Vacuum Decay

I have a short piece at Quanta Magazine this week, about a physics-y end of the world as we know it called vacuum decay.

For science-minded folks who want to learn a bit more: I have a sentence in the article mentioning other uncertainties. In case you’re curious what those uncertainties are:

Gamma (\gamma) here is the decay rate, its inverse gives the time it takes for a cubic gigaparsec of space to experience vacuum decay. The three uncertainties are from experiments, the uncertainties of our current knowledge of the Higgs mass, top quark mass, and the strength of the strong force.

Occasionally, you see futurology-types mention “uncertainties in the exponent” to argue that some prediction (say, how long it will take till we have human-level AI) is so uncertain that estimates barely even make sense: it might be 10 years, or 1000 years. I find it fun that for vacuum decay, because of that \log_{10}, there is actually uncertainty in the exponent! Vacuum decay might happen in as few as 10^{411} years or as many as 10^{1333} years, and that’s the result of an actual, reasonable calculation!

For physicist readers, I should mention that I got a lot out of reading some slides from a 2016 talk by Matthew Schwartz. Not many details of the calculation made it into the piece, but the slides were helpful in dispelling a few misconceptions that could have gotten into the piece. There’s an instinct to think about the situation in terms of the energy, to think about how difficult it is for quantum uncertainty to get you over the energy barrier to the next vacuum. There are methods that sort of look like that, if you squint, but that’s not really how you do the calculation, and there end up being a lot of interesting subtleties in the actual story. There were also a few numbers that it was tempting to put on the plots in the article, but turn out to be gauge dependent!

Another thing I learned from those slides how far you can actually take the uncertainties mentioned above. The higher-energy Higgs vacuum is pretty dang high-energy, to the point where quantum gravity effects could potentially matter. And at that point, all bets are off. The calculation, with all those nice uncertainties, is a calculation within the framework of the Standard Model. All of the things we don’t yet know about high-energy physics, especially quantum gravity, could freely mess with this. The universe as we know it could still be long-lived, but it could be a lot shorter-lived as well. That in turns makes this calculation a lot more of a practice-ground to hone techniques, rather than an actual estimate you can rely on.