The Machine Learning for Physics Recipe

Last week, I went to a conference on machine learning for physics. Machine learning covers a huge variety of methods and ideas, several of which were on full display. But again and again, I noticed a pattern. The people who seemed to be making the best use of machine learning, the ones who were the most confident in their conclusions and getting the most impressive results, the ones who felt like they had a whole assembly line instead of just a prototype, all of them were doing essentially the same thing.

This post is about that thing. If you want to do machine learning in physics, these are the situations where you’re most likely to see a benefit. You can do other things, and they may work too. But this recipe seems to work over and over again.

First, you need simulations, and you need an experiment.

Your experiment gives you data, and that data isn’t easy to interpret. Maybe you’ve embedded a bunch of cameras in the antarctic ice, and your data tells you when they trigger and how bright the light is. Maybe you’ve surrounded a particle collision with layers silicon, and your data tells you how much electric charge the different layers absorb. Maybe you’ve got an array of telescopes focused on a black hole far far away, and your data are pixels gathered from each telescope.

You want to infer, from your data, what happened physically. Your cameras in the ice saw signs of a neutrino, you want to know how much energy it had and where it was coming from. Your silicon is absorbing particles, what kind are they and what processes did they come from? The black hole might have the rings predicted by general relativity, but it might have weirder rings from a variant theory.

In each case, you can’t just calculate the answer you need. The neutrino streams past, interacting with the ice and camera positions in unpredictable ways. People can write down clean approximations for particles in the highest-energy part of a collision, but once they start cooling down the process becomes so messy that no straightforward formula describes them. Your array of telescopes fuzz and pixellate and have to be assembled together in a complicated way, so that there is no one guaranteed answer you can find to establish what they saw.

In each case, though, you can use simulations. If you specify in advance the energy and path of the neutrino, you can use a computer to predict how much light your cameras should see. If you know what particles you started with, you can run sophisticated particle physics code to see what “showers” of particles you eventually find. If you have the original black hole image, you can fuzz and pixellate and take it apart to match what your array of telescopes will do.

The problem is, for the experiments, you can’t anticipate, and you don’t know in advance. And simulations, while cheaper than experiments, aren’t cheap. You can’t run a simulation for every possible input and then check them against the experiments. You need to fill in the gaps, run some simulations and then use some theory, some statistical method or human-tweaked guess, to figure out how to interpret your experiments.

Or, you can use Machine Learning. You train a machine learning model, one well-suited the task (anything from the old standby of boosted decision trees to an old fad of normalizing flows to the latest hotness of graph neural networks). You run a bunch of simulations, as many as you can reasonably afford, and you use that data for training, making a program that matches the input data you want to find with its simulated results. This program will be less reliable than your simulations, but it will run much faster. If it’s reliable enough, you can use it instead of the old human-made guesses and tweaks. You now have an efficient, reliable way to go from your raw experiment data to the physical questions you actually care about.

Crucially, each of the elements in this recipe is essential.

You need a simulation. If you just have an experiment with no simulation, then you don’t have a way to interpret the results, and training a machine to reproduce the experiment won’t tell you anything new.

You need an experiment. If you just have simulations, training a machine to reproduce them also doesn’t tell you anything new. You need some reason to want to predict the results of the simulations, beyond just seeing what happens in between which the machine can’t tell you.

And you need to not have anything better than the simulation. If you have a theory where you can write out formulas for what happens then you don’t need machine learning, you can interpret the experiments more easily without it. This applies if you’ve carefully designed your experiment to measure something easy to interpret, like the ratio of rates of two processes that should be exactly the same.

These aren’t the only things you need. You also need to do the whole thing carefully enough that you understand well your uncertainties, not just what the machine predicts but how often it gets it wrong, and whether it’s likely to do something strange when you use it on the actual experiment. But if you can do that, you have a reliable recipe, one many people have followed successfully before. You have a good chance of making things work.

This isn’t the only way physicists can use machine learning. There are people looking into something more akin to what’s called unsupervised learning, where you look for strange events in your data as clues for what to investigate further. And there are people like me, trying to use machine learning on the mathematical side, to guess new formulas and new heuristics. There is likely promise in many of these approaches. But for now, they aren’t a recipe.

Leave a comment! If it's your first time, it will go into moderation.