Tag Archives: theoretical physics

Bonus Material for “How Hans Bethe Stumbled Upon Perfect Quantum Theories”

I had an article last week in Quanta Magazine. It’s a piece about something called the Bethe ansatz, a method in mathematical physics that was discovered by Hans Bethe in the 1930’s, but which only really started being understood and appreciated around the 1960’s. Since then it’s become a key tool, used in theoretical investigations in areas from condensed matter to quantum gravity. In this post, I thought I’d say a bit about the story behind the piece and give some bonus material that didn’t fit.

When I first decided to do the piece I reached out to Jules Lamers. We were briefly office-mates when I worked in France, where he was giving a short course on the Bethe ansatz and the methods that sprung from it. It turned out he had also been thinking about writing a piece on the subject, and we considered co-writing for a bit, but that didn’t work for Quanta. He helped me a huge amount with understanding the history of the subject and tracking down the right sources. If you’re a physicist who wants to learn about these things, I recommend his lecture notes. And if you’re a non-physicist who wants to know more, I hope he gets a chance to write a longer popular-audience piece on the topic!

If you clicked through to Jules’s lecture notes, you’d see the word “Bethe ansatz” doesn’t appear in the title. Instead, you’d see the phrase “quantum integrability”. In classical physics, an “integrable” system is one where you can calculate what will happen by doing an integral, essentially letting you “solve” any problem completely. Systems you can describe with the Bethe ansatz are solvable in a more complicated quantum sense, so they get called “quantum integrable”. There’s a whole research field that studies these quantum integrable systems.

My piece ended up rushing through the history of the field. After talking about Bethe’s original discovery, I jumped ahead to ice. The Bethe ansatz was first used to think about ice in the 1960’s, but the developments I mentioned leading up to it, where experimenters noticed extra variability and theorists explained it with the positions of hydrogen atoms, happened earlier, in the 1930’s. (Thanks to the commenter who pointed out that this was confusing!) Baxter gets a starring role in this section and had an important role in tying things together, but other people (Lieb and Sutherland) were involved earlier, showing that the Bethe ansatz indeed could be used with thin sheets of ice. This era had a bunch of other big names that I didn’t have space to talk about: C. N. Yang makes an appearance, and while Faddeev comes up later, I didn’t mention that he had a starring role in the 1970’s in understanding the connection to classical integrability and proposing a mathematical structure to understand what links all these different integrable theories together.

I vaguely gestured at black holes and quantum gravity, but didn’t have space for more than that. The connection there is to a topic you might have heard of before if you’ve read about string theory, called AdS/CFT, a connection between two kinds of world that are secretly the same: a toy model of gravity called Anti-de Sitter space (AdS) and a theory without gravity that looks the same at any scale (called a Conformal Field Theory, or CFT). It turns out that in the most prominent example of this, the theory without gravity is integrable! In fact, it’s a theory I spent a lot of time working with back in my research days, called N=4 super Yang-Mills. This theory is kind of like QCD, and in some sense it has integrability for similar reasons to those that Feynman hoped for and Korchemsky and Faddeev found. But it actually goes much farther, outside of the high-energy approximation where Korchemsky and Faddeev’s result works, and in principle seems to include everything you might want to know about the theory. Nowadays, people are using it to investigate the toy model of quantum gravity, hoping to get insights about quantum gravity in general.

One thing I didn’t get a chance to mention at all is the connection to quantum computing. People are trying to build a quantum computer with carefully-cooled atoms. It’s important to test whether the quantum computer functions well enough, or if the quantum states aren’t as perfect as they need to be. One way people have been testing this is with the Bethe ansatz: because it lets you calculate the behavior of special systems perfectly, you can set up your quantum computer to model a Bethe ansatz, and then check how close to the prediction your results are. You know that the theoretical result is complete, so any failure has to be due to an imperfection in your experiment.

I gave a quick teaser to a very active field, one that has fascinated a lot of prominent physicists and been applied in a wide variety of areas. I hope I’ve inspired you to learn more!

Integration by Parts, Evolved

I posted what may be my last academic paper today, about a project I’ve been working on with Matthias Wilhelm for most of the last year. The paper is now online here. For me, the project has been a chance to broaden my horizons, learn new skills, and start to step out of my academic comfort zone. For Matthias, I hope it was grant money well spent.

I wanted to work on something related to machine learning, for the usual trendy employability reasons. Matthias was already working with machine learning, but was interested in pursuing a different question.

When is machine learning worthwhile? Machine learning methods are heuristics, unreliable methods that sometimes work well. You don’t use a heuristic if you have a reliable method that runs fast enough. But if all you have are heuristics to begin with, then machine learning can give you a better heuristic.

Matthias noticed a heuristic embedded deep in how we do particle physics, and guessed that we could do better. In particle physics, we use pictures called Feynman diagrams to predict the probabilities for different outcomes of collisions, comparing those predictions to observation to look for evidence of new physics. Each Feynman diagram corresponds to an integral, and for each calculation there are hundreds, thousands, or even millions of those integrals to do.

Luckily, physicists don’t actually have to do all those integrals. It turns out that most of them are related, by a slightly more advanced version of that calculus class mainstay, integration by parts. Using integration by parts you can solve a list of equations, finding out how to write your integrals in terms of a much smaller list.

How big a list of equations do you need, and which ones? Twenty-five years ago, Stefano Laporta proposed a “golden rule” to choose, based on his own experience, and people have been using it (more or less, with their own tweaks) since then.

Laporta’s rule is a heuristic, with no proof that it is the best option, or even that it will always work. So we probably shouldn’t have been surprised when someone came up with a better heuristic. Watching talks at a December 2023 conference, Matthias saw a presentation by Johann Usovitsch on a curious new rule. The rule was surprisingly simple, just one extra condition on top of Laporta’s. But it was enough to reduce the number of equations by a factor of twenty.

That’s great progress, but it’s also a bit frustrating. Over almost twenty-five years, no-one had guessed this one simple change?

Maybe, thought Matthias and I, we need to get better at guessing.

We started out thinking we’d try reinforcement learning, a technique where a machine is trained by playing a game again and again, changing its strategy when that strategy brings it a reward. We thought we could have the machine learn to cut away extra equations, getting rewarded if it could cut more while still getting the right answer. We didn’t end up pursuing this very far before realizing another strategy would be a better fit.

What is a rule, but a program? Laporta’s golden rule and Johann’s new rule could both be expressed as simple programs. So we decided to use a method that could guess programs.

One method stood out for sheer trendiness and audacity: FunSearch. FunSearch is a type of algorithm called a genetic algorithm, which tries to mimic evolution. It makes a population of different programs, “breeds” them with each other to create new programs, and periodically selects out the ones that perform best. That’s not the trendy or audacious part, though, people have been doing that sort of genetic programming for a long time.

The trendy, audacious part is that FunSearch generates these programs with a Large Language Model, or LLM (the type of technology behind ChatGPT). Using an LLM trained to complete code, FunSearch presents the model with two programs labeled v0 and v1 and asks it to complete v2. In general, program v2 will have some traits from v0 and v1, but also a lot of variation due to the unpredictable output of LLMs. The inventors of FunSearch used this to contribute the variation needed for evolution, using it to evolve programs to find better solutions to math problems.

We decided to try FunSearch on our problem, modifying it a bit to fit the case. We asked it to find a shorter list of equations, giving a better score for a shorter list but a penalty if the list wasn’t able to solve the problem fully.

Some tinkering and headaches later, it worked! After a few days and thousands of program guesses, FunSearch was able to find a program that reproduced the new rule Johann had presented. A few hours more, and it even found a rule that was slightly better!

But then we started wondering: do we actually need days of GPU time to do this?

An expert on heuristics we knew had insisted, at the beginning, that we try something simpler. The approach we tried then didn’t work. But after running into some people using genetic programming at a conference last year, we decided to try again, using a Python package they used in their work. This time, it worked like a charm, taking hours rather than days to find good rules.

This was all pretty cool, a great opportunity for me to cut my teeth on Python programming and its various attendant skills. And it’s been inspiring, with Matthias drawing together more people interested in seeing just how much these kinds of heuristic methods can do there. I should be clear though, that so far I don’t think our result is useful. We did better than the state of the art on an example, but only slightly, and in a way that I’d guess doesn’t generalize. And we needed quite a bit of overhead to do it. Ultimately, while I suspect there’s something useful to find in this direction, it’s going to require more collaboration, both with people using the existing methods who know better what the bottlenecks are, and with experts in these, and other, kinds of heuristics.

So I’m curious to see what the future holds. And for the moment, happy that I got to try this out!

How Small Scales Can Matter for Large Scales

For a certain type of physicist, nothing matters more than finding the ultimate laws of nature for its tiniest building-blocks, the rules that govern quantum gravity and tell us where the other laws of physics come from. But because they know very little about those laws at this point, they can predict almost nothing about observations on the larger distance scales we can actually measure.

“Almost nothing” isn’t nothing, though. Theoretical physicists don’t know nature’s ultimate laws. But some things about them can be reasonably guessed. The ultimate laws should include a theory of quantum gravity. They should explain at least some of what we see in particle physics now, explaining why different particles have different masses in terms of a simpler theory. And they should “make sense”, respecting cause and effect, the laws of probability, and Einstein’s overall picture of space and time.

All of these are assumptions, of course. Further assumptions are needed to derive any testable consequences from them. But a few communities in theoretical physics are willing to take the plunge, and see what consequences their assumptions have.

First, there’s the Swampland. String theorists posit that the world has extra dimensions, which can be curled up in a variety of ways to hide from view, with different observable consequences depending on how the dimensions are curled up. This list of different observable consequences is referred to as the Landscape of possibilities. Based on that, some string theorists coined the term “Swampland” to represent an area outside the Landscape, containing observations that are incompatible with quantum gravity altogether, and tried to figure out what those observations would be.

In principle, the Swampland includes the work of all the other communities on this list, since a theory of quantum gravity ought to be consistent with other principles as well. In practice, people who use the term focus on consequences of gravity in particular. The earliest such ideas argued from thought experiments with black holes, finding results that seemed to demand that gravity be the weakest force for at least one type of particle. Later researchers would more frequently use string theory as an example, looking at what kinds of constructions people had been able to make in the Landscape to guess what might lie outside of it. They’ve used this to argue that dark energy might be temporary, and to try to figure out what traits new particles might have.

Second, I should mention naturalness. When talking about naturalness, people often use the analogy of a pen balanced on its tip. While possible in principle, it must have been set up almost perfectly, since any small imbalance would cause it to topple, and that perfection demands an explanation. Similarly, in particle physics, things like the mass of the Higgs boson and the strength of dark energy seem to be carefully balanced, so that a small change in how they were set up would lead to a much heavier Higgs boson or much stronger dark energy. The need for an explanation for the Higgs’ careful balance is why many physicists expected the Large Hadron Collider to discover additional new particles.

As I’ve argued before, this kind of argument rests on assumptions about the fundamental laws of physics. It assumes that the fundamental laws explain the mass of the Higgs, not merely by giving it an arbitrary number but by showing how that number comes from a non-arbitrary physical process. It also assumes that we understand well how physical processes like that work, and what kinds of numbers they can give. That’s why I think of naturalness as a type of argument, much like the Swampland, that uses the smallest scales to constrain larger ones.

Third is a host of constraints that usually go together: causality, unitarity, and positivity. Causality comes from cause and effect in a relativistic universe. Because two distant events can appear to happen in different orders depending on how fast you’re going, any way to send signals faster than light is also a way to send signals back in time, causing all of the paradoxes familiar from science fiction. Unitarity comes from quantum mechanics. If quantum calculations are supposed to give the probability of things happening, those probabilities should make sense as probabilities: for example, they should never go above one.

You might guess that almost any theory would satisfy these constraints. But if you extend a theory to the smallest scales, some theories that otherwise seem sensible end up failing this test. Actually linking things up takes other conjectures about the mathematical form theories can have, conjectures that seem more solid than the ones underlying Swampland and naturalness constraints but that still can’t be conclusively proven. If you trust the conjectures, you can derive restrictions, often called positivity constraints when they demand that some set of observations is positive. There has been a renaissance in this kind of research over the last few years, including arguments that certain speculative theories of gravity can’t actually work.

Which String Theorists Are You Complaining About?

Do string theorists have an unfair advantage? Do they have an easier time getting hired, for example?

In one of the perennial arguments about this on Twitter, Martin Bauer posted a bar chart of faculty hires in the US by sub-field. The chart was compiled by Erich Poppitz from data in the US particle physics rumor mill, a website where people post information about who gets hired where for the US’s quite small number of permanent theoretical particle physics positions at research universities and national labs. The data covers 1994 to 2017, and shows one year, 1999, when there were more string theorists hired than all other topics put together. The years around then also had many string theorists hired, but the proportion starts falling around the mid 2000’s…around when Lee Smolin wrote a book, The Trouble With Physics, arguing that string theorists had strong-armed their way into academic dominance. After that, the percentage of string theorists falls, oscillating between a tenth and a quarter of total hires.

Judging from that, you get the feeling that string theory’s critics are treating a temporary hiring fad as if it was a permanent fact. The late 1990’s were a time of high-profile developments in string theory that excited a lot of people. Later, other hiring fads dominated, often driven by experiments: I remember when the US decided to prioritize neutrino experiments and neutrino theorists had a much easier time getting hired, and there seem to be similar pushes now with gravitational waves, quantum computing, and AI.

Thinking about the situation in this way, though, ignores what many of the critics have in mind. That’s because the “string” column on that bar chart is not necessarily what people think of when they think of string theory.

If you look at the categories on Poppitz’s bar chart, you’ll notice something odd. “String” its itself a category. Another category, “lattice”, refers to lattice QCD, a method to find the dynamics of quarks numerically. The third category, though, is a combination of three things “ph/th/cosm”.

“Cosm” here refers to cosmology, another sub-field. “Ph” and “th” though aren’t really sub-fields. Instead, they’re arXiv categories, sections of the website arXiv.org where physicists post papers before they submit them to journals. The “ph” category is used for phenomenology, the type of theoretical physics where people try to propose models of the real world and make testable predictions. The “th” category is for “formal theory”, papers where theoretical physicists study the kinds of theories they use in more generality and develop new calculation methods, with insights that over time filter into “ph” work.

“String”, on the other hand, is not an arXiv category. When string theorists write papers, they’ll put them into “th” or “ph” or another relevant category (for example “gr-qc”, for general relativity and quantum cosmology). This means that when Poppitz distinguishes “ph/th/cosm” from “string”, he’s being subjective, using his own judgement to decide who counts as a string theorist.

So who counts as a string theorist? The simplest thing to do would be to check if their work uses strings. Failing that, they could use other tools of string theory and its close relatives, like Calabi-Yau manifolds, M-branes, and holography.

That might be what Poppitz was doing, but if he was, he was probably missing a lot of the people critics of string theory complain about. He even misses many people who describe themselves as string theorists. In an old post of mine I go through the talks at Strings, string theory’s big yearly conference, giving them finer-grained categories. The majority don’t use anything uniquely stringy.

Instead, I think critics of string theory have two kinds of things in mind.

First, most of the people who made their reputations on string theory are still in academia, and still widely respected. Some of them still work on string theory topics, but many now work on other things. Because they’re still widely respected, their interests have a substantial influence on the field. When one of them starts looking at connections between theories of two-dimensional materials, you get a whole afternoon of talks at Strings about theories of two-dimensional materials. Working on those topics probably makes it a bit easier to get a job, but also, many of the people working on them are students of these highly respected people, who just because of that have an easier time getting a job. If you’re a critic of string theory who thinks the founders of the field led physics astray, then you probably think they’re still leading physics astray even if they aren’t currently working on string theory.

Second, for many other people in physics, string theorists are their colleagues and friends. They’ll make fun of trends that seem overhyped and under-thought, like research on the black hole information paradox or the swampland, or hopes that a slightly tweaked version of supersymmetry will show up soon at the LHC. But they’ll happily use ideas developed in string theory when they prove handy, using supersymmetric theories to test new calculation techniques, string theory’s extra dimensions to inspire and ground new ideas for dark matter, or the math of strings themselves as interesting shortcuts to particle physics calculations. String theory is available as reference to these people in a way that other quantum gravity proposals aren’t. That’s partly due to familiarity and shared language (I remember a talk at Perimeter where string theorists wanted to learn from practitioners from another area and the discussion got bogged down by how they were using the word “dimension”), but partly due to skepticism of the various alternate approaches. Most people have some idea in their heads of deep problems with various proposals: screwing up relativity, making nonsense out of quantum mechanics, or over-interpreting on limited evidence. The most commonly believed criticisms are usually wrong, with objections long-known to practitioners of the alternate approaches, and so those people tend to think they’re being treated unfairly. But the wrong criticisms are often simplified versions of correct criticisms, passed down by the few people who dig deeply into these topics, criticisms that the alternative approaches don’t have good answers to.

The end result is that while string theory itself isn’t dominant, a sort of “string friendliness” is. Most of the jobs aren’t going to string theorists in the literal sense. But the academic world string theorists created keeps turning. People still respect string theorists and the research directions they find interesting, and people are still happy to collaborate and discuss with string theorists. For research communities people are more skeptical of, it must feel very isolating, like the world is still being run by their opponents. But this isn’t the kind of hegemony that can be solved by a revolution. Thinking that string theory is a failed research program, and people focused on it should have a harder time getting hired, is one thing. Thinking that everyone who respects at least one former string theorist should have a harder time getting hired is a very different goal. And if what you’re complaining about is “string friendliness”, not actual string theorists, then that’s what you’re asking for.

The Nowhere String

Space and time seem as fundamental as anything can get. Philosophers like Immanuel Kant thought that they were inescapable, that we could not conceive of the world without space and time. But increasingly, physicists suspect that space and time are not as fundamental as they appear. When they try to construct a theory of quantum gravity, physicists find puzzles, paradoxes that suggest that space and time may just be approximations to a more fundamental underlying reality.

One piece of evidence that quantum gravity researchers point to are dualities. These are pairs of theories that seem to describe different situations, including with different numbers of dimensions, but that are secretly indistinguishable, connected by a “dictionary” that lets you interpret any observation in one world in terms of an equivalent observation in the other world. By itself, duality doesn’t mean that space and time aren’t fundamental: as I explained in a blog post a few years ago, it could still be that one “side” of the duality is a true description of space and time, and the other is just a mathematical illusion. To show definitively that space and time are not fundamental, you would want to find a situation where they “break down”, where you can go from a theory that has space and time to a theory that doesn’t. Ideally, you’d want a physical means of going between them: some kind of quantum field that, as it shifts, changes the world between space-time and not space-time.

What I didn’t know when I wrote that post was that physicists already knew about such a situation in 1993.

Back when I was in pre-school, famous string theorist Edward Witten was trying to understand something that others had described as a duality, and realized there was something more going on.

In string theory, particles are described by lengths of vibrating string. In practice, string theorists like to think about what it’s like to live on the string itself, seeing it vibrate. In that world, there are two dimensions, one space dimension back and forth along the string and one time dimension going into the future. To describe the vibrations of the string in that world, string theorists use the same kind of theory that people use to describe physics in our world: a quantum field theory. In string theory, you have a two-dimensional quantum field theory stuck “inside” a theory with more dimensions describing our world. You see that this world exists by seeing the kinds of vibrations your two-dimensional world can have, through a type of quantum field called a scalar field. With ten scalar fields, ten different ways you can push energy into your stringy world, you can infer that the world around you is a space-time with ten dimensions.

String theory has “extra” dimensions beyond the three of space and one of time we’re used to, and these extra dimensions can be curled up in various ways to hide them from view, often using a type of shape called a Calabi-Yau manifold. In the late 80’s and early 90’s, string theorists had found a similarity between the two-dimensional quantum field theories you get folding string theory around some of these Calabi-Yau manifolds and another type of two-dimensional quantum field theory related to theories used to describe superconductors. People called the two types of theories dual, but Witten figured out there was something more going on.

Witten described the two types of theories in the same framework, and showed that they weren’t two equivalent descriptions of the same world. Rather, they were two different ways one theory could behave.

The two behaviors were connected by something physical: the value of a quantum field called a modulus field. This field can be described by a number, and that number can be positive or negative.

When the modulus field is a large positive number, then the theory behaves like string theory twisted around a Calabi-Yau manifold. In particular, the scalar fields have many different values they can take, values that are smoothly related to each other. These values are nothing more or less than the position of the string in space and time. Because the scalars can take many values, the string can sit in many different places, and because the values are smoothly related to each other, the string can smoothly move from one place to another.

When the modulus field is a large negative number, then the theory is very different. What people thought of as the other side of the duality, a theory like the theories used to describe superconductors, is the theory that describes what happens when the modulus field is large and negative. In this theory, the scalars can no longer take many values. Instead, they have one option, one stable solution. That means that instead of there being many different places the string could sit, describing space, there are no different places, and thus no space. The string lives nowhere.

These are two very different situations, one with space and one without. And they’re connected by something physical. You could imagine manipulating the modulus field, using other fields to funnel energy into it, pushing it back and forth from a world with space to a world of nowhere. Much more than the examples I was aware of, this is a super-clear example of a model where space is not fundamental, but where it can be manipulated, existing or not existing based on physical changes.

We don’t know whether a model like this describes the real world. But it’s gratifying to know that it can be written down, that there is a picture, in full mathematical detail, of how this kind of thing works. Hopefully, it makes the idea that space and time are not fundamental sound a bit more reasonable.

Replacing Space-Time With the Space in Your Eyes

Nima Arkani-Hamed thinks space-time is doomed.

That doesn’t mean he thinks it’s about to be destroyed by a supervillain. Rather, Nima, like many physicists, thinks that space and time are just approximations to a deeper reality. In order to make sense of gravity in a quantum world, seemingly fundamental ideas, like that particles move through particular places at particular times, will probably need to become more flexible.

But while most people who think space-time is doomed research quantum gravity, Nima’s path is different. Nima has been studying scattering amplitudes, formulas used by particle physicists to predict how likely particles are to collide in particular ways. He has been trying to find ways to calculate these scattering amplitudes without referring directly to particles traveling through space and time. In the long run, the hope is that knowing how to do these calculations will help suggest new theories beyond particle physics, theories that can’t be described with space and time at all.

Ten years ago, Nima figured out how to do this in a particular theory, one that doesn’t describe the real world. For that theory he was able to find a new picture of how to calculate scattering amplitudes based on a combinatorical, geometric space with no reference to particles traveling through space-time. He gave this space the catchy name “the amplituhedron“. In the years since, he found a few other “hedra” describing different theories.

Now, he’s got a new approach. The new approach doesn’t have the same kind of catchy name: people sometimes call it surfaceology, or curve integral formalism. Like the amplituhedron, it involves concepts from combinatorics and geometry. It isn’t quite as “pure” as the amplituhedron: it uses a bit more from ordinary particle physics, and while it avoids specific paths in space-time it does care about the shape of those paths. Still, it has one big advantage: unlike the amplituhedron, Nima’s new approach looks like it can work for at least a few of the theories that actually describe the real world.

The amplituhedron was mysterious. Instead of space and time, it described the world in terms of a geometric space whose meaning was unclear. Nima’s new approach also describes the world in terms of a geometric space, but this space’s meaning is a lot more clear.

The space is called “kinematic space”. That probably still sounds mysterious. “Kinematic” in physics refers to motion. In the beginning of a physics class when you study velocity and acceleration before you’ve introduced a single force, you’re studying kinematics. In particle physics, kinematic refers to the motion of the particles you detect. If you see an electron going up and to the right at a tenth the speed of light, those are its kinematics.

Kinematic space, then, is the space of observations. By saying that his approach is based on ideas in kinematic space, what Nima is saying is that it describes colliding particles not based on what they might be doing before they’re detected, but on mathematics that asks questions only about facts about the particles that can be observed.

(For the experts: this isn’t quite true, because he still needs a concept of loop momenta. He’s getting the actual integrands from his approach, rather than the dual definition he got from the amplituhedron. But he does still have to integrate one way or another.)

Quantum mechanics famously has many interpretations. In my experience, Nima’s favorite interpretation is the one known as “shut up and calculate”. Instead of arguing about the nature of an indeterminately philosophical “real world”, Nima thinks quantum physics is a tool to calculate things people can observe in experiments, and that’s the part we should care about.

From a practical perspective, I agree with him. And I think if you have this perspective, then ultimately, kinematic space is where your theories have to live. Kinematic space is nothing more or less than the space of observations, the space defined by where things land in your detectors, or if you’re a human and not a collider, in your eyes. If you want to strip away all the speculation about the nature of reality, this is all that is left over. Any theory, of any reality, will have to be described in this way. So if you think reality might need a totally new weird theory, it makes sense to approach things like Nima does, and start with the one thing that will always remain: observations.

I Ain’t Afraid of No-Ghost Theorems

In honor of Halloween this week, let me say a bit about the spookiest term in physics: ghosts.

In particle physics, we talk about the universe in terms of quantum fields. There is an electron field for electrons, a gluon field for gluons, a Higgs field for Higgs bosons. The simplest fields, for the simplest particles, can be described in terms of just a single number at each point in space and time, a value describing how strong the field is. More complicated fields require more numbers.

Most of the fundamental forces have what we call vector fields. They’re called this because they are often described with vectors, lists of numbers that identify a direction in space and time. But these vectors actually contain too many numbers.

These extra numbers have to be tidied up in some way in order to describe vector fields in the real world, like the electromagnetic field or the gluon field of the strong nuclear force. There are a number of tricks, but the nicest is usually to add some extra particles called ghosts. Ghosts are designed to cancel out the extra numbers in a vector, leaving the right description for a vector field. They’re set up mathematically such that they can never be observed, they’re just a mathematical trick.

Mathematical tricks aren’t all that spooky (unless you’re scared of mathematics itself, anyway). But in physics, ghosts can take on a spookier role as well.

In order to do their job cancelling those numbers, ghosts need to function as a kind of opposite to a normal particle, a sort of undead particle. Normal particles have kinetic energy: as they go faster and faster, they have more and more energy. Said another way, it takes more and more energy to make them go faster. Ghosts have negative kinetic energy: the faster they go, the less energy they have.

If ghosts are just a mathematical trick, that’s fine, they’ll do their job and cancel out what they’re supposed to. But sometimes, physicists accidentally write down a theory where the ghosts aren’t just a trick cancelling something out, but real particles you could detect, without anything to hide them away.

In a theory where ghosts really exist, the universe stops making sense. The universe defaults to the lowest energy it can reach. If making a ghost particle go faster reduces its energy, then the universe will make ghost particles go faster and faster, and make more and more ghost particles, until everything is jam-packed with super-speedy ghosts unto infinity, never-ending because it’s always possible to reduce the energy by adding more ghosts.

The absence of ghosts, then, is a requirement for a sensible theory. People prove theorems showing that their new ideas don’t create ghosts. And if your theory does start seeing ghosts…well, that’s the spookiest omen of all: an omen that your theory is wrong.

Congratulations to John Hopfield and Geoffrey Hinton!

The 2024 Physics Nobel Prize was announced this week, awarded to John Hopfield and Geoffrey Hinton for using physics to propose foundational ideas in the artificial neural networks used for machine learning.

If the picture above looks off-center, it’s because this is the first time since 2015 that the Physics Nobel has been given to two, rather than three, people. Since several past prizes bundled together disparate ideas in order to make a full group of three, it’s noteworthy that this year the committee decided that each of these people deserved 1/2 the prize amount, without trying to find one more person to water it down further.

Hopfield was trained as a physicist, working in the broad area known as “condensed matter physics”. Condensed matter physicists use physics to describe materials, from semiconductors to crystals to glass. Over the years, Hopfield started using this training less for the traditional subject matter of the field and more to study the properties of living systems. He moved from a position in the physics department of Princeton to chemistry and biology at Caltech. While at Caltech he started studying neuroscience and proposed what are now known as Hopfield networks as a model for how neurons store memory. Hopfield networks have very similar properties to a more traditional condensed matter system called a “spin glass”, and from what he knew about those systems Hopfield could make predictions for how his networks would behave. Those networks would go on to be a major inspiration for the artificial neural networks used for machine learning today.

Hinton was not trained as a physicist, and in fact has said that he didn’t pursue physics in school because the math was too hard! Instead, he got a bachelor’s degree in psychology, and a PhD in the at the time nascent field of artificial intelligence. In the 1980’s, shortly after Hopfield published his network, Hinton proposed a network inspired by a closely related area of physics, one that describes temperature in terms of the statistics of moving particles. His network, called a Boltzmann machine, would be modified and made more efficient over the years, eventually becoming a key part of how artificial neural networks are “trained”.

These people obviously did something impressive. Was it physics?

In 2014, the Nobel prize in physics was awarded to the people who developed blue LEDs. Some of these people were trained as physicists, some weren’t: Wikipedia describes them as engineers. At the time, I argued that this was fine, because these people were doing “something physicists are good at”, studying the properties of a physical system. Ultimately, the thing that ties together different areas of physics is training: physicists are the people who study under other physicists, and go on to collaborate with other physicists. That can evolve in unexpected directions, from more mathematical research to touching on biology and social science…but as long as the work benefits from being linked to physics departments and physics degrees, it makes sense to say it “counts as physics”.

By that logic, we can probably call Hopfield’s work physics. Hinton is more uncertain: his work was inspired by a physical system, but so are other ideas in computer science, like simulated annealing. Other ideas, like genetic algorithms, are inspired by biological systems: does that mean they count as biology?

Then there’s the question of the Nobel itself. If you want to get a Nobel in physics, it usually isn’t enough to transform the field. Your idea has to actually be tested against nature. Theoretical physics is its own discipline, with several ideas that have had an enormous influence on how people investigate new theories, ideas which have never gotten Nobels because the ideas were not intended, by themselves, to describe the real world. Hopfield networks and Boltzmann machines, similarly, do not exist as physical systems in the real world. They exist as computer simulations, and it is those computer simulations that are useful. But one can simulate many ideas in physics, and that doesn’t tend to be enough by itself to get a Nobel.

Ultimately, though, I don’t think this way of thinking about things is helpful. The Nobel isn’t capable of being “fair”, there’s no objective standard for Nobel-worthiness, and not much reason for there to be. The Nobel doesn’t determine which new research gets funded, nor does it incentivize anyone (except maybe Brian Keating). Instead, I think the best way of thinking about the Nobel these days is a bit like Disney.

When Disney was young, its movies had to stand or fall on their own merits. Now, with so many iconic movies in its history, Disney movies are received in the context of that history. Movies like Frozen or Moana aren’t just trying to be a good movie by themselves, they’re trying to be a Disney movie, with all that entails.

Similarly, when the Nobel was young, it was just another award, trying to reward things that Alfred Nobel might have thought deserved rewarding. Now, though, each Nobel prize is expected to be “Nobel-like”, an analogy between each laureate and the laureates of the past. When new people are given Nobels the committee is on some level consciously telling a story, saying that these people fit into the prize’s history.

This year, the Nobel committee clearly wanted to say something about AI. There is no Nobel prize for computer science, or even a Nobel prize for mathematics. (Hinton already has the Turing award, the most prestigious award in computer science.) So to say something about AI, the Nobel committee gave rewards in other fields. In addition to physics, this year’s chemistry award went in part to the people behind AlphaFold2, a machine learning tool to predict what shapes proteins fold into. For both prizes, the committee had a reasonable justification. AlphaFold2 genuinely is an amazing advance in the chemistry of proteins, a research tool like nothing that came before. And the work of Hopfield and Hinton did lead ideas in physics to have an enormous impact on the world, an impact that is worth recognizing. Ultimately, though, whether or not these people should have gotten the Nobel doesn’t depend on that justification. It’s an aesthetic decision, one that (unlike Disney’s baffling decision to make live-action remakes of their most famous movies) doesn’t even need to impress customers. It’s a question of whether the action is “Nobel-ish” enough, according to the tastes of the Nobel committee. The Nobel is essentially expensive fanfiction of itself.

And honestly? That’s fine. I don’t think there’s anything else they could be doing at this point.

The Bystander Effect for Reviewers

I probably came off last week as a bit of an extreme “journal abolitionist”. This week, I wanted to give a couple caveats.

First, as a commenter pointed out, the main journals we use in my field are run by nonprofits. Physical Review Letters, the journal where we publish five-page papers about flashy results, is run by the American Physical Society. The Journal of High-Energy Physics, where we publish almost everything else, is run by SISSA, the International School for Advanced Studies in Trieste. (SISSA does use Springer, a regular for-profit publisher, to do the actual publishing.)

The journals are also funded collectively, something I pointed out here before but might not have been obvious to readers of last week’s post. There is an agreement, SCOAP3, where research institutions band together to pay the journals. Authors don’t have to pay to publish, and individual libraries don’t have to pay for subscriptions.

And this is a lot better than the situation in other fields, yeah! Though I’d love to quantify how much. I haven’t been able to find a detailed breakdown, but SCOAP3 pays around 1200 EUR per article published. What I’d like to do (but not this week) is to compare this to what other fields pay, as well as to publishing that doesn’t have the same sort of trapped audience, and to online-only free journals like SciPost. (For example, publishing actual physical copies of journals at this point is sort of a vanity thing, so maybe we should compare costs to vanity publishers?)

Second, there’s reviewing itself. Even without traditional journals, one might still want to keep peer review.

What I wanted to understand last week was what peer review does right now, in my field. We read papers fresh off the arXiv, before they’ve gone through peer review. Authors aren’t forced to update the arXiv with the journal version of their paper, if they want another version, even if that version was rejected by the reviewers, then they’re free to do so, and most of us wouldn’t notice. And the sort of in-depth review that happens in peer review also happens without it. When we have journal clubs and nominate someone to present a recent paper, or when we try to build on a result or figure out why it contradicts something we thought we knew, we go through the same kind of in-depth reading that (in the best cases) reviewers do.

But I think I’ve hit upon something review does that those kinds of informal things don’t. It gets us to speak up about it.

I presented at a journal club recently. I read through a bombastic new paper, figured out what I thought was wrong with it, and explained it to my colleagues.

But did I reach out to the author? No, of course not, that would be weird.

Psychologists talk about the bystander effect. If someone collapses on the street, and you’re the only person nearby, you’ll help. If you’re one of many, you’ll wait and see if someone else helps instead.

I think there’s a bystander effect for correcting people. If someone makes a mistake and publishes something wrong, we’ll gripe about it to each other. But typically, we won’t feel like it’s our place to tell the author. We might get into a frustrating argument, there wouldn’t be much in it for us, and it might hurt our reputation if the author is well-liked.

(People do speak up when they have something to gain, of course. That’s why when you write a paper, most of the people emailing you won’t be criticizing the science: they’ll be telling you you need to cite them.)

Peer review changes the expectations. Suddenly, you’re expected to criticize, it’s your social role. And you’re typically anonymous, you don’t have to worry about the consequences. It becomes a lot easier to say what you really think.

(It also becomes quite easy to say lazy stupid things, of course. This is why I like setups like SciPost, where reviews are made public even when the reviewers are anonymous. It encourages people to put some effort in, and it means that others can see that a paper was rejected for bad reasons and put less stock in the rejection.)

I think any new structure we put in place should keep this feature. We need to preserve some way to designate someone a critic, to give someone a social role that lets them let loose and explain why someone else is wrong. And having these designated critics around does help my field. The good criticisms get implemented in the papers, the authors put the new versions up on arXiv. Reviewing papers for journals does make our science better…even if none of us read the journal itself.

HAMLET-Physics 2024

Back in January, I announced I was leaving France and leaving academia. Since then, it hasn’t made much sense for me to go to conferences, even the big conference of my sub-field or the conference I organized.

I did go to a conference this week, though. I had two excuses:

  1. The conference was here in Copenhagen, so no travel required.
  2. The conference was about machine learning.

HAMLET-Physics, or How to Apply Machine Learning to Experimental and Theoretical Physics, had the additional advantage of having an amusing acronym. Thanks to generous support by Carlsberg and the Danish Data Science Academy, they could back up their choice by taking everyone on a tour of Kronborg (better known in the English-speaking world as Elsinore).

This conference’s purpose was to bring together physicists who use machine learning, machine learning-ists who might have something useful to say to those physicists, and other physicists who don’t use machine learning yet but have a sneaking suspicion they might have to at some point. As a result, the conference was super-interdisciplinary, with talks by people addressing very different problems with very different methods.

Interdisciplinary conferences are tricky. It’s easy for the different groups of people to just talk past each other: everyone shows up, gives the same talk they always do, socializes with the same friends they always meet, then leaves.

There were a few talks that hit that mold, and were so technical only a few people understood. But most were better. The majority of the speakers did really well at presenting their work in a way that would be understandable and even exciting to people outside their field, while still having enough detail that we all learned something. I was particularly impressed by Thea Aarestad’s keynote talk on Tuesday, a really engaging view of how machine learning can be used under the extremely tight time constraints LHC experiments need to decide whether to record incoming data.

For the social aspect, the organizers had a cute/gimmicky/machine-learning-themed solution. Based on short descriptions and our public research profiles, they clustered attendees, plotting the connections between them. They then used ChatGPT to write conversation prompts between any two people on the basis of their shared interests. In practice, this turned out to be amusing but totally unnecessary. We were drawn to speak to each other not by conversation prompts, but by a drive to learn from each other. “Why do you do it that way?” was a powerful conversation-starter, as was “what’s the best way to do this?” Despite the different fields, the shared methodologies gave us strong reasons to talk, and meant that people were very rarely motivated to pick one of ChatGPT’s “suggestions”.

Overall, I got a better feeling for how machine learning is useful in physics (and am planning a post on that in future). I also got some fresh ideas for what to do myself, and a bit of a picture of what the future holds in store.