Monthly Archives: April 2025

Antimatter Isn’t Magic

You’ve heard of antimatter, right?

For each type of particle, there is a rare kind of evil twin with the opposite charge, called an anti-particle. When an anti-proton meets a proton, they annihilate each other in a giant blast of energy.

I see a lot of questions online about antimatter. One recurring theme is people asking a very general question: how does antimatter work?

If you’ve just heard the pop physics explanation, antimatter probably sounds like magic. What about antimatter lets it destroy normal matter? Does it need to touch? How long does it take? And what about neutral particles like neutrons?

You find surprisingly few good explanations of this online, but I can explain why. Physicists like me don’t expect antimatter to be confusing in this way, because to us, antimatter isn’t doing anything all that special. When a particle and an antiparticle annihilate, they’re doing the same thing that any other pair of particles do when they do…basically anything else.

Instead of matter and antimatter, let’s talk about one of the oldest pieces of evidence for quantum mechanics, the photoelectric effect. Scientists shone light at a metal, and found that if the wavelength of the light was short enough, electrons would spring free, causing an electric current. If the wavelength was too long, the metal wouldn’t emit any electrons, no matter how much light they shone. Einstein won his Nobel prize for the explanation: the light hitting the metal comes in particle-sized pieces, called photons, whose energy is determined by the wavelength of the light. If the individual photons don’t have enough energy to get an electron to leave the metal, then no electron will move, no matter how many photons you use.

What happens to the photons after they hit the metal?

They go away. We say they are absorbed, an electron absorbs a photon and speeds up, increasing its kinetic energy so it can escape.

But we could just as easily say the photon is annihilated, if we wanted to.

In the photoelectric effect, you start with one electron and one photon, they come together, and you end up with one electron and no photon. In proton-antiproton annihilation, you start with a proton and an antiproton, they come together, and you end up with no protons or antiprotons, but instead “energy”…which in practice, usually means two photons.

That’s all that happens, deep down at the root of things. The laws of physics are rules about inputs and outputs. Start with these particles, they come together, you end up with these other particles. Sometimes one of the particles stays the same. Sometimes particles seem to transform, and different kinds of particles show up. Sometimes some of the particles are photons, and you think of them as “just energy”, and easy to absorb. But particles are particles, and nothing is “just energy”. Each thing, absorption, decay, annihilation, each one is just another type of what we call interactions.

What makes annihilation of matter and antimatter seem unique comes down to charges. Interactions have to obey the laws of physics: they conserve energy, they conserve momentum, and they conserve charge.

So why can an antiproton and a proton annihilate to pure photons, while two protons can’t? A proton and an antiproton have opposite charge, a photon has zero charge. You could combine two protons to make something else, but it would have to have the same charge as two protons.

What about neutrons? A neutron has no electric charge, so you might think it wouldn’t need antimatter. But a neutron has another type of charge, called baryon number. In order to annihilate one, you’d need an anti-neutron, which would still have zero electric charge but would have the opposite baryon number. (By the way, physicists have been making anti-neutrons since 1956.)

On the other hand, photons actually have no charge. So do Higgs bosons. So one Higgs boson can become two photons, without annihilating with anything else. Each of these particles can be called its own antiparticle: a photon is also an antiphoton, a Higgs is also an anti-Higgs.

Because particle-antiparticle annihilation follows the same rules as other interactions between particles, it also takes place via the same forces. When a proton and an antiproton annihilate each other, they typically do this via the electromagnetic force. This is why you end up with light, which is an electromagnetic wave. Like everything in the quantum world, this annihilation isn’t certain. Is has a chance to happen, proportional to the strength of the interaction force involved.

What about neutrinos? They also appear to have a kind of charge, called lepton number. That might not really be a conserved charge, and neutrinos might be their own antiparticles, like photons. However, they are much less likely to be annihilated than protons and antiprotons, because they don’t have electric charge, and thus their interaction doesn’t depend on the electromagnetic force, but on the much weaker weak nuclear force. A weaker force means a less likely interaction.

Antimatter might seem like the stuff of science fiction. But it’s not really harder to understand than anything else in particle physics.

(I know, that’s a low bar!)

It’s just interactions. Particles go in, particles go out. If it follows the rules, it can happen, if it doesn’t, it can’t. Antimatter is no different.

I’ve Felt Like a Hallucinating LLM

ChatGPT and its kin work by using Large Language Models, or LLMs.

A climate model is a pile of mathematics and code, honed on data from the climate of the past. Tell it how the climate starts out, and it will give you a prediction for what happens next.

Similarly, a language model is a pile of mathematics and code, honed on data from the texts of the past. Tell it how a text starts, and it will give you a prediction for what happens next.

We have a rough idea of what a climate model can predict. The climate has to follow the laws of physics, for example. Similarly, a text should follow the laws of grammar, the order of verbs and nouns and so forth. The creators of the earliest, smallest language models figured out how to do that reasonably well.

Texts do more than just follow grammar, though. They can describe the world. And LLMs are both surprisingly good and surprisingly bad at that. They can do a lot when used right, answering test questions most humans would struggle with. But they also “hallucinate”, confidently saying things that have nothing to do with reality.

If you want to understand why large language models make both good predictions and bad, you shouldn’t just think about abstract “texts”. Instead, think about a specific type of text: a story.

Stories follow grammar, most of the time. But they also follow their own logic. The hero sets out, saves the world, and returns home again. The evil queen falls from the tower at the climax of the final battle. There are three princesses, and only the third can break the spell.

We aren’t usually taught this logic, like we’re taught physics or grammar. We learn it from experience, from reading stories and getting used to patterns. It’s the logic, not of how a story must go, but of how a story typically goes. And that question, of what typically comes next, is exactly the question LLMs are designed to answer.

It’s also a question we sometimes answer.

I was a theatre kid, and I loved improv in particular. Some of it was improv comedy, the games and skits you might have seen on “Whose Line is it Anyway?” But some of it was more…hippy stuff.

I’d meet up with a group on Saturdays. One year we made up a creation myth, half-rehearsed and half-improvised, a collection of gods and primordial beings. The next year we moved the story forward. Civilization had risen…and fallen again. We played a group of survivors gathered around a campfire, wary groups wondering what came next.

We plotted out characters ahead of time. I was the “villain”, or the closest we had to one. An enforcer of the just-fallen empire, the oppressor embodied. While the others carried clubs, staves, and farm implements, I was the only one with a real weapon: a sword.

(Plastic in reality, but the audience knew what to do.)

In the arguments and recriminations of the story, that sword set me apart, a constant threat that turned my character from contemptible to dangerous, that gave me a seat at the table even as I antagonized and stirred the pot.

But the story had another direction. The arguments pushed and pulled, and gradually the survivors realized that they would not survive if they did not put their grievances to rest, if they did not seek peace. So, one man stepped forward, and tossed his staff into the fire.

The others followed. One by one, clubs and sticks and menacing tools were cast aside. And soon, I was the only one armed.

If I was behaving logically, if I followed my character’s interests, I would have “won” there. I had gotten what I wanted, now there was no check on my power.

But that wasn’t what the story wanted. Improv is a game of fast decisions and fluid invention. We follow our instincts, and our instincts are shaped by experience. The stories of the past guide our choices, and must often be the only guide: we don’t have time to edit, or to second-guess.

And I felt the story, and what it wanted. It was a command that transcended will, that felt like it left no room for an individual actor making an individual decision.

I cast my sword into the fire.

The instinct that brought me to do that is the same instinct that guides authors when they say that their characters write themselves, when their story goes in an unexpected direction. It’s an instinct that can be tempered and counteracted, with time and effort, because it can easily lead to nonsense. It’s why every good book needs an editor, why improv can be as repetitive as it is magical.

And it’s been the best way I’ve found to understand LLMs.

An LLM telling a story tells a typical story, based on the data used to create it. In the same way, an LLM giving advice gives typical advice, to some extent in content but more importantly in form, advice that is confident and mentions things advice often mentions. An LLM writing a biography will write a typical biography, which may not be your biography, even if your biography was one of those used to create it, because it tries to predict how a biography should go based on all the other biographies. And all of these predictions and hallucinations are very much the kind of snap judgement that disarmed me.

These days, people are trying to build on top of LLMs and make technology that does more, that can edit and check its decisions. For the most part, they’re building these checks out of LLMs. Instead of telling one story, of someone giving advice on the internet, they tell two stories: the advisor and the editor, one giving the advice and one correcting it. They have to tell these stories many times, broken up into many parts, to approximate something other than the improv actor’s first instincts, and that’s why software that does this is substantially more expensive than more basic software that doesn’t.

I can’t say how far they’ll get. Models need data to work well, decisions need reliability to be good, computers need infrastructure to compute. But if you want to understand what’s at an LLM’s beating heart, think about the first instincts you have in writing or in theatre, in stories or in play. Then think about a machine that just does that.

Lambda-CDM Is Not Like the Standard Model

A statistician will tell you that all models are wrong, but some are useful.

Particle physicists have an enormously successful model called the Standard Model, which describes the world in terms of seventeen quantum fields, giving rise to particles from the familiar electron to the challenging-to-measure Higgs boson. The model has nineteen parameters, numbers that aren’t predicted by the model itself but must be found by doing experiments and finding the best statistical fit. With those numbers as input, the model is extremely accurate, aside from the occasional weird discrepancy.

Cosmologists have their own very successful standard model that they use to model the universe as a whole. Called ΛCDM, it describes the universe in terms of three things: dark energy, denoted with a capital lambda (Λ), cold dark matter (CDM), and ordinary matter, all interacting with each other via gravity. The model has six parameters, which must be found by observing the universe and finding the best statistical fit. When those numbers are input, the model is extremely accurate, though there have recently been some high-profile discrepancies.

These sound pretty similar. You model the world as a list of things, fix your parameters based on nature, and make predictions. Wikipedia has a nice graphic depicting the quantum fields of the Standard Model, and you could imagine a similar graphic for ΛCDM.

A graphic like that would be misleading, though.

ΛCDM doesn’t just propose a list of fields and let them interact freely. Instead, it tries to model the universe as a whole, which means it carries assumptions about how matter and energy are distributed, and how space-time is shaped. Some of this is controlled by its parameters, and by tweaking them one can model a universe that varies in different ways. But other assumptions are baked in. If the universe had a very different shape, caused by a very different distribution of matter and energy, then we would need a very different model to represent it. We couldn’t use ΛCDM.

The Standard Model isn’t like that. If you collide two protons together, you need a model of how quarks are distributed inside protons. But that model isn’t the Standard Model, it’s a separate model used for that particular type of experiment. The Standard Model is supposed to be the big picture, the stuff that exists and affects every experiment you can do.

That means the Standard Model is supported in a way that ΛCDM isn’t. The Standard Model describes many different experiments, and is supported by almost all of them. When an experiment disagrees, it has specific implications for part of the model only. For example, neutrinos have mass, which was not predicted in the Standard Model, but it proved easy for people to modify the model to fit. We know the Standard Model is not the full picture, but we also know that any deviations from it must be very small. Large deviations would contradict other experiments, or more basic principles like probabilities needing to be smaller than one.

In contrast, ΛCDM is really just supported by one experiment. We have one universe to observe. We can gather a lot of data, measuring it from its early history to the recent past. But we can’t run it over and over again under different conditions, and our many measurements are all measuring different aspects of the same thing. That’s why unlike in the Standard Model, we can’t separate out assumptions about the shape of the universe from assumptions about what it contains. Dark energy and dark matter are on the same footing as distribution of fluctuations and homogeneity and all those shape-related words, part of one model that gets fit together as a whole.

And so while both the Standard Model and ΛCDM are successful, that success means something different. It’s hard to imagine that we find new evidence and discover that electrons don’t exist, or quarks don’t exist. But we may well find out that dark energy doesn’t exist, or that the universe has a radically different shape. The statistical success of ΛCDM is impressive, and it means any alternative has a high bar to clear. But it doesn’t have to mean rethinking everything the way an alternative to the Standard Model would.

I Have a Theory

“I have a theory,” says the scientist in the book. But what does that mean? What does it mean to “have” a theory?

First, there’s the everyday sense. When you say “I have a theory”, you’re talking about an educated guess. You think you know why something happened, and you want to check your idea and get feedback. A pedant would tell you you don’t really have a theory, you have a hypothesis. It’s “your” hypothesis, “your theory”, because it’s what you think happened.

The pedant would insist that “theory” means something else. A theory isn’t a guess, even an educated guess. It’s an explanation with evidence, tested and refined in many different contexts in many different ways, a whole framework for understanding the world, the most solid knowledge science can provide. Despite the pedant’s insistence, that isn’t the only way scientists use the word “theory”. But it is a common one, and a central one. You don’t really “have” a theory like this, though, except in the sense that we all do. These are explanations with broad consensus, things you either know of or don’t, they don’t belong to one person or another.

Except, that is, if one person takes credit for them. We sometimes say “Darwin’s theory of evolution”, or “Einstein’s theory of relativity”. In that sense, we could say that Einstein had a theory, or that Darwin had a theory.

Sometimes, though, “theory” doesn’t mean this standard official definition, even when scientists say it. And that changes what it means to “have” a theory.

For some researchers, a theory is a lens with which to view the world. This happens sometimes in physics, where you’ll find experts who want to think about a situation in terms of thermodynamics, or in terms of a technique called Effective Field Theory. It happens in mathematics, where some choose to analyze an idea with category theory not to prove new things about it, but just to translate it into category theory lingo. It’s most common, though, in the humanities, where researchers often specialize in a particular “interpretive framework”.

For some, a theory is a hypothesis, but also a pet project. There are physicists who come up with an idea (maybe there’s a variant of gravity with mass! maybe dark energy is changing!) and then focus their work around that idea. That includes coming up with ways to test whether the idea is true, showing the idea is consistent, and understanding what variants of the idea could be proposed. These ideas are hypotheses, in that they’re something the scientist thinks could be true. But they’re also ideas with many moving parts that motivate work by themselves.

Taken to the extreme, this kind of “having” a theory can go from healthy science to political bickering. Instead of viewing an idea as a hypothesis you might or might not confirm, it can become a platform to fight for. Instead of investigating consistency and proposing tests, you focus on arguing against objections and disproving your rivals. This sometimes happens in science, especially in more embattled areas, but it happens much more often with crackpots, where people who have never really seen science done can decide it’s time for their idea, right or wrong.

Finally, sometimes someone “has” a theory that isn’t a hypothesis at all. In theoretical physics, a “theory” can refer to a complete framework, even if that framework isn’t actually supposed to describe the real world. Some people spend time focusing on a particular framework of this kind, understanding its properties in the hope of getting broader insights. By becoming an expert on one particular theory, they can be said to “have” that theory.

Bonus question: in what sense do string theorists “have” string theory?

You might imagine that string theory is an interpretive framework, like category theory, with string theorists coming up with the “string version” of things others understand in other ways. This, for the most part, doesn’t happen. Without knowing whether string theory is true, there isn’t much benefit in just translating other things to string theory terms, and people for the most part know this.

For some, string theory is a pet project hypothesis. There is a community of people who try to get predictions out of string theory, or who investigate whether string theory is consistent. It’s not a huge number of people, but it exists. A few of these people can get more combative, or make unwarranted assumptions based on dedication to string theory in particular: for example, you’ll see the occasional argument that because something is difficult in string theory it must be impossible in any theory of quantum gravity. You see a spectrum in the community, from people for whom string theory is a promising project to people for whom it is a position that needs to be defended and argued for.

For the rest, the question of whether string theory describes the real world takes a back seat. They’re people who “have” string theory in the sense that they’re experts, and they use the theory primarily as a mathematical laboratory to learn broader things about how physics works. If you ask them, they might still say that they hypothesize string theory is true. But for most of these people, that question isn’t central to their work.