Understanding Is Translation

Kernighan’s Law states, “Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.” People sometimes make a similar argument about philosophy of mind: “The attempt of the mind to analyze itself [is] an effort analogous to one who would lift himself by his own bootstraps.”

Both points operate on a shared kind of logic. They picture understanding something as modeling it in your mind, with every detail clear. If you’ve already used all your mind’s power to design code, you won’t be able to model when it goes wrong. And modeling your own mind is clearly nonsense, you would need an even larger mind to hold the model.

The trouble is, this isn’t really how understanding works. To understand something, you don’t need to hold a perfect model of it in your head. Instead, you translate it into something you can more easily work with. Like explanations, these translations can be different for different people.

To understand something, I need to know the algorithm behind it. I want to know how to calculate it, the pieces that go in and where they come from. I want to code it up, to test it out on odd cases and see how it behaves, to get a feel for what it can do.

Others need a more physical picture. They need to know where the particles are going, or how energy and momentum are conserved. They want entropy to be increased, action to be minimized, scales to make sense dimensionally.

Others in turn are more mathematical. They want to start with definitions and axioms. To understand something, they want to see it as an example of a broader class of thing, groups or algebras or categories, to fit it into a bigger picture.

Each of these are a kind of translation, turning something into code-speak or physics-speak or math-speak. They don’t require modeling every detail, but when done well they can still explain every detail.

So while yes, it is good practice not to write code that is too “smart”, and too hard to debug…it’s not impossible to debug your smartest code. And while you can’t hold an entire mind inside of yours, you don’t actually need to do that to understand the brain. In both cases, all you need is a translation.

A Scale of “Sure-Thing-Ness” for Experiments

No experiment is a sure thing. No matter what you do, what you test, what you observe, there’s no guarantee that you find something new. Even if you do your experiment correctly and measure what you planned to measure, nature might not tell you anything interesting.

Still, some experiments are more sure than others. Sometimes you’re almost guaranteed to learn something, even if it wasn’t what you hoped, while other times you just end up back where you started.

The first, and surest, type of experiment, is a voyage into the unknown. When nothing is known about your target, no expectations, and no predictions, then as long as you successfully measure anything you’ll have discovered something new. This can happen if the thing you’re measuring was only recently discovered. If you’re the first person who manages to measure the reaction rates of an element, or the habits of an insect, or the atmosphere of a planet, then you’re guaranteed to find out something you didn’t know before.

If you don’t have a total unknown to measure, then you want to test a clear hypothesis. The best of these are the theory killers, experiments which can decisively falsify an idea. History’s most famous experiments take this form, like the measurement of the perihelion of Mercury to test General Relativity or Pasteur’s tests of spontaneous generation. When you have a specific prediction and not much wiggle room, an experiment can teach you quite a lot.

“Not much wiggle room” is key, because these tests can all to easily become theory modifiers instead. If you can tweak your theory enough, then your experiment might not be able to falsify it. Something similar applies when you have a number of closely related theories. Even if you falsify one, you can just switch to another similar idea. In those cases, testing your theory won’t always teach you as much: you have to get lucky and see something that pins your theory down more precisely.

Finally, you can of course be just looking. Some experiments are just keeping an eye out, in the depths of space or the precision of quantum labs, watching for something unexpected. That kind of experiment might never see anything, and never rule anything out, but they can still sometimes be worthwhile.

There’s some fuzziness to these categories, of course. Often when scientists argue about whether an experiment is worth doing they’re arguing about which category to place it in. Would a new collider be a “voyage into the unknown” (new energy scales we’ve never measured before), a theory killer/modifier (supersymmetry! but which one…) or just “just looking”? Is your theory of cosmology specific enough to be “killed”, or merely “modified”? Is your wacky modification of quantum mechanics something that can be tested, or merely “just looked” for?

For any given experiment, it’s worth keeping in mind what you expect, and what would happen if you’re wrong. In science, we can’t do every experiment we want. We have to focus our resources and try to get results. Even if it’s never a sure thing.

Math Is the Art of Stating Things Clearly

Why do we use math?

In physics we describe everything, from the smallest of particles to the largest of galaxies, with the language of mathematics. Why should that one field be able to describe so much? And why don’t we use something else?

The truth is, this is a trick question. Mathematics isn’t a language like English or French, where we can choose whichever translation we want. We use mathematics because it is, almost by definition, the best choice. That is because mathematics is the art of stating things clearly.

An infinite number of mathematicians walk into a bar. The first orders a beer. The second orders half a beer. The third orders a quarter. The bartender stops them, pours two beers, and says “You guys should know your limits.”

That was an (old) joke about infinite series of numbers. You probably learned in high school that if you add up one plus a half plus a quarter…you eventually get two. To be a bit more precise:

$\sum_{i=0}^\infty \frac{1}{2^i} = 1+\frac{1}{2}+\frac{1}{4}+\ldots=2$

We say that this infinite sum limits to two.

But what does it actually mean for an infinite sum to limit to a number? What does it mean to sum infinitely many numbers, let alone infinitely many beers ordered by infinitely many mathematicians?

You’re asking these questions because I haven’t yet stated the problem clearly. Those of you who’ve learned a bit more mathematics (maybe in high school, maybe in college) will know another way of stating it.

You know how to sum a finite set of beers. You start with one beer, then one and a half, then one and three-quarters. Sum $N$ beers, and you get

$\sum_{i=0}^N \frac{1}{2^i}$

What does it mean for the sum to limit to two?

Let’s say you just wanted to get close to two. You want to get $\epsilon$ close, where epsilon is the Greek letter we use for really small numbers.

For every $\epsilon>0$ you choose, no matter how small, I can pick a (finite!) $N$ and get at least that close. That means that, with higher and higher $N$, I can get as close to two as a I want.

As it turns out, that’s what it means for a sum to limit to two. It’s saying the same thing, but more clearly, without sneaking in confusing claims about infinity.

These sort of proofs, with $\epsilon$ (and usually another variable, $\delta$) form what mathematicians view as the foundations of calculus. They’re immortalized in story and song.

And they’re not even the clearest way of stating things! Go down that road, and you find more mathematics: definitions of numbers, foundations of logic, rabbit holes upon rabbit holes, all from the effort to state things clearly.

That’s why I’m not surprised that physicists use mathematics. We have to. We need clarity, if we want to understand the world. And mathematicians, they’re the people who spend their lives trying to state things clearly.

The Teaching Heuristic for Non-Empirical Science

Science is by definition empirical. We discover how the world works not by sitting and thinking, but by going out and observing the world. But sometimes, all the observing we can do can’t possibly answer a question. In those situations, we might need “non-empirical science”.

The blog Slate Star Codex had a series of posts on this topic recently. He hangs out with a crowd that supports the many-worlds interpretation of quantum mechanics: the idea that quantum events are not truly random, but instead that all outcomes happen, the universe metaphorically splitting into different possible worlds. These metaphorical universes can’t be observed, so no empirical test can tell the difference between this and other interpretations of quantum mechanics: if we could ever know the difference, it would have to be for “non-empirical” reasons.

What reasons are those? Slate Star Codex teases out a few possible intuitions. He points out that we reject theories that have “unnecessary” ideas. He imagines a world where chemists believe that mixing an acid and a base also causes a distant star to go supernova, and a creationist world where paleontologists believe fossils are placed by the devil. In both cases, there might be no observable difference between their theories and ours, but because their theories have “extra pieces” (the distant star, the devil), we reject them for non-empirical reasons. Slate Star Codex asks if this supports many-worlds: without the extra assumption that quantum events randomly choose one outcome, isn’t quantum mechanics simpler?

I agree with some of this. Science really does use non-empirical reasoning. Without it, there’s no reason not to treat the world as a black box, a series of experiments with no mechanism behind it. But while we do reject theories with unnecessary ideas, that isn’t our only standard. We also need our theories to teach us about the world.

Ultimately, we trust science because it allows us to do things. If we understand the world, we can interact with it: we can build technology, design new experiments, and propose new theories. With this in mind, we can judge scientific theories by how well they help us do these things. A good scientific theory is one that gives us more power to interact with the world. It can do this by making correct predictions, but it can also do this by explaining things, making it easier for us to reason about them. Beyond empiricism, we can judge science by how well it teaches us.

This gives us an objection to the “supernova theory” of Slate Star Codex’s imagined chemists: it’s much more confusing to teach. To teach chemistry in that world you also have to teach the entire life cycle of stars, a subject that students won’t use in any other part of the course. The creationists’ “devil theory” of paleontology has the same problem: if their theory really makes the right predictions they’d have to teach students everything our paleontologists do: every era of geologic history, every theory of dinosaur evolution, plus an extra course in devil psychology. They end up with a mix that only makes it harder to understand the subject.

Many-worlds may seem simpler than other interpretations of quantum mechanics, but that doesn’t make it more useful, or easier to teach. You still need to teach students how to predict the results of experiments, and those results will still be random. If you teach them many-worlds, you need to add more discussion much earlier on, advanced topics like self-localizing uncertainty and decoherence. You need a quite extensive set of ideas, many of which won’t be used again, to justify rules another interpretation could have introduced much more simply. This would be fine if those ideas made additional predictions, but they don’t: like every interpretation of quantum mechanics, you end up doing the same experiments and building the same technology in the end.

I’m not saying I know many-worlds is false, or that I know another interpretation is true. All I’m saying is that, when physicists criticize many-worlds, they’re not just blindly insisting on empiricism. They’re rejecting many-worlds, in part, because all it does is make their work harder. And that, more than elegance or simplicity, is how we judge theories.

QCD and Reductionism: Stranger Than You’d Think

Earlier this year, I made a list of topics I wanted to understand. The most abstract and technical of them was something called “Wilsonian effective field theory”. I still don’t understand Wilsonian effective field theory. But while thinking about it, I noticed something that seemed weird. It’s something I think many physicists already understand, but that hasn’t really sunk in with the public yet.

There’s an old problem in particle physics, described in many different ways over the years. Take our theories and try to calculate some reasonable number (say, the angle an electron turns in a magnetic field), and instead of that reasonable number we get infinity. We fix this problem with a process called renormalization that hides that infinity away, changing the “normalization” of some constant like a mass or a charge. While renormalization first seemed like a shady trick, physicists eventually understood it better. First, we thought of it as a way to work around our ignorance, that the true final theory would have no infinities at all. Later, physicists instead thought about renormalization in terms of scaling.

Imagine looking at the world on a camera screen. You can zoom in, or zoom out. The further you zoom out, the more details you’ll miss: they’re just too small to be visible on your screen. You can guess what they might be, but your picture will be different depending on how you zoom.

In particle physics, many of our theories are like that camera. They come with a choice of “zoom setting”, a minimum scale where they still effectively tell the whole story. We call theories like these effective field theories. Some physicists argue that these are all we can ever have: since our experiments are never perfect, there will always be a scale so small we have no evidence about it.

In general, theories can be quite different at different scales. Some theories, though, are especially nice: they look almost the same as we zoom in to smaller scales. The only things that change are the mass of different particles, and their charges.

One theory like this is Quantum Chromodynamics (or QCD), the theory of quarks and gluons. Zoom in, and the theory looks pretty much the same, with one crucial change: the force between particles get weaker. There’s a number, called the “coupling constant“, that describes how strong a force is, think of it as sort of like an electric charge. As you zoom in to quarks and gluons, you find you can still describe them with QCD, just with a smaller coupling constant. If you could zoom “all the way in”, the constant (and thus the force between particles) would be zero.

This makes QCD a rare kind of theory: one that could be complete to any scale. No matter how far you zoom in, QCD still “makes sense”. It never gives contradictions or nonsense results. That doesn’t mean it’s actually true: it interacts with other forces, like gravity, that don’t have complete theories, so it probably isn’t complete either. But if we didn’t have gravity or electricity or magnetism, if all we had were quarks and gluons, then QCD could have been the final theory that described them.

And this starts feeling a little weird, when you think about reductionism.

Philosophers define reductionism in many different ways. I won’t be that sophisticated. Instead, I’ll suggest the following naive definition: Reductionism is the claim that theories on larger scales reduce to theories on smaller scales.

Here “reduce to” is intentionally a bit vague. It might mean “are caused by” or “can be derived from” or “are explained by”. I’m gesturing at the sort of thing people mean when they say that biology reduces to chemistry, or chemistry to physics.

What happens when we think about QCD, with this intuition?

QCD on larger scales does indeed reduce to QCD on smaller scales. If you want to ask why QCD on some scale has some coupling constant, you can explain it by looking at the (smaller) QCD coupling constant on a smaller scale. If you have equations for QCD on a smaller scale, you can derive the right equations for a larger scale. In some sense, everything you observe in your larger-scale theory of QCD is caused by what happens in your smaller-scale theory of QCD.

But this isn’t quite the reductionism you’re used to. When we say biology reduces to chemistry, or chemistry reduces to physics, we’re thinking of just a few layers: one specific theory reduces to another specific theory. Here, we have an infinite number of layers, every point on the scale from large to small, each one explained by the next.

Maybe you think you can get out of this, by saying that everything should reduce to the smallest scale. But remember, the smaller the scale the smaller our “coupling constant”, and the weaker the forces between particles. At “the smallest scale”, the coupling constant is zero, and there is no force. It’s only when you put your hand on the zoom nob and start turning that the force starts to exist.

It’s reductionism, perhaps, but not as we know it.

Now that I understand this a bit better, I get some of the objections to my post about naturalness a while back. I was being too naive about this kind of thing, as some of the commenters (particularly Jacques Distler) noted. I believe there’s a way to rephrase the argument so that it still works, but I’d have to think harder about how.

I also get why I was uneasy about Sabine Hossenfelder’s FQXi essay on reductionism. She considered a more complicated case, where the chain from large to small scale could be broken, a more elaborate variant of a problem in Quantum Electrodynamics. But if I’m right here, then it’s not clear that scaling in effective field theories is even the right way to think about this. When you have an infinite series of theories that reduce to other theories, you’re pretty far removed from what most people mean by reductionism.

Finally, this is the clearest reason I can find why you can’t do science without an observer. The “zoom” is just a choice we scientists make, an arbitrary scale describing our ignorance. But without it, there’s no way to describe QCD. The notion of scale is an inherent and inextricable part of the theory, and it doesn’t have to mean our theory is incomplete.

Experts, please chime in here if I’m wrong on the physics here. As I mentioned at the beginning, I still don’t think I understand Wilsonian effective field theory. If I’m right though, this seems genuinely weird, and something more of the public should appreciate.

The Changing Meaning of “Explain”

This is another “explanations are weird” post.

I’ve been reading a biography of James Clerk Maxwell, who formulated the theory of electromagnetism. Nowadays, we think about the theory in terms of fields: we think there is an “electromagnetic field”, filling space and time. At the time, though, this was a very unusual way to think, and not even Maxwell was comfortable with it. He felt that he had to present a “physical model” to justify the theory: a picture of tiny gears and ball bearings, somehow occupying the same space as ordinary matter.

Maxwell didn’t think space was literally filled with ball bearings. He did, however, believe he needed a picture that was sufficiently “physical”, that wasn’t just “mathematics”. Later, when he wrote down a theory that looked more like modern field theory, he still thought of it as provisional: a way to use Lagrange’s mathematics to ignore the unknown “real physical mechanism” and just describe what was observed. To Maxwell, field theory was a description, but not an explanation.

This attitude surprised me. I would have thought physicists in Maxwell’s day could have accepted fields. After all, they had accepted Newton.

In his time, there was quite a bit of controversy about whether Newton’s theory of gravity was “physical”. When rival models described planets driven around by whirlpools, Newton simply described the mathematics of the force, an “action at a distance”. Newton famously insisted hypotheses non fingo, “I feign no hypotheses”, and insisted that he wasn’t saying anything about why gravity worked, merely how it worked. Over time, as the whirlpool models continued to fail, people gradually accepted that gravity could be explained as action at a distance.

You’d think that this would make them able to accept fields as well. Instead, by Maxwell’s day the options for a “physical explanation” had simply been enlarged by one. Now instead of just explaining something with mechanical parts, you could explain it with action at a distance as well. Indeed, many physicists tried to explain electricity and magnetism with some sort of gravity-like action at a distance. They failed, though. You really do need fields.

The author of the biography is an engineer, not a physicist, so I find his perspective unusual at times. After discussing Maxwell’s discomfort with fields, the author says that today physicists are different: instead of insisting on a physical explanation, they accept that there are some things they just cannot know.

At first, I wanted to object: we do have physical explanations, we explain things with fields! We have electromagnetic fields and electron fields, gluon fields and Higgs fields, even a gravitational field for the shape of space-time. These fields aren’t papering over some hidden mechanism, they are the mechanism!

Are they, though?

Fields aren’t quite like the whirlpools and ball bearings of historical physicists. Sometimes fields that look different are secretly the same: the two “different explanations” will give the same result for any measurement you could ever perform. In my area of physics, we try to avoid this by focusing on the measurements instead, building as much as we can out of observable quantities instead of fields. In effect we’re going back yet another layer, another dose of hypotheses non fingo.

Physicists still ask for “physical explanations”, and still worry that some picture might be “just mathematics”. But what that means has changed, and continues to change. I don’t think we have a common standard right now, at least nothing as specific as “mechanical parts or action at a distance, and nothing else”. Somehow, we still care about whether we’ve given an explanation, or just a description, even though we can’t define what an explanation is.

In Life and in Science, Test

Think of a therapist, and you might picture a pipe-smoking Freudian, interrogating you about repressed feelings. These days, you’re more likely to meet a more modern form of therapy, like cognitive behavioral therapy (or CBT for short). CBT focuses on correcting distorted thoughts and maladaptive behaviors: basically, helping you reason through your problems. It’s supposed to be one of the types of therapy that has the most actual scientific evidence behind it.

What impresses me about CBT isn’t just the scientific evidence for it, but the way it tries to teach something like a scientific worldview. If you’re depressed or anxious, a common problem is obsessive thoughts about what others think of you. Maybe you worry that everyone is just putting up with you out of pity, or that you’re hopelessly behind your peers. For many scientists, these are familiar worries.

The standard CBT advice for these worries is as obvious as it is scary: if you worry what others think of you, ask!

This is, at its heart, a very scientific thing to do. If you’re curious about something, and you can test it, just test it! Of course, there are risks to doing this, both in your personal life and in your science, but typical CBT advice applies surprisingly well to both.

If you constantly ask your friends what they think about you, you end up annoying them. Similarly, if you perform the same experiment over and over, you can keep going until you get the result you want. In both cases, the solution is to commit to trusting your initial results: just like scientists pre-registering a study, if you ask your friends what they think you need to trust them and not second-guess what they say. If they say they’re happy with you, trust that. If they criticize, take their criticism seriously and see if you can improve.

Even then, you may be tempted to come up with reasons why you can’t trust what your friends say. You’ll come up with reasons why they might be forced to be polite, while they secretly still hate you. Similarly, as a scientist you can always come up with theories that get around the evidence: no matter what you observe, a complicated enough chain of logic can make it consistent with anything you want. In both cases, the solution is a dose of Occam’s Razor: don’t fixate on an extremely complicated explanation when a simpler one already fits. If your friends say they like you, they probably do.